From idea to app in less than three weeks

Three weeks ago I had an idea: free, weekly, themed mobile photo competitions. Now I have an app! Here’s how I did it.

Design

Graphic design is one of my weaker skills, so I’ve learned to just keep things as simple as possible. I use default controls when available and apply subtle shadows, gradients, and textures to make simple interface elements beautiful. I began my quest for a unique title font at The League of Moveable Type and chose Raleway. After applying a gentle shadow, I set the title image of my app and now I have branding! Using the free icon set from Glyphish, I then experimented with button placement and design until I was satisfied with the look and usability of the front page. Here’s the full evolution taking place over a few days. You can see I started out with small buttons but switched to much larger versions after realizing the tap zones could be greatly expanded.

design evolution

Hey, it won’t win any design awards, but at least it’s classy and usable.

Code

The app wouldn’t be very useful without any data, so I had to write code to handle it. Lots of code. In lots of places. This is where it helps to have experience.

Behind the scenes, I use PHP running on Lighttpd, along with a MySQL backend, all on a single Amazon EC2 instance. Photo uploads go straight to S3, so I don’t have to worry about any filesystem wrangling or quota limits.

Each time the app launches, it communicates with an API on my web server to retrieve the current theme and status. To allow voting, the app also needs a list of photos and past themes. Here’s a sample of how the API data is structured:

The app then takes this JSON data and turns it into a nice pretty display:

Similar communication occurs between the app and web server for every action: uploads, votes, profile edits, etc. Everything a user does must be synchronized online, so the app must constantly manage communications like this behind the scenes.

Conclusion

Hopefully this gives you a better idea of what goes into making internet-connected applications. This particular app took about 80 hours to create over 3 weeks, and required knowledge from a lot of different areas. Even simple ideas can cover the full range from frontend design to backend databases.

I submitted the app for review yesterday, and now it just needs users!

Join the beta list or follow me on Twitter.

S3 Hosting vs. EC2 Micro

Today Amazon announced their new website endpoint feature for hosting static websites completely on S3. In short, it allows S3 buckets to have their own index and error documents. It had been an often-requested feature, and I’m happy to see Amazon deliver on continually improving their web services.

Setting this up is pretty simple: just login to your AWS console, select the bucket you wish to use, and check the Enabled box.

While the ability to host static content on S3 isn’t new by any means, this new built-in support, along with the ridiculously simple setup, will I’m sure cause many people to consider using S3 as their exclusive web host.

Let’s run some benchmarks to see how performance compares. I copied the index page of my website (running on a micro instance in Amazon’s west-coast datacenter served by lighttpd) to an S3 bucket in the Northern California region, and ran the Apache benchmark tool from an offsite location to simulate traffic. The most important numbers here are requests per second.

EC2:

S3:

Impressive! While EC2 wins with 1440.90 requests per second, S3 still manages to keep up, serving 1328.10 requests per second. Man, static content is fast. I’m calling it a draw.

If you don’t serve any dynamic pages and don’t need any advanced server features, it looks like S3 would make an excellent hosting provider.

Using genetic algorithms to maximize Lending Club performance.

I’ve had an account with Lending Club for over 3 years. In case you’re unfamiliar, Lending Club offers borrowers lower interest rates than a credit card company and provides individual lenders (like me) higher returns than a bank. They take about a 1% cut, but handle all the details like automatic withdrawals, payment-plan negotiation, and collections. It’s a win-win-win situation. They boast an (impressive) average 9.68% APY for investors, but I wanted to see if I could beat that without spending hours picking through individual loans every day.

So I wrote some code.

I picked about a dozen search filter options I felt might strongly influence loan performance, and then built a system to apply and analyze these filters on the entire set of Lending Club history (freely available here). Are New York residents better borrowers than those in California? Let’s find out:

Answer: Yes, not only do New York residents have a lower default rate, but they also repay a greater amount of the loan before they fail. Good to know.

Armed with this new ability to analyze 4+ years of Lending Club history, I implemented a genetic algorithm to seek out the best combination of search filters. It wouldn’t be enough to test each filter individually, because they may interact with each other in interesting ways. Perhaps California homeowners are better borrowers than New York renters (Answer: they’re not). As there are literally billions of possibilities though, it would not be feasible to check them all.

Genetic algorithms work by creating a bunch of candidates, measuring their success, and improving them. It’s a kind of guided randomness. This happens over and over again, each time slightly better than the last, until the result converges on the best solution (or until I get bored). So the first run through, testing 500 completely random filter sets, my program found one option that yields a net 10.32% APY. Excellent! Let’s run it overnight and see what else it can find:

This graph resembles a logarithmic curve; it improves over time, quickly at first, but then slows near the end where it sticks around 12-13% APY. I let it run for another 12 hours just to be sure, but my program didn’t find anything else outside this range. These results are in line with Lending Club’s own statistics, where you’ll see only the top 15% of lenders earning these returns at sufficient volume.

A quick word about volume: my simulation was aimed at producing filter sets that could sustain investments of over $1,000 per month at the minimum allowed $25 per loan. This requirement also ensures that my analysis has statistical significance. As the number of matching loans increases, the resulting net APY decreases. This makes sense because the best loans are rare, and increasing the loan pool dilutes their significance. Here’s a quick graph of the relationship between loan volume and net APY from the 24-hour run:

So enough explanation and illustration… let’s see some hard results! Here’s the set of search filters for the best one I found (12.5% net APY, 2% default rate):

  • Credit Grade: C, D, E
  • Debt-to-Income Ratio: <25%
  • Home Ownership: Mortgage, Own
  • Inquiries in the last 6 months: <1
  • Loan Purpose: refinance credit card, consolidate debt, home improvement, vacation, moving expenses, wedding expenses, home down payment, renewable energy
  • Months since last delinquency: >24
  • Exclude loans with public records: Yes
  • Exclude States: AZ, CA, FL, GA, IL, MD, NV

At the time of writing, this search brings up 23 available loans totaling over $400,000; enough volume to support 700 investors contributing $575 each at the $25 per loan minimum (to maximize diversity), and it should continue matching new loans at the rate of about 50 per month. If not, you can also include borrowers with a B credit grade for an extra 25 loans per month (and a resulting 11.77% net APY).

I’ve posted the source code on GitHub if you’re interested or want to learn more about genetic algorithms.

UPDATE: I originally had the list of states reversed, so when I should have excluded CA, I mistakenly included it. Whoops. This post has been updated with all correct information and results validated by lendstats.com. 2011/02/07 @ 2pm PST