I’ve had an account with Lending Club for over 3 years. In case you’re unfamiliar, Lending Club offers borrowers lower interest rates than a credit card company and provides individual lenders (like me) higher returns than a bank. They take about a 1% cut, but handle all the details like automatic withdrawals, payment-plan negotiation, and collections. It’s a win-win-win situation. They boast an (impressive) average 9.68% APY for investors, but I wanted to see if I could beat that without spending hours picking through individual loans every day.
So I wrote some code.
I picked about a dozen search filter options I felt might strongly influence loan performance, and then built a system to apply and analyze these filters on the entire set of Lending Club history (freely available here). Are New York residents better borrowers than those in California? Let’s find out:
Answer: Yes, not only do New York residents have a lower default rate, but they also repay a greater amount of the loan before they fail. Good to know.
Armed with this new ability to analyze 4+ years of Lending Club history, I implemented a genetic algorithm to seek out the best combination of search filters. It wouldn’t be enough to test each filter individually, because they may interact with each other in interesting ways. Perhaps California homeowners are better borrowers than New York renters (Answer: they’re not). As there are literally billions of possibilities though, it would not be feasible to check them all.
Genetic algorithms work by creating a bunch of candidates, measuring their success, and improving them. It’s a kind of guided randomness. This happens over and over again, each time slightly better than the last, until the result converges on the best solution (or until I get bored). So the first run through, testing 500 completely random filter sets, my program found one option that yields a net 10.32% APY. Excellent! Let’s run it overnight and see what else it can find:
This graph resembles a logarithmic curve; it improves over time, quickly at first, but then slows near the end where it sticks around 12-13% APY. I let it run for another 12 hours just to be sure, but my program didn’t find anything else outside this range. These results are in line with Lending Club’s own statistics, where you’ll see only the top 15% of lenders earning these returns at sufficient volume.
A quick word about volume: my simulation was aimed at producing filter sets that could sustain investments of over $1,000 per month at the minimum allowed $25 per loan. This requirement also ensures that my analysis has statistical significance. As the number of matching loans increases, the resulting net APY decreases. This makes sense because the best loans are rare, and increasing the loan pool dilutes their significance. Here’s a quick graph of the relationship between loan volume and net APY from the 24-hour run:
So enough explanation and illustration… let’s see some hard results! Here’s the set of search filters for the best one I found (12.5% net APY, 2% default rate):
- Credit Grade: C, D, E
- Debt-to-Income Ratio: <25%
- Home Ownership: Mortgage, Own
- Inquiries in the last 6 months: <1
- Loan Purpose: refinance credit card, consolidate debt, home improvement, vacation, moving expenses, wedding expenses, home down payment, renewable energy
- Months since last delinquency: >24
- Exclude loans with public records: Yes
- Exclude States: AZ, CA, FL, GA, IL, MD, NV
At the time of writing, this search brings up 23 available loans totaling over $400,000; enough volume to support 700 investors contributing $575 each at the $25 per loan minimum (to maximize diversity), and it should continue matching new loans at the rate of about 50 per month. If not, you can also include borrowers with a B credit grade for an extra 25 loans per month (and a resulting 11.77% net APY).
I’ve posted the source code on GitHub if you’re interested or want to learn more about genetic algorithms.
UPDATE: I originally had the list of states reversed, so when I should have excluded CA, I mistakenly included it. Whoops. This post has been updated with all correct information and results validated by lendstats.com. 2011/02/07 @ 2pm PST