We all know how Uber works: you click a button, pay for a ride, wait for your driver, get in, arrive at your destination, and go on your way. After that, Uber takes care of paying the driver. It’s simple, right? Well, that simplicity changes when you add 100 million users.
The Challenge of Scale
With billions of rides and hundreds of billions of dollars involved, managing transactions became challenging. As Uber’s popularity surged, they evolved their tech stack to keep up. They went from using a single, monolithic application to a more complex structure, called microservices, which meant breaking the application into thousands of smaller services to handle each part of the process separately.
Building a Reliable System
Uber chose databases like Postgres and MySQL for storing data and built technology around them. However, when it comes to money, you need precision and reliability, and Uber’s system was not quite cutting it. Issues started arising: it became unstable, fragmented across different services, and was painfully slow at handling transactions. Reconciling these transactions balancing everything became a nightmare, leading to discrepancies and major delays.
Imagine being an Uber driver who has been giving rides for two days, only to face a cyberattack that disrupts payments. Now, all the money you expected is gone. To avoid issues like these, Uber had to rebuild its payment system from scratch.
Introducing the Job-Order System
They moved to a job-order system, where a “job” represents a trip or an Uber Eats order. This system provided flexibility, letting users add multiple stops along the way. For example, if someone needs to grab something from McDonald’s on their way home, the job-order system makes it possible.
Within each job are entries, which represent money moving between the payer and Uber. These entries always balance out to zero, meaning no extra money is created or destroyed. For example, if a rider pays $20 for a trip, it would appear in two entries: $18 for the driver and $2 as Uber’s service fee. The balance goes into the driver’s account and Uber’s account, respectively. This is essentially how double-entry accounting works, making sure that debits and credits balance.
The Double-Entry System
This double-entry system, similar to traditional ledgers, allows for financial transparency and control. If your account starts at $100 and ends at $1,000, but your debits and credits don’t match, you know something went wrong.
Addressing Database Limitations
But here’s where standard databases fell short. Most are “mutable,” meaning data is updated directly in place. For example, if you update your profile name, the system simply overwrites the old one. This works for basic information but not for money, which demands an immutable (append-only) approach. This way, instead of deleting or overwriting past records, corrections are added as new entries.
Creating Ledger Store
To solve this, Uber created Ledger Store, a layer that sits on top of their database and introduces three core features: sealing, manifest, and revision.
- Sealing: Closes off changes in a specific time range, marking them as complete and making them read-only, so they can’t be tampered with.
- Manifest: A security file generated during the sealing process, allowing only Ledger Store to access past, sealed transactions.
- Revision: This function allows corrections to be made to past entries without removing the original record.
Migration and Transition
Ledger Store made transaction management more reliable, but Uber still had to migrate over 250 billion records to their new database system. Processing all 300 terabytes of data at once was risky—any error would mean starting over. Instead, Uber split the data, processed each part individually, and used checkpoints to track progress. This approach took just a few weeks, saving both time and money.
Once they had historical data in place, Uber needed a seamless way to transition to a live system on one of the busiest applications. To do this, they introduced the Shadow Writer,a tool that sits between the old and new databases. The Shadow Writer writes data to the primary database instantly while updating the new database in the background. This approach keeps the main system fast, and if a background write fails, a “spot repair job” fixes it immediately.
Phasing in the New Database
Gradually, Uber phased in their new database, with fewer requests going to the old system and more to the new one. In the end, they moved to DocStore as the primary database, with Dynamo as a backup.
Managing Indexes Efficiently
Managing data at this scale also involves a complex system of indexes. Imagine a movie store that categorizes films by genre; an index works similarly by categorizing entries to speed up search results. Uber handles trillions of indexes some that need precise, instant access (like for credit card authorizations) and others that don’t (like your previous ride history).
To manage these indexes efficiently, Uber created a system where new data backfills itself with historical information, with checksums verifying data integrity. This level of control is necessary to manage the enormous amounts of information we might take for granted.
The Power of Custom Solutions
The technology powering Uber’s platform is astounding. At this scale, they must create custom solutions that may not work for everyone but fit their unique needs perfectly.