Uber Slashing Storage Costs by 70% with MyRocks Differential Backups
By [Your Name], Staff Writer
Uber has significantly reduced itsstorage costs by 70% through the implementation of a novel differential backup system for its distributed databases, according to a recent blog post. This innovative solutionaddresses a critical challenge the ride-sharing giant faced after migrating its Schemaless and Docstore services to MyRocks, a RocksDB-based MySQL storage engine.
Before the migration, Uber’s Schemaless and Docstore databases—handling tens of petabytes of operational data and millions of requests per second—were crucial for its global operations. The shift to MyRocks,while optimized for write operations and storage efficiency, initially presented a problem: a lack of incremental backup support. This meant full backups were required for each database partition, leading to substantial redundant data storage and escalating costs in blob storage.
The newlydeveloped differential backup system cleverly leverages the immutable nature of MyRocks’ SSTable (Sorted String Table) files. Instead of replicating all files during each backup, the system maintains a shared pool of SSTable files, adding only newly created files to this pool. A manifest file, acting as an index,meticulously records the list of included files, enabling efficient restoration when needed.
As detailed in a technical blog post by Adithya Reddy, the process begins with an initial full backup, storing all metadata and SSTable files in a shared pool within the blob storage. Subsequent differential backups simply append new SSTable files to thispool, reusing existing files from previous backups. The backup manifest, implemented as a JSON document, tracks essential information including backup type, success status, timing details, and file checksums, providing crucial metadata for recovery.
This streamlined system is managed by a stateless service called Backup Scheduler, which determines backup timingand frequency based on partition backup status. The actual backup process is handled by ephemeral backup containers, leveraging Percona XtraBackup where necessary.
This innovative approach represents a significant technological advancement in database management, demonstrating how optimizing backup strategies can dramatically reduce operational costs. The success of Uber’s solution highlights the potential forsimilar improvements in other organizations managing large-scale distributed databases. The implications extend beyond cost savings; the faster backup times also contribute to improved operational efficiency and resilience. Further research into adapting this methodology to other database systems could yield substantial benefits across the industry.
References:
- Reddy, A. (Year). *[
Views: 0