youtube image
From YouTube: FamilySearch: Huge Online Genealogical Database Driven by Cassandra

Description

Speaker: John Sumsion, Software Developer at FamilySearch

FamilySearch hosts a collaborative family tree with over a billion editable records. The tree currently serves as many as 10,000 concurrent users at peak weekly load. These users come from across the globe and collectively maintain and enhance the tree around the clock. Recent efforts to port the tree from a relational database to Cassandra have resulted in drastically improved performance and scalability. The database consists of more than 5 billion records in journaled form, and we anticipate having over 10TB of live data available for user view & edit, with that data size growing significantly as our user base grows. The dataset has resisted sharding in the past, so the port involved rethinking the core data model. The model we chose retains the consistency that our users demand, and is able to be implemented without requiring ACID transactions. Specifically, the consistency model we chose combined a Convergent and Commutative Replicated Data Type (CvRDT and CmRDT) with Cassandra's atomic batch implementation to form the basis for a consistency model that met the demanding needs of the family tree application.