youtube image
From YouTube: Atigeo: Cassandra in Large Scale Enterprise Grade xPatterns Deployments


Speaker: Claudiu Barbura, Senior Director of Engineering at Atigeo

xPatterns is a big data analytics platform-as-a-service that enables rapid development of enterprise-grade analytical applications. It provides tools, API sets and a management console for building an ELT pipeline with data monitoring and quality gates, a data warehouse for ad-hoc and scheduled querying, analysis, model building and experimentation, tools for exporting data to Cassandra and solrCloud clusters for real-time access through low-latency/high-throughput (automatically generated) apis as well as dashboard and visualization api/tools leveraging the available data and models. In this talk I'll share some of the hard lessons we've learned in the past three years while leveraging Cassandra (and Hector) in large-scale enterprise-grade deployments. We will focus on three specific areas, in which we identified consistent best practices & design patterns: data model optimization as a result of exporting data from HDFS/Hive/Shark into Cassandra through Spark/Hadoop MR jobs under Mesos with throttling, instrumentation and resilience features, automatically publishing geo-replicated, instrumented and monitored REST API's on top of the exported Cassandra data, and lessons learned from running Cassandra at scale from 0.6 to 2.0.6, including performance tuning, and tips and tricks. You will see live demos of our Publish to NoSql tools (Spark/Shark, Mesos, Hive, Cassandra ), a dashboard application built on top of generated data apis (D3.js, Cassandra) and xPatterns' monitoring and instrumentation consoles (Graphite, Ganglia, Nagios).