12 Aug 2014
SlideShare: http://www.slideshare.net/JonHaddad/crash-course-intro-to-cassandra
This is a crash course in the Cassandra storage model. We will start with how data is laid out on disk to understand why it's fast for real time workloads, and finish with an introduction to CQL, which we'll use to query Cassandra. If you have not used Cassandra previously this will bring you up to speed for the talks that follow throughout the day.
Jon Haddad is an Apahce Cassandra Evangelist at DataStax; prior to DataStax, he worked as a Senior Architect at SHIFT and is one of the maintainers of CQLengine. Jon has spent the last decade at various startups in the LA area. His previous projects include Answerbag.com and Livestrong.com.
This is a crash course in the Cassandra storage model. We will start with how data is laid out on disk to understand why it's fast for real time workloads, and finish with an introduction to CQL, which we'll use to query Cassandra. If you have not used Cassandra previously this will bring you up to speed for the talks that follow throughout the day.
Jon Haddad is an Apahce Cassandra Evangelist at DataStax; prior to DataStax, he worked as a Senior Architect at SHIFT and is one of the maintainers of CQLengine. Jon has spent the last decade at various startups in the LA area. His previous projects include Answerbag.com and Livestrong.com.
- 8 participants
- 58 minutes

12 Aug 2014
SlideShare: http://www.slideshare.net/planetcassandra/cool-stuff-we-use-cassandra-on
Zipwhip is the world's first text-only carrier and focuses on extending the phone number to all internet connected devices. Billions of text messages traverse Zipwhip’s network each day. Similar to what users have come to expect from modern communication services, Zipwhip coordinates the state across all connected devices and this presents a unique problem set. Zipwhip leverages Cassandra to store immense amount of real-time data across multiple datacenters and provides carrier-grade uptime and responsiveness. Zipwhip is currently trialing/proving Cassandra with signal coordination and has the ultimate goal of transitioning the storage text messages from MySQL to Cassandra.
Michael is the cofounder and SVP of Architecture and Design for Zipwhip. With over 10 years’ experience building and designing carrier class network infrastructure for high volume transaction environments, he leads a team of over 20 developers that designed the world’s first cloud texting solution. With over 10 million active users, Zipwhip is leading the charge for IP based texting solutions.
Zipwhip is the world's first text-only carrier and focuses on extending the phone number to all internet connected devices. Billions of text messages traverse Zipwhip’s network each day. Similar to what users have come to expect from modern communication services, Zipwhip coordinates the state across all connected devices and this presents a unique problem set. Zipwhip leverages Cassandra to store immense amount of real-time data across multiple datacenters and provides carrier-grade uptime and responsiveness. Zipwhip is currently trialing/proving Cassandra with signal coordination and has the ultimate goal of transitioning the storage text messages from MySQL to Cassandra.
Michael is the cofounder and SVP of Architecture and Design for Zipwhip. With over 10 years’ experience building and designing carrier class network infrastructure for high volume transaction environments, he leads a team of over 20 developers that designed the world’s first cloud texting solution. With over 10 million active users, Zipwhip is leading the charge for IP based texting solutions.
- 1 participant
- 33 minutes

12 Aug 2014
SlideShare: http://www.slideshare.net/planetcassandra/luke-tillman-getting-started-with-data-stax-net-driver-light-code-samples
So you’ve grabbed the latest 2.0 version of the DataStax C# driver from NuGet. Now what? In this talk, Luke will walk you through some of the basics of the C# driver--how to bootstrap the driver and connect to a cluster, execute CQL, and retrieve the results. Wondering what the difference between a PreparedStatement and a SimpleStatement is? Not sure what the appropriate lifetime is for a Cluster or a Session object? What about ADO.NET and LINQ support? We’ll cover this and more, so that you can get on with building applications on top of Cassandra. Even if you’re not a C# developer (or think that C# is the handiwork of the devil), many of the concepts we’ll cover will help you get started with the other DataStax drivers as well (Python, Java, and C++).
So you’ve grabbed the latest 2.0 version of the DataStax C# driver from NuGet. Now what? In this talk, Luke will walk you through some of the basics of the C# driver--how to bootstrap the driver and connect to a cluster, execute CQL, and retrieve the results. Wondering what the difference between a PreparedStatement and a SimpleStatement is? Not sure what the appropriate lifetime is for a Cluster or a Session object? What about ADO.NET and LINQ support? We’ll cover this and more, so that you can get on with building applications on top of Cassandra. Even if you’re not a C# developer (or think that C# is the handiwork of the devil), many of the concepts we’ll cover will help you get started with the other DataStax drivers as well (Python, Java, and C++).
- 1 participant
- 40 minutes

12 Aug 2014
SlideShare: http://www.slideshare.net/planetcassandra/high-throughput-analytics-with-cassandra-azure
MetricsHub is a monitoring and scalability service for the public cloud, allowing customers to gather large amounts of data, analyze and act on it in real time. Taking advantage of Cassandra’s rapid ingestion rates, and the elastic scale of the Azure, MetricsHub analyzes billions of data points every day to reduce cost and improve availability for its customers.
Charles currently works on the Windows Azure monitoring team to define the next generation of cloud monitoring and management. Charles was a Responsible for technical and business areas for MetricsHub. He was a member of founding team and developed the company from idea stage, to revenue and then to exit. MetricsHub was acquired by Microsoft on March 4th, 2013. The premium MetricsHub product was offered as a no charge service following the acquisition.
MetricsHub is a monitoring and scalability service for the public cloud, allowing customers to gather large amounts of data, analyze and act on it in real time. Taking advantage of Cassandra’s rapid ingestion rates, and the elastic scale of the Azure, MetricsHub analyzes billions of data points every day to reduce cost and improve availability for its customers.
Charles currently works on the Windows Azure monitoring team to define the next generation of cloud monitoring and management. Charles was a Responsible for technical and business areas for MetricsHub. He was a member of founding team and developed the company from idea stage, to revenue and then to exit. MetricsHub was acquired by Microsoft on March 4th, 2013. The premium MetricsHub product was offered as a no charge service following the acquisition.
- 2 participants
- 41 minutes

12 Aug 2014
SlideShare: http://www.slideshare.net/planetcassandra/c-data-modeling-37298974
Shehaaz Saif: Software Development Engineer, Expedia Inc.
Shehaaz Saif: Software Development Engineer, Expedia Inc.
- 6 participants
- 28 minutes

12 Aug 2014
SlideShare: http://www.slideshare.net/planetcassandra/olap-with-spark-and-cassandra
How do you rapidly derive complex insights on top of really big data sets in Cassandra? This session draws upon Evan's experience building a distributed, interactive, columnar query engine on top of Cassandra and Spark. We will start by surveying the existing query landscape of Cassandra and discuss ways to integrate Cassandra and Spark. We will dive into the design and architecture of a fast, column-oriented query architecture for Spark, and why columnar stores are so advantageous for OLAP workloads. I will present a schema for Parquet-like storage of analytical datasets onCassandra. Find out why Cassandra and Spark are the perfect match for enabling fast, scalable, complex querying and storage of big analytical data.
Evan Chan is a Principle Systems Engineer at Socrata. In his own words: I love to design, build, and improve bleeding edge distributed data and backend systems using the latest in open source technologies. I am a big believer in GitHub, open source, and meetups, and have given talks at conferences such as the Cassandra Summit 2013 and will be presenting at Cassandra Summit 2014.
How do you rapidly derive complex insights on top of really big data sets in Cassandra? This session draws upon Evan's experience building a distributed, interactive, columnar query engine on top of Cassandra and Spark. We will start by surveying the existing query landscape of Cassandra and discuss ways to integrate Cassandra and Spark. We will dive into the design and architecture of a fast, column-oriented query architecture for Spark, and why columnar stores are so advantageous for OLAP workloads. I will present a schema for Parquet-like storage of analytical datasets onCassandra. Find out why Cassandra and Spark are the perfect match for enabling fast, scalable, complex querying and storage of big analytical data.
Evan Chan is a Principle Systems Engineer at Socrata. In his own words: I love to design, build, and improve bleeding edge distributed data and backend systems using the latest in open source technologies. I am a big believer in GitHub, open source, and meetups, and have given talks at conferences such as the Cassandra Summit 2013 and will be presenting at Cassandra Summit 2014.
- 1 participant
- 37 minutes

12 Aug 2014
SlideShare: http://www.slideshare.net/ClaudiuBarbura/lessons-learned-from-embedding-cassandra-in-xpatterns
In this talk we’ll share some of the hard lessons we’ve learned while leveraging Cassandra in large-scale enterprise-grade deployments. We will focus on three specific areas, in which we identified consistent best practices & design patterns, which over time we’ve embedded & automated as part of our big data analytics platform.
The first is data model optimization, in particular when exporting data from HDFS into Cassandra. The second is publishing & operating REST API’s on top of Cassandra stores, including throttling, instrumentation & automated retries. The third is geo-replication, specifically when highly sensitive data is involved and extra security controls must be in place. The talk includes a hands-on demo of the tools we’ve built, plus an open discussion about the key design choices and recommendations for future projects.
Claudiu is Atigeo’s Senior Director of Engineering, Platform Services, and oversees agile engineering teams in the US and Romania while also acting as Lead Architect in building and operating xPatterns, an enterprise-class, Big Data Analytics platform. Claudiu has 17 years of industry experience in various roles, with a strong passion for Software Architecture leveraging industry best patterns and practices and contributing with a significant level of innovation. His experience spans across the Open Source, Big Data and Microsoft’s Windows/.Net technology stacks.
As Atigeo’s Senior Vice President of Engineering, David Talby leads product development and management of teams in the US and Europe. David was a featured speaker at Strata Rx, AHIMA, and Software Practice Advancement. Prior to Atigeo, David managed Microsoft’s US and European business operations teams for Bing Shopping, and built and ran distributed teams at Amazon which scaled Amazon’s financial systems. David has 25 published papers & patents to date, and holds both a PhD in Computer Science and an MBA from Hebrew University of Jerusalem.
In this talk we’ll share some of the hard lessons we’ve learned while leveraging Cassandra in large-scale enterprise-grade deployments. We will focus on three specific areas, in which we identified consistent best practices & design patterns, which over time we’ve embedded & automated as part of our big data analytics platform.
The first is data model optimization, in particular when exporting data from HDFS into Cassandra. The second is publishing & operating REST API’s on top of Cassandra stores, including throttling, instrumentation & automated retries. The third is geo-replication, specifically when highly sensitive data is involved and extra security controls must be in place. The talk includes a hands-on demo of the tools we’ve built, plus an open discussion about the key design choices and recommendations for future projects.
Claudiu is Atigeo’s Senior Director of Engineering, Platform Services, and oversees agile engineering teams in the US and Romania while also acting as Lead Architect in building and operating xPatterns, an enterprise-class, Big Data Analytics platform. Claudiu has 17 years of industry experience in various roles, with a strong passion for Software Architecture leveraging industry best patterns and practices and contributing with a significant level of innovation. His experience spans across the Open Source, Big Data and Microsoft’s Windows/.Net technology stacks.
As Atigeo’s Senior Vice President of Engineering, David Talby leads product development and management of teams in the US and Europe. David was a featured speaker at Strata Rx, AHIMA, and Software Practice Advancement. Prior to Atigeo, David managed Microsoft’s US and European business operations teams for Bing Shopping, and built and ran distributed teams at Amazon which scaled Amazon’s financial systems. David has 25 published papers & patents to date, and holds both a PhD in Computer Science and an MBA from Hebrew University of Jerusalem.
- 3 participants
- 29 minutes
