27 Oct 2014
PagerDuty had the misfortune of watching its abused, underprovisioned Cassandra cluster collapse. This talk will cover the lessons learned from that experience like:
• Which of the many, many metrics did we learn to watch for
• What mistakes we made that lead to this catastrophe
• How we have changed our use to make our Cassandra cluster more stable
Owen Kim is a Software Engineer at PagerDuty and enjoys whiskey, riding his Honda Shadow 600 (named "Chie") and discussing the finer points in narrative and expression in video games.
• Which of the many, many metrics did we learn to watch for
• What mistakes we made that lead to this catastrophe
• How we have changed our use to make our Cassandra cluster more stable
Owen Kim is a Software Engineer at PagerDuty and enjoys whiskey, riding his Honda Shadow 600 (named "Chie") and discussing the finer points in narrative and expression in video games.
- 1 participant
- 27 minutes

27 Oct 2014
Stephen Portanova- Narmal is an iOS app that saves you time (and sanity) by making email more chat-like. This talk is about why a startup chose Cassandra as its primary data store, demonstrated with real-life examples. In particular:
• Cassandra's crazy powerful data model
• Cassandra's operational flexibility
• The Cassandra Community
Stephen Portanova speaks Haskell to God, Scala to women, Objective-C to men, and Javascript to his horse.
• Cassandra's crazy powerful data model
• Cassandra's operational flexibility
• The Cassandra Community
Stephen Portanova speaks Haskell to God, Scala to women, Objective-C to men, and Javascript to his horse.
- 1 participant
- 17 minutes

10 Jun 2014
About Web Python User Group
This time around, we will have the pleasure of hearing from Tyler Hobbs: Python developer at DataStax, Cassandra wizard, and all 'round good fellow.
He will be showing us how to use Cassandra in the context of Python and Django. We'll start off with an introduction to the Python driver (which he wrote, and which recently received its 1.0 update!), and then we will move on to Twissandra, which is an example Django app showing how this all gets implemented.
As always we will meet in the Capital Factory in Downtown Austin at 7 PM. Refreshments and Tacos will be generously provided by Indeed.com: "One search. All jobs."
Afterwards we will continue to talk about Python on the web at a nearby watering hole
About Capital Factory
Capital Factory is the entrepreneurial center of gravity in Austin, Texas. Located in the middle of downtown, Capital Factory has 50,000 square feet full of startups and entrepreneurs. Take classes to learn the skills that startups need, attend meet ups to find a co-founder, rent a desk for your startup or apply for funding and mentorship in the Incubator.
This time around, we will have the pleasure of hearing from Tyler Hobbs: Python developer at DataStax, Cassandra wizard, and all 'round good fellow.
He will be showing us how to use Cassandra in the context of Python and Django. We'll start off with an introduction to the Python driver (which he wrote, and which recently received its 1.0 update!), and then we will move on to Twissandra, which is an example Django app showing how this all gets implemented.
As always we will meet in the Capital Factory in Downtown Austin at 7 PM. Refreshments and Tacos will be generously provided by Indeed.com: "One search. All jobs."
Afterwards we will continue to talk about Python on the web at a nearby watering hole
About Capital Factory
Capital Factory is the entrepreneurial center of gravity in Austin, Texas. Located in the middle of downtown, Capital Factory has 50,000 square feet full of startups and entrepreneurs. Take classes to learn the skills that startups need, attend meet ups to find a co-founder, rent a desk for your startup or apply for funding and mentorship in the Incubator.
- 7 participants
- 1:28 hours

7 Apr 2014
Last 1st of April we hosted a very interesting presentation of Apache Cassandra by Colin Clark.
- 8 participants
- 1:11 hours

24 Mar 2014
This topic will introduce the Cassandra native protocol, native drivers and Cassandra Query Language (CQL). It is important for developers to be aware of this new way of integrating with and querying Cassandra -- without using Thrift or RPC. There are various ways of tuning that integration and modeling your data - all intended to make it easier and more productive to build against Cassandra with some additional performance benefits. This is a technical session with code abstracts using the Java driver.
- 1 participant
- 47 minutes

26 Feb 2014
If you've ever wondered how to utilize Apache Cassandra for real-time analytics, then this is a meetup you won't want to miss!
We're excited to have Stephane Legay, CTO at LoopLogic, joining us to present on how LoopLogic uses Apache Cassandra for their real-time analytics use case. Stephane will be covering the LoopLogic/Cassandra architecture + how they process their data using a 2-pass system.
Go Daddy has graciously offered to host this event. Be sure to come straight from the office (or sofa) as food/beverage will be served.
Hope you can make it!
What You Will Learn in this Session
• Architectural Overview of Cassandra at LoopLogic (3,000 to 5,000 writes per second on a 2-node cluster)
• Processing Data Using a 2-Pass Process
- Event log messages get stored in Cassandra as well as Amazon SQS upon reception of the event.
- Log processor processes the SQS queue.
- Message updates hundreds of counters using Memcached counters.
- We then batch-write those hundreds of results in Cassandra and update indices.
About Stephane Legay
Stephane is the founder and CTO of the Phoenix based startup LoopLogic. LoopLogic provides users with tools to create, public and track who watches your videos. Before creating LoopLogic, Stephane was the CTO of OCEG. Stephane's skillsets include web application development, user interface design, C# programming, and project management in a wide variety of business applications.
Apache Cassandra at LoopLogic Case Study
http://www.planetcassandra.org/blog/post/apache-cassandra-at-looplogic-case-study
We're excited to have Stephane Legay, CTO at LoopLogic, joining us to present on how LoopLogic uses Apache Cassandra for their real-time analytics use case. Stephane will be covering the LoopLogic/Cassandra architecture + how they process their data using a 2-pass system.
Go Daddy has graciously offered to host this event. Be sure to come straight from the office (or sofa) as food/beverage will be served.
Hope you can make it!
What You Will Learn in this Session
• Architectural Overview of Cassandra at LoopLogic (3,000 to 5,000 writes per second on a 2-node cluster)
• Processing Data Using a 2-Pass Process
- Event log messages get stored in Cassandra as well as Amazon SQS upon reception of the event.
- Log processor processes the SQS queue.
- Message updates hundreds of counters using Memcached counters.
- We then batch-write those hundreds of results in Cassandra and update indices.
About Stephane Legay
Stephane is the founder and CTO of the Phoenix based startup LoopLogic. LoopLogic provides users with tools to create, public and track who watches your videos. Before creating LoopLogic, Stephane was the CTO of OCEG. Stephane's skillsets include web application development, user interface design, C# programming, and project management in a wide variety of business applications.
Apache Cassandra at LoopLogic Case Study
http://www.planetcassandra.org/blog/post/apache-cassandra-at-looplogic-case-study
- 6 participants
- 56 minutes

22 Jan 2014
ABOUT DATA COUNCIL:
Data Council (https://www.datacouncil.ai/) is a community and conference series that provides data professionals with the learning and networking opportunities they need to grow their careers. Make sure to subscribe to our channel for more videos, including DC_THURS, our series of live online interviews with leading data professionals from top open source projects and startups.
FOLLOW DATA COUNCIL:
Twitter: https://twitter.com/DataCouncilAI
LinkedIn: https://www.linkedin.com/company/datacouncil-ai
Facebook: https://www.facebook.com/datacouncilai
Eventbrite: https://www.eventbrite.com/o/data-council-30357384520
Data Council (https://www.datacouncil.ai/) is a community and conference series that provides data professionals with the learning and networking opportunities they need to grow their careers. Make sure to subscribe to our channel for more videos, including DC_THURS, our series of live online interviews with leading data professionals from top open source projects and startups.
FOLLOW DATA COUNCIL:
Twitter: https://twitter.com/DataCouncilAI
LinkedIn: https://www.linkedin.com/company/datacouncil-ai
Facebook: https://www.facebook.com/datacouncilai
Eventbrite: https://www.eventbrite.com/o/data-council-30357384520
- 1 participant
- 31 minutes

8 Jan 2014
More on realtime data analytics here: http://www.hakkalabs.co/articles/realtime-data-analytics-at-datadog
realtime data analytics
real time data analytics
ABOUT DATA COUNCIL:
Data Council (https://www.datacouncil.ai/) is a community and conference series that provides data professionals with the learning and networking opportunities they need to grow their careers. Make sure to subscribe to our channel for more videos, including DC_THURS, our series of live online interviews with leading data professionals from top open source projects and startups.
FOLLOW DATA COUNCIL:
Twitter: https://twitter.com/DataCouncilAI
LinkedIn: https://www.linkedin.com/company/datacouncil-ai
Facebook: https://www.facebook.com/datacouncilai
Eventbrite: https://www.eventbrite.com/o/data-council-30357384520
realtime data analytics
real time data analytics
ABOUT DATA COUNCIL:
Data Council (https://www.datacouncil.ai/) is a community and conference series that provides data professionals with the learning and networking opportunities they need to grow their careers. Make sure to subscribe to our channel for more videos, including DC_THURS, our series of live online interviews with leading data professionals from top open source projects and startups.
FOLLOW DATA COUNCIL:
Twitter: https://twitter.com/DataCouncilAI
LinkedIn: https://www.linkedin.com/company/datacouncil-ai
Facebook: https://www.facebook.com/datacouncilai
Eventbrite: https://www.eventbrite.com/o/data-council-30357384520
- 2 participants
- 1:09 hours

6 Nov 2013
In this talk, Tim Moreton, Founder and CTO at Acunu Analytics, and Nicolas Favre-Felix, Software Engineer at Acunu Analytics, share the concept, implementation and benefits of virtual nodes in Apache Cassandra 1.2 & 2.0. They also go over why virtual nodes are a replacement to token management, and how to use Acunu Analytics to collect event data, build OLAP-style cubes and ask SQL-like queries via a RESTful API, on top of Cassandra. This talk was recorded at the DataStax Cassandra SF users group meetup.
Understanding and (Not) Managing Virtual Nodes in Cassandra
Virtual nodes were added to Cassandra in version 1.2 and are now the default distribution model for the newly released Cassandra 2.0. This talk will explain the concept and implementation of virtual nodes in Cassandra, and the numerous benefits it brings. We will show you how virtual nodes make token management a thing of the past, improves the failure characteristics, improves bootstrapping and decommission speed, make incremental cluster growing and shrinking possible, and much more.
Under the Hood of Acunu Analytics
Cassandra is a great fit for building real-time analytics applications — but getting from atomic increments to live dashboards and streaming queries is quite a stretch. In this talk, we'll talk about how and why we built Acunu Analytics, which allows you to collect event data, build OLAP-style cubes, and ask SQL-like queries via a RESTful API on top of Cassandra. We'll dive into how it works and show how it follows Cassandra's spirit of denormalization under the hood.
ABOUT DATA COUNCIL:
Data Council (https://www.datacouncil.ai/) is a community and conference series that provides data professionals with the learning and networking opportunities they need to grow their careers. Make sure to subscribe to our channel for more videos, including DC_THURS, our series of live online interviews with leading data professionals from top open source projects and startups.
FOLLOW DATA COUNCIL:
Twitter: https://twitter.com/DataCouncilAI
LinkedIn: https://www.linkedin.com/company/datacouncil-ai
Facebook: https://www.facebook.com/datacouncilai
Eventbrite: https://www.eventbrite.com/o/data-council-30357384520
Understanding and (Not) Managing Virtual Nodes in Cassandra
Virtual nodes were added to Cassandra in version 1.2 and are now the default distribution model for the newly released Cassandra 2.0. This talk will explain the concept and implementation of virtual nodes in Cassandra, and the numerous benefits it brings. We will show you how virtual nodes make token management a thing of the past, improves the failure characteristics, improves bootstrapping and decommission speed, make incremental cluster growing and shrinking possible, and much more.
Under the Hood of Acunu Analytics
Cassandra is a great fit for building real-time analytics applications — but getting from atomic increments to live dashboards and streaming queries is quite a stretch. In this talk, we'll talk about how and why we built Acunu Analytics, which allows you to collect event data, build OLAP-style cubes, and ask SQL-like queries via a RESTful API on top of Cassandra. We'll dive into how it works and show how it follows Cassandra's spirit of denormalization under the hood.
ABOUT DATA COUNCIL:
Data Council (https://www.datacouncil.ai/) is a community and conference series that provides data professionals with the learning and networking opportunities they need to grow their careers. Make sure to subscribe to our channel for more videos, including DC_THURS, our series of live online interviews with leading data professionals from top open source projects and startups.
FOLLOW DATA COUNCIL:
Twitter: https://twitter.com/DataCouncilAI
LinkedIn: https://www.linkedin.com/company/datacouncil-ai
Facebook: https://www.facebook.com/datacouncilai
Eventbrite: https://www.eventbrite.com/o/data-council-30357384520
- 3 participants
- 59 minutes

17 Sep 2013
From the Austin, TX Apache Cassandra users group:
http://www.meetup.com/Austin-Cassandra-Users/events/131802852/
Presentation slides:
http://www.slideshare.net/cassandraatx/bv-emodb
Overview:
Fahd Siddiqui from Bazaarvoice describes an internal datastore built on Apache Cassandra.
Description:
Introducing Bazaarvoice datastore (EmoDB)
EmoDB is a RESTful HTTP server used by Bazaarvoice for storing JSON objects and for watching for changes to those events. It also supports a blob store, a queueing service, and a data bus to track events.
It is designed to span multiple data centers, using eventual consistency (AP) and multi-master conflict resolution. It relies on Apache Cassandra for persistence and cross-data center replication.
http://www.meetup.com/Austin-Cassandra-Users/events/131802852/
Presentation slides:
http://www.slideshare.net/cassandraatx/bv-emodb
Overview:
Fahd Siddiqui from Bazaarvoice describes an internal datastore built on Apache Cassandra.
Description:
Introducing Bazaarvoice datastore (EmoDB)
EmoDB is a RESTful HTTP server used by Bazaarvoice for storing JSON objects and for watching for changes to those events. It also supports a blob store, a queueing service, and a data bus to track events.
It is designed to span multiple data centers, using eventual consistency (AP) and multi-master conflict resolution. It relies on Apache Cassandra for persistence and cross-data center replication.
- 11 participants
- 1:05 hours

25 May 2013
MySQL to Cassandra: Big Data, High Scale, Data Migration... Oh My!
by
Scott Bonneau, CTO & EVP of Engineering, Bazaarvoice
and
RC Johnson, NYC Engineering Manager, Bazaarvoice
Hosted by Eric David Benari
A Database Month event http://www.NYCSQL.com/events/114879742/
Bazaarvoice has over 100 million of pieces of user generated content in our MySQL instances, and serves billions of page views every month to hundreds of millions of unique users across thousands of different web sites across the Internet. After nearly eight years of growth the company is poised to open up this content to a host of new brands and retailers, but faced the daunting challenge of merging all of this content into one data infrastructure that would scale both reads and writes horizontally as we continue to grow.
This session will talk about a number of the lessons learned as we have moved from multiple cluster MySQL and Solr infrastructure to a new multi-master Cassandra and ElasticSearch based data storage system.
Scott Bonneau, CTO & EVP of Engineering, Bazaarvoice
As Chief Technology Officer and Executive Vice President of Engineering, Scott Bonneau is responsible for overseeing the strategic roadmap and building of Bazaarvoice's products and critical infrastructure systems, which serve 10+ billion interactions on average each month. Scott brings more than 13 years of experience in building highly scalable software systems in the enterprise software, consumer internet, and financial services industries to Bazaarvoice, including more than 7 years in senior leadership roles. He has a passion for building robust, scalable technologies and building fast-growing, agile, and innovative engineering teams.
Prior to joining Bazaarvoice in early 2011, Scott spent nearly 4 years at Google in engineering leadership roles under the AdWords product umbrella. While at Google, Scott built teams and technologies on Google's most substantial revenue generating product line and was recognized internally as both a technology and team-building innovator. Scott has also held senior leadership and individual contributor roles at RGM Advisors, a high-frequency securities trading company, Lombardi Software (acquired by IBM), MessageOne (acquired by Dell), and Trilogy Software.
Scott graduated Cum Laude from Rensselaer Polytechnic Institute in Troy, NY with a bachelor's of science degree in Computer Science. Outside of the office, Scott is a touring guitarist, having recorded multiple studio albums and played on stages across the country. He enjoys spending time with his three young children, playing poker, and being an avid video gamer.
by
Scott Bonneau, CTO & EVP of Engineering, Bazaarvoice
and
RC Johnson, NYC Engineering Manager, Bazaarvoice
Hosted by Eric David Benari
A Database Month event http://www.NYCSQL.com/events/114879742/
Bazaarvoice has over 100 million of pieces of user generated content in our MySQL instances, and serves billions of page views every month to hundreds of millions of unique users across thousands of different web sites across the Internet. After nearly eight years of growth the company is poised to open up this content to a host of new brands and retailers, but faced the daunting challenge of merging all of this content into one data infrastructure that would scale both reads and writes horizontally as we continue to grow.
This session will talk about a number of the lessons learned as we have moved from multiple cluster MySQL and Solr infrastructure to a new multi-master Cassandra and ElasticSearch based data storage system.
Scott Bonneau, CTO & EVP of Engineering, Bazaarvoice
As Chief Technology Officer and Executive Vice President of Engineering, Scott Bonneau is responsible for overseeing the strategic roadmap and building of Bazaarvoice's products and critical infrastructure systems, which serve 10+ billion interactions on average each month. Scott brings more than 13 years of experience in building highly scalable software systems in the enterprise software, consumer internet, and financial services industries to Bazaarvoice, including more than 7 years in senior leadership roles. He has a passion for building robust, scalable technologies and building fast-growing, agile, and innovative engineering teams.
Prior to joining Bazaarvoice in early 2011, Scott spent nearly 4 years at Google in engineering leadership roles under the AdWords product umbrella. While at Google, Scott built teams and technologies on Google's most substantial revenue generating product line and was recognized internally as both a technology and team-building innovator. Scott has also held senior leadership and individual contributor roles at RGM Advisors, a high-frequency securities trading company, Lombardi Software (acquired by IBM), MessageOne (acquired by Dell), and Trilogy Software.
Scott graduated Cum Laude from Rensselaer Polytechnic Institute in Troy, NY with a bachelor's of science degree in Computer Science. Outside of the office, Scott is a touring guitarist, having recorded multiple studio albums and played on stages across the country. He enjoys spending time with his three young children, playing poker, and being an avid video gamer.
- 5 participants
- 1:26 hours

26 Oct 2012
Christian Carollo is the director of cloud and alternative platform development at GameFly. He is focused on availability, reliability and scalabilty in cloud computing and how mobile, tablet and other non-traditional platforms can leverage cloud-based services.
Previously Christian has worked at Fandango as the Director of Data Systems and at several other internet companies over the last 15 years.
Follow him on twitter at @supernaut.
Previously Christian has worked at Fandango as the Director of Data Systems and at several other internet companies over the last 15 years.
Follow him on twitter at @supernaut.
- 8 participants
- 1:19 hours

11 Mar 2012
Jeremiah Jordan
Using Apache Cassandra from Python is easy to do. This talk will cover setting up and using a local development instance of Cassandra from Python. It will cover using the low level thrift interface, as well as using the higher level
Using Apache Cassandra from Python is easy to do. This talk will cover setting up and using a local development instance of Cassandra from Python. It will cover using the low level thrift interface, as well as using the higher level
- 2 participants
- 31 minutes

2 Mar 2012
Description
Using Apache Cassandra from Python is easy to do. This talk will cover setting up and using a local development instance of Cassandra from Python. It will cover using the low level thrift interface, as well as using the higher level pycassa library.
Abstract
Very brief intro to Apache Cassandra
What is Apache Cassandra and where do I get it?
Using the Cassandra CLI to setup a keyspace (table) to hold our data
Installing the Cassandra thrift API module
Using Cassandra from the thrift API
Connecting
Writing
Reading
Batch operations
Installing the pycassa module
Using Cassandra from the pycassa module
Connecting
Reading
Writing
Batch operations
Indexing in Cassandra
Automatic vs Rolling your own
Using Composite Columns
Setting them up from the CLI
How to using them from pycassa
Lessons learned
Speaker Profile
Jeremiah Jordan is a Software Developer at Morningstar, Inc. His team created an Operational Data Store using Python along with Apache Cassandra, MySQL, Google Protocol Buffers, ActiveMQ, ZeroMQ, and Zookeeper (all being used from Python).
Using Apache Cassandra from Python is easy to do. This talk will cover setting up and using a local development instance of Cassandra from Python. It will cover using the low level thrift interface, as well as using the higher level pycassa library.
Abstract
Very brief intro to Apache Cassandra
What is Apache Cassandra and where do I get it?
Using the Cassandra CLI to setup a keyspace (table) to hold our data
Installing the Cassandra thrift API module
Using Cassandra from the thrift API
Connecting
Writing
Reading
Batch operations
Installing the pycassa module
Using Cassandra from the pycassa module
Connecting
Reading
Writing
Batch operations
Indexing in Cassandra
Automatic vs Rolling your own
Using Composite Columns
Setting them up from the CLI
How to using them from pycassa
Lessons learned
Speaker Profile
Jeremiah Jordan is a Software Developer at Morningstar, Inc. His team created an Operational Data Store using Python along with Apache Cassandra, MySQL, Google Protocol Buffers, ActiveMQ, ZeroMQ, and Zookeeper (all being used from Python).
- 4 participants
- 44 minutes

1 Sep 2011
In this presentation, SriSatish Ambati is going to talk about The Apache Cassandra Project, a highly scalable second-generation distributed database.
He'll cover:
- Use cases
- Why Cassandra
- Brisk and Hadoop
- FUD: Consistency
- Facebook and Cassandra
- Community, Code, and tools
He'll cover:
- Use cases
- Why Cassandra
- Brisk and Hadoop
- FUD: Consistency
- Facebook and Cassandra
- Community, Code, and tools
- 1 participant
- 46 minutes

21 Oct 2010
presented by Jake Luciani of Riptano. Slides: http://bit.ly/9Rbuyp
The talk covers:
use cases for search and type of search applications
problems scaling and maintaining Lucene/Solr
Cassandra
Lucandra (Lucene + Cassandra)
The talk covers:
use cases for search and type of search applications
problems scaling and maintaining Lucene/Solr
Cassandra
Lucandra (Lucene + Cassandra)
- 4 participants
- 37 minutes
