Apache Cassandra NYC* 2013 Open Meetings

10 Apr 2013

Speaker: Edward Capriolo
SlideShare: http://www.slideshare.net/slideshow/embed_code/18573117
The ColumnFamily data model and wide-row support provides the ability to store and access data efficiently in a de-normalized state. Recent enhancements for CQL's spare tables and built-in indexing provide the capability to store data in a manner similar to that of relational databases. For many use cases hybrid approaches are needed, because complete de-normalization is appropriate for some access patterns whereas more structured data is appropriate for others. At times a single logical event becomes multiple insertions across multiple column families. Likewise a user request might require a several reads across different column families. This talk describes some of these scenarios and demonstrates how advanced operations such multiple step procedures, filtering, intersection, and paging can be implemented client side or server side with the help of the IntraVert plugin.

7 participants
39 minutes

query

database

cassandra

primary

uuid

named

users

proprietary

key

mutate

10 Apr 2013

Speaker: Nathan Milford
SlideShare: http://www.slideshare.net/nmilford/the-automation-factory
Automation Stack at Outbrain

3 participants
43 minutes

servers

manages

provisioning

capacity

cassandra

automation

diablo

ops

io

maven

10 Apr 2013

Speaker: Michael Figuiere
SlideShare: http://www.slideshare.net/planetcassandra/nyc-tech-day-new-cassandra-drivers-in-depth-17867623
Cassandra 1.2 finalizes CQL3 and introduces a new binary protocol for client/server communication. These two components are the foundation of the new line of drivers developed by DataStax. Based on years of experience with Cassandra, these new drivers for Java, .Net and Python come with an asynchronous and lightweight architecture, a clean and simple API, a standardized way to discover nodes and to manage load balancing and fail over. This presentation will give an in depth look at these new drivers which will make your Cassandra-based applications even more robust, efficient and simple to write.

3 participants
51 minutes

casino

somewhat

developing

debate

casa

strategy

tax

clients

processing

drivers

10 Apr 2013

Speaker: Dave Finnegan
SlideShare:
An April 2012 InformationWeek special report entitled "Why NoSQL Equals No Security" began by stating: "If it seems security is an afterthought at best in the big data ecosystem, you're right." DataStax Enterprise 3.0 overcomes this perception and is the first big data platform in the NoSQL industry to bring the type of enterprise security used in traditional RDBMS's to secure systems and important data to the big data/NoSQL market. This presentation will describe each aspect of DataStax Enterprise 3.0's security feature set. Note that all security features are optional; the administrator can decide to use none, some, or all of them depending on their specific application. Features described include Internal Authentication, External Authentication, Permission Management, Transparent Data Encryption, Data Auditing, Client to Node Encryption

3 participants
37 minutes

security

secure

enhancements

datastax

ss

authentication

encryption

configured

cassandra

provides

10 Apr 2013

Speakers: Jeff Sigman & Sanjay Sharma of Volly

https://www.volly.com

6 participants
36 minutes

mail

bali

providers

digital

addressing

strategizing

volley

data

gateway

big

9 Apr 2013

Speakers: Matt Pfeil(DataStax), Eddie Satterly(Splunk), Edward Capriolo(Media6Degrees), Matt Conway(Backupify), Russell Bradburry(SimpleReach), Jake Lucianni(BlueMountain Capital)

16 participants
53 minutes

cassandra

servers

handle

fail

clients

datastore

manager

backup

query

dashboards

8 Apr 2013

Speakers: Jake Luciani and Carl Yeksigian, BlueMountain Capital
SlideShare: http://www.slideshare.net/planetcassandra/nyc-tech-day
This talk will focus on our approach to building a scalable TimeSeries database for financial data using Cassandra 1.2 and CQL3. We will discuss how we deal with a heavy mix of reads and writes as well as how we monitor and track performance of the system.

9 participants
47 minutes

query

queries

cassandra

data

databases

profiling

stored

problems

intervals

careful

8 Apr 2013

Speaker: Jay Edwards
Topic: Amazon Web Services

1 participant
5 minutes

amazon

aws

services

vpc

utilization

bandwidth

database

throughput

balancer

provisioned

8 Apr 2013

Speakers: Brian O'Neill and Taylor Goetz
A successful Big Data platform combines distributed processing and polyglot persistence into a single cohesive infrastructure. Over the past few years, Health Market Science has transitioned from traditional relational databases and enterprise systems to a massively scalable Big Data platform that combines Cassandra and Storm to ingest thousands of feeds of data from the health market industry to produce a single high-quality masterfile. Hear how we applied event processing and NoSQL to deliver real-time analytics, while accommodating structural change over time, and fuzzy/geospatial search.

2 participants
51 minutes

complexity

vechta

insights

discussion

cassandra

big

quad

decision

leveraging

plan

8 Apr 2013

Speaker: Rick Branson of Instagram
It's upsetting whenever we hear that we can't have things that we want. It'd be nice to live in a world where it was possible to have things ACID transactions, uniqueness guarantees, and sequential counters that were globally and always available. What makes this worse is that when we're told we can't have them, people just wave their arms around in the air and shout things like "CAP theorem." In this talk, I'll walk through some of these "ponies" and demonstrate the points at which things start falling apart with practical, real-world examples.

8 participants
49 minutes

protocols

hosting

availability

transactional

foundation

concerning

proposal

infrastructure

database

devops

8 Apr 2013

Speaker: Thomas Pinckney, Senior Director of Engineering at eBay
SlideShare: http://www.slideshare.net/planetcassandra/e-bay-nyc
Recommendation and personalization systems are an important part of many modern websites. Graphs provide a natural way to represent the behavioral data that is the core input to many recommendation algorithms. Thomas Pinckney and his colleagues at Hunch (recently acquired by eBay) built a large scale recommendation system, and then ported the technology to eBay. Thomas will be discussing how his team uses Cassandra to provide the high I/O storage of their fifty billion edge graphs and how they generate new recommendations in real time as users click around the site.

9 participants
48 minutes

preferences

opinions

taste

recommendation

customers

intuitively

analyzing

predictive

cassandra

curious

8 Apr 2013

Speaker: Eddie Garcia, VP of Development at Gazzang
Topic: Security at Gazzang (DataStax Partner)

1 participant
5 minutes

encryption

encrypting

encrypt

encrypted

secure

security

authentication

trusty

drm

key

8 Apr 2013

Speaker: John McCann, Senior Director of Applications Engineering at Comcast
SlideShare: http://www.slideshare.net/planetcassandra/nyc-tech-day-using-cassandra-for-dvr-scheduling-at-comcast
Comcast is developing a highly scalable cloud DVR scheduling system on top of Cassandra. The system is responsible for managing all DVR data and scheduling logic for devices on the X1 platform. This talk will cover the overall architecture of the scheduling system, data model, message queue and notification software that have been developed as part of this ambitious project. We'll take a deep dive into the details of our data model and review the implementation of Comcast's open-source, Cassandra-based clones of Amazon SQS and SNS.

4 participants
42 minutes

comcast

cable

tv

channel

netflix

dvr

xfinity

platform

moved

configuring

8 Apr 2013

Speaker: Jonathan Ellis, Apache Cassandra Project Chair and CTO/Co-Founder of DataStax
SlideShare Presentation: http://www.slideshare.net/planetcassandra/nyc-jonathan-ellis-keynote-cassandra-12-20

2 participants
47 minutes

cassandra

community

people

audience

increasingly

important

technology

support

esoteric

client

20 Mar 2013

Speaker: Eric Lubow
SlideShare: http://www.slideshare.net/planetcassandra/big-data-revolution-is-an-evolution-ny-cassandra-20130320
Dealing with data doesn't only require a data store, it requires an infrastructure. At SimpleReach, we have 5 data storage layers to service all of our data needs. These range from high volume, high velocity data ingestion with real-time analytics to ad-hoc style historical analysis with search capabilities. To communicate effectively between applications, data stores sit behind a service architecture for consistent data access patterns and failover/redundancy. This talk is a story of how we came to this architecture and some of the lessons we learned along the way.

5 participants
40 minutes

twitter

analytics

monitoring

content

media

information

web

publish

users

social

Apache Cassandra / NYC* 2013

10 Apr 2013

10 Apr 2013

10 Apr 2013

10 Apr 2013

10 Apr 2013

9 Apr 2013

8 Apr 2013

8 Apr 2013

8 Apr 2013

8 Apr 2013

8 Apr 2013

8 Apr 2013

8 Apr 2013

8 Apr 2013

20 Mar 2013