Apache Cassandra Cassandra Summit 2015, 14 Mar 2016

Previous Meeting

Next Meeting

⏯

youtube image

►

From YouTube: Activision: How to turn Massive Amounts of Streamed Data into Real Time Personalized Experiences

Description

Speaker: Darryl Kanouse, Senior Director - Consumer Technology

In 2014, with the launch of Call of Duty: Advanced Warfare, Activision released a system for messaging its users with highly personalized and contextually relevant communications designed to enhance the user experience and deepen user engagement. Now, in year two, the system is being extended to serve all Activision titles and to deliver these experiences in reaction to player behaviors. The key to success is Activision's ability to process massive amounts of data in real-time using a data center built around Cassandra as the primary user profile store.

A

B

So hello I'm here to talk about how to turn massive amounts of stream data into real-time personalized experiences for millions of users. Just quickly about me, my name's Daryl can house I am a senior director of consumer technology at Activision before that I was principal solution. Architects at a marketing agency called Rosetta. Activision was our client at that time, so I've been with Activision for a couple years, but worked with them for about four or five years.

B

My specialty is in large-scale user engagement technologies and that activation that lead. The group that's responsible for the design and development deployment, support and management of the platform that we use for personalization and commercial technologies.

B

I also do wire art and music and I have kids that I love- and this is my wire art and I- have to put it up, because I do one of them every day, today's day, 181, and that was one that I did. If, if you're like me and you, google, your presenters you'll see a lot of this stuff but anyway. So let's talk about the massive amounts of stream data for real-time personalized experiences. I have sort of the subtitle of architect for success. I think it's probably a bit of a cliché to say that.

B

But that's definitely my perspective is there's a lot of things you can do with optimizations low-level tuning that kind of thing, but the decisions that you make up front about how you're going to build your system tend to be the ones that are going to determine your success or failure in these kind of scenarios. So we'll talk a little bit about how activation we went about the process of architecting our solution for streaming data and the role of cassandra plays in it.

B

Obviously, so the main question that we ask ourselves whenever we're making any decisions about how we're going to design stuff is, will it scale and that's something that when you talk about Activision and the titles that we have our user base, scaling is probably the thing that we spend almost all of our time on making sure that our systems can support the types of data that comes in at the volumes that it comes in.

B

So there's some important things to think about when you, when you're talking architecture, there are technical considerations, then there are non technical considerations. I think the non technical considerations are almost as important as the technical ones, technical ones. You can do something about generally speaking, the non-technical ones. You cannot do anything about and they sort of define be constraints around the system that you're trying to build. So we're talking about stuff like business timelines company like Activision, we release video games on an annual cycle. November is the big release month for us.

B

Whatever solution we come up with and, however we architect things they need to fit within the constraints of the business timelines, you can't have our solution deployed in December. If that just doesn't that doesn't work and I'm sure most of the businesses there's a similar type of constraint around product releases and Palace tough works, there's also the financial environment that you're working within again. This is often time stuff. We don't have a lot of control over its budgets. I think that's pretty normal I think we're all sort of familiar with that.

B

If we had an infinite amount of money, our solutions probably look a lot different than they do with the reality of the budgetary constraints.

B

We have also legacy systems, something else we consider a lot of when we are talking about taking a personalization platform and adding it to a video game, we're talking about bringing a new system into a pre-existing ecosystem of a bunch of other applications, understanding what they do and how they work is important in helping us decide what what decisions were going to make and how we're going to approach our solution and then finally, there's organizational politics I'm sure everybody is familiar to some level of that Activision. It's not really that bad in itself.

B

Organizational politics is neither good nor bad, but sometimes you'll run into like. Why did they do that kind of questions that are only answerable because, as somebody said so, but it helps also to kind of constrain a little bit about how we think about our solutions.

B

So I guess that's all to say that when we talk about how we're going to do things like large scale personalization in real time, so that the context under which we're building this application is very important. So with respect to Activision and the context there, let's just talk a little bit about the history of activation and kind of how we got to these things we're doing this year. I, don't know, maybe guys are familiar with the company. We started in 1979 who the first third party developer for any games at all.

B

It was for the Atari console cartridge system up until Activision got started. All of the games high dosage, you got were made by the manufacturers of the consoles, so so that's a legacy that sort of continues on today. We are still. The third are the largest third-party game developer as far as the industry as a whole, just to kind of give you a sense of when we talk about large-scale activity. Is a big player in this industry. In this industry is very, very big. 155 million Americans play video games.

B

51 percent of all us households own a dedicated console, whether it's an Xbox or Playstation. Forty-Two percent of Americans play more than three hours: 22 / 22 billion dollars spent on the video game industry. It's it's big and Activision is one of the biggest players in that industry. We currently have over 4,000 employees. We've got 38 locations, that's kind of a mix of offices and development, studios support teams, it kind of depends on them on the day that you're asking that's something beautiful ucks, but that's about where we stand today.

B

We release a lot of video games a lot, but there's three that at the moment are probably the ones that most people have heard of. They are blockbusters, they're, the kind of tent pole titles in the organization call of duty. You probably have heard of that one destiny was released last year, one of the biggest new IP releases ever in the video game world, which is not hard to do.

B

These new records can be broken often because the video game industry is just growing, so big and so fast that new biggest releases happen kind of all the time and then Skylanders, if you guys, have kids, hopefully you're familiar with Skylanders.

B

My kids have it I step on the little toys all the time at home, but it's a it's a lot of fun, so that's kind of our business more or less, and when we talk about personalized experiences within that context, we sort of start with the general belief that personalized experiences are better than experiences that are not personalized.

B

I, don't think. That's very controversial, I'm sure, there's lots of statistics that can support that. But I think at this point we all sort of acknowledge that that's the expectation you just have personalized experiences. That's the that's the thing, but we don't just do personalized experiences for nothing. Obviously, the reason why they're valuable is because they increase player engagement. They make people happy and in the video game world for us we can measure that engagement through a couple of key metrics session frequency and duration, basically getting people to play more.

B

Presumably, because they're having a good time and they're enjoying it, and also we are a business, so the average revenue per user is a metric that we also look at just as a side note the acronym for average revenue per user arpu, which still elicits giggles. When we talk about it, there is another metric that we look at, which is average revenue per paying user, which has the unfortunate RP poo acronym, but people still giggle when we talk about in the office.

B

But but those are the things that we look at and that's how we know that we're successful on a personalization efforts, because we're seeing enough lift in these kind of metrics.

B

Historically, we've done personalization across a lot of channels started with email. We do website stuff from mobile app has personalization most of what we'll talk about architectural e is the in-game on console stuff, but I think it's important to understand that that the platform that we built is multi-channel and supports a lot of different stuff, so some background on those channels and how we got it from a development team standpoint to where we are today.

B

That's kind of the story of CRM and if you guys are familiar with CRM as a traditional discipline, customer relationship management outreach, that's the type of an organization you would expect to be responsible for personalized experiences and driving engagement, making people be repeat customers and at Activision about four years ago. Five years ago there was the beginnings of the CRM group that has now evolved into the group that I'm a part of today, and at that time this was 2011 with modern warfare.

B

3 was kind of the start of Activision getting into data as a source of personalized experiences, and is something that we can look at to get insights from our customers, not just to do reporting for financials, but to actually impact the game experience at that time.

B

It was very basic, we're just doing basic segmentation strategies like do you play a lot or do you play a little? Are you good with your kdr, you bad, and then we would send you emails that are related to that, but it was kind of putting our toe in the water technologies. We use at that time. A dupe for all the basic data collection info, bright, Oracle Oracle, is kind of a legacy system that IT group had, but we used it.

B

We had L Aquila was the email service provider that would actually send out the emails, and we had a lot of guys who would write sequel code against Oracle to produce CSV outputs that we would ftp off to an email service provider to deliver emails, and that's where we started, and oh those were the days so now. Moving on to the next year, black ops 2, it became important. We realized to stabilize the data environment.

B

This was the first time we'd actually looked at millions of users, data all coming in at the same time and trying to deal with it. Now. This point we're still not talking real time, we're talking about capturing the data and analyzing it offline automation clearly was becoming a problem having people writing sequel code is not an ideal scenario to generate ad hoc list to send emails. So it's the beginning of an automation process and we refined our targeting so that we weren't using the same basic segmentation and around.

B

Are you good or not, but we actually started to look at stuff like well. What are you good at and maybe that's something that we could use to to talk to you? Are you really good at a particular mode of play versus some other mode? Our technology platforms involved only really to the extent that we ditched Oracle not a year too soon, and we used info bright as the basic data Mart for all of the the data points we were using to drive.

B

The personalization exacttarget became the new email service provider and we went to a j2ee java based application to do the advanced, targeting and automation stuff.

B

So moving on to 2013, we learned a lot of stuff from the previous year. One of them was that info bright was terrible, I hope there's nobody for me per bite here, but it was, it did not. Work for us had lots of problems, keeping it up, but what we didn't want to do is continue on the path of advancing our segmentation and advancing the personalization efforts by looking at more and more data, and then we also began to introduce some of these. Other technologies increase the automation. This is a progressive step along the path.

B

What we had been before, basically just getting better at email, but we got a lot better at it and we saw a lot of great results through this effort, encourage the organization to kind of move on to wear right now.

B

So one of the key things that happened this year was in 2013 was the development of a personalization engine that, as a modular component, was the type of thing that we could throw data at throw a set of rules and some content at, and it could smash it all together and produce an experience that experience was email at the time. But but even now we use still the same basic principles in how we do the stuff in game.

B

Our mandates for personalization and now we're starting to see some hints at how our architecture is going to need to conform to some of the constraints we have to maximize relevance and user value. We can't just talk to people about nothing. We can't you know put in front of them suggestions that they buy stuff that they're not interested in. We really need to make sure that people feel there's a value add by this personalization effort.

B

We need to optimize for performance, we're seeing lots and lots of data still you can always kind of improve in that area and then the last part was providing controls for administrators to be able to dial up and dial down certain pieces of content, as they saw marketing strategies that could benefit from that type of thing, and this is kind of what it looked like.

B

This is an email form and I think this is a good representation, more or less of the general approach to personalization that we have still today, even in the game and I, think one of the things that sort of is worth looking at look I got my little pointer yesterday. It works so if you compare the guy over here on the left to the guy over there on the right, you can see that that the way we've stacked content is different. As a rules process, there are perhaps stats we want to give to people.

B

There are achieving want to congratulate them about, and obviously those things are going to have some degree of customization they'll have the relevant data points, but the question for us is: do we show them stats at all? So if you look at the person over here on the left, that guy would be somebody might, we might consider to be a new and we would say for him he had a good week. Let's show him some stats, he might get kind of kicked out on that and think that's great the guy over there on the right.

B

Perhaps a vet, perhaps not as impressed with his own stats, however, would like to know that he prestige, because you spent a lot of time playing the game. So these are the types of ways that we kind of manifest experiences, and this is a good visual for for how that works. It all kind of starts with having a lot of content. The kind of personalization we talk about doesn't really work. If you only have three or four things to talk about, we actually have thousands I.

B

Think ten thousand by last count of individual types of things. We can talk to people so, as you can imagine, there's a there's, a pretty massive production machine behind all this stuff that generates this content, but I sort of my third factor thing, but you can see some of the types of things we're talking about stats, congratulating people, giving them tips all geared towards making them feel good and want to play more okay, so last year was 2014. This was the big shift from out of the emails and into the game.

B

This is a I can't really like overstate what a significant evolutionary stuff this was for us, because it meant now that our systems had to be production ready. It's one thing: when you have stuff that's working offline and you're, creating your writing applications that maybe send an email- maybe don't, but when you're talking about putting something in the game at Activision, you need to tread very carefully and this was the year and not not to Uncle.

B

Incidentally, this was the year the Cassandra came into play for us, where we started to realize that in game you're talking about communicating with everybody an email, you've really only got a small subset. You also talked about the need for reliability, the need for performance. Everything starts to get a lot trickier, and so so Cassandra we still use Hadoop and green plum as an analytics source of sorts.

B

But here, if your Kim, Cassandra and I will talk more about what how that Cassandra stuff worked, this is what it looked like inside the game like I mentioned before you can see a lot of their beakers recurring themes. Here's somebody got an achievement for something there's a tip on how to use some of the score. Streaks again, these are all targeted, specifically I think these were mine.

B

I think I used the missile very poorly, and so it was telling me how I could use it better, but you can start to see- and we got very positive feedback on this as well. So now we're talking about this year, and this is where the real time piece came into play and making it happen in real time if getting in the game was an incremental step forward in terms of general, like stress and anxiety, about the application. Making it happen in real time is now really ratchets up the thing you can see.

B

We have slightly different technology stocks that we're using now you'll recognize that I, probably don't need to say it. Amazon is playing a role and what we're doing now wasn't before, and but we still have Cassandra some one thing. I will note now that if you were noting on all of the slides, this is the first year that we actually stuck with a data platform for profile management two years in a row, and it was go Sandra and that's why we're here, it's great. It works out for us. Ok, so I think that's.

B

That's kind of the context for how the business got to the point where we're building the thing that we're building now to sort of understand how we, how we think about the way the application is supposed to work. So this is the team size more or less.

B

That builds the thing that we're working on there's a core nucleus of people that is maybe a little bit less than this 41, but about 41 people, probably on a day-to-day basis, have some sort of interaction, whether it's working on the in the studio sides in building the game, client code to interact with personalization or its building the systems on our side to do the database and there's also content managers, QA people- so this is the size of the group.

B

Excuse me, okay, so we have a thing that we want to build personalization streaming real time and we need to put it into the game environment. We are putting it into a very, very noisy place that has a lot of stuff that's happening in order to keep a game like Call of Duty active and online. There are systems and services that are constantly running that we need to be aware of so that we don't sort of step on them. This is a bit of the legacy system. Consideration authentication, matchmaking marketplace.

B

These are all critical game services that need to always be online to make the game work and we are bringing ourselves into that ecosystem. Tell us to support that.

B

So what the ecosystem looks like game clients, so here's three, but it's really more like 30 million, but more than one I guess at the point I'm trying to make here and they talk to a service gateway.

B

We have P concurrent users that are somewhere around two to three million give or take, depending on you know whether a game just launched or whether it's April and we have a service gateway that acts as kind of a proxy and a caching tier between all the backend services and everything that the game and all the games in the way that they interact. So they all games talk to the Gateway behind the Gateway is a whole set of services. I, don't think this is a particularly unique or different kind of architecture.

B

I think you'll see this in a lot of places. This is sort of how it works also just like in a lot of other places. The kind of the mess is behind the service here, where you have a complete hodgepodge of a bunch of different things that have developed like barnacles over the years that our core applications that need to run, but that haven't been touched in a long time and they're all kind of mix and match.

B

But the truth is there's actually some some rhyme or reason about that there are. There are owners of the systems, they can tell you why they are the way they are and how they need to be there, and and and as it happens, we are basically building one of those things we're another one of the tentacle barnacled bits that are in there. Okay,.

A

B

We'll at scale is the question about our application, so ensuring that it will making sure that the answer can be. Yes is where we start looking at the requirements and a little bit more deaf and one of those is to understand what personalization means so for us I think it's important to lay out up front what personalization is not in the game like Call of Duty or any of the games. We do and is messing with progression systems or changing weapon performance on a personal basis or monkeying, with maps Ernie that kind of stuff.

B

As much as we would like to make the game easier for people who are bad because we know they will play more. That sort of violates the notion of the integrity of the game and the and the level playing field that we need to support. However, we do have a lot of options and things we can talk about. That will result in people having a better time and getting better and enjoying it and I. Don't know how many of you guys play call of duty. I know my first experience of it was. It was brutal.

B

It was terrible and it was frustrating, but after many hours of play and lots of tips, I feel like I'm a little bit better. Now and and and mainly it's it's kind of like tips and congratulate congratulate or you know, attaboy type of stuff is the thing that kind of keeps people coming back, despite the fact that it can be relatively brutal experience. So the other thing to consider. So that was that's.

B

What personalization kind of is it's sort of the in-game manifestation of stuff that we talk about an email, then there's also the consideration of what is real-time actually mean, and there is nothing that is instant in anything in the world. There's the speed of light constraint, but in addition to that, there's also, you know you got to move data from one place to another. You've got to act on it.

B

You've got to get back so really what we're talking about when we, when we try to define real time, we're talking about what is an acceptable amount of latency between a behavior that we see in the game and a resulting targeted communication that comes out of that.

B

So the window that we have to work with is what happens: the there's the end of the match when you're playing a game, there's a data that gets packaged up and it gets sent off, and that starts the clock ticking after that there is, if you're playing the game from a user experience standpoint, there's a killcam, it's a sort of a replay of the last guy who killed somebody and it's fun. If you're that guy, not as fun and fear the guy that got killed, then there's an after action report in the game.

B

That's a kind of a list of stats and all the stuff is happening in the game. While we are taking that data and processing it and trying to decide what we're going to do next when they drop back into a lobby, the lobby is kind of like the waiting ground where people sort of sit and wait for the next match to start, and that's really what we want to take the most advantage of an opportunity to communicate to them.

B

That's when people most people who stop playing stop playing while they're sitting in a lobby so we'd like to do it in under a second to get all the data out of the games and to get, and we can do that when there's only two or three people playing when there's through three million people playing it's a little bit harder to do so we have to measure what is it?

B

What is too long for us to wait and for us it's 29 seconds so that 29 seconds may seem like a long time away from real time. But in any system like this, when you talk about real time, you're measuring it by the experience of the user and in essence the experience will feel real time because they played a match, the first opportunity we had to talk to them. We have something that's relevant and related to that data.

B

So what needs to happen in that 29 seconds and really you know, 29 seconds is definitely the high-water mark or the the backstop if it takes more than that, we just won't do it and we really can do much better than that. But here's all the things that kind of need to happen.

B

Parsing the inbound match stats, determine whether or not the profile update needs to happen archive the inbound data for analysis and for replay deliver the user to the business engine. The rules engine that can assign the content and that rules engine will then load the user profile, including all the stats stuff that is the generated offline, apply the updated segmentation determine appropriate messaging treatment and a bunch of other stuff and then eventually send all that stuff back out to the engine going back out to the to the game and doing that for p concurrent volumes.

B

That's that's really what our challenges and when the question is well will it scale is whatever you can do to make this happen, something that you can do when there's two or three million people that are trying to do the same thing. So the answer is yes and we we can do that through three core architectural principles, the first one being a low latency, high-volume, datastore and I. Think there's any mystery.

B

What that is a stateless, bespoke processing those you can talk a little bit about that and then Q Q's QA I did a lot of queueing total side. Note queueing, as eight letter word, with five vowels in a row, it's the only one in the English language I take an opportunity to write it so there it is so. First, let's talk about the low latency high volume data store, I.

B

Think it's pretty pretty self-evident that the more you know about somebody, the better you can be at at communicating with them, and so, let's just say, hypothetically, what we want to do is recommend or talk to somebody about. It particular treat a cupcake, for example, knowing a little bit about them. We can under down that in this case they like frosting. What we're talking about from a game standpoint is just understanding. What console do you plan on? How long have you played? What's your XP?

B

What level are you at and that can kind of provide some constraints around? What we would tell you, the more we know, I think, is pretty obvious the more refined we can get in what we're trying to talk to people about.

B

Eventually, you get to a point where you got something that looks really personal and and that's really what we're trying to get at the balance is that, in order to do that, you need lots and lots of attributes lots of things to know about a player in at this point right now, there are a thousand attributes that we are tracking for every individual, and this is only the stuff that we've decided to retain as useful in their profiles, which are a snapshot of their current state.

B

This is not representative of all of the data that we capture, you know the match level, stuff and purposes of analysis and replay. This is really only a profile for an individual. So when you're talking about 30 million users and you've got a thousand attributes, you're talking about a profile store of sorts, that's going to need to have 30 billion records.

B

So one of the things that is sort of important when we try to understand how we're going to deal with this volume and how we're going to scale is try to be judicious about where we get the data and what are the things? I really have to be real time versus the things that don't have to be real time. What does the fact is? There's a lot of stuff that really doesn't have to be real time in order to provide the right experience.

B

So we look at our data sources, the real time data source that stream that comes in off the gateway that everybody is playing, provides us with this kind of stuff sessions start you know, equipment, setup. What weapons and things you choose when you start a match. The match summary data, which is which is a critical piece of the actual metrics for an individual click stream, stuff marketplace bids and contained within that real-time stream is the source data for everything else that we're trying to do.

B

In addition to that, there are things like hey somebody just got: 100 kills in a match. That's awesome. We need to talk to them about that. That is a real-time trigger. We need to do something about that right then, but there's other stuff that does not happen mean that same sort of time frame, Sharon propensity models me anything that sort of model oriented. That is a trending analysis.

B

Those things don't tend to change match by match, and so we don't actually have the same requirement to retain that stuff in real time or to act on it.

B

So so those are the two bits we cannot take. Take those two parts. We have to smash them into a profile that we can access very quickly as a database design, a principle that sort of informed how we set up our Cassandra instance. We took the notion of the coupling and design scheme up from the business model, which is not a unusual approach when you're talking about the sequel, databases versus relational databases.

B

For some people it is a it is a shift and how they do things they hear requirements about, like say, for example, you're talking about a video game. You have things like maps, you have things like weapons, you have users, so you would have a map stable and in a user's table. Then a user maps table that would have your stats about that, and you have a lot of joins and stuff that you all have to assemble this master profile. Rather, what you could do is just call anything.

B

That's an attribute, a thing, give it a like an ID and say this thing. This represents your kills in domination on the map called turbine and and so on, and so you can just stack all those together and what that does is it creates flexibility for adding new attributes, so things can happen in the game. We can track them and we can put them into our infrastructure without the need to change anything in our data model.

B

It gives us consistency on the interfaces with the data access layer, so how we access the the the profiles doesn't have to change. There's no new sequel queries, there's no any of that kind of work to do we're, basically just striping out a ton of attributes based on a user ID and optimizes the Select operations.

B

This is essentially the case for no sequel databases have more or less. You know the flexibility, the ability to just kind of dump things in and do what you want with them. So what you end up with is a user profile that might look something like this, where you have the user ID and then a bunch of rows.

B

If you will that are the attributes and then some value that represents ponds to those in a real-time scenario we have to evaluate you know which of these things are, are necessary from a real time perspective and which of these things, can we upcoming load into the profile offline?

B

This kind of defines the right profile. I. Think in our case is probably no surprise that there are, you know our rights outnumber are we is about probably 10 to 1 hour. Reads our stripes a full full line data from individual user. Our access to the profiles has one query. Type of one query type only is give me all the attributes for a particular user.

B

Our rights are obviously a lot more complicated, and this is the this is why we're here, because this is why, in our minds, cassandra is the was the only thing that has really worked for us to be able to handle the volume of this. This balance of reads and writes, and, like I said earlier, I come from the perspective of architecture. There are certainly optimizations that make the stuff go faster. There's there's always adding more nodes.

B

You know that kind of stuff, but essentially you're still you're talking about a data schema that is going to support it in other, where other other situations would. So. This is what our an example of what an actual schema would look like. So if previously was sort of an oversimplification in the real sort of business case, our player fact table. So we use sort of legacy terms of facts and dimension relationships from warehouse.

B

This is a representation of the kind of legacy and the context that we came from where warehousing was the thing, so we've still adopted the same naming convention. Although what, if you think of fact and dimension tables in a traditional warehousing way, that is not at all the way we actually store data in Cassandra, but it is representative of the fact that, basically, all we're storing as integers of some type or another, because the applications don't really need to know specifically what it is because they're all sort of the same.

B

We have a particular user, ID and network ID, because this is the only way we ever access. Profiles is all the reads are very very simple: we have the cholesteric II that represents what the actual attribute is in itself. That's going to provide us with a level of uniqueness and distinction around each of those things which lets us do kind of cheating and shortcuts around just kind of absurd. In whatever comes in it makes it gives our application developers a lot of leeway and sort of forgiving, and all that they're really updating.

B

Is this these little bits of values that we then will assemble and marry up with some set of rules? We keep track of the dates if there's a change in the value. What that is?

B

Yes, so and then there's the ED value itself, so that was the low latency high volume datastore it's Cassandra and it's awesome. Stateless, misspoke, processing, nodes, I'll just try to talk about this stuff quickly, because I'm already running low on time in a traditional architecture, or not this traditional not, but this is the way you see this a lot of stand-alone application model. We are building a thing and the thing has an application code base and we have a lot of shared modules and it is and it all it kind of runs together.

B

You can sequence all the things are supposed to happen in 29 seconds in one continuous procedural set of code, but the problem there is that you don't have isolated scaling controls. So when you have performance impacts that are say the result of too much data coming in you now have you can scale horizontally? Yes, but you're, but the efficiency on resource usage is you're.

B

Now standing your rules engine across all those additional nodes and you're you're, not really maximizing the opportunity to to separate those stuff out the bottom line, being that the impact of load isn't equal in all parts of the system. The things that might make the engine run slower are going to be stuff like seasonal seasonality. You know Christmas stuff. We want to talk more to people, whereas, like a an event, reader is going to be impacted by the number of concurrent users which goes up and down.

B

So these are the three basic parts of the application: the real time processing bits. I think these would be a pretty standard delineation of how you would set this type of thing up, inbound event streaming and the parsing of that it is a it is, to some extent kind of contained, there's real time, rules, execution and then there's the offline data and messaging processing. So the architecture gets updated and it looks kind of like this inbound event. Streaming reads off of the gateway and updates cassandra and its profiles.

B

There is the rules engine that now knows: hey I've got a user and I've got to process this user. So I will read from user profiles and I'll update back into the gateway to get back out to the client and then there's an offline data processing bit. That is completely decoupled from the rest of the application, still doing core and important stuff like updating the profiles with all that offline stuff.

B

So then, just a quick on the last point here queues this may go without saying, but we would have a very hard time scaling our infrastructure to support the maximum load because they're the peaks and valleys in the load that we see is extreme what we see on november. Seventh, the day after a car title gets launched versus june. Seventh, after people been playing it for six months, it's a magnitudes of ten to one.

B

We can't support an infrastructure that can just look at spikes all the time 100-percent it just it's not really cost-effective- and this is not even accounting for time of day, which is has a huge impact on on how data volumes come in so so use qs qs lets us set up an infrastructure and all these individual components in a way that they can. They can have a little bit of wiggle room.

B

So we can plan a weekend planned capacity around what we think is ninety-five percent of the spike and let q is kind of squishing the rest so that we don't lose any data. Nothing gets overheated and we don't end up spending a lot of money on having infrastructure that we don't really need to support. So when we add Q's, the updated bit looks a little bit like this.

B

Kinesis is what we're using to manage the inbound data stream. I. Think I mentioned earlier when Amazon you know, Kafka is a obviously is an alternative for Canadians. In this type of situation, our highest data volumes is the stuff that comes in straight out of the games pure unadulterated.

B

We can use canisius to start breaking us stuff up into categories and to give it the kind of springiness that, let's just kind of cue through it, well we're going to eat up some of our 29 seconds, but at least we're still going to be we're not to be losing any data. Nothing's gonna get overheated and we're not going to pay for a massive infrastructure unnecessarily.

B

The other part is RabbitMQ, so we want to isolate the dependencies on the inbound data with the actual rules processing because they have different sets of concerns around when they get when they get overheated and when they reach capacity.

B

Rabbitmq is good for that in part, because the while the data volume is massive, the Kinesis needs to support the the data profile in rabbit is very small, we're just sending in batches of user IDs without any other data and we're letting the the rules engine query Cassandra itself to figure out what needs to go there so in- and this is what our real-time streaming data processing for personalized communication and architecture looks like. We have game clients to talk to a service gateway. The data that comes in off the game goes into Kinesis.

B

We have an event parser that reads Kinesis and determines, if there's anything valuable there. It archives the stuff into I. Guess I didn't mention this part, this little weird little pointer. So imagine me pointing at s3 EMR redshift.

B

That's it until warehouse. It gets archived. We update our user profiles and then the event reader hands off the set of users that we need to process into a rabbit. Cue, the rules engine reading the cue says: I've got a new set of users. I need to process a query: Cassandra I get new sets of profiles, all striped out attributes ready to go process.

B

Some rules marry some content and send it back out to the service gateway to go back into the game, and this whole loop can happen because of the springiness, because we split the stuff out and in no small part, because we have Cassandra as a back-end. We can do all this stuff really fast at really high volumes and so yay for us.

B

So again, just sort of the review, the low latency high blind bid store Cassandra I could just replace that with Cassandra I was talking somebody earlier today about you know they were asking me so like what the benefits of Cassandra versus some of the other stuff that you had used with. To try to do some of these bits and I'm kind of giving her some of the history and I couldn't really think of a good answer to the question like what are the benefits.

B

Cuz Hollinger brings versus something else, mostly because I couldn't think of anything else that we could have fit into this particular use case. So I don't know what the benefits I don't know that there even is I ended up telling her that that the benefit is we could do it and for the alternative is we could so stable, photo processing, nodes and cues. So we just talked about that and then also, maybe perhaps, if some sort of interest some of the other technologies we use just run down the list, Cassandra. Obviously Kinesis.

B

We talked about RabbitMQ all the court application development, the rules engine the event parser this, how a Python 37 we may move to three eventually, when everybody else catches up. I think it's only been out for like five years or something so we're sticking with 27 for now s3 red chip EMR. This is our Amazon replacement for Hadoop.

B

What we still had you so plays a role, but this is re analytics mark, which is suffice to say that my sequel I didn't mention this before, but it's worth noting that the rules engine itself does need to have some sort of configuration for how it runs. We use my sequel for that.

B

It could be anything else, but we have a django interface that connects to my sequel, the knobs that I talked about earlier in terms of implementing strata setting up content who to target all that stuff happens in that way, you get copyright for server monitoring, ansible French infrastructure deployment, so I guess. The last point to make here is what we're in a good place. 2016 is coming up fast with every new year. You can be sure there will be another call of duty title.

B

There will be more Skylanders, there will be more of everything, and so what are we going to do? I mean I. Don't know that we can continue this pace of innovation. Maybe we can. You know predictively, guess what your stats are going to be on the next game: I'm, not really sure, but one thing I can't say for sure kassandra's going to play a role in it, we're pretty well embedded. Our use case is so well couple to what the Cassandra does for us. It's hard to really imagine getting outside of that.

B

So I think I have two minutes and 47 seconds left. So thank you. In two minutes, if there's questions, I can do it and also that's me and my wire stuff, hey Mia, hi,.

C

Rush did I just see the previous technology stack one more time, I'm.

B

Sorry, you can dead.

C

I just want to take a look at the technology stacks. Sorry.

B

I got one more time.

B

Yes, that's good fun, stuff, I. Think of all the stuff is listed here, I feel pretty confident about all of it. I, don't there's nothing we're not really in the middle of any kind of reevaluation or a question cool can.

A

You say a little bit more about personalization and the rule engine say.

B

More about personalization in the rules engine, yes sure I, just like at a high level, I think one of the things that sort of differentiates us from some of the other approaches to personalization is a notion of segmentation and event-driven, mixing and matching. So there are lots of vendors that offer personalized solutions and stuff, but we have a kind of a brain trust from marketing agencies. Who've worked in this space for a long time that have kind of put the stuff together. It's a bit of a bit of secret sauce but I.

B

Think if you consider those emails that I was showing, it's really important to us that we that we add the extra dimension of not just a customization of a stat or a choice of a color or a user-defined type of personalization, or even the right recommendation in the right spot. But the question of: do we even show a recommendation at all?

B

Do we even show stats at all and what's the order in which we place those and that to me you know, and all of my experience in general communications in CRM and marketing, all that yeah? That feels like a big differentiator and it's also its resource intense on the backend to process rules like that. So a large part of that 29 seconds is evaluating users against content and maxing out the matrix to make sure that you end up an experience that works.

B

A

B

You earth. Thank you. The.

A

Player profile and data fields or data products that you update in real time, do you guys do any segmentation using this.

B

So, yes, and but the role the segmentation place for us, is slightly different than traditional segmentation.

B

We don't need the segmentation in real time per se, because most segmentation is trend based so like we do churn analysis.

B

It is important for us do we feel like someone's about to stop playing, and if so, we need to intervene, giving some bring some encouragement and all that type of stuff, but we don't tend to find that stuff matters in real time, but the data that comes in in real time still serves as the source data for those models and for that segmentation, and even within that, once we once we apply the segmentation for a user there's not a one-to-one map between the segmentation and then the experience that we deliver to them in the personalized bit.

B

There's a bunch of other sort of informing data points that we also use, so it gets very, very personal time, probably am at a time but I'm happy to take off the mic, and we can talk. You know all right. Thank you guys.

B