Apache Cassandra Cassandra Summit EU 2013, 12 Nov 2013

Previous Meeting Next Meeting

⏯

youtube image

►

From YouTube: C* Summit EU 2013: Performance: It's All About the App

Description

Speaker: Michael Masterson Director of Strategic Business Development at Compuware APM
Even with the fastest Cassandra cluster beneath the hood, it's the app that ultimately governs performance. Learn from customer examples on how to address the root cause of slow performance.

A

And more importantly, when we talk about performance, often you might feel like you're trying to find a needle vacation right, you're looking and try out a bunch of things that might might work. You may have read about it on a blog. You may you know, be an expert and actually think that you've got the right answer, but I'm here to share it couple stories around I can really get insight minute and you into your application really running on top with the sandbur or we'll take a quick look at MapReduce as well.

A

So, if you're dabbling in to do perhaps you're running MapReduce on top of beta stacks, which happens to add that layer of a common center will give you some insights and some best practices and how to take that holistic. You and, at the end of the day, what we're seeing with no any Santa, no sequel database even to do it's a level of complexity that you're adding it. Some of you may be working on a brand new project towards you know pure python and Cassandra.

A

But guess most of you are are adding a sander into an existing application. Maybe you're looking to kind of take an expensive transactional system and reduce costs, increase efficiency with something like Cassandra so the other day you're looking at you know that there needs to cut pull together different views in a transaction. Well, it may end up in your Cassandra cluster, probably started somewhere else, so getting insight into where things really come from the complexity.

A

That is probably reducing the visibility of your app and how do you close that gap that you've got so that end-to-end you into what really happens in in what matters I know. There's a couple of you hear that are just getting started and I'm not going to try to pigeonhole Cassandra in one slide, but I want to kind of from a perspective of performance of.

B

What it's good at? What it's not good at, because that will lead into the.

A

Discussion today around our our kind of case tens cassandra is great. When you talk about something that provides you a fast read, access or fast right answers right, you don't necessarily want to read and write or write and read exact same time. You know at the exact transactional level, but it's great and taking those those core concepts scaling them horizontally, as you saw Jonathan earlier today to a really great linear scaling. So you can add more servers.

A

It's really a patient through the course of WOD a high-availability built in, but it also needs you to have some best practices and some patterns built into your design. You want to make sure that you've got. You know fairly, even distribution of data. That way when you request them into the cluster you're, not seeing you know largely different response times via double reads and writes. We also want to make sure that you have applications that have made you a poor pattern of access.

A

For example, you know a single request: remand application, triggering 4, 5 10 requests to the database into Cassandra. You want to know about that, because it's not necessarily a standard issue is more. Is your application doing the right thing so have you get insight into what your apps were really doing?

A

Looking at those patterns that are a best practice that if you start with the right approach, you can kind of force that, through the development you'll have a much better experience in ski Leandra, Cassandra construction, when we dive into the Duke will talk a little bit about the world of balancing and being able to take requests that you know are ultimately driven across the cluster running parallel and there's some unique things that happen.

A

We think about a bachelor flow versus a transaction workload specifically little things that you know go wrong, get really wrong as they start to scale across the cluster. So, of course the dude was great at up and that's what you can do good general, but we'll talk about MapReduce since MapReduce does run on top of data stacks enterprises. You can leverage that data that resides within your cassandra cluster. What works well and what are some of the challenges?

A

You know specifically things like distribution of data within within MapReduce wasting resources when you think about what sufficient what's not efficient. We've got a case study with an application that one line of code loading, a stream library to prepare data objects, had a massive impact on performance, something you would never think about the development perspective until you actually see the the impact morning in production. So it's it's those types of things that really give you inside to do.

A

How are people really building an application that runs on my cluster, runs on my environment and then best decision for miss tuning scale in your database from there? Let's start with one of our ad tech customers in New York. So this is a company that believe there's even a couple of you that are the the ad tech business here as well serving a bad looking to find the best match.

A

This particular company was really about trying to match you with your peers and your friends when that bid, request comes in to say, who's got the best ad for customer they've got the ability to very quickly pull everything they know about you.

A

As well as your friends to deliver that, and if you're not too familiar with what the real-time bidding network looks like request comes in, you load your browser, who has seen those ads on the site, there's a exchange that is essentially bartering that data to say who's going to pay me the most for that impression, and, of course you need to make that decision as fast as possible.

A

You know those seconds matter here, because you're going to need to decide whether it's worth you're, my client pain ex, you know cents per per million impressions with that in front of the investor. The request comes in from you know: browser through an exchange ultimately hits somebody who's in charge of selling and brokering ads on behalf of their customers, and then that analysis takes place very quickly. So the use case of high-speed availability on that type of data, the impression the graphic and the targeting is really what we're looking at.

A

Of course, there are some things in place that you know really drive whether you're going to win and make money for you know yourself or your customers, and that's all about how fast you know, file that user coming in against the data set. Did you know about it? So it shows Cassandra. For a couple of you know a couple of reasons: one high speed, low latency access.

A

They wanted to make sure that when the request came in, don't wanna be certain the same at the same person, so you want to keep track of everything that you previously serve about that person's those you can get the most interactive type of participation, but you also want to you know kind of be able to segment, then try certain offer certain products so a fairly typical you space in, of course, level agency. In this case they had about 100 millisecond SLA with the brokerage meeting.

A

If they don't respond within milliseconds, they lose then chance to today. This isn't a database requirement. This is end-to-end system. 100 milliseconds is really the operating window. So when I looked at the center I said well, we're going to you know, try to follow best practices when we build out this. This beta schema because that's important Cassandra was the right fit, but also they have the right data, so they said well, it's mostly schema lists.

A

They know the type of data they're going to be expecting they're going to do mostly rights because they're going to see all the requests that are coming in when they need to read it's going to be outside of the realize. Now we're going to be a transactional or waiting to kind of get back to bid. We just wrote in the Cassandra optimizing around different types of puddin get never doing a scan across the data.

A

It's a little crew process, the type of profiles for a user, and so all in all it looks like a pretty good fit for purpose and and then you know, d normalize the data as it comes in so that it's ready to access on the way out looks pretty good so far and of course, trying to you know optimize each each of that transaction.

A

So when you start to tune your cluster, there's some best practices that you can find you can go through and you know, d normalized data is one that you'll hear make sure that you've got your multi-column keys, set up. You'll, have distribution across different nodes and, more importantly, you'll. Try things like well, this compression help or hurt is it? Does it make a difference whether I've actually got it enabled and depending on your use spaces, it's going to vary and also the type of goals?

A

If you have, as far as you know how much data is actually actually in there and there's variety of just cash settings that can be said and in general, you know it's kind of a trial and error process for a lot of people figuring out. Well, does this work, and so this particular customer rule and tried a bunch of things, but at the end of the day they still ran into a few problems. Cashing didn't necessarily solve the problem.

A

Mem tables and everything that they try to do in front of other parts of their applications still left them with some critical things, one they were missing some bits so that hundred millisecond SLA we talked about. They still didn't know exactly why they get you. Don't hundred intend 350 milliseconds on some of those requests. They can see them coming and see the requests coming in out of cluster, but it didn't necessarily give them inside him too. Well, I've got more than just the request with Santa know.

A

They've got some processing that happens in flight before things come in, so they need to get visibility into really from the request that came in to make change throughout their system into the cluster and figure out. Why will those occasional bids getting this everything they designed is supposed to be ultra efficient? They saw their query times, go for most grace in the you know, 10 to 20 millisecond range so relatively fast, and what you would think would be enough buffer for the other. 17 milliseconds of everything else happens within the system. Occasionally they failed.

A

Occasionally they actually didn't come back. You heard this morning a couple of reasons and there are some edge scenarios in Cassandra where things will actually get locked up in the cluster if you're waiting, for example, on a consistency of all.

A

If you can chance that if something is backed up in queue, you might actually time out before you can get the request back so little things that come down to data access patterns around the application, even if you don't think you're doing it as you actually look at how the application is really accessing a cluster, you discover those things very quickly to say: ah that's really what happened on the stall request or this transaction that time doubt it who came back with, came back to an area and then, lastly, looking at you know variances, why do I see you know ten to twenty milliseconds on average, but every now and then he'll 40 50, 60 milliseconds, coming back in a cluster which ultimately ends up driving the overall transaction time above that that threshold so trying to understand the outliers in you know, basically the system that is designed for real-time access into the bidding network.

A

So how did they get there? How do they start to resolve these? Well, they started by looking at some. You know kind of high-level measures, things that you might get out of ganglia.

A

You might look at you know from an off-center using datastax enterprise to try to see that ballistic view of how is my cluster doing it and nothing really jumped out out. They read the blogs they try to do their own investigation of the web. What's the the type of tuning that I can do and can I look at specific gmx, metrics I. Look at requests per. Second.

A

Can I look at the number of column, families, an employee and trying to sort of do that that triage, but still looking inside the cluster, which, as it turned out, wasn't necessarily the only only story to look at because as much as you think, it's the database fault right. The reality is, it's not always the case and how you access and really how you built your application patterns, determine a lot on on the end and performance and, of course, as I'm here today, you should all be deploying you know to dot 0 2 into production.

A

It always gets better and faster, but reality is there are regressions and there are things that you need to check out. So just because you've got to over 200, 200 or 202, because it means it's going to be the best fit for your application, so being able to bench line between releases is important and so having a process in place. Unless you actually measure and know whether it's it's effective to field roll out, so they started to look outside of the center box because they felt like they doesn't tuning.

A

They felt like they had kind of pushed as far as they could in to optimize at the center, even though they knew there was some discrepancy in a between request time and some outliers, where things will just be slow occasionally, so we tried to look outside and say all right: can we take the transactions that are coming into Cassandra and local listing buddy and the end? You know essentially looking for that black sheep that they knew was out there.

A

That was causing problems somewhere, but they had gaps and they were trying to close that gap in visibility between was actually happening in their entire application, all the way out to the de Santa Fe. So as you look at kind of taking that view of what's actually coming in, there's some general pet best practices that you can apply from the management and say what's the overall response time and then to start to bring down each tier so that you can understand within your application. Where are the bottlenecks?

A

And ultimately, if you're hitting you know, 120 milliseconds in your target is hundred milliseconds or less what happened in there. Where did you spend your time and starting to measure each piece along the way to build up that in the end of you? You also want to look at the data access patterns.

A

How is my application access in the cluster for every transaction that comes in, do I hit Cassandra once or do I hit it 5-10 times so, in effect, even though my response times from the cluster is great I'm, actually looking at me or four five six building up to satisfy that the end requests so understanding the pattern of your application, as well as for that traffic's coming from and then lastly, how is data distributed within the cluster?

A

Is that affecting things because I might have you know data that's not as evenly distributed and certain nodes have different characteristics. The running in different data, centers they're running on a different hardware, software, a different network in between so looking at data access patterns and I mentioned earlier, the consistency level. You know we didn't think we were accessing things with consistency equals all. But if you are, how do you know? Have you find those to make sure they're?

A

Not not the root cause you before the culprit, so, as you start to look at the end end and you, you want to start the very edge so the data that's coming in and if you can even measure it from what does an end user see well, what is the the brokerage house seems that you've got that perspective of there telling me I missed the mark.

A

I didn't get my request, sir, because you know returned in 120 milliseconds so start there understand what you're, either in user or in client looks like the data is coming in. Make sure that if you have any third party requests in this case as part of the transaction flow, while they weren't hitting a facebook login screen- and this is something where we couldn't get- that we can share all their proprietary details and how they built the system.

A

There were some third-party systems that came in to collect and assess user data that everything was contained within two standard kind of overlooked in their personal measuring the way to work around us say if you have third-party access that doesn't return, you know use that as as your own metric, you know, if you don't have the data coming back, move on, don't fail just because you've got a third-party data service that that isn't available.

A

Looking at your own application starts are trying to understand internally. What do those tears? Look like we're two transactions flow and then, more importantly, how do they hit the back of clusters? Are you going against anything beyond just a sequel, server beyond your Cassandra server?

A

Are there other pieces of data that are coming and going and that, in the end view of your application, you probably mob with it when you built it, it's probably on a white board somewhere, but it's probably different when it actually lives in code, so getting insight into what's really happening inside your app is going to. Let you then explore what really contributes to that and end transaction time and then, finally, of course, once you understand where my tattoo point is into Cassandra, I can start to look and say.

A

Is that really the bottleneck is that, where my mom, that the culprit eyes.

A

So, starting to calculate that, looking at purely Becky Sanders side, the access pattern is important, so we mentioned a little bit around understanding. How is your application, interacting with Cassandra? You can do things like you know, instrumenting, each call to say: I've got a certain IP address, calling a cluster you can build up. You know, sort of your own view, there's also tools in place that give you this kind of disability out of the box so, depending on on you know how you're looking to kind of instrument your cluster there's lots of different ways.

A

Some provide a great fit for kind of rolling your own others can give you a view that really show you based on each transaction, comes in how many times am I kidding each service and we'll go in a little bit more detail around this particular pattern. Where we saw some interesting me gauge or something that was completely unexpected from from their application and, as you start to look inside of a call in data center, you know it's important.

A

You saw today there's some tracing, that's now built in where you know, and maybe an offline or non production environment. You can start to trace individual calls. That starts to give you some insight into what's actually happening. I can see where the day is being pulled from where the coordinator is sending each request, and it's going to give you that level down to be able to say, starting with you know, what's the quarry that's being executed?

A

Where is the note that's going to supply the data for it, and then you know things like consistency level, making sure that you check and the behavior that you're expecting? Is it actually with you what you intended so a starting point is to say: what's the quarry coming in and the associated characteristics behind it?

A

What's coming back for each particular statement, what type of data are you returning? So if you have what should be a relatively short query, make sure that the data to returning is not huge in size, I mean you want to make sure you're pulling the right data.

A

Is you don't necessarily want to return the entire row, depending on the fact that you need, and then ultimately you know how many calls are you making how many requests within the cluster, whether using tracing, whether using another tool that gives you an insight making sure you understand the flow within the cluster so that, before that request to serve that, you've got a good picture of what's happened within within the environment?

A

Again data access patterns, all this kind of contributes to your understanding of the application running on the top of the center. So what we found looking inside is in this particular case. The transaction coming in, even though is one outside transaction, was generating about five calls into the cluster four of those were coming back at 15. Milliseconds of about is expected, but there were a couple of them that were averaging 50 to 80 milliseconds.

A

So, if you're trying to achieve 100 milliseconds, you know having something like this happen, occasionally is going to completely break your application. It's still going to return, but you're not going to be able to serve the needs of, in this case, of a real-time bidding system so being able to understand- and it gives that point where you have isolated in this case some external dependencies that might have been contributing where you're calling a third-party service inside the cluster you've got four out of five that are working well, but one particular node. In this case.

A

Our third node is performing where we would expect them to the 20 milliseconds. The other two nodes are quite a bit higher 50 to 80. Milliseconds he's still not bad, but certainly not within the SLA rerespected, and that gives you some insight into say all right now.

A

I can look at those particular nodes and say really what's what's driving the difference between cory's, what data is extracted from note, 3, vs, note, 4 or the other ones in getting inside to say now we're one level down or in the node and trying to understand whether it's tracing whether it's creezy another tool? What exactly is causing that a TMO second mark versus 1500 seconds and being able to get inside the job and say you know, here's what's actually happening. Is it reading and writing from disk? Is it waiting on memory? Is it?

A

Is it contending for some type of buckles it's in there, but getting to that next level? Down of really what's happening behind the scenes, something that is relatively, you know, can be relatively complex click you can isolate the issue and know where that performs pain is coming from you're going to have the ability to you know, debug it yourself to go to your vendors, providing support and say here's what we're seeing. How do we work through this issue, but isolating exactly where that pain is exactly what's contributing to that?

A

Latency is truly the secret to most performance endeavors and how long it takes you to get their courses. How long you spend working through the debugger issue in this case we're seeing a couple of odd things where 113 millisecond latency on certain calls you know driving in in in those calls that was attributed to really waiting time waiting on memory to be allocated.

A

It wasn't disk I/o, it wasn't young computational, so those types of carriage characteristics that say you know this is something that might get configuration parameter can be tuned if I'm just waiting, there's often a a brand. That's you know available to tweak in tune a cluster or an application to super solve that. And then, if you have a little bit deeper insight, you can actually start to see some hot spots across the cluster. You can say by large are my request largely found by memory, bio or is there something in between?

A

Are we waiting on certain data as he returned, so this type of analysis lets? You know that cluster that was returning, 80 milliseconds. Maybe the disk is slowing down. If there's something happening or if you're running on a managed service, then you need to provision more I ops. For just that particular note.

A

You don't really understand the impact until you start to see that data distributed over time and I think some of the other sessions today you'll hear about more stories around how you know things over time, get thrown tombstones, be it from other types of access patterns start to change.

A

The behavior cluster, which was great when you first stirred up and then over time, is going to take on very unique characteristics, but what you can identify and understand where they are insert makes the decisions on the best way to resolve and possibly to change from the data access pattern or where that data is being game stored in and provision from.

A

It's also helpful to look across your cluster and look at some basic things like when is garbage question running when is in passionately and if you can start correlate that time in which you know passionate running with some of your transactions. Sometimes that's an immediate hit. You can do things like tune, your garbage collection object. So, instead of running you know every so often and really impacting that the cluster you can run on a more regular basis.

A

Likewise, the compaction- you can schedule when's the right time to actually to make sure that you, during your critical time periods, if you avoid those those things so getting some high-level views into your cluster again. Some of these are out of box when you think about that that few of the in death cluster others are a little bit tougher to get to me of getting inside the individual transactional understanding. When this read request from cluster 3 takes 50 milliseconds, it's a little tougher to get to that little detail.

A

But this type of view is it's a great starting point and gives you that that broad level view of what's happening over time over time to panic, luster and then you'll start to see things like well, you know, there's an inning pattern here. We see something like garbage collection, giving us a lot of memory back, but it's a pretty big shot. Not a very you know, linear stare stuff here and it looks like it's happening on a fairly sequential basis and also we're seeing you know other patterns that you wouldn't would necessarily want.

A

You know a very large brother collection running. It was about two o'clock and in the afternoon on the cluster. Likewise, a real spike in requests. So is there a behavior in my application? That's driving. You know, sort of that instant. You know hammering of Cassandra for some particular reason again, not by design, but things happen. Other things we're going to get your cluster and perhaps those other traffic a quarry that's coming into at report.

A

It just so happens to come in at the same time as your high high quality request, the G name for real-time bidding.

A

So when we get a had a chance to kind of do this end analysis with the client we found a couple things emerge. You know the reason that we were missing some of the SLA requirements around bidding. First, there was a read time. Aggradation that had occurred. There was a fair amount of data that had been written over and over to the same row, which was extending the real size and causing that road to span multiple system tables over time, and as some of that data was deleted, tombstones ended up.

A

You know building up and essentially.

B

Really slowing.

A

Down data access for certain customers and certain requests, so not a guaranteed pattern to happen. There's there's lots of you know ways that you can experience that type of behavior, but it happened to be one of the root causes that they discovered was well a few of those data. Access approaches in those long reads across rows, as well as the the delete operations they didn't realize they were doing on a regular basis were causing some of those requests to just take title bit longer.

A

Second, one was around some of the coordination and I mentioned that consistency level. They discovered that you know, despite what they thought. There were some consistency levels of consistency, all that, as those nodes were queuing, those requests. They would slow down and eventually time out that type of time now, of course, going back ends up failing and giving them. You know of a block so to speak on our ability to deliver that that bidding request.

A

In this case, they discovered that while they thought a node was actually down, in fact, it was just timing out and in some cases it's better for a note to entirely fail, so be aware of tough environments where you might actually want the node to be killed when it when it has that type of a behavior. And you know if you can model your application or modern. Some of your, your monitoring to look for those patterns, kill the node entirely.

A

Let it come back up and don't have it sort of sit there in limbo, holding up requests and then, lastly, the data sets that they were returning. They saw that some of those access patterns when they were retrieving requests were retreating at a much larger data set than expected. So again, it kind of comes back to it's, not always a culprit within the application or within the database. But when you put those two together, you can see why the behavior applications is really tied closely to the end and performance of the system.

A

So you always might be looking at know both patterns both sides of the equation and trying to solve for the right place where you're seeing latency and issues.

A

Yes, we're just considered a very large.

A

It's a great question. I, don't know in their case how large the actual size I do know that, depending on the type of request they were serving, they were keeping track of. Every request was served to a particular user and type in their environment. I'm, not sure I, don't know, but I do know they were pending the ropes. They were adding data onto the road and letting I bro bro additional rows or having some type of of maximum value. So off the cuff I'm, not quite sure they were a little uh there's.

A

Some specifics here that we couldn't share the full details they in, like those transaction flows, are close to to what their application was doing, but they're not very exact, topology and and their exact discussion. So great question: there's probably some experts in the room that would tell you what's the biggest growth side, you should be dealing with or how many times to attend, but I'm not sure what would be the that practice and I hate to to guess at that.

A

Ok, the next application is one that actually that we've developed internally, that one of our other division runs that it's basically watching around the world for outages and saying anytime, there's an outage can lead to figure out who are some of the downstream dependency. So, for example, when Facebook authentication login api's go out how many other third-party services are affected, and so it's kind of keeping track around the world which gives you some insight when you're using your own application.

A

If you're seeing you know a problem with, you know, page load, you can quickly say well. Is this a third party am I using a third-party ad login, you know type of services. Constructing your page front end, and you know trying to understand. Are there multiple dependencies that could be in player, something that's out? There happens to use a bunch of technology.

A

In the background, including the do including to sandra's or to sample that data twitter- desperate for long-term analysis and to give some trending and recording so that we can tell you how long over the last month was, you know particular authentication api down. As we know, the weather is a very dynamic place and it's often surprising to look at how many times there is a short outage that you don't really you may not be aware of, but depending how you build your application if it could be directly impacted by it.

A

So we looked at this and said: okay, we built this and it's basically running every night and every hour a set up a new job, so basically MapReduce across the data set, and so we looked at internally. What's the behavior of our cluster and it seemed these are some of the patterns that they jumped out right away.

A

As we looked at that high level view of the cluster a couple things to notice when you look at the top left, CPU spikes like that, if you're used to any type of kind of massive parallel processing, you don't necessarily want lots of peaks and valleys you'd like to have things fairly normally distributed. So let you know I'm using as much CPU bandwidth and disk as possible. That's kind of a high-level view into the efficiency of is my stuff working well in a transactional system.

A

Maybe that's okay, you're going to have peaks and valleys depending on and use your load, but if you're running in a batch processing, you really want that to be maximizing the capacity of that Parker. So that looks a little bit off looking at some of the network and disk I/o, a few spikes in there that are a little concerning to see a relatively consistent blown on disk, but every down and things get backed up, which probably correspond to some of the work love it that's running on there and yeah overall cpu spikes ah memory.

A

So when you look at memory utilization across the the cluster, an average memory utilization was around thirty to forty percent, which, if you're running jobs and you're keeping fifty to sixty percent, a remember free again one of those patterns that probably room for optimization.

A

Maybe it's what's been written, maybe it's just tuning of the cluster itself, so a couple of things jumped out at that high-level view, but it didn't really give you insight into what was happening within the jobs of themselves in a produce which is, you know, taking essentially a application and distributing parts of it to run across the cluster understanding what those pieces are doing as they run as they're tying a cluster gives you insight into know. Is that hi Cory, you know poorly written?

A

Can you tell your data scientists that hey this type of five query is really expensive? Go do that or if you're writing their own custom MapReduce code? Is there areas that you can look at to say? This is something that you know.

A

This is really inefficient and, starting with that view that says from the top down here's what my clusters doing at a macro level, here's all the jobs that are getting scheduled and then within those jobs, there's a couple of different phases: mapping, shuffling and reducing which you can start to look at to say what you know.

A

What's really going on behind the scenes, getting a view of who's actually using your cluster there's a couple projects that lets you do that white elephant from LinkedIn gives you an ability to kind of see the jobs that are running. There's also other tools that they give you that end-to-end view and, in addition to a cluster view that can help you drive into a week.

A

We have the build one that there are multiple ways to do it if you want to kind of roll your own or if you want to have something that sort of provides that that deep insight, we notice that you know we looked at who's, the biggest user of the cluster, because in most clusters you've got multiple jobs.

A

They're running you've got a batch environment, so jobs scheduled they run, and they finish so, if you're looking at kind of a performance view of it start with your biggest consumer, you know don't start with the guy who's wearing a job. You know once a day once a week start with somebody who's really utilizing a good, we'll get the most bang for your buck so to speak in and try to optimize what they're dealing with in the jobs. Try to understand. You know how much time are you spending mapping, shuffling and reducing?

A

That, of course, is the need to optimizing something like mapreduce understanding, where your time is being spent, how much data you're having to move between those and ultimately back to our kind of original thesis. Why are we seeing, as such, you know kind of peaks and valleys within within the jobs if you can identify inside particular area, so I understand.

A

That means part of the shuttle phase things that I noted here is that as we analyze the job and looked at CPU disk and network utilization, there's a third tier in there called wait, and that's your classic. Now, you're waiting for memory allocation, you're, basically waiting for the infrastructure to come back and what we found during the shuffle phase was that there was a relatively large amount of time spent waiting for memory to be allocated.

A

Now what that means, as you start to diabetes once you can understand where that waits occurring, you can say all right in this case we're starting to see, wait on the shuffle task and I think one more detail here. We pulled it out in these routines, drop-top, shuffle of the input buffer memory bites and parallel copies. Once you understand where that that weight is occurring, then you can say in fact you know.

A

Sometimes, once you get that little detail, you can google that to put method and say input buffer bites and you'll find a lot of feedback from here's. What things are set by default, but depending how jobs are constructed, we know if defaults, you don't always work and once you've narrowed down. Where that that weight is occurring, then you can say all right.

A

Well, in this case, maybe I should increase the buffer size logically been waiting on memory in this particular party application, increasing the buffer size, or, in this case increasing the overall memory allocation for this step should help. If you recall, we looked at the overall memory utilization, the cluster we saw it was around thirty forty percent plenty of room, just a matter of figuring out which not to attune to get inside there.

A

By doing that, there was about a sixty-five percent reduction in the shuffle time, because we were taking away that memory contention so little tuning little tweaks and environments that are heavily vaginal rings with like mapreduce. It make a big difference once we have moved beyond applebees and his next time up to look at the matching face in this case, it's one of those, the first steps that runs to really crunch the data.

A

That's in there we found again there was a little bit of weight time, not as bad as before, but a relatively large amount of CPU time and again, diving into another place where the mapper was essentially waiting for memory allocation. We found the spilling thread and schooling buffer was ultimately contributing to it, so by increasing that buffer size and giving it a little more room that eliminated the weight cycle, so little configuration tuning in this case, another fourteen percent dropped out of the of the math time.

A

Just by adjusting the buffers once we realize the behavior of the job again, you know context. What's my application doing, a Duke is certainly different and that reduces different than a pure cassandra note access, because you're running a lot more code in in each particular and each particular note. So after these first two other fixes that were more configuration tuning, we found a couple of things. One, the cpu allocation looks a lot better.

A

Writing we're seeing much more normal distribution of I/o we're not seeing spikes like we saw before and overall there's still a couple of troughs in when we go from the mathletes, reducing so areas that might still be an improved, but overall it's starting to look to a bit better from mapping every using and inside the next piece of take a look at was a it's a little hard to see here, but there's two parts of code within a produced there's a framework which we've been looking at so far and then there's the custom code that you're running on top either things that you've written directly or, if you're, using some of the quarry tools, hi gig or anything that harms atop the cluster they're generating that code.

A

That's going to run out there so we're looking at some of us it's running and we see that one it's using the most amount of time, which is a good thing. It means we're not you know, wasting much recycles on a framework, but rather the code that we've written is consuming the time and, more importantly, it's also CPU bound.

A

So if we look at those three colors again when the cpu one was waiting and the other words diskin and network, I am so kind of understanding the behavior of our application running a plus across the cluster. Let us then start to dive in it and understand. Well, where was spending spending time? I mentioned that in this case there were a couple methods that were queuing up most of the cycles, and this happens to be a whole string algorithm that splits data.

A

So as you go in and write code that it's processing a parallel string processing on that data, it is pretty common. Sometimes the decisions where you use a particular library end up chewing a lot of cycles in this case about thirty percent of our timeless was spent splitting strings which doesn't really make sense, and so, as you look at where that's for that string, split is occurring and what function you're using you can then make an assessment to say well.

A

Is this the right way to do it and see what what's really happening behind the scenes? The second one we looked at and again this was about a look at the overall kind of and added about another seventeen percent of the total group with the job. So each one of these chunks, because you're starting at the most expensive part, taste a pretty good cut at the overall overall experience.

A

The last piece of pig look at was again we're now to a point where we're seeing you know a lot of cpu utilization, which is a good thing from a performance perspective. Anytime, you have high cpu utilization small code. Changes can make a big impact. I think recursion think things that are just really impactful for an application, so understanding we're still cpu-bound, not network bound or just bound, is a good thing and then moving on. In this case, we had one other string kind of routine.

A

That's comparing dates and in this case we're using a resource bundle to look up and compared to to date, formats back and forth, and that that date compare happened to load a fantasy for resource bundles, which is part of localization. So it was doing a date. Compare based on you know the specific geography or whatever was was associated to the jvm that was in a very expensive way to pair to date, objects so by understanding. Here's a inefficient way to compare days.

A

Is there a better way that preloads any locale specific information that little change added about a 70 production to the reduced base? In this case again, knowing where to look understanding, what's tying up your cycles and then making a adjustment from it? So overall and we started off before we look at our own coat with a cluster profile approximately like this again, we knew CPU usage was pretty good.

A

Once we fix the buffer issues, we knew that we're still running in about six and a half hours, and there were still some behavior where our overall average Willie dropped off in the end to drop down to just about three and a half hours so same Hardware sing jobs and configuration tuning and two code changes to some string. Algorithms made a significant difference. You could have tried to throw more hardware at the problem, increase the number of clusters.

A

If you were running on cloud, sometimes that works, sometimes it doesn't, and it kind of brings me back that scenario of what are you going to try first, and do you have some insight into where performance issues actually come from so I've known as well? Some of the particular areas that's six times faster, reduce phase so you're, not always looking at percentage gains, sometimes you're. Looking at order of magnitude gains more important on a distributed system like mapreduce, where little changes you know affect the entire cluster, but certainly inside of a sander as well.

A

Small things can make a huge impact. Consistent skills, for example, make a pretty significant impact on the overall cluster and again, you know we saw a much more normalized distribution.

A

We saw kind of a nice trough get eliminated from the over cycle and, as we look at the overall course of Newark well now we can feel a little more comfortable as we're running these jobs scale and running them efficiently, kind of getting the most bang for buck out of that that infrastructure and the team, whether it's the marketing team, that's asking queries or the compliance team, it's it's data mining is able to put as much work load through the system overall in looking at performance, it's always important to understand.

A

You have different components, but if you can get a view in your system, you'll understand what really contributes to the end application. Sometimes it is a database configuration issues. Sometimes there are challenges within the Cassandra cluster other times. It's really attributed to the pattern that has evolved over time in accessing or how that date has been distributed.

A

So you'll find that, as you work more and more with newer technologies, there's always things that get discovered and fixed and something this is you start to really understand what your usage of patterns look like you'll, be able to to either improve or be able to go back to your vendor and say: here's where things are going wrong. Can you help me to nor working helping fix this? This particular access problem not specific than just the sander.

A

If you're looking at a no sequel database, you'll you'll know, as you start to use some of them, make sure you pick the right right technology for the right job and the different profiles that you have. If transactional either breed right based, you know you'll figure out which ones have the best performance for environment Sandra. Facts would be a great fit for a lot of use cases.

A

It's not perfect, for everything doesn't mean you should get rid of post gray or my sequel entirely, but for the right nice case it can be extremely performing figure out and likewise in the MapReduce world, understanding understanding when things run in a distributed environment. It's a little bit different and if you've got developers that are just starting to learn that kind of parallel framework, you know the impacts that they may not be able to see on stop when they were on their job might actually end up being quite a bit different out in production.

A

So little optimizations can make huge impacts when you run across a and strew different get map reduce or for other parallel environments. With that I want to kind of open it up to a few questions. I know it's been a bit of a broad view here, to both no sequel as well as to do, but any particular questions or, more importantly, make any insight that anybody would like to share with the audience here. Obviously, there's some experts in the room and some people that are just getting started. Some of the products.

A

A

What will you use for applying the job sure? So that's what was called yeah, that's what was told about Dinah trace and it it provides the ability to look inside each tier in the application to get inside the job. There's you know some.

A

You mean ability to go in and stitch together their transactions so that, when you drop in happy agent face so think of it as a JVM library that loads on each node in the application and then it discovers, the application exceeds you that had deep, deep profiling capability, there's a pretty wide segment of a tool Navy and market there's. You know things that have been built.

A

I spent ten years that I did building software it and never used anything tool, because the one we haven't had a house, it just wasn't even to use a developer spective. It was very heavy to instrument the company that I joined, actually has best funny tool, and so what's been really interesting, is to see the ability to kind of drop in discover an application profile via text and our cluster or a leaf cluster, and really get that same insight that you know you used to running on a desktop profiler, converting the temp Russian environment.

A

So that's, essentially, what we do is try to keep it not too much of a product this year to recommend some best practices.

A

There's a lot of ways that you can, you know extracts from the state of yourself, a tension, ganglia negatives, give you something some views into it, but, as you start to look inside the job, that's really where the investment around sort of a PM gives you that out-of-the-box insight into what's what's really happening in my code, what's really happening in my cluster and sort of gives you that one click analysis that is highly valuable, a huge problem, any other questions.

A

Quick poll how many people have written and have a cassandra or no sequel application running in production or maybe coming into certain production by say the next three to six months, a couple: oh good, hands all right, and what about on them to do the MapReduce side to do this well, a little bit and what about using MapReduce on top of, say, datacenter price on your standard? Is anybody started to do that yet a little bit and what's been your experience so far? Is that working? Is it something? That's that's valuable?

A

On top of of datas.

B

If it's in it, it's interesting right, you're, two completely different, do for passing and finding at finding use cases for 42 years. I need something that you got to. You have to actually look at at places. Make business sense to actually go through with this. Oh.

A

Cool well with that, if you'd like to learn more about about batteries, I was really happy to have to show you a bit more work out knee to the floor. Any other questions around some, the customer. You know use cases here be happy to answer as well, but thank you very much for your time.

B

As a comment earlier about.

B

If I, if I remember correctly, the audience will know the reason for this, but you want your entire road to be a very, very so so ever between. You know, 100,000 rows like 10 million or so that you know the answer is probably somewhere in there somewhere between two billion.

A

Per thousand hundred fold, no, no, no, not now.

A

There's your answer is that there are experts in the room here and certainly you've got a lot of experts.

A

Thank you very much.

A