Apache Cassandra This Week In Cassandra, 6 May 2016

Previous Meeting Next Meeting

⏯

youtube image

►

From YouTube: This Week In Cassandra: Advanced Architecture Patterns 5/6/2016

Description

Link to blog: http://bit.ly/1VMPXNu

A

Alright, what's today's date may six all right may six there we go this week in Cassandra on Planet Cassandra. Today, we've got me, of course, because I'm here sometimes we have Patrick McFadden, Luke Tillman and our guest this week is brought Brad very simple.

A

Reach he's going to be talking to us about some advanced architecture patterns that they've implemented or would simply reach, some of which may be you know a little confusing at first and non-obvious, but Russell and I give us the lowdown on what they've been doing and why it works and why it's awesome so before we get going on that, let's talk a little bit about some of the stuff. That's been in the news.

A

If you will, there was an interesting post this week on visualizing data with Zipkin by Mick over the last pickle I'm, pretty excited about this because I just really like Zipkin a lot and yeah I'm, pretty pretty pumped to see this. Have you guys taking a look at this? A lot of echo that sucks well.

B

Does it again, Zipkin is amazing, I think because it's it's opening up a whole new realm of monitoring for us, of course, from the dapper paper at Google and a lot of those a lot of cool technology comes out of some really cool paper from some big company that has this problem, so I Zipkin, the Zipkin stuff, of course came from Twitter, as am I right, yeah and I I I was just talking to a team at Hoover yesterday they can't they are dared pretty much heavily heavily reliant to the point where they don't know how they could do what they do without it and I think that kind of sums it up yeah.

B

A

You're, if you're working and you've got a request going through multiple systems, if it's for or 40 I, don't know how you know, I, don't know how you could go back from using a tool like zip came where you can. You know, look at an individual trace and see all the servers where time was spent and just narrow things down. Oh, it's! Not the problem isn't. Obviously here, like I, had this giant span of time. Like you know, we.

B

Just feel they're, all there are tools; they were out there doing this, the APM stuff performance management, and they just didn't give you the right. You know so, just in a nutshell. What Zipkin does is it sets up this like trace ID as things go through a distributed system, so from the from the request, all the way down the change or microservices, and as we do more micro services and spread these things out, it's for all those requests have to go from one point to another point to another point. So what happens whenever they?

B

From the end point, you say: I made a request and I got it back after 500 milliseconds. Why.

C

B

Unless you've got a lot of debug code, you feel like going through a lot of logs you're, never going to find it and that's what Zipkin promises, or at least a dapper paper was that with this transaction ID it will take those traces, each point and say: okay. This is how long it took it for each place. Reroll it back to you and they say: okay, I had some middle service call that was really long and everything else is super fast.

B

Let's figure out what that is, so it just goes in line with this troubleshooting distributed system problems are hard, but this is definitely the bat belt to make that easier. You.

C

Know Patrick: we were talking about that you, you were at a software architecture conference a couple weeks ago and you said, and you know all the talk there was all about micro services, because that's the super husband but he's been doing it for about a year. Now that's been the super hot thing and I can't imagine being like building a system like that. You know: building a micro services architecture right now and not having a tool like this to be able to have insight into you know what it is.

C

That's that's causing problems like you're you're, basically like operating blind. If you don't have something like this to tell you what's going on so like.

B

I don't know you.

C

See this more and more popular, you know, as as those microservices architectures continue.

A

Yeah rust, your guy know that you guys are big into microservices over a simple reach right, yeah.

D

We uh we have like a full suite of reactive services that um that we we actually before Zipkin, came out. We kind of hand rolled a similar thing where you know, as the events come in every every so often we would ten an event with tracer and and then that that would be like sending each time it went down the line. It would send something the spats d, but this this this is the Zipkin- is really cool. It's it's just really really an interesting piece of software and I'm excited to try to try it out.

B

Yeah I'm never changes that were made to Cassandra right done, I mean it's. There had to be some changes that make that work.

A

Yeah, so that's so funny you bring that up. I'm actually going to be talking about that up in vancouver at the apache conference. That's happening next week and one of the you know the part of my deck here is in three dot, for it was made so that tracing could be pluggable. So previously, like you know, for pretty much everybody in the world who's you know not running through, for when you want to do tracing with sandra you're, just limited to whatever shows up at racing table, and you know, there's really no context, there's no.

A

You have no idea like who made the request what happened like what was going on so with political tracing and there's another article by mick that came out like somewhere around last year. It kind of goes into like okay, like for you know using some of the new stuff that was in added to the c ql protocol, and you know with three dot, for we can make it so that we can pass that that trace ID that you were talking about. We can pass that along into cassandra and then it'll report, all that tracing data.

A

So you know, like this request came in over here from whatever user and all these micro services were hit up a lot many Cassandra's head. You can even see that the granular level of like oh it took this amount of time to read the SS table data, and so you have like a really solid understanding and then using a tool like the the new SS table, dome or the hold. You know SS able to JSON. You can take a look at the actual data.

A

That's in the SS tables figure out what happened there, so it's very, very cool and I think this is going to be one of those things that you know really ramps up in popularity, because I was building a system right now. I could not imagine doing it without Zipkin, like you just you just bake it in from the start, and it's just going to save you hundreds of hours of debug time, so I'm excited to be excited. Yeah and hey.

A

Look maybe I think you said that you were doing some work right now with refactoring killer video. Maybe maybe we should get zip get in there. I.

C

Was actually I've been thinking that, for a while yeah I was gonna see what the state of it is? As far as you know, various programming languages I can plug it in with because right now the killer, video that's getting refactored is you know the web servers in node so I'm guessing they probably got JavaScript support, because you know that's the anything and everything is being written in JavaScript these days.

C

Now, whether or not they have dotnet support, that's a whole other thing, but we'll see, but yeah yeah I definitely definitely want to check that up. Well,.

A

The interesting part about Zipkin is that and dapper at least for the original paper was the data is written at a sample rate to the local file. So as long as there's like an understood, file, format you're not looking at a situation where it's like, oh I, have to debug a bunch of like Java classes and like right now like this like magic thing, it should be pretty straightforward to write that debug information yeah, you know, even even with even if that library isn't there for you, yeah cool.

A

Will the the next thing I know that we wanted to talk about was the MVP nominations are open, and this is always a fun time right because we could say acknowledge the people in the community that have done a lot of work to help promote. You know to help other people how to use Cassandra teach things and have built cool, build cool projects and Russ you've been you've, been an MVP. What like thousand years running now so myself, that's.

D

The beginning actually yeah.

A

B

What I was going to say I should stress like what what makes on. Why is Russ an MVP forever, and you know we have the nomination form up there and that this is a probably good time to pitch this. But and it's in that, it's in the show notes you can look at that, but the denomination is just that it is a nomination. It's not me and John sitting there drinking and talking about who we think is cool. It is from the community, even though we do that that does happen happen.

B

D

Gonna happen right.

B

um But we, you know what we do is we take this form that we've linked and that form is for you in the community who says: hey, I, know someone who's doing some really cool stuff in the Cassandra community, and then we have some criteria. So it's not just hey there cool. We actually are looking for people that are doing things like if you think about contributing to your community as a whole, so blog post presentations like doing meetups um contributing code to Cassandra, um there's there's a few cracking materia.

B

If you know someone who meets that criteria, then by all means nominate them, because we want to make sure they're recognized and if you look at our current MVPs, these really I mean they really do have the M and the V going on, because they're just I mean rescue. You guys talk all the time about stuff that you're doing like today, but some of the others that have been in there for a long time same story, and it really makes a community better yeah.

A

That's good stuff, I think I mean this is the. What we're going to talk about today is kind of the perfect example. You guys have been I think we talked about.

A

You know, there's those two things that if you guys have going on right now that I think are, you know and really really interesting you're, one of the only you know, teams running on EBS and I know that you guys have just done a bunch of work, spinning up what you've been calling coordinator only tier, which is really interesting because I, you know, I talked with you guys kind of separately about this and and we threw some numbers around and I was kind of blown away by some of the results that you guys got, and this stuff's really really interesting.

A

I don't know. Can you give us a little background on the coordinator tier.

D

Yes, so originally we were running, our cluster was running the G standard. You know it's Cassandra and AWS best practices, which is running cassandra on the i2 instances.

D

You know which have the ephemeral, SSD volumes and have just extremely high amount of high ops, the high amount of fruit foot, and we we were hitting a point where our our cluster was getting to about the time where we need to start adding notes and the what we, what we ended up doing was starting an upgrade and that upgrade ended up catching up catching a pretty nasty bug.

D

So to work around that bug, we decided to put what's called what we call the coordinator only tier and that's simply just starting up a cassandra node with join ringing. It equals false, so it takes over all the coordinator functions and allowed us to kind of do the upgrade, because the the the bug was within the coordinator functions, and we just got the coordinator to your on upon a lower version of cassandra that didn't have it.

D

But what we ended up, realizing and noticing- and you know- was that it dramatically reduced the amount of CPU load on the actual cluster. So we added we had. We had a 48 node cluster, we added 6 coordinator nodes, and these no's were running on cheaper hardware because they didn't need the high throughput SSD volumes and we saw about you know.

D

Our average load was sitting around eight on these a core machines and went down to around 22 23 and- and we were just kind of blown away by just this small amount of cost in terms of the hardware leading to a dramatic decrease in the amount of utilization of the cluster yeah. And you know, big shout-out goes to Rick Branson for that, because I think he was one of the initial ones that came up with uh with the idea, and he actually helped me through it a little bit with some of the JVM tuning.

D

Because because it's a coordinator only node, you can actually tune the JVM for coordinator specific functions, which is like having like very large heap where you can create a lot of small objects in the memory space so and then, and then you can also to jr. your data. It's for data functions like you know, mem tables and an IO throughput. So it really gives you a little bit versatility there interesting.

A

One of the I'm.

C

Curious did you take a hit in latency at all with with your ops, because now you're hopping through this coordinator tier instead of maybe you know the driver being smart and going directly to nodes that own? You know given piece data.

D

No not really because, like I said, the nose were kind of hitting their peak load. We actually didn't see much of a difference. They apps didn't even notice. It.

A

Interesting one thing one thing I'd be very curious to see is if the so I've kind of dealt with some going completely against, like the common pattern that you would expect I remember when I was deploying tightening production people would you know the advice was: will co-locate tighten with Cassandra I? Did the exact opposite and found that I got way better performance I'd be interested to see what the difference would be?

A

If you were to co-locate the coordinator tier with your application server and only talk to your local coordinator and the thing that's interesting about that is because then you avoid the extra network transfer kind of in the example that Luke is talking about like you're just going over shared memory at that point, so it's kind of like and whenever maybe it's not that big of a deal and it doesn't involve spitting up new servers, you can.

D

Try yeah, if you had a small number of app servers, I think that would be you know that would be really doable and preferred. We have like hundreds of app servers so having hundreds of coordinators and managing that would be kind of, like I, think a maiden stamina nightmare. You know especially like if one coordinator kind of goes down, then you have to rewrite, but and- and you have to the way- the way that we keep the apps from connecting to to the main cluster through the drivers.

D

It is to use a whitelist round-robin call policy rather than your standard DC aware, 411, given.

A

It's pretty interesting, I would. I would love to digging, because when I, when I hear stuff like that, I'm like. Oh that's, counterintuitive, like what is the root cause of this, I think it'd be fun to you know, dig into that a lot more and understand exactly what's going where's that x, why? Why would it have such a dramatic decrease?

A

You know what would happen if you put the coordinator to your on the same server as the as the data, so you'd actually run to what, if you rent to JVMs one just for the data and one for the coordinator like would that have an impact? So those are those are really interesting things I love to see people like just trying stuff that doesn't seem obvious and may even seem like a bad idea.

A

You know if you haven't tried something like that before you know, distributed systems are one of those things that if your intuition is frequently wrong and you gotta try stuff like this, to figure it out so.

C

Russ, are you guys planning on keeping the coordinator to it? Sounds like it was kind of like a temporary thing that you fried and maybe now you're thinking it should stick around yeah.

D

It was, it was originally just a temporary, like a workaround and and now just because of the dramatic improvement in performance that we've seen. We're definitely going to be keeping it around.

B

I'm curious did you were using token we're routing from a client beforehand too. So.

D

Yeah we were you saying we're using a token aware routing so.

B

Wasn't even really doing any coordination, it was just dealing with the request. Yeah.

D

I I suppose um we we also it's important to know. We also do cross partition patching. So it's it's probably a bit play The Bachelor, oh.

B

Okay, that's a very yeah, that's a very important thing! Yeah, because the coordinator is responsible for maintaining the state of that batch until it's finished so about I sure that out a lot to do with it. That's a very interesting detail: yeah.

D

Another Santa at Sea.

B

I message now: price.

D

B

No I actually understand now why the coordinator node makes sense, because that is a really that's a hard load to manage one of the reasons I, don't recommend using batches.

B

It's an interesting way to get around it.

A

The other thing that you guys are doing that I thought was pretty cool. That I've only really heard of you know the guys at CrowdStrike doing was running everything on EBS. Did you notice a performance difference from switching the EBS know.

D

And- and you know, I I'd like to say like running edss, this stuff and coordinator notes and all this stuff. This is all like use case specific. You have to know your business, no, your use case in order to make these decisions now the reason why EBS volumes and we're running the S, the new st1 ETS volumes, which are high, throughput low I ops.

D

So they only have 500 I ops, but they can do 500 Meg's a second throughput, but the reason why we get such good I won't say performance, but because the performance is is the same or probably a little less than then the ephemerals. It's the cost savings, the the dramatic cost savings and the reason why it works for us is because we're 95% rights and, as you know, rights are sequential to disk and and we're five percent reads and because we're you know we're an analytics company.

D

We track time series data that five percent reads is is like the two percent of the of the data, the most recent two percent of the data, and- and that means that it's it's in the page cache ninety-five percent of the time, and so because it's in the page cache most the time we don't even have to hit the disks.

D

So we can use these super te BS volumes and store a lot of data on them, because we don't need to read the data and when we do need to read the data we want high throughput because we're typically doing smart jobs against it. So having that high throughput, one megabyte read ahead, you know 500 Meg's. A second actually helps the spark jobs perform better interesting.

A

I appreciate the not just the hey we use EBS like whatever, but the reason behind it, because you know sometimes people will see like a lazy BS. We might as well just do it too, and not even think about that use case like that's to me, that's the most thing is understanding all the you know exactly what you said that the business case behind it, like the you know, understanding how much of your data is in page cache, it's so important, so yeah.

B

I mean Google horrible to do projects get started in on that Oh google uses to do right or yahoo used to do will use it and it'll be magical. Yeah.

A

B

A

Full Facebook built an entire business on sharding, my sequel and you're, just like that's.

B

A

What so many tag? How many times do you hear.

B

Ya, you hear grated technology and you're like well. We went to we want that. Look to you that technology yeah.

A

Right, oops, yeah, so cool, oh I, know, rests I know you have like a hard stop coming up, so I don't want to screw up your your day. I just appreciate you coming on I mean dude. This is it's so cool to talk about house. Thank you, and this is you know this is why you're definitely in that MVP category like this is really really cool stuff.

A

Anything anything you guys want to say before we sign off. Oh.

D

Yeah we're always looking for, like you know the best talent, if you have a passion for analytics Cassandra and just high volumes of data reach out to press it simple reach or air, it gets simple reach and you know we're always looking for people cool, Luke,.

C

Anything I got nothing, get your nominations in for Cassandra MVPs. Yes,.

A

And put your cfp in for summit, see.

B

You for summit absolutely, as in John I, think next week we should have a little like how to do an awesome cfp. Let's do that next week. So stay tuned, come if you, if you love this up so next week, gonna be amazing. The rest you're gonna be.

A

My solution to that is I, usually just ask you for for help. I see well.

B

I was funny, I was just thinking about writing a blog post on how to what would be a great cfp, and a great cfp is not going to talk about cool stuff. I did yes lots lots of my shifting yeah. It's awesome trust me. Actually we saw that Nessie FP 1 year. You will not believe what I'm going to talk about trust me.

B

So I will not I.

C

A

C

B

Right and you know it's funny- it's the person who sent that probably is listening right now. Mm-Hmm cool.

A

All right guys, I think we can sign it off them. Thank our.