Red Hat OpenShift OpenShift Commons Operator Briefings, 9 Sep 2020

Previous Meeting Next Meeting

⏯

youtube image

►

From YouTube: Database Transactions in Kubernetes and OpenShift Spencer Kimball (Cockroach Labs) OpenShift Commons

Description

Database Transactions in Kubernetes and OpenShift
Spencer Kimball CEO, Cockroach Labs
OpenShift Commons Operator Hour
OpenShift Commons Briefing
September 9 2020

A

All right, everybody welcome back to yet another openshift commons briefing. Today we're gonna um kick off a series we're doing with folks who built operators that run on openshift, but we're not necessarily gonna talk about the operators. Specifically, we might talk about them a little bit and get you guys um to tell us what you did and what you learned building them, but today we're gonna do something um kind of special, actually we're gonna. The title is uh database transactions in kubernetes and openshift, but the guests are um from cockroach labs.

A

We have with us spencer, kimball, the ceo and co-founder and jim walker who's, the vp of product marketing at cockroach, and we're going to have an interesting conversation with them, because spencer comes out of google. Has a lot of um the genesis of cockroachdb is comes out of google, we're going to talk a little bit about that we're going to talk about distributed, sql we're going to talk about working with openshift to make all that work for you and kind of the impact of cloud, 5g, serverless and all those kind of things.

A

So it's going to be an interesting conversation, introduce yourselves jim and spencer and we'll have live q a at the end, so ask in the chat wherever you are.

B

That is great and thank you diane. Thank you for having us uh sincerely. This is a bit of a little bit of a homecoming for me as well, because I was uh as a member of the core os team as well, and so you know, coreos being part of red hat and and part of openshift, and this whole shift to kubernetes and everything else. You know um I'll, introduce spencer I'll, let him introduce himself, but I gotta say the very first time I ever met spencer.

B

Actually I was working for os, so here we are driving down the kubernetes route and spencer and and and alex bulby, who was with corus at the time, wanted to actually demonstrate a database that would survive the failure of pods and the failure of systems.

A

B

You know this wonderful demo, I think, um but when I found this database I was like wow. That's that's what's needed that that is that's right for this, this modern architecture and so uh hi. My name is jim walker. I am a vp of product marketing here at cockroach labs, so spencer, do you want to introduce yourself as the has the fire engine gone by? Yet?

B

Yes, I believe it has that's the uh the risk of conducting this interview uh on top of a roof in manhattan, so I will try to mute if it happens again. Hopefully it doesn't yeah.

A

B

Yeah I've been in the industry now wrestling with databases for about 30 years. I started even when I was at university. I wasn't very interested in databases, but uh the reality is. If you try to build anything, you run head first into them, and- uh and I did a con startup and then ended up at google in 2002, and I was there for 10 years and it was really a front row seat on the evolution of google's interest in databases.

B

One of the first things I worked on was uh around the uh adwords product and they were using my sql and they ended up having to shard it, which just means that you have some customers on one mysql instance, but it gets too big. So then you employ another mysql instance and then another and then another as you grow by the time I left that project. It was about 32, shards and eventually uh went to about a thousand stars before they replaced it. Then google built bigtable just super interesting. That was like about hey.

B

We, we don't want sql or we do want sql. It's not really the point we want scale. We want elastic scale and then they built something called megastore started to introduce transactions a couple years later and then finally spanner so. I actually worked on some of that distributed infrastructure. A product called colossus which is a exascale distributed file. Storage is kind of the successor to gfs, um but also worked on applications in between, and so I got this it's a really nice cycle right.

B

You work on applications, you decide well, there's some things that are really missing from what the infrastructure is providing. Then you work on infrastructure and you get to provide those things and then you go back to applications. You realize that's great. I've got that now, but I need these other things, and so the cycle continues. When we left google, I say we because I've got two co-founders of cockroach and all of us worked at google for those 10 years together.

B

We we wanted to build the way google built and not everything that google had internally was available in open source at that point in time, in fact, open source really felt about 10 years behind. I think the world's caught up to google quite dramatically and that's part of what cockroach is right.

B

It started as sort of a manifesto we weren't trying to build a database when we left google we're trying to build a private photo sharing application, but it became pretty obvious that what google had evolved internally was exactly what we needed for our startup uh and then we got acquired by square and we realized wow square needs a lot of the same things that we needed at viewfinder and that google needed for the 10 years. We were at google and then you look around uh people.

B

I knew at dropbox and pinterest and they yelp and everyone needed a database like this. So that's where we started the open source project and uh you know the rest is sort of history. It's been five and a half years now. uh Cockroach and uh you know we have some of the world's biggest companies and some of the world's smallest companies is our customers. It's uh very exciting, but the way the right way to think about cockroach is it's a relational sql-based database.

B

In fact, it looks like postgres, but instead of being monolithic in the tradition of systems like sql server and db2, and oracle, where you're really scaling up an instance that sits in one location, cockroach is fully distributed.

B

You know from the the the sort of most basic parts of the the overall system is fully distributed, and what that really does by you is you know what we're going to talk about fundamentally in this interview, but it's been, then it's been a long journey and uh you know I never was interested in databases when I was at university, but that's what I've basically spent all my time on since so spencer. Thank you for that. That was like, basically the first half of our conversation.

B

In a nutshell, I love it buddy, so the um so I you know there are two things I actually want to actually ask you about too. um In university. You also had a pretty popular. I.

A

Guess it was an open source.

B

Project or you and one of our founders actually started something. I think a lot of people are familiar with. What was that you never mentioned it? I got to pull it out from you.

B

Yeah we've had a long association, so peter, and I have been working together now for 27 years, we met in 1993 at berkeley, yeah and.

A

uh We were you.

B

Know coming out of in 93 it was everyone had windows or mac at home mac os one of these earlier versions of my quest. We had windows 3.1. I can't remember exactly what it was at that time and we got to berkeley and it was wow look at these unix systems and you know by today's standards they weren't so good. But you know we were using these sun solaris workstation things and we were so impressed that uh the unix ecosystem and all of the free software.

B

What really was missing, though, is photo editing which we were used to using photoshop and the like, and there was xv and there's something called x paint and they were, you know, pretty pretty sad uh compared to what photoshop was at that point in time. So we we kind of decided one night um after a couple beers. I guess that you know hey. This would be a wonderful thing to build as sort of free software, and so that's where the idea of the came from and was uh you know, titled his gnu image manipulation program.

B

But it was really inspired by pulp fiction which came out that year and we.

A

B

That character was funny and you know, bring out the when you've got a difficult photo editing task. So uh yeah.

A

We worked on that.

B

For basically, our entire undergraduate career is often skipping classes and not doing a great job on some of our projects uh in lieu of working on this like uh full-time and then when we left college, the beauty of it is that we stopped working on it in 97 and it still exists and it's going strong right. So it's been since we stopped working, it's been 20 uh 23 years.

B

That's it's hard to imagine, but 23 years and the open source community has inherited it and has promulgated it and improved it continuously and that's the beauty of open source, and that is the beauty of open source. It's exactly right! It's it's years and years later, and here you are still doing open source as well, but um the question that you and I always get uh no matter where we go.

B

What's with the name right so cockroachdb, I think you kind of described it, but like a little bit about the name, I think people always ask us that right. So how did you come up with the name?

B

Well, as you can tell, I think peter came up with the name and uh peter and I happen to have similar senses of humor, and so we, like things that are a little bit darker, maybe is the right way to say it um just kind of tickles our funny bone.

B

You know, cockroach, the name was one that I chose and it was really because when we were desperately figuring out what kind of database we are going to use in our viewfinder architecture, which is the private photo sharing startup, we kind of hit on the idea after examining things like hbase and cassandra and mongodb, and you know.

A

B

The normal things like mysql and postgres, we said you know we want something: that's like google spanner, but it should be open source and you know if we we're going to think about how it's going to work. It's like these individual sort of uh pods or nodes that are all greedily optimizing, but making sure that the data they have is is replicated elsewhere, sort of greedily. Managing that, if you give them more space, they colonize it right. And so this idea of you know just kind of the evocative concept was a bunch of cockroaches.

B

I live in new york city and I don't like cockroaches believe me. I hate them. I mean I hate them, probably more than most people. So it's a little ironic. I chose the name, but I never really thought I'd be explaining it to big audiences, um and here we.

A

B

Well, funny enough, I think, uh love it or hate it. People remember it, that's for sure it's uh it's definitely got you know it definitely has that going for it. So let's talk about spanner and google a little bit and kind of the gen, because that is really the genesis of cockroachdb was I mean the google? You know the google cloud kind of that white paper that they published right and and some of those sort of things you were kind of front lines in terms of like they had bigtable. You mentioned this in your intro.

B

They had bigtable, they had a couple other solutions. Why did google need to build yet another database on top of borg and everything else? At the time I mean I, I know you weren't on that team, but you were kind of adjacent to that right I mean I think you were working. Were you working within like the classes team at that time, like the the file system? Correct, that's right! That's.

A

B

File think of it as a blob storage system. Really I mean they built some file system stuff on top of it, yeah.

A

Right, you know google.

B

Has a had a long and storied history of databases and I'm sure it's continued in the eight years I've been gone. um You know, google is not afraid to do pretty involved. R d and and part of the reason was that they had to like if they wanted to realize their ambitions, and if you think about google in 2002 scale was a you know, an unbelievably pressing concern for them. They had the entire world. Basically um starting to do you know daily, maybe hourly, maybe by the minute.

B

You know, actions that involve their systems, and so they needed very large scale databases and at the beginning it was really read-only databases or write, wants to read many times. uh They had these systems. That were, you know, really just read-only indexes, and then they started moving to something that could kind of gradually layer in additional changes called the rt server and then the the the indexing pipeline you know, was kind of its own database and index and so forth.

B

It was very custom purpose, but then they started branching off into other things like adwords, for example right. So they needed something to manage all the creatives and you know the places where people come in and that's that's where they started using mysql uh and but at the same time they started building other things uh like uh started, storing data for every you know.

B

If you had a cookie and you started searching on google, they wanted to associate data with that so that they could build better search right so that they knew what you're uh what you were probably looking for, based on past things, and and so there was, uh you know, a huge need for you know massive scale.

B

Data what's interesting is that original mysql adwords project, because it started to scale way beyond what one instance of my sequel could handle, um and you know I got put on that project to help because it started to fall over when it got to more than four shards and just too many connections coming in from too many application servers and they kept crashing the databases they had this ads war room, because every morning we'd be in there and uh and jeff huber was running. It uh he's had quite a career uh and you know we'd.

B

Look at the problems that uh that that happened since the the last day and we'd be like figuring out how to solve them, how to do the short-term phase. This is what the medium long-term fixes were and that it's that went on for months, because it was such a big problem and we we got it somewhat stable. We built lots of interesting things, as I mentioned before, that went to a thousand charts by two thousand, oh wow, so it went. You know it was hard enough getting past eight and you know getting to 32..

B

They went to a thousand and if you actually add up all the engineering involved in that over that 10-year period, it probably is enough to build a couple databases. So that's that's interesting, but this is a pattern that repeats itself at facebook, for example, they have hundreds of thousands of shards of my sql 100, not a thousand hundreds of thousands and facebook has spent engineering millennia knitting, those together into a big meta database.

B

Very custom super snowflake technology only good for facebook, but you know what cockroaches building probably wouldn't ever be able to apply to facebook, because their scale is so enormous and they have. They have a careful balance of consistency and eventual consistency. It's a very complex system, but what's interesting, is google's experience with adwords in my sequel in sharding led them to create a moratorium on that kind of an architecture. They said. No we're not going to do this again right once is enough right.

B

So if you want to build a large scale system, you've got to use something else, and around 2004 is when bigtable came out and bigtable is really geared towards that problem. I was talking about of associating data with cookies and search they're, just massive scale and bigtable wasn't like you know, we don't like sql, we don't want to use sql, uh it was really well. Sql is really complicated and we've got a bigger problem right now, which is scale.

B

That's big enough and not just scale but elastic scale, and so big table was a direct consequences of the challenges and the opportunities that google was facing. Like how do you, how do you they didn't? Buy? Mainframes or you know, sun microsystem.

B

You know huge multi-processor systems, they they were doing commodity hardware, they're kind of like the proto cloud right. It was google's cloud and it felt like a public cloud internally right. So google's like 10 years ahead of the.

A

B

Back then, uh and and so how can you knit a big database together out of all of these commodity servers and that's what bigtable was and it was groundbreaking, and that paper was you know, definitely one of those red papers at the time and spawned the entire nosql movement.

A

Yes, I was gonna say that they.

B

Started that that movement as well right, yeah so go.

A

B

Yeah, they started a lot of them um and it was internally. It was very exciting too. It was an amazing piece of technology, and you know everyone was pretty jazzed about it, and I remember there was an ask early on of like from, I think, the infrastructure team- okay adwords, you know you guys are having trouble with your my sequel, sharding. You just use bigtable. This will handle your scale issues and um there was a huge pushback because they said okay guys we have like. I don't know what it was at the time.

B

Probably like a couple hundred tables. We need transactions, we're talking about people's financial uh information, there's essentially a ledger happening like this is not there's a huge impedance mismatch between us coming from a relational database system. Even if we have a very awkward sharding mechanism uh to going to big table which is just not set up- and it's like uh big table is great for certain kinds of tasks, but adwords required something that you know fundamentally.

B

Sql relational database has been evolving for 40 years by that time, 40 years right, so you can't just replace it in one fell swoop, and so, interestingly, google responded to that. The infrastructure teams responded back and they created something called megastore and megastore came out two years later. This is very interesting because uh you know uh mongodb didn't get transactions until like 2019 or something like that right. Google had transactions, limited form of them to be fair in 2006, because the big table kind of fell flat for certain uses right.

B

We need to have some kinds of consistency. The big table megastore introduced that they also introduced the idea of consensus-based replication, so a big step forward in mega swords, two years after big table, but quickly after that people realize you know what there are more things we need to do. We want general purpose transactions. It's not these limited ones that megastore brought, and we want to sort of uh in some ways you can think of spanners is sort of closing. The circle like the circle got opened right.

B

It's like we need to deal with massive scale and then it's like okay, it's starting to close. We need to get some kind of transactions now we need really general purpose transactions and then eventually, they've created something called f1, which is we actually want to bring sql back, because we want to replace adwords. We want to replace the mysql sharded mess with a the new next generation system, but in order to do that, we can't just make them rewrite everything for some weird new api. It's actually should be sql.

B

It should look enough like sql that we can make that transition, so they came full circle. It's very interesting, and so you know when we left google, we realized. Oh, you know we want a relational database too, and we also want these capabilities and we know it needs to be open source. So we had the enviable opportunity to fast follow google, which is.

A

B

Than doing it all for the first time, well, I think we're all benefiting from that period of time at google right I mean today, spencer, you and I we get in conversations with customers and it's like they're almost like your conversation right now is some of the stuff that I'm hearing in some large organizations. Well, we've scaled this database and it's falling over and we're having problems with it. We can only go so far and then.

A

They're trying to figure it out using.

B

Legacy, tech and knowing that well, there is actually prior art. There is actually somebody who has solved this and- and I you know to me this- the you know board turning to kubernetes, um that that whole movement, um just the the wherewithal and the the vision at google to actually understand what the future of compute was going to be. um You know resulting what are we 15 17 years later or 12? You know, 20 years later, after that journey, you know really really kind of took off right.

A

And it's it's good.

B

15 years at least yeah from like the major.

A

B

Think, technological advances and it's.

A

B

Thing, that's really interesting is just that you've mentioned borg and kubernetes. I mean you know for a while when kubernetes was new and didn't really do stateful workloads, there was like uh a non-trivial contingent of people that would opine frequently that uh well, kubernetes is really just for stateless services. Stateful isn't going to work, that's not what it's for!

B

Well, you looked at borg back in 2005 and six and believe me, borg ran everything had ran, big table ran mega store, I ran spanner, ran colossus around d, uh it ran stateful services and it ran them with uh ease and uh and that's of course, what orchestration needs to do. It's not just the stateless stuff right. It's everything! So there's a there's, a there's, a really good uh match between orchestration technologies like kubernetes and database systems like cockroach. They too are made for each other.

B

So it's good to see kubernetes, evolving and maturing and being able to handle that that much wider. um You know spectrum of use, cases and infrastructure. Yeah! That's right, and it's it's! It's brandon phillips a long time ago, once quoted this out. It's like it's giphy. It's google, google infrastructure for everybody. I think is what he wanted to have and I think that's where we're headed, um but you know, but one of the things that had come out of that whole conversa.

B

I remember this like four years ago, was the whole concept of operators and actually simplifying the life of how all this stuff works. How much did it take to actually make all this work back then I mean just sort of just a pure resources point of view spencer, like if I'm gonna have okay, I'm gonna go from four shards to a thousand shards at google or facebook. We have teams and teams of people that have to manage this right is that I mean how people were.

A

B

I mean it was just it.

A

Massive people.

B

Times, time right, like that,.

A

That's why I always.

B

Use like things like engineering years uh and engineering.

A

B

And engineering, millennium and.

A

B

Reality is that google, I don't know about google adwords, and that charted my sql system. You know if it if you could linearly extrapolate how much time we spent on it. I mean.

A

It was engineering.

B

Centuries, uh you know I'm sure of it. I.

A

Don't know if the.

B

The system matured to a point where they didn't have to do much work on it at some point in that interim. I really I didn't follow it that closely, but I am aware that uh facebook has spent engineering millennia on their database systems. Yeah.

A

B

That's not something to uh take very lightly right. That's a yeah! That's.

A

B

It's an eye-opening um step, but if you think about it, obviously oracle had engineering millennia put into it and so did db2 and you.

A

Know it's like databases.

B

Takes seven years to mature and stone breakers fame, this sort of rule of thumb and um and then they have a lifetime which can go decades beyond that, and.

A

B

There's such a vast surface area to relational databases that follow the sort of modern sql standards. It's a it's a kind of a an unfillable pit right you just kind of get pouring it just keeps going, uh and you know the thing gets better and better, but you just look down you're like wow. We we can't even see the cement, we keep pouring it so yeah. You know it's it's a huge task, but uh it's it's absurdly valuable right.

B

Every service and application in the world is backed by some kind of a database, and uh many of them are backed by relational databases. I mean you know we have a customer, that's uh one of the big big financial services companies and just in one of their arms they have 6 500, just like one of their business units, 6 500 externally facing applications, that's just like mind, blowing yeah yeah and the amount of databases that you hit in your day, thus far between waking up till this moment, and it's only a couple hours.

B

It's just insanity right, like I joke about that in the legacy databases all the time, um but you know I love the concept of engineering millennium and it's like well, there's devops millennium and there's an sre millennium as well that's put into these systems and I don't think people really understand like what it actually takes to even going like one of the reasons.

B

I love that I'm on here, with with what the openshift team is doing at red hat and simplifying the operations side and that side of the world I just, um but we talked a little bit about how you know. Cockroachdb is aligned very well with kubernetes, spencer um and just being a distributed database on top of a you know, distributed orchestration platform right like that.

B

It is absolutely what originally attracted me to this company. um If you look at the future of the database to me, I believe it's got to be distributed. It's equal, but when.

A

You combine that with.

B

You know you know, I think it was is. Is it stone breaker that says it takes seven years for a database to fully gestate mature right, yeah, so like building a database takes seven years right? Building a distributed system adds a whole new other layer. On top of that, right, like it is, that's that's complexity. On complexity.

B

You know, I guess going back to the the beginning of what we've done at cockroach and architecting from the ground up. What were the biggest challenges to actually get this to work right, like I mean, I think we had a really good sample in this banner white paper, but you know technically there's some pretty pretty hefty challenges that we faced early and we still face today. Right I mean we've been building for five and a half almost six years now. So what were.

A

Some of the biggest challenges at.

B

The very beginning, well, the distributed transaction model is, uh is, you know, required a a lot of work. Also, the consensus replication. I mean people yeah. It's not unusual for undergraduates to implement raps, which is a variant of access, is what we use. uh You know one of my co-founders ben darnell likes to quit that it took him a week to implement raft in uh you know probably another several years to make it actually work in production properly. So that's uh just it's it's kind of shocking to hear that.

B

But it's um it's it's incredible where just how much of the devil is in the details in.

A

B

These things and the transaction model is another fascinating thing that we worked on because cockroach isn't just distributed in terms of hey. You can shovel in more commodity hardware within a data center cockroach is meant to be distributed across data centers, for example, within a region like different availability zones on the us east coast. That would be, uh you know how you really really want to do the geo replication, so you have low latency, but consensus replication, but it's also built to be replicated across continents like vast geographic distances.

B

So how do you build a transaction model that gives you serializable isolation and minimizes all of the round? Tripping and that's uh you know, sql is a very chatty protocol. You you open up a transaction, you read stuff and then you write stuff. uh You know all of the every time you do a read, you're actually hitting the database. That's certainly you'd. Never ever want that chattiness to go, for example, from australia over to virginia. That would that would absolutely destroy your your transaction time. So how do you actually build?

B

How do you plan for that? What does the topology look like of the cluster and how does the transaction model work? And we have you know anyone? That's really interested in the details of this there's a lot of interesting posts on our blog that describe uh sort of the state of the art in terms of how we're managing that. But you know those two things: transactions and replication uh by far are those stickiest points and they can continue to be honest.

B

There's there's a lot of stuff on the drawing board for how we're gonna really make global transactions, uh something that everybody is uh trivial. That's right!

B

That's right, and I think it's um I I always uh I always point people to our docs as well spencer, because I think they're just like if you really want to get into it. What jesse and team does on that side is just it's tremendous right like it's there's such really really good stuff there so, but coming back to distributed transactions, and you also mentioned serializable, um you know- and I think that's a that's a that's a big task to take on right, because you know our ultimate competition here is the speed of light right.

B

Like I mean, if we're really going to do this at global scale, right, like I always think about that as like you know, one of our engineering leads always sits up. You know here's speed light. That's our that's our end goal! That's what we're trying to get after right! um That's.

A

A tough one, I really like that.

B

Jim, I really like that, like the next time someone uh some investor asked me like you know, tell me about your competitors and say: well, you know our real. Only real competitors well.

A

It really is, ultimately, I mean like.

B

Well, I I use it all the time, because I mean look at building a distributed transaction model from the ground up is not simple. I think, there's a couple companies that are trying to do this as well. It's not a simple task. It took you know how long at google I I go through the beginning of everything at google, how long it took to get to spanner spencer, because it.

A

Is engineering millennia it.

B

Is a lot of expertise, because these are not simple things to do. um You know building oracle from the ground up from way back and what an incredible database I mean. What a I mean. It is incredible technology. I still think like that and photoshop I think, are two of the most incredibly complex. I mean.

A

B

I always describe oracle as like the absolute evolutionary peak of the monolithic architecture, the database. It's.

A

Amazing, it's a.

B

Fantastic dummies system, no question: I mean it's running most of the world's uh high-value use cases, at least the plurality of them. It's it's definitely an extraordinarily successful company as well. There's there's some things that uh you know: people don't love about their their vendor relationship with oracle, but you can't argue with their success. It's a great product. Yeah I mean our our society. Is you know a lot of the advances that we have are due to that, because this this this transactional model and being able to actually implement serializable like our banking.

B

Everything like our money is basically run through systems like that and so doing. Serializable isolation in a in a distributed transaction model presents some fairly difficult.

B

Like complications right spencer, I mean like there's a couple things that we did. I think that are interesting here, a like the way that we implemented it, but also geo partitioning is also kind of part of that whole conversation as well. Can you just kind of describe just a little bit about? I mean, I guess you know we're only a half hour and a little like a little deeper into kind of how that works with raft and everything and like how the geopolitician works. I guess I don't.

A

Know at the 5000 foot level.

B

Yeah I'll give it a shot. I mean, there's there's a lot of technical details. Yeah the high level there's a lot of pieces that are moving um but.

A

It's it's really. That's instructed.

B

Right so this is, this is how you become a marketer, see because, like engineers will laugh at you like what was that you just glossed over like 80 things, but.

A

B

To try to put on my marketing hat more than my engineering hat.

A

B

Thing, that's just really important, there's a there's, a dichotomy in the database that uh I think people sometimes conflate the two things, but there's geo-replication and then there's uh or just replication and generally that doesn't have to be geographic but and then on top of that is the transaction model.

B

So when you write a single key uh or even a group of keys that are kind of close together and they can all be written, atomically as part of the rack or paxos doesn't matter um replication protocol, the the problem is that when you have a truly distributed system, that's quite large you've got lots of different. uh You know for lack of better term shards.

B

We call them ranges but they're, essentially chunks of the key space that are replicated between, say any three nodes you have, but you might have a hundred nodes and so you've got lots of different uh replicated um shards of the key space, and so in order to actually write to multiple parts of that key space that live on different nodes in the in the larger distributed system. That's where a distributed transaction comes in, so at the low level you've got replication through something like paxios or wrap and at the top level.

B

That's where you have a distributed transaction protocol and those two things have to work very well together.

B

You know so you mentioned serializable, as probably most of the people on that are listening to this are aware, from their experience with databases in the sql ansi standard, there's a there's, a number of different isolation levels. These isolation levels are really difficult to understand because they're not based on something that developers are thinking about what they're really based on is what database developers have been able to create in terms of trade-offs between how quickly uh or how much.

B

uh If how much latency there might be in a transaction versus uh you know how much isolation that you're getting, and so you can have less isolation, and uh you know really let the database blaze through, in other words, transactions that might be overlapping or intersecting in some way they don't really care about each other. Some people call that, like the the uh I shouldn't use the word, but you know that starts with an f, but uh you know effort mode transactions.

B

Let's say uh you know, that's that's like the uh read uncommitted and then it goes through these different levels, uh all the way up to serializable, and when people talk about acid transactions that I isolation, that means serializable, let's just be honest right, like all those other modes they're about trading off isolation in order to get more efficiency in the database. So uh you know what we decided to do like one of our our sort of overall mission for cockroach labs was to make data easy.

B

We came from square before and there's this make commerce season. We kind of got a little lazy there, but.

A

B

It's actually a great mission, because you're only ever going to ask them product be approaching. Data is not easy right. It's very, very much, not easy. So by having that mission, we always know what our north star is. Can we make the database simpler for people, and I talked about how to make it so that people could truly do global transactions easily? That's that's kind of one of our north stars, but you know in the early days of cockroach it was like.

B

How can we make it so that when you're using distributed transactions you're not trying to understand the distributed transaction model, so you don't screw things up but still make it efficient? We said what we said: there's going to be one mode of isolation in cockroach serializable and we started with two. Actually we had snapchat isolation as well. We ended up with just one serializable.

B

We want that because that's the actual I in acid transactions right, that's the uncompromised, correct eye, and so we said serializable only, but we need to make that fast right so fast that you don't miss the other modes and that's a work in progress.

B

Still, but uh we made a huge amount of progress in the first three years that really allowed us to only release with serializable and to make our customers happy with that and that's great because as a developer, trying to understand what it means, if you do, repeatable read versus read, committed versus snapshot versus serializable is a very difficult prospect. In fact, people never get it right, and there was a paper about this that came out of stanford.

B

It's called acid rain and it analyzes available open source e-commerce applications which actually happen to run more than 50 of the e-commerce on the web and they analyze it with this very intelligent tracing mechanism to see whether there are what are called anomalies due to weaker isolation levels and the transactions that the e-commerce systems were doing. I mean some of these e-commerce systems, weren't even doing transactions some of the time.

B

So it's like there's some pretty shoddy programming out there, but even when they were trying to do the right things, they often didn't- and this thing would highlight those in ways that you could take these e-commerce systems and, for example, you could put multiple things in your checkout card just by like sending a concurrent request, because those concurrent requests would come in and the database wouldn't have the right isolation level.

B

It would let you do things that should be illegal if you had the right isolation level and that allowed people to check out multiple items and pay once or to use. Coupon code codes multiple times, and so the reality is that this is just like a small sampling of e-commerce things that happen to be very heavily used.

B

But I mentioned before, like there's, there's probably hundreds of thousands. Maybe millions of database backed applications out there, where people just use the default isolation level, which is usually read, committed right.

A

B

uh Are basically uh these these applications out there are just uh filled with whole. I mean they're leaky buckets, so any you know in you know just to give you a practical example of the risks here there is a bitcoin exchange. I I don't want to say the wrong ones. I can't exactly call what it was, but it's in the paper uh that was completely emptied out and it was a it was a currency based acid rain type attack, and so this is. This is actually a huge risk and there's really money in it.

B

So uh you know this is something that we wanted to remove that difficulty of cognition from the developer. It's like no, I use cockroach. I always get exactly the right isolation level. I don't have to play games with it, trying to make my thing faster. So that's.

A

B

That that's a that's kind of a it. I think it's a state of mind that we enter like the product uh sort of ideation, and uh you know what are we going to set our standards? How are we going to set them? And it's just very helpfully aided by what our true north is, which is really to make data easy, that's right and don't make compromises because of we want to actually have the serializable isolation right.

B

We want the data always be correct because, honestly, spencer, I I don't know too many developers who actually think about isolation levels in their database. They basically just spin up a database and turn.

A

B

On and they don't even think about that configuration, you know.

A

B

They want they want it to work, they expect it's going to work. I mean databases, work really well right. The problem is that people set these isolation levels. They don't want people to do a load test and you know if they defaulted to serializable, which would be the safe thing to do and let people opt out of that. Then you know what would happen. Is people like this database sucks? It's really slow.

A

B

And so it's really just. uh I know it's just a little cynical, but in the end, what you end up with is a lot of people that don't understand. Isolation, levels, use the default isolation level and are leaving a big gap in the security of their system and that's yeah. That's I think a fundamental problem is something you know we wanted to uh let users uh you know off the hook. For that extra knowledge. You know we do want. We want developers to do what they do, which is like hey.

B

I've got to solve this problem. I want to build this application. I expect the database to work and it should work. Well, that's right! That is exactly what they should think. They shouldn't have to understand isolation, levels, but other databases either force you to understand them. If you're going to be a very responsible programmer or they let you do the wrong thing. That's.

A

B

Yeah and it's you know my one of the reasons I'm proud to work here is: we've actually dialed the the knob all the way up on the speed of light and serializable isolation, spencer, and it's it's forcing some really incredible software engineering out of the team, because when you have those two things, pegged you're actually forced to do some really interesting takes on on on how to solve these problems.

B

You know, like parallel commits, which we came out with last year, which allows us to basically kind of forward commit transactions to these there's there's stuff on our website about this, that I think people could check out.

A

But I think it would be true, it was worth.

B

Mentioning jim that there's a we, we just had a paper in sigmod which actually that's right. It's a wonderful, wonderfully concise description in 10 pages of livestream talking about so people that are interested to read that, and if you really want the gory details, the blog posts from the engineers that worked on these things are uh are really explanatory, that's right and and if not anything, just to explore kind of the inner workings of cockroach database, which I think is really interesting.

B

But as people go and build distributed systems, there is some really good examples of how you actually think about these things about a software engineering point of view as well. I think that's one of the reasons that you know being part of an open source company contributing back to a community is. We do share all these ideas and I think it's some really kind of novel ways of thinking through these things is.

A

B

That's what this banner paper did for us right, like it's only right that we also publish these sort of things back so, um but coming back to that, and so you know, people can use nosql to get scale across the planet right, but they can't get serializable isolation right and that's one of those. That's what like at adwords right way back in the day. Spencer. That's why they couldn't use bigtable right and.

A

B

You know how are organizations using cockroachdb today, you know we have what a couple hundred customers, so we have some really huge massive implementations of this right. Some some great brand names, um some really kind of you- know high powered use cases. You know when people ask you, how are people using us today? What do you typically respond with? Well, I think, there's you know the the the easiest answer is the cockroach is a system of record.

B

It says it's a cloud native relational database and it's for any application that you know needs a system of record which is most of them um and it's particularly, I think, a a an appropriate choice for workloads that are very high value, and I mean there are certain workloads where you can. You can absolutely use something like cassandra: it's not necessarily a bad choice or mongodb I mean these.

B

Are these are good systems in their own right, but you know relational database, high value, use cases, the right kind of consistency, that's what cockroach is really geared towards, but there's three primary capabilities or differentiators. That cockroach is bringing- and you know some of our customers need all three.

B

uh Many of them need just two out of the three and it's kind of somewhat rare that you just need one, although that does happen and in all three of these differentiators are fundamentally born from the distributed nature of its architecture, and that's why it's distributed right like you, you you create a new architecture, not for the heck of it, but so that you get new capabilities and those three are scale resilience and global or multi-region at least right and I'll.

B

Just kind of go through them scale is pretty simple: to understand a relational database, that's monolithic, you can scale it up, but it's a super linear cost curve with a ceiling. That's actually not very high when you look at sort of global workloads right, so you know as an example, you know if you're a food delivery service in covid and we we have a customer that moved. uh You know they were using amazon, aurora and.

A

B

Amazon aurora can scale reads out, you know relatively well, but it can't scale right. So you get.

A

B

The xxl size- and you know, if you kind of start, hitting the the limits there, your database starts to cavitate and then falls over, and so they they became. uh It was you know, code red and they had to get off amazon aurora and they tried cockroach and cockroach can scale the rights out arbitrarily right so doing that. Getting that elastic scale, which is a no sequel thing that no sequel brought to the the world to the market but doing it in the context of a relational database.

B

That's special we're, not the only ones that do it right now, obviously spanner can do it. There's there's some other uh sort of smaller competitors. You know, but it's it's significant and it really matters like especially like. I think, we're entering the era of what I call transactional. Big data right, if you think about uh you, know the the history of what databases have supported it. Sort of you know back in the 90s. It was enterprise scale so like within a company.

B

What are the kinds of customers that an average company used to have, but then the web kind of hit this web scale idea and that was kind of okay? You know in in a web scale, you can get 100 million customers that are banging on this every day you can get a billion right. You can even go beyond that, but there's only so many human minions that are interacting with computers and interacting with the databases that are backing services and applications.

B

I think the next uh era that we're already entering is really when you have non-human entities with some agency that are connecting to services as an example, iot right or virtual agents that aren't even like physical pieces of hardware, but it's just virtual things that are checking. um You know the prices on something but they're doing it at machine speeds and now you're talking about not like sort of order: 10 billion human beings banging on keyboards and mouses, and things like that interacting you're. Talking about uh you, know, 100 billion.

B

Maybe a trillion like you're talking about orders of magnitude that are going to increase and so scale is going to become a a much bigger problem even than it is today, and it's already becoming one for many companies resilience. You know kind of mention this about geo-replication. Fundamentally, you want to be able to think of a data center going away not as a disaster recovery scenario, but, as I t resilience like a data center going away should be okay. We had three seconds of additional latency, no data loss, no postmortems for our application team.

B

Everything is coming along, we'll get that data center back up uh things will re-replicate and then we'll be back in sort of the nominal green. If you look at google and their high value use cases, they'll put it across five data centers even across like the the different power grids in the united states, because they want to be able to say. Okay, we've got five things: everything's nominal everything's up and running lots of redundancy. Here things are working: fine, okay.

B

We need to take a data center out for planned maintenance, we're going to replace something all right. That's just not going to be available. We know about it, we're planning ahead, so we take that out now we're down to four with consensus-based replication as long as you have a majority. So if you have five three out of five, if you've got three data centers for replication, that's two out of those three. As long as you have the majority, then you're never gonna lose data. You're gonna have forward operation in your system.

B

So if you've got five, you have this amazing property where you can take one out for a plan maintenance and then you know all hellblade breaks loose somewhere else and you lose another one. Now you've only got three of your replication sites. You still have complete business continuity, and so that's it's a new way of thinking. It introduces latency, so it does come at a cost.

B

You can manage that you can put the three replication sites very close to each other say like in on the east coast, united states, or you can spread them across if you spread them across. You get better sort of non-correlated failure domains, uh but you have more latency, but you pay a price, and when I talk about our transaction model, we're getting it down so that we have absolutely minimal latency on these even geographically distributed uh transactions.

B

You can do a huge distributed transaction modifying hundreds of different tables, and we can get that all down to one consensus: latency, just one, not even not even a two step, which is normal for a distributed transaction you're able to hide that second phase of the transaction from the end user- and you mentioned parallel clients, that's what that is, um and then the third thing I mentioned global or at least multi-region.

B

uh This is the most exciting one. This is like 2020s. This is going to be on everyone's roadmap, but the question is: how do you build a global data architecture and what I mean by that is you have even as a startup the opportunity to have customers in australia customers in brazil. How do you give those customers, a first-class user experience I'll, tell you what you don't? Do you don't put your entire application in one availability zone right? You can't have australian users hitting virginia every time they want to.

B

You know, do something on their app. It's going to be like a half second lag at the minimum when they hit a button, a half a second. You can notice that very easily. In fact, department of defense figured out is 100 milliseconds or less for command and control systems. That's the the threshold, for instance instantaneously.

B

So if you can get less than 100 milliseconds, you can make it feel completely real time and interactive, and obviously this matters for ar and vr and interactive media and gaming and self-driving.

B

But I think it also matters for any kind of application going forward when you start to have an experience with your mobile app, where you hit a button and instantaneously you're, seeing interactions with other users, but you can design new things that we haven't seen yet I just kind of push the state of the art forward and companies that are kind of doing things.

B

The old-fashioned way: they're users in brazil, they're users in australia, they're users in japan, their users are going to feel like this is a kind of an antique app it's kind of like when the iphone got bigger and you had those like black bars on the side, because the app maker hadn't updated their application. You're like well. This app maker is kind of cheesy, like everyone else has kind of gotten with the program these guys.

B

Obviously, they must not have enough engineers because, like they're not using my full screen and it's a it's not as good experience as it could be, that same thing is going to happen and part of that's driven by 5g. You mentioned that earlier, but the other big part of it's not just latency. It's about data sovereignty.

B

You've got things like gdpr. You even have states like california, new york, threatening to create data privacy regulations that are different from other states. We could have vulcanization of data privacy laws in the united states, but you know vietnam requires you to keep a copy of the data there. South korea's got its laws. Brazil's got very uh stringent laws, china and russia. They require that all your data is domiciled there. uh Obviously, if you're facebook, you can fight those, but smaller companies can just get jumped out of business pretty quickly.

B

So there's a there's, a lot of liability right now in the world in terms of hey, I've got a business and I'd like to embrace a global user base. But can I do that responsibly? Can I do that in a way that you know meets any kind of legal requirements that I have, but also? What's the user preference, let's say you're a sas business right and you would like to have customers uh in asia?

B

You can bet that their preference is going to be that, like if you're, storing their employment records or their sales data or something you bet that they want that store in their legal jurisdiction and if you're going to compete with a local, a more local operator. That's a asia base that is going to to give them that guarantee you're, not going to have a very good story to tell right so like how do you build those holistically global data? Architectures and facebook can build them. Uber builds them, netflix builds them.

B

Google builds them right, but uh can a startup, obviously not right. You put it in one availability zone because that's somewhat, you know, tractable and and what's interesting, is can. Can the companies in the global 2000 can a random financial.

A

B

Company that might have more than a billion dollars in revenue can they build a global data architecture? The answer is, they can okay, because that's not their expertise, they don't hire those kinds of people that are kind of on that cutting edge. So what we fundamentally think of ourselves as doing is okay, it's very complex, building a truly global application or service, but the database is the hardest part of that no question: we want to make that easy right.

B

We want to make that at least if it's not doesn't start easy, it's tractable, and then we make it easy as we go so those.

A

B

Scale, resilience and global the big yeah and and that's exactly right, spencer, and I think what we're looking at is the beginning of a massive huge transformation in a way that all of us think about applications. Right, I mean it's like uh you know you and I joke about 5g. It's like you remember when we went from dial up to dsl and how fast the internet was at home.

B

It's that same type of aha, I think, is going to happen with people, but it's going to be on their phone and it's going to be everywhere, and how do you actually meet the needs of your consumers from an sla right there? What.

A

B

Expect out of you, I think the 100 millisecond rule is hugely important for organizations as they think about applications at that scale, but this resilient thing too, and I think you know, I think this is why a lot of people are moving towards kubernetes and doing this whole thing. It's like this whole concept of having this.

B

You know this this orchestration platform, that's doing that for you, but then surrounding it with the with the enterprise capabilities that they need to basically manage and monitor these things, and I think that's where you know our hosts red hat openshift are doing such a great job. You know pushing down that path. How do we make it easy to install things? How do we make it easy to manage things? How do you you know? How do you, how do you sort out a rolling upgrade of software right? You.

A

Can do that now.

B

Right like in production, it's cool.

A

And I think you know operators and this this whole marketplace and everything and what's.

B

Going on is it's it's, it's it's awesome right. So it's.

A

Incredibly, exciting.

B

Because what what all of this stuff is from the public cloud to all of the uh you know new orchestration technologies, kubernetes openshift, uh these things are, you know fundamentally allowing companies to not spend those engineering centuries or less devops entries.

B

That is, that is that allows everyone to do more with less, but it you know, ultimately what you, what you find is is pretty interesting, like the cloud's, a lot cheaper, right and, and you know, taking infrastructure as a service. The total cost of ownership is actually less expensive, but but, ironically, people end up spending more, but it's not necessarily a bad thing.

B

It's kind of a growth mindset right you can do more with less, and so you have uh additional engineers additional capability, which means you can build another service in that same amount at the top.

A

B

You can you can further improve the experience for your customers and I think you know that's that's sort of the the world that we're entering it's extremely exciting and fundamentally it's. I think, once in a generation that you see this kind of shift and it's a paradigm shift and it's it's it's it's a big paradigm shift.

A

It took me a while to figure out kubernetes and how.

B

The whole thing worked because it was, I was used to the old worlds. You know it was a different way and I think there is priority. There's you know, there's organizations I google and others that have kind of done this sort of thing so spencer. uh One of the questions that came in was about backup and restore, and how do you think about that and distributed databases.

A

It's a different thing right, because you're dealing not just with.

B

Like the back end, restore the database, but, like this data sovereignty, thing comes into it too.

A

B

It's uh it's! It's incredibly, uh you know textured. Actually you know it is one thing when you had sort of one bin log coming out of my sequel. Now, we've.

A

B

Potentially, you know 250 nodes and they might be in different countries, and so you've got this. You know geo partition. Some data might be replicated only within europe. Some data only replicated you can control all that the row level and a table and cockroach so yeah that that vastly complexifies the uh the burden on the backup and restore- and you know what what we actually end up doing.

B

What we found to work best is to uh you know, make the backup restore policies definable on the same level as we actually do uh the sort of domiciling policy so where the data is stored. You also need the backup and restore to work that way. We didn't start that way, but it's moved in that direction because exactly as you say, jim, you know, if you, if you have data, that's only supposed to live in europe, you definitely want to back it up.

B

Only in europe too, you don't want that to stream over to the us for lots of reasons, uh and so you know, but one one one I think really important point to bring up here is that we don't have to solve all these problems on our own. uh You, you realize that in this new ecosystem, all of your customers, not all of them, but most of them have access to all the other cloud infrastructure. That's out there, so really what you really want to be able to allow them to do as an example.

B

This is just one of many things you could do, but you can have for the european data that could be backed up to an s3 bucket that is sitting in europe specifically right right, and so you, you kind of you're. Actually using am what aws is providing. uh You know geographically, spread out around the world in order to augment cockroaches capabilities and to complement them, and that's, I think, that's critical, because if you try to solve this all uh you know in the database the the um the amount of work and the complexity.

B

It's it's counterproductive right and it's ultimately not what customers want right, that you kind of want to use the best of breed approach. So we do try to integrate with everything in the in the ecosystem yeah and to be multi-cloud and run everywhere and take advantage of whatever it is in each of the public cloud providers that they use. It is a big piece, so um damn mike um was there any other question that you guys wanted to throw in here there was one you see a question about life cycle, support, yeah,.

A

B

One yeah: that's a good.

A

B

A

You wanna repeat, spencer,.

B

So the question is um software: like lifecycle support, how does the software? How? How do you do a software upgrade without downtime? So uh you know that that's a good question. It's actually not just the the software upgrade. It's uh also things like backup and restore you don't want to lock table uh schema changes, so we do online schema changes. It's very you know, cockroaches.

B

uh You know very predicated on any kind of thing that you do to to manage the database, whether it's an upgrade or a backup, or uh you know, change the underlying data model. All of those are uh do not require downloads and including scaling the database up scaling the database down moving from one cloud to another. All these things can be done, live the software upgrade.

B

I I won't go into too much detail, but basically what you're allowed to do is have uh multiple nodes and successive versions, so you can have a mix of versions and, like uh certain kinds of features, don't mind that there's a different version because they're not really predicated on the version, but sometimes you have a new feature where it's not going to ever work. If there's a mixed version situation, so what will happen is when you you have that mixed thing. You can't use that new feature.

B

Yet it's available on the nodes that uh that are the newer version, but even on those nodes, you're not actually able to access that feature. And then what happens? Is you kind of you get to all of the nodes being the newest version and then, at that point you'd make sure that the system works? You can run it that way for a while. If it doesn't work you can you can downgrade it back to the previous version uh and again it's really just about restarting each node sort of in a rolling restart fashion.

B

But if things are working nicely and you don't need to downgrade, then you kind of flip a switch. You say: okay, now we're going to make the minimum version for this cluster the new version and that enables the new functionality. So that's just a little bit, there's more details than that to it.

B

But so when you do that, I mean there's, there's some concerns within the application itself, how you build that to be backwardsly compatible to features right, and so I think one of the benefits of of of red hat open shift in kubernetes is that you can do these rolling upgrades across compute right and, like think about, cockroachdb is just a simple application running in a pod in kubernetes you all like that's how we get deployed on kubernetes, I mean we actually fit this environment very, very well, um but I think you know just for people who are casually listening to this.

B

There is things you have to think about inside the application itself to actually make that work and work very well. um The orchestration platform will surely do this. This beautiful rolling upgrade for you, which is like what a huge massive awesome kind of capability. So um there's another question in here: spencer: it's a little bit in the weeds, um I don't know. Is there one you're? Looking at in particular, I was going to do the atomic clock. I was looking at the uh benchmark, kubernetes storage solution, yeah.

A

I like that one.

B

Yeah, this is from walid shari. It says: uh uh have we benchmarked kubernetes storage solutions? What kind of storage would we prefer to run? On top of uh I I I'm pretty sure that that some of our customers have benchmark kubernetes storage solutions. I don't know the actual details there. um Unfortunately, so I can't answer that part of the question, but you know the storage.

B

We prefer to run on top of the ones that we have the most experience with and right now that's gcp's persistent ssd and aws uh evs, and I think we've tried a lot of different ebs ones, um and you know: we've had different experiences with each. It's usually like a cost versus performance trade-off, uh those those persistent ssds and the ebs.

B

The really beautiful thing about those storage solutions versus using raw ssd is that if you, if you actually experience some sort of downtime um from one of your nodes, let's say you just lose a node. You know that that would be attached normally to raw ssd. That usually would require cross data center bandwidth in order to re-replicate that missing as raw ssd device. When you use something like persistent, ssd and gcp, you actually rely on gcp to do that within that data center. So it saves you any kind of cross data center bandwidth.

B

So it's very nice, that's how we run cockroach cloud, which is our hosted service uh when com. When we have customers that self-host us, if they're running it in aws or gcp, they can do the same thing we're doing with cockroach cloud if they're running in a private data center. I actually think the right thing to use is probably some sort of kubernetes storage solution. That's doing something similar right. That's actually doing a distributed file system that can handle intra data center re-replication if a node is lost.

B

So um you know I I just don't know exactly what the the best solutions are there. I know we've got a a fairly large number of customers that are doing things in private data centers, with a variety of solutions from raw ssd to distributed storage systems that can be run with kubernetes yeah and I've seen people using ceph under rook uh from 2008 there's a rook operator. For that I mean you know for us.

B

It's really simply as simple as a storage class mounting a volume and then we use staple sets to actually manage everything within the pod itself. So dan, I think we're we run out of our time. Almost.

A

At the end of our hour, and um you seem to have covered off everything, you took a lot of the questions that I had out of it. I've done the data sovereignty stuff. um I uh a colleague of ours back in the day, dave mccory talked about data gravity um and the need for all of our the gdbr.

A

The data sovereignty issues, latency issues, all of those things and I think, as we move forward, what you're doing in the evolution of distributed systems that you've been giving us a great tour de force and history on um there's a lot of links um to reference papers.

A

I'm trying to find them, as you mentioned them, because they're just just spurring my um remembering of these points in time and when things changed and- and I I'm curious- maybe to end on where you think in like 20- I don't know- I don't even want to say 20 25 4 years from now. You know.

B

I love that question diane.

B

Yeah I mean I listen, the the thing that is my guiding north star for where the ecosystem is going, and I want cockroach to play very nicely in that ecosystem. Is you know as a developer myself and I I would like to write an application next, you know kind of my earlier point about iterating between infrastructure and applications.

B

I want to be able to write something on my laptop and then, when I'm done with it like it's working when I test it, I want to say: okay, I'm going to push this to the cloud, but I'm not going to have to deal with vms or or any any kind of infrastructure myself. Whatever I works on. My laptop is not just going to work on the cloud it's going to scale elastically.

B

If I've got 100 million users, it's going to scale to 100 million users if I've got users in brazil, all of a sudden, it's going to start storing their data in brazil and it's going to do that in serverless fashion. In terms of my my uh relation to the whole project as a developer right, what that means is fundamentally. Yes, I don't deal with vms and you know the hands-on infrastructure and monitoring and all that stuff.

B

None of that, but also I only pay for what I use, because you know I've done a number of startups back when I had to do exodus, co-location and build my own servers and put them there all the way till you know more recently, where I get vms and ec2 or in gcp, and uh you know the I'm not very good at that stuff. But worse what you end up doing when you launch your startups.

B

You've got like 10 people poking at it right because you don't have product market fit yet and you're paying for these vms to just sit there. You've got a bm for your database, which is a big one right like because you're you're planning on success. So you start with all these things: you're just paying out hundreds of dollars a month and just immediately. But for you know what honestly you don't you don't have? You should be paying cents per month right, and so that is that's the goal like you.

B

You should have consumption based billing on everything, and it should scale globally to any size, and you should never lose your data. It should just always be up. You don't have to deal with any kind of monitor. So that's the future. It's like a truly serverless future. Where you can take your idea, you can develop it and then it just becomes a global reality like the way that google builds things right.

B

The way that uber has a has evolved their system so that when you go to paris, your uber account just works as it normally would right, and it's like uh you know that that is the kind of system that everyone should be building, where you get that 100 millisecond rule essentially for free.

A

So so that would be the giffy for all.

B

Yeah giphy for all serverless giffy for all.

A

Right so you heard it here. First, I think that's um like if you read lots of people's marketing messagings, that's already here. The reality is it's not and we still have to build it together. So um it's been wonderful getting to getting to hear your story, hear your point of view and how the evolution of distributed databases came about and where you're at right. Now we really look forward to having the cockroach operator um working on openshift, getting more use cases more.

A

You know feedback to you guys um and want to hear more someday soon about cockroach cloud um and- and I love cockroaches in a way that you wouldn't expect is they never die and they will be here until the end of time. So um I think your naming in branding is brilliant, so um you know keep keep going whatever the next product is. I want to know what the name is you choose because you're you're dead on with the with your um your naming and branding as well as your technology.

A

So thank you very much for joining us. Michael waite has got a number of other operators coming up folks coming in the next couple of weeks in stana and a few others, so um look forward to more conversations around this and learning more about what people are doing in building that runs on openshift. So, thanks for your time, spencer and your wonderful view um and uh jim for uh your uh your great uh q, a and question and keeping the conversation rolling, so thanks again guys.

B

Thank you danny.

A

B

Having us really really an honor, thank you.

B