Kong Kong Summit 2019 Sessions, 15 Nov 2019

Previous Meeting Next Meeting

⏯

youtube image

►

From YouTube: Observability: What Got You Here Won’t Get You There

Description

In the first wave of DevOps, practitioners embraced change to incorporate development into design and build processes. To successfully build the next generation of software, practitioners will need to catch the second wave of DevOps to focus on controlling and fine-tuning evolving architectures. Honeycomb CEO, Charity Majors, discusses how to ensure buildability, empower developers, and make a truly observable architecture that’s primed for success.

Learn more about Kong: https://konghq.com/

A

Well, what I'm going to talk about is observability and in high-performing teams. As lovely gentleman just said, my name is charity. You can tell them from ops cuz. That's how I feel about software. The only good dip is a red death. I've worked in a bunch of places. Recently I work at honeycomb, gotta yo we just raised. We just announced this this week that we just raise another round, so we're gonna be around for two more years.

A

Anyway, so this I don't want to talk about three central questions: what does it mean to be a high-performing team? Why should you actually care? And, conversely, if you care how you can make other people care too, because one person caring.

A

And number three: how can any team become a high-performing team like? Is it just something that only the X Googlers of the world get to inhabit, or is it something that is actually accessible to to all of us and fortunately like we now have some science?

A

Have you all read Jaz and Nicole and jeans amazing book, accelerate I feel, like we've been cargo quilting, a lot of best practices for as long as I've been in the industry like someone has something that they sell work pretty well at a company that they used to work at, and so it becomes like it's almost like a DNA or like a few different like. Oh, that's, better!

A

Isn't it hahaha okay Justin bride anyway, so they like six every thousands of teams and they realized that you can actually tell how high performing a team is by just asking for four metrics. How often do you deploy how long between, when you emerged a master and when your code is live? How quickly can you cover from knowledge and how many of your deploys fail, like just makes it much more concrete and and and and clear for us right, it's not like. Well, it feels good. You know we're spending like 30 percent of our time.

A

It's really just these four things and that what that means is that if we, if we, if we put energy and pressure into these four things and make them better over time, the ripple effects for the rest of your engineering organization for the rest of your company can be huge right. Oh and PS teams that use that use platforms in the service are one and a half times more likely to be elite. The way they defined elite is, you know. Well here, I'll show you on demand.

A

Multiple deploys four days takes less than one hour to go, live with your changes less than one hour to restore service and zero to fifteen percent change failure rates which, to me still sounds pretty, lacks.

A

Very much achievable right, but it really really really pays off, because if you look at the Delta between teams that are high performing teams that aren't you can see how quickly you can get out competed by teams that can move faster and with more confidence. So what does it mean? What what? What do elite teams? um What are they made up of? Well, obviously, they're made up of all X Facebook expanding employees and mighty grads as a dropped out music.

A

Major I would like to represent for the non elites in the audience who nevertheless can build really awesome teams.

A

Elite teams, in my experience and I've, been this industry for a long time, I think all since I was 17 so we're half my life elite teams are made up of ordinary engineers, ordinary skill level, who care about their work, who communicate with each other who invest in the right things.

A

You know III think if it is like doing tech out of out of anger, because we're gonna the things that must be worked on, not because oh this framework looks cool, but that that's not going to lead to a great team right, but but focusing on those four questions and making those metrics better and and having and being fully empowered to do the work that needs to be done right like if you break the part this chain like if you, if you have no ability to do it, needs to be done.

A

It doesn't matter how clearly you can see things and it doesn't matter how much will and desire you bring to this. If you can't actually see what's going on so I think it's sort of elite, I think I really like to use the term excellent right. What we need is production excellence. We need production excellence. My my coworker Liz Lange Jones. It's a great talk about this, which you should all watch. We need, because it happier customers happier teams.

A

Every engineering team has a dual mandate right: make your users happy, but also make your team happy, and this work really begins with observability, because if you can't, if you, if you don't wipe off the dashboard of your car before you start driving down the road, it just makes everything a lot harder.

A

Like I said to constituencies and a lot of times, people and they'll be very hard-nosed business folks, you know and they'll be like it's all about users, we're obsessed with users right Amazon. Their mission is like obsessed with users, they never say anything about being obsessed with happy teams, and this is short-sighted. This is short-sighted and in the end it it will. It will damage you, because engineering excellence is tightly correlated with happy well-rested, empowered engineers.

A

So next, let's talk about the way things are changing and why this suddenly matters systems are changing. Problems are changing. This people have to change. This is always scary right. We hate change as human beings as animals, like it's scary, so let's walk through the house and why those systems complexities going up the roof, as everyone here knows, I'm sure our systems are now ephemeral, dynamic, loosely coupled distributed.

A

Basically, if you look at the architectural graphs on the one side, we see our humble lamp stack. The middle is a diagram of parse when I was working there a few years ago, and then the left is the National electrical grid and when you're building systems these days, I would argue what you should have fix in your mind. Is that left side you should be building systems like they're, the National like they are far-flung loosely coupled distributed like you have no ability to predict. What's going to go on, like failure is a given.

A

It's a it. You have to make friends with it. You know some problems are going to be only a hyperlocal right. You have to zoom way in to see like a tree fell over on Market Street in San Francisco. Could you have predicted that no should you've written a monitor check for it? Also, no, should you invest in gathering a detail at the right level of abstraction so that you can ask any question of your systems to understand any state that they can get into whether or not you could predict it.

A

Yes right, some problems are hyperlocal. Some some, you can only see if you zoom way way out like like. If a certain like a certain component, see every bolt was manufactured in 2013 is rusting ten times as fast as all the other bolts right like this happens, all the time in our data centers last year's batch of RAM. Let's see we have to proactively replace it a lot faster right. It's happened at the Facebook data centers. This means that our tools for understanding them have to shift away from the known unknowns.

A

Where you know lamp stack, I would look at it. I could eyeball it and guess. Eighty percent of the ways it would ever fail, so I write a bunch of monitoring checks for those things right, I would page myself. I would write up, run, books right and then like over the next six months, we'd kind of bake it in I've learned the other 20% and like twice a year, I'd be really truly stumped. What the hell is going on in my lamp stack right, very, very rare.

A

Well, it happens every goddamn day with modern systems right in fact, it's so much so that, like it's, this savings of shift from monitoring to observability, obviously right from a world where you could predict what was going to happen. You'd, write checks and you write, run books to a world where you've pretty much automated those things away and every time you get paged should be something new that takes your creative engineering brain like uh that again shouldn't be that should be huh I'm stumped.

A

Every time you get paged right well, this means that our tool set needs to needs to change in other ways to to make it more explorable to make it more so that we can start assuming that we know nothing and quickly break it down to figure out exactly what's going on every time or you know we're just bad at understanding. Our systems we're really bad at understanding our systems, every one of you, I assume, has had the experience of getting paged starting to try and figure out what was wrong.

A

Getting the recovery and everyone kind of looks at each other. Like we don't know what happened? Are we going to invest the rest of our day into trying to figure it out? Probably not very rarely right, it's fine probably find wait, but we really don't understand. What's happening most of the time um and I will illustrate this. Here is a problem. Photos are loading slowly for some people, not everyone.

A

Just some people- and this is in the lamp stack era- and here are some of the things that we probably could have predicted and will definitely happen.

A

Right cool monitor for these great monitoring, the need for monitoring doesn't go away. Now, let's look at the same question, but all of these scenarios were taken from real instances that we had at Instagram and parcel.

A

I'm not exactly sure what to monitor for, but I've got more.

A

Latency reverts to historical mean on Tuesdays I know what you're thinking, but it's not a cron job.

A

Like I could do this all day know that one in Romania was probably one of my favorite outages. Ever we start getting these reports it like pushes it down, I'm like pushes or not down like I'm, getting pushes and they're in a queue or pushes are not down and like days later, there's still very upset, so we go and start start figuring it out and I. Don't know if this is still true, but it used to be the Android devices. All you still have to keep a socket open to.

A

You know the push notification service, so they could subscribe to push this. So we would run an auto scaling group. You know load Bella, it's using round-robin penis, so they pasady all this service, and so we add a capacity one day no big deal. We do this all the time it all came up, started serving traffic. Look normal turns out the round robin DNS record had exceeded lead.

A

If you packet size cuz, we know it too many entries and that's fine, usually totally fine, because it's supposed to failover to TCP, which it did everywhere in the world, except for this one router in Romania and again I.

A

Ask you like what exactly am I supposed to monitor for right, like there's just instead of having like a few things to monitor for that a relativist and that we can right run books for instead, it's like we have this infinitely long, thin list of things that almost never happened so once they do, or you know it's five impossible things that all have to converge in order for this one bug to get triggered. Yes, this makes staging basically useless they're. All completely unknown.

A

Unknowns may never have happened before probably will never happen again, so the workflow that we've had, which is you know for years, we've been like. Okay, we encounter a new error cool. Let's, let's have a retrospective. Let's post-mortem it, let's write a monitoring check for it, let's build a graph for it, so we can find it immediately. The next time I'm. Looking at this one dashboard, you know and then we'll write a modern check so that we get you a page. We know what it is right.

A

Okay, fast forward, a few years later, I mean we're getting paid bombs of like things that are in no way related to the actual problem, and we have tens of thousands of dashboards and you can't jump directly to the one that describes this exact problem because whatever was serving at status, probably stopped shipping it at some point, and it's just it just doesn't work yes welcome to distributed systems. We are all the tribute systems, engineers now I think that means you get to ask for a raise.

A

So, let's, let's Mull over a few of these and a little bit more detail, the technical aspects and the cultural associations, because I think it's really interesting to think about the way that some of the cultural pathologies that we all like love to rage about actually come proceed very directly from technical choices and tooling that we have created over the years.

A

Look at the lamp stack. You know it's famous for having the database. You know the one database, the application right and because it has this these characteristics um we we grew up with monitoring and metrics, which are inherently incapable of handling high cardinality data. So like anyway, if you ever start using data dog or whatever and you're like ah I, know, it'd be a nice tag to be able to group my metrics by hostname and that works until you had a couple few dozen hosts and then suddenly blew out the cardinality and keyspace and david.

A

It's like pay billions of dollars and it won't work well. Cardinality is important, turns out now like everything's, a high cardinality problem, because there are many of everything, because we could mostly we felt we had. We had the illusion of control because we felt like we had a handle in the ways that our systems were going to break. um We saw failure as a thing that should be prevented right. We put all this energy into like preventing.

A

Failure like it was terrible site should never go down, you know, and- and this is Rob's people have some shame to bear like we. We built these walls to keep people out.

A

We taught people to fear production, we taught them to leave it to us right which led all of the you know the yeah, the self abuse that ops people are known for right, masochistic, on-call culture, that's what I'm looking for and and we treated deploys like they were like they were scary like there were things to be feared and prepared for and gripped really tightly, so the right I. Think of it as like. We built this glass castle and it was beautiful.

A

It's fragile, forbidding out of the edifice and it was very hostile to exploration and experimentation. Well, the world is changing in many many many ways right. So let's look at the technical aspects and cultural associations for distributed systems. We've got many storage systems, I'm, not saying that all these are good things as someone, but the database backgrounds. It makes my eyes bleed how many databases people like to use these days, but fine I'm down I'm hip.

A

There is a lot of services every unknown, unknown check. Every every alert should be a new thing. It's it's entirely an instrumentation game. In fact, this is part. This is driving DevOps. This is a big part of the reasons we have DevOps is that the people with the original intent in their head, who are writing code, have to see it through. You have to see it all the way out to watch users using it in production.

A

If you break that up, you're screwed right, like nobody, could understand their head, they can't understand what the hell is going on right and your job as developer is not done until you see users use it because they're just there are too many twiki things.

A

There double go wrong in the process and then deployment instead of being like these big bang's, where we like pull hard and we're like peas it over the wall, it's more like baking cookies, you have to think of deployments more, like you're you're, putting them in the oven and you're beginning the process of gaining confidence in your code. It's just the start right, it's just the start. You should never trust your code when you're putting it out and prod.

A

This is why we write you know, like you know, progressive deployment stuff and, like you know, automated checks. Little rollback errors exceed. You know all of these safety things that we baked in to do these things automatically are done because we know we can't trust this code right, but it needs to be I, think it almost like development.

A

The development process is, is smearing all the way over into production, and the prog process has to be like all the way back into dev, so that it's more like it's just a continuum right production is where your users live. Failures have to become your friend and just frankly, because there's just too many of them. You know we have to embrace them. We have to embrace them. We have to make friends of them.

A

We have to, you, know, become a at home with the idea that in reality, you've got so many things failing right now that you don't even know about so sleep well, you know like it's a different. It's it's it's kind of black humor, but it's also true that your software is way more broken and it is correct at any point in time. That's fine! That means deploys opportunities. That means every outage and opportunity. I find this.

A

Actually, very cheering I I mean, like I, believe that this is the only way that we move forward to a more humane industry is by just accepting the human frailty of all of our systems, best practices, you build it. You run it three years ago, when I started talking about putting software engineers on call people got very angry. You were very upset and now I think that debate is actually over it's more.

A

The the only question is how right, but the trade-off here is that, in exchange for everyone agreeing to be on call to support their code, management has an equal responsibility to make sure it does not suck it should not. Those of us who are over 30, who don't gonna, want to get woken up in the middle of the night anymore. We should be able to not plan our lives around being on call right. It should not be life impacting.

A

It should not be a thing that you, you know you dread or this terrible, and if your management isn't carving out the time you need to invest in system so that they aren't hellish. You should quit your job seriously.

A

Don't reward bad management with the gift of your labor, so here's like the dirty little secret, like the next generation assistance, is not going to be built and run by burned-out, exhausted, tired people who can't communicate with each other and they're, not gonna, be run by teams we're just following orders. We just check-in at the beginning of the day and check out and don't really bring their full creative self. There they're just too complex, now, they're too hard they're, too chaotic they're too. You can't just like learn the system and then watch it.

A

It doesn't work that way. Every day is learning every day is something new, and if you don't love what you're doing that's going to be exhausting, you need to love it.

A

It should be lovable to sides right, but I like this, because when I think about you know our tools, our tools were designed for much more predictable world and you, you might have looked into this future and and thought that this would be scary and terrible and doom doom doom and it could be, but I actually think that all of the incentives are kind of lining up in the right direction to make it a better place. Anyway, a philosophical blah blah blah.

A

Let's talk about visuals one of the things about tools is that, like the last couple generations of tooling for understanding our systems like monitoring, metrics logging, they were really organized around the idea of answering questions quickly. If you had a question, they would answer it fast and they do that very well.

A

But when I think about the kinds of problems that I'm having I usually dealing with a situation where I have some, maybe some problem reports that I may or may not trust I have a hazy like idea that something over there like reminds me of something I might have seen once that might be wrong. I've got some leaps of fancy. I've got some suspicions.

A

What I don't have is a question and in fact, by the time that I've worked my way to the point where I have a question that I can ask I, probably know the answer to right because finding the answer, it turns out it's much easier than figuring out what the question is, and so we need to shift tooling um to a much more exploratory open-ended way of interacting with our systems, and we also need to pull all the state about these systems out of our heads and put it into a tool where everyone can access it.

A

First of all, we can't fit them in our heads. Anymore did I mention they're, ephemeral and dynamic and changing every 15 seconds. Is you know if you're trying to reason about how the request is making it through the system in your head? You've forgotten something you've left it out right and more to the point. What and your head is not available to your team, and this will hold us back right.

A

We need to have this I think of a single source of truth, or we have multiple sources of lies, and that includes it's in your head. I.

A

Think that it's really important that we that we kind of release the like I love being the genius you know cause like I, remember like I, could look at a dashboard of par, so I could just go ah by the curve of that graph. I can tell you. The fault is Redis like a freaking genius, I kind of missed that I do I'm, not gonna lie, but what I?

A

What I don't miss is the fact that, when I was on my honeymoon, I got called like half a dozen times, because nobody else could fix MongoDB right. There are upsides. There are downsides on balance, I think that it's it's fair. So next I want to talk about how observability directly leads to these high-performing teams and and in fact, why you can't really do it without a durability and listen.

A

I came up with a maturity model and- and we distilled really everything that we've ever seen or experienced about high-performing teams and to do these five things and I'll go through them really quickly and they come with pairs of lists right. Where one list is, you know if you're doing well in this button, that's in this area, then you should resonate with this. You should be like yes to subscribe me, in which case cool move to the next one right, because the other list is these are pathologies.

A

These are you know, signs that you're not doing well so for resiliency right if you're like yes, all of you subscribe me cool, you don't need to invest in resiliency right. We all the limited number of engineering cycles and part of what the next generation of systems involves is learning to think ruthlessly about where to spend those cycles. This is why I'm always telling people to staging you know, because, like it's a black hole for your engineering time, it's probably not gonna get you the answers that you want by the time you do get.

A

The answer is probably out of date, spend your spare your scarce engineering cycles on production, so for resiliency right ultimately on call is not extremely sec, stressful and that leads to low turnover, which is nice on the flip side, if your outages are frequent, if you have alert fatigue, if troubleshooting is hard, then this is a place to invest in and observability helps here, because it gives you context, because you know there's such it. There is such a difference between being on a team that has invested in instrumentation.

A

That has the ability to just like you know, take any problem report and just slice and dice. Oh I see it's happening for you know all the users who are using iOS who are in this country who are using this release. You know that sort of thing that is open-ended.

A

You don't have to have experience before versus teams that are just pattern matching based on shared trauma or like the last outage, or you know they see one graphic one spike in a graph and they're like ah I know what this means, and so they've only got got like a corner of the answer.

A

It is honestly it's really hard to describe to people with this. What this looks and feels like because, like we found, we've had like zero churn ever a honeycomb, the heart, the heart part, is getting people to experience it on their own systems, because everybody knows that demos can be faked. Everybody knows that you know fake data isn't real, but once you've experienced the ability to interact and like explore and just ask any question understand any any result. It's like night and day.

A

It really is it's like it's like putting your glasses on before you go drive down the road and I'm really blind, but you do not want me to drive something glasses. That's how I feel about having observable systems. High quality code is another one right if you're doing well, you spend your time like customer happiness, not on customer bugs right, subtle distinction and cascading failures. Don't bite you all the time. The thing were you like?

A

Oh there's a bug here and you go to fix it and suddenly your week is shot because you know nobody understands it. It's a mess, can't reproduce anything. This also manifests when you're not doing well. It manifests his fear of the deploy process because anytime something goes wrong. You basically have to go. Learn the code base some scratch right, it's not intuitive, it doesn't feel it doesn't. It doesn't have a consistent look and feel it doesn't you like debug problems, it neighbor your area of expertise and observability lets.

A

You watch deploys, find bugs early I, honestly, think that you know we've talked about how developers should be running their own deploys and watching you know, taking that original intent, all the way out to watch users using it and prod I think I have seen that if teams have developers push their own code and and they just develop the muscle memory of going and looking at that code after they've shipped it looking at it through the lens of the instrumentation that they just wrote right and PS, no one should ever accept a pull request.

A

If they can't answer the question, how will I know if this isn't working right? That is like that's the bar for instrumentation? How will I know if this thing that I'm about to push is or is not working right and then you go and you look at it and is it doing what you wanted it to do and does anything else? Look weird like catches like 85% of all problems before users can ever even notice them right. There's there's! No, because that original intent is gonna decay from your head.

A

If it's a week from now and you're going and trying to figure it out, uh who knows if you're gonna remember what someone else who's trying to debug? What you just did right know the best way to catch bugs best way to have. You know a honeycomb. We we have everyone in one on-call rotation and about once a week someone will get paged out at hours and we've got a giant multi-tenant system with unpredictable traffic, spikes and everything, and not even I'm, not even talking about getting woken up like out of hours.

A

We get about one alert a week. This is absolutely doable, that's absolutely possible, but you do have to go and look at it and you do have to go and look at it when things aren't on fire. You'd have to know what normal looks like right. It's not enough to just know how to debug serious problems.

A

Predictable releases is the third thing and we were saying you know quick releases, but then it's like well, some people business bottle doesn't involve. You know continuous releases, but they should be predictable. They should be no big deal like who here has a do not deploy on Fridays rule.

A

No one is admitting it excellent. Yeah I mean sometimes that's legit step to take temporarily right, but you should be kind of ashamed of it. You know you should be like here's, a band-aid that we have to do. You know uh can't wait till we've. You know progressed to the point where we don't have to do this, because the number one thing that will make your deploys terrible is when you're batching up multiple changes at once.

A

Do not do that, like all of this fear that people feel about deploys, it's actually legitimate, it's just that what they should be afraid of is batching changes together and deploys not deploys period if you're back if you're just shipping one change at a time. Oh my god, they're, not scary, people should not avoid doing deploys, and it's everybody helped here, because you know everything about this can be instrumented. Everything about this could be everything about. This should be transparent. It should be. It should be receptive to a curious person, exploring complexity and tech.

A

Debt needs to be managed is the fourth thing you should be able to answer any question without shipping new code, because you, if you have to ship new code well, you have to restart everything bugs gonna go away. You know, haunted graveyard. That was my favorite thing. Yeah absolutely helps you do the right work at the right time, and this like honestly, this is what helps teams win.

A

I have a team of seven people and when I look at what we've shipped over the past year, it isn't it is an amazing amount, but it's not necessarily that much more than other teams that I've worked with, but we have done better at shipping the things that needed to be shipped right and in order to ship the things need to be shipped.

A

You need to be up to your elbows in prod every day right under seeing how users are interacting with what you're doing improving things, making sure that the work that you're putting into it is having an actual impact. I, don't know how people expect to pick the right thing to do the right work if they aren't constantly interrogating production and, finally, understanding user behavior. This is everyone's job right. This isn't just the PM's job, because, like your code meets reality where users interact with it in production, right, that's a magical place.

A

That is the only thing that matters it does not matter. If your test can pass if it breaks and prod it doesn't matter nothing matters, but production teams should share useful for a view. A single view of reality and observability will ground you in that reality, and sometimes people will protest, but I don't have the time to invest in it, which is a chicken and the egg problem, and in fact you can't afford not to because you waste so much time. Let me tell you about all the time that you waste.

A

There is a stripe developer report recently, where they they showed that um they surveyed, like thousands of developers about 42% of every engineers. Time, goes to doing things that do not move the product forward that are often frustrating time-consuming. You know trying to figure out where, in the code is the right piece of code to fix fixing it, it's probably pretty easy figure out what needs to be fixed that can take all week right.

A

A lot of that could just be cut out if you could just see what you're doing where you're going. You really can't afford not to engineering quality of life is tightly linked to high-performing teams and resilient systems, and, if you're into all of these new fangled, you know chaos, engineering, observability is kind of prerequisite for those things to inclusion for thinking about the future. I think we now have consensus that you know on-call is everyone's job. Everybody has to support what they're writing and production, and it's actually key to doing a good job.

A

As an engineer, it's actually observing it all the way out. The end of its life cycle and on-call must not be miserable. I. Think of it. Kind of like on-call is gonna, be less like a heart attack and more like diabetes or maybe visit I. Don't know it might not ever be pleasurable, although I have seen it be pleasurable, it can be a really nice break in the routine. It can be permission to go off on six other little things that you can't really justify doing during your normal week.

A

Right I think it's a good idea to have a policy of you. Do not do normal project work doing during your on call week. That is a week where you get to work on all the other things work on your deploy. Scripts work on. You know your your deploy pipeline work on all of the little things that kind of bug you about everyone else's stuff, but you know well don't take that too far, but you know work on the operability work on resiliency work on the things that contribute to on-call, not being a nightmare.

A

Often people ask me how to think about instrumentation and honestly I just say: look at how serverless does it that's the correct lens right. You should be observing the world through the lens of the instrumentation that you write, you can instrument almost any code and deploy list is coming. I don't know if y'all have been paying attention to the stuff that dark is doing so cool they're. Just like writing code like well on production.

A

It saves it and runs unit tests, and oh- and this is the cool stuff that I've seen people doing where it'll embed like a sparklines BRAF. Yet so when you're modifying a function, it shows you the estimate of how long it took to run, and they estimates how long it will take to run after the change that you just made. It's so freaking cool, tightening these loops better for everyone right, invest in your deploys democratize. Access to this data don't be scared by regulations.

A

In almost every case that somebody's been like well I, can't I can't have my developers be on call because you know regulations. They have been wrong. It is. We are carbon holding that stuff.

A

There are ways to do this, because this is how we build quality services, where there's a will there's a way and I was talking about the glass castle right and I feel like where we built this glass castle. You know developers and operations together mining forces to make everyone miserable.

A

That was a mistake. We should have built the playground. We should have built a place where people feel invited to come and experiment and see what's see the consequences of what they've done. What they've built see how you like, we always human beings. We deeply crave the impact right. We crave autonomy, we crave, meaning we crave impact and production is where you get all of these things.

A

Villager does a playground. Guardrails I mean we expect children to get. You know a bloody nose at the playground. We don't expect them to die, so you know, invest in Tilly to make it safer. But like we're all partners in this, but guardrails encourage curiosity, emphasize ownership, don't punish people for making mistakes, because you make mistakes too. We all do and our is failing all the time. It's fine. It's probably fine, that'll be fine if we practice practice failures and finally, senior engineers like your job, is to amplify the costs that are hidden right.

A

Your job is to is to is to surface things. Decision-Makers are making the best decision that they can with the information that they have and they often lack. What's in your head a lot of time when you're grumbling about something a decision being bad, that's because you know something the decision-maker did not. They did not know that they were building something that was going to be impossible to maintain. They did not know that they weren't allocating enough time to you know to testing or whatever they didn't know.

A

You know they didn't know these things and it's your job not to just say it once, but be a squeaky wheel to be constantly providing the right amount of signal so that the decisions that get made are good. Communication is absolutely your job. So, in conclusion, all the stuff is changing, but I actually think it's it's it's gonna be alright. I think it's gonna be good.

A

I think that the shape of the changes I'll point to a more democratized, more empowered, more necessary labor force of engineers who get to come to work and bring their full creative self and um feel very invested in what they're doing so and every engineer organ has two constituencies, and this really just nines, don't matter if he's right, happy, but great teams build great systems, burn out burned out. People do not make great teams, so everyone should get enough sleep and everyone should feel like a donor. So we had the opportunity to make things better.

A

They won't get, they won't get better unless we make them better. So we should do it thanks.