Red Hat OpenShift OpenShift Commons Operator Briefings, 23 Sep 2020

Previous Meeting Next Meeting

⏯

youtube image

►

From YouTube: Context is King in OpenShift - Matthias Luebken (Instana) OpenShift Commons

Description

Context is King in OpenShift
Matthias Luebken (Instana)
September 23 2020
OpenShift Commons Operator Hours
OpenShift Commons Briefing

A

All right, everybody well welcome back to another openshift commons operator hour, I like to do on wednesdays. We have someone one of the many folks who have built operators that run on openshift come in talk about what they're doing why they built it and what their operators do, and today we're really pleased to have instanti here um and um matthias lupkin and he's going to talk about using instanta's um offerings to successfully manage applications running in kubernetes.

A

um So, as he's want to say in his title context is king I'm going to let him explain that and introduce himself and then we'll have live q a at the end. So um thank you all for joining us today and um take it away matthias.

B

Yeah, thank you very much. Diane um yeah, my name is matthias lipkin um and today, I'd like to talk a little bit of how you know our experience of running our operator uh running kubernetes workloads and what we see uh from our customers. You know of the new challenges they have running these um uh just a few words uh about myself on the pm for kubernetes and infrastructure at insana got some cut some experience on software development uh all over the place.

B

I've actually been with red hat and done some interesting um stuff over there um and now uh within stana. We are, you know, focusing on helping developers um and devops teams to kind of manage all this. You know crazy, crazy things that we're seeing uh and operators help a lot. um You know. I hope our this talk helps a little bit and yeah some of the experiences that we all right. So basically one talk in the slide right. This is the agenda. This is the talk.

B

um So uh what I would like to talk about today is basically give a you know, an understanding of how of what to look for if you're running an application in kubernetes- and you know we- we kind of like separated the different aspects of different perspectives- and you know, came up with these three basic respective different different views on this um and that's what I would like to share today um and also give you very, very time tangible um ways of doing this on the the slides. The contents have.

B

A lot of um you know further links to to deep dive on on how how to get it going for yourself. I hope there's a lot of things to take away, so this was actually actually uh something that that you know brought up in may. But I I love that I love the uh the phrasing of it is venetis is very good at solving the problem. It reduces in your environment right, so um you know.

B

Kubernetes and openshift are awesome platforms, our awesome platform for distributed for for managing distributed, distributed setups, but at the same time it. It also introduces a lot of the complexity that we might or might have not been uh exposed to initially- and I think this is a this is a real challenge right. It's it's a it's. Yes, we're techies and we we want to get this uh all solved, uh but it's not it's not always simple right.

B

um There's, there's challenges for developers, there's challenges for devops and I think there's a lot of a lot of things that we that we can do better um and you know hopefully uh we're doing part of it uh today.

B

So if we take a step back or if we look at what a kubernetes application actually introduces, um it's actually you you're quite quite a few new attributes that that make these these things really challenging the first of all.

B

It's um you know we, um you know, we've talked about microservices based applications before, but to be honest, that that hasn't been um the scale um as I've seen uh before, with kubernetes um and with kubernetes there's you know, a decent application certainly consists of a hundreds of thousands microsoft's very, very different, and you can talk about pros and cons of microservices, but the matter of the fact is kubernetes allows us and gives a lot of means to doing so.

B

So you know uh people are taking advantage of it, but they also need to manage these right. The change also increases dramatically uh what with developers openshift provides a great platform for continuous deployment, um and you know the increased change of these is just just tremendous and parts come and go right. So you've got these auto scalers, you've got rolling deployments and everything is getting ephemeral right. uh So you know also something that you know at least at least for me, but also what I'm seeing from customers is something that they've not not been.

B

You know really used to it, and you know, containers are awesome right. We've got now the whole fleet of suddenly. uh You know just packing everything in a container putting the runtime on it and then just pick the technology you want everything is polyglot now, but who manages this right, who's who's, taking care of this.

B

So when you are then looking off of what? What does that mean? What are the problems? What are the challenges that that you face that these are? These are some of them that I I hear hear most as if, if there are using astana or any other other tool of looking up of what what what the kubernetes applications are doing is like okay, so, first of all what is actually affected like like I'm building my application, it has a problem what are actually affected and what what metrics do?

B

I need to look at like well, what is the cpu utilization? What are the request rates, and where do I start looking at what would be? Actually all these metrics actually mean right. Okay, now I've got I've got a list, but how do these work together right now, I've got a couple. What else do I need to look at and what? What is the root cause to all this problem um and again right? This is about you, know, developers devops that are are now. You know, working with this new environment for them new right.

B

uh You know, maybe, for some of the industries like you know, been working with with this for quite a while, but this is like you know, being new right. It's new new for them, and operators put another level on it right.

B

They do codify a lot of these things, but they put another level on this and stitching these things together might have their own means of uh bearing that um distributed load and putting their own custom resource definitions and again like which metrics do I look at what do these mean right, so another layer challenges and and managing the company's application?

B

All right? Let's look at it. Let's look at an.

C

B

um That that you know uh just you know to make it a little bit more more tangible to to to to discuss what we were talking about. So this is a very simple I don't know if it's a very simple example.

C

But I'm probably.

B

Not not too unrealistic example, let's, let's say we're: building an elastic or a search application right and and throwing throwing up using elasticsearch. You know the common common uh tool for that um so running this in kubernetes. You know, obviously means that we have a deployment in the name space.

B

You know, then the the elastic search itself is written in jbm, so that jbm is running in a container on a part, uh most likely we're running this on a linux host right, which is itself a kubernetes node with its own properties, and all this thing is running in a kubernetes cluster and some availability right. So the question is like what what do? I need to know what what happens- and I guess you know, if you're looking at a problem right, then these all surface right. So this is like.

C

B

Example or the simple stack, the more you know more well. This example is that that we're actually talking about the cluster right, so the elastic search is not a single node, but rather a cluster, and each of these you know easily classes is obviously mentioned in a couple of nodes and we might have a spring boot application or another java application. You know most likely with a similar stack.

B

So let's look at a let's look at a problem that that could could happen right. So, let's start with the lower around uh right hand corner the first one. Let's say that that the node has a an I o problem right and that that particular shard in the elasticsearch cluster you know is, is getting something to to look out for then you know, maybe the the thread pool itself on the jvm on the elasticsearch node is ha has a problem and there's a warning sign right. So I'm I'm circling these.

B

The the the yellow circles are kind of warnings uh to your system and and and then and then another elasticsearch node needs to take over and there the thread pull is so overloaded that it actually, you know, goes beyond a certain threshold and that requests are queued that much that it, you know, can't fulfill the there's a service that it was supposed to so actually the the overall cluster gets, the throughput is decreasing and the performance of my service application is also decreasing.

B

So I think the the the the message I want to bring across is is you know this is, I wouldn't say too complicated too far off application, a rather simple application, but you can also see that that you know specific problems could be somewhere anywhere in this in this um in this setup- and um I would like to you know, give give a little order into this and give a little bit guidance on on where to start and how to look for right. This is a little little example from uh from our tool.

B

um So instanta is an uh monitoring observability tool. um We do all sorts of sources um of what we gather and we've got this, uh this dynamic graph that hooks up everything- and this is actually uh you know just a visualization. This is a fun project of seeing the death star of all these, you know components hooked up um quite fun. Actually, uh the live version is even even funner, because it's uh uh it's it's dynamic and you can see things moving all right. So how do we get started?

B

You know, let's, let's, let's look at you know a suggestion of of how to reason about all. So I guess anyone looking at any you know ways of of managing monitoring any modern application.

B

Everyone kind of agrees that the first one to reason about our services, services, uh to you know what the the user or the internal user or a dependency within within the the microservice landscape uh you know needs to provide um you know, is a logical abstraction that uh that we're looking at and we're gonna talk about it, but we're gonna, but we can't stop there as I tried to go earlier, so I think the second one, um and that's that you know obvious obvious for this one.

B

For for this audience, hopefully um is the kubernetes environment right, the kubernetes environment, the openshift environment, to understand of what's happening there to get an overview of the name spaces, the past, the deployments, other workloads, etc.

B

Now, the third one and I'm let's, let's see what's what we can get out of the uh the conversation here um is um what I would uh like to argue for is that the infrastructure infra level is not going away. uh We tried to build up abstractions, but whatever I'm seeing like you know, looking into some some examples, um there is always something where you start looking up how abs particular containers behaving on a particular host, so um infrastructure, as I showed earlier with the io example, is still something that we need to look into.

B

But as the talk right context, the context is king. You need to understand on how things all these things relate to each other. So that's the fourth uh dimensional perspective, I'm going to talk about so in a in a in one slide again right. I think, um if you reason about of of how do we manage kubernetes applications, then I I would you know, try to start looking at these three core perspectives. The.

A

Logical as service.

B

Application, the kubernetes layer in itself and the infrastructure later and how all these things tie together this it's it's it's interesting when, when I started talking about these um that you could abstract these two specific roles right um and I think certain um roles are naturally bound to one or the other, so you can think about the services on being for the developer, the kubernetes side of things on the devops side and the infrastructure on the upside.

B

I just put them here, but I also think that it's um you know um the the the nice thing about devops. Is that we're not building up walls right, so we don't want to uh cut off and then say I don't. I don't care about the rest. Just give me a host and I'm done with it. um You know we want to combine these, so I think it's also important to share this perspective.

B

Share these views. Share these metrics share dashboards between the different organizations and yeah. I'm looking forward to to what you guys think. But that's that's! That's my that's our perspective now, just just a word. This is one view of this right and we, you know kind of um you know, brought that um to instagram, but there's obviously lots of all other perspectives.

B

um Like you know, end user monitoring, business custom metrics synthetic morning, networking security, yada, yada yada I've um actually had the argument uh when I was preparing to talk with with a colleague of mine that something else is such more and more important than the three perspective I I'm talking about here, but yeah. It's just it's one view right and if you think um others are more important happy again to to reason and talk about it, um but I think these these are generally applicable all right. So we've got these three perspectives. So what what?

B

What should I look at? How should I look at these and then last but not least, let's put them in context all right. Let's start with the service again how? uh What am I looking for so uh the definition for services, for me is something that is is has some logical context to it right. So it's it's implementation and infrastructure independent um and we care about what the service provides to that user right.

B

We've we've also got you know these sli slos, which would work perfectly with it, and the important piece is that it's a logical uh unit that serves a um a user a service.

B

Now um the important piece is that we're looking at an implementation, independent um definition here, because technology specific kps can be misleading right um and you know, maybe you want to exchange the servers with a different technology, so you know there's so many things that you can consider about the technology. So let's look at the logical in itself, um especially in kubernetes, as I I you know briefly mentioned with a polyglot environment. There's so many things. So you know, let's, let's take the logical view on this uh extract.

B

We don't have to overdo it, but that's that's that's like that.

B

So um if you, if we're, then looking off of what what to observe uh what? What are we looking into um their first? The first question that you need to to answer for yourself for your team is um of of the granularity of the service.

B

um uh You know we're we're supporting very heterogeneous environments and I guess in in kubernetes there's there's there's already a structure to to what a service is, uh but it doesn't have.

C

B

Right, you're you're not bound, um and I think you shouldn't bound yourself to a kubernetes service in itself, but you know think again service as a logical unit for yourself and um if you, uh you know, have to have an old web apple, a web server, um then you might, you know, split different end points. um You know you might have different granularities of what what works for you and insana we have the the the default is something that is named and a certain type like http uh database and the like.

B

uh But again, that's that's up to you on the granularity of the service.

B

The other part is that you know need to consider is some sort of higher level assemblies um you know, given the operator right operator is a good means of stitching these things together, there's also the application crd or or helm, but you know it doesn't have to be on the kubernetes side of things. You can also talk. Think about it differently of you know many. Maybe some other back-end unit is also belonging to that you know logical assembly and I'm and I'm making this term very very loosely we're calling these application perspectives.

B

um But you know whatever whatever works, for you to combine a couple of logical services together to then you know serve of what was needed.

B

There's pretty there's a pretty good understanding uh uh right now of what to observe um the foregone sickness by the uh um uh will. Has our ebook have introduced these latency uh track uh traffic error situation, um tom wilkie um from we forgot scafana, has um introduced the red method which resonates.

C

B

Me uh more with me because it takes out the saturation, which is a technology specific, uh often a very technology, specific um opponent. That said, you know both work, um I'm going to go with the raid errors. Duration throughout throttle.

B

So these are. These are just some examples. On the left-hand side, you see the it's. It's stana dashboard right where we, uh you know, show the information of of that service, but obviously this is not bound to to any tool whatsoever, and so here, on the right hand, side I've, just you know, introduced a grafana dashboard that shows similar similar stats right and in openshift. You get um similar views um of you know, showing um the the the traffic uh showing the uh errors um um and uh yeah read a request, errors and duration.

B

The the question is: if we're looking at this, it's like how? Okay, now that's great, like you know, let's, let's look at this example and let's look at the data, but how do I get to this data and there's many many different ways of doing these?

B

um The I guess, the most common one, um and at least out there in the space that that we're capturing these natively or with some library out of the workload itself and um in the openshift club space prometheus, is the standard and, if we're looking at java again, there's tons of tons of options of just using using a specific library, called client driving, client java, jmx export or mitre micrometer exporter. So that's um the most common one we're seeing- um and you know, um also probably very straightforward, um a different.

B

A different way of doing this is actually capturing this from uh distributed traces so with distributed traces. Something that is built upon is that you, um you know, look at the traces between the different between the different services and um kind of use, these traces to calculate um the different different kpis.

B

The advantage with that is that it a it's it's it's like. You, don't have to do anything um if you, if you're, looking, for example, if you're using a service mesh or if you're, using its data, you don't have to do anything. You can capture them automatically and the other one um advantages, at least within sauna, that you can dynamically change this, and inside and other tools you can dynamically change. The service commit composition, the granularity, something that I talked earlier about.

B

So that's um uh you know a an advantage of of of working with it. uh Just one note if you are, uh if you're sampling the traces you know, please please be worried about this and and store stored the metrics separately.

B

Actually, there was a good link that I found from a from a red hat colleague that just did this with open tracing and jaeger to like to collect and store the application metrics in kubernetes, but there's another example with the open zipkin all right, so we've got the logical service and whatever we do right, whatever we look at.

B

That's always the starting point: that's where we are base, basing um our sli's on our service level indicators, our slos, the objectives that we strive for, um and it's just the uh kind of the the starting starting point to to it all. And if you, if you do one thing, then do this right, but um um I think for understanding the whole perspective to understanding everything. I think the other perspectives are equally important, which brings us to kubernetes so kubernetes, as the uh you know, probably don't know.

B

I don't need to talk about that too much in this for this audience, but um the uh orchestrator of distributed workloads has a lot of new things to take care of that might have been hidden earlier right. uh Kubernetes opens up this environment for us and schedules uh the workloads across um across the fleet and makes the resources available to the kubernetes uh to the resource available through the actual workload right.

B

um Something that I um I haven't included in the example earlier are persistent volumes, persistent uh position volumes to the uh elastic surge environment so um to the electric surge application. So actually you know the database can be stored. So you know that's that's the the the job of of kubernetes and has these um you know great apis. Everyone is talking about and makes sure that you know the the um different setups are um or different workloads new workloads with with operators.

B

Existing workloads with the existing schemes are distributed throughout the system and that the cluster has the knowledge of how to work with this in the beginning and throughout the life cycle.

B

Now the question is now what what? What? What do? I need to look at again, um so I guess the the first one is um it's just the cluster itself right, depending of where you are? If someone else is managing it, you probably need to look at the cluster itself as the on the control plane, just making sure the cluster itself runs, and if there is a problem that again can correlate something with this right is lcd.

B

Behaving as expected is, is it kind of distributed the knowledge uh through the lcd um and um that that is just you know, one indicator or one information that you need to gather now on the workloads itself, um the distribution state is essential, like how many of my just desired um workloads parts are running right.

B

If it's a demon set, is it evenly distributed on all the nodes that I want to have taken care of or and if I have a deployment is it you know at the scale that that I need and if something is going um unavailable is that is that still in my you know, budget, or is that something that I need to consider um on the workload side of things for the scheduler to run uh for for making sure that the works are distributed?

B

We have requests and then limits, but we need to put these in consideration in context to to the others um to make sure that we've got that covered all right again, two examples here, um so um um you know looking at the different cpu resources uh of the requests and limits and how uh the utilization of these are is something if you're looking at from the kubernetes perspective, um the the starting point of you know investing things and maybe a little bit on the starting point. That's also something that I found very interesting.

B

These three perspectives is in no means. Is there a we? We need to measure the services, as I said earlier, but it's not something that that uh you um that it's also very natural for people coming from a different background, starting somewhere else right so with the kubernetes environment. Maybe I'm I'm more on the devops side and then need to to to to level up.

B

I need to um make sure that a new new name space is running smoothly right, so I'll start with the name space, but I guess the important pieces that you know when you're starting kubernetes that you also understand which envir, how is it running on the on the host itself on the cluster itself, that you have some means of getting there and that you also have some means of understanding both the applications? What what what are actually the developers putting on there right? um So so you know having good starting points.

B

I think it's it's important and linking them uh again, which I'll talk about um how to measure that's actually pretty nice in kubernetes right, because it basically is all there right. We've got a cube state metrics, which talks about all the things around the workloads, the configurations it serves themselves and just provides provides these metrics on the control plane. We have the individual, metrics endpoints and uh just you know for for completeness. There's metric server too in do dual auto scaling.

B

So there's everything is already there and um I have the dashboard there, but um with these being provided as standard there's, also like a lot of freak and dashboards ready to just you know, have this perspective ready and and easy to to see, and we have that in grafana we have that an openshift, so it's it's pretty pretty easy to to get started with and to to enhance with more metrics all right infrastructure. So why do we then now need to look at infrastructure and the example um that I I gave?

B

Should you know hint at it right? The I o problem on the host is something that is needed for the troubleshooting, but also without the trouble shooting um it's it's an. I think it's something that we shouldn't be afraid of, or developers devops shouldn't be afraid of. Of having this this in mind, and not like you know, only look at the at my pod and my jbm, but also understanding of how the jvm run, what what are the threads doing? How is it running on the on the on the host right?

B

um So um yeah, you know- maybe that's that's one takeaway for for the talk right is encouraging developers to look into this and understanding of of what's happening there. Hopefully, maybe, with with this talk, a little bit of of what to look at now very very important right. We talked about the services being the starting point. That's still the case right, cpu utilization, as adrian cockroft says, is virtually useless as a metric in itself.

B

um There are so many assumptions in there that if you just look at the cpu utilization and try to write on it, you know you will most likely be wrong, but putting that in context and understanding what the service impact is to uh to a possible problem on the host is the point I'm trying to get at um there's a great method uh similar to the red method.

B

There's a similar method here by brandon gregg, uh awesome performance um engineer, does lots of talks and lots of great books and the use measures talks about you know for all physical server components, so we're looking at cpus memories and stores. For all of these, you know look at basically three things.

B

First of all, look at the errors right if there's an easy way to get at the errors, look at the errors and what do they tell you, look at the utilization so how busy the resource was serving uh serving the work and the situation, so how much work is kind of like queued up for this resource to to work at um so on the host level? um You know we have all. You know these gazillion uh resources on the horse itself or connected to the host. So um you know something that you know.

B

Probably everyone should you know have a look at is: is the cpu and memory um on the usage side of things on the low side of things and again, uh dashboards are all over the place, um and um I guess it's just it's an important. You know just you know getting familiar with it.

B

um This is, this is another example. I just found that found that interesting on the on the jvm side, right um jvm. Being such an important important part of our system is that we um kind of you know looking into it deeper and looking at you know different different uh metrics there be it threads, uh be at the the heave. The memory pools, uh and especially the garbage collection um is, is something to you know, understand, and you know have ready when you're looking at problems.

B

um So again, so we've got the infrastructure metrics. So how? How would I get there? um The in in kubernetes, the best way is actually to work with again with some exporters. So there's the node exporter and the jmx exporter.

B

Also c advisor is um promoting a couple couple of a good metrics um on the infrastructure side of things, but it's important um that you know um some for some performance reason. You need to also look at the instrumentation itself that, for some performance reason, more native information might might be needed, um our uh our sensor and- and you know that that's true for other sensors, probably like. We also is that we're you know um we're doing more.

B

That's roughly 50 of the instrumentation we're getting and out of in a native way choose just to be more more performant all right, so we've got the service kubernetes and infrastructure and the they're all needed, and I guess you know to be taken care of in itself now. What do we do with the context? How do we stitch these things together um and that's something that you know um we within the standard we've. You know basically built our our tool about upon, but it's it's something that you know you.

B

uh You can also do yourself uh when I was you know preparing the talk. I actually realized that um there's a pretty good standard in the upcoming um that you know hints at a lot of these and that's the uh open telemetry.

B

So there's lots and lots of things in the open telemetry, but something that uh for for this uh context, I would like to highlight: is the resource semantic conventions, so the the resource semantics? We mentioned the notification they describe um um how a resource should be considered of in a consistent manner, and so there's a couple of you know: kind of tagging, suggestions and open telemetry.

B

There are um they're, not only suggestions, but there's also some mandatory required ones and some optional ones, um but I think um that's a pretty pretty decent, pretty good starting point if we are thinking about how to correlate these ft together. So if you are working with a if you, if you are, if you've got a service- and you picked like a service name and open telemetry talks about the service name space that makes these things um unique together um and then you correlate it to a service instance id.

B

So something that serves the service, then you've got an um unique um identifier of what what this thing is that actually serves this right again and then sana. We also have the service type but um which you know, I think, makes a certain use cases easier and easier to to get at. But you know open. Telemetry does not not now so we've got the service right and if we're using this tagging theme, then we can start correlating things together. We can see okay. This service belongs to this container to this host to this um kubernetes.

B

You know part, for example, and the other way uh also around right. The the information that we've gathered from all these different other instances are common tagging schemes that we can use to. uh Excuse me that we can use to correlate one uh to the other now. um A different way of looking this um is is if we're looking at at the trace phase world of things, and I'm mentioning this explicitly um because, as I said earlier, um the the services that the service metrics that we gather are based out of traces.

B

So we infer a lot of these. We and others right. That's not it's not unique to insana, but um those who are working that way. Infer a lot of these information out of it and um the um interesting or the way to look at it is if you've got traces right, that you use the trace id in itself and to correlate things.

B

um Here's the first one the example is from grafana and and they talk about how to use the trace id and logs um and use a you know: common service tags throughout throughout the system, something that you know this service name here, they're using um slightly different but throughout the system so that they can.

B

Then you know look to the service, um something interesting that I found with zipkin traces, um that the tax fans had the part id and for the service naming they were looking at doing a reverse: uh reverse lookup on the pod id and then enriched the part data.

B

um The open, telemetry talks about making this uh more and more automatic right um in an open standard. We we already do this um as as to others.

B

So this is an example, and um I mean we could also do a short demo if you like, but this is an example of how we are doing this and how this is visualized and installer. So um the um you know, this is our example, but basically we separate these three different perspectives, this application perspectives, kubernetes applications, interest perspectives and, basically on any entity that uh you know you're looking at um you, you can link to the others, but um you know conceptually: it's not.

B

uh You know you can rebuild this with your own tools or again, just just give it a give it a try within startup, um so key takeaways. uh uh If, if you like, uh we've got service bananas and intra, please consider all of them. Please also consider them all of them independent and make the best use of when, when you're looking at and at these independently, um because there's always someone coming from that particular background, and if you overload them uh with information from different perspectives, they might be overwhelmed right.

B

um They, these different perspectives, um share them right, make them shareable within the team to ensure a common understanding of what to measure, and why do you measure them? Why is this particular saturation metric for your workload, the most important one last but not least, right context is king link. These together make them aware for everyone and link them together. So everyone has the same understanding um of of what what you can do all right.

B

um I guess this is it uh at least for what I've prepared um and I think we've got time for questions.

A

What I would love you to do is if you could go over to the instanta site and go to the install the operator page just so. People have that link too, because I think that.

B

Would be great.

A

And that would get them. um You know at least to know where you're, where you live and breathe in insana as well, and where all the docs are as.

B

A

You can also go to the red hat catalog and grab it from there.

B

Right so a couple of words to the operator, um the insana operator is actually we we've been really been stoked about the operator and uh and the the insana the insana operator is actually basically available uh wherever wherever you like right, um the it's obviously installed in the in the operator hub and you can also get it in and directly we've we've included it in in our um environments on how to install uh the agent or you can just like you know, get get the source and and get everything um uh uh from you know uh from github directly now the operator on the or the agent operator does a lot of like nice things for us and it's it helps us distributing of what we do with our agent.

B

So you know if we're going back to the talk that I talked about of of correlating all these different things together, you know that's something that the operator the agent does, and you know, as you can think of you know, if we're, if we're taking this in the next steps. Further, we've not only got, we've got infrastructure, kubernetes and services, but you've got all you know all mixed in with all the cloud stuff, with different operating environments with different runtimes.

B

So there's lots and lots of things to do, and the operator helps us on distributing that workload to throughout the um throughout the cluster and just picking uh or selecting different nodes, putting some intelligence to our operator and making the operator very making the agents very dynamic um and and in what they do.

B

um The operator itself is on the agent side and we've got something cooking on the back end side, but that's uh for someone else else to show at a later point.

A

Okay, so a little context is king here operator, hub dot, io um runs all of the open source, kubert runs anywhere on any kubernetes, and then I'm explaining maybe a little bit michael. What the catalog.redhat.com operators are. What that's all.

C

Yeah sure and I'm so sorry that I'm late, I have no control over this, but I'm I've been working from my cabin in the mountains for the last seven months and it's like it's a dsl phone line running through the woods. So when a moose gets crazy, things can go down, so I apologize for being so so late. But I did link the um and hi mike matthias. How are you it's nice to see you.

B

C

Nice to see you wearing an instant uh name badge these days um on that, but yeah um no. I did link the red hat catalog because our team works with companies like instant and others to run their operators through the red hat certification process, which which really allows customers to know that you know all the parts and and the internals of it are.

C

um You know, like the blueberry, uh pillsbury muffin man, seal of approval, that that the the red hat components and the instant components are all supportable and they can use it in a production environment. So that's that's! That's where our customers can go to download something and make sure that they're getting um genuine intel inside uh parts. So.

B

Yeah and again the the the the the the operator, um so you can think of the operator and multiple dimensions, um the um so so far um we don't have like uh in our tool dedicated supports of monitoring uh operators. So it's it's just used as a custom resource definition, but obviously, as it gets more and more used by by developers it, it would be one of the additional perspectives.

B

Okay, we've got the services, we've got kubernetes, we've got infrastructure. So looking looking at deeper or more intelligent, look at the kubernetes, and you know kubernetes layer. The operator gives means of even better understanding and better linking these together and putting some semantics into the operations of let's put, let's, let's take the elastic search, for example, right and that's totally in a layer that I can think of adding. um And uh you know we we are running our operator.

B

um You know with great success and are thinking uh greatly about like how how to leverage um the the knowledge in in an open shift or on any kubernetes cluster.

C

And you know this doesn't just happen by chance. I I actually was at a trade show. I think it was probably one of the early kubecon ones it might have. I think it might have been in. um I think it might have been in seattle or portland. I forget where, and I ran into one of your founders.

C

um His name was pete abrams, pete, abrams, terrific guy, and I was talking to him about what we were doing and and um instanto was probably one of the first uh apm type vendors that ever certified a container for the red hat portfolio and built an operator and and- and that was because we were working with them very closely- and I used to travel down pete invited me to your sales kickoff in miami.

C

It was probably two or three years ago now and so me and my team flew down there. We bought um you, know appetizers and drinks for the entire sales organization and stana, so we've actually had a really really good, close working relationship with your whole team, including your marketing people as well for a number of years. So this this this. This doesn't just happen by accident and and we're doing these types of things together to make the overall customer experience as as good as it possibly can be in a cloud-native environment and.

B

Yeah and we continue- we continue to do so right, so this is this is on the asian side, uh but we also have you know a lot of backhand components. So I'm not I'm not it's again for another another conversation down the road. I need to get my colleague on online for this, but we're going to continue investing in there.

C

That's cool, hey! I I don't see any other questions coming in about your technology.

A

There is one question: um someone is asking the about: the back end operator status um that you refer to sort of and said someone else would. Can you give us any hints on when that.

B

I I can't give any answer. I apologize chris, uh I apologize. Maybe I should have you mention it. Apologize.

C

Now it is is, is that when you say the back end operator you're talking about that right now, there's an instant agent which is containerized and certified, and then are you talking about instantan, the actual the the the smarts on the back end actually being turned into an operator as well.

B

Exactly so so instanta itself, uh so obviously you need to have the agent running, um but uh we have an on-prem solution or a self-hosted solution, as we like um and kubernetes is always has been, or for a very long time um been our primary way of surfacing. This um and, uh with you know, uh operators with our experiences on the agent side, we're also looking into um you know uh what we can do uh on the backhand side to make the on-prem install easier and faster.

C

Okay, and is that being driven by customers, because they're saying you know, uh we have certain requirements where we need to have the full apm solution inside our infrastructure, from a security perspective, or something like that.

B

Right so so the traditional, the traditional on-prem uh questions apply here right, so making sure that it's secure that's in-house, uh but also performance reasons of of on ensuring that it's it's it's nearby, uh the actual workloads. So so, there's there's a multiple way of reasoning about or or motivating this. uh We don't don't take a stance right there. We we just try to make it as easy as possible, and operators and uh hitler are a homogeneous environment like kubernetes or openshift, gives us the means of of installing it.

C

Yep, hey we gotta, we got another question, I'm gonna, I don't know. If you can, I'm gonna read it here and maybe you can translate for me matias, but so jeffrey says hey in new relic as we're using right. Now we have the aptx which measures, satisfaction or response time based against a set threshold to get the insight of the application health does it have any corresponding features for.

B

Is a really really important and uh aspect of um you know, monitoring your applications, um something that we are more leaning towards. um Is the um sli slow way of looking at this, so we've um we've, just we've just introduced that um that you, that you define service level indicators on your customer journeys, define these and alert on these, and also with our application perspectives. There's um you have much more fine, granular control um of of which traffic of which aspect you're looking at and alert alert on these.

B

So um right now we don't have like the uh very very same equi equivalent um to to what aptx is, um but I think um you know if you look at insana of what we provide, um I think at the end, you will also be uh you know, possibly uh even more liking. uh The way we translate these things.

C

Okay, um hopefully, hopefully jeffrey that- that addresses your question. If it doesn't, I'm pretty sure that we can get you just about any questions you want. um Where would we send people to if they have follow-up questions I mean my email address is weight at red hat dot com, it's just w-a-I-t-e at red hat dot com and I can connect people with just about anyone at any level of the organization and stana. I am from top to bottom, very, very close with everyone over there um matthias do you have there you go.

C

There's your email address as well.

B

That's my email address and stana.com there's a gazillion ways to reach out uh whatever ping ping, anyone we'll get back to you um and um uh and if uh and jeffrey. If you would like to talk about more about the aptx standards, I'm happy to you know, go and go into some detail with you or, and especially you know, looking at use cases like, why are you looking at the specific?

B

The specific one that that's very that's very important to us is understanding why this particular measure is, is important and helps you there, and I think we we usually got a good answer on that.

C

Hey matthias, I I really wanted to ask you this at the very beginning, but I, as I said, I've been uh dealing with the legacy internet issues.

B

Next time, next time, next time I'm coming I'm coming to the cabin and I'm gonna hang out in the cabin earlier.

C

You want to see what it looks like real quick, I mean this is this: is.

A

We're all going to the cabin soon.

B

C

This is, this is my front desk right here.

A

B

Oh, my god, yeah.

A

Yeah, this is good.

C

But I was gonna, I was gonna, ask you if you know you were you were at red hat for several years and and then you and then you moved to instanta. How lucky are you I mean? I, I think that you know being able to be a part of that team in this time where everyone needs apm, for you know, helping them to have the visibility and insight into running their business in the hybrid cloud. Are you just absolutely thrilled to be there or I know.

B

Well, first of all, I was thrilled also to be at red hat. um Red hat was uh was really a great time and we we built some uh nice, uh really awesome tools in the. I was more on the developer side of things there uh and in the code ready, and we really do build an awesome tool uh for uh analyzing dependencies. So shout out all to all my ex colleagues.

B

That was an awesome tremendous time um and yes, uh obviously and stana is, is great because uh we kind of like you know, are challenged um in a new way of you know being being uh going against other other players in the market, but using this new microservice movement uh to our advantage and and build. I think you know something very unique um that uh is is just very suited to this new to new environment. Just you know, just one example.

B

I think which is really, which is also very dear to my heart, is that it's really really easy to get started. Our agent discovers everything throws everything on the dashboard and, yes, you need to tweak and configure things, but everything is it's just there right, it's it's just it's there um and um as we as we're talking about you, know different perspectives of what people are looking at.

B

I think uh for myself, but uh I also hear that from customers, it's just great that that you know you've got a platform that you look at and then you've you've seen them the majority of things already there and then you can kind of then dive into details and start tweaking. But you don't. You know, you're, not you're not lost at the beginning, and that's that's something that I value within sana. A lot about and the whole distributed tracing is just it's just a fun topic. It's just it's just a fun technology.

C

Cool, well, I don't see we have any more questions coming in and I I know um chris short is going to. Let us know that we're that we're just about out of time. So um I thank you so much for coming on. I mean I I reached out to to uh to star who's my marketing contact over there, and I was like you got to find me someone like really really good to be a part of this.

C

It's our it's our you know one of our early on um open shift, commons briefings, and so we're really really glad that you guys could help be a part of this today.

B

Glad thank you very much for the invitation always a pleasure. I'm happy to come back um and it's uh yeah, it's it's! It's some great technologies, mixing in together um and yeah, happy to be here.

A

All right and when you get that back end operator and you're ready to talk.

B

We uh I'm gonna, I'm gonna, let my colleague know and we're gonna talk. Yup.

A

A