Flatcar Container Linux Tech talks, 24 Mar 2021

Previous Meeting Next Meeting

⏯

youtube image

►

From YouTube: Monitoring Kubernetes with prometheus-operator - Lili Cosic, Red Hat

Description

In this talk, Lili will walk through what prometheus-operator does, the resources it configures and manages. To make the story complete, we will have a look at the kube-prometheus project, which is a set of manifests that allows you to easily set up full Kubernetes monitoring into your clusters and get a complete insight into the cluster workloads. We will conclude with a guide through some examples of how to monitor your applications using prometheus-operator custom resources.

A

Okay, so, as lexi said, um my name is lily um and today we'll be talking about monitoring kubernetes with the prometheus operator.

A

um So, as likely already mentioned, I am currently a principal software engineer at red hat, but I used to work at ken falk as well and I'm an engineer in the openshift in cluster monitoring team and we work on opportunity being the red hat's kubernetes product, um I'm also the maintainer and contributor of prometheus operator cube prometheus and keepsake metrics, to name a few just to mention why I'm qualified sort of to do this talk.

A

So, um let's see let's get started then um so, we'll briefly talk about prometheus because I'm sure most of you have already came across prometheus by now, and there are many great talks and resources out there to explain the to explain the inner workings of prometheus very well. So we'll just briefly do a summary here.

A

um I borrowed this um graph off of um greetia's website. So let's just do a very quick short summary. So essentially, prometheus takes care of um target discovery. um It's a monitoring, um project and retirement being usually your application, for example, or you're your infrastructure workload.

A

It also pulls the metrics um every time, interval and stores those metrics um as time series and a custom time series database. We call it tsdp. Another thing: important thing that it does. It also evaluates, alerting rules and pushes those alerts to alert manager.

A

Alert manager then takes care of proxying. These alerts um to the correct receivers that you can figure but prometheus does the actual, alerting evaluation and that's something that a lot of people don't necessarily realize, because the alert manager name is so misleading um and the alerting roles are actually stored in prometheus itself, and this will become relevant at a later point and basically prometheus evaluates them against the data that is stored in the time series database, so essentially the metrics.

A

So that's briefly about prometheus, um but let's move on to the topic of today, which is prometheus operator, so prometheus operator um is part of the prometheus operator. Org actually- and it consists of two projects right now- um 3ts operator and cube prometheus, and um we have over 5k stars, which is of course, the most important metric out there, and both projects have a huge adoption rate and we recently even added the adapters file.

A

So if you are using the project or your company uses the project, please add yourself to the adopters files which is located in the prometheus operator. So, let's start with the operator itself and briefly go through that.

A

So the operator actually was one of the first kubernetes operators in general, which was created by core os, which also coined the term operator. As we know it today and essentially what prometheus operator does it manages configures and operators monitoring components within your kubernetes clusters.

A

It also provides a very powerful multi-tenancy knobs and features and gives you the ability to self-service your monitoring. Essentially, we also have a really cool logo that was um donated to us by bianca and yeah, we're now a very fancy project so um to for the operator to be able to actually manage the monitoring components, um you need to create uh custom resources as a user and prometheus custom resources um are the following ones and we'll go look at each one of those and what they do in the next few slides.

A

So, let's start with the most important one, which is the custom resource, so what the prometheus custom resource does. Is um it basically implements it? Configures manages prometheus via statefulset. So basically, the operator, based on your configuration, creates a stateful set depending on how many replicas you choose and it deploys those and manages it for you, some interesting fields in the prometheus spec that you should be. Configuring are um selector which basically selects which objects it should pick up and you can match on labels there.

A

Another one is the alerting field which tells prometheus, which alert manager endpoints. It should push the alerts to which is the thing we mentioned earlier and another one which is very very useful is resources because, as we know, prometheus is really just a database um and you should treat it as such and figure out how much data you'll be working with and size that accordingly and finally, the as we mentioned the replicas.

A

So with no sharding enabled you should choose roughly two replicas to have a highly available setup or if you have, especially, if you have more than one node cluster um and for a full list of all the apis and all the fields we have.

A

You can have a look at our prometheus operator, api dock, which I'll um there's a link to it at the very end- and these are just some of the things that you can configure um the next one to look at is the alert manager resource um essentially, as with the prometheus one um alert manager, is configured deployed and managed via stagefulset deployment you can also, but what it doesn't. um Also it configures the alert manager instances to talk to each other.

A

um So alert managers have a gossip protocol to synchronize the instances of an alert manager cluster, um so it it does that because you wanna um have not duplicated notifications sent out. So essentially they do sort of like a gossip amongst each other to prevent those things as you don't want to be paged three times for the same thing and prometheus operator handles that configuration.

A

So the once you have your prometheus up and your alert manager up um via the custom resources you next are interested in actually monitoring your things. So, for those things the service monitor and the pod monitor custom resources are needed.

A

So what they actually do is let you configure the targets to monitor without needing to learn a prometheus-specific configuration. So essentially, it's very simple fields which um lets you very easily monitor things. So we often get asked the question: what's the difference between the service monitor and the pod monitor, and really it's just that the pod monitor is relatively new and we historically had only service monitor, so people defaulted, of course, to service monitor, but we now frequently get those questions and really what a service monitor does.

A

Is it selects via label matcher and we'll see that at the very end, it selects all the services that match those labels and, in turn, scrape each of the pod that backs those service and the pod monitoring directly selects pods. So that is the difference between them. It depends on your setup a couple of interesting things. You should be configuring, especially if you can control our all the service monitors or the pod monitors um are the sample and target limits. So basically, what they do is you.

A

um This is useful when you're trying to like limit the amount of data a single configuration can produce.

A

So it's super useful, like I said in a multi-tenant environment or when you um basically don't know what your users are up to to do. Some kind of high unbound cardinality series and they're a fairly recent um edition.

A

So um how does it actually work right? How do the service monitor and the pod monitor actually work after the resource is created by the user? So let's say you specify a service monitor or pod monitor for your application that gets picked up by the prometheus operator, which in turn creates a secret, a kubernetes secret with the content of the target discovery. As we mentioned, it translates everything to the prometheus specification and then the config reloader sidecar, which runs along prometheus watches, that target discovery, secrets and reloads prometheus.

A

If there were any changes to those secrets, so there is no real magic to it. It's um it just basically boils down to actual kubernetes objects and we use a secret as your targets can contain sensitive information.

A

Next up we have prometheus rule, so the prometheus rule is used to create, alerting and recording rules.

A

Alerting rules, as the name suggests, are used for prometheus to evaluate alerts by creating a rule when you should be paged about your workload, so always make sure to create your alerts that are simple and don't do as many and, of course always alert on symptoms and not causes and recording rules on the other side are rules that basically allow you to pre-compute, frequently needed or expensive expressions, so they basically save their result as a new time series.

A

So you don't have to um save as much compute, so you basically save compute and how do those work. um So, in turn, when you create a prometheus rule, whether it's, whether it has alerting or recording rules, um the prometheus operator, depending on which namespace it watches um and if you've created one in that namespace, it picks up that custom resource and then it bin packs all the rules um that you've specified into a config map or multiple config maps depending on the size of the rules, and it essentially mounts those config maps.

A

There can be multiple into the prometheus pod and the configmap reloader sidecar, which we talked about earlier, watches for any of those changes and again reloads prometheus on them.

A

So again, it's really important information for whenever you need to debug something that goes wrong in case. Something doesn't get picked up like a prometheus role or a service monitor um one of our newer custom resource that we added recently is the probe custom resource and what it is. It essentially is, as the name suggests it lets, you configure how groups of static targets or ingresses should be monitored.

A

You do need to deploy something like the black box exporter, for example, for it to work, and it's one of our newest ones, so we have been using it as well, and it's really powerful and the latest one which is not v1, it's technically all the ones that I've mentioned are stable resources, um but the alert manager config, which is also a custom resource, is really great for multi-tenant environment and we plan on using it in openshift as well.

A

It basically allows you to configure subsections of alert manager and you can connect to one alert manager essentially, but you can select things like routing and receivers and let yours users, who might not be admins of the alert manager, custom resource, for example, still let them configure those.

A

It also allows you to configure inhibition rules and inhibition rules in alert manager are things that actually mute. All specified alerts that match on it um whenever a group of alerts is firing. So, for example, let's say that a node down alert is you don't want to be firing? 10 other alerts after that, so inhibition rules are really powerful. For that.

A

And finally, the last custom resource that we have right now is the thanos ruler. um Thanos is a project part of the cncf organization and we use it for um we use the tandem sidecar, which I'll mention later and we use the thomas ruler and our prometheus operator. But you can deploy it on the square here to make like the full story and essentially what the thanos ruler is.

A

Is it's really powerful when you connect it to multiple prometheus instances, for example, and it does, is it evaluates prometheus rules, um so the recording and the alerting rules- and you can connect it to a chosen query api, so you can connect it to any of the prometheus instances you have and again, with all the other components, it's really useful as a multi-tenant setup, where multiple instances of prometheus are deployed or where you want to essentially have a very powerful, huge cluster and you make it into a very specific multi-tenant environment.

A

So there are a couple of cool overlooked features of prometheus operator, and this is just naming a couple, but I'm sure there are more many more and I want to briefly talk about them.

A

So often people end up using just some custom resources out of the box, but um one of the things that is really cool is automated charting, so um sharing already exists in prometheus, but what we do is we essentially distribute the loads automatically across the number of shards specified, and that's really really nice um and we do get some users using it, but it definitely needs a bit more um love and a bit more. um um The next one is the enforce namespace label.

A

This is part of the prometheus spec and um essentially what it is is that it add, enforces adding a namespace label for each of your alerting um and metric that the user creates- and this is really great for enforcing tenancy per name space, so a user can never alert on something that is outside of their namespace essentially and another. One to note, as I mentioned earlier, is the thumbnail sidecar.

A

You essentially add it to prometheus and you enable object, storage and you can connect it with um into a really powerful configuration, and this is something we heavily use in openshift as well.

A

So now that we've learned about prometheus operator, let's have a look at the other project in the same organization, which is the cube prometheus, so keep prometheus as as the name applies, is essentially a group of manifests that lets you easily modif monitor your kubernetes workloads out of the box, so things like etcd api server, cubelet the monitoring components itself and many more things, but what it does it also provides. All of these manifests in form of json.

A

For those of you don't know, jsonnet is a data templating language that extends json, and by doing this it really allows you to customize your experience. For example, in openshift we bring in q prometheus, but we do a very specific openshift customizations and we do have links to this. How this can be done for your environment, so you can essentially customize your workload monitoring.

A

So, as we mentioned, it brings a lot of things. So, let's see exactly what those things are so as mentioned before there. These are the kubernetes own workloads, but it really all starts with deploying prometheus operators. So the cube prometheus sac first deploys the prometheus operator deployment um with all the custom resource definitions, it registers and out of the box. It creates a prometheus h, a setup and an aha of alert manager. So two replicas of prometheus and three replicas of alert manager.

A

We also installed keepsake metrics, which helps us provide an insight into your kubernetes cluster by exporting the metrics about all the kubernetes resources, um also the node exporter, which provides the os and hardware metrics, and we also monitor all the kubernetes components so essentially the cubelet, the lcd um all of those and the monitoring components, because you should always be monitoring your monitoring system as well. Otherwise, you can't know it's reliable.

A

We also prevent provide a bunch of grafana dashboards and alerting and recording rules out of the box for all the kubernetes um things as well, so to visualize this a bit better.

A

This is my own personal cluster and I have the latest keeper meteors deployed, and essentially this is what we see in terms of pods and service monitors. So we have.

A

The service monitor, as we mentioned, are the targets that are being monitored, and these are all the pods that are running inside of my cluster, for example, and, like I said you can customize this using jsonnet or you can just customize the manifests, and we also have customize, I think, with a k, but don't ask me much about that, but we do are very much open to issues and suggestions, so feel free to send some our way.

A

So now that we've seen um how the kubernetes owned workloads are monitored, um let's see how you would monitor your own applications. So we monitor the monitoring system, we monitor kubernetes, but how do you monitor your applications? So it's quite fairly simple.

A

Really, as we mentioned, we have our service monitor on the very right side here, which has the matching labels highlighted and what that matches is our service, which is in the middle, and the service monitor in the middle has the selector that matches all the pods which, in turn, are deployed by the example deployment. So we also expose the patch export in the service and that's it that's all the magic.

A

Everything else is taken care of by prometheus operator, and your application only needs to expose the slash, metrics endpoint, um and then it's picked up by prometheus itself.

A

So too often things don't work out of the box. If you, for example, misconfigured some labels and the best way to go around troubleshooting is to go to the slash um to the prometheus ui and to the slash targets page, and here you will see all your targets that prometheus discovered or, in my turn, couldn't discover. um So in the case of the screenshot, for example, um here prometheus just couldn't scrape the tennis side car.

A

um In my case, it was a network issue in this configuration, but it's always the first thing I look at whenever someone reports that just prometheus metrics are not being able to be found so essentially for troubleshooting.

A

You can set your debug log level on prometheus operator to see like, for example, which service monitors, pod monitors or prometheus roles were picked up.

A

You can also use this handy command to see what is actually in the secret itself that is created by prometheus operator, so you can do that by grabbing the service, monitor, pod, monitor or other resource name, and we also have this linter tool. um So it's essentially a way to validate your custom resources. You can add it as part of your ci or um just do it for locally.

A

So, um as a conclusion, I think we all learned a bit about the basics of kubernetes monitoring and how to monitor your workloads with prometheus operator and we'll have a look at also some helpful docs and where you can ask for help. So essentially we have a new website coming soon. I think matthias is working on it now, um so you can bookmark it and we'll let you know when it's ready.

A

um It should contain things like um guides and blog posts, any talks we do and just in general, a good place for prometheus operator, but also um prometheus itself. We also have a section in our um github, where we have troubleshooting docs so feel free to contribute as well. If you found any good troubleshooting tips, we have a slack channel on the kubernetes slack, the prometheus operator and also the prometheus operator dev in case you're, contributing something you can always open an issue on github on either of the repos.

A

We also have discussions enabled so that's always really a great help, and I also compiled some useful docs or links to things so things we often get asked.

A

So I think I'll share my slides afterwards and you can click around. We also recently started uh creating a wiki for run books for alerts that is located in the q prometheus project, and anyone can edit that and add their own run books and it's essentially community.

A