Cloud Native Computing Foundation CNCF Webinars, 22 Oct 2020

Previous Meeting Next Meeting

⏯

youtube image

►

From YouTube: Webinar: K8s Audit Logging Deep Dive

Description

Many people know that Kubernetes can report API activity to logging back ends and that auditing is a powerful security tool, but what happens in the real world when you have:

- Multiple API servers
- Mutating Admission Controller Webhooks
- Aggregated APIs
- Webhook audit log backends
- Massive API throughput requirements

The short answer is, things get tricky. In this short seminar, we’ll take a brief look at the more complex and deeper issues faced by Kubernetes operators when seeking to implement comprehensive, efficient, and secure Kubernetes auditing.

Presenter:
Randy Abernethy, Managing Partner @RX-M

A

Okay, let's get started I'd like to welcome everyone to today's cncf webinars k8 auditing in depth. My name is jerry fallon and I will be hosting today's webinar. We would like to welcome our presenter today, randy abernephy, managing partner at rxm. We just have a few housekeeping items before we get started during the webinar. You are not able to talk as an attendee. There is a q, a box at the bottom of your screen, so please feel free to drop your questions in there and we'll get to as many as we can.

A

At the end, this is an official webinar, the cncf and as such, is subject to the cncf code of conduct. So please do not add anything to the chatter questions that would be in violation of the code of conduct. Please be respectful of your fellow participants and presenters.

A

Please also note that the recording on slides will be posted later today to the cncf webinar page at cncf, dot, io, slash webinars, and with that I will hand it over to randy for today's presentation.

A

Randy, your mic is on mute.

B

Hey thanks a lot for that good morning afternoon evening and welcome um uh this. Is uh the kate's auditing in depth session a bunch of interesting stuff to take a look at over the next half hour or so so? Why don't we go ahead and just jump right in all right, so my name is randy abernathy, I'm a cloud native geek of the first order, I'm a big fan of microservices and apache thrift and things in that area, and um I work for rxm and we're.

B

You know um diving a little cloud native folks over here. That's just a quick note on me.

B

The session today is gonna cover um auditing and we're going to we're going to start from the start and go through some of the basics, but we're going to quickly move into more advanced concepts and we're going to talk about some of the challenges and issues with multiple api servers and mutating admission controller web hooks and how to deal with different audit log back-ends in scenarios where you have you know kind of massive throughput requirements, and you know what exactly you can expect from audit logging from a throughput and capacity standpoint.

B

So, let's, uh let's start off, you know audit. What is audit logging if we just start at the very start? Well, the definition of audit is an official inspection of an individual's or organization's accounts, typically by an independent body- and this is a you know- kind of an interesting parallel to what kubernetes auditing is, and you can see the the uh the context sentence, I think, is actually even more telling audits can be expected to detect, can't be expected to detect every fraud.

B

So this is exactly the spirit of auditing in kubernetes logging happens in in services that you run in kubernetes, including the control plane services. Like your you know, controller managers, scheduler the api, server, um kublets and so on, and they all um log to standard out standard error and, if you're, a systemd service, that's going to be manageable through journal cuddle and all that stuff.

B

If you're actually running in a pod, as you may, if you're running, you know kind of a cube, adm style setup for your your control plane, then um cubecode logs would be able to show you the log output of these different services and it can be managed with you know, plugins that forward the logs off the backends low key elasticsearch.

B

What have you and then you've also got events taking place inside the cluster control, plane events and those events are going to be visible through code, get events, for example, or if you describe an object, you'll be able to see information about it. But this is a different beast. Auditing right auditing is designed to give you the ability to inspect um an individual organization's accounts um right and to be able to detect activity that might be fraudulent, for example, some of the other verb uses here. Companies must have their accounts audited.

B

He made use of knowledge gleaned from economics class he audited, so being able to watch and oversee something right is sort of the idea of the audit log and it's a you know it's different in kind, because what it's designed to do is capture um the who, what and whys of activity going on in the cluster and it's usually at a far more granular level than these other types of logs right application level. Logging that you get in standard out and standard error is going to be things like you know I created this.

B

I did that it did this other thing and some of the details may be um obscured for security reasons or something like that. But an audit log is designed to capture all of the details. It is designed for no holds barred inspection of what's going on in the cluster, and so you know, generally um only you know. Privileged individuals should be looking at the audit log because it can expose a lot of stuff.

B

You know you can you can look at the exact, manifest posted by every user for all of the resources that they're creating. You can see the responses in detail from the cluster and so on and so forth. So it's really a you know: a function like you would have a security audit. That's really what the audit log is for it's for facilitating those types of activities.

B

So, let's start off with just uh some of the basics. This is the definition straight out of the kate's. I o docs kubernetes auditing provides a security relevant chronological set of records, documenting the sequence of activities by individual users, administrators or other components affecting the system.

B

So an interesting thing about the audit log is that not only are you looking at the actions of users, administrators or otherwise, but you're also seeing the interaction from the the system, principles within the cluster, so you're going to see activity from kubelets and from controller managers and schedulers, and in this way you can use the audit log as a great way to get an in-depth understanding of how your cluster is actually operating.

B

And in fact um you know how frequently you know certain types of activities are taking place, and so you know you can always go and dig around and config files and things like that to sort of figure out how things are set up and what they're doing but going to the horse's mouth is always you know the authoritative answer, because you might see a configuration file that says this thing's supposed to happen every five minutes and you look at the audit log and it's happening every three seconds.

B

Well, that's putting a lot of load on the control plane. Maybe you want to look into why that is? Is it because the config file is mistyped? Is it because uh there's a default that is, is at play in some scenarios where there isn't a config file?

B

You know, there's all sorts of interesting things you can glean from digging through the audit log, so it can be used for um you know, for security professionals and forensics, and things like that, but it can also be used for cluster debugging and performance tuning and it's a you know, just a really all-around powerful facility.

B

So if we were to look at the um the architecture picture of the audit log, it probably looks something like this: all roads lead to the api server. The api server is the state manager for the cluster. At the end of the day, it's the microservice that owns all of the metadata describing what the cluster is doing. Now.

B

The api servers are stateless themselves, but they have the logic that is there to handle authentication authorization, admission, control and all of these types of things that decide whether something that a end user would like to create as a specification is going to be accepted or not. Now, if that specification is accepted, it's going to be dropped into etcd. So ncd is a highly consistent key value store sitting behind the api server, and you know this is a simplified model.

B

You generally have multiple api servers and a cluster of ncd nodes, but the communications channels are the same right. Everybody talks to the api server and only the api server talks dead cd, and so, if you want to know the status of something, if you want to create something delete, something update, something modify something you do it through the api server. So essentially the api server is the gatekeeper of all.

B

You know: configuration and status for the cluster when the kubelets, which are the node agents running out there in the cluster nodes report in the activity that they're seeing the status of their memory and cpu consumption, the specifications that they're maintaining for pods and containers all of that stuff's being dumped into the api server and dropped into ncd.

B

So if we enable the api server to log all of this activity at the api level and since the api server's api is the gateway to all state in in the kubernetes cluster, then you know we're really creating a place where we can see everything happening. Now, it's not completely true. It is a distributed system.

B

You know, kublets keep the cache of the pods that they're supposed to be running, and there are, you know, little pockets of information throughout the system, but at the end of the day, if, if the api server is right and good, most things in your cluster are going to be right and good, and if there's something wrong, the api server is going to be able to see that in most cases. So it's the perfect place to be capturing this kind of detailed logging.

B

Now, when you run an api server by default, there is no audit log, and so we don't have this facility. It's therefore very good to know that the audit log exists and to start thinking ahead and saying hey today we don't need the audit log, but after something catastrophic happens, it's too late right. You want to see what caused this big problem and you have no no record of the activity. So there's a lot of different ways.

B

We might want to configure audit logging, because audit logging can range from non-existence so no load at all to capturing every single thing that comes and goes from the api server, which is going to be a lot heavier.

B

When you think about the the the cluster picture that you see here, you can, you can quickly rationalize, you know kind of a of the the amount of activity that's going to be coming into the api server. If you imagine that you've got a a maximal sized cluster. Currently upstream, kubernetes is 5000 nodes right. So, if you've got a 5, 000 node cluster you've got 5 000 kubelet's reporting status to the api server on a regular basis.

B

You've got reconciliation loops in the configuration managers looking at all of those deployments and services and and as pods come and go. They get evicted because one node gets a little bit tight on memory and they pop up somewhere else.

B

You've got the scheduler moving things around and imagine you maybe have you know: 10 pods per node in a 5 000 node cluster, that's 50, 000 pods, and that might equate to something like 5000 deployments and 5 000 services and 50 000 end points for all of those services and that's a lot of that's a lot of just quiescent activity, even though nothing might be happening um in the cluster driven by users.

B

The the reporting and the constant reconciliation, the self-healing behaviors that that kubernetes is performing all the time can create massive. You know flows of messages into the audit log, depending on how you've got it tuned. So um the api server is really the centerpiece all the requests to view or modify state go through that api server, and so it's the central place where we want to do auditing. um So what's in the audit log?

B

Well, it tells you what happened so if I, for example, create a pod or create a deployment or create a service, that's going to be recorded. It's going to record when it happened, so the api server will time stamp. The audit log event they're called it's going to specify who initiated it. So it'll capture my identity, and this is very detailed. So if I am an administrator and I'm impersonating another user, um when I create this pod it'll capture, both of those identities, the my identity and the party, I'm impersonating.

B

um So what object did this happen on so the pod, the deployment, whatever you can get the exact um uh identity of that object and uh where was it observed? So, as information you know is, is being processed by the api server we have different uh stages and so on, and then from where was it initiated?

B

Where did this request come in from and where is it going? So if there's any destination for this thing, that can also be identified in the audit log. So an example of an audit event might look something like this and, as you can see here, we're just tailing the the audit log file and just grabbing one line and the modern audit log is json based. So the old audit logging, which is deprecated at this point legacy.

B

Audit logging was a text based format and so audit events were a single line, but the current approaches json, which is a lot easier to parse and process and store and search and index, and all that kind of stuff. So if we just clean clean up the formatting- and you know space it and indent it a little bit there with jq our json query tool, we get a nice dump like this, and so you can see this looks a lot like a kubernetes resource, um a kubernetes.

B

um You know manifest or a kubernetes spec, and so it follows the same exact principles as everything else in kubernetes, where it is kind of a sort of a declarative approach with key value pairs, and you know, support for nesting and collections and things like that, and so any kubernetes object that you would try to create. Would have a kind, and so that's the type of thing it is. This is an event.

B

However, the kind of a thing doesn't create a unique definition, and this is because parties can can create custom resources, for example, and so if, if my company, let's say rxm, creates an event type of resource and then let's say you know, another company creates an event type of resource. How would you disambiguate well there's an api um group that you would organize those kinds under, and so, as you can see here, this is an audit case, io based event, and it also has a version. So it takes three pieces to put together a complete.

B

um You know kind. Actually you need the the group. You need the um the kind itself which is subordinate to the group, and then you need a version, and so there are other types of events, and this can be confusing to people first getting into audit logging. The typical kubernetes event that you deal with is a control plane event. That is not an audit event.

B

So there are different kinds of events, keep an eye on that api group to know which type of event you're dealing with, and then these guys have they've got a number of other bits of information which makes them a little bit different from a typical kubernetes resource. Generally, kubernetes resources would support metadata and the object in question would have a name. So these events um have um you know they have identity. um You know per se, but they're not named right, they're, just events in a stream, and so it's not like a pod.

B

That would have a name and also you'll note that these events aren't labeled right, they're emitted, only we don't create them, they're, an artifact of activity in the in the cluster and so um they're, created by the api server in a stream, as things happen and they're. So they're they're a little bit different um from your traditional resource, but the the format is kind of similar. Now you will note down at the bottom here that we have annotations, and these are just exactly like annotations in a typical kubernetes resource.

B

They give us the ability to expand on the functionality of the audit log event without damaging the overall spec. So, for example, if you create a pod- and you want to tell um you know some- oh uh let's say some some c and I plug in something special like maybe the c I plug-in has some tricky dual networking functionality and you want to tell it to put you on the b network. You could use an annotation.

B

Kubernetes, doesn't know anything about multiple networks, but by plugging an annotation in there you're creating a key value pair that kubernetes basically passes around everywhere, but just ignores, and so all of the plug-in components and extension points in kubernetes are often going to use these annotations to um to augment the functionality of a particular thing, and so, in the case of audit logging. That's exactly true! So if we have, for example, um an admission controller that we've um added to our cluster through a web hook well, kubernetes can create audit events around hey.

B

This thing got denied because the pod security policy denied it, but if the um if the web hook, um that is not part of kubernetes denies, this we need to you know, maybe have some some reasons. Why or if it mutates the request. We might want to know what the mutation was and all those types of things can be represented in annotations. So annotations are really um really give us a lot of flexibility here.

B

Some other things um that you'll note that we're going to talk about in a second are that there are stages of processing, and so you can record events at a given stage of processing or multiple stages of processing. If you like, we have a you know the user, as we described, who was involved here. So this is this particular node. So that's the hostname of the node that that made this api get request and you can also see the the url. So this was the request api.

B

So this this particular node was getting the uh api v1 nodes information on itself, um which it is allowed to do um and it is going to do on a on a regular basis right to get any. um You know kind of updated information about itself. This is an interesting thing about kubernetes right. You have to remember that when you submit a specification to kubernetes the api server, basically verifies it from a security standpoint and then dumps it into cd.

B

There is no guarantee that that means it's going to be okay or work right and so um things asynchronously then kick off after that, like the scheduler assigns pods to nodes and if there's no node available your pod might be pending if the pod does end up on node, but there you know the image that you've specified in the pod is no good.

B

um That's gonna cause the kubelet to not be able to pull the image and it's not gonna run again, but in all cases, as far as the api server was concerned, the spec was good and it saved it to cd. So you have to also have you know, a fair amount of understanding of kubernetes to be able to follow through with some of these events, because um the ways that you would find out these other things would be after the fact right.

B

The user posted that pods fact sure and there was no errors, but that doesn't mean it's okay. The scheduler might might attempt to do something and report that it couldn't be scheduled. The cubelet might report a status of image, pull you know, um you know, fall back and you know be be uh be failing to to pull the image and continue retrying and reporting that so you can find lots of different pieces of the puzzle and wiring that all together um is is definitely a skill that you develop through practice.

B

So um you know one would suggest, then, that if you're going to, if you, if you find that audit logging is going to be an important part of your, you know operational environment um working with audit logs and starting to craft, um uh you know some experience and and dashboards and things like that through your backend, you know log management systems, whatever they may be. Splunk or uh elasticsearch kibana, you know uh grafana, on low-key or whatever it is.

B

um You know getting getting prepared and developing some skills ahead of time can really pay dividends when you're in a um a scenario where there's a failure or some security event that you need to deal with. So what is the definition of the fields in the audit event and and how are they all organized? Well, that's a good question and I'm just going to pull up something that you're probably familiar with here, go to kates dot, io docs and pull up the reference here.

B

So if you want to know, what's you know how to specify resources? The api for kubernetes is essentially this. These json documents right, I mean you, you get post, put and delete these things, but all of the activity that's taking place is in response to these documents, and so, as I mentioned, if I go ahead and search for event and we look down the left hand side here, you can see that there's a metadata api and there's an event defined there.

B

But this is part of the core group right any any time you you don't have a a group name and so, for example,.

C

Let me just go over here to a cluster and do a cube, ctl api resources.

B

Right so these these are all the api resources um known um to the um to this particular api server. And so these are things you can post you know and put and and delete uh through the api server api, but you'll see that there's a bunch of these resource types pods. You know the early guys that were there with version one that don't have an api group, that's the core group, so those guys are always in the core group and then you've got a bunch of these guys that have different groups um depending on.

C

You know which, which working group is.

B

Is managing those those particular resources and you'll? Note that if we.

B

Look for events. You can see that we've got events and event right events, dot, kate's dot io. So this is a completely different group. It's an it's a, not the same resource as the the audit log resource. It's not part of the api right. It's! These are just a format for audit. um You know uh information being emitted, so you know we're kind of stranded here, because if we you know you can search around and you're just you know, you're going to find these are legacy um right, old api versions.

B

So there's just there's: no uh there's no information here. There's there's uh limited um information about um the format of the api events and things like that, and so, whenever you're in doubt, you know go to the sources because kubernetes being open source is a huge benefit. um The quality of the code in kubernetes is pretty dang high because there's a there's. You know this. All of this governance involved in you know how changes are made and you know, reviews and you know, minimum requirements for documentation and um in the code.

B

So, as you can see here, um all we really need to do is go to kubernetes in github, and you know, move down to the api server package and then look in the audit. B1 types go and you're going to find definitions of all of the types of things that the audit subsystem uses.

B

So you'll find information, for example, here on the event struck, and you know, every single field is um described and you can of course even see the data types that are being used here which, if you can read, go which isn't that hard to ingest. If you know any programming language really, um you got a a leg up, so um getting detailed um definitions for events and all the different fields.

B

You can find here you're running into something that you're not familiar with, but let's cover a few of the key things: um api servers, um you know, process, requests and stages, and so they authenticate users, they authorize users, they admit uh resources um as a final stage in the security processing. There's other things that happen as well and so um from an audit standpoint. um Receiving request is something that we can log.

B

If we want to this is the first stage this is generated as soon as the audit handler receives the request, and so, if you, if you're interested in you know every single request, that's made. That's that's a stage next, the response started. So this is after the response headers are sent, but before the response body is sent, so this would apply to like a long-running request like a watch request or something like that, and then response complete again. This is the response body complete.

B

So after there's no bytes left to send, and then there are also um you know in in go programming, a panic means something pretty catastrophic has happened and while you might be able to um recover from that most of the times, that means that you know the api server would would crash. So I'm pretty serious. So those are some stages that you can see identified in the events, and you can also use these stages for filtering events too.

B

As we'll see so audit levels control the level of data emitted for an event, none means you're not going to log anything. This is the default. So if there's no policy specifying what to do nothing's going to happen, then you've got metadata. So this is basically going to log all the you know, the high level stuff, like the the header type of information that you would have in a http sort of exchange. So it's going to log the user, the time stamp resource information the verb used.

B

So you can basically see what's going on, but you won't be able to see the details, so you wouldn't be able to see, for example, you'd see that somebody's creating a particular pod, but you wouldn't see what the pod spec is. Now that's going to do two things: by sticking with metadata you're going to be able to see broad activity in your cluster you're going to be able to know what kinds of things are happening and and which objects they're happening too, but you won't have the details. You won't know. Specifically.

B

You know when this thing happened or when that thing happened or when the next thing happened insofar as mutation of a pod, for example, where, if you go with requests, that's going to give you the request body, but you won't know what the response is going to be.

B

If you're going to, if you're going to capture the request, which is often um you know a big piece of the puzzle, then you might want to think about capturing the response as well, so that you can see the um the response body coming back, though again you know, if there's a lot of activity on your cluster, that's looking things up constantly, then the responses could be large and that could you know you know not not just incrementally, but you know, potentially multiplicatively increase the amount of logging, so each of these gives you progressively more log output, and that means you're going to have to have more capacity and throughput in your log.

B

You know function so, however, you've got your logging managed.

B

So here's the audit policy, so an audit policy is again a lot like a you know, a typical kubernetes resource and it is saved in a file and provided to the um to the api server in order to allow it to um you know, decide what kind of auditing you'd like to do and the audit policy file is incredibly powerful.

B

It allows you to really get very, very um specific about the types of things that you want to capture. So, for example, you've got high level specifications that you can add like omit request received stage. So um that's you know, that's sort of a global policy, then you've got individual rules and these rules specify the level so in this case, request response down here metadata down here: none and then you've got the resources and so you've got the group. So this is the core group right. Empty string is core group.

B

If you wanted to reference apps, you know you'd put apps in there or if it was networking.kates.io you'd put that in there and then you've got the resource types, and this is of course a list. So you could have as many resource types from the core group as you'd like to specify here.

B

As we move down, you can see that you can even choose specific resource names.

B

So if you want to, for example, make sure that we're not logging config maps, then we could specify this specific config map by name is not to be logged and then we've got scenarios where we're picking in specific users from the list, so logging activity from the q proxy in this case and and specifying the verbs that we're interested in well in this case, not logging, and so maybe there's a lot of activity in your in your log that you've identified as not being pertinent in the scenarios that you want to be able to do.

B

Forensics around you can um you know you can sometimes carve out 10 20 30, 40 percent of the you know of the I o by just carefully blocking off certain bits of logging using level. None so uh really useful, um useful tool- and you know you know very, very powerful and gives you lots of granular control. So again, where do we get the the details? um This is not a kubernetes api server resource. It's a it's! An audit subsystem configuration file.

B

So again, if we go to types you can see all of the different settings for a policy rule which is you know, the main thing you're going to construct and if you look through you're in the types you'd see the you know, the policy will list and the audit policy types and all that.

B

But the rules are kind of the interesting one, and so you've got the users component of the rule, for example user groups um and if you're familiar with our back, if you've done any kubernetes security, and I would say that that that's almost a prerequisite to you know working with the audit log. If you're in this space and you're doing this kind of stuff you you may be a security professional or that's a hat. You wear and so audit logging um involves similar types of constructs right.

B

You are um you're in our back going to give a particular principal a user, a group or a service account some capability, so some verb on some object, so some resource type and those resource types again can be scoped by the the group and then they can even be scoped down to a specific named resource and then a kind, and so that just kind of carries on here. So if you're familiar with uh with our back, the audit rules are very similar.

B

So, um as we mentioned, auditing's not cheap, um it increases the memory. Consumption of the api server remember from the model that we saw only the api server is involved in um in auditing. So you know you don't really have to worry about the activity from you know the controller managers or the other guys.

B

It's really the api server, and so um you know topping your system getting some baselines of your server without auditing, and then you know, maybe progressively ratcheting up the policy to increase the amount of auditing and watching your resource consumption on the api server side. um You know, is not a bad thing to do. That way. You can sort of get a sense for you know where the diminishing returns are. If you, you know, if you basically log everything it's going to be, you know crazy.

B

You know, you know every byte in is going to be magnified by two, because you're going to be writing it to the audit log with a bunch of extra metadata. So um you know kind of getting a sense for um the the throughput capabilities of your system and and the memory capabilities is gonna. Be important. Memory is a key piece of the puzzle because the um the the api servers audit subsystem is gonna to capture.

B

You know various contexts and other types of information about all this logging audit logging output and in many cases, you'll want to be buffering your output as we'll talk about in a little bit so watching memory, utilization of your server and also the I o consumption of your server to whatever kind of back end you're using for capturing the audit information.

B

So how do we set up the api server? Well, the api server has 30 audit logging options and the most important one is perhaps audit policy file, and so that's the file to actually use um to to define your audit policy and a lot of times when people um you know start thinking about. You know this.

B

This file, um you you you can you can put it in a bunch of different places, but at the end of the day, in a cube adm scenario, you would probably put it in a protected directory and then you would hostpath mount it into the api server container. That would be a typical scenario.

B

um So the next thing that we've got- and let me just maybe I'll just show you a quick example of that. So, let's.

C

Come back over here and let.

B

Me just dump out the.

B

The manifest that's running our api servers so.

B

There you go, and so you can see this guy. This guy's got a host path for var log audit, that's where the log output is going, but the audit the the policy file, also as it turns out, is there a lot of times the policy file would be in etsy kubernetes or something like that, but there's the mount path inside the container, so it's the same as the host and then in the configuration of the api server. So we run the cube api server in our configuration.

B

We specify the audit log path and then we have the policy.

B

So those are two key configurations and you can run the kubernetes api server with a minus minus help and that'll dump out all the switches, and there are a lot- um and you can also you know, use the documentation for reference if you want to, but this is a complete list of the the current with kubernetes 119.3 audit log options.

B

Now, if you want to, you, have two possible backends for the um the api servers audit logging, one of them is a local file or it doesn't have to be local a file and then the other one would be a web hook, and so there's a posting protocol for the web hook to receive all the events and in either scenario um there are lots of settings right.

B

So these are all the log settings these apply to a file based log output, where regular file io would be what the api server was doing, and these are the web hooks where it would be an http.

B

um You know post style output, so that's the you know the basic configuration stuff so now, let's now that we kind of got an idea of audit logging, we've seen some events, we know how to configure servers. We understand this policy thing. um What are some of the concerns um that you run into in practice with this? Well, one of the first things that you run into is having multiple api servers, because nobody in a production environment is going to have one api server, because the api server goes down. Your cluster is dead.

B

Two is probably fine um for most clusters. You get. You know that way. If one of them goes down, you still have the other one, and you know it's: it's pretty unlikely. You're gonna lose two, and and when you add api servers you you get some scaling ability right, because the logic being processed by the api servers is, you know now distributed, but really at the end of the day. You know, maybe I'll just go back here to this previous model.

B

At the end of the day, the bottleneck is etcd, so lcd is an in-memory key value store. This is why the audit log is not going here right, it would totally.

B

Etcd is already a bottleneck just keeping up with the the the configuration of the cluster right and the events that are happening because those you know the eight. You know: control, plane, level events. Those events are actually being stored in std for a period of time, um but the audit log data that's massive, so we have to have a completely different channel totally independent.

B

um For the you know, the the audit log and when you think about this um ftd is is, is often in a production system running on a different cluster. So you would have a you know, a three. You know rolling production.

B

You probably have a five or a seven node std cluster, and so you know the api server uh cardinality is independent of std if you're, if you're, not running them on the same box right so a collapsed std api server, where the std nodes are running on the same machines as the api servers is, is an okay way to do things, but if you're doing things in that way, then what defines the number of api servers you have? Is the xtd cluster size? Not the api? You know not the api service.

B

Two api servers is fine for high availability for most clusters, but here's a problem um when you have multiple api servers and let's say you have three you're going to need a load balancer, and so you might set up you know if you're using google cloud, you might use a network load balancer on amazon or something um google cloud use, their load, balance or azure use their load balancer um to basically front end.

B

Your api servers and all your your your cube, kiddo configs, are going to refer to the um the tcp load balancer, and so they hit this guy. He just forwards the traffic onto one of these api servers and you don't know the difference, but there's a health check so that when one of these guys crashes, they only send you to the guys that are alive.

B

Well, that's great, but the downside is you don't know which one of these guys is going to be dumping out. The audit log information that you're interested in so you could do something like run a pod as you can see down here and then you know tail the audit log looking for you know some sort of pod activity and not see it because you're looking at the wrong audit log right and you- you came in here and hit this guy and this guy logged the activity and these two logs don't have anything about it.

B

So the api servers are shared, nothing right, they're microservices, they don't know you know, what's going on in the other, server they're really focused on you know being as independent as possible. So the downside is your audit log is um now sharded, basically and so to unchart it you're going to do something like run a fluent d or a log stash or a beat or a fluid bit or something like that to splunk.

B

You know forward or something to move that log data into a back end where you can get a complete picture of what's going on in the cluster, and so that's important another thing about these distributed. um You know. Api servers is that on the upside you get scaling right. So if you've got, you know huge throughput going on in your audit log, you just divided it by three by having three api servers, and so you know as long as your network can can handle it and you've got.

B

You know the bandwidth um on the on the actual wire you're you're, using three nicks you're using three sets of memories you're using three sets of disk, whatever the case may be, you've really got scale there, so this is one way in which actually having multiple api servers can in fact have a dramatic impact on your scaling challenges because we're not using audit logging.

B

um You know I mean one: api server can handle a pretty honking big cluster, it's the ncd cluster, that's always the bottleneck, and so you know two api service is good for ha three is even better, but you know if you had audit log challenges, you might wanna go to four or five and you know get your audit logs scaled out. um You know using using more api servers and remember the cardinality could be completely different from the xtd cluster because you know usually that's a separate cluster of servers.

B

Another thing people, often you know stub their toe on for a day or two is using config maps. Configmaps are awesome for configuring things and, you might say, hey I'm running the api server in a pod. Why don't I set up the policy as a config map you could. But what happens when you start the cluster?

B

There's no api servers running and you fire up the first api server and for him to configure himself. He needs a config map. Well, how are you going to get that config map?

B

You need to make a request to an api server, that's going to hit std and give you the config map, but there's no api service, there's a chicken and egg problem. So most people skip that and do some other technique to standardize their policies. But this is another pitfall right. What if api server one has one policy and server two has a different policy and server. Three has a different policy.

B

I mean there. There could be excuses for doing that. It is totally possible, but um it's going to give you weird asymmetric log output right from the different servers, so in most cases that I've run across you probably want them to be the same, and you might want to have some. You know sort of immutable infrastructure, ansible salt. You know whatever type of things it's you know, keeping those files in sync or have them from a shared disk. You know or something so next thing to talk about. We've got a few more things here.

B

I know I think we have to wrap up, but I'll try to hit a couple more things here and then we'll see. If we can get some time for questions so mutating admission controller webhooks, it can be useful to know which mutating web hook mutated an object in an api request and if you've got a you know if you've got a bunch of plugins into the api server that are potentially changing the nature of a resource that somebody created um by default. The api server won't know anything about it right.

B

It's going to call these guys, and um you know it's. It's they do what they do and then the the api server just you know, moves on to the next um unit in the chain, and so what we want to be able to do is see where in the chain you know, change a happened and where, in the chain change b happens. So a popular example would be istio, for example, and the istio proxy injector.

B

So I create a pod, I'm oblivious, I just you, know, wrote my app and I put it in a pod and I go to deploy it. And now the um the api server says. Oh, we have a. We have a um you know, a mutating admission controller here that wants to mess with this pot and what is he going to do he's going to add to the pod of an init container?

B

That's going to rewire all of the traffic in the pod and then he's going to add a sidecar for the proxy which is going to intercept all the outbound traffic from the main container and he's going to then you know tls encrypt it and do mtls and emit tracing data, and whatever else he's going to do.

B

But that mutation is fairly complex and it could interact in weird ways with other mutations and it becomes hard to sort of figure out what's going on unless you have some way to introspect, and this is exactly what um mutating admission controller web hooks can do by using annotations. So you can see in the example here, mutation web hook and mission controller case io round one index two. So if you're familiar with admission controllers, we have you know different phases so round one um is, is uh the first pass, but then we once everything's mutated.

B

You might also have um you know an admission controller, that's going to allow or disallow only um and so um that that machine controller you know, would come up in another round, and so we have these different rounds and then we have the indexes. So in this case we're the the second of the um of the mutating controllers, and so we have the configuration and we specify some configuration data. We have the web hook information and then we have the status of whether this guy mutated it or not.

B

So in this case, um this this controller did not mutate the resource, and so that's a nice. You know piece of documentation that you can now get from your system. You can, you can see, you know if you're mucking, around with admission controllers, you can really look in and get a deep dive into. What's going on, another thing that we can do is we can specify the actual mutation, and so, if you have a request level, um you know audit or higher.

B

This is all you'll get by the way, if you are just a metadata level, so just yes or no, I mutated this object, but if you're at request level or higher, which is more detailed, then as you can see, we get the patch right, and so you actually have the um you know the the information about how things were changed, which is, um you know, really really can be useful again in debugging scenarios.

B

Okay, so I think a couple more things and then we'll wrap up here um so auto log monitoring um the the api server. It has two open metrics style, metrics, endpoints or metrics uh metrics in its slash, metrics endpoint, and one of them is api server, audit event total. So that's the cumulative total of audit events and then there's api server. Audit error total. So that's the total number of events that were dropped due to an error in exporting, so I hadn't talked about this just yet.

B

But if you're, if you're, you know dumping huge amounts of information to a back end like an elastic search or a fluency aggregator or something and you're overwhelming it um with ios one way to fix that problem is to batch a bunch of events together and do a single io with a collection of events, and so you can reduce the number of ios by a factor of 10 by just collecting every 10 events and submitting them as a unit and that often solves problems.

B

uh Another thing that you can have is you can have you know, sort of um up and down. You know performance in these aggregators because they might be servicing lots of other. You know log streams, and so you might need to buffer your output. You might send them a batch and then a whole nother batch and another batch another batch. You might have five or six batches waiting and then, as soon as they process that first one then they might catch up.

B

So you need to sort of look at what the lag is and figure out a buffer size that also works for them, and so, if you end up running out of buffer you're going to drop events- and this will tell you if you're doing that, so those are both really important, because the first one event total is going to give you an ability to sort of estimate and discover spikes. So if you have a prometheus or something system, monitoring the metrics from your api server, you can plot that and look for anomalies or trends.

B

You know if you're increasing continually day after day, you might want to make sure you've got the headroom to get to where you're going to need to be in a month and then errors, of course, are always nice to know about.

B

So, there's just an example: dump running a cube, kiddo proxy on a machine because you know to avoid all the tls stuff that you need to get the metrics and then just curling the proxy to get through to the api server on the metrics endpoint, and you can see that we've got the audit event in this case is what we're grabbing for the total and you could pull up the error in the same way. So that's some of the metrics um other things um handling massive throughput.

B

So we talked about batching, blocking and um and strict blocking um batching um is where you're gonna buffer events asynchronously blocking you're gonna, actually block the api server responses until the event is processed. So that's you know, that's pretty draconian and will impact users um and then blocking strict is the same as blocking. But when there's a failure during the audit logging, the whole request is rejected to the user. So that's even more strict, so um batches.

B

You know typically what what people would set to for their um the buffering strategy and then there's a you know: buffer sizes wait times um throttling all sorts of things that you can set up here to help control the throughput.

B

Other considerations remember that each api server is independent, shared nothing and so scaling them can can give you some scale. So if your back end web hook is the bottleneck, you're gonna have to think about that as well, but you can scale the api servers to scale the stuff out. Okay, um I think we're getting pretty close to the end of time here, so I'll wrap up um thanks a bunch really appreciate your time and maybe we'll see about some questions.

A

Okay, well, thank you randy for that excellent presentation. We have about five minutes for a question. So if you have anything you'd like to ask, please drop it into the q. A box first question here is the json output formatted. According to cadf specifications.

B

uh That is a good question.

B

um Types um go source, is the the you know, defines the structure, so I'm not positive I'll I'll I'll uh I'll see. If I can um find out, though, and maybe I can post an answer with a follow up in the.

C

When we, uh when we upload things.

A

Not a problem does anybody else have a question. They would like to ask the few moments so please feel free to ask away.

A

Would you be able to elaborate a bit more on the relation between an api server and xd.

B

Sure, um okay, let me.

C

Back up here to.

B

This picture so um the the ftd cluster.

B

C

Just gonna see if I could draw something, um but I don't think I oh yeah. I can't here we go.

B

I used so many different darn presentation tools. These days I have to keep track. So if you have an ntd clutter, let's say: let's do it simple? Let's just keep an example of three that ftd cluster with three nodes is going to use the raft consensus protocol to elect a leader, and so let's say this guy's the leader.

B

So if you've got, um let's say three, let's say: you've got five these guys just to make it a little bit more interesting, so say: you've got five sed nodes and you've got three api servers, so the api servers are all going to write to the leader and in general they're going to read from the leader too, and you might say: oh my god, that's terrible because the more api servers you add, the more load you're creating on that leader and while that's true at the end of the day, the etsyd cluster is highly consistent.

B

When you write to the leader, it has to write that data to all of the other nodes in the cluster and furthermore, because it's a highly consistent key value store. It has to know that a quorum of the nodes have committed the data and so um etd becomes the bottleneck in most cases, when you're, when you're you know experiencing control, plane, challenges, and so the um you know, the api servers are, are uh they're anonymous, faceless, identity-less, microservices, you put a load balancer in front of them.

B

You hit the load balancer, you don't care which one you get because no matter which one you talk to. It's always going to give you the same picture of the world, because you have this highly consistent key value store that stores all the state. No, the api servers have some. You know caching and things like that, but at the end of the day, everything comes from fcd.

B

So if you ask number one two or three of those masters, what pods are out there um they're all going to give you the same answer, and so the the state in etd is uh is, is the the real you know, bottleneck the management of that and, unfortunately adding more nodes to cd slows it down the fastest scd cluster is a single node because he doesn't have to copy the data to anybody, and so the reason you need to have multiple nodes and that production systems are like, usually five or seven.

B

Is that if you, um for example, want you know, diversity and and failure, tolerance resilience which you want. You know. In most cases you can have three availability zones, for example in the cloud, and you can run your std cluster across all three and if you lose any az, you still have a quorum right. Quorum is n over two greater than so. If, in this case, five over two that's two and a half, so three would be the next higher integer. So any three of these guys and we're good.

B

So we could lose a whole, a z or if a node crashes you can- um and let's say you take a node down for maintenance. You can still have another node crash and be okay. Seven is a little bit safer than that, but it's a little bit slower. So um that's sort of the relationship right, stateless microservices, the api server, usually behind a load.

B

Balancer like this, would be like a kubernetes service sort of right, but in the case of a load of api servers, you'd usually use not always but usually use something else, because you want people to be able to access the cluster who are not in it, and so um you know, services um a load, balancer service, you know you know, could make sense. But again you you got to worry about chicken and egg problems when you're creating resources in the cluster to access the api server.

B

The api server is the thing that gives you access to those resources, so usually some sort of external load balancer in front of the api servers and then the api servers communicating with the leader of the std cluster and the entity. Cluster leadership is dynamic, so all the api servers are typically going to know about all the fcd servers.

C

B

Hopefully that answers.

C

A

Thank you very much randy, and I want to thank you again for a wonderful presentation. um Unfortunately, we are out of time. um I would like to thank everybody for joining us today and, as I said before, today's webinar and slides from today's presentation will be available on the cncf webinar page at cncf, dot, io webinars.

A

Thank you. Everyone for attending. uh Thank you again, randy for a wonderful presentation. um Everybody take care, stay safe and we will see you at the next cncf webinar.

B

Thanks everybody.