Apache Cassandra Cassandra Summit 2014, 10 Oct 2014

Previous Meeting Next Meeting

⏯

youtube image

►

From YouTube: Stormpath: Infinite Session Clustering with Cassandra

Description

Speaker: Les Hazlewood, CTO at Stormpath and the Apache Shiro PMC Chair

In this session Les Hazlewood, the Apache Shiro PMC Chair, will cover Shiro's enterprise session management capabilities, how it can be used across any application (not just web or JEE applications) and how to use Cassandra as Shiro's session store, enabling a distributed session cluster supporting hundreds of thousands or even millions of concurrent sessions. As a working example, Les will show how to set up a session cluster in under 10 minutes using Cassandra. If you need to scale user session load, you won't want to miss this!

A

A

Ok, so this presentation is about using Cassandra as a data store for storage and maintenance of sessions and when I say sessions, I really mean anything session e. So this can be a user session. It can be a device session. It could be anything that you want to maintain state, for that is temporal in nature, so it's not necessarily just limited to user sessions, but that's primarily what we're going to talk about and demonstrate in the context of this presentation just because it seems to be one of the most widely adopted use cases.

A

So my name is less hazel and I'm from storm path. We are a cloud-hosted identity management service. We focus on authentication, user management, security, workflows, automating, all of the stuff that tend to tends to go wrong when you implement things or when you're implementing applications exposed to the world at large. How many people here were we're here for the presentation that was just before this one on Oh auth to as a service? Okay, cool? So if you didn't want to build and maintain that yourselves, you could use storm path.

A

For example, now we do a lot of other things in addition to that, but we handle all these things and all for developer tools and SDKs and libraries and of course, as you might expect, we use Cassandra on the back end during the course of this presentation would be talking about apache Shiro. That's the implementation that I'm going to use to demonstrate how to do session clustering in this particular use case. How many people here have not heard of Apache Shiro?

A

Ok, a few people so Apache Shiro is the Apache foundations, open-source security framework for the JVM platform. It handles user management at the application level, session management, authentication authorization in best practices, cryptography and encryption and security digests, but in the course of this presentation, we're only going to be talking about session management. So these other things and the web support and auxiliary features like testing support and another things in crypto are out of scope for this presentation. Of course, this is what I do most of the day.

A

So if you have any questions on this stuff, please feel free to ask me after the presentations over so okay, before we jump into Cassandra related code, let's cover some shiro quick concepts. The idea, what Shiro is that as a security, freaking work, everything is user or in this case, subject. Centric and subject is just a security term, for that means the currently executing user or the currently executing individual or thing that's interacting with a service or an application.

A

So this can be a human being most of the time it is, but it can also be a device, a daemon, a third party service. It's anything. That's currently interacting with the service and Shiro on almost all of its security operations takes a subject: centric approach to its API. We find that people tend to think about things and user use cases. You know if I'm accessing this REST API endpoint can I do X, Y or Z, or am I allowed to click this button.

A

So everything takes a user-centric approach and there's all sorts of convenience methods on a subject that represents the current user. For example, you can log in the user, you can do some permission, checks, authorization, checks, role, based access, control and whatnot, so that we're going to be referencing. The set the subject here to show you what this kind of looks like in code when we start doing session management. So what do I mean by session management? It's really managing the life cycle of a subject-specific temporal data context, and that's really a big mouthful.

A

That just means you know state that is associated with some identity over a period of time, most people associate or think of HTTP web sessions when they, when they hear about session management- and that is the most widely used use case and, of course, as I mentioned before, this can actually pertain to any kind of state managed over time. So devices time series data attributed to a particular field device. Anything kind of that fits that mold can apply here. So what can Shiro do in regards to session management?

A

One of the cool things about shero's. It supports this notion, heterogeneous client access, and that means you can access the same session from different devices or different web browsers. This is a feature that doesn't exist in Java or je e or je anywhere. That I'm aware of. For example, you can't access the same stateful session being from a web web browser, as you can from a server-side component, because there's inherently connection state and other things associated with those those access patterns, but Cheryl allows you to do this by session ID access.

A

Everything is eros, pojo based j2se based, so it's very icy friendly for things like juice and spring and other in and jboss and other dependency injection frameworks. It's got an event like Senator mechanism, so you can listen for relevant session events. You know expiration, you know, attributes are added or removed creation, all sorts of good stuff. It also retains the supports this thing called host address retention.

A

So, unlike most servlet requests or just other generic session infrastructure, you can retain the IP address from where the session was initiated, and sometimes that can be useful for access control policies, especially in intranets. There's, of course, inactivity and exploration support, as you might expect, but share also supports this notion of a touch method.

A

So in more more frequently people writing a single page applications on the web and in rich Internet applications, and if those apps don't actively interact with the server the server side session can actually expire and so set zero supports is touch method that allows the client-side app to touch the session and keep it alive as long as is necessary and in she rose session, support of we have HTTP and survey based session stuff out of the box.

A

So what that means is that if you already have code that programs to the session API in the servlets back, you have to change any of that code. You can this. Every everything still works, as expected. Shero implements a servlet specification, so you don't have to change your source code and one of the biggest reasons people use, zero and- and this presentation kind of reflects. That is that you can get container independent clustering.

A

So if you drop cheer on a nap and point to a clustered session store like Cassandra, that will work the same in any web container, whether it's in jboss or glassfish or tom cat or jetty, you don't have to change how you cluster sessions right today. Almost all of those mechanisms are container specific and requires you to to to know how that containers configuration operates, and so you can test and jetty and deployed tomcat production or any other combination. And you don't have to change your source code.

A

That's a really big benefit for teams to minimize kind of fluctuation during development. Ok, so how do you acquire a session? How do you create them? How you access them? As I said she Rose a subject, specific API or subjects centric API and there's really two ways: you can call the subject get session method that will guarantee a session exists.

A

So if one does not exist at the time, this method is called a new one is created and returned if one already exists or is already associated with that subject, that just returns the existing one and then, of course, just like the servlet API, you can pass in a boolean to indicate whether or not a session should be created or not. So, in certain cases, for example, rest api's. You want to ensure that your rest api can remain stateless, but if they already have a session on via the user interface, for example, an admin console.

A

Maybe you want to leverage that during the REST API, and so you can get a session if exists, but not create one if it doesn't in addition to those methods, the session itself has methods that you would expect to see in almost any kind of framework and.

B

A

Parallel the servlet request: API, you can get set attributes you can set the timeout for an individual session. There is this notion of touch which the servlet API does not have, but pretty pretty common things, things that you would expect. So, let's talk about how this actually works inside a Shero in the internal architecture, because this is going to be important when we talk about how to plug in Cassandra.

A

So when you call subject get session and it returns a session what's actually returned is this is an interface by the way, it's not a concrete class. It's actually returning is a very lightweight proxy to Shiro's internal session manager and the session manager, as you might expect, manages all sessions or a particular application, and so all operations on the session interface itself actually are delegated to the session manager.

A

To do the real heavy lifting and the session manager in turn uses a session factory to create brand new sessions new session instances at the time that they're requested. It's also has this notion of a session Dao everybody in here familiar with the data access, object, design pattern. You know just thin tier layer to access your on the length data store, so there's a session Dao. The intern reference is a session ID generator, so you can customize what your session IDs are.

A

They can either be time you you IDs, which is pretty common for Cassandra use cases. They can be secure, randomly generated strings that can be integers whatever you want. You can customize this in addition to the session ID generator, it's a session Dao.

A

Also, you can utilize a session cache, so you don't have to hit the data store every time and keep things in memory or especially useful if your cash is a clustered distributed, cache like coherence or hazel cast or any of these other clustered cash frameworks, and then finally, the cash can-can proxy to a datastore and, of course, of a cache, is not enabled the data storage access directly in this data store, as you might expect, is completely pluggable. It can be almost anything it can be a disk.

A

It can be just the clustered cash mechanism itself. It can be a variety of things and, of course, in this presentation we're going to be talking about Cassandra as that datastore we're going to utilize cassandra as the thing that we plug into Shiro's architecture.

A

Session managers also have a session validation scheduler, which I'll talk about in a little bit, and it supports the notion of session listeners the ability to listen to various events during a sessions lifetime. So you can react to it and perform business logic. So all of this kind of represents how to access the session, how the session works and the kind of underlying architecture infrastructure of how Shero thinks about sessions. But the most relevant parts for this conversation are the purple parts.

A

So we're only going to be talking about the things related to data store duty store access, so the session Dao we're going to have to create something that knows how to talk to Cassandra, in this case, we're going to create a session ID generator that uses a time uuid, because that's one of the most common ways of guaranteeing a unique identity for a stored resource as well as being able to sort by time which, as you can imagine, is very important or can be very important for sessions.

A

You know sorting by most recently accessed or oldest session, or what have you the datastore is Cassandra. We're not worried about that, and one of the interesting things is this notion of a validation scheduler. So it's every session framework, that's out there, whether a Shiro or tom cat or anything else, has this thing called a validation scheduler. The main name might be different across the frameworks, but ultimately the scheduler is required for running periodically and deleting orphan data out of the data store.

A

So you want to make sure the old sessions, those that have expired or have been implicitly terminated, but not explicitly terminated are cleaned up, so they don't fill up your data store. If you don't have this scheduler or some kind of validation mechanism, it will fill up over time and that's never a good thing. Your disks will run out of space and whatnot. So this is an important concept that everything has 0. Has it and we'll talk about specifically in regards to Cassandra how this is important or whether it's important or not?

A

So how do I enable this inside of an application because we're talking about web apps? In this particular case, you can do everything that we're talking about here, the simple web XML configuration or granted with the the latest advent of the servlet 30 spec. You could do these things programmatically as well, and what we want to do is make sure that we protect all URLs so any request that goes into the system. We want to intercept it and make sure that we can leverage Shero session implementation instead of the servlet containers.

A

Because, again, I said: Cheryl implements a servlet spec, so we need to make sure that that is represented instead of the default servile, behavior I'm sure also can, in addition to sessions, protect all URLs, of course, there's authentication and access control authorization tools that you can do at aur specific level. Cheryl supports us, this notion of defining filter chains in a very concise manner. That is probably the most succinct and easiest to use mechanism.

A

I've ever seen for a web app much easier than defining a ton of filters in web.xml and we'll talk a little bit about what that looks like in a minute. There's also a JSP tag support, so you can control whether or not elements in jsps or JSF pages are rendered based on session state or user state, all sorts other things, and as I mentioned before, we implement the Soviet spec. So you don't have to change your session based code. So this is how you enable Shero in a web app either XML or programmatic config.

A

You enable a listener that that loads up Shero in its environment and then you specify filter, so it can intercept all requests that come into your web app. In addition, you want to make sure that the filter mapping for that filter can intercept every type of request that is performed by the servlet container dispatcher. So there's there's request types: there's forwarded fort requests forward types include types, error, error, error, handling, Cheryl wants to filter all of them, so it can inject the appropriate behavior.

A

So this just gets it set up and then finally, Cheryl has its own config format. We use ini as our lowest common denominator format if you're using spring or annotation based configuration. Those are definitely a great choice and probably preferable, but this will always work regardless of what kind of programming environment you might be. Deploying an excuse me, and so the breakdown of this is that Cheryl has a main section which basically is an object. Graph config you can define some static users, some static roles and the URLs section is defining filter change.

A

So any given request that comes in you can define a set of filters that will perform security operations either garrity accession exists or make sure that they're authenticated and, if they're not redirect them to a login page, all sorts of that stuff can be configured here in a more succinct way than using what about XML. So how do we leverage these? You know this config to implement session clustering, specifically using Cassandra.

A

The idea here, of course, is that, if we're, if we're operating with a web application with large scale or extreme scale, we're talking, millions of requests are millions of user sessions over the course of a day, and that could be anything from an end user again to a device to a mobile application.

A

You want to be able to handle this at extreme scale, so we're going to leverage cassander fat. How do you do that? So? There's two approaches in Shiro. You can either write a session dao, as I mentioned before. That's the thing that manages access to a data store or you can leverage she rose out of the box enterprise cast session, dao implementation and just write a cache manager. So again, a cache manager can talk to a clustered distributed cache. That's perfectly fine!

A

You can use cassandra as a as a cache kind of a replacement for like a memcache, but we're not going to go with that approach. We're going to write directly and talk directly to Cassandra, so we're going to write a custom session, dao. Very, very simple.

A

The only things that we have to worry about when we with this particular approach is that we're going to focus on the implementation, the session ID generator and then the actual data store that we have to talk to so the best approach when you do this is probably to extend the Shiro abstract session dao, which implements a lot of a lot of logic for you, like session ID generation and delegation to an ID factory and whatnot. So you just basically have to override a couple methods that perform the crud operations to the underlying data store.

A

And if you want, you can subclass the caching session dao instead of abstract, if you want to enable right through caching. So if you want to leverage a cache or cluster distributed cache before having to go down to the datastore, you can do that.

A

So the way to enable native session management or session management controlled by Shiro, instead of the servlet container, is to just tell Shiro: hey I want the session manager that's in use to be a default web session manager. This overrides the the configured or implicit default manager. That's that's used in web apps at startup, which piggy backs, or rather wraps the servlet container session manager. So now we're saying I don't want to use a surrogate, an arrow when you shiro, so I can leverage these.

A

These are these clustering things and then we're going to do some more config that allows us to interact with the actual underlying cassandra datastore. So these are just some utility classes. This is this is from a sample application. I wrote this is all available apache license on github and I'll show you the urls at the end of the presentation. If you want to download this yourself and hack on it and kind of play with it, but we're creating a cluster object this. This really creates a cluster object using the Cassandra.

A

Excuse me, the data stacks Cassandra driver, and so this cluster object is then used to perform all of our queries and do our crud operations. So the dao here is just being configured. Here's the cluster, here's, the key space I'm going to interact with this is a particular table or column family that I want to interact with. These are all the defaults, pretty simple stuff, so once this dao is to find you just have to configure it on the session manager.

A

So, as you can kind of see from some of these things, then the main section of Shero is really just simple object: graph kind of definition: I'm defining objects, I'm defining properties on objects. This is nothing kind of interesting, but it is very convenient from a text config kind of way. So these are all use. Javabeans properties, as you might expect, I'm very easy again if you have spraying or juice or annotation base config, that's probably better, but this works for everybody. Ok, so now we've got our dao plugged into shiro. What is the table?

A

Look like, so this is a cql definition of what our sessions column. Family looks like in Cassandra so of course, there's an ID. Every every session has an identifier that it can be used to access on on subsequent requests, but we also keep track of other information that might be useful, especially for sorting reasons. So, there's a start time stamp a stop time, Sam the last time that the session was accessed by an end user. That's really important for session and validation reasons.

A

You know time out whether or not the sessions been expired. Maybe you want to keep expired sessions in the data store, so you can run reports later and maybe find out how many sessions are created by a given user or device.

A

This is really just an optional thing. What what the host, what the host from where the session was created? You know its IP address and the serialized value is really the serialized attribute map from within the web app. So you know you could do sessioned get attributes that attribute. These are all apps specific things, and so it might be prudent to mention that, as is the case in this implementation, as well as pretty much every session implementation I've ever seen, is you want to keep your serialized attributes the session state itself extremely minimal right?

A

The more session state you have, especially if its large, the more you have to spread that state over cluster, the more I? Oh, you have and serializing it and reading it into memory session state can kill large scale apps. So the only thing that really, if you want to scale that should be stored in sessions, are simple pointers. You know an identifier or a couple identifiers to something else that you can look up from a cached data store. So the idea is keep the stuff minimal and your performance with Cassandra will be much much better.

A

No Butte, no big huge object. Graphs in the session yeah, don't store the entire UI state in the session yeah okay. So we talked before about a session validation, scheduler, and if you noticed in the config previously, we didn't specify any session validation scheduler it one is defined by default, but we don't need one in this case right, there's a lot of overhead in querying a datastore and finding all of the sessions that I've been expired and if they are either purchase them from the datastore mark them as deleted.

A

Again, as I mentioned every session mechanism out, there has to do this to prevent orphans, but we don't need one when we use Cassandra, it's really great because we can use Cassandra's TTL feature. So if you specify a TTL equivalent to or equal to your session timeout for that particular session, you can trust that Cassandra will automatically delete that session from your datastore and that these orphans don't fill up over time. So again, this pertains to sessions. This pertains to open, ID tokens.

A

This pertains to anything that is temporally stateful and so we're going to leverage the same behavior here. So the way you tell Shiro to not enable its default scheduler is just turn it off. So we're telling the session manager hey I, know, session validation is really important, but in this case I'm explicitly turning off, because I know that my datastore will do it for me and that way the application doesn't bear the overhead of having to do it itself.

A

So, given the table, you know what happens when a new session is created. Here is a sequel query that we can use that will populate a row in that table, so we're going to update the session table using a particular TTL dollar sign timeout. It's just my own little identifier, token, for this presentation to show you where the real value would go.

A

That is not valid sequel syntax, but that shows you where the real value would be substituted during runtime, ideally be like a question mark, but that's where you want the TTL and then you set all the values, and then you specify the ID of war that that value should be saved, and so this is commonly referred to as an upsert. Cassandra doesn't care whether this is the first time the row of the records being written or if it's an update up, certs work really well for Cassandra.

A

A

What ah so, the question is what, if you have these TTL, what about tombstones and that it's an amazing question, what about tombstones?

A

So the reality is that, yes, if there is a lot of reading and writing and deleting you're going to create tombstones and crew, tombstones and variably create or add more space, and they need to be cleaned out as well, so when using Cassandra as a time-based data store or temporal data store for session state, you want to be careful with your GC grace and the compaction strategy that you use. So we recommend a GC grace.

A

You know, as many of you probably know, GC grace by default in cassandra is 10 days, which means tombstones can live up to 10 days in the datastore, and so if sessions are expired and say a session timeout of 30 minutes or an hour. That's an awful lot of time to keep this. This data, that's probably never going to be used around. So we recommend a GC grace in this case. 86400 is, is one day 24 hours, but you can even go much lower than that depending on your load.

A

So the more concurrent sessions you have and the more scale that you have you might be incurring more and more tombstones, so you might want to reduce this even further. We found that a day in practice works pretty well for most people, but um you know your mileage may vary. Tuning is important, so get some data and make adjustments as necessary. Compaction here is really important, though, so, how many people here do not know the difference between size, tiered compaction and leveled compaction in Cassandra?

A

Okay, so we got some folks, so sighs tiered compaction basically will compact SS tables for more efficient dish, storage and utilization from Cassandra, based on the size of the SS table file itself. So as it gets to a certain file, its going to there's, there's going to be more s, s tables created after that size is reached and that's a fine strategy in an efficient for space utilization. When you have a right, mostly workload or like time, series data, that's not manipulated.

A

So that's, actually, a really good use case for using the size to your compaction strategy levelled is really important. If you have a read heavy workload or an update heavy workload, and what this does, is it actually compacts SS tables based on kind of an LRU strategy or frequency of use, and so the things that are most recently accessed go into the tier 0, then tier 1, then tier 2, and so on. So there's different levels that act as as persistent tears for the USS tables.

A

That's really important for sessions, because the things that are most active or or widely used or frequently used you want to have in the interior 0 if possible, and so that the the probability of this stuff is. Actually you know. The amount of SS tables that you'll have to read on average is 1 point 1 1111, like you, have a ninety percent chance of seeing your data in the tier 0. Then, if you have, if it's not there, you hit tier 1, you have a ninety percent chance of hitting it there.

A

So leveled is much more efficient for for for read, heavy or update heavy kind of flow, so make sure you're using leveled compaction strategy. That being said, this does incur more I. Oh right, compaction happens more frequently, there's more I Oh on disk, and so you have to be aware of that when you're customizing or testing your sessions at scale and practice. This isn't this isn't really that big of a deal special if you're on SSDs, but you know, just run some tests and make sure that they that this works for your particular scenario.

A

What about row? Caching, this is a question that comes up. Sometimes you know, hey can I turn on row cash and have it worked as sort of like a memcache, so I don't really need to be hitting disk all that often, and until today's announcement with two dot one, you probably don't really need it most of the time row. Caching is used in very very specific use cases. The data stack score.

A

Engineers are probably the best people to ask about when it's when it's valid to be used, the SS table is already likely to be an operating system, page cache so and it's off heap in Cassandra. So it's already likely to be a memory, so you don't really need row cash again. Lest you know four versions less than two that one's you do want to use key cash. It's really important to make sure your keys are cached, so they can be accessed efficiently, 11 accessing a partition row.

A

This is really important, but it's okay, it's it's enabled by default on Cassandra, 12 and above so most time. You don't have to worry about this. That being said, I'm going to run some tests now that two dot one is out and it's available on using row cash for these specific use cases because row, cash and two not one- is really interesting, where you can keep the most frequently used entries in a particular memory cache which makes a lot of sense for sessions right.

A

Those are the things that are constantly being updated in access, and so for this particular case, maybe you might want to keep a thousand or 5,000 sessions in the row cash, and you can do that based on the two dot one specific row cash. The code is here um feel free to check out a project and download it and update it. I don't really have a whole lot of time left.

A

What's my cut off time, 1115 anybody know 1130 perfect, so I can't show you a demo awesome all the perils of real-time demos, but now, okay, so I'm going to show you guys, basically how this this project is set up, what the code looks like and how you can run it yourselves. So why did I choose Cassandra.

A

So so the question is: why did I choose Cassandra because alternatives? There are different alternatives and for me one of the important things was we wanted session state to be persistent and geographically distributed in case of a data center disruption, and so that means that our sessions are fully fault, tolerant across geographic work zones and, in my experience, persistent, eventually consistent data stores.

A

There's no data store that works better than a Cassandra for Geographic replication like it is so good at that and so efficient that you know there are memory, only stores that probably do that really well, but in practice at extreme scale, we found Cassandra to be excellent in that regard.

A

Additionally, the maintenance overhead from the dev ops team is so low with Cassandra that there are other variances, our other other programs that require more operation overhead, that just wasn't worth it for us, and so we found Cassandra they scaled to hundreds of millions of sessions with no noticeable impact on performance. It was unbelievable, so it was kind of a no-brainer for us also. We were using Cassandra for other things, so the more that we could consolidate to a single datastore just made it easier for operations.

A

Yes, how many replicas? We we run a minimum of four replicas across our data cluster, so we have seven nodes that are online at all times, so we write with a confirmed consistency, but we read with a consistency level of one which is actually interesting. You guys get a chance to go see a data stocks talk actually not datastax a netflix talk, I, don't know if Christos talks about this in his presentation, but the the variability on consistency level of one and having something go wrong is so incredibly low, like at the millisecond level.

A

That odds are high, that it's not really worth an issue worth incurring the network overhead of quorum consistency, especially if it's data, that's not like super critical. It depends on what you in your session in this case, but most of the time, it's not that big of a deal. You can also implement retry logic and the client side of anything fails on CL level one. Then you try and quorum and then that usually fixes everything, but we found performance be much much better, as did netflix by using a consistency level of one. Yes,.

A

I'm sorry I can't hear he's coming with it with a phone.

B

For you to write at quorum and how do you handle the failure when Lord goes down, your rights will wait.

A

So is the question if I'm writing with a consistency level of quorum? How do I handle failures? Is that the question almost always it's expected to implement that that logic in your application, and so it's really easy. For example, I, don't know if you guys use guava, but um the the Java tool kit from from the Google team, or you can write it yourself, but it's really easy kind of wrap in a function that implements exponential, back-off right, and so the first thing is try. If it fails, try it again.

A

You know 50 milliseconds later, if that fails, try again 200 milliseconds later, and you keep doing that up to a certain max and there's utility functions it I know. Guava has them I'm sure it's easy enough to write yourself. If you don't want to depend on the library, but that's definitely the best way to do that and your code can block until it succeeds. um When you have that many nodes online odds are very, very high that it's just going to work. It doesn't always work that way.

A

So yeah you should be fault, tolerant and resilient and do that kind of stuff. Let me get through the demo, real, quick and then I'll answer some more questions, so we are going to just basically take a look at.

A

So this is, this is basically a dao. It's a pojo, it's nothing very special again, its subclasses that abstract one. It's got some lifecycle methods associated with it. This is an ideally designed. This was just done quickly for the demo. You know. I'd probably extract some of the stuff out into proper oo components, but the idea here is that we're setting a time uuid session, ID generator and if I go into this you'll, see that it just uses the Cassandra datastax library you IDs, that time based.

A

So this is a shoot.

A

There's nothing special about this. You don't have to depend on the third party library. Let me go back here, that's already kind of built into the data sex driver. So you can. You can leverage time base uuid generation cluster? Is the Cassandra datastax object. Our driver object. What key space I'm using? What table we're going to persist the stuff into and the anit method kind of lazily creates the key space. If it's not there.

A

Just see if I can find that thing, you know in this case I'm just using a simple strategy or replication factor of one, because this is the local test. Ideally, you'd use a replication factor of three or whatever n divided by 2, plus one is for your cluster and we're going to go ahead and create our table. It lazily, and this is what I was showing you guys before.

A

Alright we're going to create a table with a uuid time, uuid for the identifier, some different values, here's my GC grace and the level of compaction strategy, and this is going to create again create this table on startup. If it does not exist and hear some of our kind of creation logic we're going to we're going to generate a session ID we're going to assign it to that particular session.

A

Object, we're going to save it to the data store and then return it, and here unfortunately, it's probably not well formatted, but we're using the Cassandra, the datastax driver and we're creating a prepared statement.

A

We're binding our argument, which is the session ID to that statement, we're going to serialize some of the data. These are the the class attributes are in the the session attributes and we store all this as a byte buffer, and then we execute the statement to the data store and here's actually, you know we saw this already. So the idea is that all of this is encapsulated in a single class. It's really easy to use and I'll show you quickly using.

A

Performance difference between up certs, so I'm. Sorry. What was the question? Because.

C

She is for the sequel you could have used in search our updated the bose service as the upsurge right.

A

C

You find any performance difference between update and insert. We.

A

We didn't test it extensively to really find out if there is a difference between the two. So for us, upsert worked fine in all cases, especially for hero, because when a session is not there, we want to add it to the store. If it is there, we just want to update it and because that's implemented by Cassandra by default. There is no extra logic that we had to add in our app. So we didn't even try to test the performance difference of insert versus update because it always worked.

A

Fine for us and I'll show you some performance metrics in just a little bit, so you can see how fast this actually is for for real applications. The.

C

Reason I ask you: is it because they're always using search? You know, you know. Oh yeah.

A

Yes, I, don't honestly I mean I would actually I would ask a datastax engineer, I'm, not sure the insert and update of it, if are any different at all, with the exception of where the, where clause you know, it's basically using the same identifier, to hit the rocky in the database, so I am I, don't know for sure, but I'm pretty I would be I would venture to say that there's no performance difference at all, but again confirm that with the data sex engineer. So this is just the very simple sample application for Shiro.

A

I'm going to log in these are some sample accounts. This is our normal Shiro demo web app quick start. So, as you just saw, I just logged in, I can visit an account specific page. I can return to the page. I can see some roles and things that this account has and they don't have. I can log out. I can go back in and lie going to get up.

A

And so now you could see my roles have changed, because my my my user count is different. All of this is using Cassandra under the hood to store clutch store sessions and you could fire up any number of web nodes and an all point to the same Cassandra sore and in practice. This is this is crazy, fast, really really fast. Let's see if I can run a quick demo, I.

A

Got a little a little integration test here, so I'm gonna, I'm going to run this test called drop the hammer I'm going to create 10,000 sessions in Cassandra.

A

We actually storm path runs one hundred percent on amazon, so ours are not physical machines, but there you go. I just created 10,000 sessions, oh and I want to I want to indicate here. What's going on for each session, I'm creating a new one, then I'm reading it out of the datastore, I'm updating the data I'm going to, then you know make that update to the data store and then I'm going to delete it, because I don't need anymore and then I'm going to do yet another read to assert that it's gone.

A

These are five io or five operations on the data stacks driver each one of them independently hitting cassandra and I'm doing that 10,000 for 10,000 different sessions, so that basically equates to 50,000 operations on a little on a 2011 laptop with a crap ton of stuff running, and it did it in about four seconds right. That is a ridiculous quantity of operations for a non optimized application on a non optimized platform. You can take this and scale it literally and we've tested this millions of sessions with almost no reasonable impact of the application.

A

So you can have hundreds of thousands or millions of concurrent sessions with very, very little io impact to your datastore.

A

So with that I will I will finish up and I think there was a last.

C

A

There's there's the the project you can check it out. Try it run it yourself. It's super easy to test and tweaked the code. That's all I have for you guys thanks for your time,.