Rust Programming Language Rust Linz, 29 Aug 2023

Previous Meeting Next Meeting

⏯

youtube image

►

From YouTube: Qdrant - A Search Engine in Rust - Arnaud Gourlay - Rust Linz March 2023

Description

Vector search engines have recently been gaining popularity. In this session, we will be explaining what they can achieve, peek under the hood to check what is inside, and see why Rust is a good fit for the job.

---

Arnaud Gourlay is a software engineer at Qdrant where he contributes to the Qdrant vector search engine. Long time OSS user and maintainer, he enjoys programming in Rust and sharing his knowledge.

A

Which will be about quadrant, which is a vector search engine written in Rust? So don't worry if you know nothing about Vector search engine in general, we'll go through the basics together.

A

So first, who am I so now you know I'm Arno, I'm, a software engineer and a full-time contributor on the quadrant project. My two favorite things are open source software and rush, so I'm very grateful to be able to work on the quadrant project. I have myself a bunch of Open Source projects that I maintain. You can find those on my guitar profile and I also have a Blog where I've write about my projects and about rest and I enjoy particularly to write about the performance aspects of developing rust cool.

A

So the agenda, of course we're going to talk about quadrant, but this won't be your sales pitch. uh I will give you a general introduction to Vector search engines so that you can transfer what you learned today in other uh with other products. Everything is okay, we'll go over a couple use cases for the technology, because good technology without good use cases not really interesting.

A

We'll pick under the hood see how it works internally, and by doing so, we hopefully see why rust is a very good fit for the job.

A

Okay. So what is a quadrant? Now? You know it's an open source, Vector search engine fully written in Rust and we do embrace the rust ecosystem. That is, we have a lot of dependencies. We do suffer from time to time from long build times, but that that comes with uh the two layout set. Concretely, you interact with quadrant via an HTTP Json, API or jrpc. If you prefer, we have for that three official clients, python, rust and go. It is possible to deploy quadrant in a distributed setup, which is nice for scaling.

A

But if you do not like to handle the operations yourself, you can use our software as a service. We have a cloud offering so that we'll run and manage a quadrant cluster for you.

A

To understand what quadrant does it's good to have a look at the evolution of search for the past years? Traditionally, search has been about having an exact match on the text, keyword, if not exact, then maybe relying on a list of synonyms or, if maybe a fuzzy match with something like an edit distance to a better match. However, if we look at the internet, we have a growing amount of infrastructure data. We have collections of text files, images, videos, audio files, and we have trouble making sense of all this data.

A

Also, if you use the internet I'm sure you you've noticed that all websites now do recommendation you to go on e-commerce websites. They offer recommendation for products, so you want to watch a video, or here you go. You should watch this video. You should listen to this podcast and so on, and this this changing Trend can be described, Under the Umbrella of similarity search. This is what we're gonna have a look. Look at.

A

The easiest example of similarity search is the Google Lens project which perform reverse image search. Concretely, you take a photo. You upload it on the Google Lens service and it will first find pictures that are matching visually your photo, but you also give you additional metadata. So in that case, I take a a picture and Google tell me this this picture. This is this flower, so the first time I used this service, I was really blown away and I was wondering: how is it working and I don't know if you think about it yourself?

A

How would you implement something like that? Would you maybe compare to pixel comparison, which sounds a bit difficult or maybe rely on some complicated Edge detection, computer vision, uh library, but I'm not sure this would be as general as uh What uh Vector search proposes.

A

This was an example for similarity search with a picture, but you can of course, apply a similar research to text and it's called neural search here. We have an example: we take a subset of Wikipedia and we're going to query it with what is the capital of the United States.

A

On the left hand, side we have classical lexical search pm25 that relies on keywords and, on the right hand, side. We have a new neural search approach and we can see that for the classic approach. The first result is uh the capital punishment as existed in the United States. Since then, I guess, there's a date, and the second and third results are not correct either. So it seems that the lazy call search has removed the stopwards in the query and has really focused on a couple of keywords.

A

The problem here is that keywords: I mean words have different meanings depending on the context, so this really tripped uh the lexical search approach, Warehouse on the right hand, side we have the neural search, and here we see the top hit is what we want is Washington DC. The neural search has been able to really understand the query in his entirety. According to the context, so neurosuch is pretty good at answering long queries.

A

um Similarity search is done by using so-called similarity models, those models they are trained via machine learning. There are a lot of models online that you can take off. The shelf websites like perking phase, uh but you have to be careful. A specific model will not necessarily fit your domain.

A

When you pick a model, you have to make sure that it has been trained on data that is really close to what you're doing the alternative is otherwise to build uh the model yourself, and if the model is large, it can get pretty expensive for the remaining of this presentation, I assume that we have a a working model.

A

Otherwise those are more machine learning concern what is a module anyway. So concretely, your model takes a complex, a complex object, let's say a picture MP3 file, and then it's going to reduce it in towards a vector and these Vector those vectors are called in the in the ml jargon, they're called embeddings, and now the secret Source photo similarity models. Maybe very good information to remember from this talk is that similar inputs are vectors that are close to each other in space, and here we have an example.

A

We have a query and a document that we both encode using a neural encoder. We have then aquarium bedding in the document embedding. So those are two vectors two vectors and we're able to compare those vectors by relying on the cosine similarity metric, so cosine similarity. It relies on the angle that is formed uh between those two vectors to compare them. That's the magic uh here is the first, let's say, checkpoint a little executive summary about Vector search engine.

A

Now we know that those vectors search engine dnable similarities such use cases like semantic search for text, similar images recommendation systems. They mostly have two role: they perform storage, that is the process, durability, vector embeddings, and then it will search use case. That is finding vectors that are the most similar to an input vector and obviously the whole game here is to do those things very efficiently.

A

We got to make this very concretely with an example that is taken from a Blog from a Blog here. The goal is not to look at every line because we don't need all of those we're just going to understand. What's going on uh we're gonna use here, the python the python driver, because this is what most data scientists are using uh here in this first snippet I picked, it I wanted you to really understand that.

A

It's really that easy to load a model in Titan using the sentence Transformer package, just one line model equals entrance Transformer given name now, you have a model to work with uh seventh step is to create a quadrant collection, so the collection is the abstraction that is holding your vectors. That's what you're going to work with on top of doing the networking configuration we create the collection by giving it a name, and then the most important is the vector configuration we have to inform a quadrant of the dimension of our vectors.

A

Here the size is 384, it's absolutely not a random number. This number comes from the model. The model produces vectors that are that have this size. So they must be an exact match, and then we also tell quadrant what is the distance metric we're going to use previously we've seen the cosine similarity uh here. We use the dot product, so it's possible to pick whichever distance we want.

A

Once we've created our collection, we can ingest data into it and the interesting bit is uh to perform a search here. We have a search term and we want to search our collection with that term and we cannot just send that search term. Earth observation to quadrant quadrant doesn't understand strings it understands embedding. This means you have to encode the search term as well. Then you can send it via this search function on the driver.

A

So the point that I want to hammer here is that the model is necessary for ingesting data, but it's also required for at search time. So this maybe we'll put some constraint on your architecture. Otherwise, you'll have to maybe pre-compute the vectors ahead of time.

A

Something specific about quadrant: we are not dealing only with Vector, but now you have more interesting use cases you can attach additional additional data to your vector here. On the right hand, side. We have an example where we're going to insert three points into a collection collection. We wear mentioning a point is something that has an ID a vector and a payload, and the payload is just the Json object. You can attach a additional data that will that can be used, then for filtering.

A

So if you think about it, not only you can ask the question: please find uh find a restaurant that is similar to this restaurant, but you cannot, you can say, and please make sure that it is within a radius of around 500 meter from my place. This is pretty neat. We support different type of payload here. I mentioned geo coordinates, but we have, of course, text numbers and so on all right, let's dig a bit deeper. Now, how would you implement vectors search yourself?

A

I always ask myself this one I use a tool and the way you could do it is simply. You use a SQL database. You create a table and it, for instance, if we have vectors with a four dimension, with just four columns focal lens w x y z, and you just insert all your vectors in there and you're done now. We have a search. What do we do? We have an input Vector coming in.

A

We need to compute the similarity between our vector and each Vector in our storage, so we go go through the table and and on our way, we're going to make sure to remember which of those vectors as the higher similarity. Then we have it. We can return it to the user. I guess you understood, we just performed the Full Table scan.

A

This kind of approach can work. Fine, if you have a small data set I would say between 10 000 vectors to 100 000 vectors, depending on the side of on the dimension of the vector, but quickly you're going to run into problems. Here generally, if you are doing a database business- and you are running into Full Table scan, you kind of think I know I need an index and we need an index here as well, but we need a special index.

A

So re little reminder here on index is really a structure that is optimized for reads that we're going to build ahead of time to support our queries. Unfortunately, you cannot just use traditional indexes here like slapping a posting list or B tree. Fortunately, they are already specialized indexes for vectors. That's not something new.

A

We have the entire uh GEOS geospatial uh work there. You have indexes like the KD trees, which enable you to do uh geo-coordinated queries.

A

Once you have those indices you you're able to perform what is called a k nearest neighbors. There is a search that finds the nearest neighbor to a point. However, you have a problem with those is that those uh space partitioning techniques they are too expensive in high dimension.

A

This is uh known as the curse of dimensionality in the literature. Basically, all the tools we have regarding algorithm and data structure. They start to break apart as the number of Dimensions increase, but we need to enter very large Dimension. So how are we gonna go about it uh usually as usual. One way to fix your problem is to relax the constraints, and this is what we're going to do with something called A and N search. So a-n-n stands for approximate nearest neighbors, and here the the constraint we relax is We're, not gonna.

A

Ask for the exact precise result, we're gonna, say I'm. Okay, if if this is an approximate result uh this, um so the trade-off here that you're performing is really precision versus speed. Here, I have a screenshot that shows you various algorithm, because this is really an active area of academic research. A lot of people are working on this and you can see on the bottom. It's the Precision, I know. If you can read you see a one-half, so you have 50 and goes 90 percent 99 99.9.

A

So you see we're talking about approximate result, but this is a really uh well over 1999. It's highly dependent on your data set, but this is what kind of precision we're talking about. There are many good algorithms there. Globally. There are three categories: you have three based ones: the hashes and the graphs for a quadrant. We decided to go with a graph days algorithm because it's more flexible and the one we picked is a pretty good performance performance.

A

It's the hnsw1. Now we're going to dig a bit into actionsw.

A

um Hnsw stands for hierarchical navigable small words. Try to explain it properly.

A

Basically, you have different layers of proximity graphs, so proximity graph is, if you are close to each other on the graph means you are close to each other according to a metric function.

A

The trick here is that we're going to separate the links according to their length scale into different layers. So on top we have the long length long length middle Links at the bottom. We have the short links. This way, we're able to use our graph as a skip list. If you don't remember skip list, you are jumping to a more dense representation as you go down, if you want. Like a metaphor, let's say you want to travel to the other side of the planet. First you're going to take a plane, that's your Layer Two!

A

Then maybe you're going to take a train. That's your layer, one and then maybe you're going to take a taxi for your layer. 0.! That's how you think about it! Then you you're gonna, go through this graph performing a greedy search you're, going to take your query you're, going to find the the point that is the most similar to your query on the layer.

A

Two then, once you have the point that is the the most similar you're going to go down one layer and repeat this until you reach layer 0 and hopefully you find the best result. So this process is approximate because we do not. We encode only at most M friends that is M nearest neighbor per vector and also at each layer. We don't perform an exhaustive search. We only look at EF search, node per layer.

A

Of course you can make M bigger, but it's going to make your graph much much more complicated to build and if you make EF search, bigger means that your search time are going to be longer.

A

If you are very interested in this topic, I can recommend reading the paper from 2018. It's very nice to talk a bit about rust. This part is quite expensive and we use rayon to make it faster.

A

There is an implementation of hnsw, but we could not use it for two reasons. First, reference implementation is in C, plus plus, and also we need to support payload filters. Remember we do not only look for the closest point. We also want to make sure that they match a specific criteria. This enable many more many interesting use cases to think about it. There are two approaches to do uh filtering on hnsw. You could do what is called post filtering. First, you go through your index.

A

You find all the similar similar points and then, from this result set you will filter them now. This has a problem because it forces you to overfetch from your index and then hope that you'll have enough point when performing the filters. The other approach is to do pre-filtering, so first you're going to go through some kind of index to find all the points that are matching a specific filter. Once you have this result, set, you will check, which are the ones that are the closest to your input query.

A

So in that case, you do not even use your hnsw index, which is a bit sub-optimal. Let's say those two approaches were not good enough first, so we have a third approach where we actually enrich our graph with additional links. Regarding the payload of the indexed points, this approach enables us to perform the search in a single pass, called like single stage search, and that's really one of the key points that makes a quadrant fast.

A

If you are interested in the details of uh this approach, I can recommend having a look at this block article, all right, let's go back to the distance metrics. So by now you should understand that this is a very, very common operation at runtime. We do it a lot during indexing when building this uh those layers and also at search time when going through the graph.

A

We have several similarity metrics available, depending on the encoder, that's kind of dependent on your model that is uh internally, we've decided to use a trait to abstract away the concrete metric.

A

This enables us to have to really split the metric implementation from the indexing at the search, and this way we can add new metrics without breaking the whole code base here, I've simplified a bit the metric trait the most important function here is the similarity one which takes us input two slices of f32s and gives you as a result, a single F32, and here the comments is really clear: greater the value, the closer the vector.

A

Let's Pretend We're implementing our Vector search engine and we want to implement that product that we were using. In the previous example.

A

um So the dot product, a quick reminder from high school. Maybe uh when you have two vectors you want to compute a DOT product. You can do it by summing, the pairwise multiplications, that's a simple definition, and here are we able to do it using the the nice iterator from rest, First, We Take, the first Vector we're gonna, zip it with the second vector and as we go down through it, we perform the multiplication and, at the end, we're going to reduce it with a sum.

A

So I really like this implementation. It's a single line and it will also show that rust can be a functional language as well, so I called it naive.product, but I think it's pretty decent.

A

uh Let's say we ship this and we start doing a lot of search or indexing on it and let's have a look at how the system responds uh you're doing a lot of search. You want to see what's going on on your CPU, you start profiling. It and then you find out that 70 of your CPU time is spent into the sum function of the iterator. uh Let's say infrastructure.

A

If you can see, we can go through the stack bottom to top uh start at the bottom from our Tokyo runtime, then we're going to go up our Vector storage, abstract abstraction until we reach something called hnsw index graph layers. That should ring a bell. If you remember what we talked about, and then this is the point where we're gonna try to find the points that are more similar to each other by calling the dot similarity routine function. That is basically our implementation.

A

So you see this, and so you wonder, is it slow? Well, to be honest, you don't know if it's slow to know if it's slow, you would need a different implementation to compare it with right. For now. You just know that this implementation is CPU bound, and that is a bottleneck, and uh this means that if you remove the bottleneck, you most likely have huge gains and of course this is what we did to do. So we used uh CMD. So CMD stands for a single instruction, multiple data.

A

This is a those are Specific Instructions that your CPU can use and what they do is they apply a single operation to multiple data, simultaneously they're pretty magical. In that sense, those instructions are very specific to um on the per micro architecture basis.

A

You have the x86 for X8 x86. You have the old, SSE family or AVX, with the latest one being AVX 512. for arm. You have Neon those instructions, the whole they all have different width. The more modern instructions have larger width which tell you much data. You can stick into it and, as those instructions get better, they get new functions to combine results.

A

There are two ways to use: uh CMD. Either you hand code the intrinsic yourself or you realize, on auto vectorization from a compiler. uh To be honest, if you really want to make sure you uh you're using CMD, you have to hand code, it I would never rely on auto vectorization.

A

Broadly speaking, there are two classes of simple instructions. You have the horizontal one which take a single vector and apply an operation to it, to reduce it. So in that example, this is a horizontal add. We add all the elements or you have the vertical um vertical uh operations which take two vectors and apply the operation on a pairwise fashion and at the end you get a vector in this example is a vertical ad.

A

Now we want to use those and there's one more decision to that. We have to make I said that those instructions were dependent on your micro architecture. That is using a specific instruction would make your code non-portable.

A

There are two ways to solve this problem. You could do it statically, that is, you will produce different binaries for each micro architecture. This is doable, but this will put a lot of stress on your release pipeline. We went with the a different approach that is, we are detecting at runtime uh dynamically. The features supported by the CPU you can see. We have those configuration block blocks. The first one is for x86 AVX, the other one x86 SSC.

A

Then we have arm and at the bottom, this is our fallback implementation, the dot similarity one that was showing up in the flame graphs. Let's go a bit deeper and have a look at the dot similarity AVX.

A

This is uh what CMD code looks like, so first, it's pretty unsafe even internally, to make it work. You won't have to realize on Raw pointer to load data. In that case, this is a heavy X2 I believe we use 256 bits register now. The trick that you have to remember here is that we have a 256 bit register and we're working with F32 float that is able to fit eight floats into the register, and here the magic is that we're gonna process, the dot product eight floats at the time.

A

The for Loop here goes through uh those two arrays and perform a multiplication. Addition with the F FM ad I won't go through the details, because the code can be a bit tedious and at the end we do a sum we have in that sum. 256 register. We have eight float that somehow we need to reduce so many casts out and F32.

A

This kind of code is not easy to write. You rely you. You need a lot of knowledge and I'm really happy. We have someone that is very knowledgeable on our team, which make this much easier.

A

Oh by the way. This is a simplified version. The real version is called dot avix as well. You can find it in our code base. It's. uh It doesn't fit on the slide, but if you want to go fast, sometimes you have to go dirty.

A

Is it any good? Well here we have a crota and Benchmark that compares our DOT product function, iterative base with the CMD one, and we uh or we use it's important to be precise. Here the data set uses vectors with a 1000 Dimension. So that's pretty large and we see a 10x Improvement as we are processing floats.

A

Eight uh eight floats at a time we could already expect 8X and we get a little bit more because uh we also making the addition a bit faster when you see benchmarks like that, you you're always very happy, but the question is always: how is this uh performing in end-to-end system.

A

Now we're gonna see our seemed impacts. uh Indexing and indexing Benchmark I took uh 100 000 vectors of Dimension 500 500 and I've indexed, those on my laptop. So let's say the absolute numbers here are not very interesting. It's done on my laptop and I'm pretty sure the other process is running at the same time, so we're building this hnsw index with our DOT products, and we can compare the two uh results here. We can see that the dot iterator is about 300 seconds and when using the CMD version, We are under 100 seconds.

A

So it's a really great gain. I mean we basically changed around 150 lines of code. You would really miss out not using a CMD for this use case and, interestingly, before using CMD, we were using the open blast project which is kind of a reference in terms of speed for linear algebra operation. It's done in C and Fortune, and thanks to the work of of the team in optimizing our CMD operations, we now have similar performance with that project, which is really a great achievement.

A

A bit about the architecture of the project, he said we have a collection that holds vectors internally, a collection is composed of several segments and those segments are kind of created on demand. As you add more points, each segment is independent in terms of storage. We have inside the same and we have the vector storage. We have the payload storage done independently and those two are done. Approcity within rocksdb, we use the Russ roxdb binding there concerning the hnsw index.

A

We save it in a binary file that we handle ourselves those segments they have a life cycle, they can be full. uh You need to create a new one. Still you need to be able to delete things in there, so we have a couple of optimizers that are running in the back that are performing patch deletion going through the Tombstones, sometimes merging two segments together and so on.

A

Interestingly, when you create your collection, you have you have to make a choice. Do you want to use an in-memory approach, or do you want to use an M map approach that is? Do you want to load all your vectors in memory or don't you don't have enough memory? Maybe you want to use level up.

A

um Concurrent programming, I, think by now you understand that quadrant is very, is inherently stateful. We have to juggle with segments indexes different storage and it's quite complex. At the same time, we need to manage the concurrent access on this data in Rust. You have like, broadly speaking, to fashion of doing a concurrent programming. Either you share data between threads via Channel, which is a bit akin to the actual model with the mutable state is enclosed within threads, or you do thread synchronizing on concurrency Primitives such as mutex and or aw log.

A

We use both flavor depending on the use case, but we had a lot more as a fun and with the mutex and RW logs. So this is where I'm gonna go a bit deeper before talking about the aw log, the mutex mutex protects a piece of data and enforce a single thread accessing it accessing it at the time. So it's very strict. If you want to have something less strict, you have the RW lock, which gives you several readers at a time or one writer.

A

The beautiful thing about those uh those API in Rust is that they they're enforced. The proper usage is enforced by the type system here, on the right hand, side. This is an example that I uh shamelessly took from the standard Library you can see. There is no unlock method when you acquire a lock.

A

There's no unlock to do, and this is a bit difficult to understand, for people coming to rest, What's, Happening Here is that when you acquire a lock, if you do a log dot read you get what was what's called a read God. If you do it right, you get a right God and as long as you hold this card, you are holding the lock either you need to drop the card explicitly or you wait for it to go out of scope to release your lock.

A

The nice thing about this API as well is um when you have a lock readlock, you can access the data, because the read card implements the ref and the same way. When you have a right God, you can access the data and get an exclusive reference out of it because it implement it implements the revenue. Those apis are really nice. We use a lot of logs in our code base, but we use a lot of logs from parking lot, which is another crate that has like more concurrency primitive. They are generally faster and lighter.

A

The problem with logs is, of course, deadlocks. Deadlock, says threads waiting for each other. Here we have thread one, that's all that old resource a and thread 2 that holds a resource B, and then they want to access the enabler's lock and they will stay like this forever.

A

Another way it can happen is if you perform a double read lock on the same thread, because in Rust locks are not re-entrant. You can run run into this issue, but I was talking a lot about locks, but you can get Deadlocks also with channels if you start laying around with Channels with uh capacity and Cycles. This is very dangerous as well.

A

Those issues with uh deadlock, they're not caught by Rossi- and this is sometimes surprising people expect that all concurrency issues are magically uh taken care of by the rust compiler. But this is not the case. This means fixing those no fixing, but avoiding those issues require discipline from yourself.

A

There are several techniques you could decide to acquire the locks in the same order on all the threads in the code base or you instead of reacquiring those locks all the time you just pass. The cards as arguments to your function.

A

Luckily, there are some tips that can give you to help you, because we did have I would say a dozen Deadlocks and it's absolutely not fun to track. The first one is to use a runtime deadlock detector. This is a feature from parking lot. This is a build feature, so you need to enable it um in your cargo terminal file.

A

We do it only for CI build, so our CI build will check Deadlocks and it saved us one time here we were able to find a deadlock in our cluster status and when the deadlock happens, you have a really nice stack trace and it's really a full. So, in my opinion, the deadlock detection is such a killer feature that it justifies using parking lot.

A

If you want to go further than this, and you want to try some bleeding edge technology, you can try a static deadlock detector. So this is the project log pad from Burton Queen, which analyzes your code and try to run some simulation to find potential potential issues in the interaction with the code. We discovered this project because the author just came on our GitHub issue tracker and said you potentially have a double read lock in your proxy segment, and that was very nice of him and we fixed it right away.

A

You can use the project yourself. You need a bit of extra, let's say brush tool chain, tooling, plus you need to do to compile it with nightly, but it's definitely doable. This project supports the standard new text, RW lock and parking lot as well. I recommend you have a look at it.

A

Let's go, distribute it. Having a single note, you can really go a long way with uh with a quadrant first with straws, then the Sim this stuff on a single beefy, node I went myself with more than 30 million vectors and it was very blazing fast. However, that's maybe not the in the architecture you want to go with on production. First, you want to stay available. If you have only a single machine, it goes down. You're, gonna, you're, gonna have a bad time.

A

Also, when you scale you want to scale on commodity machine scaling uh by hosting only really beefy machine gets very expensive. The way we did it in quadrant is. We have a classical short replica architecture. When you create your collection, you say how many shots you want and how many replicas it's a pretty straightforward from a user perspective.

A

We do not have uh leader shots for right. What we do is we send your request to anyone and we'll figure out um where the right should be executed. Any of the replicas can handle of writing. Then the the this variety is propagated to other replicas. By default. We do not have transactional operation of the vector. So if you need to support concurrent right on the same points, that's something you have to parameterize yourself by passing a little extra option. We will know that we need to serialize uh those rights.

A

The way we do it, we had to synchronize some kind of cluster State and have a distributed collection topology, and to do so, we use the raft consensus, algorithm, quick 101 on raft you, the rough consensus algorithm, enables you to have your node agree on the sequence of operation. This is called your log and your log as entries, and you want this log to be replicated everywhere. Everyone needs to agree on. What's on the log, it's critical once this log is replicated, you can execute it. Well, it's a stylized code of understate machine replication.

A

If you have the same state machine everywhere, if they run on the same data, you should have identical results in rough. There is a single leader that is elected and the role of the leader is to receive the proposal for the new entries and send them to the followers and once the follower act, I've replicated that thing it tracks it and when a majority of the followers has replicated the an entry, the leader will commit it, and that says I'm committing this means uh follower. Please process this entry.

A

Now it's a common misunderstanding: misunderstanding with rough people seems to think that this is some kind of atomic Global uh uh distributed uh con um consensus, but you will have basically lags node lagging behind.

A

um So you need to be careful. If you start performing read on your follower, you most likely get stale read and you have an eventually consistent system. In that case, you need to perform your reads on the leader using raft in Rust, we decided to go with the raft RS Library.

A

This was the most mature Library we found at the time. It's definitely not easy to to integrate in a code base. There's a complicated, dense to process the message. Plus you have to implement your own persistence layer for the log. You have to be really careful there. We had some questions, but luckily the maintenance are very nice and very responsive so kudos to them going in this kind of Adventure I.

A

Can what you want your debugging is can be very difficult make sure you have good logs and good tests all right time to wrap it up here. A couple takeaways I hope I was able to demystify Vector search engine for you that you know now when to reach for it. What kind of use case it supports.

A

We went through a couple of new indexing schemes to support approximate nearest neighbor and the specific implementation of H and SW ell. Now you know you can apply CMD to your bottlenecks. If it's possible, you should do it here, we're doing a lot of float F32 business, but even if you do backend applications, uh you can parse Json with Cindy Json, just give it a try. Modern Hardware is really fantastic, um Fearless concurrency in Rust sure, but except for Deadlocks, and this is the responsibility of the developer to make sure those does not happen.

A

um This written distributed systems are hard, that's not very new, but I had to put it and basically this whole presentation is I. Guess another data point in favor of us for building uh concurrent highly distributed systems.

A

I will leave you with a couple of links. We have first, of course our GitHub repo, then our documentation, a couple of benchmarks. Where we see are we Fair against other engines?

A

Our blog, where we push articles about new features and a new trend, for instance, now we you can start using a semantic search for code search, it's very interesting stuff and the best way to connect with us and the community is via Discord. That's all for me.