Ceph Ceph Tech Talks, 23 Feb 2017

Previous Meeting Next Meeting

⏯

youtube image

►

From YouTube: 2017-FEB-23 -- Ceph Tech Talks: Big Data Analytics

Description

Adit Madan talks about Big Data Analytics on Ceph using Alluxio..

http://ceph.com/ceph-tech-talks/

A

All right welcome back everyone to this month's Tech Talk. As you can see this month, we're going to be hearing from Alexio on enabling fast big data analytics on SAP, so I did it. Would you like to take it away? Sounds.

B

Good thanks, Patrick thanks everyone for joining as we go through. The presentation today feel free to stop me at any time. If you have questions- or you would like to talk about something today, like Patrick, said I'm going to talk about how Alexia can be used to speed up data analytics on top of asset storage, cluster.

B

Okay, before we start a little bit about myself, I'm a software engineer at a Luxio, the company behind the open-source project, I graduated from CMU in 2013, where I worked on different distributed and storage system problems, and before that I was an undergrad at the Indian Institute of Technology in Delhi feel free to get in touch with me. After the talk, if you are interested, my email is right there on the screen.

B

It's other battle-ax, your comm, so I'll start with a brief introduction of what Alexio is and the ecosystem it's typically used in as Alexio. The open source project is actually one of the fastest growing open source projects in the big data space. The graph that we're looking at it's showing us the number of contributors for different projects in the diff in the early stages of the project itself.

B

So we and axis the x-axis is the number of months, and we can see that Luxio has a vibrant and a fast growing community which is spread across different organizations both in academia and in the industry.

B

So the essence of what Alexios does is connect any application to any storage at memories, key memory speed at any scale, but to give you a little more context on where we are coming from, we we started with the world in which we had one computer aim, work which was Hadoop MapReduce, and there was one typical one storage system used. Typically with MapReduce, which was the Hadoop distributed file system.

B

Then we saw a proliferation of different storage systems of every type of data and workload. We now have a world of possibilities both on the compute side, which come compute framework like spark with flink, which Tom likes Tom, which cater to a different kind of application.

B

At the same time, we have storage, which is spread across different storage systems. Like the s3 object storage. We have, we have Saif and we also have things like ECS, which provides storage, back-end.

B

The problem that we faced in such a scenario was that the end application developer had the job of connecting their compute to this.

B

This variety of storage systems, which was not an easy task now what what you can do with the Luxio is that you can configure your computer aim work to work with the Luxio and it Luxio itself handles all of the communication with different kinds of storage system. So, as an application developer, you only worry about connecting with the Luxio and Alexio can connect with different storage systems underneath.

B

So alexa provides different kinds of interfaces, but the recommended interface is the file native file system like api that we have, which would give you access to any of these systems and me and to connect to the system's underneath, if you're connecting to the Hadoop distributed file system, you use the HDFS interface. In our case, when we connect to Swift connect youssef, we use the Swift interface to connect to set.

B

Ok, just to summarize why you would use a Luxio is Luxio has has a scale out architecture which virtualizes different storage systems underneath in one unified, namespace alexia works with any concrete framework and storage system of your choice and Alexio delivers memory, speed performance because it's co-located with the computer I'll talk about this a little more of about what I mean with Alexia is co-located with the compute in in the later in later, slides the separation of storage and compute resources and and the end, the focus I'll also focus on what the performance benefits Alexio brings.

B

In such a scenario.

B

Alex you, like I, said it brings three main benefits. The first is to unify different storage systems, taking these high performance by running jobs at memory, speed as a Luxio is co-located with the concrete, and you also save money by only, and only by the compute and storage. You need with the flexibility that Alexia provides with the separation of compute and storage.

B

So here's the the use case that we're going to focus on in in today's presentation, which is which is to accelerate IO from remote storage with the use of Luxio. Now there are several benefits of separating compute and storage resources.

B

You can use cost-efficient object stores, you can scale the resources independently eye on a need on and on as on as needed basis, and you could also use a big native frame bus without sharing any resources with the underlying storage. Now, however, it is a disadvantage of the separation between compute and storage.

B

Is that whenever the compute framework has to access data it has to it takes longer, because the storage is a father away and the network latency and throughput is near, latency is high and the throughput is low, and this is exactly where Luxio comes into the picture. As the compute side, data management layer.

B

So if you look at the use case without Alexia in in the example that I'm going to present today, you I'm using spark, as as a computer, aim work and I'll use Saif. As the storage framework, the the box in the end could be replaced with which F. Now, whenever you're accessing data from the storage in the complete framework, you observe high, latency and and and you're bounded by the network throughput, which is available from storage system.

B

However, if the same data was present on resources which are co-located with the computer Facebook in this case, the same resources which are running the spot frame way, you would observes low, latency and high memory trouble.

B

So, in the case where you using a Luxio, you would use a Luxio to manage all of the memory resources on the on the resources which are running spawn. So in this case we will configure a Luxio to manage memory only.

B

But I would like to mention that Alessio is not limited to managing memory, but it can also manage different tiers of storage like a safeties or HD DS on the resources which are running the copy paper, and while it is managing the resources, a managing SSDs and is a duties, it seamlessly migrates this data into memory so that you, you can still access the data at memory speed in most cases,.

B

Like I said, keeping data in a Luxio axle accelerates the data accessed by the complete framework, as it is close to to the computer.

B

Here's an example use case people at Baidu were using a Luxio to accelerate story data access from from Baidu file system. Cluster Alexio was managing over 2 petabytes of data, both in memory and hard disk drives, and the size of the deployment was over over 200 nodes in in this particular use case, Alexia was able to bring your performance benefits of over at 30x by the benefits that I have already outlined previously.

B

Okay, now that I've described the abstract problem, let's dive into the specific solution on how you would use Luxio to accelerate story analytics on topper set I will run through some basic instructions on how you would connect Alessi of its s and also how you would use spot to issue ad hoc queries.

B

For the daemon for demonstration purposes, today, Alexio connects to SEF using the Swift API, so swell seth has been configured with Raider's gateway and all requests to set storage pass through pass through the rail escaping.

B

I'll show you experiments, results and, and a demo video of running spark on top assessed in the configuration that I have I'm running everything on ec2 I'm, using four types of machines which I named one. The first type of machine is called the compute master, which is running the spark which is running spark and allows your master processes. The second type of machine that I have are the compute workers, which is running spark an deluxe.

B

Your workers and I use three of these workers I and the third type of node is a Storage Manager, which is running the say, the separators gateway, daemon and also the monitor process and then. Lastly, the actual data lies on nodes named as storage devices, which is essentially the Ceph OSDs I'll use. Our three dot extra large, instant type and all of the machines have been launched in in the same availability zone. Also note that this is not a requirement that everything has to be in the same availability zone.

B

The benefits that Alessio brings will only be amplified in the case in which the storage cluster is in a different availability zone. As we have. We of the higher latency and lower bandwidth to do the storage cluster.

B

The versions that I've used abused, safe hammer Alexio the recently released a Luxio open source version. 1.4 I've used a custom, jaws library in case anyone is interested in reproducing the numbers. Their jaws is essentially the client library that allows your uses to communicate with a storage back-end which supports the swift api.

B

We found some limitations and we have. We have made some Corrections to the Jos library in case anyone is interested in knowing what what those changes are. Please feel free to email and I'll be happy to provide you with with the custom library the diversion of spot that we used is 1.6 dot. One.

B

I'll show you a quick five-minute, video of some of the things that I described so in before I start. The video park, Luxio and Seth have been pre deployed with on the configuration that I showed you test has been pre-populated with a 60 GB data set, which is not present in a Luxio memory. So it's only in intercept when I start. The video I will show you a sample application of running queries in SPARC using the spot.

B

Shell of what we'll do is we'll run a simple spark count, job which is counting the number of lines in a file which is the 60 day with a 60 GB data set I will run on a second spot count job which will show you the performance or the caching effects in SPARC itself, and also compare it with with storing the data in a Luxio memory. I restart the shell and I will show you the performance of a third count.

B

The the implication of restarting the shell is that once you restart the shell or whatever data a spark had cached is lost, but this limitation is not present when you use a Luxio, as you will see in the performance results that that I share. So the third count that you do, you will see significant higher performance with the Luxio individual to end the demo. I will show ad-hoc queries using Alexio, but what I mean by that is that I'll store some I'll issue? A word count.

B

Query in Luxio I'll store some intermediate data so that subsequent queries which use the same data set, are much faster.

B

Okay, I hope the font on the demo is large enough for everyone to see I'm going to perform all of the operations in second shell, better. The second shell that I have open.

B

The first thing that I did is I already have SEF configured as and as a storage system that allows Co communicates with not in memory in in in the text that we see right now. It means that the data is not being managed in Alessio, but Alexio is aware, is aware of these files existing in safe storage. So there is a folder named ADA, which has 25 each file. Is each file, is 3 3 gigabytes and which makes up our total sample data set of 60 gigabytes?

B

It's only like I said before it's only in set and it has not been stored in the lotsa memory.

B

The next thing I did was that I started a spark shell, the which is an interactive process that we used to to run compute jobs. This part has been configured with the correct libraries to it to talk to a lot Co.

B

The first thing that we did was that so that we need to sorts, because we want to observe the timing, information of the compute job that I'm going to run. I said a log level to info which will print some information.

B

So yeah I'm going to pause there so, but the way spark communicates with the Luxio is using the Luxio filesystem system like interface. Here. What we did was we created a file with the parts Alexia then demo masters is, is the host name, which has the Alexia master process and data is essentially is the is the directory that we are running our compute job on.

B

So I just started the count operation, which is running, which is counting the number of lines on the 60 B 60 GB data set.

B

So I have forwarded this I have fast forwarded the actual running time of the compute job of this is a job which runs for 12 minutes use. You can see that spark is running with process local locality, which means that which also means that a lot of the data is not being fetched from Alexio at the moment from alaska memory.

B

Both for the sake of comparison, Luxio and Alexia has been configured with Alexio and when I do a direct access on says both have been configured with a 512 megabyte block size. What that means is that for the 60 GB data set, there are 120 tasks created by spark and like, and we saw that the count job on Alex Hewitt. The first count job that we did. It took seven hundred and fifty seconds, which is approximately 12 minutes.

B

The next thing that we did was we reevaluate the file in Alexia, or this tab is essentially fetching the block locations from Alexia. So it is fetching the locations of the Luxio worker processes which store the data. Now that the data has been fetched from reemerged a remote store as a storage cluster in tulips. Here.

B

You will see that now that we do this part with a count. Job performs count job again, the count job finishes much faster and also your which you can see that the tasks were run with note local locality, which is which means that the tasks have been launched on the nodes which store which told the data.

B

In this case, the the source of story of the data force part is a Luxio memory, so you saw that we got the same result which which for the number of lines that were counted and the job only took 40 seconds, because it was stored in the Luxio memory.

B

The next thing that I'm going to do is I'm going to exit the park, shell and I'm, going to restart the shell and perform the same job again before I restart the shell, or you can see that now, when I do a listing on the data, it shows that it is in memory which means that it is being managed by Alexia now a different spot.

B

Shell could also mean you have a different user accessing the same data of the same data set, so differentials could mean running just running the count job by the same user at a different time or also by a different user. At the same time,.

B

So we perform the same operations again. We reevaluate the data as in the file, and we perform the count job. You will see that we still preserve the performance benefits that we saw in in the previous case,.

B

Okay, so the next thing that I'm going to do is I'm going to perform some word count: operations on top of the same data set and store. The intermediate count results in Alexio. What I mean by the intermediate count results is that once we calculate the the word, the count for each word, we store this information in Luxio, which we, which I call intermediate data, and then I will perform subsequent queries on the intermediate data, which is not accessing the entire 16gb data set from from remote set storage.

B

Note that this job still takes a little bit time, even though it is stored in and Luxio memory as this job is, is a compute bound job and is not necessarily exercising the I/o benefits that Luxio brings.

B

We saw that the job took about 400 seconds to complete and, as you can see on the screen over there, the four it took about four hundred and twenty-two seconds to complete the next thing that we do in the demo is we store the intermediate data in a Luxio in a file named 60 gb counts, which stores a key value pair. The key is the word, and the value is the count of that word, and now we will issue subsequent queries on on the 60g counts by.

B

Note that I also exited the shell and restarted the spark shell, so that to demonstrate that once the data is in the Luxio, it can be, it can be shared across different applications. This is also relevant in the case in which are typical, Big Data workloads I, can also share data across different jobs, and the benefits that we see with spark caching are only applicable to the same job itself.

B

So we saw that subsequent query on top of the 60g count, which is counting the top 10 words, which is calculating the top 10 words in terms of the highest work term, took only only two seconds to complete.

B

Summary of what we went through, we looked at the performance of a spark count job on the 16db data set. We saw that the first count, which was performed on on a Luxio it took about seven hundred and fifty seconds.

B

We perform the same job without a Luxio accessing self, using the HDFS filesystem API HDFS, specifically the HDFS Swift connector to access data from spot.

B

We observed that the the job took approximately the same time now when we do the same the same operation in the same spark shell for a second time, Luxio took, took a little bit less time than it took for then it took to run spot on top of the data which is cached in spark. The difference in the second count that we have over there is that in the blue bar data is stored in Luxio and in the red bar. The data is stored in spark.

B

What we did next was we restarted the spark shell which is simulating another application, and we perform the the count job again in case of Palacio. You can see that when the data was being accessed from Alexia memory, it took approximately 20 times less time as it took to access the same data from a remote, safe storage. Cluster, and this is the performance benefit that we see with a lot of when accessing data from Alessio 4 for repeated accesses.

B

We also have a white paper on the use case that I just described if you're interested in looking at how you would set up a Luxio on top of says, win spot. The white paper has detailed instructions on doing that and and and the blog would give you a brief introduction on what the white paper is talking about.

B

Thanks for attending the call and I'm open to any questions that you may have.

A

All right does anyone have any questions. Friday.

A

Feel free to speak them out or type them in the chat.

A

All right sounds like no all right well, thank you very much. You did appreciate you coming to do the Tech Talk. This will be available on the YouTube channel, probably within the next week, or so probably the next couple of days.

A

The next Ceph Tech Talk will be on March the 23rd at the same time stay tuned for the details on that, as we get a little bit closer. But thank you, everybody for joining us and we'll see you again in a month.