Filecoin Slingshot Phase 2.3 Demo Videos, 9 Jun 2021

Previous Meeting

Next Meeting

⏯

youtube image

►

From YouTube: Slingshot Phase 2.3: Closing Ceremony

Description

Slingshot 2.3 closing ceremony, held on June 9th 2021.

00:00 Welcome & Agenda - Deep Kapur
1:36 Phase 2.3 Closing
4:19 Phase 2.3 Judging & Results
13:22 Phase 2.4 Kickoff
14:58 FileDrive and Go-Graphsplit - Laura
23:35 Common Crawler data processing with FilSwan - Charles Cao
35:11 Storing the A2D2 dataset - Fei Yan
42:51 Twin Quasar - Julien Noel
49:01 Q&A and Closing - Deep

A

Pretty exciting lineup of sessions for you today, so I'm excited to get started thanks for those of you that waited and for those of you that are attending from various parts of the world at a time zone. That may not be the most convenient, but of course this is also being recorded and will be available uh for several weeks months and years for viewing. But yeah really excited to be here thanks for those that can make it out.

A

If you have any questions or if you have any comments, please feel free to drop them in the chat, I'll, be monitoring, and so will the wonderful uh team that's here today with that, let's keep things off uh so on the agenda today. We've got uh brief sort of closing comments from me on phase 2.3 as well 2.3, as well as phase 2.4, which is actually already underway and crossed a milestone. Today, then, we'll have a brief presentation from laura on file drive charles will be presenting about his work with common crawl.

A

Faye is here to present as well with his work with the a2d2 dataset, and then we've got a video recording from julian, since it's an inconvenient time in france for him to present, but he'll be talking about his work with the free rainbow tables. Dataset, we'll also try to keep some time at the end for a live q a so.

A

If you're in the zoom room, you can feel free to unmute at that point in time and ask any questions, but in general we would prefer, if you just drop your questions in chat, so that we can ensure that everybody's questions get answered at the end of the session.

A

You can also ask questions of the participating teams that are presenting, and so, if you have any questions for them as they go to their presentation, please feel free to drop that in chat great, let's get started so first up. Congratulations on completing slingshot phase 2.3! This is a pretty exciting phase. We actually had lots of data. You know onboarded close to nine pebby bytes through the course of the space. You can take a look at this graph here.

A

It's really illustrative in terms of all the way from september, when sync shot 2 started the way in which the rate has continued to increase the amount of data that's being onboarded onto the network has grown substantively uh phase. 2.1 was you know, 0.8 peppy bytes phase 2.2 was over three per bytes uh phase. Three was close to nine per bytes, and so it's it's growing substantively.

A

This phase also had 300 000 deals that were made, which is massive uh for those of you that are more intimately familiar with slingshot.io. The website that we have.

A

There are interesting cases where just loading the data takes a little longer now, because there's just so much information being shared um and so really excited to share this progress with all of you- and you know just congratulations to all those of you that participated in this phase and made it such a massive success, quick sort of overview for those of you that are also newer to the competition there's still tons of data sets that are eligible that were eligible for phase 2.3 and will continue to be eligible afterwards.

A

The full list can be found at their repo for slingshot it's somewhere in, like the mid 60s. I believe at the moment so landsat 8 was disallowed from slingshot phase 2.3, because lots of teams onboarded it already and during the course of 2.3, uh we achieved similar amounts of like lots of replicas of common crawl, genome arc, linux, iso and so moving forward.

A

Those will no longer be eligible, uh but lots of copies of those available thanks to the progress in phase 2.3, and so hopefully we can find ways for the community to engage with that data as usual, we're also continuing to receive lots of contributions from the community. So if you have a data set that you think should be eligible for slingshot, please feel free to open up an issue or open up a pr at that repo.

A

The only thing that you must know is that you know the data set should be public open accessible to all, so no special permission should be required to retrieve the data that's contained in the data set and generally we tend to index towards data sets that have some sort of public value uh and can be useful for either development uh on or around filecoin or within the ecosystem in general, uh and so uh datasets that include data for modeling training or just data that should be utilized on its own or data that you know helps advance uh studies.

A

Research science uh tend to typically be the ones that get accepted, but you know happy to entertain any ideas that you have and so feel free to create an issue. If you'd like to kick off a discussion to include your data set or data set, that you found that you'd like to encourage being onboarded onto the network cool. So as we discussed, you know we crossed 13 baby bytes at the end of this phase that then unlocked a total rewards pool of a hundred and sixty five thousand file coin, which is absolutely massive.

A

uh 25 teams were eligible at the end of this. uh This includes you know, teams that complied with all of the rules of how the data needed to be stored, which meant no team was violating the maximum number of replicas for each piece that could be stored onto the network teams were using at least four miners to store the data with and no more than 30 percent of the data was being stored with a single miner, and so there was some sort of distribution of data across miners. This is a quick screenshot.

A

I know it isn't particularly big or particularly useful, but I just want to show you that sort of what it looks like. So, if you head over to this link, it's shared in the slingshot announcement channel and if you haven't seen it and want to see it and can't find it feel free to dm me on slack, I'm deep kapoor on slack and I'll. Send you a link to this, but just sort of give you an idea of how it it looks and how you should read it.

A

So you have the project names, which are the names of the participating teams that are onboarding data onto the network. The do metadata score, which is of the information that was submitted by these teams, about the deals that they made the documentation score, which is the quality of the docs that were written and provided by the team, so that anybody can understand how to use what they've onboarded onto the network to ensure that it's useful for people uh in the future.

A

The deal ui score so teams for this phase actually created websites for each of their projects that you can check out demo videos, so these you'll actually see uh one of these later during this presentation. But in general you know. Each team now has like a few minute video that talks about the brief explanation on the data, their logic for how they onboarded content onto the network. How you can find that and how you can retrieve it and use it, and all of the videos are actually available on on youtube.

A

And so, if you're looking for any links at the easiest way to head over to youtube to find the playlist and if you're not able to access youtube, then there's links available for food fee, and you should also be able to check that out from the slingshot website and yeah. So you know happy to discuss this a little bit further. If you have any questions, I'm going to dig into a couple of the details here for those of you, especially that are interested in participating in future rounds, definitely pay attention to this.

A

So these four sort of criteria that I briefly discussed were you know, introduced to the rule set in phase 2.3, so the bar in general for phase 2.3 is higher than it's ever been, and it's going to continue that way.

A

Of course, so we found that pushing teams to create this kind of content was incredibly valuable both for the community, but also for the teams in general, in terms of how they prepared the content that was being stored and ensure that it was retrievable and usable teams are also initially capped at 10 of the total rewards uh available in the pool, and so once we exhausted and ensure that the right amount of rewards are allocated for each of the teams with the excess file coin that was remaining in the pool.

A

We actually went back awarded bonuses for exceptional content, submissions so for documentation, ui and videos that scored at 100 of the criteria for each of those different categories. uh 500 phil as a bonus was awarded and then whatever phil was remaining at the end was again allocated back to the participating teams proportionally, um and so that's how? If you look at the previous table, these little cells in green are actually the scores where we're like wow. This is like excellent met.

A

Every single ask criteria: the standard was really high, and so that team is, you know, deserving of an additional bonus yeah. Once again, congratulations to all the 25 teams that participated well, more teams that participated 25 teams that ended up being eligible for rewards like really really well done. It was awesome. It was a privilege to get to score the content that you had created, uh and so thanks for the time that you put into this some quick takeaways. uh Some of these were also shared in slack.

A

If you want to read about them in further detail, but I did want to share them at this session as well, especially as you think about your participation in phase 2.4, mostly because phase 2.4 will have a lot of the same rules and is a superset mostly of the rules that we have in phase 2.3.

A

First up retrieval success rate, so we talked about retrievability being a key part of how scoring would happen phase 2.3.

A

We did get a retrieval success rate published, but it was not actually included in the scoring for phase 2.3, and this is primarily because we didn't have enough of a sample size for all the participating teams. However, we have started publishing this on the website, so you can always check it out. If you're a project team, you can go and see what our attempts at retrieving your data have looked like, and this will certainly be a part of the scoring for phase 2.4 and likely future phases as well in terms of documentation.

A

Retrieval instructions were typically a little bit sparse, so they didn't always include information on how to get retrieve content uh if it wasn't from the deal ui. So in some cases you know, teams shared just go to the deal ui and get it in other cases.

A

Sometimes team said just go to the network and get it it's important to cover both the cases because clients that are coming and wanting to get the data from teams that have onboarded this data on the slingshot could have a preference for getting data directly from miners on the network or going over to the uis that you've built, but also once they've retrieved these individual pieces. Constructing them back into a usable data set is key and many times that content was missing.

A

So we encourage those of you that are writing documentation in the future to make sure that you go all the way through to helping a client reconstruct all the individual components to get a file or to get like a data set that they can actually dive into and use for whatever analysis and work that they want to do, especially those of you that also use additional libraries or frameworks or scripts to make things more efficient or effective.

A

In some cases, documentation was great but in other cases a documentation for those particular things was limited as well. So would encourage you to share more about that, because that's also how this ecosystem will grow and this community will benefit from each other's work laura esteem.

A

Our father has also been doing some awesome work with open sourcing, pooling that has made storing large amounts of data specifically for slingshot, more viable, and so her presentation will include some of that as well dear you guys, so I personally really enjoyed seeing some really awesome, javascript and front-end work, um and so congratulations to those of you like that. That put in the time to do that, it was awesome. I really enjoyed it.

A

uh The main flag that I'd call out here is that, like you know, search search, features themselves, tended to be inconsistent across the site, so some cases had uh phenomenal search where I could put anything in and figure out exactly what I needed to go retrieve and in some sites the search button didn't work or I could only search by specific fields or I had to like make sure I got the entire text uh you know correctly pasted in otherwise I won't be able to find what I was looking for, and then this is more of a general comment here, but individual files tended to be the way in which we expected clients to look for stuff, but a lot of the websites were actually targeted towards a user that knew exactly what piece they were looking for, which is not super intuitive, especially if a piece includes multiple files, so index saying the pieces are making it discoverable on a per file basis is probably going to be beneficial moving forward.

A

I'm sorry, I'm rushing through some of this as well. I want to make sure we get through the rest of the content so yeah. If you have any questions about what I'm talking about, please feel free to get uh get to them and chat and I'll definitely address them as we reach at the end of uh this particular session on the videos uh also fantastic work for those of you that submitted videos. It was really great great explanations.

A

Understanding of the data set, as you know, showing the entire flow that you went through uh main flag here, similar to the documentation, like most videos, went all the way to showing the retrieval actually being done and like a file being obtained from the network, but then actually like opening the retrieved content like not just showing like, what's in the dar file, but putting it together and opening it up in a viable like application. So a good example would be for some people that stored like the linux iso database.

A

uh They just show like the the dar file and then you'd pop it open and you'd, see yeah like there's a bunch of images here. But what do you actually do with that? Because the client is going to want to do something so like mounting that somewhere or showing just you know for 30 seconds what it would look like to extract and use that to create like a vm or anything else would probably be useful, but yeah. Overall, I mean this was one of the most difficult part phases to participate in expectations were really high.

A

Teams definitely met those expectations. 27 bonuses were awarded across these teams, so these bonuses, as a reminder, were based on exceptional quality of content submitted so yeah. Congratulations to those teams that also you know, met that bar across multiple fronts, and- and hopefully you know, you feel like your work was well reported and we hope to continue seeing that high quality from you. So what happens? Next uh slingshot phase 2.4, so phase 2.4 actually started as soon as phase 2.3 ended pretty much, and so they already teams that are onboarding.

A

Data uh want to call out that, like today, congratulations to those that are participating, you hit the milestone for 15 peppy bytes, which is uh 25 000 fill being distributed in rewards uh there'll, be a continued focus in phase 2.4 under capability submissions, you know, will include bill data documentation, deal ui, demo, videos similar to phase 2.3. uh We'll definitely include like the rsr number as well.

A

One thing we're doing, though, is we're. Reintroducing a limit of having three teams use the same data set. We want to continue encouraging the diversity, the information being onboarded onto the network and, of course, if you have a data set, you specifically want the on board as part of your participation slingshot, and it's not on the list of data sets. Please feel free to head over and create an issue. So we can have a discussion about that so yeah.

A

If you're interested in participating, I would say head over to slingshot.filecoin.io to see the latest on the competition and register there to participate. If you have any questions reach out in the slingshot channel on falcon slack and I'll, be sure- or you know, somebody who's more knowledgeable than me will be sure to address them.

A

So with that thanks so much for for being here, I'm going to kick it over first to laura, so laura's actually submitted a video that I'll be playing uh for her presentation but she's. Also here, if you have any questions for her, please feel free to share in the chat um so yeah. I'm gonna play a quick video from laura laura thanks. So much for making time uh for sharing this with us.

B

Hello, everyone: this is laura from field job.

C

Welcome to slingshot phase 2.3 closing ceremony and congratulations to all the participators. We did a truly extraordinary job and thanks for inviting me to talk about our slingshot project builder, so, firstly, please. Let me do an overview of my team. My team, phil drop, has been the fulcrum ecosystem from 2019.

C

Currently, we concentrate on solutions of distributed storage based on ffs and filco network and technology development in order to contribute to the fuel community.

C

Now we have already developed a data part 2 uh graph slept and the first filicon class project, dashboard fieldplus.info and as for screenshot computation, we have been participating from slingshot phase one last year and stored over two pb used for data from a number of public data sites all over the world.

C

Now we are working on the reward period, 2.4 and hope to reach 25 pva milestone with other sling shelters together as developers in the field calling community. We are willing to contribute to the whole ecosystem.

C

Since this reason we developed grabs plate a tool for slicing large data science into graph slice, slices, fit form making view in the filter network. Now it an open source uh project on github and everyone can use it or create issue or pr to improve it. So we so glad to know a number of screen. Shorters use these tools in slingshot 2.3, and we also have received many feedbacks from community members. It means a lot to us and back to this tool.

C

Well, why we need it during the slingshot computation we found when storing large data size, we need to slice it into small pieces to fit for the size of sector which could generally be 32, gb or 64gb, and at first, especially in slingshot 2.2. We made this data into a large tarball uh check, this powerball into small pieces, and then we made storage deals with miners with these pieces on the other side of storage. This way is quite efficient and allow us to start hundreds of tv daytime year months.

C

However, it's broad difficulties for data retrieval, even if we only need to retrieve a small file, we have to retrieval and download all the pieces of this tower first and decompose. It decompose it and find the specific file we needed graph slate solved. This problem will be feasible.

C

It took advantages of ipld protocol following the unified format, data structure. It regards the data size of its subdirectory as a big graph and cut it into small graphs. Each small graph will keep eight file system structure as possible as it used to be so after that we only need to organize these small graphs into a car file.

C

If one data piece has a complete file and we need to retrieve this file, we only need to use uh payload cid to retrieval these data piece through luther's client feature it back and get the specific field we need.

C

Besides magnifies, a csv will be created to save the mypin with graph, slash name, halo, cid and pcid, and the inner file structure will be supported in the next release, which is version 0.4.1 and another advantage of this tool. Is it can perfectly match ipfs like if you build an fps website as your deal ui website, the internal file structure of each database can be shown on it, and it is easy for users to retrieval and download data that you stored during a slingshot computation.

C

uh So this is a brave uh introduction of uh this tool and more information can be found in our github code base, and we do welcome your idea, feedbacks and suggestions, and the next part is the demo video of what file drive did in slingshot 2.3. Please take a.

C

D

A

Awesome, thank you. So much for sharing uh laura, really appreciate it. uh Congrats again for your participation uh in this uh competition. Your team did extremely well and we're really grateful for the work that you're putting in in improving the state of the network, bringing data onto far coin and creating awesome resources like go graphs but for other participating teams as well.

A

uh So with that I'd like to hand it over oops.

B

Hello, everyone, let's hand.

A

It over to charles from the filezone team, uh chelsea feel free to you know, take over with the screen and present thanks. So much.

E

Hi everyone- um this is charles from first one team, so it's very happy to working with you guys during strange swing, shot 2.3 and science, lots of teams already present how to downloading and participating in the screenshot. Here we are bringing to another interesting topic about how to use the data. So so, first of all we are going to introduce about ourselves and we participated the screenshot from 1.1. In the beginning we are working as our deep learning platform. Then we gradually going deeper and deeper.

E

We can close with fragile network, so we finally started the first one project and then just from last month we are happy to announce that first, one now not just a project, but it's also a company. So we built we are going to build this uh swan cloud computing company to dedicate using the decentralized age infrastructure marketplace and provide the processing based on fico network.

E

So uh with this phase of screenshot, we are using the data center common code, which is a non-profit organization, provides per byte of data connected from internet science. 2008. An interesting story is that zip provides their own data format, which is wrx for my files and with different um size, tags and information, including that website.

E

So, while we are using um those data set and start to analyze it, we have some different challenges as a client, uh for example the common crew organization. If they want to backup their data set to the frequent network, they will have some challenges like with the fire. Conceptors are 32 gigabytes or 34 gigabytes. How can we doing trunk or merge data effectively?

E

How consistent the out of batch deals without too much human interaction? How can we maintain the internal life cycle of the data processing.

E

So, um as all these factors will determine if a client willing to go to fargo network or not on the other side, we as a miner, we need to import a large amount of deals every day.

E

Are we supposed to sit in front of the computer and just watch it or get some online gears from we don't know when and where we don't know where it will come from, so we need some kind of tool to manage and we also need to recorded all those deals about their success rate or video rate just in case, if we wanted to provide some extra service for our clients at the end, the data consumer not necessary to be the client who wanted to analyze.

E

As a data scientist, how can we easily use this data from the frequent network so in order to solve this problem, those challenges we build out this slide to show how we are servicing.

E

So this is usually the typical process for clients update their site. First, they need to prepare dataset either they get it from the internet downloaded or they have petabytes of hardware drives get ready, so they need the trunk and the mergers files they need to upload those files to the server or they need to export it to hdd, which will be shipped to data centers like uh file coin discovery drives, does after they have those facilities ready, they need to find a couple of miners and then they need to make deals.

E

This is the whole cycle lifecycle on the client side, and this is the way, if you do it all manually, with the original bluetooth website and the noodles terminal. What you need to do you need using uh to calculate how many characters, uh how many lengths of the deal you need to download it. You need to target the file to combine with the size.

E

Then you need a computing cid and the files piece id, and of that you can start doing the current import of the input you can use in the claim dl and using the pcs id and instead of similar yields from this place to the miner and the end you get is a dr id, which you can use it for saving your your database in the first one we're doing in another way about the batch gears. We open source our project on the first one.com and you can check it out the documentation side.

E

If you complicate it, they will give you a guide about how to do this step by step with first one we created a python repository allowed you to sending deals together as a batch file. So in first one we define the deals as tasks so with one with some small data set. You can put all the deals in one task. For example, you can have three or 30 dls in one task and you give it a name for you, it's easy to remember.

E

You can also give it a tag about the different uh data set to help you to fetch it in the future. We provided the function to download the entire data set with all the dr series together. So when you have those information ready, you can use the python tool cri to just point to the directory, which contains your carfare and you can send all the batches with one shot.

E

On the other side, when miner got those kind of deals and instead of the online or offline csv importer manually, you can also download it directly from first one website. It can give you the csv file, contains all the information or you can using the api key directly connected to the first one website.

E

It will provide the tools code using just one command, prp install first one miner. It will allows you to install all the dependencies if the embedded uh downloading to our real tool can open a monitoring downloading from the address which is uh you put it? You upload it from a client on the website. It will automatically download all the deals and start importing as you wish in the configuration file you can define when you want to start and uh the um the waiting time between each task.

E

As a data consumer, uh we provide a certain ways to reorganize the data, so, for example, with the common core, we have a website to give you all the information about the deals. All the deals is combined to the csv, as we have shown it before with each dataset. You will know that which miner id and contains which kind of file- if you click it, it will give you the detailed information about all the original file which is merged to one car file.

E

You can also using the traditional way to download it from the client line to get all the data as well yeah. So the interesting part is after getting the data, what you are going to do with that. So this is the part of demo we are showing here. We can.

E

We have the data all put here and you can see it very well. So then we are using the official words. I work io to do the data analysis, so this demo is show. How can we get the all the input information from one file, which has a very long name here from 2021 may data?

E

Its purpose is search all the web pages, which contains youtube videos to see during this time frame this month. How many websites contains youtube videos? It will take a couple of minutes for processing the data. So in the beginning, you just set up a radix formula to for the python to search, and you get the data which is downloaded from falcon network using the commander tool to start a sub process, which we are doing, the compress the data of your decreased data.

E

You set the data set to the read mode, then we are doing the matches to search through all the packages to see the content which is matches the youtube links. So the process takes a little bit of time. Then you will see that uh we found there was about three 4242 youtube links, matched in this website 56 000 website.

E

This is a way about a real data scientist what they are going to do visit so with the results they get they'll be able to go to deeper about each line of the data to see do something more like a sensitive analyze or some trend analyze about the topic or category those kind of different method, and this is the way we are thinking in the future.

E

How frequent network can continuously provide value for the entire community? Yep thanks everyone, and uh this is uh our part where we really enjoy the solution and going to participate 2.4. Thank you.

A

Thank you so much alex thanks for also being here and presenting, uh and you know continuing to contribute to this competition through all the phases uh looking forward to seeing what you pull off for the next phase um cool. With that I'd like to introduce our next speaker. Faye, who worked on storing. Actually two different data sets, but he'll, be speaking specifically about the a2d2 dataset feel free to take over.

F

Hi, my name is faye and I I'm a home-based miner from vancouver I joined since the filecoin test network. I participated in the beginning of the space race and then subsequently the screenshots in every phase. So far since I'm a small manner, I always store my data with other manners.

F

I find them in the slack community of icon, which they are very helpful. I will share in my desktop.

F

Now uh this is.

F

This a2d2 is the project. I'm gonna focus on this time. uh This is my project a2d2. It is a data set from the official screenshot.

F

List of data sets this one, and if you go to the website in the download section, there is a list of all the files that is available for download these ones and it's very simple to retrieve them, because each of the data set has the url on the ews. You can just put the prefix before each file name and be able to retrieve them.

F

Some of the files are smaller and some of the files are larger, because filecoin can only store 32 gigs per sector, so I have to slice them into different tables for the ones that are larger than 32 gigs.

F

This time after I batch download and generate the turbos, I generate car and do the compete and then import into lotus client database and then send the deals to the miners that I communicate with. Then I will send them a csv that generated from my script.

F

It contains all the information, especially the doc id and the founding, so they can just use the import data here to import to their manner and started seeding. That's the basic process of making an offline deal.

F

I also made a web ui uh specif uh for each data set that I've been uh worked on. I will demo the a2d2 dataset it's on this little website. Here I focused on the data usability or easy to retrieval. I made a navigation on the left here which looks very similar to the official download page. It contains pre preview segmentations you can find them are organized very similarly.

F

I will just refresh this page.

F

I think I'm having some database.

F

F

F

A

Hey you didn't put in enough prayers for the demo gods. It's all good.

F

uh Maybe I can play my video about the retrieval.

A

Sure, thank you so much.

F

G

This video is for my.

G

There is also, of course, the generic search feature that lists all the deals that have been made in this space. If you want to search for a particular deal id example, this one.

F

I'm going to pause here on this search is also available to search by the file name or by the miner id. So if you want to, for example, you want to retrieve oh it's here. This comes up, for example, if you want to retrieve like the segmentation for the semantics, you can just search like.

F

If you want this this file, you can just uh search here, it will shows up and you can see the details of which manner it has been stored with and uh and uh you can click on the manner to get a retrieval command that you can easily copy paste and into your terminal.

F

If you have a lotus running and retrieve the whole file.

F

That's uh that's all the the website is about.

A

Awesome, thank you so much faye and uh nice work working through the technical glitches that looks like I came back just in time as well, so we got some of the great work uh for you as well. uh Search feature is fantastic website. It's great uh folks recommend you definitely check it out, especially if you're interested in and seeing what a great deal ui looks like um thanks for being here. Faith really appreciate it.

A

uh So with that I'd like to go into the last session from a presenting team, which is from free, rainbow tables, is the data set and julian is, is the person who was participating? um Some of you may have seen him on slack at sonic42 uh julian. I know you can make it, but thanks so much for submitting uh this video that we're able to show instead and I think, for quality. I think I might just play my local copy of this instead. So, hopefully you can all hear this.

H

Hi everyone, my name, is julian noel, founder of twin kazar. Twinkazer, is an active member of the federal ecosystem. We build our reputation on two values: being trustworthy and quality.

H

We are trusted by protocol labs being selected as a five plus notary receiving two dev grants. Member of the minority program and frequently invited as presenter for falcon events. We are trusted by the community. More than 400 miners have deployed forecaster for their day-to-day activities and by leading actors like textile and flick.

H

We have a large exposure and a complete understanding of the freightline ecosystem with by being miners developers, tokenholder and hosting ipfs nodes. Our goal in participating to sleekshot is to onboard valuable data to the network, but also to understand client challenges.

H

We decided to go for the free rainbow table dataset. For three reasons. This dataset is used as a proof of concept for one of our site project, which is an integration between filecoin and golem network. The dataset size is 12, tbs is small enough to fit our onboarding capacity.

H

Each individual piece of the data set is larger than 16 gb, fitting perfectly.

E

H

Storage and retrieval capacities, our onboarding strategy is to select miners based on their ability to seal batches in less than 24 hours. They agreed to keep the ansi copy for fast retrieval.

H

We gently validated that retrieval is working and they have a long-term commitment to the network. Finally, they are distributed in eu and us close to clients.

H

The free rainbow table project is the set of rainbow tables for mt5, ntlm, mysql or lan manager. Hash functions. A rainbow table is a precomputed table for catching the output of cryptographic, hash functions usually for recovering password from a hash.

H

This curated data set is a copy of all the rainbow tables of the free rainbow table project on filecoin.

H

Now, let's go through the process, we applied to store the rainbow tables on filecoin. First, we downloaded and bundled each set of rainbow tables separately. We kept the same organization as the original dataset. If you are familiar with the rainbow tables and already have an incomplete set, you can download the remaining pieces from filecoin.

H

As you can see, we kept the same folder name, file, names and content. As the original to process the data. We only use standard command lines. We first create a tar archive with all the files of a specific dataset inside then we split it in 32 gb.

H

Pieces these files are stored, as is on file coil.

H

To retrieve the file simply go to the free rainbow tables.twinter website select the dataset on the left. Pane select the file you want to download, see the deal id copy, paste, the lotus command line to a terminal and wait for the download to complete repeat the process for all files you want to download.

H

Here we have 13 files to.

H

H

As you can see, we've got to retrieve all the 32gb files we had before it's now time to extract the rainbow tables. The process is very simple: only relying on the dark command line.

H

All the rainbow tables files are now in the sha1 folder.

H

We are now going to recover a password from the hash leveraging, the rainbow tables, which has downloads and the aircraft eye program. First, we download aircraft from the web.

H

Save it to the local drive.

H

And zip it and run aircracker against the ash. We want to recover the password from using four threads and the rainbow tables.

H

As you can see, the password file coin has been recovered successfully.

H

A

Thank you very much um thanks julian for sharing your video uh first. I know that you could make it at this time, but uh it was great. I appreciate you stepping out and creating the recording for us to share uh excellent work as well, showing the retrieval all the way through, as well as the application in which the files can actually be used.

A

So with that we're into the last part of today's summit, um we've kept a few minutes for q a but as far as I'm seeing actually all the questions so far that have come in chat were addressed in chat as well. uh I'll, probably just hold for a minute to see. If there are any other questions in the room, if you're here, please feel free to raise your hand and share a question. If not, I've got one or two last things I'd like to chat about, and then we can wrap up.

A

Okay and so far it looks like no immediate questions, so I I would love to show you guys uh this new thing as well, uh that we've.

G

A

Added to the site, so if you head over to slingshot.filecoin.io so for those of you that aren't as familiar with the site, this is sort of the home base for the competition. You can see the latest statistics. uh You can see the rewards that are being distributed. uh You can check out the rules to participate the prizes upcoming events such as this one, but I'd like to introduce you to a new tab that was just added called explore, and so this is slingshot's data explorer.

A

Where you know you can check out the data, that's being onboarded onto the network um in the form of you know the data sets that are eligible, but through the various projects that are staring at they're, storing that data on the network, and so you can discover where you can actually obtain the files for yourself, so we could probably grab one of these. So perhaps we can grab linux iso.

A

This one is, you know, as you can see, being stored in a bunch of different geographies lots of different flags here uh being covered and the nice part about this one is, you can see. The file names are familiar. uh Debian uh different versions of it, fedora and so based on this, you can actually use this as an explorer to find the exact iso file that you'd like to download, see which cid it fits into and figure out.

A

You know where the miner is that is near to you that you can actually grab this content from and run a retrieval. So this is still just recently been added, somewhat work in progress still as we continue to flush this out. But if you have any feedback, suggestions, ideas and ways that we can make this useful as well as just generally ensure that the data that's being onboarded onto file coin, is being used as charles was talking about just making sure that people that can use it have access to it and can benefit from.

A

It is absolutely paramount, as we continue to increase the amount of data available on the network.

A

So with that I'd like to thank all of you for attending this slingshot phase 2.3 closing ceremony, thanks to those of you that participated in the chat as well, and asked questions for those of you that are watching the recording.

A

Thanks for making the time to view this recording, if you're interested again head over to slingshot.falcon.io or join us in the slingshot channel, at falcon slack available to answer any questions or help you in your journey in participating in the slingshot competition, as we all work together to make the falcon network more useful and more productive.

A

Thanks for being here really appreciate, it have a.