Project Harbor Community, 13 Jan 2021

Previous Meeting Next Meeting

⏯

youtube image

►

From YouTube: Harbor Community Meeting - Jan 13, 2021

Description

CNCF Harbor's Community Zoom Meeting

A

Hey guys, uh thanks for joining the uh community meeting for harbor for today's agenda. uh First, I'm gonna give you some update regarding the release and the progress of v2.2 and after that I'll pass to shikhar my colleague to give us a presentation uh regarding how he hosts harvard provides service to the vmware employees yeah, and that's it, um and uh so um this week we released uh version 2.1.3, which includes two important fixes. The first one is a security issue.

A

We found that a user can use developer role to delete an image via the v2 api. um um Thanks for the community. uh Let me share my screen.

A

A

This is this one and in addition to that, we also fix the issue that docker 20.10, uh seeing this empty header issue when pulling uh via the proxy cache project. So we highly recommend you uh bump up. The harbor to uh you know, fix this issues and uh uh now that's for the patriots and for the uh uh version 2.2.

A

uh We have feature completed and currently we are uh working to uh groom some of these uh uh bugs and issues and doing uh more system level tests and uh if things go well, uh we're gonna pass rc uh by the end of the month or early february, and hopefully we have a ga in the earlier mid february.

A

So that's the update for version 2.2. Meanwhile, we are starting to uh plan for the next version 2.3. So if you have any feature or idea, I think it's important and you want it to uh uh you know, be implemented in version 2.3. It's a good opportunity for you to you, know, open an issue or raise this requirement to us so that we can discuss together and uh see if we can implement that in working 2.3.

A

Okay. So that's the update for the open source uh release and uh now I'm gonna uh pass to uh my colleague. Shikhar um to uh you know, give us the sharing shikhar.

B

Yeah very good, thank you, daniel uh welcome everyone, and thank you for taking the time to to uh hear me. So let me share my screen.

C

B

So share and let me sort of put this into presentation mode. Sorry,.

B

B

Okay, does everyone see harbor as a service at vmware, yep, yep, okay,.

C

B

Good, so let me sort of start off by sort of talking about the. Why so? Why does vmware need or its own container registry services? So, first of all vmware business units. Often they use a variety of external services to distribute their containers, so they use either bin tray from jfrog. They use docker hub. They use a registry from your favorite public cloud provider, ecr gcrs, and what so forth, so the universe of how containers are distributed at vmware is a very distributed.

B

I mean, as a result, this reliance on third-party distributions has has lots of issues. It's it's f, it's inefficient because we are not able to standardize.

B

Obviously, many of these services are commercial, so they cost money, and probably the biggest issue is that it adds risk to the overall business. So we are probably mostly most everyone. I hope is aware of. For example, docker hub announced that, as of november 1st 2020, that uh all free accounts would be limited in terms of their pull volume so depending upon, and it would be by number of pulls per unique ip address.

B

So for some, like an enterprise like vmware, where all uh the external visible ips is so few in number, this causes a lot of hardship. Similarly, even on the commercial services bin trade was from jfrog is, has recently been eol, then they're being replaced by a new service called artifact recess. So this require causes friction among the development teams in terms of having to change all of their pipelines because they have to migrate to a new service.

B

So and finally, vmware wants to get a better control over its own digital supply chain and external distribution. They want to understand much better from a governor governance point of view what's being distributed when what's when we're pulling containers, what are the different layers? What are the different artifacts that we're pulling from the different sources so that we can track and make sure that what we distribute to our customers? We have full visibility and there's nothing being pulled, that's hidden from us.

B

So this is the question on the. Why? So, what did we do? So we stood up actually two separate instances of harbor for separation of concern and risk mitigation. This happened in mid-october of last year: we've deployed them in vmware data centers in the u.s one in california, and the other one is in washington.

B

These are our two main data centers for the company and uh instances are deployed uh in an active active mode, so we have active failover. If one has one data center or anything in one instance has an issue, so we actually called these instances by different names for ease of uh uh differentiation, so that people so that we can understand what people are using them for so one is called corporate harbor.

B

This is purely for internal access for bills. It's also configured as a proxy cache to docker hub. As of last week, we were seeing 400, 000, plus weekly pulls through the proxy cache. It just gives you an indication of uh sort of the amount of volume that vmware was sort of using from a third party which in a way, was very surprising to us.

B

I mean we knew as a company that we made extensive use of docker hub, but seeing these type of numbers- and this has been growing on a weekly basis since we implemented the proxy cache in mid-october right now, as of uh end of last week, as of last friday, there were 220 projects hosted within this instance and almost 1900 repos, and that numbers just continued to grow on a weekly basis.

B

We have a second instance, which we call distribution harbor, and this one is externally exposed and it allows artifacts to be accessed by customers and partners. So again, this is to remove the dependency on uh third-party registry, so particularly for open source projects. uh Many of the vmware business units is looking to potentially uh you start using distribution, harbor versus other third-party sources for distributions or open source artifacts into the public domain.

B

So those are the two instances we have they're both uh identically configured in terms of uh released they're, both running right now, 2-1-1 uh and uh uh from we are actually actively uh waiting for 2.2 to get released and go ga because, uh in addition to the 2010 20.10, docker client fix that is uh there in 2.2, which is also in 213.

B

We also want to add a bunch. We also want to have another, a bunch of other features which we'll talk about that will that are targeted for the 2.2 release, so we're eagerly awaiting that.

B

So this is a very busy slide, uh but I think the the main point that we I want I would like to get across to the community is that within vmware we are a very strong component of dog fooding. In other words we like to use and what we build and what we sell to our customers.

B

So this is the underlying infrastructure that our harbor instances are deployed on. So we have deployed on uh on vsphere uh with kubernetes with three master nodes and we are able to scale up the worker nodes as we need to the networking underlying it is nxt, and since we are a large enterprise, I mentioned that redundancy is key for us and availability.

B

We do this by deploying multiple master nodes in the control plane, as well as multiple worker nodes. We also take it a step further. For example, we have anti-affinity rules set up for the vsphere cluster to ensure that not all of the master node vms are not on the same cpu. So we want to make sure that if any type of single point of failure does not take down our entire instance and finally, for example, we know that uh for kubernetes hcd it is the heartbeat of the entire uh cluster.

B

So what we do is that we manually uh not manually rather but regularly sort of snapshot the xcd database so that we're able to roll back and reinstall if anything happens, to happen that cd. So again, I don't. I can go into details on each one of these points, but just to reiterate from our uh deployment, we are running the entire v uh vmware uh tanzu stack for harbor and uh as of when we went live in mid-october, we have not seen any downtime as of yet so. We're very pleased with this.

B

So in terms of operational support, as I mentioned, were maintained as a production level service within its. So what does that mean? So we have 24 by 7 by 365 monitoring of the entire infrastructure that includes the underlying kubernetes. As I mentioned, the storage, which is a cloud native storage and cpu, as well as the entire network path, is checked regularly. That's firewall load balancer, the uh the the wav, the dns.

B

All of that is sort of regularly checked. We have heartbeats going through to do pulls uh every minute through our harbor instances, as well as checking for performance.

B

We have slack on the right, I'm just showing a little bit of a screenshot. We actually have a notifications page where we actually show all of the different major services that we offer in the within our engineering services and, however, both sentences are available, so you can subscribe. Users can subscribe to that.

B

So, if there's any change in operational performance, they'll get a notification through slack and email and uh on the bottom, I'm showing you just a picture of the 10 of uh of uh tanzu monitoring, just to show how we're monitoring the kubernetes as well. So we are actively treating this as a a key critical I.t service, just like any of the other services that we offer within vmware for the engineering community, which we support.

B

So that's sort of what we are today. Where are we going to tomorrow? So uh from the harbor registry support side, uh we are actively looking and waiting for support for additional proxy cash, such as ecr and gcr, and so that's targeted for two 2.2, which we are very, as I mentioned, very interested in waiting for uh the kubernetes operator, enabling day one and day two operations so that we can replace helm chart for deployment. That's of interest to us.

B

A particular interest is the prometheus integration in 2.2, because again just to enhance our telemetry of what we're seeing and how we're doing active monitoring of these instances as we grow. This is of very big importance to us. Finally, more uh granular role-based access control is something that, uh especially since uh we have multiple groups that are actively monitoring and looking at the service. This will enable us to have better governance of who's accessing harbor.

B

From the service side, we are looking to expand our geographic footprint as needed. So, as I mentioned in the beginning, we're deployed in two data centers in vmware for redundancy, but both of them is just a little bit of the geography both of them are are on the west coast of the us, so it doesn't provide us with geographic coverage in terms of performance if we start to see issues with latency and throughput.

B

So we are prepared to expand into a emea and the apac region with the redundant nodes, then do replication across all of the different sites if needed. At this point, we are not seeing the the the need so far. We actively run scale tests and throughput performance sustains throughput performance for large scale tests.

B

For example, it's averaged between two to two and a half gigabits per region when we'd ran like large tests of a thousand pulls from hong kong or amsterdam, so we're keeping an eye on it, but we're prepared to expand as needed and fi. The other uh one aspect that we're actively looking on is harbor comes for vulnerability. Scanning comes packaged with the with trivia and claire, which is being deprecated, but uh vmware also has its own specific scanning requirements around the open source and uh other things.

B

So we're also looking and working with some of the teams within vmware to see how we could uh add those scanning options via the the the the scanning plug-in, an open api within harbor, and add that so that when artifacts are uploaded and then pushed into harbor, the scanning happens for that according to vmware requirements.

B

So these are near-term things that we're looking at over the next sort of six plus months six months to nine months, and then we will re reevaluate our roadmap as we go forward just like any other service.

B

That's really all I have at this point. It was a short update of harbor and how we're using it. We are very pleased uh uh with the with the with the response from the community from the harvard team and actually from our users. uh The the growth has been uh essentially exponential in terms of user growth and sort of volume and uh from what we are getting. The feedback from our users that are using the services they're generally been very pleased and appreciative of it. So I'm open to questions.

B

If anyone has any about the service about harbor, uh please feel free.

C

So I have, um I have one question regarding the deployment here. You talked about mid-october, you created distinct instances for separation of concern, but of course this was being planned long before this and around mid-october. We also got uh the announcement from from docker around the the docker hub uh throttling what what was the response uh from your users when that announcement happened and um what was your initial response.

B

So I I would love to say that when we were doing the planning that we knew that docker was going to make that announcement and we sort of were ready and able to go. I mean, as you mentioned, we were planning to launch it and to be fair.

B

The corporate harbor internal instance was already running for a while before that, it was not like we just sort of stood up in mid-october. It was there, but essentially by mid-october. We essentially went live in a big way. We've sort of evangelized to in the internal community that the the registry was up, that the proxy cache was configured uh put into place, all of the the readme and the documentation of how to create a registry.

B

How to create your own service, the automation around, creating your own project, sort of in a self-service manner that all happened around mid-october. So what we saw is is that before mid-october we had a we had a number of. We probably had uh 50 to 75 projects that people were using sort of on a playground basis.

B

But what happened in sort of mid-october when sort of people started realizing that they couldn't start just almost, I would probably say, use the word sort of without thinking just go to docker, hub and sort of have everything come in and just and their pipeline started. Breaking people started moving their production pipelines directly onto harbor registries, so it became much more than just a playground to try things out with.

C

So yeah you could say it was kind of a forcing function as well uh to to move into this.

B

I mean it was a forcing it was a forcing function as well and from a from a vmware perspective from a sort of a big company perspective. uh Many of the groups were also looking at hey. Is that without something like this? Maybe I'll just get a paid account with docker right so multiply that out by somewhere a number of different groups, it doesn't uh from a money. There's there's the monetary aspect.

B

But again, one of the points for launching the the services is that we, as a as an organization at the large enterprise, want to get better governance of what we're using and understanding what we're, using. So by having everything go through the proxy cache and teams use it actually we're seeing a lot of visibility of what people are using, which is actually something we never had before.

B

We can see sort of the pull volumes and for me, as an administrator of the service, I find it interesting because you can sort of see that oh we're now started seeing. Oh some, a new project must have started up because now I see, for example, the gold line pulls sort of like scrolling through the roof and this other project. The pull volume has sort of like died. I don't see anything there, so something maybe they have what they need.

B

So it is a sort of illuminating from just an internal perspective, of what we're using that's really cool.

B

That's an interesting aspect, yeah it it is, and so uh I mean uh I for the folks that are in vmware, uh I publish sort of like the weekly pulls and the main sort of the top pull things just as an interesting thought so and there's a few things that come out of it because sometimes like we're pulling things that uh we're trying to sort of get away from from a corporate reason for various for legal reasons or other reasons, yet we're still pulling artifacts from that. So that's racism, eyebrows as well, too.

A

Yeah- and that would be interesting- you know to uh using the data you have in the corporate hardware to analyze what technology is popular within our company right, interesting topics, even for radio, maybe.

B

Oh, absolutely I mean it's, for I mean as an example as sort of like we saw, for example, for a while, we saw sort of like maven pool sort of sort of creeping up, and now they sort of slowed down to a crawl right. So, okay, so it either means that we're not pulling that uh people are getting away from maven or maybe they're just using the cash.

B

Because it's populated I mean it's hard to sort of make clear uh sort of uh sort of like a cause and effect uh sort of the uh determinations, but just to see that oh suddenly we're starting to see. For example, I mean out of nowhere and for example, in mid-november, the number of ubuntu poles have gone through the roof.

B

Okay, why? I don't know, but it has right.

A

Yeah, that's really interesting. Oh thanks for the sharing and do we have any other questions for shikhar.

D

uh Yeah sure this is uh steven zo from harvard team, and you mentioned your in your slides. uh You mentioned that there will be some other. You know scanning requirements from vmware, so uh do we already have we already collected some requirements or we're just planning to collect some more requirements.

B

Actually so from the vmware side, the aspi team is already starting to experiment with the uh the scanning api, and I know that there's a few things that they're working on, I think, there's some tickets, that they've opened with the harbor team for the things that they don't understand. So uh I can put you into that thread if uh you're not aware, but they have already started on the to play around with the api to understand how they could potentially use osp, which is essentially for the folks that are outside of the vmware.

B

So osp is essentially our wrapper for black duck for open source scanning, so that we understand that when we're shipping any type of commercial products, what open source packages were we using? What versions are we using just to make sure a from a visibility, commercial aspect as well as us that are using the right version?

B

So, stephen to your question, they have started to engage uh uh milan. uh Milena's team has started to engage with the with the api, and I know that there's been some tickets open into harbor for some issues. They've had.

D

Oh so so the team is from carbon black.

B

No, this is uh this is the this is the actually the old the are the best cpe team that uh so this is not carbon black. This is uh open source scanning that they're starting to look at not so much of vulnerability or uh static code scanning.

D

Okay, uh because I talked to cover black team, they also yeah working on the adapter to introduce their. You know black capability into harbor to do the scanning work, so I thought yeah.

B

Okay, so actually so put me in touch with them, so uh I I did not know that that was happening.

D

D

B

Any other questions from anybody.

A

Yeah, I have a question, but not very related to harvard you mentioned that you do regular snapshots for suv for your kubernetes cluster is there. Any incident happened that you had to. You know roll back to the snapshot not yet really.

B

Yeah, okay, I mean uh so. Actually I mean we are actually running. I mean again on I'm actually on a shared cluster, so we don't have a dedicated cluster for harbor, but this is a shared I.t cluster. That runs a lot of uh sort of critical I.t applications like uh help now, which is uh uh sort of our internal ticketing system and a few other things. So this is a critical cluster, so we uh are maintaining this because a lot of other things depend upon it.

B

So, in terms of rolling back from the snapshot, at least since harbor came on, we've not had to do anything with it. Okay, I.

A

B

In the past, the answer is probably yes, but I don't know, I don't know when uh we can talk to manas about that. If you just want some more uh data around that point.

A

Oh just out of curiosity, because when you roll back lcd, it may impact other application. Considering it's a shared class.

B

Exactly I mean I mean. Obviously this is a if we have to roll back it's because something catastrophic happened right. So it's all about sort of getting us back up and running as quickly as possible. Right yep.

A

So, okay, thank you for the great sharing car, um absolutely yeah. uh Next, I I think uh I I'm open the mic to uh you know the whole uh other uh participants to see if there any uh topic we want to discuss in this meeting. If not I'm going to close this meeting and finish early, do we have any other topic.

B

Anyone, okay, I guess I guess one.

A

D

B

uh So after 2.2 comes out in sort of mid mid late february, it's.

A

B

2.3 is that uh looking to be like three months later, sort of may time frame.

A

Yeah yeah, we are considering. uh We need to roll out like three or four releases a year and so, uh depending on the workload we plan for 203, it may be out targeting may or june or even july. Okay,.

B

A

So if you have any requirement now, it's a good time, I think the next few weeks you can open issues. I think I'm gonna also leverage the discussion feature on github to engage other community members. uh You know to discuss: what's the high priority for 203.

B

Okay, now that's fair, I mean actually what uh myself I mean, what the team here is myself, manus govind we're actually trying to put together sort of like a uh I'll call it a prd of things that we would like to see from uh from harbor from both just a feature and operational perspective. Just from our experiences.

A

Yeah, that would be very valuable for us, because a biggest goal for harbor, I think for this year, is to improve the scalability and make it more suitable for you guys to run in production, for you know in the enterprise environment, absolutely yeah, excellent, okay!

A

Okay, if there's no topic, I'm gonna, uh you know uh end this meeting and thanks everyone for joining us see you guys next time take care.

B

Bye thanks again shikhar.

A

For joining us.

B

B