1 Jun 2022
This is a demo of deploying GitLab Observability (Opstrace) locally and using a test application to send and query traces.
This is using our new Golang operator code, GitLab auth, ClickHouse backend and an early version of our Observability UI (Grafana fork).
This is using our new Golang operator code, GitLab auth, ClickHouse backend and an early version of our Observability UI (Grafana fork).
- 3 participants
- 21 minutes
14 Apr 2020
This is a recording of a session between OpenTelemetry project maintainers and the GitLab Monitor Stage team members on what OpenTelemetry is and does, and how GitLab and the community, in general, can contribute to OpenTelemetry.
- 5 participants
- 58 minutes
1 Jan 2020
Speaker: Andrew Newdigate
GitLab.com’s monolithic Rails application experiences high week-on-week traffic growth. To ensure availability, GitLab’s Infrastructure team track and plan ahead in order to avoid hitting capacity limits in the application, whether these limits be CPU, database connection pools, memory, storage or any number of other finite resources. Hitting these limits could result in hours, or days, of degraded service while workarounds are put in place. With this in mind, the team set about building a set of tools on top of Prometheus recording rules and alerts to provide them with the information they need to be sufficiently forewarned, up to a month in advance, of potential resource saturation issues. If you’ve ever felt that you’re reactively responding to resource saturation issues, this session will provide practical examples of how we’re building a framework for resource planning into our SRE team workflow. We’ll be presenting our open-source solution and explaining how it works for us.
Slides: https://promcon.io/2019-munich/slides/practical-capacity-planning-using-prometheus.pdf
GitLab.com’s monolithic Rails application experiences high week-on-week traffic growth. To ensure availability, GitLab’s Infrastructure team track and plan ahead in order to avoid hitting capacity limits in the application, whether these limits be CPU, database connection pools, memory, storage or any number of other finite resources. Hitting these limits could result in hours, or days, of degraded service while workarounds are put in place. With this in mind, the team set about building a set of tools on top of Prometheus recording rules and alerts to provide them with the information they need to be sufficiently forewarned, up to a month in advance, of potential resource saturation issues. If you’ve ever felt that you’re reactively responding to resource saturation issues, this session will provide practical examples of how we’re building a framework for resource planning into our SRE team workflow. We’ll be presenting our open-source solution and explaining how it works for us.
Slides: https://promcon.io/2019-munich/slides/practical-capacity-planning-using-prometheus.pdf
- 7 participants
- 28 minutes
4 Dec 2019
Andrew & Marin pair on adding a new saturation metric to the GitLab.com monitoring suite. This video resulted in this https://gitlab.com/gitlab-com/runbooks/merge_requests/1679.
This change was a corrective action following on from Production Incident https://gitlab.com/gitlab-com/gl-infra/production/issues/1419 and again in https://gitlab.com/gitlab-com/gl-infra/production/issues/1437 (exactly one week later)
This change was a corrective action following on from Production Incident https://gitlab.com/gitlab-com/gl-infra/production/issues/1419 and again in https://gitlab.com/gitlab-com/gl-infra/production/issues/1437 (exactly one week later)
- 3 participants
- 60 minutes
4 Dec 2019
Hordur and Andrew discuss how AutoDevOps can be better monitored using the key metrics framework used for monitoring the components of GitLab.com.
This follows on a outage in the feature https://gitlab.com/gitlab-org/configure/general/issues/9
This follows on a outage in the feature https://gitlab.com/gitlab-org/configure/general/issues/9
- 2 participants
- 50 minutes
30 Jul 2019
A presentation I prepared at the request of the Self-managed Scalability Working Group.
https://docs.google.com/presentation/d/1xx8sOoWsRvw8_wHqBKbujrWQwmK-meDQrSs4zGNY53I/edit?usp=sharing
Issue: https://gitlab.com/gitlab-com/gl-infra/infrastructure/issues/7217
https://docs.google.com/presentation/d/1xx8sOoWsRvw8_wHqBKbujrWQwmK-meDQrSs4zGNY53I/edit?usp=sharing
Issue: https://gitlab.com/gitlab-com/gl-infra/infrastructure/issues/7217
- 1 participant
- 17 minutes