Ceph Ceph Virtual 2022 Open Meetings

23 Nov 2022

Presented by: Jiffin Tony Thottan

Introduction to Container Object Storage Interface aka COSI for ceph RGW

For applications in Kubernetes, CSI provides a way to consume file/block storage for their workloads. The main motivation behind the Container Object Storage Interface is to provide a similar experience for Object Store. Basic idea is to provide a generic, dynamic provisioning API to consume the object store and the app pods can access the bucket in the underlying object-store like a PVC. The major challenge for this implementation there is no standard protocol defined for object and the COSI project need to be vendor agonistic. It won't handle the orchestration/management of object store, rather it will be another client and provide bucket access on behalf of applications running in Kubernetes. The initial version of the ceph-cosi driver can be found at https://github.com/ceph/ceph-cosi.

2 participants
27 minutes

storage

csa

kubernetes

csi

topics

project

handled

volume

cefcocet

ssf

23 Nov 2022

Presented by: Jonas Pfefferle
NVMe-over-Fabrics support for Ceph

NVMe-over-Fabrics (NVMeoF) is an open, widely adopted, defacto standard in high performance remote block storage access. More and more storage vendors are introducing NVMeoF target support, with hardware offloads both for NVMeoF targets and initiators. Ceph does not support the NVMeoF protocol for block storage access; its clients use the Ceph RADOS protocol to access RBD images for good reason: RADOS is a distributed m-to-n protocol that provides reliable object access to sharded and replicated (or erasure coded) Ceph storage. However, there are good reasons to enable NVMeoF for Ceph: to enable its use in datacenters that are already utilizing storage hardware that offload NVMeoF capabilities, and to allow existing NVMeoF storage users to easily migrate to Ceph. In this talk we present our effort to integrate a native NVMeoF target for Ceph RBD. We discuss some of the challenges of implementing such a support for Ceph including subsystem/namespace discovery, multi-pathing for fault tolerance and performance, authentication and access control (e.g., namespace masking). Furthermore, we describe how the NVMeoF target design can be extended to allow reducing additional network hops by leveraging the Ceph CRUSH algorithm (ADNN).

Event link: https://ceph.io/en/community/events/2022/ceph-virtual/

2 participants
41 minutes

nvm

routers

fabric

vpn

threads

connection

users

protocols

envy

offloading

23 Nov 2022

Presented by: Deepika Upadhyay, Gaurav Sitlani, and Subham K Rai

Troubleshooting and Debugging in the rook-ceph cluster

This session shares an overview of troubleshooting a rook-ceph cluster, discussing some troubleshooting scenarios. This is followed by an introduction and an overview of kubectl-rook-ceph krew plugin, and how it makes managing containers easier from a troubleshooting perspective. We’ll also discuss future issues we’re planning to solve with it here, followed by a short future roadmap of the rook project. Later, we look forward to discussing and gathering feedback from the users about common and challenging problems they face while troubleshooting the clusters.

4 participants
42 minutes

troubleshooting

rooksip

rook

debugging

monitors

cluster

kotak

workflow

logs

prepared

22 Nov 2022

Presented by: Nizamudeen A
Event: https://ceph.io/en/community/events/2022/ceph-virtual/

4 participants
32 minutes

dashboard

dashboards

advanced

manage

monitoring

safe

version

workflow

troubleshooting

sf

15 Nov 2022

Presented by: Federico Lucifredi & Ana mcTaggart

Data Security and Storage Hardening In Rook and Ceph

We explore the security model exposed by Rook with Ceph, the leading software-defined storage platform of the Open Source world. Digging increasingly deeper in the stack, we examine hardening options for Ceph storage appropriate for a variety of threat profiles. Options include defining a threat model, limiting the blast radius of an attack by implementing separate security zones, the use of encryption at rest and in-flight and FIPS 140-2 validated ciphers, hardened builds and default configuration, as well as user access controls and key management. Data retention and secure deletion are also addressed. The very process of containerization creates additional security benefits with lightweight separation of domains. Rook makes the process of applying hardening options easier, as this becomes a matter of simply modifying a .yaml file with the appropriate security context upon creation, making it a snap to apply the standard hardening options of Ceph to a container-based storage system.

5 participants
43 minutes

administrator

hosts

hacking

mike

protocols

security

chef

staff

linux

acknowledgment

15 Nov 2022

Presented by: Gabryel Mason-Williams

DisTRaC: Accelerating High-Performance Compute Processing for Temporary Data Storage

There is a growing desire within scientific and research communities to start using object stores to store and process their data in high performance (HPC) clusters. However, object stores are not necessarily designed for performance and are better suited for long term storage. Therefore, users often use a High-Performance File system when processing data. However, network filesystems have issues where one user could potentially thrash the network and affect the performance of everyone else's data processing jobs in the cluster. This talk presents a solution to this problem DisTRaC - (Dis)tributed (T)raisent (Ra)m (C)eph. DisTRaC offers a solution to this problem by providing a method for users to deploy Ceph onto their HPC clusters using RAM. Their intermediate data processing can now be done in RAM, taking the pressure off the networked filesystem by using the node interconnect to transfer data. In addition, all the data is localized, creating a hyper-converged HPC cluster for the duration of the job. DisTRaC reduces the I/O overhead of the networked filesystem and offers a potential data processing performance increase.

Learn more about DisTRaC: https://github.com/rosalindfranklininstitute/DisTRaC
Event: https://ceph.io/en/community/events/2022/ceph-virtual/

2 participants
16 minutes

distract

institute

research

project

mechanism

facility

maintainable

informatics

workflow

hpc

15 Nov 2022

Presented by: Josh Salomon & Laura Flores

New workload balancer in Ceph

One of the new features in the Quincy release is the introduction of a new workload balancer (aka primary balancer). While capacity balancing exists and works well since the introduction of the upmap balancer, the issue of primary balancing in order to even the load on all the OSDs was never handled. This proves to be a performance problem, especially in small clusters and in pools with less PGs. In this presentation we will discuss the difference (and sometimes the contradiction) between capacity balancing and workload balancing, explain what we did for Quincy, and outline future plans to further improve the Ceph balancing process.

3 participants
55 minutes

balancer

balancers

rebalancer

capacity

workloads

operation

storage

important

reads

upstream

15 Nov 2022

Presented by: Chunsong Feng

Optimize Ceph messenger Performance

1. The NIC SR-IOV is used. Each OSD uses an exclusive VF NIC. 2. The DPDK interrupt mode is added. 3. The single-CPU core and multiple NIC queues are implemented to improve performance. 4. The admin socket command is added to obtain the NIC status, collect statistics, and locate faults. 5. Adjust the CEPH throttling parameters, TCP, and DPDK packet sending and receiving buffer sizes to prevent packet loss and retransmission. 6. The Crimson message component uses the Seastar DPDK.

1 participant
21 minutes

optimize

efficient

optimal

tcpm

compression

rdm

maintenance

tester

ost

dps

15 Nov 2022

Presented by: Daniel Gryniewicz

RGW Zipper

RGW was developed to provide object access (S3/Swift) to a Ceph cluster. The Zipper abstraction API divides the RGW into an upper half containing the Operations (Ops) for S3 and Swift, and a lower half, called a Store, containing the details of how to store data and metadata. This allows the same Ops code to provide correct S3 and Swift semantics via a variety of storage platforms. The primary Store is the current RadosStore, which provides access to a Ceph cluster via RADOS. However, new Stores are possible that store the data in any desired platform. One such Store, called DBStore, has been developed that stores data in SQL, and specifically in a local SQLite database. Additional Stores, such as S3, are planned to provide additional flexibility. Zipper also allows intermediate Filter layers that can transform Ops, perform policy (such as directing different objects to different Stores), or perform caching for data and metadata. The first planned Filter is a LuaFilter, which will allow rapid prototyping and testing of other filters. An individual instance of RGW will consist of a stack of Filters, along with one or more Stores providing actual data. This presentation will cover information about Zipper, about the existing DBStore, and plans for the future.

1 participant
14 minutes

storage

s3

rs3

zipper

rgw

apps

cache

protocol

compressed

unzipping

15 Nov 2022

Presented by: Satoru Takeuchi

Revealing BlueStore Corruption Bugs in Containerized Ceph Clusters

Cybozu has been running and testing their Rook/Ceph clusters for two years. During this time, they have suffered from a bunch of BlueStore corruption (e.g. #51034 and #53184). Most corruptions happened just after OSD creation or on restarting OSDs. They have been able to detect these problems because the nodes in their clusters are restarted frequently and lots of OSD creation happens for each integration test. These scenarios are not so popular in traditional Ceph clusters but are common in containerized Ceph clusters. They will share what the known problems are in detail and how they have overcome these problems with the Ceph community. In addition, they will also propose improvements to the QA process to prevent similar problems in the future.

1 participant
24 minutes

cyborg

cyborgs

cyber

containers

virtualization

infrastructures

kubernetes

machines

modern

firmware

14 Nov 2022

Presented by: Ziye Yang

Accelerating PMEM Device operations in bluestore with hardware based memory offloading technique

With more and more fast devices (especially persistent memory) equipped in the data centers, there is great pressure on the CPU to drive those devices (e.g., Intel Optane DC persistent memory) for persistency purposes under heavy workloads. Because there is no DMA-related capability provided by persistent memory compared with those HDDs and SSDs. And the same issue also exists in Ceph while using persistent memory. We would like to address such pain points by leveraging memory offloading devices (e.g., DSA). So generally in this talk, we will talk: 1) While persistent memory integration is not very successful in Ceph due to the high CPU overhead while performing I/O operations on persistency device; 2) We introduce the memory offloading devices (e.g., DSA) in order to offload the CPU pressure while doing I/Os; 3) We will describe the main change in pmem device (i.e., src/blk/pmemdevice.cc) and state how we can achieve the offloading including the challenges. 4) We would like to have some early performance results if Intel's SPR platform is available to the public.

1 participant
28 minutes

cpus

computing

hdm

memory

virtual

devices

ssds

configuration

io80

bottleneck

14 Nov 2022

Presented by: Gal Salomon

S3select: Computational Storage in S3

S3 Select is an S3 operation (introduced by Amazon in 2018) that implements a pushdown paradigm that pulls out only the data you need from an object, which can dramatically improve the performance and reduce the cost of applications that need to access data in S3. The talk will introduce s3select operation and architecture. It will describe what the pushdown technique is, why, and where it is beneficial for the user. It will cover s3select supported features and their integration with analytic applications. It will discuss the main differences between columnar and non-columnar formats (CSV vs Parquet). We’ll also discuss recent developments for ceph/s3select. The presentation will show how easy it is to use ceph/s3select.

1 participant
30 minutes

push

pushing

s3

ssx

sd

efficient

workflow

querying

allocation

manipulating

14 Nov 2022

Presented by: Yingxin Cheng

Understanding SeaStore through profiling

SeaStore is the new ObjectStore designed to complement Crimson OSD to support a new generation of storage interfaces/technologies (NVMe, ZNS, Persistent Memory, etc). As SeaStore matures, profiling becomes increasingly critical to understand the comprehensive performance impact of design choices and to set direction moving forward as the backend moves to the mainstream. Profiling infrastructure will also aid new contributors to understand the inner workings of SeaStore. In this session, we will talk about SeaStore support for performance profiling, optimizations made based on the initial analysis, the current status or gaps vs BlueStore along with performance data.

Event link: https://ceph.io/en/community/events/2022/ceph-virtual/

3 participants
28 minutes

profiling

advanced

throughput

implementation

performance

analysis

crimson

monitoring

cpus

inefficient

10 Nov 2022

Event: https://ceph.io/en/community/events/2022/ceph-virtual/
Presented by: Matt Vandermeulen

How we operate Ceph at scale

As clusters grow in both size and quantity, operator effort should not grow at the same pace. In this talk, Matt Vandermeulen will discuss strategies and challenges for operating clusters of varying sizes in a rapidly growing environment for both RBD and object storage workloads based on DigitalOcean's experiences

1 participant
24 minutes

servers

deployments

capacity

storage

digitalocean

dbas

centralized

systems

workflow

cloud

10 Nov 2022

Presented by: Curt Burns and Anthony D'Atri

Optimizing RGW Object Storage Mixed Media through Storage Classes and Lua Scripting

Ceph enables flexible and scalable object storage of unstructured data for a wide variety of workloads. RGW (RADOS GateWay) deployments experience a wide variety of object sizes and must balance workload, cost, and performance requirements. S3 storage classes are an established way to steer data onto underlying media that meet specific resilience, cost, and performance requirements. One might for example define RGW back end storage classes for SSD or HDD media, non-redundant vs replicated vs erasure coding pools, etc. Diversion of individual objects or entire buckets into a non-default storage class usually requires specific client action. Compliance however can be awkward to request and impossible to enforce, especially in multi-tenant deployments that may include paying customers as well as internal users. This work enables the RGW back end to enforce storage class on uploaded objects based on specific criteria without requiring client actions. For example one might define a default storage class on performance TLC or Optane media for resource-intensive small S3 objects while assigning larger objects to Matt Vandermeulendense and cost-effective QLC SSD media.

1 participant
23 minutes

cluster

capacity

workloads

servers

bottlenecks

gateway

optimizing

iops

ai

disk

10 Nov 2022

Presented by: Samuel Just

What's new with Crimson and Seastore?

Next generation storage devices require a change in strategy, so the community has been developing Crimson, an eventual replacement for ceph-osd intended to minimize cpu overhead and improve throughput and latency. Seastore is a new backing store for crimson-osd targeted at emerging storage technologies including persistent memory and ZNS devices. This talk will explain recent developments in the Crimson project and Seastore.

Event: https://ceph.io/en/community/events/2022/ceph-virtual/

1 participant
26 minutes

iop

throughput

cpu

iops

workloads

threading

efficiency

monitor

parallel

cores

9 Nov 2022

Join us November 3-16th for different Ceph presentation topics! https://ceph.io/en/community/events/2022/ceph-virtual/

Ceph Crash Telemetry - Observability in Action

To increase product observability and robustness, Ceph’s telemetry module allows users to automatically report anonymized crash dumps. Ceph’s telemetry backend runs tools that detect similarities among these reported crash events, then feed them to Ceph’s bug tracking system. In this session we will explore Ceph crash telemetry end-to-end, and how it helps the developer community to detect emerging and frequent issues encountered by production systems in the wild. We will share our insights so far, and learn how users benefit from this module, and how they can contribute.

2 participants
31 minutes

dashboard

cluster

telemetry

self

bot

triage

device

access

privacy

deploying

3 Nov 2016

Join us November 3-16th for different Ceph presentation topics! https://ceph.io/en/community/events/2022/ceph-virtual/

4 participants
34 minutes

stuff

research

enhancements

milestones

initiative

significantly

monitoring

squidward

seth

quincy

Ceph / Ceph Virtual 2022

23 Nov 2022

23 Nov 2022

23 Nov 2022

22 Nov 2022

15 Nov 2022

15 Nov 2022

15 Nov 2022

15 Nov 2022

15 Nov 2022

15 Nov 2022

14 Nov 2022

14 Nov 2022

14 Nov 2022

10 Nov 2022

10 Nov 2022

10 Nov 2022

9 Nov 2022

3 Nov 2016