youtube image
From YouTube: Lightning Talk: Data Flow Control in Cluster Logging Pipeline - Pranjal Gupta & Eran Raichstein, IBM

Description

Lightning Talk: Data Flow Control in Cluster Logging Pipeline - Pranjal Gupta & Eran Raichstein, IBM

Logging pipelines are crucial in ensuring container logs are reliably collected and routed to persistent storage. Logs generated by workloads (container processes) are written to files by Container Monitor processes (e.g. Conmon). In production environments, as Fluentd deals with a massive volume of logs, the log generation rate often exceeds the rate of log collection, which causes log loss. There is a need to prioritise application logs so that administrators can collect logs from high priority workloads in a controlled manner. In this talk, we introduce a new feature in the in_tail input plugin, which uses group rules to rate limit log collection. We share exciting insights from our systematic study about log loss on Fluentd plugins using our open-source benchmarking framework. We also present a Log Flow Control framework that allows users to define and enforce log rate limit policies to control log loss predictably.