Road to FOSDEM 2024

co-authored by Denis GERMAIN, Armelle Bengochea, Benjamin Martin, Weeking

What is FOSDEM ?

We had the opportunity to attend FOSDEM once again this year. If you’re not familiar with it, FOSDEM stands for Free and Open-source Software Developer’s European Meeting.

FOSDEM is one of the largest European conferences and it takes place every year at the Université Libre de Bruxelles (The Free University of Brussels) in Belgium.

This event is always one of our favorites to attend! Why? Because there are nearly 60 different tracks and 855 talks, and so many different communities coming together for two days in the same place!

Don’t worry if you missed it, we’ve prepared a recap of the talks that stood out the most for us!

Our FOSDEM Survival Guide

There were over 8,000 people there this year, all sharing the same passion, so it’s a necessity to come prepared! Whether you’re a returning veteran or attending for the first time, here are our FOSDEM survival tips:

  1. The number one rule of FOSDEM is: you won’t be able to see all the talks that interest you. Don’t get frustrated, the conferences will be available for replay (and even live if you don’t have time to change rooms!)
  2. Interested in a talk? Get ahead by trying to enter the room before the previous session is over. This way, you’ll surely get a comfortable seat while avoiding the crowds as much as possible.
  3. If you want to eat at the food trucks while avoiding the crowd, steer clear of the 12pm-2pm slot.
  4. You’ll find the majority of goodies in building K 🙂

Our selection of talks

Modern application observability with Grafana and Quickwit

By François MassotVideo & slides

Before attending this wonderful talk presented by François Massot, the co-founder of QuickWit and one of the core developers of the QuickWit engine (Rust), we were expecting to see yet another new Search Engine. As it turns out, we were pleasantly surprised by QuickWit’s potential.

But first, let us introduce you to the observability of modern context. We live in a world of SOA (Service-Oriented Architecture) and distributed systems, and we have access to different telemetry data sources like metrics, logs and traces. OpenTelemetry showed up as a standard with SDKs and tools to be able to instrument, generate, export and collect these telemetry signals, which enabled new observability solutions. QuickWit is one of them.

We will start by explaining a bit what QuickWit is. It’s a powerful search engine based on a Rust library named Tantivy for full-text search — both projects are maintained by the QuickWit team. The solution supports OTEL-native traces and logs and the people behind it told us about their dream to support metrics (spoiler alert, they’re currently working on it). QuickWit is compatible with the ElasticSearch API, meaning an ES client can take advantage of the solution, like fluent-bit and its ES output plugin. Trust us, it’s not just another search engine, because QuickWit (written in Rust) consumes less CPU/memory for the same job and does it faster than ES or OpenSearch, particularly on large data models. Also, it’s schemaless, so you can put any document on it, and it’s Kubernetes ready with a helm chart. For the storage part, they use s3 backend with indexing, so it ensures a really low latency request, same as if it’s remote object storage. And obviously, they have a Grafana Datasource plugin, heavily inspired by the ElasticSearch plugin.

Now that you better understand the potential of Quickwit, let’s jump to François’ talk. He started by explaining the observability challenges in modern contexts and gave us details about the specific storage backend types currently used for the traces, logs and metrics. He invited us to pay attention to cardinality, simply because traces and metrics can both be stored on TSDB backend type, but traces are not adapted to it. Why? Two reasons:

  • TSDB is optimised for monitoring and suffers from high cardinality. So you need to control them to avoid hugely increasing your compute and storage costs.
  • The Trace telemetry data is not made to be controlled, label filtered or thinned down. You need to keep everything and -understand the full context with every information available, like userID, requestID, details, etc.

François wanted to explain to us why Grafana Tempo, QuickWit and other distributed tracing backend exist and why they’re better than using a classical TSDB backend.

Then François described to us why QuickWit, a search engine, is a great solution as a trace and logs backend too. The main reasons are the needs around traces. We generate a lot of them, they contain a lot of data, we need to query them often, and we need to have low latency queries through billions of spans. All of that means it can be hard to scale the storage and to have low latency requests. That’s exactly where QuickWit comes in. It uses s3 storage backend being scalable and resilient, it uses the indexing to control the request latency (under 1 second same for large data model) and it uses a decoupled storage and compute architecture allowing us to independently scale searchers and indexers.

Now, I’ll try to explain how a remote object storage with indexing provides low latency responses on very large data models based on the details François gave us. This the QuickWit architecture schema François used in his slides:

To the left, we can see the write path in QuickWit’s architecture with indexing, and to the right, we can see the read path with the search part. With an interval in seconds, QuickWit builds and pushes to the object storage a bunch of splits containing docs data and metadata. Then, for each split, a line is added to the metastore file containing the split metadata, therefore the searcher can access it. Now, let’s talk about split details:

A split contains three data structures:

  • The Doc store: It’s a raw-oriented storage part waiting to receive a DocumentID in order to provide the full document content.
  • The Inverted index: If you provide a field like a userID, it pops a list of DocumentID containing this userID.
  • The Columnar store: It’s used for aggregation and analytics.

And it also stores the three data structures metadata in a hotcache called “footer split”. This hotcache has been optimised to always be in the searchers’ cache. And with the pointers to the data structures, it allows searchers to find data in one or two requests maximum. That’s how QuickWit is optimised for searching logs and traces with object storage.

After the QuickWit technical presentation, François moved on to a demo of how it stores spans based on the OTEL data model. We said it’s schemaless, so you can store anything you want, but for spans you really want to follow the OTEL standard. François generated a lot of spans and traces in order to show how he deals with application monitoring and QuickWit. He used a Grafana labs project named “xk6” allowing him to build k6 with extensions, and he used the tracegen Golang package to generate traces easily. He deployed a QuickWit cluster on a Kubernetes cluster and created a Grafana instance on Kubernetes.

Firstly, he demonstrated the QuickWit Web UI that can list indexes, check their details, check cluster details and display node information. He showed us how much QuickWit is compressing files because splits were around 40GB while the size of uncompressed published docs was 290GB. And yes, QuickWit does compaction too, it explains all the panel’s CPU spikes. It merges splits. François put more than 400M of spans into the otel-trace index and kept sending around 15k docs/s in live (~1TB per day of data).

He used the Grafana explore tab to discuss with QuickWit (Lucene query style like ES), and with inverted index, he showed us how you can search and navigate easily into spans. It was very fast. Then he demonstrated how to dig into one specific trace with the explore Jaeger plugin targeting the QuickWit cluster. It works very well and returns all spans for a given trace_id. Here is the debug view from Grafana:

Just for your information, a trace can be identified as a group of spans forming a complete request debug view. It can be a distributed trace if the trace travels to several services.

Unfortunately, François didn’t have enough time for questions because of a little technical issue at the beginning of the talk. He would have liked to further explain the debugging way with Grafana and QuickWit, but we think he did an excellent job. He managed to hold our attention and motivated us to do a “Quick QuickWit Proof of Concept”.

If we had to pick out a few points to remember, here’s what they would be:

  • observability, thanks to the OpenTelemetry standard, is growing up and is promising for the future. We will have new very good solutions and QuickWit is the proof.
  • traces and spans are nowadays mandatory in SOA to be able to debug failures, crashes and errors.
  • QuickWit is a great tool to store logs and traces for large amounts of data. Thanks to the indexing, the indexers/searchers scalability and the s3 storage, it seems to be amazing and seems to be very easy to maintain.
  • QuickWit comes with a Kubernetes ready architecture, a very efficient and performant search engine (written in Rust) and a good hype, which makes us strongly believe in the project.

Additional resources:

Multithreading and other developments in the ffmpeg transcoder

By Anton KhirnovVideo & slides

In this insightful talk, Anton Khirnov, one of the key developers of ffmpeg and the main architect of the ffmpeg transcoder, explained how he and his team refactored the whole transcoder codebase for the upcoming ffmpeg version 7. We found this talk to be an excellent example of dealing with a daunting technical debt, and trying to rewrite a significant codebase according to a modular and incremental approach.

To better comprehend the extent of the changes, let’s first talk about the ffmeg transcoder, i.e. the command-line tool.

Ffmpeg is arguably the most used multimedia transcoding pipeline. Virtually almost any software that involves audio or video content somehow depends on it, including an extensive use here at Deezer. Technically, a typical modern ffmpeg process can be summarised with the following graph:

From a set of input multimedia streams or files (audio and/or video, represented in green), decoders are applied, then filters apply effects or transformations (in yellow). The outputs are then fed through encoders (in blue), before being multiplexed into output streams or files (in red). Connections can be skipped, split to multiple separate blocks or fed from multiple sources, which makes transcoding rather complex DAGs.

As Anton described, open source contributions to ffmpeg have flowed relentlessly over the last 20+ years from a significant number of distinct contributors. Yet, each contribution was always made with the same objective in mind: minimise both the code diff and the amount of work necessary to add, fix or review product features. The overall design or the ease of future development, however, were not a key part of the contribution process. Moreover, the fact that the number of features exploded each year led to a “ridiculous number of options that absolutely none, even developers, can remember”, as Anton confirmed.

For these reasons, the ffmpeg transcoder suffered from a substantial technical debt, with many complex yet critical parts of code very hard to comprehend and maintain.

To tackle this problem, Khirnov decided to rewrite the project in an oriented object paradigm, with the strong objective of making the codebase look like the graph presented above, as much as possible at least. On top of a huge refactoring task that also included “removing old things that didn’t work and nobody could understand”, Khirnov said. The general methodology: list all instances and their dependencies, clean each one of them, add isolation to components while respecting the graph object design as above. This massive refactor was reportedly the most complex in decades, through a series of 700+ patch sets spanning two years of work.

Besides redesigning the transcoding pipeline, Khirnov leveraged the new modular system to isolate components and make them runnable in parallel. Other advantages of modularity include predictable, deterministic outputs, as well as better probing capabilities, for latency estimation in particular. All of these make for new key major features for the transcoding tool, as byproducts of a new cleaner architecture.

We found this talk particularly inspiring as a great example of how the architecture of a complex, entangled and heavily patched piece of software can be entirely redesigned, no matter how popular and critical it is, with a systematic and methodical approach. A significant effort, but that brings key outcomes: not only the resulting codebase is much easier to maintain and to support future developments, but it also yields new features critical to the application itself.

Putting an end to Makefiles in Go projects with GoReleaser

By Denis GermainVideo & slides

For the first time in his career, Deezer’s Senior Lead SRE Denis Germain had the opportunity to present a talk at FOSDEM, in the Golang devroom.

The presentation was on GoReleaser, a tool built in Golang that aims to automate and ease all tedious tasks when releasing Golang software. Most of the talk was actually live coded.

Despite a minor glitch during the demo, he showed us that automating the whole delivery of a new release of a Golang app (cross building the binary, creating and pushing a docker image, adding a release in GitLab with archives and checksums, sending a Mastodon toot) could be as easy as writing a few lines of YAML code, under 15 minutes time.

Even though most enjoyed this beginner-friendly talk, FOSDEM’s Golang devroom is generally used for more advanced topics, which led to some comments at the end of the presentation. We promise we will bring more expert material next time 😉.

An idea for improvement

While this event is very enjoyable and informative (although tiring), there’s certainly room for improvement.

In particular, the level of knowledge recommended for understanding each talk is not clearly communicated. We felt that some were either too advanced, or not advanced enough. We think adding information about the intended audience level for each talk would help attendees make the most of their FOSDEM experience in the future!

“All I’m askin’…is for a little respect,” says Aretha

FOSDEM is an annual event that only takes place thanks to the participation of many volunteers. It’s a free event, intended to be non-commercial, which in our humble opinion, embodies what open-source is by definition: the free sharing of knowledge without barriers.

However, in their race to make the most out of the weekend, it seems some participants forgot all manners. In some devrooms, in particular the Containers track, which is one of the most visited tracks of the event, Q&A sessions had to be shortened because people were rushing out as soon as the talk had ended, while speaking out loud. This type of incivility surely impacted other participants’ experience in a negative way, which is a shame.

It’s worth reminding that speakers also take part voluntarily, with some traveling thousands of kilometers to attend the conference, which makes FOSDEM one of the biggest tech events in Europe. So out of respect for them and the audience, it would seem reasonable to expect that all attendees exit (and enter) rooms as quietly as possible when a talk is ongoing.

Conclusion

Once again this year, FOSDEM was rich in experiences, inspiring talks, and of course, waffles. We hope to participate again next year!

Until then, we would like to thank all organizers and volunteers for allowing us to spend two wonderful days in Brussels!

Keep up the great work to ensure that FOSDEM remains an exceptional event that all of our teams eagerly await 🙂