Metrics are one of the main building blocks in the topic of observability.
Metrics are one of the main building blocks in the topic of observability.
Insights, experiences and learnings from trivago's tech teams.
Metrics are one of the main building blocks in the topic of observability.
Metrics are one of the main building blocks in the topic of observability.At trivago, we generate a huge amount of logs and we have our own custom setup for shipping logs using mostly Protocol Buffers. Eventually we end up with some fields in Elasticsearch (ES) that contain partial (or full) URLs. For instance, in our specific case we store the query component of the URL in a field called query
and the path component in a field named url_path
. Sample values for these fields could be:
tl;dr: continuously monitor your CDN and origin servers on layer 3 with tools like MTR. Layer 3 issues on external middleware can have a significant impact on layer 7 web performance.
Hello from trivago's performance & monitoring team. One important part of our job is to ship more than a terabyte of logs and system metrics per day, from various data sources into elasticsearch, several time series databases and other data sinks. We do so by reading most of the data from multiple Kafka clusters and processing them with nearly 100 Logstashes. Our clusters currently consists of ~30 machines running Debian 7 with bare-metal installations of the aforementioned services. This summer we decided to migrate all of this to an on-premise [Nomad](https://www.nomadproject.io/ cluster) cluster.
Back in April 2015, I felt the need to do some work and earn money besides my studies in Computer Science at the University of Düsseldorf. After doing some research and crawling a few job platforms, I finally applied for a job in IT-Support at trivago. The job offer looked very appealing and life at trivago promised to be fun.
We're a data-driven company. At trivago we love measuring everything. Collecting metrics and making decisions based on them comes naturally to all our engineers. This workflow also applies to performance, which is key to succeed in the modern Internet.
At trivago we store a subset of our realtime metric data in InfluxDB and we are quite impressed by the load it can handle. Despite all the joy, we had to learn some lessons the hard way. It is pretty easy to overload the database or the web browser by executing queries that return too many datapoints. To prevent that, we wrote Protector - a circuit breaker for Time series databases that blocks malicious queries.
At trivago we rely heavily on the ELK stack for our log processing. We stream our webserver access logs, error logs, performance benchmarks and all kind of diagnostic data into Kafka and process it from there into Elasticsearch using Logstash. Our preferred encoding within this pipeline is Google's Protocol Buffers, short protobuf. In this blog post, we will explain with an example how to read protobuf encoded messages from Kafka using Logstash.
Follow us on