Monitoring at trivago

How we improved reporting and monitoring of test automation results

Over the last few years, we completely refactored what was described in our previous article about how we use the ELK stack for an overview of our test automation results, but some core concepts remain valid and applicable.

Giuseppe Donati · 15 Feb 2023 · 6 min read

How we scaled our Prometheus setup

In 2020 we started to migrate one of our most significant workloads, our Node.js based GraphQL API and many of its microservices, from our datacenter to Google Kubernetes Engine. We deploy it in three GCP regions, each having its Kubernetes cluster. Since then, our monitoring infrastructure has changed due to various periods of instability and pandemic induced scaling challenges.

Simon Brüggen · 23 Aug 2022 · 9 min read

SRE: On-Call Procedure at trivago

One of the many responsibilities of a Site Reliability Engineer (SRE), is to ensure uptime, availability and in some cases, consistency of the product. In this context, the product refers to the website, APIs, microservices, and servers. This responsibility of keeping the product up and running becomes particularly interesting if the product is used around the world 24 hours every day like trivago. And just like in the medical profession, someone has to be on call to react on failures and outages outside of the office hours.

Kenechukwu Nnamani · 18 Jul 2022 · 7 min read

Continuous Performance Monitoring for PHP - The tale of Blackfire at trivago

We're a data-driven company. At trivago we love measuring everything. Collecting metrics and making decisions based on them comes naturally to all our engineers. This workflow also applies to performance, which is key to succeed in the modern Internet.

Jorge Luis Betancourt · 27 Oct 2017 · 7 min read

Better Log Parsing with Logstash and Google Protocol Buffers

At trivago we rely heavily on the ELK stack for our log processing. We stream our webserver access logs, error logs, performance benchmarks and all kind of diagnostic data into Kafka and process it from there into Elasticsearch using Logstash. Our preferred encoding within this pipeline is Google's Protocol Buffers, short protobuf. In this blog post, we will explain with an example how to read protobuf encoded messages from Kafka using Logstash.

Inga Feick · 19 Jan 2016 · 4 min read

Monitoring at trivago

How we improved reporting and monitoring of test automation results

How we scaled our Prometheus setup

SRE: On-Call Procedure at trivago

Continuous Performance Monitoring for PHP - The tale of Blackfire at trivago

Better Log Parsing with Logstash and Google Protocol Buffers

Popular tags

Featured articles

Implementing Data Validation with Great Expectations in Hybrid Environments

How we scaled our Prometheus setup

Being on-call as a software engineer - a challenging and fast learning experience

Java Reactive Programming - Effective Usage in a Real World Application

Learn Redis the hard way (in production)

trivago tech newsletter

Popular tags

Featured articles

Career? trivago.