We're a data-driven company. At trivago we love measuring everything. Collecting metrics and making decisions based on them comes naturally to all our engineers. This workflow also applies to performance, which is key to succeed in the modern Internet.
I love to take complex and tedious processes and automate the pain out of them until they are reduced to three or four steps!
As part of the release team of trivago, one of our roles is to create tools that make the lives of our developers easier so that they can create amazing features for our website.
For our products, like the trivago hotel search, we are using Redis a lot. The use cases vary: Caching, temporary storage of data before moving those into another storage or a typical database for hotel meta data including persistence.
Configuration management tools have recently gained a lot of popularity. At trivago we use SaltStack to automate our infrastructure. As the complexity of configuration files and formulas is increasing, we need a fast, reliable way to test our changes.
At trivago we store a subset of our realtime metric data in InfluxDB and we are quite impressed by the load it can handle. Despite all the joy, we had to learn some lessons the hard way. It is pretty easy to overload the database or the web browser by executing queries that return too many datapoints. To prevent that, we wrote Protector - a circuit breaker for Time series databases that blocks malicious queries.
At trivago we rely heavily on the ELK stack for our log processing. We stream our webserver access logs, error logs, performance benchmarks and all kind of diagnostic data into Kafka and process it from there into Elasticsearch using Logstash. Our preferred encoding within this pipeline is Google's Protocol Buffers, short protobuf. In this blog post, we will explain with an example how to read protobuf encoded messages from Kafka using Logstash.
The advances and growth of our Selenium based automated testing infrastructure generated an unexpected number of test results to evaluate. We had to rethink our reporting systems. Combining the power of Selenium with Kibana's graphing and filtering features totally changed our way of working. Now we have real-time testing feedback and the ability of filtering between thousands of tests, all in one Dashboard.
tCache takes a creative approach for near lock-free evictions and supports data-aware evictions. Its key features are:
Configuration of features is individual per Cache instance, by using a cache Builder:
Learn how we managed to move fast and create a new Symfony application without breaking our old legacy session handling. We write to our legacy session (which is file based) from our new project which uses PDO as the session storage.
Here at trivago we write a huge number of log messages every day that need to be stored and monitored. To handle all these messages we created Gollum, a tool that enables us to conveniently send messages from multiple sources to different services. While initially only covering log messages Gollum quickly evolved to a routing framework for all kinds of data. This blogpost is a short introduction to Gollum and how we use it at trivago.
At trivago we love hotels above everything else, but we also like metrics, we love to measure everything, compare, decide, improve and then rinse and repeat. In this blog entry we are going to describe our experience with InfluxDB, a time series database that we are using to store some real time metrics.
Tackling hard problems is like going on an adventure. Solving a technical challenge feels like finding a hidden treasure. Want to go treasure hunting with us?
View all job openings
Follow us on