Read How to Survive a Regional Outage

How to Survive a Regional Outage

As I’m writing this, we’re in the middle of our yearly load testing process.
Since a couple of years now, trivago conducts regular production load tests. We do this to test if all our services sustain the increased load we experience during the summer and winter months.
This year is now the 2nd time, where we also do another test: A "regional failover" test.

Read Google Cloud Workload-Placement-Guide

Google Cloud Workload-Placement-Guide

At trivago we operate a hybrid infrastructure of both on-premise machines and clusters on Google Cloud. Over time, we came up with a set of deployment guidelines for running our workloads as more and more of them are migrating to Google Cloud. These are not strict rules, but rather suggestions to best serve each team's needs.

Read Introducing Gollum: A NxM message multiplexer written in Go

Introducing Gollum: A NxM message multiplexer written in Go

Here at trivago we write a huge number of log messages every day that need to be stored and monitored. To handle all these messages we created Gollum, a tool that enables us to conveniently send messages from multiple sources to different services. While initially only covering log messages Gollum quickly evolved to a routing framework for all kinds of data. This blogpost is a short introduction to Gollum and how we use it at trivago.