The price of reliability is the pursuit of the utmost simplicity. — C.A.R. Hoare, Turing Award lecture Introduction Have you ever enthusiastically released a new, delightful version to production and then suddenly started hearing a concerning number of notification sounds? Gets your heart beating right? After all, you didn’t really expect this to happen because it worked in the development environment. This “But it worked in the development environment!
Posts about DevOps
On a normal day, we ingest a lot of data into our ELK clusters (~6TB across all of our data centers). This is mostly operational data (logs) from different components in our infrastructure. This data ranges from purely technical info (logs from our services) to data about which pages our users are loading (intersection between business and technical data). At trivago,we use Kafka as a central hub for moving data between our systems (including logs).
“The Cloud Native Computing Foundation (CNCF) hosts critical components of the global technology infrastructure. CNCF brings together the world’s top developers, end users, and vendors and runs the largest open source developer conferences. CNCF is part of the nonprofit Linux Foundation.” - CNCF Last year, when visiting CloudNativeCon/KubeCon Europe in Barcelona (one of the biggest cloud-focused conferences in Europe), I noticed that there were some companies present in the exhibition space whose primary focus wasn’t software development.
To the outside, trivago appears to be one single software product providing our popular hotel meta search. Behind the scenes, however, it is home to dozens of projects and tools to support it. Teams are encouraged to choose the programming languages and frameworks that will get the job done best. Only few restrictions are placed on the teams in these decisions, primarily long-term maintainability. As a result, trivago has a largely polyglot code base that fosters creativity and diverse thinking.
Make was created in 1976 by Stuart Feldman at Bell Labs to help build C programs. But how can this 40+ year old piece of software help us develop and maintain our ever-growing amount of cloud-based microservices?
Hello from trivago’s performance & monitoring team. One important part of our job is to ship more than a terabyte of logs and system metrics per day, from various data sources into elasticsearch, several time series databases and other data sinks. We do so by reading most of the data from multiple Kafka clusters and processing them with nearly 100 Logstashes. Our clusters currently consists of ~30 machines running Debian 7 with bare-metal installations of the aforementioned services.
Testing your functionality is important, but what happens if other factors come into play? In this blog post we show how trivago handles non-functional testing for every commit and how we scaled it.
We do think that our tech blog is full of interesting things powered by our engineers’ great stories. Let us take you on a journey of how we maintain trivago tech blog from the technical perspective and how we recently automated its deployment process.
We’re a data-driven company. At trivago we love measuring everything. Collecting metrics and making decisions based on them comes naturally to all our engineers. This workflow also applies to performance, which is key to succeed in the modern Internet.
Tackling hard problems is like going on an adventure. Solving a technical challenge feels like finding a hidden treasure. Want to go treasure hunting with us?View all current job openings