Latest articles

Technical Decision-Making

As a part of the series of posts already mentioned on WARP - A Web Application Rewrite Project, we are disclosing our process of making technical decisions. We hope that you find this process helpful. Maybe you can even pull something out for your own projects.

Tom Bartel Radovan Janjic David Meyer · 22 Feb 2023 · 6 min read

How we improved reporting and monitoring of test automation results

Over the last few years, we completely refactored what was described in our previous article about how we use the ELK stack for an overview of our test automation results, but some core concepts remain valid and applicable.

Giuseppe Donati · 15 Feb 2023 · 6 min read

How continuous product discovery works for us

Hello, I am a product manager here at trivago. I have worked on different parts of the product such as apps, alternative accommodations, landing pages, and search & flow. We work in cross-functional teams that solve problems within a defined scope. My key responsibilities as a product manager are developing a product vision & strategy, defining objectives & outcomes, driving product discovery efforts such as user research, solution ideation, solution testing, and supporting engineering in product delivery. Overall it is about helping the team to ship the right product.

Sören Weber · 1 Feb 2023 · 8 min read

What Have I Even Been Doing Today?

You have always been an engineer, solving problems and writing code. Now, there is an opportunity to become an engineering manager. You are interested.

However, questions arise.

Tom Bartel · 3 Jan 2023 · 11 min read

Marketing Attribution: Evaluating The Path to Purchase in the Product Ecosystem

While working with data and analyzing the interactions of our users with the products we have today, it is essential to understand their behaviors by tracking their past actions, such as opening notifications, interacting with a blog, or creating a new login in the platform. In that context, the attribution study refers to the method of grouping together all of those actions in a specific pattern to generate one desired end result.

Shelly Leal · 6 Dec 2022 · 9 min read

Explore-exploit dilemma in Ranking model

Imagine, out of thousands of accommodations that match a user search, you have to select the “best” 25 to show to the user. Which ones would you show- the ones you know perform well or ones that have never been shown before, so that you can discover new high-potential accommodations? In the Data Science world, this is known as exploitation (continue doing what works well) versus exploration (try something new to discover hidden potential) problem and is often explained using the well-known multi-armed bandit problem. The objective of the problem is to divide a fixed number of resources between competing choices to maximize their expected gains, given that the properties of each choice are not fully known at the time of allocation.

Aida Orujova · 4 Nov 2022 · 8 min read

How to substantially slow down your Node.js server

Back in March 2022, after spending a considerable amount of effort migrating our monolithic Node.js GraphQL server from Express to Fastify, we noticed absolutely no performance improvements in production. That hit us like a bombshell, especially because Fastify performed exceptionally well in our k6 load tests in staging, where it responded to HTTP requests 107% (more than two times) faster on average than Express!

Abdelrahman Abdelhafez · 15 Sept 2022 · 7 min read

Powering ML-Based Systems With Reliable Data: The Data Annotation Journey

In the last few years, organisations have been increasing their investments in building Machine Learning (ML) based systems. In practice, such systems often took longer than expected to be built or failed to deliver the promised outcome. Data availability and quality have been among the most significant reasons behind this phenomenon. Since each organisation had its custom problems, open datasets or even logged data were not always directly usable. As a result, data collection and annotation processes became more crucial, yet remained under-documented.

Omayma Said Srinivas Ramesh Kamath · 1 Sept 2022 · 13 min read

How we scaled our Prometheus setup

In 2020 we started to migrate one of our most significant workloads, our Node.js based GraphQL API and many of its microservices, from our datacenter to Google Kubernetes Engine. We deploy it in three GCP regions, each having its Kubernetes cluster. Since then, our monitoring infrastructure has changed due to various periods of instability and pandemic induced scaling challenges.

Simon Brüggen · 23 Aug 2022 · 9 min read

How to Survive a Regional Outage

As I’m writing this, we’re in the middle of our yearly load testing process.
Since a couple of years now, trivago conducts regular production load tests. We do this to test if all our services sustain the increased load we experience during the summer and winter months.
This year is now the 2nd time, where we also do another test: A "regional failover" test.

Arne Claus · 15 Aug 2022 · 7 min read

3 Things We Learned When Switching to TypeScript

With the rewrite of our core product web application, we moved from a PHP/JavaScript tech stack to a Next.js stack. One of the most significant changes for developers was the switch to TypeScript, which most of us had not had a lot of experience with, previously.

Tom Bartel · 1 Aug 2022 · 6 min read

SRE: On-Call Procedure at trivago

One of the many responsibilities of a Site Reliability Engineer (SRE), is to ensure uptime, availability and in some cases, consistency of the product. In this context, the product refers to the website, APIs, microservices, and servers. This responsibility of keeping the product up and running becomes particularly interesting if the product is used around the world 24 hours every day like trivago. And just like in the medical profession, someone has to be on call to react on failures and outages outside of the office hours.

Kenechukwu Nnamani · 18 Jul 2022 · 7 min read

WARP - A Web Application Rewrite Project

From April 2020 until the end of 2021, we have put trivago’s web frontend on a new tech stack. Having moved away from a quite large PHP codebase and our home-grown JavaScript framework Melody, trivago now runs on a Next.js application, written in TypeScript.

Tom Bartel · 16 May 2022 · 13 min read

How we got on top of our data

Scalability and availability are key aspects of cloud native computing. If your microservice takes five minutes to start up, it becomes very difficult to meet the expectations because adjustments to traffic changes, regional failovers, hot-fixes and rollbacks are simply too slow. In this article, we show how we solved this and a few other problems by taking control of the process of updating our data and storing it in a highly available Redis setup.

Kevin Beineke · 4 May 2022 · 9 min read

Improving Evaluation Practices in Natural Language Generation

Throughout last year I had the opportunity to participate and collaborate on multiple research initiatives in the field of Natural Language Generation (NLG) in addition to my responsibilities as a Data Scientist at trivago. NLG is the process of automatically generating text from either text and/or non-linguistic data inputs. Some NLG applications include chatbots, image captioning, and report generation. These are application areas of high interest internally within trivago as we seek to leverage our rich data environment to enrich the user experience with potential NLG applications.

Saad Mahamood · 31 Mar 2022 · 8 min read

Latest articles

Technical Decision-Making

How we improved reporting and monitoring of test automation results

How continuous product discovery works for us

What Have I Even Been Doing Today?

Marketing Attribution: Evaluating The Path to Purchase in the Product Ecosystem

Explore-exploit dilemma in Ranking model

How to substantially slow down your Node.js server

Powering ML-Based Systems With Reliable Data: The Data Annotation Journey

How we scaled our Prometheus setup

How to Survive a Regional Outage

3 Things We Learned When Switching to TypeScript

SRE: On-Call Procedure at trivago

WARP - A Web Application Rewrite Project

How we got on top of our data

Improving Evaluation Practices in Natural Language Generation

Popular tags

Featured articles

Implementing Data Validation with Great Expectations in Hybrid Environments

How we scaled our Prometheus setup

Being on-call as a software engineer - a challenging and fast learning experience

Java Reactive Programming - Effective Usage in a Real World Application

Learn Redis the hard way (in production)

trivago tech newsletter

Popular tags

Featured articles

Career? trivago.