Read Building Our First GraphQL Server with Go: An Implementation Guide

Building Our First GraphQL Server with Go: An Implementation Guide

trivago provides travelers with an extensive collection of hotels, empowering them to compare prices and uncover the best vacation deals. With so many exceptional options available, we have introduced a new feature called "Favorites" to streamline the navigation process. This feature enables users to effortlessly save their preferred accommodations and access them later, ensuring ease of use. To access this feature, visit https://www.trivago.com/en-US/favorites.

Read Implementing Data Validation with Great Expectations in Hybrid Environments

Implementing Data Validation with Great Expectations in Hybrid Environments

Data validation is an essential step in any data processing pipeline, as it ensures the integrity and accuracy of the data to be used across all subsequent processing steps. Great Expectations (GX) is an open-source framework that provides a flexible and efficient way to perform data validation, allowing data scientists and analysts to quickly identify and correct any issues with their data. In this article, we share our experience implementing Great Expectations for data validation in our Hadoop environment, and our take on its benefits and limitations.

Read Tech IT Up - Growth and Learning for trivago Techies

Tech IT Up - Growth and Learning for trivago Techies

A tech conference is a gathering of tech enthusiasts, geeks, and wizards who come together to share their magic spells (aka tech knowledge), cast some illusions (aka demos), and talk about the future of technology in a professional yet humorous way. It's a place where you can explore the latest tech trends, make new connections, and have a great time with like-minded individuals. So, pack your wizard hat and prepare to be inspired by our tech conference called trivago Tech GetTogether (TGT)!

Read How continuous product discovery works for us

How continuous product discovery works for us

Hello, I am a product manager here at trivago. I have worked on different parts of the product such as apps, alternative accommodations, landing pages, and search & flow. We work in cross-functional teams that solve problems within a defined scope. My key responsibilities as a product manager are developing a product vision & strategy, defining objectives & outcomes, driving product discovery efforts such as user research, solution ideation, solution testing, and supporting engineering in product delivery. Overall it is about helping the team to ship the right product.

Read Marketing Attribution: Evaluating The Path to Purchase in the Product Ecosystem

Marketing Attribution: Evaluating The Path to Purchase in the Product Ecosystem

While working with data and analyzing the interactions of our users with the products we have today, it is essential to understand their behaviors by tracking their past actions, such as opening notifications, interacting with a blog, or creating a new login in the platform. In that context, the attribution study refers to the method of grouping together all of those actions in a specific pattern to generate one desired end result.

Read Explore-exploit dilemma in Ranking model

Explore-exploit dilemma in Ranking model

Imagine, out of thousands of accommodations that match a user search, you have to select the “best” 25 to show to the user. Which ones would you show- the ones you know perform well or ones that have never been shown before, so that you can discover new high-potential accommodations? In the Data Science world, this is known as exploitation (continue doing what works well) versus exploration (try something new to discover hidden potential) problem and is often explained using the well-known multi-armed bandit problem. The objective of the problem is to divide a fixed number of resources between competing choices to maximize their expected gains, given that the properties of each choice are not fully known at the time of allocation.

Read How to substantially slow down your Node.js server

How to substantially slow down your Node.js server

Back in March 2022, after spending a considerable amount of effort migrating our monolithic Node.js GraphQL server from Express to Fastify, we noticed absolutely no performance improvements in production. That hit us like a bombshell, especially because Fastify performed exceptionally well in our k6 load tests in staging, where it responded to HTTP requests 107% (more than two times) faster on average than Express!

Read Powering ML-Based Systems With Reliable Data: The Data Annotation Journey

Powering ML-Based Systems With Reliable Data: The Data Annotation Journey

In the last few years, organisations have been increasing their investments in building Machine Learning (ML) based systems. In practice, such systems often took longer than expected to be built or failed to deliver the promised outcome. Data availability and quality have been among the most significant reasons behind this phenomenon. Since each organisation had its custom problems, open datasets or even logged data were not always directly usable. As a result, data collection and annotation processes became more crucial, yet remained under-documented.

Read How we scaled our Prometheus setup

How we scaled our Prometheus setup

In 2020 we started to migrate one of our most significant workloads, our Node.js based GraphQL API and many of its microservices, from our datacenter to Google Kubernetes Engine. We deploy it in three GCP regions, each having its Kubernetes cluster. Since then, our monitoring infrastructure has changed due to various periods of instability and pandemic induced scaling challenges.

Read How to Survive a Regional Outage

How to Survive a Regional Outage

As I’m writing this, we’re in the middle of our yearly load testing process.
Since a couple of years now, trivago conducts regular production load tests. We do this to test if all our services sustain the increased load we experience during the summer and winter months.
This year is now the 2nd time, where we also do another test: A "regional failover" test.

SRE: On-Call Procedure at trivago

One of the many responsibilities of a Site Reliability Engineer (SRE), is to ensure uptime, availability and in some cases, consistency of the product. In this context, the product refers to the website, APIs, microservices, and servers. This responsibility of keeping the product up and running becomes particularly interesting if the product is used around the world 24 hours every day like trivago. And just like in the medical profession, someone has to be on call to react on failures and outages outside of the office hours.