Read RecSys Challenge 2019

RecSys Challenge 2019

Our data scientists and engineers love the challenges that their work presents to them on a daily basis and thrive in our agile environment where they can share their knowledge, learn from others, and work together to solve any problems that arise. We are always looking for ways to share the unique problem settings we encounter and to inspire a productive exchange on algorithm development and evaluation.

Read A New Functional Approach to Complex Types in Apache Hive

A New Functional Approach to Complex Types in Apache Hive

When faced with the challenge to store, retrieve and process small or large amounts of data, structured query languages are typically not far away. These languages serve as a nice abstraction between the goal that is to be achieved and how it is actually done. The list of successful applications of this extra layer is long. MySQL users could switch from MyISAM to InnoDB or use new algorithms like Multi-Range-Read without a change to their application. We, as Hive users, can effortlessly switch our complete processing from MapReduce to, say, Tez or Spark. All this is possible because of SQL serving as an abstraction layer in between. However, in this article, I will outline the effects when SQL - specifically hiveQL - misbehaves and which steps we are taking to recover.

Read Nomad - our experiences and best practices

Nomad - our experiences and best practices

Hello from trivago's performance & monitoring team. One important part of our job is to ship more than a terabyte of logs and system metrics per day, from various data sources into elasticsearch, several time series databases and other data sinks. We do so by reading most of the data from multiple Kafka clusters and processing them with nearly 100 Logstashes. Our clusters currently consists of ~30 machines running Debian 7 with bare-metal installations of the aforementioned services. This summer we decided to migrate all of this to an on-premise [Nomad](https://www.nomadproject.io/ cluster) cluster.

Read My Journey to trivago

My Journey to trivago

I’m Behrang Yarahmadi from Iran and I’m a 3rd-year Computer Engineering student at the University of Duisburg-Essen.

Sometimes, when I look back over the time I have spent working at trivago, I see how it changed my life and how lucky I’ve been to have the chance to work with this amazing community, to live and to learn with them. I look back and see a younger version of myself looking desperately for something different and, through sheer luck, getting it.

Read Building fast and reliable web applications

Building fast and reliable web applications

Test, test, test. If you don’t, an issue is bound to crop up in production sooner or later.

We’ve all heard this mantra in one form or another. The importance of testing your software has been covered by countless articles, books and conferences. You worked hard on your code coverage and your downtime due to regression-related bugs has severely decreased.

Read Nine Nations, United in Code

Nine Nations, United in Code

Ten participants from nine countries — India, Cuba, Tunisia, England, Poland, Spain, Indonesia, Malaysia, and Brazil. Even on trivago scale, this kind of diversity was impressive.

These were the software developers who were selected for the trivago Tech Camp 2018, an eight-day event taking place at the trivago campus in Düsseldorf, Germany. The event is aimed primarily at IT students, but the admission rules are not terribly strict — basic-to-intermediate coding and problem-solving skills suffice, and many candidates sent in code samples which were so advanced that we were quite impressed. In the end, we also had a physicist on board.

Read Efficient Image Recovery at Scale Using Amazon S3 Versioning

Efficient Image Recovery at Scale Using Amazon S3 Versioning

If you’re using Amazon Web Services, then there is a higher possibility that you’re familiar with Amazon S3. Amazon S3 ( Simple Storage Service ) is a widely used service where we can store (theoretically unlimited amount of) our data with a high availability 99.99%. That’s why we, the Visual Content team at trivago, use Amazon S3 to store the images which you see on our website and many other tools.

Read Improving Your Data Layer with Rebase on Python

Improving Your Data Layer with Rebase on Python

Technology keeps getting better and better which, at some point, makes us think "Should I migrate to the latest version/technology or not?" Well when you decide to use a better technology for your application, you have to also consider rewriting the code that your application runs on. The business logic remains the same in most of the cases but the data model would definitely change if you are switching from SQL to some NoSQL Technology for example.

Read Win a Spot in a 5-day JavaScript Workshop With Kyle Simpson!

Win a Spot in a 5-day JavaScript Workshop With Kyle Simpson!

trivago engineering is excited and looking forward to welcoming Kyle Simpson to the spectacular new trivago Campus.

Kyle will give a 5-day JavaScript workshop starting on the 6th of August, 2018. While the workshop is primarily for trivago employees, we want to share this special occasion with the community as well. Therefore, we have reserved three spots for JavaScript enthusiasts who share our love for open source projects.

Read AWS Kinesis with Lambdas: Lessons Learned

AWS Kinesis with Lambdas: Lessons Learned

Almost six months ago, our team started the journey to replicate some of our data stored in on-premise MySQL machines to AWS. This included over a billion records stored in multiple tables. The new system had to be responsive enough to transfer any new incoming data from the MySQL database to AWS with minimal latency.