At trivago, we have several workflows which interact with external services. The health and availability of external services can have an impact on keeping our workflows alive and responsive. Think of an API call made to an external service which is down. Our workflows have to be prepared to expect these errors and adapt to it.
Almost six months ago, our team started the journey to replicate some of our data stored in on-premise MySQL machines to AWS. This included over a billion records stored in multiple tables. The new system had to be responsive enough to transfer any new incoming data from the MySQL database to AWS with minimal latency.