Blog
Updates from the Arroyo team
Streaming data to S3 is surprisingly hard
Arroyo 0.5 added the FileSystem connector, a high-performance, transactional sink that lets you write pipeline outputs to file systems and object stores like S3—and makes Arroyo a great tool for performing real-time ETL. This turns out to be surprisingly tricky to do well. Read on for a deep dive into how Arroyo solved this with a new checkpointing strategy and some clever Parquet tricks.

Jackson Newhouse
CTO of Arroyo
