Built by streaming experts from
What is stream processing?
Modern business operate in the moment. Consumers expect their apps are always up to date. Threats and attacks can happen in seconds. And operations teams need to respond to issues in real-time.
Stream processing operates on events as they come in—providing answers in seconds instead of days or hours.
But today stream processing is too hard. Existing tools like Apache Flink are complex. They require deep expertise to build and operate correct, reliable, and performant pipelines.
Arroyo is a new kind of stream processing engine, built to make real-time as easy as batch.
SQL that just works
Optimized from the SQL planner to the storage layer for excellent, unsurprising SQL support. Build reliable, efficient streaming pipelines without specialized streaming knowledge.
Designed for the cloud
Designed from the ground-up to run in modern, elastic cloud environments.
Run on the Arroyo Cloud, or self-host with Kubernetes or Nomad.
Comes out of the box with an automated control plane, so you don't need to worry about manually managing pipelines. Reliable and efficient state checkpointing prevents data loss.
How it works
Real-time with Arroyo
Arroyo lets you build streaming pipelines by writing the same analytical SQL queries you are already running in your data warehouse, with a few extensions for real-time. See our SQL docs for the details.
CREATE VIEW tags AS ( SELECT tag FROM ( SELECT extract_json_string(value, '$.tags[*].name') AS tag FROM mastodon) WHERE tag is not null ); SELECT * FROM ( SELECT *, ROW_NUMBER() OVER ( PARTITION BY window ORDER BY count DESC) as row_num FROM (SELECT count(*) as count, tag, hop(interval '5 seconds', interval '15 minutes') as window FROM tags group by tag, window)) WHERE row_num <= 5;