Arroyo
About Arroyo
Arroyo is a serverless stream processing platform that enables users to execute SQL queries on real-time data streams from Kafka, allowing for the transformation, filtering, and aggregation of data with sub-second latency. It eliminates the need for a dedicated operations team by scaling automatically to handle millions of events per second while ensuring exactly-once processing semantics.
```xml <problem> Traditional stream processing systems often require dedicated operations teams to manage infrastructure and ensure scalability, adding complexity and overhead. Existing solutions can also struggle to provide both analytical SQL capabilities and sub-second latency for real-time data processing. </problem> <solution> Arroyo is a serverless stream processing platform designed for cloud-native environments, enabling users to execute analytical SQL queries on real-time data streams with sub-second latency. The platform automatically scales from zero to millions of events per second without requiring a dedicated operations team. Built on Rust and the Arrow in-memory analytics format, Arroyo offers high performance and exactly-once processing semantics. It allows data scientists and engineers to build end-to-end real-time applications and dashboards using familiar SQL syntax. </solution> <features> - Analytical SQL support for transforming, filtering, aggregating, and joining data streams - Serverless architecture that automatically scales to handle varying workloads - Exactly-once processing semantics to ensure data accuracy and consistency - Native support for JSON, Avro, Parquet, and raw text/binary formats - User-defined functions (UDFs) for extending SQL functionality with Rust (and soon Python) - Web UI for managing connections, developing SQL queries, and monitoring pipelines - REST API for declarative orchestration and management at scale - Support for time windows (sliding, tumbling, and session) with watermark processing - Connectors for integrating with various data sources and sinks, including Kafka, Kinesis, Confluent Cloud, and more </features> <target_audience> Arroyo is targeted towards data scientists, data engineers, and application developers who need to build real-time data pipelines and applications with low latency and high scalability. </target_audience> ```
What does Arroyo do?
Arroyo is a serverless stream processing platform that enables users to execute SQL queries on real-time data streams from Kafka, allowing for the transformation, filtering, and aggregation of data with sub-second latency. It eliminates the need for a dedicated operations team by scaling automatically to handle millions of events per second while ensuring exactly-once processing semantics.
Where is Arroyo located?
Arroyo is based in Berkeley, United States.
When was Arroyo founded?
Arroyo was founded in 2022.
How much funding has Arroyo raised?
Arroyo has raised 500000.
Who founded Arroyo?
Arroyo was founded by Micah Wylde and Jackson Newhouse.
- Micah Wylde - Founder
- Jackson Newhouse - Founder
- Location
- Berkeley, United States
- Founded
- 2022
- Funding
- 500000
- Employees
- 2 employees
- Major Investors
- Y Combinator