Introduction¶
runqy is a distributed task queue system designed for machine learning inference and other workloads that require:
- Stateless workers that receive configuration at startup
- Server-driven bootstrap for centralized control
- Long-running Python processes that stay warm between tasks
Key Concepts¶
Server-Driven Bootstrap¶
Unlike traditional task queues where workers are pre-configured, runqy workers start with minimal configuration (just the server URL and API key). At startup, they:
- Register with the server via
POST /worker/register - Receive Redis credentials and deployment configuration
- Clone the task code from a git repository
- Set up a Python virtual environment
- Start the Python process and wait for it to be ready
One Worker = One Parent Queue = One Process¶
Each worker instance processes tasks from a single parent queue using a single supervised Python process. Sub-queues of the same parent (e.g., inference.high and inference.low) share the same code deployment and runtime.
Queue configuration options:
queue: "inference"— Listen on all sub-queues ofinferencequeues: [inference.high, inference.low]— Listen only on specific sub-queues
This constraint ensures:
- Predictable resource usage
- Simple failure isolation
- Easy horizontal scaling (add more workers for more capacity)
- Efficient resource sharing across sub-queues with different priorities
Long-Running vs One-Shot Modes¶
runqy supports two execution modes:
- Long-running (
long_running): The Python process stays alive between tasks, ideal for ML inference where model loading is expensive - One-shot (
one_shot): A new Python process is spawned for each task
Next Steps¶
- Quick Start — Set up a local development environment
- Architecture — Deep dive into how the components interact