What is flashQ? A Modern Job Queue for AI Workloads

If you've ever built an application that needs to process tasks in the background—sending emails, generating reports, processing images, or calling AI APIs—you've probably encountered job queues. They're the backbone of scalable applications, allowing you to offload work from your main application thread and process it asynchronously.

flashQ is a high-performance job queue built specifically for modern AI workloads. It's designed to be fast, simple, and reliable—without requiring you to manage Redis or any external infrastructure.

The Problem with Traditional Job Queues

Most job queues in the Node.js ecosystem rely on Redis. Tools like BullMQ, Bull, and Bee-Queue are excellent, but they come with a significant operational overhead:

Redis management: You need to provision, configure, and maintain a Redis instance
Memory costs: Redis stores everything in memory, which gets expensive at scale
Persistence concerns: Redis persistence (RDB/AOF) requires careful tuning
Network latency: Every job operation requires a network round-trip to Redis
Payload limitations: Redis has practical limits on value sizes (~512MB, but performance degrades much earlier)

For AI workloads specifically, these limitations become even more painful. AI applications often need to:

Send large payloads (embeddings, images, long text contexts)
Chain multiple operations together (embed → search → generate)
Rate limit API calls to avoid hitting provider quotas
Handle long-running jobs (minutes, not seconds)

Enter flashQ

flashQ was built from the ground up to solve these problems. Here's what makes it different:

1. No Redis Required

flashQ is a standalone server written in Rust. You run a single binary, and you're done. No Redis to provision, no connection strings to manage, no memory to monitor.

# Start flashQ server
./flashq-server

# Or with Docker
docker run -p 6789:6789 flashq/flashq

For persistence, flashQ can optionally connect to PostgreSQL. But for many use cases, the in-memory mode is perfectly sufficient.

2. BullMQ-Compatible API

If you're already using BullMQ, switching to flashQ is trivial. The API is intentionally compatible:

// Before (BullMQ)
import { Queue, Worker } from 'bullmq';

// After (flashQ)
import { Queue, Worker } from 'flashq';

That's it. Your existing code works with minimal changes.

3. 10x Faster

Because flashQ eliminates the network hop to Redis and is written in Rust with careful attention to performance, it's significantly faster:

Metric	flashQ	BullMQ + Redis
Push throughput	1.9M jobs/sec	~50K jobs/sec
Processing throughput	280K jobs/sec	~30K jobs/sec
Latency (p99)	<1ms	~5-10ms

4. Built for AI Workloads

flashQ has features specifically designed for AI applications:

10MB payload limit: Send embeddings, images, and large contexts without workarounds
Job dependencies: Chain jobs with depends_on for RAG pipelines
Rate limiting: Built-in token bucket rate limiting per queue
Long timeouts: Jobs can run for minutes without being marked as stalled
Progress tracking: Report progress for long-running inference jobs

How flashQ Works

At its core, flashQ is a TCP server that accepts commands and manages job queues. Here's the architecture:

┌─────────────────┐     ┌─────────────────┐
│   Your App      │     │   Workers       │
│  (Producer)     │     │  (Consumers)    │
└────────┬────────┘     └────────┬────────┘
         │                       │
         │    TCP/HTTP/gRPC      │
         └───────────┬───────────┘
                     │
              ┌──────▼──────┐
              │   flashQ    │
              │   Server    │
              └──────┬──────┘
                     │
              ┌──────▼──────┐
              │  PostgreSQL │ (optional)
              └─────────────┘

The server maintains queues in memory using efficient data structures:

32 shards for parallel access without lock contention
Priority queues (binary heaps) for job ordering
Hash maps for O(1) job lookups
Atomic counters for job IDs and metrics

A Simple Example

Let's build a simple AI pipeline that generates embeddings and stores them:

import { Queue, Worker } from 'flashq';

// Create a queue
const queue = new Queue('embeddings');

// Add a job
await queue.add('generate', {
  text: 'The quick brown fox jumps over the lazy dog',
  model: 'text-embedding-3-small'
});

// Process jobs
const worker = new Worker('embeddings', async (job) => {
  const { text, model } = job.data;

  // Call OpenAI API
  const response = await openai.embeddings.create({
    input: text,
    model: model
  });

  // Return the embedding
  return response.data[0].embedding;
});

When Should You Use flashQ?

flashQ is ideal for:

AI/ML pipelines: LLM calls, embeddings, image generation, batch inference
Startups and small teams: No infrastructure to manage
High-throughput applications: When you need more than 30K jobs/sec
Large payloads: When your job data exceeds a few KB
Development environments: No Docker Compose file just to run Redis

You might prefer BullMQ + Redis if:

You're already running Redis for other purposes (caching, sessions)
You need Redis-specific features (pub/sub, streams)
Your team has deep Redis expertise

Getting Started

Ready to try flashQ? It takes about 5 minutes to get started:

# Install the SDK
npm install flashq

# Start the server (using Docker)
docker run -d -p 6789:6789 flashq/flashq

# Or download the binary
curl -L https://github.com/egeominotti/flashq/releases/latest/download/flashq-linux -o flashq
chmod +x flashq
./flashq

Then in your code:

import { Queue, Worker } from 'flashq';

const queue = new Queue('my-queue');
await queue.add('task', { hello: 'world' });

const worker = new Worker('my-queue', async (job) => {
  console.log(job.data); // { hello: 'world' }
});

💡 Pro Tip

Check out the documentation for advanced features like job dependencies, rate limiting, and clustering.

Conclusion

flashQ represents a new approach to job queues—one that prioritizes simplicity and performance without sacrificing features. If you're building AI applications and tired of managing Redis, give flashQ a try.

We're open source and actively developing new features. Join us on GitHub and let us know what you think!

Ready to try flashQ?

Get started in 5 minutes with our quickstart guide.

Get Started →