A distributed computing platform that solves the N-Queens problem by leveraging web browsers as compute nodes. Workers run as WebAssembly modules using both CPU (Web Workers) and GPU (WebGPU compute shaders), orchestrated through a RabbitMQ message queue and a Java WebSocket-to-AMQP bridge.
+-------------------+ +----------+ +--------------------+ +----------------+
| Orchestrator | AMQP | RabbitMQ | AMQP | WebSocket2AMQP | WSS | Browser(s) |
| (Java) |------>| |------>| Bridge (Java) |------>| |
| | | "Task" | | Port 8443 | | GPU Worker |
| Generates tasks | | "Return"|<------| |<------| (WebGPU/WGSL) |
| Collects results |<------| | | Routes messages | | |
+-------------------+ +----------+ +--------------------+ | CPU Workers |
| (Web Workers) |
+----------------+
Data flow:
- The Orchestrator generates N-Queens partial states via recursive exploration and publishes batches of 256 tasks to the
Taskqueue - The WebSocket2AMQP Bridge pulls tasks from the queue (with explicit ACK) and distributes them to connected browser clients via WebSocket
- Browser workers solve the tasks using backtracking with bitmask optimization (CPU) or GPU compute shaders (WebGPU)
- Results are sent back through the bridge to the
Returnqueue - The Orchestrator aggregates solution counts and reports the final result
hivequeen/
├── application/ # Task orchestrator (Java/Gradle)
│ └── app/src/main/java/org/hivequeen/
│ ├── App.java # CLI entry point (n, depth args)
│ ├── Problem.java # Generic distributed problem interface
│ ├── NQueenProblem.java # N-Queens implementation
│ ├── Orchestrator.java # Task distribution & result aggregation
│ └── State.java # N-Queens board state (bitmask)
│
├── websocket2amqp/ # WebSocket-to-AMQP bridge (Java/Gradle)
│ └── app/src/main/java/org/hivequeen/
│ ├── App.java # SSL/TLS setup
│ ├── WebSocket2AmqpServer.java # Bridge with explicit ACK & timeout
│ └── WebSocketClient.java # Client state & task tracking
│
├── wasm_cpu_worker/ # CPU solver (Rust -> WebAssembly)
│ └── src/lib.rs # Backtracking with bitmask optimization
│
├── wasm_gpu_worker/ # GPU solver (Rust + wgpu -> WebAssembly)
│ └── src/
│ ├── lib.rs # WebGPU setup & adaptive benchmarking
│ └── nqueen_compute_shader.wgsl # WGSL compute shader
│
├── site/ # Frontend dashboard
│ └── html/
│ ├── index.html # Dashboard layout
│ ├── style.css # Dark theme styling
│ ├── script.js # WebSocket, workers, real-time UI
│ └── worker.js # Web Worker loader for CPU solver
│
├── docker-compose.yml # Full system deployment
└── test_e2e.sh # E2E integration test script
- Java JDK 21+
- Rust with WASM target:
rustup target add wasm32-unknown-unknown cargo install wasm-bindgen-cli
- Docker (for RabbitMQ and deployment)
The easiest way to run the full system:
docker-compose up --buildThis starts RabbitMQ (with healthcheck), the WebSocket bridge, the orchestrator, and the frontend. Configure credentials via a .env file at the project root:
AMQP_USERNAME=guest
AMQP_PASSWORD=guest
CERTIFICATE_PSWD=changeitThe orchestrator accepts CLI arguments for problem size:
# Default: n=18, depth=5
docker-compose run orchestrator ./bin/app 12 3docker run -d --name rabbitmq -p 5672:5672 -p 15672:15672 rabbitmq:3-managementManagement UI: http://localhost:15672 (guest / guest).
# CPU worker
cd wasm_cpu_worker
.\run.ps1
# GPU worker
cd ..\wasm_gpu_worker
.\run.ps1Each script compiles Rust to WebAssembly and copies the output to site/html/pkg/.
Create websocket2amqp/app/.env:
AMQP_HOST=localhost
AMQP_USERNAME=guest
AMQP_PASSWORD=guest
CERTIFICATE_PSWD=<your_certificate_password>Place your SSL certificate as websocket2amqp/app/websocket.p12, then:
cd websocket2amqp
.\gradlew.bat shadowJar
cd app
java -jar build/libs/app-all.jarCreate application/app/.env:
AMQP_HOST=localhost
AMQP_USERNAME=guest
AMQP_PASSWORD=guestcd application
.\gradlew.bat build
cd app
java -jar build/libs/app.jar [n] [preplacedQueens]Default: n=18, preplacedQueens=5. Minimum: n >= 4, preplacedQueens >= 0 && < n.
The frontend auto-detects the WebSocket address from the page hostname. For local development, use the ?ws= query parameter:
https://localhost/?ws=wss://localhost:8443
Serve site/html/ with any HTTPS-capable web server, or use Docker:
cd site
docker build -t hivequeen-site .
docker run -p 443:443 hivequeen-siteOpen in a WebGPU-compatible browser (Chrome 113+, Edge 113+). Non-WebGPU browsers (Firefox, Safari) automatically fall back to CPU-only mode.
The browser frontend includes a real-time monitoring dashboard:
- Solutions counter — total solutions found with live rate (solutions/s)
- Elapsed timer — time since first computation batch
- Connection status — GPU and CPU WebSocket state with animated indicators
- Compute mode — GPU + CPU or CPU-only (auto-detected)
- Throughput chart — 60-second rolling GPU vs CPU batch/s sparkline with area fill
- Mini chessboard — visualization of the last solved task with queen positions
- Activity log — timestamped feed of connections, disconnections, and events
- GPU info — WebGPU adapter name and configuration
The orchestrator recursively pre-places queens (configurable depth, default 5) on the board, filtering invalid placements at each level. Each valid partial state becomes a task containing:
- Board state as bitmasks (
column_status,left_diagonal_status,right_diagonal_status) - Search history array for backtracking
- A unique
task_id(thread-safe viaAtomicInteger)
Tasks are batched in groups of 256 and published to the Task AMQP queue. The system uses a generic Problem interface, making it possible to add new problem types without modifying the orchestrator or bridge.
Each browser spawns navigator.hardwareConcurrency Web Workers. Each worker runs a Rust-compiled WASM module that performs iterative backtracking with bitmask-based conflict detection for O(1) safety checks.
A single WebGPU compute shader processes 256 tasks in parallel (one per thread in a workgroup). The Rust host code includes adaptive benchmarking that:
- Finds the optimal number of workgroups by doubling until performance degrades
- Finds the optimal iteration count per dispatch by incrementing until throughput stabilizes
- Explicit ACK: the bridge uses
basicGetwith manual acknowledgement — tasks are only ACKed after the client returns results - Task timeout: a scheduler checks every 30s for tasks older than 60s and re-queues them via
basicNack - Stale client detection: clients with multiple consecutive timeouts are disconnected
- WebSocket reconnection: browser clients automatically reconnect with exponential backoff (1s-30s) and jitter
- GPU fallback: if WebGPU is unavailable, the browser operates in CPU-only mode
| Parameter | Location | Default |
|---|---|---|
| N-Queens size | CLI arg 1 | 18 |
| Pre-placed queens | CLI arg 2 | 5 |
| Batch size | Multiple files | 256 |
| WebSocket URL | Auto-detect or ?ws= param |
wss://<hostname>:8443 |
| Bridge port | websocket2amqp/.../App.java |
8443 |
| AMQP credentials | .env files or env vars |
guest/guest |
| Task timeout | WebSocket2AmqpServer.java |
60s |
| Reconnect backoff | script.js |
1s-30s |
The system's sequential fraction is minimal:
| Phase | Type | Time (n=18) | Fraction |
|---|---|---|---|
Task generation (App.java) |
Sequential | ~2s | < 1% |
AMQP publish (Orchestrator.runTasks) |
Sequential | ~3s | < 1% |
| Result aggregation (consume callback) | Sequential (per message) | ~0.001s/msg | < 0.1% |
| Solving (WASM workers) | Parallel | minutes-hours | > 98% |
With sequential fraction s ~ 0.02:
-
Amdahl's law (fixed problem size):
speedup = 1 / (s + (1-s)/N)- N=4 workers: ~3.8x
- N=16 workers: ~12.3x
- N=inf: ~50x (limited by s)
-
Gustafson's law (scaled problem size):
speedup = s + (1-s) * N- N=4: ~3.9x
- N=16: ~15.7x
- Scales nearly linearly because s is so small
The N-Queens backtracking exhibits divergent control flow (different threads take different branch paths), which is not ideal for GPU. However, the system mitigates this by:
- Bounded iterations: each dispatch runs a fixed
max_iterations, keeping threads roughly synchronized - Re-queuing: incomplete tasks return to the queue for further processing
- Massive parallelism: 256 threads per workgroup x multiple workgroups compensates for divergence inefficiency
- Network latency: WebSocket round-trips add overhead per batch (~5-50ms)
- JSON serialization: tasks are serialized as JSON over WebSocket (~1-2ms per batch)
- Browser volatility: workers can disappear at any time (mitigated by task timeout + re-queue)
- GPU divergence: backtracking causes warp divergence, reducing GPU efficiency to ~30-60% of theoretical peak
# Java unit tests (19 tests: State + NQueenProblem)
cd application && ./gradlew test
# Rust unit tests (7 tests: solver + JSON handling)
cd wasm_cpu_worker && cargo test
# E2E integration test (requires Docker + RabbitMQ)
./test_e2e.sh| Component | Technology |
|---|---|
| GPU compute | WebGPU, wgpu 24.0, WGSL |
| CPU compute | Web Workers, WebAssembly |
| WASM compilation | Rust 2021, wasm-bindgen |
| Message broker | RabbitMQ (AMQP) |
| Bridge server | Java 21, java-websocket, amqp-client |
| Build system | Gradle 8.12 (Java), Cargo (Rust) |
| Frontend | Vanilla JS, Canvas API |
| Deployment | Docker, docker-compose |

