How It Works
How Fusionaly's SQLite-based ingestion pipeline handles hundreds of thousands of events per day
Fusionaly uses SQLite with deliberate buffering and backpressure. No heavyweight cluster. Every layer—browser SDK, HTTP ingress, WAL-backed writes, and async aggregation—embraces eventual consistency. Bursts slow down briefly instead of overwhelming the database. Dashboards can lag by a minute during spikes while raw events stay durable.
Key guarantees:
- Multi-layer retries prevent dropped events (browser → server → database)
- Ingestion writes straight to disk (no volatile buffers)
- Background jobs convert raw events into hourly aggregates
- Backpressure favors delays over failed writes
Retry Strategy
Section titled “Retry Strategy”Every layer has retries. Events don’t get lost.
| Layer | Retry Behavior |
|---|---|
| Browser SDK | 3 retries with exponential backoff (1s, 2s, 4s). Caches up to 100 events in localStorage when offline. |
| HTTP Ingress | Returns 503 + Retry-After when saturated. SDK respects this. |
| Database Writes | PerformWrite retries up to 10 times with exponential backoff on SQLITE_BUSY. |
| Background Jobs | Failed batches stay in queue. No row advances until successfully processed. |
The browser SDK uses navigator.sendBeacon() for page unload events—these survive even if the user closes the tab.
Data Flow
Section titled “Data Flow”Browser SDK (200ms batches, 100-event localStorage cache) | v/x/api/v1/events (HTTP ingress + concurrency limiter) | vingested_events table (durable queue) | v Background job (100-row batches)events table (authoritative log) | v Aggregate upsertssite_stats / page_stats / ref_stats / ... (hourly buckets) | vDashboards (eventually consistent)This embraces SQLite’s single-writer model instead of fighting it. Buffers are explicit, backpressure is intentional, and aggregates can lag briefly without threatening correctness.
Storage Layer
Section titled “Storage Layer”SQLite with WAL mode, busy_timeout = 5000, and a connection pool (10 open / 5 idle). The PerformWrite helper retries transactions up to 10 times with exponential backoff.
Three queueing layers:
- HTTP handler - Concurrency limiter returns
503 + Retry-Afterwhen saturated - Connection pool - Goroutines wait for a connection instead of hammering the DB
- WAL queue - Serializes writes while allowing concurrent reads
Processing Pipeline
Section titled “Processing Pipeline”- Browser SDK - Batches events (200ms intervals, ≤10 per batch), retries with backoff, caches up to 100 events in localStorage when offline
- Ingestion API - Validates, normalizes URLs, filters excluded IPs, persists to
ingested_events - Background processor - Runs every 60s, drains pending rows in 100-row batches into
events - Aggregate upserts -
INSERT ... ON CONFLICTinto hourly tables (site_stats,page_stats, etc.)
No row is dropped. Every failure triggers a retry before advancing.
Hot vs Cold Data
Section titled “Hot vs Cold Data”| Path | Tables | Consistency | Use case |
|---|---|---|---|
| Hot | ingested_events, events | Synchronous | Visitor Transparency, exact queries |
| Cold | *_stats tables | Eventually consistent | Dashboards, reports |
Cold tables are hourly buckets—even multi-year queries read only ~24 rows per day per dimension.
Burst Handling
Section titled “Burst Handling”During a spike (1,000 events/second):
- Browsers coalesce requests → ~100 HTTP req/s actually hit the server
- Ingestion accepts everything into
ingested_events, responds fast - Background processor drains queue in 100-row chunks, loops until empty
- Aggregates lag briefly but remain ordered
Performance
Section titled “Performance”Reads: Most dashboard queries < 10ms. Hourly aggregates keep queries cheap.
Writes: WAL mode + batch transactions (100 events/commit) + client/server retries. No RAM buffer needed.
-- Typical dashboard query (single-digit ms)SELECT strftime('%Y-%m-%d', hour) AS day, SUM(visitors) AS visitorsFROM site_statsWHERE website_id = ? AND hour BETWEEN datetime('now', '-30 days') AND datetime('now')GROUP BY day;Storage Estimates
Section titled “Storage Estimates”~240 bytes per event (including indexes and aggregate overhead).
| Traffic | Daily Events | Daily Growth | 1-year | 5-year |
|---|---|---|---|---|
| 1K visits/day | ~3,750 | ~0.7 MB | ~0.3 GB | ~2 GB |
| 10K visits/day | ~37,500 | ~7 MB | ~2.6 GB | ~19 GB |
| 100K visits/day | ~375,000 | ~72 MB | ~26 GB | ~190 GB |
Storage grows linearly with time, not exponentially with traffic spikes.
On SQLite limits: SQLite theoretically supports 281 TB databases. You will never hit this. A site with 100K daily visits accumulates ~26 GB/year. At that rate, reaching SQLite’s limit would take about 10,000 years. Plan for disk space, not database limits.
Summary
Section titled “Summary”SQLite works because: hot tables stay lean, dashboards query compact hourly aggregates, writes never block readers. A modest instance handles ~400 req/sec—enough for hundreds of thousands of events per day.