Skip to content

How It Works

How Fusionaly's SQLite-based ingestion pipeline handles hundreds of thousands of events per day

Fusionaly uses SQLite with deliberate buffering and backpressure. No heavyweight cluster. Every layer—browser SDK, HTTP ingress, WAL-backed writes, and async aggregation—embraces eventual consistency. Bursts slow down briefly instead of overwhelming the database. Dashboards can lag by a minute during spikes while raw events stay durable.

Key guarantees:

  • Multi-layer retries prevent dropped events (browser → server → database)
  • Ingestion writes straight to disk (no volatile buffers)
  • Background jobs convert raw events into hourly aggregates
  • Backpressure favors delays over failed writes

Every layer has retries. Events don’t get lost.

LayerRetry Behavior
Browser SDK3 retries with exponential backoff (1s, 2s, 4s). Caches up to 100 events in localStorage when offline.
HTTP IngressReturns 503 + Retry-After when saturated. SDK respects this.
Database WritesPerformWrite retries up to 10 times with exponential backoff on SQLITE_BUSY.
Background JobsFailed batches stay in queue. No row advances until successfully processed.

The browser SDK uses navigator.sendBeacon() for page unload events—these survive even if the user closes the tab.

Browser SDK (200ms batches, 100-event localStorage cache)
|
v
/x/api/v1/events (HTTP ingress + concurrency limiter)
|
v
ingested_events table (durable queue)
|
v Background job (100-row batches)
events table (authoritative log)
|
v Aggregate upserts
site_stats / page_stats / ref_stats / ... (hourly buckets)
|
v
Dashboards (eventually consistent)

This embraces SQLite’s single-writer model instead of fighting it. Buffers are explicit, backpressure is intentional, and aggregates can lag briefly without threatening correctness.

SQLite with WAL mode, busy_timeout = 5000, and a connection pool (10 open / 5 idle). The PerformWrite helper retries transactions up to 10 times with exponential backoff.

Three queueing layers:

  1. HTTP handler - Concurrency limiter returns 503 + Retry-After when saturated
  2. Connection pool - Goroutines wait for a connection instead of hammering the DB
  3. WAL queue - Serializes writes while allowing concurrent reads
  1. Browser SDK - Batches events (200ms intervals, ≤10 per batch), retries with backoff, caches up to 100 events in localStorage when offline
  2. Ingestion API - Validates, normalizes URLs, filters excluded IPs, persists to ingested_events
  3. Background processor - Runs every 60s, drains pending rows in 100-row batches into events
  4. Aggregate upserts - INSERT ... ON CONFLICT into hourly tables (site_stats, page_stats, etc.)

No row is dropped. Every failure triggers a retry before advancing.

PathTablesConsistencyUse case
Hotingested_events, eventsSynchronousVisitor Transparency, exact queries
Cold*_stats tablesEventually consistentDashboards, reports

Cold tables are hourly buckets—even multi-year queries read only ~24 rows per day per dimension.

During a spike (1,000 events/second):

  1. Browsers coalesce requests → ~100 HTTP req/s actually hit the server
  2. Ingestion accepts everything into ingested_events, responds fast
  3. Background processor drains queue in 100-row chunks, loops until empty
  4. Aggregates lag briefly but remain ordered

Reads: Most dashboard queries < 10ms. Hourly aggregates keep queries cheap.

Writes: WAL mode + batch transactions (100 events/commit) + client/server retries. No RAM buffer needed.

-- Typical dashboard query (single-digit ms)
SELECT strftime('%Y-%m-%d', hour) AS day, SUM(visitors) AS visitors
FROM site_stats
WHERE website_id = ? AND hour BETWEEN datetime('now', '-30 days') AND datetime('now')
GROUP BY day;

~240 bytes per event (including indexes and aggregate overhead).

TrafficDaily EventsDaily Growth1-year5-year
1K visits/day~3,750~0.7 MB~0.3 GB~2 GB
10K visits/day~37,500~7 MB~2.6 GB~19 GB
100K visits/day~375,000~72 MB~26 GB~190 GB

Storage grows linearly with time, not exponentially with traffic spikes.

On SQLite limits: SQLite theoretically supports 281 TB databases. You will never hit this. A site with 100K daily visits accumulates ~26 GB/year. At that rate, reaching SQLite’s limit would take about 10,000 years. Plan for disk space, not database limits.

SQLite works because: hot tables stay lean, dashboards query compact hourly aggregates, writes never block readers. A modest instance handles ~400 req/sec—enough for hundreds of thousands of events per day.