Edge
Frontend
Cloudflare
Argo routing, HTTP/3
App
Streaming SSR,
500ms latency
SERP
Search results, knowledge panels, AI assistant
Ingestion
Crawling
Fleet
50K pages/sec
, multi-level stochastic queues
Queue
queued
Custom RocksDB-based queue,
300K ops/sec
Processing
Text
Parser
WHATWG spec, semantic extraction
Normalizer
Chrome removal, canonicalization
Chunker
spaCy sentencizer, context preservation
Statement chainer
DistilBERT classifier, anaphora resolution
Compute
GPU
Runpod
200+ RTX 4090s, SBERT model,
100K embeddings/sec
, 90% utilization
CPU
Query vectorizer
Real-time query embeddings
Storage
Records + blobs
Distributed RocksDB
64 shards,
82TB SSDs
, BlobDB,
200K writes/sec
, xxHash routing
Vector index
Distributed HNSW
64 shards,
4TB RAM
,
3B embeddings
, parallel queries
Knowledge graph
Wikipedia
Extracts, images
Wikidata
Entity graph, properties
Infrastructure
Service mesh
mTLS + HTTP/2
Universal auth, encryption, MessagePack RPC
CoreDNS
Internal DNS resolution, service discovery
systemd
cgroups, journald