distsoc Discovery Protocol v2 — Design Review

Three-layer architecture: L1 Peer Discovery L2 File Storage + Content Routing L3 Social Routing. Click any section to expand.

101 persistent QUIC connections (81 social + 20 wide). Single ALPN. 2-min diffs. ~350K 3-hop map. Your inline feedback is auto-saved to localStorage.

1. Architecture Overview
1.1 The Three Layers
LayerPurposeMap contentsUpdate mechanism
L1 Peer Discovery Find any node's address NodeIds + hop distance + addresses (1-hop only). ~350K entries. 2-min diffs (1-hop forwarding), worm lookup
L2 File Storage Content-addressed storage + author update propagation node:postid + media + author_recent_posts (256KB max) File replication, piggybacked updates, staleness pulls
L3 Social Routing Direct routes to follows / audience Cached routes to socially-connected nodes Push (audience), pull (follows), route validation
Design Principle

Layer 1 answers "where is this node?" Layer 2 answers "where is this content?" Layer 3 answers "how do I reach the people I care about?" Each layer operates independently but they reinforce each other — a file layer hit can bypass a Layer 1 worm entirely.

1.2 Follow vs Audience
FollowAudience
InitiationUnilateral — no request neededRequires request + author approval
DeliveryPull only — follower pulls updatesPush — author pushes via push worm
Author awarenessAuthor does not knowAuthor knows (approved the request)
LatencyMinutes (pull cycle or file-chain propagation)Seconds (direct push)
Resource costFollower bears costAuthor bears cost
ScaleUnlimited followers (pull is distributed)Author pushes to approved list
Key Distinction

Follows are private and passive — the author never learns who follows them. Content reaches followers via the file layer (author_recent_posts propagates through stored files) or via periodic pull. Audience is consented and active — the author pushes in real-time.

1.3 Connection Model — 101 Persistent QUIC
┌──────────────────────────────────┐
│         This Node (101 conns)    │
├──────────────────────────────────┤
│  81 Social Peers                 │
│  ├─ Mutual follows               │
│  ├─ Audience (granted)           │
│  ├─ Users we follow (online)     │
│  ├─ Recent sync partners         │
│  └─ (evicted by priority)        │
│                                  │
│  20 Wide Peers                   │
│  ├─ Diversity-maximizing         │
│  ├─ Re-evaluated every 10 min    │
│  └─ At least 2 must be anchors   │
└──────────────────────────────────┘

Mobile: 10 social + 5 wide = 15 connections.

All connections use a single ALPN (distsoc/2) with multiplexed message types. One TLS handshake per peer. QUIC keep-alive every 20 seconds.

ResourceDesktop (101)Mobile (15)
Memory (connection state)~1.5 MB~250 KB
Keep-alive bandwidth~22 MB/day~3.2 MB/day
CPUNegligibleNegligible
1.4 Unified Protocol — Single ALPN

All communication over one ALPN distsoc/2. Message types via 1-byte header per QUIC stream:

Layer 1: Peer Discovery
  0x01  RoutingDiff          2-min gossip diff
  0x02  InitialMapSync       Full map exchange on new connection
  0x10  WormRequest          Forwarded search
  0x11  WormQuery            Fan-out to peers (local check)
  0x12  WormResponse         Results to originator
  0x20  AddressRequest       Resolve NodeId → address
  0x21  AddressResponse

Layer 2: File / Content
  0x30  FileRequest          Request a post by PostId
  0x31  FileResponse
  0x32  AuthorUpdateRequest  Request fresh author_recent_posts
  0x33  AuthorUpdateResponse
  0x34  AuthorUpdatePush     Push updated author_recent_posts
  0x35  PostNotification     Real-time new post notification

Layer 3: Social
  0x40  PullSyncRequest      Follower requests posts since seq N
  0x41  PullSyncResponse
  0x42  PushPost             Audience push delivery
  0x43  AudienceRequest      Request to join audience
  0x44  AudienceResponse

General
  0x50  ProfileUpdate
  0x51  DeleteRecord
  0x52  VisibilityUpdate
Why Single ALPN

Previous design used 4 ALPNs (sync/6, addr/1, gossip/1, worm/1) requiring separate connections. Single ALPN means one TLS handshake per peer, connection reuse for all message types, simpler accept loop, easy to add new message types.

L1 2. Peer Discovery — 3-Hop Map + Worm
2.1 The 3-Hop Discovery Map
┌─────────────────────────────────────────────────────────┐
│ Hop 1: 101 direct peers                                 │
│   Stored: NodeId + SocketAddr + is_anchor + is_wide     │
│   Source: Direct QUIC connection observation             │
│                                                         │
│ Hop 2: ~5,500 unique nodes                              │
│   Stored: NodeId + reporter_peer_id + is_anchor         │
│   Source: Peers' 1-hop diffs (their direct connections) │
│                                                         │
│ Hop 3: ~350,000 unique nodes                            │
│   Stored: NodeId only                                   │
│   Source: Peers' 2-hop diffs (their derived knowledge)  │
└─────────────────────────────────────────────────────────┘

Storage: ~11.3 MB (101×96B + 5,500×66B + 350K×32B)

Why ~350K with 20 Wide Peers

Source (2-hop)RawAfter dedup
Social intra-cluster (81 × ~50)4,050~118 (rest of our cluster)
Social inter-cluster (81 × ~51)4,131~3,500
Wide intra-cluster (20 × ~50)1,000~1,000
Wide inter-cluster (20 × ~51)1,020~1,000
Total 2-hop~5,500

3-hop: Each of ~5,500 2-hop nodes has ~100 connections not yet counted. Wide-wide-wide paths contribute ~80K+ unique nodes from completely different parts of the graph. Total after dedup: ~350,000.

Wide Peer Multiplier

Without dedicated wide peers, 101 random social connections in a clustered graph reach ~150-200K. With 20 wide peers: ~350K. The wide peers cascade diversity — their wide peers escape their neighborhoods, and so on through 3 levels.

2.2 Diff-Based Gossip (2-min cycles)
Every 2 minutes, each node sends a diff to each of its 100 other peers:

  RoutingDiff {
    hop1_changes: [Added/Removed/AddressChanged],  // our direct observations
    hop2_changes: [Added/Removed],                  // derived from received diffs
    seq: u64,
  }

How Diffs Propagate (1-hop forwarding only)

T=0 Node X goes offline. X's 101 direct peers detect connection drop. T=0-2m X's peers include "X removed" in their next 1-hop diff. → X's peers' neighbors learn "X gone from 2-hop" T=2-4m Those neighbors include "X removed" in their 2-hop diff. → Nodes 3 hops from X learn "X gone from 3-hop" T=4-6m Further propagation for deeper views. 3-hop propagation: ~6 min worst case, ~3 min average.

No amplification: You don't re-forward received diffs. You compute your own view's changes and report those. Each change is re-derived at every hop.

Bandwidth

Churn rateDiff size/peerPer day (Layer 1)
1% hourly (low)~200 bytes~50 MB
5% hourly (mobile-heavy)~700 bytes~122 MB

Previous design: ~318 MB/day. This is 2.5-6x better.

2.3 Worm Lookup with Fan-Out

Used when target NodeId is not in local 3-hop map.

Worm arrives at node A looking for targets [T1, T2, T3]: Step 1: LOCAL CHECK A checks own 3-hop map (~350K entries). O(1) per target. Found T2 → send WormResponse directly to originator. Step 2: FAN-OUT CHECK (parallel, 500ms timeout) A sends WormQuery to all 100 peers. Each peer checks their ~350K map. O(1) per target. Peer P7 finds T1 → resolves address → WormResponse to originator. Step 3: FORWARD remaining targets Select best forwarding peer (wide, not visited, not queried). Forward WormRequest { ttl: ttl-1 }. That peer repeats Steps 1-3.

Coverage Per Hop

ComponentEntries checked
Local 3-hop map~350,000
100 peers' maps (fan-out)100 × ~350,000 = ~35,000,000
After overlap dedup~25,000,000 (1.25% of 2B)

With iterative routing (each hop guided toward target):

  • TTL=3: ~75M entries — finds most socially-proximate targets
  • TTL=5: ~125M entries — finds virtually any reachable target
  • Expected resolution: 3-5 hops, 1.5-2.5 seconds
vs Previous Design

Previous worm checked only local map (~2M per hop with 2-hop tables). Fan-out to 100 peers gives 12x more coverage per hop.

2.4 Address Resolution Chain
1. DIRECT       — T in 1-hop → have address                    (instant)
2. 2-HOP REF    — T in 2-hop → ask reporter for address        (1 RTT)
3. 3-HOP REF    — T in 3-hop → ask peers who's closer → chain  (2 RTT)
4. WORM         — T not in map → worm search                   (1.5-5 sec)
5. ANCHOR        — Worm fails → profile anchor or bootstrap     (1-5 sec)
TierNodes covered% of 2B
1-hop1010.000005%
2-hop~5,5000.00028%
3-hop~350,0000.018%
Worm (1 hop)~25,000,0001.25%
Worm (5 hops)~125,000,0006.25%
L2 3. File Storage + Content Routing
3.1 Core Concept — Files Carry Their Own Routing

Every stored file (post + media) carries a small metadata blob: author_recent_posts (max 256 KB, author-signed). This blob lists the author's recent post IDs.

Key Insight

If you have any file by author X, you passively know X's recent posts. You can then request specific posts from any peer who has them — you don't need to find author X.

This creates a natural CDN: popular authors' post updates propagate through the file storage network as each copy of their files carries the latest post list.

StoredFile {
  post: Post,                          // content-addressed, immutable
  post_id: PostId,                     // blake3(content)
  media_blobs: Vec<MediaBlob>,

  author_recent_posts: AuthorRecentPosts {
    author_id: NodeId,
    posts: Vec<RecentPostEntry>,       // newest first
    updated_at: u64,                   // ms timestamp
    signature: Signature,              // author signs this blob
    // Max 256 KB total → ~8,000 recent post entries
  }
}
3.2 Update Propagation — Three Paths
Author A publishes a new post. Path 1: DIRECT PUSH (audience, seconds) A pushes new post to audience members via push worm. Recipients update their stored copies of A's files with the new author_recent_posts blob. Path 2: FILE-CHAIN PROPAGATION (followers, <12 min typical) A's 101 persistent peers receive the update. When ANY peer accesses a file by A, they see the fresh author_recent_posts and can request the new post. Propagates naturally as files are accessed/synced. Path 3: STALENESS PULL (>1 hour fallback) If author_recent_posts.updated_at is older than 1 hour, the holder triggers an update pull: - Check Layer 3 social route to author - Check other peers who hold author's files - Worm request for latest author_recent_posts
Result

Popular authors' updates reach most file holders within minutes. Unpopular authors' updates reach followers within 1 hour (staleness pull).

3.3 Popular Author Scale (1M Audience)
Author A has 1,000,000 audience members. Posts a new photo. Layer 1: A has 101 persistent connections. PostNotification sent to all. → 101 audience members get it instantly. → 101 copies of updated author_recent_posts now exist. Layer 2: Those 101 peers have files by A. Each has 101 peers. → T+2m: 101 × 100 = ~10,000 peers see fresh author_recent_posts → T+4m: 10,000 × 100 = ~1,000,000 peers reached → Natural file-chain propagation covers the full audience No destination declared. The file layer IS the CDN. Author does 101 pushes. O(log N) hops to reach everyone.
vs Previous Design

Previous design: author splits audience into chunks of 10, tries to push to each chunk leader. 1M audience = 100K chunks = author's machine saturated for hours.

New design: author does 101 pushes total. File layer handles the rest via natural propagation. O(1) work for the author, O(log N) time to reach everyone.

3.4 Requesting Posts via File Layer
You see author A has new post P in author_recent_posts.
You don't have post P stored locally.

1. Check if any persistent peer has P:
   → Fan-out WormQuery to 100 peers (they check local storage)
   → Any peer with the file can serve it

2. Request posts newest-to-oldest:
   → Prioritize catching up on recent content
   → Older posts can wait or be skipped

3. No need to contact author A at all.
File Authority Chain

Each node caches a route back to the author for each file they hold. When author_recent_posts is stale (>1 hour), follow the authority chain hop-by-hop toward the author. Each hop may have a fresher copy — you don't need to reach the author, just a fresher copy.

3.5 File Keep Priority

Formula

priority = pin_bonus + (relationship × heart_recency × post_age / (peer_copies + 1))

Scoring Tables

Relationship (to file's author)
Self (our own content)∞ (never evicted)
We are audience of author10
We follow author8
Author has >10 hearts from network5
Author has >3 hearts3
Author has >2 hearts2
No relationship1
Time WindowHeart Recency ScorePost Age Score
< 72 hours100100
3-14 days5050
14-45 days2525
45-90 days1212
90-365 days66
1-3 years33
4-10 years11

Peer Copies: Divides priority. More copies nearby = lower urgency to keep ours. 0 copies → full priority. 10 copies → 1/11 priority.

Pin: 99,999 bonus. Even pins compete when storage is full.

Examples

YOUR OWN post:
  ∞ → never evicted

Audience author, yesterday, hearted today, 0 copies:
  0 + (10 × 100 × 100 / 1) = 100,000 → very high

Followed author, last week, hearted 2 days ago, 2 copies:
  0 + (8 × 100 × 50 / 3) = 13,333 → high

Popular stranger (>10 hearts), yesterday, 20 copies:
  0 + (5 × 100 × 100 / 21) = 2,381 → moderate

Random, 6 months old, 3 hearts, 8 copies:
  0 + (3 × 6 × 6 / 9) = 12 → very low

Unknown, no hearts, old, many copies:
  0 + (1 × 1 × 1 / 11) = 0.09 → first evicted
Storage Budget

Default: 10 GB. At 256 KB avg: ~40K files. At 1 MB avg (with media): ~10K files. The formula ensures: own posts always kept, audience/follow prioritized, rare content preserved, old/common/unrelated content evicted first.

L3 4. Social Routing
4.1 Purpose — Cached Routes to People You Care About

Layer 3 is a personal routing cache for follows and audience. It stores recently-working routes so you can push/pull content without going through the Layer 1 worm every time.

SocialRoute {
  target: NodeId,
  relationship: Follow | Audience | Mutual,
  last_route: Vec<NodeId>,          // path that worked
  last_success: u64,
  address_hint: Option<SocketAddr>, // if direct worked
}
4.2 Follow Pull Path
Follower F wants updates from author A: 1. Is A a persistent peer? (Layer 1, 1-hop) → Yes: content flows in real-time. Done. 2. Check social route cache (Layer 3) → Have recent route? Follow it to A. → Pull author_recent_posts + new posts. → Update route cache. 3. Check file layer (Layer 2) → Have any of A's files? Check author_recent_posts freshness. → If <1 hour old: up to date. Request missing posts via worm. → If >1 hour old: follow file authority chain for fresher data. 4. Fall back to Layer 1 worm. Typical: step 1 or 2 (fast, no worm needed).
4.3 Audience Push Path
Author A creates a post. Has approved audience members. 1. Audience members who are persistent peers (1-hop): → Push PostNotification on persistent connection. Instant. 2. Audience members with social routes (Layer 3): → Follow cached route. Push post via push worm. → Update route cache on success. 3. Audience members with no cached route: → Layer 1 worm to find address. → Push post. Cache route for next time. For audience >101: post also pushed to file layer. File storage network handles further propagation. No destination declared — the CDN effect takes over.
4.4 Route Maintenance
  • On successful push/pull: update route + address hint
  • On failure: clear route, fall back to Layer 1
  • Every 30 min: validate routes for top-priority follows/audience
  • Routes older than 2 hours without verification → stale
5. How It All Fits Together — Lifecycles
5.1 Public Post — From Creation to Feed
T=0 Author A creates post P. PostId = blake3(content). Stores in local DB. Updates own author_recent_posts. Persistent peers (Layer 1): → PostNotification on all 101 connections. → All persistent peers have P + updated author_recent_posts. Audience push (Layer 3): → Persistent audience members: already done. → Social route audience: push via cached route. → No route: worm to find, then push. T<2m Peers' peers see updated author_recent_posts (Layer 2): → File-chain propagation begins. T<12m File layer reaches most file holders (Layer 2): → Anyone accessing any file by A sees new post listed. → Can request P from any peer who has it. T=60m Pull cycle for distant followers (Layer 3 fallback): → Follower checks author_recent_posts → sees P → requests it.
5.2 Encrypted Post (DM / Circle)
T=0 Author A creates post with VisibilityIntent::Direct([R]). 1. Generate random CEK 2. Encrypt content with ChaCha20-Poly1305 3. Wrap CEK per-recipient via X25519 DH 4. PostId = blake3(encrypted_content) Push to recipient R: → R is persistent peer? Push directly. → Social route? Push via route. → Worm to find R. author_recent_posts updated (includes PostId + VisibilityHint::Encrypted). Peers see there's a new post but can't read it without the wrapped key. On receipt, R: → DH to derive shared secret → Unwrap CEK → Decrypt content
5.3 Discovering a New User to Follow
User has author X's NodeId (from out-of-band sharing). 1. Layer 1 map: X in 3-hop? (~350K entries) → If yes: resolve address via referral chain. Connect. Done. 2. Worm search (Layer 1): fan-out to 100 peers. ~25M entries checked per hop, 3-5 hops. → Found? Connect to X. Pull profile + recent posts. Done. 3. File layer (Layer 2): anyone we know have X's files? → Check if any peer has author_recent_posts for X. → If yes: get X's recent posts without finding X directly. 4. Anchor fallback: contact bootstrap anchors. 5. Once connected to X (or X's file holders): → Cache social route (Layer 3) for future pulls. → Store X's files → future updates via file layer.
Three layers cooperate

Layer 1 finds the person. Layer 2 lets you get their content even without finding them. Layer 3 remembers the route for next time. Each layer is a fallback for the others.

5.4 Popular Author with 1M Audience
Author A: 1,000,000 audience members. Posts a photo. T=0s A pushes to 101 persistent peers. Author's work: 101 sends. Done. T=0-2s 101 audience members have post + fresh author_recent_posts. T=2m 101 × 100 = ~10,000 peers see updated author_recent_posts via file-chain propagation (Layer 2). T=4m 10,000 × 100 = ~1,000,000 peers reached. Full audience covered. Total author effort: 101 pushes. Total time to full coverage: ~4 minutes. Total bandwidth (author): 101 × post_size. No audience member list transmitted. No destination declared.
The File Layer IS the CDN

Popular content replicates because many peers have the author's files. The author_recent_posts blob travels with every file copy. The author doesn't need to know or manage the delivery — the storage network handles it.

6. Bandwidth & Resource Budget
6.1 Per-Node Summary
MetricDesktop (101 conns)Mobile (15 conns)
Layer 1 map~350K entries, ~11 MB~15K entries, ~500 KB
Layer 2 files10K-40K files, ~10 GB1K-5K files, ~1 GB
Layer 3 routes~200-500 entries, ~50 KBsame
Layer 1 bandwidth~50 MB/day~8 MB/day
Layer 2 bandwidth~50-100 MB/day (varies)~10-30 MB/day
Layer 3 bandwidth~10-20 MB/day~5-10 MB/day
Total bandwidth~110-170 MB/day~23-48 MB/day
Worm coverage/hop~25M (1.25%)~2M (0.1%)
Worm hops to find any target3-55-8 (or anchor)
6.2 Comparison to Previous Design
AspectPhase F (current code)Protocol v2 (this spec)
ConnectionsEphemeral (connect/sync/disconnect)101 persistent
ALPNs41
GossipFull peer list each time2-min diffs, 1-hop forward
Map depth2-hop (~5K)3-hop (~350K)
Content deliveryPull-only (60 min)3 layers: push + pull + file propagation
File storageNot managedPriority-based with keep formula
Worm coverage/hop~2M~25M
Daily bandwidth~318 MB~110-170 MB
Popular author scaleAuthor pushes to all (O(N) work)File layer propagates (O(log N) time)
First-contact latency10-30 seconds1-5 seconds
6.3 Anchor Node Costs

A well-connected anchor (listed by 10,000 users as profile anchor):

ActivityPer day
Persistent connections (~200)~30 MB RAM
Gossip diffs (200 peers)~10 MB
Map storage (larger map)~50 MB disk
Worm forwarding (~100/hr)~5 MB
Address lookups (~500/hr)~2 MB
Content relay~100 MB
Total~170 MB/day, ~80 MB RAM

A $5/month VPS handles this comfortably.

7. Bootstrap — Entering the Network
New node, first launch: 1. Read bootstrap anchors from anchors.json (shipped with app) 2. Connect to 1-2 bootstrap anchors 3. Exchange Layer 1 maps (InitialMapSync) — learn ~350K NodeIds 4. Begin wide peer selection from learned nodes 5. Connect to 20 wide peers 6. Fill social peer slots based on follow list 7. Worm search for followed users not yet found 8. Within ~10 minutes: fully operational with 101 connections
Lightweight Bootstrap (Future)

New nodes don't need full map exchange. "I'm new, give me 200 diverse peers" (~15 KB response). Connect to received peers, build maps via normal gossip. Reduces anchor load from ~11 MB to ~15 KB per new node.

8. Implementation Order

Phase 1: Foundation

  1. Single ALPN (distsoc/2) with message type multiplexing
  2. Persistent connection manager (81 social + 20 wide slots)
  3. discovery_map table (Layer 1)
  4. 1-hop map population from persistent connections

Phase 2: Layer 1 Gossip + Worm

  1. RoutingDiff and 2-min gossip cycle
  2. 2-hop + 3-hop derivation from diffs
  3. Wide peer diversity scoring
  4. Worm v2 with fan-out
  5. Address resolution chain

Phase 3: Layer 2 File Storage

  1. stored_files + author_recent_posts tables
  2. File keep priority calculation + eviction
  3. author_recent_posts update propagation
  4. File authority chain routing
  5. Post request via file layer (fetch from any holder)

Phase 4: Layer 3 Social Routing

  1. social_routes table
  2. Follow pull via cached routes
  3. Audience push via cached routes + worm fallback
  4. Route maintenance

Phase 5: Integration + Optimization

  1. Popular author file-chain propagation
  2. Lazy 3-hop streaming on connection
  3. Mobile mode (15 conns, smaller maps)
  4. Delta sync for content (sequence numbers)
  5. Bloom filter caching (optional)
9. Open Questions & Decisions Needed

Q1: Peer Eviction Policy

When all 81 social slots are full and a higher-priority peer comes online, which peer gets dropped? Need to prevent thrashing (repeatedly connecting/disconnecting borderline peers).

Q2: author_recent_posts Authenticity

The blob is author-signed, but a malicious peer could serve a stale (valid but old) blob. Include sequence numbers? If you see seq 50 from one peer and seq 45 from another, seq 45 is stale.

Q3: Peer Copy Counting for Keep Priority

How do we learn peer_copies? Passively from worm responses and file requests. Exact counting isn't needed — order-of-magnitude is sufficient.

Q4: File Layer Bandwidth

If every file carries 256 KB of author_recent_posts, that's substantial overhead on small posts. Compact format (just PostIds at 32 bytes each = ~8000 entries per 256 KB) or fetch separately on demand?

Q5: Storage Quotas / 3x Hosting Rule

Design spec mentions 3x hosting quota. How does this interact with the keep priority formula? Quota sets overall budget, priority decides what fills it?

Q6: Global Lookup for Isolated Nodes

Worms + anchors handle most cases. For truly isolated nodes (~0.01% of lookups that fail), do we need a structured DHT layer? Or is anchor fallback sufficient at scale?

Overall Notes

10. Glossary
TermDefinition
NodeIded25519 public key (32 bytes, 64 hex chars). Permanent identity.
Connect stringNodeId@host:port. Enough to establish first contact.
AnchorNode with stable public address. Network entry point + relay.
Wide peerOne of 20 peers selected for maximum graph diversity.
Social peerOne of 81 peers selected by social relationship priority.
3-hop mapL1 ~350K NodeIds reachable within 3 hops of this node.
WormL1 Bounded-depth search with fan-out. ~25M entries checked per hop.
author_recent_postsL2 256KB signed blob listing author's recent posts. Travels with every stored file.
File authority chainL2 Cached route back to a file's author for freshness updates.
Keep priorityL2 Score determining which files to keep vs evict when storage is limited.
Social routeL3 Cached working path to a followed user or audience member.
FollowUnilateral pull-only. Author doesn't know. Follower bears cost.
AudienceConsented push. Author knows + approves. Author pushes in real-time.
CEKContent Encryption Key. Random per-post, ChaCha20-Poly1305.
RoutingDiffL1 2-min gossip message. 1-hop + 2-hop changes only.
Push wormWorm that delivers content (not just searches). Used for audience push.
HeartUser endorsement of a post. Affects file keep priority.
Peer copiesNumber of copies of a file within 3-hop range. More copies = lower keep priority.
Feedback auto-saved 0 feedback entries