v0.3.1 — 2026-03-13

Design Document

This is the canonical technical reference for ItsGoin. It describes the vision, the architecture, and the current state of every subsystem — with full implementation detail. This document is versioned; each update records what changed.

Changelog

v0.3.1 (2026-03-13): Share links + QUIC proxy + content search. Share link format: itsgoin.net/p/<postid_hex>/<author_nodeid_hex> — simple, no host encoding needed. itsgoin.net web handler acts as QUIC proxy: receives browser request, searches the network for the post, fetches it on-demand via PostFetch (0xD4/0xD5), renders HTML, serves to browser. No permanent storage of fetched content. Extended worm search — WormQuery now carries optional post_id and blob_id fields for unified node/post/blob search. Each peer checks local storage, CDN downstream tree (up to 100 hosts per post), and blob store. WormResponse gains post_holder and blob_holder fields. Nova fan-out pattern — burst peers include one N2 wide referral; referred peer does its own 101-burst, reaching ~10K nodes with ~202 relay hops. PostFetch (0xD4/0xD5) — lightweight single-post retrieval after worm finds a holder, much lighter than full PullSync. itsgoin.net node deployed as anchor + web handler (--web 8080). “Unavailable” page with honest network model explanation + install CTA. Universal Links / App Links planned for native app interception. | Engagement sync — pull sync now fetches reactions, comments, and policies via BlobHeaderRequest/Response after every sync. Profile push fix — profile updates now sent to all connected mesh peers (not just audience). Auto-sync on follow — following a peer triggers immediate post pull + engagement fetch. Popover UI — notifications settings, network diagnostics, and message threads now open as popovers. Notification settings — per-key settings table in SQLite, configurable message/post/nearby notifications with JS Notification API. Tiered DM polling — smart message refresh based on conversation recency. Reaction display — posts show top 5 most popular emoji + total response count. UI cleanup — removed Suggested Peers and Find Nearby sections, placeholder text changed to “How’s it goin?”, clickable node IDs in activity log.

v0.3.0 (2026-03-12): Full rename distsoc → ItsGoin. ALPN, crypto contexts, data paths, Android package ID all changed. Clean break — incompatible with prior versions.

v0.2.11 (2026-03-12): Engagement system — reactions (public + private encrypted via X25519 DH + ChaCha20-Poly1305), inline comments with ed25519 signatures, author-controlled comment/react policies (audience-only, public, none), blocklist enforcement. CDN tree for all posts — new post_downstream table (keyed by PostId, max 100 peers) gives every post a propagation tree; PostDownstreamRegister (0xD3) sent when any peer stores a post. 4 new wire messages: BlobHeaderDiff (0xD0) for incremental engagement propagation, BlobHeaderRequest/Response (0xD1/0xD2), PostDownstreamRegister (0xD3). 6 new SQLite tables, 9 new IPC commands. Thread splitting — headers exceeding 16KB auto-split oldest comments into linked thread posts. Frontend: emoji picker, reaction pills, comment threads, policy selects in compose area.

v0.2.10 (2026-03-12): Per-family NAT classification — IPv4 and IPv6 public reachability now detected independently. Previously, a public IPv6 address incorrectly set has_public_v4=true, causing nodes behind IPv4 NAT to skip hole punching. STUN now always runs (unless --bind) so IPv6-only anchors correctly classify their IPv4 NAT. Anchor advertised address fallback — anchors without --bind or UPnP now advertise their first public bound address (e.g. IPv6 SLAAC), so peers store them in known_anchors for preferential reconnection. Bootstrap anchor deprioritization — startup connection sequence now tries discovered (non-bootstrap) anchors first, falling back to hardcoded bootstrap anchors only when no discovered anchor is reachable. Reduces load on bootstrap infrastructure as the network grows.

v0.2.9 (2026-03-12): ConnectionManager actor redesign — replaced single Arc<Mutex<ConnectionManager>> with two-layer actor pattern: ConnHandle (cheap-to-clone command sender) + ConnectionActor (dedicated tokio task, owns state, processes commands via mpsc/oneshot channels). Eliminated lock contention from 14 code paths that previously held the mutex during network I/O (up to 15s for QUIC connects). All network.rs and node.rs callers now use ConnHandle (~60 call sites migrated). I/O-heavy functions extracted as standalone: broadcast_diff, push_circle_profile, push_visibility, pull_from_peer, send_relay_introduce, send_anchor_register, request_anchor_referrals. Public conn_mgr() accessor removed — Arc<Mutex> is now an internal implementation detail of the actor.

v0.2.8 (2026-03-11): NAT filter probe (0xC6/0xC7) — anchor probes node’s filtering type by attempting QUIC connect from a different source port; address-restricted (Open) vs port-restricted determined in 2s, eliminating unnecessary scanning for most connections. Role-based NAT traversal — EIM nodes punch every 2s (stable port visible to peer scanner), EDM/Unknown nodes walk outward at ~100 ports/sec (opening firewall entries for peer punches to land). Steady scan replaces burst tiers (was 37K tasks, now ~20 in-flight). IPv4 vs IPv6 public differentiation — startup reports v4-only/v6-only/v4+v6, “Public” no longer assumes Open filtering. Task cleanup via JoinSet::abort_all().

v0.2.7 (2026-03-11): Port scanning refinement — scan only the anchor-observed IP (relay-injected first address) instead of all self-reported addresses, avoiding wasted scan budget on unreachable VPN/cellular IPs. Scanning now triggers when peer NAT type is unknown, not just when explicitly EDM.

v0.2.6 (2026-03-11): Anchor self-verification implemented (Section 8) — AnchorProbeRequest/Result (0xC3/0xC4) wire messages, witness-based cold reachability testing via N2 strangers, candidacy checklist (UPnP/public + 50 connections + 2h uptime + non-mobile), periodic re-probe in anchor register cycle, 2-failure revocation. Advanced NAT traversal implemented (Section 10) — NatMapping (EIM/EDM) + NatFiltering (Open/PortRestricted) profile types, hole_punch_with_scanning() replaces hard+hard skip at all 5 call sites, tiered port scanning (±500, ±2000, full ephemeral) at 50 concurrent probes, behavioral filtering inference from connection outcomes, PortScanHeartbeat (0xC5) message type. NAT profile shared in InitialExchange (nat_mapping/nat_filtering fields).

v0.2.5 (2026-03-11): Advanced NAT traversal design (Section 10) — relay-assisted port scanning protocol for EDM/symmetric NATs, full NAT combination matrix (mapping × filtering), tiered scan from observed port at 250/sec, 2s relay heartbeat feedback loop, makes hard+hard pairs solvable without full relay. Reconnection race fix — run_mesh_streams checks stable_id() before cleanup to prevent reconnecting peers from losing their connection entry.

v0.2.4 (2026-03-11): Anchor self-verification probe design (Section 8) — witness-based cold reachability testing via N2 strangers, candidacy checklist, periodic re-probe. Anchor selection simplified to LIFO on last_seen, removed success_count weighting, stale anchor cleanup (7-day probe). BlobHeader separation from blob content (Section 18) — immutable BLAKE3-addressed blobs require separate mutable headers, BlobHeader struct replaces CdnManifest, 25+25 post neighborhood, BlobHeaderDiff incremental propagation. Removed 3x hosting quota — CDN is attention-driven delivery infrastructure, not storage; author owns durability. Keep-alive session ceilings (Section 16) — desktop ~300-500, mobile ~25-50, mobile priority stack, hysteresis for borderline reachability. Mesh stranger controls — mutual mesh blacklist for targeted stranger relationships, --max-mesh CLI flag for topology testing. Phase 2 reciprocity simplified — attention model makes quota enforcement unnecessary.

v0.2.3 (2026-03-11): NAT type detection implemented (Section 10) — raw STUN probing classifies NAT as Public/Easy/Hard/Unknown on startup, shared in InitialExchange, stored per-peer, skip hole punch for hard+hard NAT pairs. LAN Discovery spec (Section 12) — mDNS scan loop for automatic LAN peer connection, keep-alive LAN sessions, local relay design. Pruning & timeout tuning — preferred peer prune 24h→7d, watcher expiry 24h→30d, N2/N3 startup sweep. Growth loop lock fix — resolve_address no longer blocks conn_mgr during network I/O.

v0.2.2 (2026-03-10): Hole punch fixes (Section 10) — session peers now fully participate in relay introduction (observed address injection for both requester and target), all hole punch paths use hole_punch_parallel() (parallel addresses, no more sequential timeouts), requester self-reported addresses filtered to publicly-routable only.

v0.2.1 (2026-03-10): Added UPnP port mapping (Section 11) — best-effort NAT traversal for desktop/home networks, external address in N+10 and peer advertisements, lease renewal cycle.

v0.2.0 (2026-03-09): Major design updates — three-layer architecture (Mesh/Social/File), N+10 identification, keep-alive sessions, 3-tier revocation, multi-device identity, growth loop redesign, pull sync from social/file layers, relay pipes default to own-device-only, remove anchor register loop.

v0.1.0 (2026-03-09): First versioned edition. Consolidated from ARCHITECTURE.md, code review, and gap analysis into a single source of truth.

Contents 1. The Vision 2. Identity & Bootstrap 3. N+10 Identification 4. Connections & Growth 5. Connection Lifecycle 6. Network Knowledge Layers (N1/N2/N3) 7. Three-Layer Architecture (Mesh / Social / File) 8. Anchors 9. Referrals 10. Relay & NAT Traversal 11. UPnP Port Mapping 12. LAN Discovery 13. Worm Search 14. Preferred Peers 15. Social Routing 16. Keep-Alive Sessions 17. Content Propagation 18. Files & Storage 19. Sync Protocol 20. Encryption 21. Delete Propagation 22. Social Graph Privacy 23. Multi-Device Identity 24. Phase 2: Reciprocity 25. HTTP Post Delivery 26. Share Links Appendix A: Timeout Reference Appendix B: Design Constraints Appendix C: Implementation Scorecard Appendix D: Roadmap Appendix E: Features Designed But Not Built Appendix F: File Map

1. The Vision

"A decentralized fetch-cache-re-serve content network that supports public and private sharing without a central server. It replaces 'upload to a platform' with 'publish into a swarm' where attention creates distribution, privacy is client-side encryption, and availability comes from caching, not money."

The honest promise: The CDN is an attention-driven delivery amplifier, not a storage guarantee. Hot content spreads naturally through demand; cold content decays unless intentionally hosted. Authors are responsible for their own content durability — a post backup/export tool is the author's safety net, not the network's job. The system is a loss-risk network — best-effort availability, not durability guarantees.

Guiding principles

Our distributed network first, direct connections always preferred
Social graph and friendly UX in front, infrastructure truth in back
Privacy by design: public profile is minimal, private profiles are per-circle, social graph visibility is controlled
Don't break content addressing (PostId = BLAKE3(post), visibility is separate metadata)
Your feed is yours: reverse-chronological by default, no algorithmic ranking, user-controlled discovery
Three separate layers — Mesh (structural backbone), Social (follows/audience/DMs), File (content storage/distribution) — each with its own connections and routing

2. Identity & Bootstrap

First startup

Identity: Load or generate ed25519 keypair from {data_dir}/identity.key. NodeId = 32-byte public key. A unique device identity is also generated for multi-device coordination (see Section 23).
Storage: Open SQLite database (distsoc.db), auto-migrate schema.
Blob store: Create {data_dir}/blobs/ with 256 hex-prefix shards (00/ through ff/).
UPnP mapping: Attempt UPnP/NAT-PMP port mapping (2s timeout). If successful, store external address for advertisements. Do not block startup if unavailable. See Section 11.
NAT type detection: STUN probes to two public servers (3s timeout each). Classifies as Public/Easy/Hard/Unknown. UPnP success overrides to Public. Anchors skip probing. Result stored on ConnectionManager, shared in InitialExchangePayload, stored per-peer. See Section 10.
Stale N2/N3 sweep: Remove all N2/N3 entries tagged to peers not in the current mesh. Clears stale reach data from previous sessions (e.g., unclean shutdown).
Bootstrap anchors: Load from {data_dir}/anchors.json. If missing, use hardcoded default anchor.
Bootstrap: If peers table is empty, connect to a bootstrap anchor. Request referrals and matchmaking (unless self or the other node is an anchor). Persist on that anchor's referral list until released (at referral count limit) while beginning the growth loop immediately.

Startup cycles

Spawned after bootstrap completes:

Cycle	Interval	Purpose
Pull sync	On demand (3h Self Last Encounter threshold)	Pull new posts from social + upstream file peers
Routing diff	120s (2 min)	Broadcast N1/N2 changes to mesh + keep-alive sessions
Rebalance	600s (10 min)	Clean dead connections, reconnect preferred, signal growth
Growth loop	60s + reactive (on N2/N3 receipt)	Fill empty mesh slots until 101 (90% threshold for reactive mode)
Recovery loop	Reactive (mesh empty)	Emergency reconnect via anchors
Social/File connectivity check	60s	Verify <N4 access to N+10 of active social + file peers; open keep-alive sessions as needed
UPnP lease renewal	2700s (45 min)	Refresh UPnP port mapping before TTL expiry (desktop only)

Removed: Anchor register loop. Anchors are for forming initial mesh connections when bootstrapping, not for ongoing registration. Nodes only connect to anchors during bootstrap or recovery.

3. N+10 Identification

Concept

Every node is identified not just by its NodeId but by its N+10: the node's own NodeId plus the NodeIds of its 10 preferred peers. This accelerates the capacity to find any node — if you can reach any of the 11 nodes in someone's N+10, you can find them.

Where N+10 appears

Context	What's included
Self identification	All self-identification messages include the sender's N+10
Following someone	When you follow a peer, you store and maintain their N+10 in your social routes
Post headers	Every post header includes the author's current N+10. Updated whenever they post.
Blob headers	Blob/file headers include: (1) the author's N+10, (2) the upstream file source's N+10 (if not the author), (3) N+10s of up to 100 downstream file hosts
Recent post lists	Author manifests include the author's N+10 alongside their recent post list

Why this works

Preferred peers are bilateral agreements — stable, long-lived connections. By including them in identification, any node that can find any of your 10 preferred peers can transitively find you within one hop. This eliminates most discovery cascades for socially-connected nodes.

Status: Partial

N+10 is partially implemented — preferred peers exist and are tracked, but N+10 is not yet included in all identification contexts (post headers, blob headers, self-identification messages). Currently preferred_tree in social routes provides similar functionality for relay selection.

4. Connections & Growth

Connection types

Mesh connection — long-lived routing slot. Structural backbone for discovery and propagation. DB table: mesh_peers.
Keep-alive session — long-lived connection for social or file layer peers that aren't in the mesh 101. Participates in N2/N3 routing. See Section 16.
Session connection — short-lived, held open for active interaction (DM conversations, group activity, anchor matchmaking). Tracks remote_addr so the relay can inject observed addresses for session peers during introductions.
Ephemeral connection — single request/response, no slot allocation.

Slot architecture

Slot kind	Desktop	Mobile	Purpose
Preferred	10	3	Bilateral agreements, eviction-protected
Non-preferred	91	12	Growth loop fills these with diverse peers
Total mesh	101	15	Long-lived routing backbone
Keep-alive sessions	No hard limit	No hard limit	Social/file layer peers not in mesh (max 50% of session capacity reserved for keep-alive)
Sessions (interactive)	No hard limit	No hard limit	Active DM, group interaction, anchor matchmaking
Relay pipes	10	2	Own-device relay by default; opt-in for relaying for others

v0.2.0 change: Removed the distinction between "local" (71) and "wide" (20) non-preferred slots. The growth loop goes wide by default. Session counts are no longer hard-limited — an average computer can sustain ~1000 QUIC sessions without strain. The 50% keep-alive reservation ensures sessions remain available for interactive use.

MeshConnection struct

Each mesh connection tracks: node_id, connection (QUIC), slot_kind (Preferred or NonPreferred), remote_addr (captured from Incoming before accept), last_activity (AtomicU64), created_at.

Mutual mesh blacklist Planned

Targeted two-node stranger relationship. Both nodes opt in, maintaining genuine N2 stranger status indefinitely regardless of growth loop behavior. Stored in a local mesh_blacklist { node_id } table.

Growth loop skips blacklisted nodes during candidate selection
Incoming mesh upgrade from blacklisted node → respond with RefuseRedirect (0x05)
Both nodes must add each other — asymmetric blacklist is valid but only prevents the blacklisting side from upgrading
Blacklisted nodes remain visible in N2 via shared N1 peers
Full session/ephemeral interaction still works — messages, probes, routing participation
Never consume each other's mesh slots

Production utility: Operators maintaining intentional stranger relationships for network diversity, preventing specific nodes from becoming preferred peers, or any scenario where two nodes want to cooperate at session level without mesh entanglement.

`--max-mesh <n>` CLI flag Planned

Topology control at network scale. Forces a node to cap its mesh connections, keeping it permanently in N2 of other nodes. Testing affordance only — not for production use.

--max-mesh 0: Pure N2 participant, never takes mesh slots. Warning: free rider — consumes routing knowledge without carrying mesh load.
--max-mesh 3: Partial mesh, useful for testing sparse topologies
--max-mesh 101: Default, full normal behavior
Node responds to all protocol messages normally, never initiates or accepts mesh upgrades beyond the cap
Reuses existing RefuseRedirect (0x05) — no new protocol machinery

Keepalive

Interval: 30 seconds (MeshKeepalive message, 0xE0)
Zombie detection: No stream activity for 600s (10 min) = zombie, removed in rebalance
last_activity updated on every stream accept

5. Connection Lifecycle

5.1 Growth Loop (60s timer + reactive on N2/N3 receipt)

Timer: Fires every 60 seconds. Checks current mesh count. If < 101, runs a growth cycle.

Reactive trigger: Fires immediately after receiving a peer's N2/N3 list (from initial exchange or routing diff). Continues firing on each new N2/N3 receipt until mesh is 90% full (~91 connections). After 90%, switches to timer-only mode.

Candidate selection (N2 diversity scoring):

score = 1.0 / reporter_count + (0.3 if not_in_N3)

Fewer reporters = higher diversity = better candidate
Bonus for locally-discovered peers (not transitive)
Sorted descending, best candidate tried first
Growth loop goes wide by default — no local/wide distinction
Blacklist filter: Skip nodes in mesh_blacklist table (see Section 4)

Connection attempt cascade:

Direct connect (15s timeout) — use stored/resolved address
Introduction fallback — find N2 reporters who know this peer, ask each to relay-introduce us

Failure handling: Track consecutive failures. After 3 consecutive failures, back off (break loop, wait for next signal). Mark unreachable peers for future skipping.

5.2 Rebalance Cycle (every 600s)

Executed in priority order:

Dead connection removal: Remove connections with close_reason() set, or idle > 600s (zombie)
Stale entry pruning: N2/N3 entries tagged to a peer that is no longer connected are pruned immediately (on disconnect and on startup sweep). Age-based fallback: entries older than 7 days. Social route watchers older than 30 days.
Priority 0 — Preferred peer reconnection: Iterate preferred_peers table, reconnect any that are disconnected. If at capacity, evict the lowest-diversity non-preferred peer to make room. Prune preferred peers unreachable for 7+ days (slot released, does NOT auto-return on reconnect — must re-negotiate via MeshPrefer). After 7 days, social checkin frequency drops from 1–3 hours to daily until the 30-day reconnect watcher expires.
Priority 1 — Reconnect recently dead: Re-establish dropped non-preferred connections. Skip blacklisted nodes — do not attempt reconnection to peers in mesh_blacklist.
Priority 2 — Signal growth loop: Fill remaining empty slots via growth loop
Idle session cleanup: Reap interactive sessions idle > 300s (5 min). Keep-alive sessions are NOT reaped by idle timeout.
Relay intro dedup pruning: Clear seen_intros entries older than 30s, cap at 500

Note: Low diversity score alone does NOT trigger eviction. The only eviction path is Priority 0 (making room for a preferred peer).

5.3 Recovery Loop (reactive, mesh empty)

Trigger: disconnect_peer() fires when last mesh connection drops.

Debounce 2 seconds (wait for cascading disconnects to settle)
Gather anchors: known_anchors table ordered by last_seen DESC (LIFO — most recently seen is most likely still reachable) → fallback to hardcoded default anchor(s) only if known_anchors empty or exhausted
For each anchor: connect, request referrals and matchmaking, try direct connect to each referral, fallback to hole punch via anchor for unreachable referrals
Persist on anchor's referral list until released, begin growth loop immediately
Post-bootstrap stale anchor cleanup: After successful bootstrap/recovery, probe known_anchors entries where last_seen > 7 days. Success: update last_seen. Failure: DELETE from known_anchors. Reuses existing anchor probe machinery (0xC3/0xC4). No new cycle or timer — runs as final step of bootstrap/recovery.

5.4 Initial Exchange (on every new connection)

When two nodes connect, they exchange:

N+10: Our NodeId + 10 preferred peers' NodeIds
N1 share: mesh peers + social contacts NodeIds (merged, no addresses)
N2 share: deduplicated N2 NodeIds (no addresses)
Profile: PublicProfile (display name, bio, avatar CID, public_visible flag)
Delete records: Signed post deletions
Post IDs: All local post IDs (for replica tracking)
Peer addresses: N+10 address list for connected peers

Processing: Their N1 → our N2 table (tagged to reporter). Their N2 → our N3 table (tagged to reporter). Store profile, apply deletes, record replica overlaps. Trigger growth loop immediately with new N2/N3 candidates if mesh < 90% full.

5.5 Incremental Routing Diffs (every 120s + on change)

NodeListUpdate (0x01) contains N1 added/removed, N2 added/removed. Sent via uni-stream to all mesh peers and keep-alive sessions. Receiver processes: their N1 adds → our N2 adds, their N2 adds → our N3 adds, etc.

6. Network Knowledge Layers (N1/N2/N3)

Layer	Source	Contains	Shared?	Stored in
N1	Our connections + social contacts	NodeIds only	Yes (as "N1 share")	`mesh_peers` + `social_routes`
N2	Peers' N1 shares	NodeIds tagged by reporter	Yes (as "N2 share")	`reachable_n2`
N3	Peers' N2 shares	NodeIds tagged by reporter	Never	`reachable_n3`

<N4 access

A node has <N4 access to a target if the target appears in its N1, N2, or N3 tables. This means the target is reachable within 3 hops without needing worm search or relay introduction. The social/file connectivity check (see Section 16) uses <N4 access to determine whether keep-alive sessions are needed.

What is NEVER shared

Addresses (resolved on-demand via chain queries)
N3 entries (search-only, never forwarded)
Duplication counts (topology leak)
Which NodeIds are social contacts vs mesh peers (merged in N1 share)

Address resolution cascade (`connect_by_node_id`)

Step	Method	Timeout	Source
0	Social route cache	—	`social_routes` table (cached addresses for follows/audience)
1	Peers table	—	Stored address from previous connection
2	N2 ask reporter	varies	Ask the mesh peer who reported target in their N1
3	N3 chain resolve	varies	Ask reporter's reporter (2-hop chain)
4	Worm search	3s total	Burst to all peers → nova to N2 referrals (each does own burst)
5	Relay introduction	15s	Hole punch via intermediary relay
6	Session relay	—	Pipe traffic through intermediary (own-device or opt-in)

7. Three-Layer Architecture (Mesh / Social / File)

The network operates across three distinct layers, each with its own connections, routing, and purpose. The separation enables specialized behavior without the layers interfering with each other.

Layer	Purpose	Connections	Sync trigger
Mesh	Structural backbone: N1/N2/N3 routing, diversity, discovery	101 mesh slots (preferred + non-preferred)	N/A — mesh is infrastructure, not content
Social	Follows, audience, DMs — the human relationships	Social routes + keep-alive sessions as needed	Pull posts when Self Last Encounter > 3 hours
File	Content storage and distribution — blobs, CDN trees	Upstream/downstream file peers + keep-alive sessions as needed	Pull on blob request, push on post creation

Key principle: mesh is not for content

Pull sync does not pull posts from mesh peers. Mesh connections exist for routing diversity and discovery. Content flows through the social layer (posts from people you follow) and the file layer (blobs from upstream/downstream hosts). This separation means mesh connections can be optimized purely for network topology without social bias.

Cross-layer benefits

Each layer's connections contribute to finding nodes and referrals for the other layers. Keep-alive sessions from the social and file layers participate in N2/N3 routing, which improves <N4 access for all three layers. A social keep-alive session might provide the N2 entry that helps the mesh growth loop find a diverse new peer, and vice versa.

8. Anchors

Intent

Anchors are "just peers that are directly reachable" — standard ItsGoin nodes with a routable address. They run the same code with no special protocol. Their value comes from being directly connectable for bootstrapping new nodes into the network and matchmaking (introducing peers to each other). Anchors include VPS-deployed nodes (always-on) and desktop nodes with UPnP port mappings (see Section 11).

Each profile can carry a preferred anchor list — infrastructure addresses, not social signals.

Status: Complete (with gaps)

When anchors are used

Bootstrap: First startup with empty peers table. Connect to anchor, request referrals and matchmaking, persist on referral list while growing mesh.
Recovery: When mesh drops to 0 connections. Same flow as bootstrap.
Not ongoing: Nodes do NOT register with anchors on a loop. Anchors are for forming initial connections, not for ongoing presence.
itsgoin.net node: A permanent, well-connected ItsGoin node runs on itsgoin.net as part of the share link redirect infrastructure (see Section 26). This node participates in the network as a standard anchor — it bootstraps new nodes, accepts referral requests, and is included in known_anchors by peers that connect through it. It is not special-cased in the protocol. Its value as an anchor comes from permanent uptime and high mesh connectivity, not from any privileged role.

Anchor referral mechanics

When a bootstrapping node connects, the anchor provides referrals from its mesh and referral list. The node persists on the anchor's referral list until released at the referral count limit. During this time, the anchor can matchmake — introducing the new node to other peers requesting referrals.

Anchor selection order

known_anchors table — ORDER BY last_seen DESC (LIFO). The most recently seen anchor is most likely still reachable, particularly given short-lived home desktop anchors.
Hardcoded default anchor(s) — only if known_anchors is empty or exhausted. A brand-new node hits hardcoded anchors once on first bootstrap, populates known_anchors from that session, and the hardcoded list recedes to pure fallback.

No scoring, no success counting, no prediction. Attempt, move to next on failure. The known_anchors table stores only: node_id, addresses, last_seen.

Anchor self-verification Complete

Nodes with UPnP-mapped IPv4 or IPv6 public addresses cannot self-certify as anchors — they need external verification that they are genuinely reachable by cold direct connect. A node is a viable anchor only if a complete stranger can connect to it directly with no introduction, no hole punch, and no relay.

Witness selection

Node A (candidate anchor) selects a witness from its own N2 table entries NOT present in its N1. These are genuine strangers — no prior connection, no cached address, no warm path. A selects one (call it C) and knows C's address via the N1 reporter (call it B) who reported C in their N1 share.

Probe message flow

A → B (N1 reporter of C): AnchorProbeRequest {
    target_addr,     // A's external address to test
    witness,         // C's NodeId
    return_via,      // B's NodeId (for failure reporting)
}

B → C: forward AnchorProbeRequest

C: cold direct QUIC connect to target_addr
   — MUST use only raw QUIC connect (step 1 of connect_by_node_id)
   — MUST skip entire resolution cascade, hole punch, introduction, relay
   — 15s timeout

SUCCESS: C → A directly (on new connection): AnchorProbeResult { reachable: true }
FAILURE: C → B → A: AnchorProbeResult { reachable: false }

Asymmetric return path: If cold connect fails, by definition there is no direct path from C to A. C reports failure through B (who has a live connection to A). On success, C has a fresh direct connection and uses it. The return_via field tells C which node to route failure through.

Why bypass the cascade: The normal connect_by_node_id cascade has 7 steps including hole punch and relay. If C uses the full cascade, a successful result via relay is a false positive. The probe handler must be a special code path: raw QUIC connect only.

Anchor candidacy checklist

is_anchor_candidate():
  - has UPnP mapping OR has IPv6 public address
  - probe succeeded within last 30 minutes
  - mesh ≥ 50 peers (sufficient N2 density)
  - uptime ≥ 2 hours continuous
  - NOT mobile (platform check at build time)

Probe refresh schedule

Trigger	Action
Startup (after UPnP attempt)	Run initial probe
UPnP renewal if address changed	Re-probe
Every 30 minutes while anchor-declared	Periodic re-probe
Any failed inbound connection	Immediate re-probe
Two consecutive probe failures	Stop advertising as anchor, revert to normal peer

Session fallback for full anchors

When an anchor's mesh is full (101/101), new nodes fall back to a session connection for matchmaking. The anchor accepts referral requests over session connections, not just mesh.

Remaining gaps

Gap	Impact
Profile anchor lists not used for discovery	Profiles have an `anchors` field but it's not consulted during address resolution
No anchor-to-anchor awareness	Anchors don't discover each other unless they connect through normal mesh growth
Bootstrap chicken-and-egg	A fresh anchor with few peers produces few N2 candidates for new nodes. Growth stalls because there's nothing to grow from.

9. Referrals

Status: Complete

Referral list mechanics (anchor side)

Anchors maintain an in-memory HashMap of registered peers. Each entry: { node_id, addresses, use_count, disconnected_at }.

Property	Value
Tiered usage caps	3 uses if list < 50, 2 uses at 50+, 1 use at 100+
Disconnect grace	2 minutes before pruning
Sort order	Least-used first (distributes load)
Auto-supplement	When explicit list is sparse (< 3 entries), supplement with random mesh peers

10. Relay & NAT Traversal

Status: Complete

Relay selection (`find_relays_for`)

Find up to 3 relay candidates, prioritized:

Preferred tree intersection: Target's preferred_tree (from social_routes, ~100 NodeIds) intersected with our connections. Prefer our own preferred peers within that tree. TTL=0.
N2 reporters: Our mesh peers who reported the target in their N1 share. TTL=0.
N3 via preferred tree: Target's preferred_tree intersected with N3 reporters. TTL=1.
N3 reporters: Any N3 reporter for the target. TTL=1.

RelayIntroduce flow (`0xB0`/`0xB1`)

Requester → opens bi-stream to relay, sends RelayIntroduce { target, requester, requester_addresses, ttl }
Relay handles three cases:
- We ARE the target: Return our addresses, spawn hole punch to requester
- Target is our mesh or session peer: Forward request to target on new bi-stream, relay response back. Inject observed public addresses for both parties (session peers carry remote_addr from their inbound connection).
- TTL > 0 and target in our N2: Forward to the reporter with TTL-1 (chain forwarding, max TTL=2)
Requester receives RelayIntroduceResult { target_addresses, relay_available }, then:
- hole_punch_parallel(): Try all returned addresses in parallel, retry every 2s, 30s total timeout
- If hole punch fails and relay_available: open SessionRelay (0xB2) pipe through the intermediary

Session relay (relay pipes)

Intermediary splices bi-streams between requester and target. Desktop: max 10 concurrent pipes. Mobile: max 2. Each pipe has a 50MB byte cap and 2-min idle timeout.

v0.2.0 change: Relay pipes are own-device-only by default. A node will only relay traffic between its own devices (same identity key, different device identity). Users can opt in to relaying for others in Settings, but this is not enabled automatically. This prevents nodes from unknowingly burning bandwidth for random peers while still enabling personal multi-device routing.

Deduplication & cooldowns

Mechanism	Window	Purpose
`seen_intros`	30s	Prevents forwarding loops
`relay_cooldowns`	5 min per target	Prevents relay spamming

Hole punch mechanics

Both sides filter self-reported addresses to publicly-routable only (no Docker bridge, VPN, or LAN IPs) and prepend UPnP external address if available. The relay injects each party's observed public address (from the QUIC connection) at the front of the list. All paths use hole_punch_parallel(): parse returned addresses into QUIC EndpointAddr, spawn parallel connect attempts to every address simultaneously. Each attempt: 2s timeout, retried until 30s total deadline. First successful connection wins.

NAT type detection

Status: Complete (interim: public STUN servers)

On startup, each node classifies its NAT type as one of four categories:

Public — observed address matches local, or UPnP-mapped. Directly reachable.
Easy — same mapped port from multiple probes (endpoint-independent mapping / cone NAT). Hole punch will likely succeed.
Hard — different mapped ports per destination (symmetric / address-dependent mapping). Port is unpredictable.
Unknown — detection failed or not yet run.

Current implementation (interim)

Raw STUN Binding Requests (20 bytes, no crate dependency) sent to stun.l.google.com:19302 and stun.cloudflare.com:3478 from a single UDP socket. XOR-MAPPED-ADDRESS parsed from each response (IPv4 + IPv6 supported). Comparison: same mapped port = Easy, different = Hard, matches local = Public. 3s timeout per server. UPnP success overrides to Public. Anchors skip probing entirely (already Public).

Target design (multi-anchor STUN)

When the network has enough anchors, replace public STUN servers with anchor-reported your_observed_addr from InitialExchange. Connecting to two or more anchors at different public IPs provides the same classification without external dependencies.

NAT type sharing

NAT type is included as a string field ("public"/"easy"/"hard"/"unknown") in InitialExchangePayload. Stored per-peer in the peers table (nat_type TEXT column). Available for hole punch decisions before any connection attempt.

Hole punch strategy

Peer A	Peer B	Strategy
Public / Easy	Any	Hole punch (likely success)
Hard NAT	Easy NAT	Hole punch (B's port is predictable)
Hard NAT	Hard NAT	Port scanning — `hole_punch_with_scanning()` tries standard punch first, then escalates to tiered port scanning (±500, ±2000, full ephemeral range)

All hole punch paths use hole_punch_with_scanning() which replaces the former hard+hard skip. NAT profiles (NatMapping + NatFiltering) from InitialExchange determine whether scanning is attempted. Behavioral inference updates filtering classification from connection outcomes.

Advanced NAT traversal

Status: Complete

NAT "hardness" has two independent dimensions:

Mapping: Endpoint-Independent Mapping (EIM / "easy") uses the same external port for all destinations. Endpoint-Dependent Mapping (EDM / "hard") assigns a different port per destination.
Filtering: Address-restricted (Open) accepts from any port on an IP the host has sent to. Port-restricted accepts only from the exact IP:port the host has sent to.

STUN probing at startup classifies mapping (EIM/EDM). Filtering is determined reliably via the anchor filter probe.

NAT filter probe (0xC6/0xC7)

After anchor registration, each node with Unknown filtering sends a NatFilterProbe bi-stream request to its anchor. The anchor creates a temporary QUIC endpoint on a random port and attempts to connect to the node’s observed address (2s timeout). If the connection succeeds, the node is Open (address-restricted or better — accepts packets from any port on the anchor’s IP). If it times out, the node is PortRestricted.

This probe runs once at startup (during anchor register cycle) and the result feeds into all subsequent InitialExchange payloads, so peers know each other’s exact filtering type.

Note: “Public” NAT type does not automatically mean Open filtering. A node may be public on IPv6 but NATed on IPv4. The filter probe tests actual reachability from a different port, regardless of self-declared NAT type. Startup logs now report public (v4 only), public (v6 only), or public (v4+v6).

NAT combination matrix

Side A	Side B	Result
addr-restricted, EIM	addr-restricted, EDM	Basic hole punch
port-restricted, EIM	addr-restricted, EDM	A scans to find+open port; B punches A’s stable port regularly
addr-restricted, EDM	port-restricted, EDM	B scans to find+open port; A waits then responds
port-restricted, EDM	port-restricted, EDM	Both scan+punch alternately
addr-restricted, EIM	addr-restricted, EIM	Basic hole punch
port-restricted, EIM	addr-restricted, EIM	Basic hole punch
addr-restricted, EDM	port-restricted, EIM	B scans to find+open port; A punches B’s stable port regularly
port-restricted, EDM	port-restricted, EIM	B scans to find+open port; A punches B’s stable port regularly

Key insight: if both sides have Open (address-restricted) filtering, scanning is never needed — should_try_scanning() returns false and basic hole punch handles it.

Role-based scanning protocol

Each side independently determines its role based on its own NAT profile:

EIM (stable port) → Puncher: Punch peer’s anchor-observed address every 2s. Our port is stable — the peer’s scanner will find us.
EDM or Unknown → Scanner+Puncher: Walk outward from peer’s anchor-observed base port at ~100 ports/sec (base, base+1, base-1, base+2, base-2, ...). Each probe opens a firewall entry on our NAT. Also punch every 2s to check if peer has opened their port for us.

The scanner opens ports on its own firewall. The other side’s periodic punch (one every 2s to the scanner’s observed address) checks if the scanner has opened a port matching the puncher’s actual port. For both-EDM pairs, both sides scan and punch simultaneously.

Scan parameters

Rate: ~100 ports/sec (one probe every 10ms)
In-flight: ~20 concurrent (100/sec × 200ms connect timeout)
Direction: Outward walk from anchor-observed base port
Target address: Anchor-observed (relay-injected) address only — not VPN/cellular/LAN addresses
Max duration: 5 minutes (covers full 65K port space at ~100/sec in ~11 minutes; ±2000 range covered in first 40 seconds)
Task management: JoinSet with abort_all() on success or exhaustion — no orphaned tasks
Punch interval: Every 2s to peer’s anchor-observed address

Why 5-minute scan duration is acceptable

The cost is time, not resources (~20 in-flight at any time, ~100 probes/sec). For connections that would otherwise be impossible (both EDM + port-restricted), accepting a longer setup time is far better than giving up entirely. Most successful connections resolve within the first 40 seconds (±2000 port range).

Design principle: This protocol eliminates the need for full relay in virtually all NAT scenarios. Session relay remains opt-in only — it is never used as an automatic fallback. The scanning approach respects the user’s intent that peers communicate directly whenever physically possible.

11. UPnP Port Mapping

Status: Complete

Purpose

UPnP (Universal Plug and Play) allows a node to request its home router to forward an external port to its local QUIC port. This makes the node directly reachable from the internet without hole punching — any peer with the external address can connect immediately. This dramatically improves connection success rates for desktop nodes on home networks.

Startup flow

bind Endpoint → attempt UPnP mapping (2s timeout) → store external addr → bootstrap

Discover gateway: Search for UPnP/NAT-PMP gateway with a 2-second timeout. If no gateway found, proceed without — do not block startup.
Request mapping: Map both UDP and TCP for the local QUIC port to the same external port (or next available). UDP is required for QUIC (existing). TCP enables HTTP post delivery (see Section 25). Both use the same external port number. If the router supports one but not the other, accept the partial mapping gracefully — QUIC connectivity is not affected by TCP mapping failure. Request lease TTL of 3600s.
Store external address: The resulting external SocketAddr is stored alongside iroh's observed addresses. It feeds into N+10 identification, InitialExchange, anchor registration, and all peer address advertisements.
Log result: Clearly log whether UPnP succeeded, failed, or was unavailable. This is critical for diagnosing connectivity issues.

Lease renewal cycle (every 2700s / 45 min)

UPnP mappings have a TTL (typically 3600s but varies by router). A renewal loop runs every 45 minutes to refresh the mapping before it expires. If renewal fails, the external address is removed from advertisements and the node falls back to hole punch / relay paths gracefully.

Shutdown

Explicitly release the UPnP mapping on clean shutdown. Routers have finite mapping tables — releasing is good citizenship. Tauri's shutdown hook handles this.

Integration with existing address logic

The UPnP external address is treated the same as any other address the node knows about. It feeds into:

N+10 identification: Included in self-identification so peers store a routable address
InitialExchange: Advertised to new connections
Anchor registration: Included in bootstrap/recovery registration
Social routing: Available in social route address cache for follows/audience
Relay introduction results: Returned alongside hole-punch candidate addresses
Share link host lists: The UPnP external address, when mapped for TCP, determines whether this node includes itself in share link host lists (see Section 26). A node only self-includes if it has confirmed TCP reachability — either via UPnP TCP mapping or a known public IPv6 address.

Why this matters for mobile

Mobile devices on cellular networks cannot use UPnP (carrier NAT doesn't expose it). However, if the peers they're trying to reach (especially desktop nodes and anchors) have UPnP mappings, those peers become directly reachable from the phone without hole punching. The phone doesn't need UPnP — the other side does.

Honest limitations

Limitation	Impact
UPnP disabled on router	Some ISPs ship routers with UPnP off. Mapping silently fails, fallback to hole punch.
Double NAT	ISP modem + user router: mapping reaches inner router but not outer. Partial help at best.
Cellular networks	No UPnP at all. This is purely a desktop/home-network feature.
Carrier-grade NAT (CGNAT)	ISP shares one public IP across many customers. UPnP maps to the ISP's NAT, not the internet. Same as double NAT.

Design principle: UPnP is a best-effort enhancement that improves direct connection reliability for the common case. It is not a dependency. The hole punch + relay fallback chain already handles all failure cases — UPnP just reduces how often you fall back to them.

UPnP nodes are anchors

A node with a successful UPnP mapping is directly reachable from the internet — which is the only thing that makes an anchor an anchor. When UPnP mapping succeeds, the node self-declares as an anchor (is_anchor = true). Other peers will add it to their known_anchors table, providing diverse bootstrap paths back into the network.

When the UPnP mapping is lost (lease renewal fails, shutdown), the node reverts to non-anchor. Peers that stored it as an anchor will naturally age it out via last_seen — LIFO ordering means stale anchors drop to the bottom. The 7-day post-bootstrap cleanup probes stale entries and removes failures. No special cleanup needed beyond the existing anchor infrastructure.

This means any desktop on a home network with UPnP-capable router becomes a potential bootstrap point for the network, dramatically increasing the number of available anchors without any manual server deployment.

Implementation

Crate: igd-next (async support, well-maintained fork of igd). Implementation lives in network.rs alongside the iroh Endpoint — UPnP mapping is an Endpoint concern, not a connection concern.

12. LAN Discovery

Status: Planned

iroh's mDNS address lookup broadcasts peer presence on the local network via multicast DNS (service name "irohv1", backed by the swarm-discovery crate). Currently this is configured as a passive address resolver — if we already know a peer's NodeId, mDNS can resolve its LAN address. But mDNS also discovers unknown peers on the same network, and iroh exposes this via MdnsAddressLookup::subscribe().

Discovery flow

Hold the mDNS handle: Build MdnsAddressLookup explicitly (not via the endpoint builder) so we retain a clone for subscribing.
Spawn a LAN scan loop: Call mdns.subscribe().await to get a stream of DiscoveryEvent::Discovered and DiscoveryEvent::Expired events.
On discovery: Extract NodeId + LAN addresses from the event. If not already connected, initiate a direct connection + initial exchange. Register as a LAN session (a keep-alive session tagged as local).
On expiry: Clean up the LAN session. Peer left the network or powered off.

LAN sessions

LAN peers are special: zero-cost bandwidth, sub-millisecond latency, and very likely someone you know (same household/office). They deserve their own treatment beyond regular mesh or session slots:

Automatic keep-alive: LAN sessions stay open as long as the peer is on the network (mDNS heartbeat). No idle timeout. Not counted against session slot limits.
Sync priority: Pull sync and push notifications go to LAN peers first — instant delivery over the local link.
Local relay: LAN peers can relay for each other to the wider internet. A phone behind carrier NAT can relay through the desktop's UPnP-mapped connection. Bandwidth is free (local network), so relay limits can be much more generous than over the internet.
Blob transfer: Large blob transfers between LAN peers are essentially free. Prefer LAN peers as blob sources when available.

Design rationale

Today, two distsoc devices on the same WiFi network can only find each other if they happen to share a peer that reports them in N2. This is absurd — they're on the same network segment. LAN discovery turns mDNS from a passive address resolver into an active peer source, exploiting the fact that local bandwidth is essentially unlimited.

The keep-alive + relay pattern means a household with one well-connected desktop and several phones creates its own mini-mesh: the desktop provides anchor-like connectivity, the phones stay connected through it, and everyone syncs instantly over the LAN even when the internet connection drops.

Implementation note: iroh's MdnsAddressLookup::subscribe() returns a Stream<DiscoveryEvent>. The DiscoveryEvent::Discovered variant includes EndpointInfo with NodeId + IP addresses. Custom user_data can be set via endpoint.set_user_data_for_address_lookup() to embed distsoc-specific metadata (e.g., display name) in the mDNS TXT record.

13. Worm Search

Status: Complete

Used at step 4 of connect_by_node_id, after N2/N3 resolution fails.

Algorithm

Build needles: target NodeId + target's N+10 (up to 10 preferred peers from their profile/cached N+10)
Local check: Search own connections + N2/N3 for any of the 11 needles. Also check local storage, CDN downstream tree, and blob store for any requested post/blob content.
Burst (500ms timeout): Send WormQuery{ttl=0} (0x60) to all mesh peers in parallel. Each peer checks their local connections + N2/N3, plus local storage and CDN tree for post/blob content.
Nova (1.5s timeout): Each burst response includes a random "wide referral" — an N2 peer. Connect to those referrals and send WormQuery{ttl=1}. The referred peer does its own 101-burst (fans out to all its mesh peers with ttl=0). This reaches ~10K nodes with only ~202 relay hops, keeping network pressure low by expanding one hop at a time rather than flooding.
Total timeout: 3 seconds for the entire search.

Content search

WormQuery carries optional post_id and blob_id fields, enabling unified search for nodes, posts, and blobs in a single query. Each peer checks:

Posts: local storage (direct match), CDN downstream tree (post_downstream — up to 100 known hosts per post)
Blobs: local blob store, CDN post ownership (get_blob_post_id → post_downstream)

WormResponse carries post_holder and blob_holder fields alongside the existing node search results. A content hit (post or blob holder found) is treated as a successful response even without a node match.

The CDN layer is the key multiplier: each node's downstream tree can cover hundreds of posts across dozens of hosts, giving every peer thousands of "I know where that is" answers. Combined with social layer knowledge, even a 202-hop nova covers enormous content space.

PostFetch (`0xD4`/`0xD5`)

Lightweight single-post retrieval after worm search identifies a holder. Opens a bi-stream to the holder and requests one post by ID. Much lighter than full PullSync — no follow filtering, no batch processing, just the target post.

Dedup & cooldown

Mechanism	Window	Purpose
`seen_worms`	10s	Prevents loops during fan-out
Miss cooldown	5 min (in DB)	Prevents repeated searches for unreachable targets

14. Preferred Peers

Status: Complete

Negotiation (`MeshPrefer`, `0xB3`)

Bilateral: Requester sends MeshPrefer{requesting: true}, responder accepts/rejects
Acceptance: Both sides persist to preferred_peers table, upgrade slot to PeerSlotKind::Preferred
Rejection reasons: "not connected", "preferred slots full (N/M)"

Properties

Eviction-protected: Never evicted during rebalance (only non-preferred peers can be evicted)
Priority reconnect: Reconnected first in rebalance (Priority 0), before any growth
Pruned after 7 days unreachable: If a preferred peer can't be reached for 7 days, the slot is released. The bilateral agreement is cleared — reconnection requires a new MeshPrefer handshake. A reconnect watcher persists for 30 days at low priority (daily check). This prevents churn from aggressive pruning while ensuring slots aren't held indefinitely for offline peers.
N+10 component: Your 10 preferred peers' NodeIds are included in your N+10 for all identification (see Section 3)
Preferred tree: Each social route caches a preferred_tree (~100 NodeIds) — the target's preferred peers' preferred peers. Used for relay selection.

15. Social Routing

Status: Complete

Caches addresses for follows and audience members, separate from mesh connections.

`social_routes` table

Field	Purpose
`node_id`	The social contact's NodeId
`nplus10`	Their N+10 (NodeId + 10 preferred peers)
`addresses`	Their known IP addresses
`peer_addresses`	Their N+10 contacts (PeerWithAddress list)
`relation`	Follow / Audience / Mutual
`status`	Online / Disconnected
`last_connected_ms`	When we last connected
`reach_method`	Direct / Relay / Indirect
`preferred_tree`	~100 NodeIds for relay tree

Wire messages

Code	Name	Stream	Purpose
`0x70`	SocialAddressUpdate	Uni	Sent when a social contact's address changes or they reconnect
`0x71`	SocialDisconnectNotice	Uni	Sent when a social contact disconnects
`0x72`	SocialCheckin	Bi	Keepalive with address + N+10 updates

Reconnect watchers

reconnect_watchers table: when peer A asks about disconnected peer B, A is registered as a watcher. When B reconnects, A gets a SocialAddressUpdate notification. Watchers pruned after 30 days. Low priority — daily check frequency for watchers older than 7 days.

Social route lifecycle

Follow → store their N+10, upgrade to Mutual (if audience)
Unfollow → downgrade/remove
Approve audience → Mutual/Audience

16. Keep-Alive Sessions

Status: Planned

Purpose

When the mesh 101 doesn't provide <N4 access to all the nodes we need for social and file operations, keep-alive sessions bridge the gap. These are long-lived connections that participate in N2/N3 routing but are not part of the mesh 101.

Social/File connectivity check (every 60s)

Periodically check whether we can reach every node we need. A node is considered reachable if either:

We have <N4 access to their N+10 (within N1/N2/N3), or
There is an anchor within N2 of them — we can ask that anchor to matchmake on demand without maintaining a persistent connection

Only when neither condition is met do we open a keep-alive session. With UPnP auto-anchors (see Section 11) scattered throughout the network, the odds of an anchor being within N2 of any given peer increase significantly, reducing the number of keep-alive sessions needed.

Nodes to check:

Nodes we DM'd in the last 4 hours
All follows
All audience members
All file upstream peers (for blobs we host)
All file downstream peers (for blobs we serve)

For any node whose N+10 is NOT reachable within N3, open a keep-alive session to the closest available node in their N+10 (or to them directly if possible). This ensures we can always find and reach our social and file contacts without worm search.

Keep-alive session behavior

N2/N3 routing: Keep-alive sessions exchange N1/N2 diffs and participate in routing, similar to mesh connections. They expand our network knowledge without consuming mesh slots.
Not counted in mesh 101: Keep-alive sessions are a separate pool. They don't affect mesh diversity scoring or slot management.
Capacity limit: Max 50% of total session capacity is reserved for keep-alive sessions. The other 50% remains available for interactive sessions (DMs, group activity).
Not idle-reaped: Unlike interactive sessions (5-min idle timeout), keep-alive sessions persist as long as the connectivity need exists.
Reevaluated periodically: The 60s connectivity check closes keep-alive sessions that are no longer needed (e.g., the target now appears in N3 via a mesh connection).

Practical ceilings

Platform	Ceiling	Binding constraint
Desktop	~300–500	Routing diff broadcast overhead — `NodeListUpdate` to all sessions every 120s. Memory and connection count are not the bottleneck.
Mobile	~25–50	Battery (radio wake-ups per heartbeat cycle) and OS background restrictions (iOS/Android will kill background sockets).

Mobile priority stack

When approaching the mobile ceiling, keep-alive sessions are prioritized:

DMs last 30 min — active conversations take highest priority
Follows — people you follow
Audience — people following you
File peers — upstream/downstream blob hosts

Lower-priority sessions are closed first to make room.

Hysteresis

Don't open a keep-alive session for a contact who just barely fell outside N3. Wait for persistent unreachability — the contact must be absent from N1/N2/N3 for multiple consecutive connectivity checks (e.g., 3 checks = 3 minutes) before opening a keep-alive. This prevents churn from nodes that transiently appear and disappear at the N3 boundary.

Reject + redirect

When a node is at its keep-alive session capacity (50% of total sessions), it refuses new keep-alive requests with a redirect — offering a random N2 node that also has <N4 access to the target. Same pattern as mesh RefuseRedirect but for the keep-alive pool. The requester tries the suggested peer instead.

Cross-layer benefit

Keep-alive sessions from the social and file layers feed N2/N3 entries back into the mesh layer. A social keep-alive to a friend's preferred peer might provide N2 entries that help the mesh growth loop. Similarly, a file keep-alive to an upstream host might provide access to nodes the mesh has never seen. The three layers compound each other's reach.

17. Content Propagation

Intent

"Attention creates propagation": when you view something, you cache it. The cache is optionally offered for serving. Hot content spreads naturally through demand. Cold content decays unless intentionally hosted.

The CDN vision: every file by author X carries an author manifest with the author's N+10 and recent post list. If you hold any file by author X, you passively know X's recent posts and can find X through their N+10.

Status: Partial

BlobRequest/BlobResponse (0x90/0x91) for peer-to-peer blob fetch
AuthorManifest (ed25519-signed, 10+10 post neighborhood) travels with blob responses
CDN hosting tree (1 upstream + 100 downstream per blob)
ManifestPush propagates updates down the tree
BlobDeleteNotice for tree healing on eviction
Blob eviction with social-aware priority scoring

Passive discovery via neighborhood diffs

Passive file-chain propagation is enabled through BlobHeader neighborhood diffs. Every blob header carries the author's 25+25 post neighborhood (25 previous + 25 following). When a host receives a BlobHeaderDiff (0x96), they learn about the author's newer posts without explicit subscription. Hosts of old content are pulled toward new content by the same author naturally — attention creates propagation.

Remaining gaps

Gap	Impact
N+10 not yet in file headers	Blob headers should include author N+10, upstream N+10, and downstream N+10s. Currently only AuthorManifest travels with blobs.
No "fetch from any peer who has it"	Blobs are fetched from specific peers. No content-addressed routing ("who has blob X?").

18. Files & Storage

Blob storage Complete

Property	Value
CID format	BLAKE3 hash of blob data (32 bytes, hex-encoded)
Filesystem path	`{data_dir}/blobs/{hex[0..2]}/{hex}` (256 shards)
Metadata table	`blobs` (cid, post_id, author, size_bytes, created_at, last_accessed_at, pinned)
Max blob size	10 MB
Max attachments per post	4

Blob content immutability

Blob data is BLAKE3-addressed — the CID is the hash of the content. This means blob content is immutable by definition. Any mutable metadata (neighborhood, host lists, signatures) MUST be stored separately in a BlobHeader. Inline mutable headers are architecturally incompatible with content addressing.

BlobHeader Planned

Formal mutable structure replacing/extending CdnManifest. Stored and transmitted separately from blob data.

BlobHeader {
    cid,                    // BLAKE3 hash of blob content
    author_nplus10,         // Author's N+10 (NodeId + 10 preferred peers)
    author_recent_posts,    // 25 previous + 25 following PostIds (neighborhood)
    upstream_nplus10,       // Upstream file source's N+10 (if not author)
    downstream_hosts,       // Up to min(100, floor(170MB / blob_size)) downstream hosts
    author_signature,       // ed25519 signature over author fields
    host_signature,         // ed25519 signature by current host
    updated_at,             // Timestamp of last header update
}

Post neighborhood: 25 previous + 25 following PostIds. Forward slots are empty at publish time and populate via BlobHeaderDiff propagation as the author continues posting. Empty forward slots are not an error condition.
Downstream host count: min(100, floor(170MB / blob_size_bytes)) — smaller blobs allow more downstream hosts, larger blobs reduce the count to cap per-host storage overhead.
BlobHeaderRequest: Lightweight header-only fetch — retrieve just the header without retransferring blob data. Useful for neighborhood updates and host discovery.
Self Last Encounter: Stored per-author, becomes the newer of what's stored and "file last update." Determines when to trigger pull sync.

Blob transfer flow (`0x90`/`0x91`)

Requester sends BlobRequest { cid, requester_addresses }
Host checks local BlobStore:
- Has blob: Return base64-encoded data + CDN manifest + file header (N+10s, recent posts). Try to register requester as downstream (max 100). If full, return existing downstream as redirect candidates.
- No blob: Return found: false
Requester verifies CID, stores blob locally, records upstream in blob_upstream table. Updates Self Last Encounter for the author based on file header.

CDN hosting tree Complete

AuthorManifest: ed25519-signed by post author, contains post neighborhood (25 previous + 25 following posts — see BlobHeader above), author N+10, author addresses
CdnManifest: AuthorManifest + hosting metadata (host NodeId/addresses, source, downstream count)
Tree structure: Each blob has 1 upstream source + up to 100 downstream hosts
ManifestPush (0x94): Author/admin pushes updated manifests downstream, which relay to their downstream
ManifestRefreshRequest/Response (0x92/0x93): Check if manifest has been updated since last fetch
BlobDeleteNotice (0x95): Notify tree when blob is deleted; includes upstream info for tree healing

Blob eviction Complete

priority = pin_boost + (relationship * heart_recency * freshness / (peer_copies + 1))

Factor	Calculation
`pin_boost`	1000.0 if pinned, else 0.0. Own blobs auto-pinned.
`relationship`	5.0 (us), 3.0 (mutual follow+audience), 2.0 (follow), 1.0 (audience), 0.1 (stranger)
`heart_recency`	Linear decay over 30 days: `max(0, 1 - age/30d)`
`freshness`	`1 / (1 + post_age_days)`
`peer_copies`	Known replica count (from `post_replicas`, only if < 1 hour old)

Pin modes Planned

The CDN is delivery infrastructure, not storage. Authors own durability. Pinning extends content in the local delivery pool — it is not a network obligation.

Concept	Status
Anchor pin vs Fork pin	Not started. Anchor pin = host the original (author retains control). Fork pin = independent copy (you become key owner).
Personal vault	Not started. Private durability for saved/pinned items.

19. Sync Protocol

Wire format

[1 byte: MessageType] [4 bytes: length (big-endian)] [length bytes: JSON payload]

Max payload: 16 MB. ALPN: itsgoin/3.

Pull sync: social + file layers, not mesh

v0.2.0 change: Pull sync pulls posts from social layer peers (follows, audience) and upstream file peers, NOT from mesh peers. Mesh connections exist for routing diversity, not content. This separates infrastructure from content flow.

Self Last Encounter: For each peer we sync with, we track the timestamp of our last successful sync. When Self Last Encounter ages beyond 3 hours, a pull sync is triggered. Self Last Encounter is updated to the newer of: (a) what's currently stored, or (b) the "file last update" timestamp from file headers received during blob transfers. Since file headers include the author's recent post list, downloading a blob from any peer hosting that author's content can update Self Last Encounter for the author.

Pull sync filtering

PullSyncRequest: Includes requester's follow list + post IDs they already have
PullSyncResponse: Sender filters posts through should_send_post():
1. Author is requester → always send (own posts relayed back)
2. Public + author in requester's follows → send
3. Encrypted + requester in wrapped key recipients → send
4. Otherwise → skip

Message types (41 total)

Hex	Name	Stream	Purpose
`0x01`	NodeListUpdate	Uni	Incremental N1/N2 diff broadcast
`0x02`	InitialExchange	Bi	Full state exchange on connect
`0x03`	AddressRequest	Bi	Resolve NodeId → address via reporter
`0x04`	AddressResponse	Bi	Address resolution reply
`0x05`	RefuseRedirect	Uni	Refuse mesh + suggest alternative
`0x40`	PullSyncRequest	Bi	Request posts filtered by follows
`0x41`	PullSyncResponse	Bi	Respond with filtered posts
`0x42`	PostNotification	Uni	Lightweight "new post" push to social contacts
`0x43`	PostPush	Uni	Direct encrypted post delivery to recipients
`0x44`	AudienceRequest	Bi	Request audience member list
`0x45`	AudienceResponse	Bi	Audience list reply
`0x50`	ProfileUpdate	Uni	Push profile changes
`0x51`	DeleteRecord	Uni	Signed post deletion
`0x52`	VisibilityUpdate	Uni	Re-wrapped visibility after revocation
`0x60`	WormQuery	Bi	Burst/nova search for nodes, posts, or blobs beyond N3
`0x61`	WormResponse	Bi	Worm search reply (node + post_holder + blob_holder)
`0x70`	SocialAddressUpdate	Uni	Social contact address changed
`0x71`	SocialDisconnectNotice	Uni	Social contact disconnected
`0x72`	SocialCheckin	Bi	Keepalive + address + N+10 update
`0x90`	BlobRequest	Bi	Fetch blob by CID
`0x91`	BlobResponse	Bi	Blob data + CDN manifest + file header
`0x92`	ManifestRefreshRequest	Bi	Check manifest freshness
`0x93`	ManifestRefreshResponse	Bi	Updated manifest reply
`0x94`	ManifestPush	Uni	Push updated manifests downstream
`0x95`	BlobDeleteNotice	Uni	CDN tree healing on eviction
`0xA0`	GroupKeyDistribute	Uni	Distribute circle group key to member
`0xA1`	GroupKeyRequest	Bi	Request group key for a circle
`0xA2`	GroupKeyResponse	Bi	Group key reply
`0xB0`	RelayIntroduce	Bi	Request relay introduction
`0xB1`	RelayIntroduceResult	Bi	Introduction result with addresses
`0xB2`	SessionRelay	Bi	Splice bi-streams (own-device default)
`0xB3`	MeshPrefer	Bi	Preferred peer negotiation
`0xB4`	CircleProfileUpdate	Uni	Encrypted circle profile variant
`0xC0`	AnchorRegister	Uni	Register with anchor (bootstrap/recovery only)
`0xC1`	AnchorReferralRequest	Bi	Request peer referrals from anchor
`0xC2`	AnchorReferralResponse	Bi	Referral list reply
`0xC3`	AnchorProbeRequest	Bi	A → B → C: test cold reachability of address
`0xC4`	AnchorProbeResult	Bi	C → A (success) or C → B → A (failure)
`0xD0`	BlobHeaderDiff	Uni	Incremental engagement update (reactions, comments, policy, thread splits)
`0xD1`	BlobHeaderRequest	Bi	Request full engagement header for a post
`0xD2`	BlobHeaderResponse	Bi	Full engagement header response (JSON)
`0xD3`	PostDownstreamRegister	Uni	Register as downstream for a post (CDN tree entry)
`0xD4`	PostFetchRequest	Bi	Request a single post by ID from a known holder
`0xD5`	PostFetchResponse	Bi	Single post response (SyncPost or not-found)
`0xD6`	TcpPunchRequest	Bi	Ask holder to punch TCP toward browser IP
`0xD7`	TcpPunchResult	Bi	Punch result + HTTP address for redirect
`0xE0`	MeshKeepalive	Uni	30s connection heartbeat

Engagement propagation

Reactions, comments, and policy changes propagate via BlobHeaderDiff (0xD0) through the CDN tree:

Push (real-time): On react/comment, the diff is sent to both downstream peers (CDN tree children) and the upstream peer (who we got the post from). Each intermediate node re-propagates both directions, excluding sender. This flows the diff up to the author and down to all holders.
Auto downstream registration: Nodes that receive a post via pull sync or push notification automatically send PostDownstreamRegister (0xD3) to the sender, ensuring bidirectional diff flow.
Pull (safety net): Every 5 minutes, the pull cycle requests BlobHeaderRequest (0xD1) with the local header timestamp. Peers respond with the full header only if theirs is newer. Additive merge — store_reaction upserts, store_comment inserts with ON CONFLICT DO NOTHING.
Planned: Pull engagement from both upstream and downstream peers to catch missed diffs from either direction.

20. Encryption

Envelope encryption (1-layer) Complete

Generate random 32-byte CEK (Content Encryption Key)
Encrypt content: ChaCha20-Poly1305(plaintext, CEK, random_nonce)
Store as: base64(nonce[12] || ciphertext || tag[16])
For each recipient (including self):
- X25519 DH: our_ed25519_private (as X25519) * their_ed25519_public (as montgomery)
- Derive wrapping key: BLAKE3_derive_key("distsoc/cek-wrap/v1", shared_secret)
- Wrap CEK: ChaCha20-Poly1305(CEK, wrapping_key, random_nonce) → 60 bytes per recipient

Visibility variants

Variant	Overhead	Audience limit
`Public`	None	Unlimited
`Encrypted { recipients }`	~60 bytes per recipient	~500 (256KB cap)
`GroupEncrypted { group_id, epoch, wrapped_cek }`	~100 bytes total	Unlimited (one CEK wrap for the group)

PostId integrity

PostId = BLAKE3(Post) covers the ciphertext, NOT the recipient list. Visibility is separate metadata. This means visibility can be updated (re-wrapped) without changing the PostId.

Group keys (circles) Complete

Each circle gets its own ed25519 keypair
group_id = BLAKE3(initial_public_key) — permanent identifier
Group seed wrapped per-member via X25519 DH (KDF domain: "distsoc/group-key-wrap/v1")
Epoch rotation: On member removal, generate new keypair, increment epoch, re-wrap for remaining members
Wire: GroupKeyDistribute (0xA0), GroupKeyRequest/Response (0xA1/0xA2)

Three-tier access revocation

Three levels of revocation, chosen based on threat level:

Tier 1: Remove Going Forward (default)

Revoked member is excluded from future posts automatically. They retain access to anything they already received. This is the default behavior when removing a circle member — no special action needed.

When to use: Normal membership changes. Someone leaves a group, you unfollow someone. The common case.

Cost: Zero. Just stop including them in future recipient lists.

Tier 2: Rewrap Old Posts (cleanup)

Same CEK, re-wrap for remaining recipients only. The revoked member can no longer unwrap the CEK even if they later obtain the ciphertext. Propagate updated visibility headers via VisibilityUpdate (0x52).

When to use: Revoked member never synced the post (common with pull-based sync — encrypted posts only sent to recipients). You want to clean up access lists.

Cost: One WrappedKey operation per remaining recipient, no content re-encryption.

Tier 3: Delete & Re-encrypt (nuclear)

Generate new CEK, re-encrypt content, wrap new CEK for remaining recipients, push delete for old post ID, repost with new content but same logical identity. Well-behaved nodes honor the delete.

When to use: Revoked member already has the ciphertext and could unwrap the old CEK. Only for content that poses an actual danger/risk if the revoked member retains access. Recommended against in most cases.

Cost: Full re-encryption + delete propagation + new post propagation. Heavy.

Trust model: The app honors delete requests from content authors by default. A modified client could ignore deletes, but this is true of any decentralized system. For legal purposes: the author has proof they issued the delete and revoked access.

Private profiles (Phase D-4) Complete

Different profile versions per circle, encrypted with the circle/group key. A peer sees the profile version for the most-privileged circle they belong to. CircleProfileUpdate (0xB4) wire message. Public profiles can be hidden (public_visible=false strips display_name/bio).

21. Delete Propagation

Status: Complete

Delete records

DeleteRecord { post_id, author, timestamp_ms, signature } — ed25519-signed by author. Stored in deleted_posts table (INSERT OR IGNORE). Applied: DELETE from posts table WHERE post_id AND author match.

Propagation paths

InitialExchange: All delete records exchanged on connect
DeleteRecord message (0x51): Pushed via uni-stream to connected peers on creation
PullSync: Included in responses for eventual consistency

CDN cascade on delete

Send BlobDeleteNotice to all downstream hosts (with our upstream info for tree healing)
Send BlobDeleteNotice to upstream
Clean up blob metadata, manifests, downstream/upstream records
Delete blob from filesystem

22. Social Graph Privacy

Status: Complete

Follows are never shared in gossip or profiles
N1 share merges mesh peers + social contacts into one list (indistinguishable)
No addresses ever shared in routing updates
N3 is never shared outward (search-only)

Known temporary weakness: An observer who diffs your N1 share over time can infer your social contacts (they're the stable members while mesh peers rotate). This will be addressed when CDN file-swap peers are added to N1, making the stable set larger and harder to distinguish.

23. Multi-Device Identity

Status: Planned

Concept

Multiple devices share the same identity key (ed25519 keypair, same NodeId). All devices ARE the same node from the network's perspective. Posts from any device appear as the same author.

Device identity

Each device also generates a unique device identity (separate ed25519 keypair). This device-specific key is used to:

Find each other: Devices with the same shared identity can search for each other using their device identities to facilitate syncs and self-routing
Own-device relay: Route traffic through your own devices (e.g., home computer relaying for your phone) using the device identity for authentication
Conflict resolution: When devices post simultaneously, device identity helps order and deduplicate

Setup

Export identity.key from one device, import on another. The device identity is generated automatically on each device. Once two devices share an identity key, they can discover each other through normal network routing (same NodeId appears at multiple addresses).

24. Phase 2: Reciprocity (Reconsidered)

Status: Reconsidered

The original Phase 2 design centered on hosting quotas (3x rule), chunk audits, and tit-for-tat QoS. On reflection, the attention-driven delivery model makes quota enforcement unnecessary. The CDN is a delivery amplifier, not a storage system — hot content propagates through demand, cold content decays. Authors are responsible for their own content durability.

Tit-for-tat QoS solves the wrong problem: it optimizes for fairness in a storage-obligation model that no longer exists. What matters is that the delivery network functions efficiently, which it does through natural attention dynamics.

If reciprocity mechanisms are needed at scale, they should address delivery quality (bandwidth, latency, uptime) rather than storage quotas. This remains an open design area.

25. HTTP Post Delivery

Intent

Every ItsGoin node that is publicly reachable can serve its cached public posts directly to browsers over HTTP — no extra infrastructure, no additional dependencies, no new binary. The same QUIC UDP port used for app traffic is accompanied by a TCP listener on the same port number. UDP goes to the QUIC stack as always. TCP goes to a minimal raw HTTP/1.1 handler baked into the binary.

This makes every publicly-reachable node a browser-accessible content endpoint, enabling share links that deliver content peer-to-browser without routing any post bytes through itsgoin.net.

Dual listener architecture

<port>/UDP  →  QUIC (existing app protocol)
<port>/TCP  →  HTTP/1.1 (new, read-only, single route)

Both listeners bind on the same port. The OS routes UDP and TCP to separate sockets — no conflict, no protocol ambiguity.

HTTP handler

The handler is intentionally minimal — implemented with raw tokio::net::TcpListener, no HTTP crate, no new dependencies. Approximately 150–200 lines of Rust.

Single valid route: GET /p/<postid_hex> HTTP/1.1

postid_hex must be exactly 64 lowercase hex characters (BLAKE3 hash)
Any other path, method, or malformed request: hard close with no response (not even a 400). Do not be helpful to malformed requests.
Post must be public (PostVisibility::Public). Encrypted posts are never served over HTTP regardless of whether the node holds the content.

Response: Minimal HTML page containing the post content with a small footer:

<footer>
  This post is on the ItsGoin network — content lives on people's devices,
  not servers. <a href="https://itsgoin.com">Get ItsGoin</a>
</footer>

The footer HTML is a static string constant compiled into the binary (~2KB). No template engine, no dynamic footer generation.

Security constraints

Concern	Mitigation
Connection exhaustion	Hard cap: 20 concurrent HTTP connections. New connections over the cap are immediately closed. No queue, no wait.
Slow HTTP attacks	5-second read timeout for complete request headers. Exceeded → hard close.
Content enumeration	Identical response (hard close) for “post not found” and “post not public.” No timing oracle, no distinguishable error codes.
Malformed requests	Hard close only. No error response.
Encrypted content	Never served. Public visibility check is mandatory before any response.

Which nodes serve HTTP

A node serves HTTP only if it is publicly TCP-reachable:

IPv6 public address — serves directly
IPv4 + UPnP mapping — serves if TCP is included in the UPnP mapping (see Section 11 update)
IPv4 behind NAT without UPnP — cannot serve HTTP, but can still appear as a host in share links for app-protocol delivery. The CDN tree and itsgoin.net redirect handler route around unreachable nodes automatically.

302 load shedding via CDN tree

When a node is overwhelmed (at the 20-connection cap) or chooses to redirect:

Query post_downstream table for the requested postid
Filter downstream hosts to those with a known public address (IPv6 or UPnP-mapped IPv4)
302 → http://[their_address]:<port>/p/<postid>

The receiving node applies the same logic recursively if needed. This mirrors the app-layer CDN tree behavior at the HTTP layer — the same attention-driven propagation model, the same tree structure, now accessible to browsers.

Binary size impact

Zero new dependencies. Negligible compiled size delta (~10–20KB). No App Store size concerns. No install size impact for existing users.

26. Share Links

Intent

Every public post can be shared as a URL that works for both app users and browser users:

App installed: OS intercepts the URL via Universal Links (iOS) / App Links (Android) before the browser loads. App opens directly to the post, fetched via QUIC. Zero browser involvement.
No app: Browser loads itsgoin.net, which searches the ItsGoin network for the post and redirects the browser to a live node serving it over HTTP. The share link becomes a product demo and install opportunity.

URL format

https://itsgoin.net/p/<postid_hex>/<encoded_hostlist>

postid_hex: 64 hex characters (BLAKE3 post hash)
encoded_hostlist: base64url-encoded binary list of up to 5 host entries (see encoding below)

Example: https://itsgoin.net/p/3a7f...c921/AAEC...Zg==

Host list encoding

Compact binary encoding — optimized for QR code scanability:

Per IPv6 host:  [0x06][16 bytes IP][2 bytes port]  =  19 bytes
Per IPv4 host:  [0x04][4 bytes IP][2 bytes port]   =   7 bytes

5× IPv6:  95 bytes → ~127 chars base64url  (comfortably scannable QR)

All integers big-endian. base64url-encoded (URL-safe, no padding).

Host list generation (at share time)

When a user taps “Share” on a post:

Query post_downstream for this postid
Filter to hosts with a known public address (IPv6 or UPnP-mapped IPv4)
Select up to 5 — prefer IPv6 public over UPnP IPv4, prefer most recently seen over stale
Include self if this node is publicly reachable
Encode and embed in URL

Availability math

At 80% per-node uptime (conservative for a mix of home and mobile nodes), 5 independent hosts gives 1 - (0.2⁵) = 99.97% link availability. Hosts are selected from nodes that have already demonstrated they cached this specific post — not random peers.

itsgoin.net QUIC proxy handler

Route: GET /p/<postid_hex>/<author_nodeid_hex>

1. Check local storage (fast path — post already fetched recently)
2. Connect to author via QUIC (connect_by_node_id cascade) → PostFetch
3. Content search (extended worm with post_id) → find holder → PostFetch
4. Found? Store temporarily + render HTML + serve to browser
5. Not found? Serve "unavailable" page

The itsgoin.net server acts as a QUIC proxy — it fetches posts on-demand from the peer network and renders them as HTML for the browser. Posts are not permanently stored on the server. This is a marketing/convenience service, not a data store. The proxy model works because most nodes are behind NAT and can't serve HTTP directly to browsers.

Scalable via additional instances: 1.itsgoin.net, 2.itsgoin.net, etc. Each runs its own ItsGoin node with its own mesh connections, increasing total search coverage.

itsgoin.net node

itsgoin.net runs a permanent, well-connected ItsGoin node (--bind 0.0.0.0:4433 --daemon --web 8080). This serves two purposes:

Content search — when a share link is visited, the server-side node uses the extended worm search (with post_id) to find any peer holding the post. The CDN tree means each peer knows about hundreds of posts across their downstream hosts, making hits highly likely even for content the server has never synced.
Anchor — a permanently-online, high-uptime node that bootstrap peers can rely on. Strengthens the network’s anchor infrastructure without any special protocol — it’s just a well-connected peer that happens to also serve the web handler.

Share link format

https://itsgoin.net/p/<postid_hex>/<author_nodeid_hex>

PostID: 64 hex chars (32 bytes, BLAKE3 content hash)
Author NodeID: 64 hex chars (32 bytes, ed25519 public key)
Author ID enables direct QUIC connection as fast path; worm search handles the case where author is unreachable
Only public posts can generate share links

Unavailable page

⬡ This content isn't currently reachable.

  It may be available again when someone
  who has it comes back online.

  [ Install ItsGoin to find it when it resurfaces ]

This is not a 404. It communicates the honest model: content lives on devices, not servers. Cold content decays. The install CTA is the honest answer to “how do I get this.”

Universal Links / App Links

Same URL — itsgoin.net/p/... — intercepts to the native app for users who have ItsGoin installed. No separate URL scheme, no app:// links.

Required static files on itsgoin.net:

/.well-known/apple-app-site-association     (iOS Universal Links)
/.well-known/assetlinks.json                (Android App Links)

Both are static JSON deployed once, pointing to the ItsGoin app package ID for the path pattern /p/*.

App-side handling: Register the URL pattern in Tauri config. On receipt of itsgoin.net/p/<postid>/<hostlist>, parse the postid, decode the hostlist, fetch via QUIC from the hostlist peers. If the post is already in local SQLite, render immediately. Universal Links intercept before the browser loads — itsgoin.net sees zero traffic for app users.

iOS caveat: Universal Links require the app to have been opened manually at least once before OS interception activates. First-time tap from a link goes to the browser fallback. All subsequent taps open the app directly. The browser fallback is the full loading screen experience — first-time users see the product demo, which is the right outcome anyway.

QR codes

Share links are also valid QR codes. At ~127 chars base64url for 5 IPv6 hosts plus postid and domain, total URL length stays well under 200 characters — comfortably scannable at low error correction.

QR codes for share links use the same Universal Links / App Links interception path. A generic phone camera scanning the QR sees an itsgoin.net URL and offers to open it — either in the app (if installed) or in the browser.

No custom QR scheme needed. The HTTPS URL is the QR payload.

Chrome HTTPS (October 2026)

Chrome 154 (October 2026) enables “Always Use Secure Connections” by default, warning before HTTP sites. This does not affect the share link architecture:

itsgoin.net is HTTPS — no warning, Universal Links work normally
The 302 redirect takes the browser off the HTTPS page before content loads, so no mixed-content issue
Node HTTP endpoints are raw IP:port addresses, which Chrome treats as private network addresses (exempt from the public-site warning requirement)
The redirect itself is a header only — no content flows through itsgoin.net

No architecture changes needed before or after October 2026.

Appendix A: Timeout Reference

Constant	Value	Purpose
MESH_KEEPALIVE_INTERVAL	30s	Ping to prevent zombie detection
ZOMBIE_TIMEOUT	600s (10 min)	No activity → dead connection
SESSION_IDLE_TIMEOUT	300s (5 min)	Reap idle interactive sessions (NOT keep-alive)
SELF_LAST_ENCOUNTER_THRESHOLD	10800s (3 hours)	Trigger pull sync when last encounter exceeds this
QUIC_CONNECT_TIMEOUT	15s	Direct connection establishment
HOLE_PUNCH_TIMEOUT	30s	Overall hole punch window
HOLE_PUNCH_ATTEMPT	2s	Per-address attempt within window
RELAY_INTRO_TIMEOUT	15s	Relay introduction request
RELAY_PIPE_IDLE	120s (2 min)	Relay pipe idle before close
RELAY_COOLDOWN	300s (5 min)	Per-target relay cooldown
RELAY_INTRO_DEDUP	30s	Dedup intro forwarding
WORM_TOTAL_TIMEOUT	3s	Entire worm search
WORM_FAN_OUT_TIMEOUT	500ms	Per-peer fan-out query
WORM_BLOOM_TIMEOUT	1.5s	Bloom round to wide referrals
WORM_DEDUP	10s	In-flight worm dedup
WORM_COOLDOWN	300s (5 min)	Miss cooldown before retry
REFERRAL_DISCONNECT_GRACE	120s (2 min)	Anchor keeps peer in referral list after disconnect
N2/N3_STALE_PRUNE	Immediate on disconnect + 7 day fallback	Remove reach entries tagged to disconnected peers; age-based fallback for stragglers
N2/N3_STARTUP_SWEEP	On boot	Remove all N2/N3 entries tagged to peers not in current mesh
PREFERRED_UNREACHABLE_PRUNE	7 days	Release preferred slot (must re-negotiate MeshPrefer on reconnect)
RECONNECT_WATCHER_EXPIRY	30 days	Low-priority reconnect awareness; daily check after 7 days
GROWTH_LOOP_TIMER	60s	Periodic growth loop check
CONNECTIVITY_CHECK	60s	Social/file <N4 access check for keep-alive sessions
DM_RECENCY_WINDOW	14400s (4 hours)	DM'd nodes included in connectivity check
UPNP_DISCOVERY_TIMEOUT	2s	Gateway discovery on startup (do not block)
UPNP_LEASE_RENEWAL	2700s (45 min)	Refresh port mapping before TTL expiry
ANCHOR_PROBE_INTERVAL	1800s (30 min)	Periodic re-probe while anchor-declared
ANCHOR_PROBE_TIMEOUT	15s	Cold connect attempt by witness
ANCHOR_STALE_THRESHOLD	7 days	Post-bootstrap cleanup probes known_anchors older than this

Appendix B: Design Constraints

Constraint	Value	Notes
Visibility metadata cap	256 KB	Applies to WrappedKey lists in encrypted posts
Max recipients (per-recipient wrapping)	~500	256KB / ~500 bytes JSON per WrappedKey
Max blob size	10 MB	Per attachment
Max attachments per post	4
Public post encryption overhead	Zero	No WrappedKeys, no sharding, unlimited audience
Max payload (wire)	16 MB	Length-prefixed JSON framing
Mesh slots	101 (Desktop) / 15 (Mobile)	Preferred + non-preferred, no local/wide distinction
Keep-alive session cap	50% of session capacity	Ensures interactive sessions remain available
Keep-alive ceiling (desktop)	~300–500	Binding constraint: routing diff broadcast overhead
Keep-alive ceiling (mobile)	~25–50	Binding constraint: battery + OS background restrictions
`mesh_blacklist` table	`{ node_id }`	Targeted mutual stranger relationships for testing/diversity
`known_anchors` table	`{ node_id, addresses, last_seen }`	LIFO ordered, 7-day stale cleanup via probe

Appendix C: Implementation Scorecard

Area	Status
Mesh connection architecture (101 slots, preferred/non-preferred)	Complete
N1/N2/N3 knowledge layers	Complete
Growth loop (60s timer + reactive on N2/N3)	Partial (timer exists, reactive trigger needs update)
Preferred peers + bilateral negotiation	Complete
N+10 identification	Partial (preferred peers exist, N+10 not in all headers)
Worm search (nodes + content search for posts/blobs)	Complete
Relay introduction + hole punch	Complete
Session relay (own-device default)	Partial (relay works, own-device restriction not implemented)
Social routing cache	Complete
Three-layer architecture (Mesh/Social/File)	Partial (layers exist conceptually, pull sync still uses mesh)
Keep-alive sessions	Planned
Self Last Encounter sync trigger	Planned
Algorithm-free reverse-chronological feed	Complete
Envelope encryption (1-layer)	Complete
Group keys for circles	Complete
Three-tier access revocation	Partial (Tier 1+2 work, Tier 3 crypto exists but no UI)
Private profiles per circle	Complete
Pull-based sync with follow filtering	Complete
Push notifications (post/profile/delete)	Complete
Blob storage + transfer	Complete
CDN hosting tree + manifests	Complete
Blob eviction with priority scoring	Complete
Anchor bootstrap + referrals	Complete
Delete propagation + CDN cascade	Complete
Multi-device identity	Planned
UPnP port mapping (desktop)	Complete
NAT type detection (STUN) + hard+hard skip	Complete
Advanced NAT traversal (role-based scanning + filter probe)	Complete
LAN discovery (mDNS scan + auto-connect)	Planned
Content propagation via attention	Partial
BlobHeader separation from blob content	Complete
25+25 neighborhood with HeaderDiff propagation	Partial (engagement diffs work, neighborhood diffs planned)
BlobHeaderDiff message (engagement)	Complete
Reactions (public + private encrypted)	Complete
Comments + author policy enforcement	Complete
Engagement sync via BlobHeaderRequest after pull sync	Complete
Notification settings (messages/posts/nearby)	Complete
Tiered DM polling (recency-based schedule)	Complete
Auto-sync on follow	Complete
Post CDN tree (post_downstream)	Complete
Anchor self-verification (reachability probe)	Complete
Mutual mesh blacklist	Planned
`--max-mesh` flag (test affordance)	Planned
Audience sharding	Planned
Custom feeds	Planned
HTTP post delivery (TCP listener, single route, load shedding)	Planned
Share link generation (postid + author NodeId)	Complete
itsgoin.net QUIC proxy handler (on-demand fetch + render)	Complete
PostFetch (0xD4/0xD5) single-post retrieval	Complete
Universal Links / App Links (itsgoin.net/p/*)	Planned
itsgoin.net ItsGoin node (anchor + web handler)	Complete
UPnP TCP port mapping alongside UDP	Planned

Appendix D: Critical Path Forward

The highest-impact items, in priority order:

1. Three-layer separation (pull sync from social/file, not mesh)

Implement Self Last Encounter tracking and move pull sync to social + upstream file peers. This is the foundation for the layered architecture.

2. N+10 in all identification

Add N+10 (NodeId + 10 preferred peers) to self-identification, post headers, blob headers, and social routes. Dramatically improves findability.

3. Keep-alive sessions

Implement social/file connectivity check and keep-alive sessions for peers not reachable within N3. Cross-layer N2/N3 routing from keep-alive sessions.

4. UPnP port mapping

Best-effort NAT traversal for desktop/home networks. Makes nodes directly reachable without hole punching. External address feeds into N+10 and all peer advertisements. Especially impactful for mobile-to-desktop connectivity.

5. Growth loop reactive trigger

Fire growth loop immediately on N2/N3 receipt until 90% full. Currently only timer-based.

6. Multi-device identity

Same identity key across devices with device-specific identity for self-discovery and own-device relay.

7. File-chain propagation

Make AuthorManifest with N+10 and recent posts work passively. Enable discovery of new content from any blob holder.

8. Share links + HTTP post delivery

The viral growth mechanism. Every share becomes a product demo for non-app users and opens natively for app users. Dependencies in order:

UPnP TCP mapping (small addition to existing UPnP code)
Raw TCP HTTP listener (150–200 lines, zero new dependencies)
Host list generation at share time (query post_downstream, encode, embed in URL)
itsgoin.net redirect handler + known_good DB (server-side, independent of app releases)
itsgoin.net loading screen
Universal Links / App Links registration (static JSON files + Tauri config)
itsgoin.net ItsGoin node (run the binary, configure as anchor)

Steps 4–7 are itsgoin.net infrastructure, deployable independently of app releases. Steps 1–3 ship in the app. Step 6 requires an app store release to activate but can be deployed to itsgoin.net ahead of time.

9. Own-device relay restriction

Restrict relay pipes to own-device by default, opt-in for relaying for others.

Appendix E: Features Designed But Not Built

Feature	Source	Status
Three-layer pull sync (social/file, not mesh)	v0.2.0 design	Planned
N+10 in all identification & headers	v0.2.0 design	Planned
Keep-alive sessions	v0.2.0 design	Planned
Multi-device identity	v0.2.0 design	Planned
Own-device relay restriction	v0.2.0 design	Planned
Self Last Encounter sync trigger	v0.2.0 design	Planned
Anchor pin vs Fork pin distinction	project discussion.txt	Planned
Audience sharding for groups > 250	ARCHITECTURE.md	Planned
Repost as first-class post type	project discussion.txt	Planned
Custom feeds (keyword/media/family rules)	project discussion.txt	Planned
Bounce routing (social graph as routing)	ARCHITECTURE.md	Planned
Reactions (public + private encrypted)	v0.2.11	Complete
RefuseRedirect handling (retry suggested peer)	protocol.rs	Partial (send-only)
Profile anchor list used for discovery	ARCHITECTURE.md	Partial (field exists)
File-chain propagation (passive post discovery)	Design	Partial (manifest exists)
Anchor-to-anchor gossip/registry	Observed gap	Planned
BlobHeader as separate mutable structure	v0.2.11	Complete
BlobHeaderDiff incremental propagation (engagement)	v0.2.11	Complete
Post export/backup tooling (author durability)	v0.2.4 design	Planned
Anchor reachability probe (self-verification)	v0.2.6	Complete
Mutual mesh blacklist	v0.2.4 design	Planned
`--max-mesh` flag (test topology control)	v0.2.4 design	Planned
Relay-assisted port scanning (advanced NAT traversal)	v0.2.6	Complete

Appendix F: File Map

crates/core/
  src/
    lib.rs          — module registration, parse_connect_string, parse_node_id_hex
    types.rs        — Post, PostId, NodeId, PublicProfile, PostVisibility, WrappedKey,
                      VisibilityIntent, Circle, PeerRecord, Attachment
    content.rs      — compute_post_id (BLAKE3), verify_post_id
    crypto.rs       — X25519 key conversion, DH, encrypt_post, decrypt_post, BLAKE3 KDF
    blob.rs         — BlobStore, compute_blob_id, verify_blob
    storage.rs      — SQLite: posts, peers, follows, profiles, circles, circle_members,
                      mesh_peers, reachable_n2/n3, social_routes, blobs, group_keys,
                      preferred_peers, known_anchors; auto-migration
    protocol.rs     — MessageType enum (39 types), ALPN (itsgoin/3),
                      length-prefixed JSON framing, read/write helpers
    connection.rs   — ConnectionManager + ConnHandle/ConnectionActor (actor pattern):
                      mesh QUIC connections (MeshConnection), session connections,
                      slot management, initial exchange, N1/N2 diff broadcast,
                      pull sync, relay introduction. All external access via ConnHandle.
    network.rs      — iroh Endpoint, accept loop, connect_to_peer,
                      connect_by_node_id (7-step cascade), mDNS discovery
    node.rs         — Node struct (ties identity + storage + network), post CRUD,
                      follow/unfollow, profile CRUD, circle CRUD, encrypted post creation,
                      startup cycles, bootstrap, anchor register cycle
    web.rs          — itsgoin.net web handler: QUIC proxy for share links,
                      on-demand post fetch via content search, blob serving
    http.rs         — HTML rendering for shared posts (render_post_html)

crates/cli/
  src/main.rs       — interactive REPL + anchor mode (--bind, --daemon, --web)

crates/tauri-app/
  src/lib.rs        — Tauri v2 commands (38 IPC handlers), DTOs

frontend/
  index.html        — single-page UI: 5 tabs (Feed / My Posts / People / Messages / Settings)
  app.js            — Tauri invoke calls, rendering, identicon generator, circle CRUD
  style.css         — dark theme, post cards, visibility badges, transitions

License

ItsGoin is released under the Apache License, Version 2.0. You may use, modify, and distribute this software freely under the terms of that license.

This is a gift. Use it well.

Design Document

1. The Vision

Guiding principles

2. Identity & Bootstrap

First startup

Startup cycles

3. N+10 Identification

Concept

Where N+10 appears

Why this works

Status: Partial

4. Connections & Growth

Connection types

Slot architecture

MeshConnection struct

Mutual mesh blacklist Planned

--max-mesh <n> CLI flag Planned

Keepalive

5. Connection Lifecycle

5.1 Growth Loop (60s timer + reactive on N2/N3 receipt)

5.2 Rebalance Cycle (every 600s)

5.3 Recovery Loop (reactive, mesh empty)

5.4 Initial Exchange (on every new connection)

5.5 Incremental Routing Diffs (every 120s + on change)

6. Network Knowledge Layers (N1/N2/N3)

<N4 access

What is NEVER shared

Address resolution cascade (connect_by_node_id)

7. Three-Layer Architecture (Mesh / Social / File)

Key principle: mesh is not for content

Cross-layer benefits

8. Anchors

Intent

Status: Complete (with gaps)

When anchors are used

Anchor referral mechanics

Anchor selection order

Anchor self-verification Complete

Witness selection

Probe message flow

Anchor candidacy checklist

Probe refresh schedule

Session fallback for full anchors

Remaining gaps

9. Referrals

Status: Complete

Referral list mechanics (anchor side)

10. Relay & NAT Traversal

Status: Complete

Relay selection (find_relays_for)

RelayIntroduce flow (0xB0/0xB1)

Session relay (relay pipes)

Deduplication & cooldowns

Hole punch mechanics

NAT type detection

Status: Complete (interim: public STUN servers)

Current implementation (interim)

Target design (multi-anchor STUN)

NAT type sharing

Hole punch strategy

Advanced NAT traversal

Status: Complete

NAT filter probe (0xC6/0xC7)

NAT combination matrix

Role-based scanning protocol

Scan parameters

Why 5-minute scan duration is acceptable

11. UPnP Port Mapping

Status: Complete

Purpose

Startup flow

Lease renewal cycle (every 2700s / 45 min)

Shutdown

Integration with existing address logic

Why this matters for mobile

Honest limitations

UPnP nodes are anchors

Implementation

12. LAN Discovery

Status: Planned

`--max-mesh <n>` CLI flag Planned

Address resolution cascade (`connect_by_node_id`)

Relay selection (`find_relays_for`)

RelayIntroduce flow (`0xB0`/`0xB1`)

PostFetch (`0xD4`/`0xD5`)

Negotiation (`MeshPrefer`, `0xB3`)

`social_routes` table

Blob transfer flow (`0x90`/`0x91`)