From 9040d70bf6b76dea4c8cbfa4e06bcd7a90f8fd3f Mon Sep 17 00:00:00 2001 From: Scott Reimers Date: Tue, 12 May 2026 21:47:44 -0400 Subject: [PATCH] =?UTF-8?q?docs:=20correct=20padding=20rule=20=E2=80=94=20?= =?UTF-8?q?bucketed=20throughout,=20not=20random=20above=20256?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Prior round misread Opus's recommendation: I wrote "rand(0..=256) above 256" for slots and "round up to nearest 256KB above 256KB" for body. Body was right; slots were wrong. Correct rule: bucketed throughout. Slot buckets: 8, 16, 32, 64, 128, 256 (power-of-2 sub-256), then 384, 512, 640, 768, ... (+128 steps above). Body buckets: 1KB, 2KB, ..., 256KB (power-of-2 sub-256KB), then 512KB, 768KB, 1024KB, ... (+256KB steps above; aligns with future chunk size). Stronger privacy than random: observer learns bucket, never position within it. Stable across posts; no min-over-many-posts floor attack. Co-Authored-By: Claude Opus 4.7 (1M context) --- docs/fof-spec/layer-2-mode2-fof-comments.md | 12 ++++-------- docs/fof-spec/layer-3-mode1-fof-closed.md | 10 +++++----- sessions.md | 15 +++++++++++---- 3 files changed, 20 insertions(+), 17 deletions(-) diff --git a/docs/fof-spec/layer-2-mode2-fof-comments.md b/docs/fof-spec/layer-2-mode2-fof-comments.md index 3c579d0..4bb837d 100644 --- a/docs/fof-spec/layer-2-mode2-fof-comments.md +++ b/docs/fof-spec/layer-2-mode2-fof-comments.md @@ -35,7 +35,7 @@ Cost: `pub_x_index` is a per-post pseudonym for the voucher-chain — leaks "the - **Comments are encrypted.** `CEK_comments = HKDF(CEK, "comments")`. Non-FoF observers see only ciphertext + signatures on comments. Read-gating is a side effect of the same slot unwrap. - **CDN propagation verifies before forwarding.** Four-check accept rule: valid `pub_x_index`, not in `revocation_list`, `group_sig` validates, `identity_sig` validates. Any failure → drop, do not propagate. - **Revocation is retroactive.** Revocation diff is author-signed, appends a revoked `pub_x` entry. Every file-holder that receives the diff **deletes all comments on this post signed by that `pub_x`** from local storage, then forwards the diff. Comments in flight before the diff arrived get deleted when it catches up. Stronger than "stop forwarding" — prior garbage also goes away. -- **Hybrid slot-count padding.** Up to 256 real slots: pad to next power of 2 (smallest authors get strong bucket-grouping; e.g., 5 real → 8 published, 100 real → 128 published, 200 real → 256 published). Above 256 real slots: pad to `real_count + rand(0..=256)` (large authors get probabilistic noise; power-of-2 jumps would waste up to 50%). Dummy `pub_x` entries are included in `pub_post_set` 1:1 so `pub_post_set.len() == wrap_slots.len()`; dummy pubkeys are 32B random bytes (no one holds the matching `priv_x`, so `group_sig` verify against a dummy always fails — benign). +- **Bucketed slot-count padding.** Deterministic buckets throughout. Up to 256 real slots: next power of 2 (8, 16, 32, 64, 128, 256). Above 256: next +128-step bucket (384, 512, 640, 768, …). Author publishes `next_bucket(real_count)` slots; dummy slots fill the gap. Dummy `pub_x` entries are included in `pub_post_set` 1:1 so `pub_post_set.len() == wrap_slots.len()`; dummy pubkeys are 32B random bytes (no one holds the matching `priv_x`, so `group_sig` verify against a dummy always fails — benign). - **Post-hoc read-access grant via author comment.** Author can widen the read-set of an already-published post by publishing an author-signed special comment that appends a new `WrapSlot` (and its `pub_post_set` entry) for a newly-vouched persona. No rotation, no republish; retroactive inclusion. See "Access-grant author comment" below. - **`vouch_mac` retained.** Inside ciphertext. Enables author render-time and intra-circle attribution complementary to CDN-level `pub_x_index` attribution. - **Author has their own `pub_me/priv_me`.** Treated as one of the entries in `pub_post_set`. Author signs their own comments through the same path. @@ -246,7 +246,7 @@ struct AccessGrantComment { | `prefilter_tag` | 2 | | **Subtotal per slot** | **~154** | -At 500 real vouchees (above the 256 inflection point): pad to `500 + rand(0..=256)` = 500–756 slots × ~154B = ~77KB to ~117KB. At 200 real vouchees: pad to 256 = ~39KB. At 9 real vouchees: pad to 16 = ~2.5KB. Acceptable for a header carried once per post. +At 500 real vouchees → next bucket above 256 is 512 → 512 slots × ~154B ≈ 79KB. At 200 real vouchees → bucket 256 ≈ 39KB. At 9 real vouchees → bucket 16 ≈ 2.5KB. Worst-case in-bucket overhead: just-under-boundary case (e.g., 257 real → 384 bucket → 33% padded; 129 real → 256 bucket → 49% padded; well below pure power-of-2 doubling). Acceptable for a header carried once per post. Per-comment CDN overhead (vs shared-keypair baseline): +64B (`group_sig`) + 4B (`pub_x_index`) + 64B (`identity_sig` — already in baseline) ≈ 68B additional. Negligible. @@ -266,11 +266,7 @@ Decision: **defer Merkle variant to PQ migration**. v1 ships inline. The reason this is acceptable: the alternative is no propagation-level verification, which means any one admitted FoF member can DoS the mesh. Per-post pseudonym is the minimum viable disclosure for CDN-level filtering. -**Vouch-set size:** hybrid padding behaves differently in two regimes. -- **Small authors (≤256 real slots)** publish a power-of-2 count. Observer learns the bucket (e.g., `published == 128` → `real_count ∈ (64, 128]`). Bucket boundaries are 8, 16, 32, 64, 128, 256. Within-bucket count is fully hidden. -- **Large authors (>256 real slots)** publish `real_count + rand(0..=256)`. Each post leaks an upper bound on real_count. Across many posts, an observer's best floor estimate is `min(published_i) - 256`, which doesn't converge tighter without independent side information. - -Small-author bucket boundaries are a known cost; large authors trade off a coarser noise floor against bandwidth. Both regimes hide the true count within meaningful brackets. +**Vouch-set size:** bucketed padding throughout. Observer learns the bucket; position within the bucket is fully hidden. Buckets are 8, 16, 32, 64, 128, 256 (power-of-2 sub-256) then 384, 512, 640, 768, … (+128 steps above 256). The same author's bucket is stable across posts unless they cross a boundary — so observers see "this author has between X and Y real vouchees" with no way to converge tighter from multiple posts. **Non-FoF reader UX:** non-FoF readers see the public body + "Comments are private" affordance. Comment count is not shown (no engagement leak). Optional: a "Request access via DM" button that sends the author a note; author can respond by publishing an access-grant author comment (above) that retroactively admits the requester. @@ -306,6 +302,6 @@ Small-author bucket boundaries are a known cost; large authors trade off a coars - Revocation diff format + CDN honor path (retroactive delete + forward). - Access-grant author comment mechanism for post-hoc read widening. - Per-post ephemeral-keypair generation. -- Hybrid dummy padding on both `wrap_slots` and `pub_post_set` (power-of-2 up to 256, then `count + rand(0..=256)`), shuffled together. +- Bucketed dummy padding on both `wrap_slots` and `pub_post_set` (power-of-2 up to 256, then +128 steps above), shuffled together. - Back-compat: old clients can still render existing non-FoF posts; old comments on new FoF posts are rejected at CDN (missing required fields). - Integration test: 3-node FoF chain (A→B→C). A posts Mode 2 FoF. B comments (accepted, propagates). C comments (accepted via B's chain). D (unrelated) signs junk with fake `pub_x_index` pointing at a real entry (rejected at first-hop CDN: `group_sig` fails). D signs junk with `pub_x_index` pointing at a dummy (rejected: `group_sig` fails against a dummy random pubkey). A revokes B's `pub_x`: B's already-propagated comments are deleted on each file-holder as the diff sweeps through; B's subsequent comments rejected at first hop. A vouches for E after the post is live and publishes access-grant author comment; E can now read + comment retroactively. diff --git a/docs/fof-spec/layer-3-mode1-fof-closed.md b/docs/fof-spec/layer-3-mode1-fof-closed.md index 5fd81a1..a06cbbf 100644 --- a/docs/fof-spec/layer-3-mode1-fof-closed.md +++ b/docs/fof-spec/layer-3-mode1-fof-closed.md @@ -23,8 +23,8 @@ Builds on Layer 2's `pub_post` / `priv_post` / `wrap_slot` primitives — same s - **New variant, not extended Encrypted.** `PostVisibility::FoFClosed` is its own variant. Existing `Encrypted{recipients}` wraps per-recipient NodeIds — visible on the wire. FoF wraps anonymously under symmetric keys — no NodeIds. - **One wrap slot per unique `V_x`.** Dedup at the `V_x` byte level — if multiple personas hold the same `V_x`, include one slot. Friends-only post: one slot under `V_me`. FoF post: own `V_me` + every distinct `V_x` the author holds. Custom: subset chosen by author (deferred to v2 power-user UI; v1 ships three presets only: Public / Friends-only / Friends-of-Friends). -- **Hybrid slot-count padding.** Up to 256 real slots: pad to next power of 2 (smallest authors get strong bucket-grouping). Above 256: pad to `real_count + rand(0..=256)` (large authors get probabilistic noise; power-of-2 jumps would waste up to 50% bandwidth). Dummy slots are byte-identical to real ones, AEAD-fails on any `V_x`. Dummy entries also added to `pub_post_set` 1:1 (see Layer 2). -- **Hybrid body-size padding.** Same shape: up to 256KB of body ciphertext, pad to next power of 2 (1KB, 2KB, 4KB, …, 256KB). Above 256KB, round up to nearest 256KB. Aligns large posts with the storage chunking-block size; small posts get strong bucket-grouping against length-based classification. +- **Bucketed slot-count padding.** Deterministic bucket boundaries throughout — observers learn the bucket, not the position within it. Buckets: 8, 16, 32, 64, 128, 256 (power-of-2 up to 256), then 384, 512, 640, 768, … (linear +128 steps above 256). Author publishes `next_bucket(real_count)` slots with random dummies filling the gap. Power-of-2 sub-256 keeps small-author overhead bounded; +128 steps above 256 avoid the 2× waste of pure power-of-2 at scale. Dummy slots are byte-identical to real ones, AEAD-fails on any `V_x`. Dummy entries also added to `pub_post_set` 1:1 (see Layer 2). +- **Bucketed body-size padding.** Same shape applied to body ciphertext bytes: power-of-2 buckets up to 256KB (1KB, 2KB, 4KB, …, 256KB), then 256KB-step buckets above (512KB, 768KB, 1024KB, …). 256KB above is the future storage chunk size — once large enough to chunk, padding aligns to chunk boundaries naturally. - **Each slot carries both CEK and priv_x (Layer 2 dual-derivation).** Layer 2's `WrapSlot` dual-derivation (read → CEK, sign → priv_x) is the canonical form. Mode 1 simply also uses the CEK to encrypt the body, where Mode 2 leaves the body plaintext. - **Prefilter tag is `HMAC(V_x, post_id)[:2B]`.** Readers precompute a 2-byte tag for each key in their keyring and skip slots that don't match. Cuts trial-decrypt cost by ~2^16 on average. - **Order of slots is randomized.** No positional leak about which slot corresponds to which voucher. Re-shuffled on every header revision (including access-grant appends from Layer 2 — TBD whether append-only ordering is acceptable, or whether the entire set is re-shuffled at each grant). @@ -130,7 +130,7 @@ Ciphertext `FoFClosed` posts ride the same CDN propagation as other encrypted po ## Resolved (2026-04-24) -- **Slot count padding**: hybrid scheme — up to 256, next power of 2; above 256, `real_count + rand(0..=256)`. Body-size padding follows the same shape with 256KB as the inflection point. +- **Slot count padding**: bucketed throughout. Power-of-2 buckets to 256 (8, 16, 32, 64, 128, 256), then linear +128 buckets (384, 512, 640, …). Body-size padding follows the same shape with 256KB as the power-of-2 ceiling and 256KB linear steps above. - **Custom mode UI**: deferred. v1 ships only the three presets (Public / Friends-only / FoF). Power-user custom-subset UI is v2. - **Slot deduplication**: dedup at the `V_x` byte level. One slot per unique key. - **Body length padding**: yes — pad to next power of 2 up to 256KB, then 256KB chunks above. @@ -140,8 +140,8 @@ Ciphertext `FoFClosed` posts ride the same CDN propagation as other encrypted po ## Ship criteria for Layer 3 - `PostVisibility::FoFClosed` exists end-to-end. -- Author creation path generates per-post keypairs, wraps CEK+priv_x under each unique `V_x` (deduped), and pads per the hybrid rule: power-of-2 up to 256 real slots, then `real_count + rand(0..=256)` above. -- Body-size padded: power-of-2 up to 256KB, then nearest 256KB above. +- Author creation path generates per-post keypairs, wraps CEK+priv_x under each unique `V_x` (deduped), and pads to the next slot bucket: power-of-2 up to 256, then +128 steps above. +- Body-size padded to next body bucket: power-of-2 up to 256KB, then 256KB steps above. - Reader decryption path iterates personas × keyring with prefilter tag. - `receive_post` accepts FoFClosed ciphertext without decrypting. - UI surface: post composer has three presets — Public / Friends-only / Friends-of-Friends. Custom subset is v2. diff --git a/sessions.md b/sessions.md index d01c3f6..d349278 100644 --- a/sessions.md +++ b/sessions.md @@ -65,13 +65,20 @@ See `CONTRIBUTING.md` for the protocol. See `AGENTS.md` for the Claude-specific **Stopping point**: commit `b8b38a6` (Layer 1) + new commit for Layer 2 both on branch; not merged. Awaiting Scott. -### Update 2026-04-24 — Layer 3 round 1 + cross-cutting padding rule +### Update 2026-04-24 — Layer 3 round 1 + cross-cutting padding rule (corrected) Scott talked to Opus and resolved Layer 3 open questions + introduced a unified padding rule that supersedes Layer 2 round 2's `rand(32..=128)`. -**Hybrid padding rule (applies to both slot count and body size)**: -- ≤256 real units → pad to next power of 2 (1, 2, 4, …, 256). Strong bucket-grouping for small authors / small bodies. -- >256 real units → pad to `real + rand(0..=256)` (slots) / round up to nearest 256KB (bodies). Probabilistic noise for large authors / large bodies; avoids 2× bandwidth waste of pure power-of-2 at scale. +**First-pass misread (corrected by Scott):** I initially wrote the rule as "power-of-2 up to 256, then `real + rand(0..=256)` above." That's wrong — the rule is **bucketed throughout**, not random above the threshold. + +**Bucketed padding rule (applies to both slot count and body size)**: +- ≤256 real units → next power-of-2 bucket (8, 16, 32, 64, 128, 256). +- >256 real units → next linear-step bucket: +128 step for slots (384, 512, 640, …), +256KB step for body bytes (512KB, 768KB, 1024KB, …). +- Deterministic. Author publishes `next_bucket(real)`; dummies fill the gap. + +Why this is stronger than random: observers learn the bucket but never the position within it. Across multiple posts from the same author, the bucket is stable until the author crosses a boundary — so no "min over many posts" attack converges tighter than the bucket bound. Random padding would have leaked `min(observed) - max_noise` as a floor. + +Linear-step above 256 vs pure power-of-2: avoids the 2× waste of jumping 256→512 for an author with 257 vouchees. Above 256, step buckets are 128 (slots) or 256KB (body) so worst-case in-bucket overhead is bounded (~33% at the worst spot). Applies uniformly to slot count and body size.