Fix: GroupKeyDistribute admin forgery + cap concurrent port scanners
Two pre-release fixes found during audit. 1) GroupKeyDistribute admin forgery (critical) `group_key_distribution::try_apply_distribution_post` trusted the `admin` field inside the decrypted payload without verifying it matched the post's author. Exploit: any peer who learns a victim's posting NodeId (public — appears as a recipient on any DM/group post) and observes a target group_id in the wild could craft an encrypted distribution post claiming to be from the legitimate admin. The victim's storage uses INSERT OR REPLACE on group_keys, so a successful forgery would overwrite the victim's legitimate group key record and stored seed, breaking future rotations / key distributions from the real admin. Fix: reject the distribution post when `content.admin != post.author`. Added test `forged_admin_is_rejected` that seeds a legitimate record, attempts a forgery, and asserts the legitimate record is untouched. 2) Cap concurrent port-scan hole punches at 1 (bandwidth) `hole_punch_with_scanning` fires ~100 QUIC ClientHellos/sec for up to SCAN_MAX_DURATION_SECS (300s), ~1 Mbps per active scanner. With no cap, the growth loop / anchor referrals / replication paths could spawn several scanners at once and drive sustained multi-Mbps upload — particularly pathological on obfuscated VPNs where every probe stalls at a proxy timeout, explaining the reported 10 Mbps sustained upload after anchor connect. Fix: module-level `tokio::sync::Semaphore(1)` guarding entry to the scanning loop. Second-and-beyond callers fall back to the cheaper `hole_punch_parallel` (standard punching, no 100/sec port walk) instead of spawning another scanner. Permit is held for the scanner lifetime and released on return. Added unit test `scanner_semaphore_caps_concurrent_scans_at_one`. Both changes leave the successful-call path untouched (single scanner still runs; legitimate key distributions still apply). 120 / 120 core tests pass.
This commit is contained in:
parent
f88618bb6f
commit
dfd3253734
2 changed files with 127 additions and 0 deletions
|
|
@ -155,6 +155,20 @@ const SCAN_PUNCH_INTERVAL_SECS: u64 = 2;
|
|||
/// Maximum scan duration (seconds) — accept the cost for otherwise-impossible connections
|
||||
const SCAN_MAX_DURATION_SECS: u64 = 300; // 5 minutes
|
||||
|
||||
/// Global cap on concurrent port-scan hole punches. Each scanner fires
|
||||
/// ~100 QUIC ClientHellos/sec for up to `SCAN_MAX_DURATION_SECS`, which
|
||||
/// is ~1 Mbps per active scanner. Without a cap, multiple parallel
|
||||
/// referrals (growth loop, anchor referrals, replication) can spawn
|
||||
/// several scanners at once and drive sustained multi-Mbps upload —
|
||||
/// especially pathological on obfuscated VPNs where every probe stalls
|
||||
/// at proxy timeouts. A permit is acquired before the scanning loop
|
||||
/// starts and held until the scanner returns; extra callers fall back
|
||||
/// to the cheaper `hole_punch_parallel`.
|
||||
fn scanner_semaphore() -> &'static tokio::sync::Semaphore {
|
||||
static SEM: std::sync::OnceLock<tokio::sync::Semaphore> = std::sync::OnceLock::new();
|
||||
SEM.get_or_init(|| tokio::sync::Semaphore::new(1))
|
||||
}
|
||||
|
||||
/// Advanced hole punch with port scanning fallback for EDM/port-restricted NAT.
|
||||
///
|
||||
/// **Role-based behavior** (each side calls this independently):
|
||||
|
|
@ -188,6 +202,21 @@ pub(crate) async fn hole_punch_with_scanning(
|
|||
return hole_punch_parallel(endpoint, target, addresses).await;
|
||||
}
|
||||
|
||||
// v0.6.2: cap to one concurrent port scanner per node. Additional
|
||||
// callers fall back to the cheaper `hole_punch_parallel` instead of
|
||||
// spawning another 100-probes-per-second scanner. The permit is held
|
||||
// for the lifetime of the scanner loop below (dropped on return).
|
||||
let _scan_permit = match scanner_semaphore().try_acquire() {
|
||||
Ok(p) => p,
|
||||
Err(_) => {
|
||||
tracing::debug!(
|
||||
peer = hex::encode(target),
|
||||
"another port scan already in progress — falling back to parallel punch"
|
||||
);
|
||||
return hole_punch_parallel(endpoint, target, addresses).await;
|
||||
}
|
||||
};
|
||||
|
||||
// Filter to reachable families, then use observed address (first in list, injected by relay)
|
||||
let reachable = filter_reachable_families(endpoint, addresses);
|
||||
let observed_addr = reachable.first()
|
||||
|
|
@ -8379,3 +8408,21 @@ fn now_ms() -> u64 {
|
|||
.unwrap_or_default()
|
||||
.as_millis() as u64
|
||||
}
|
||||
|
||||
#[cfg(test)]
|
||||
mod tests {
|
||||
use super::scanner_semaphore;
|
||||
|
||||
#[test]
|
||||
fn scanner_semaphore_caps_concurrent_scans_at_one() {
|
||||
let sem = scanner_semaphore();
|
||||
// Fresh — one permit should be available.
|
||||
let p1 = sem.try_acquire().expect("first scan should acquire");
|
||||
// Second concurrent caller must be rejected.
|
||||
assert!(sem.try_acquire().is_err(), "second scan must not acquire while first holds permit");
|
||||
// Dropping the first permit returns it to the pool.
|
||||
drop(p1);
|
||||
let p2 = sem.try_acquire().expect("after release, next scan should acquire");
|
||||
drop(p2);
|
||||
}
|
||||
}
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue