fix: v0.7.3 — disable EDM scanner, bootstrap batching, stale-anchor prune
Bandwidth + bootstrap hardening on top of v0.7.2. Wire-compatible with v0.7.0/v0.7.1/v0.7.2; no protocol changes. EDM port scanner DISABLED - hole_punch_with_scanning() now does only single quick punch + parallel punch over 30s window. The EDM port-scanner branch is gone from the live path because per-probe endpoint.connect() amplifies catastrophically: iroh accumulates every connect() target into a per-endpoint paths set and probes them all under QUIC NAT-traversal in the background. A 100-probes/sec / 5-min scan inserted ~30k paths; iroh probed all of them. Observed at 22MB/s outbound from one client — DoS-grade. - Scanner body preserved as edm_port_scan_disabled_v0_7_3() with all supporting helpers (PortWalkIter, scanner_semaphore, role-based scanner/puncher split, found_tx/found_rx channel pattern, deadline + tokio::select! orchestration) marked #[allow(dead_code)]. Refactor target: replace per-probe endpoint.connect() with raw socket.send_to() so probes don't enter iroh's path store. Bootstrap probing batched - New probe_anchors_batched() helper: 3 anchors in flight at a time, 2s stagger between batch dispatches, 10s per-anchor timeout, no abort on success. First success unblocks the bootstrap flow; remaining probes continue in background and fill peer connections naturally. - Phase 2 (bootstrap fallback) still only fires when every discovered anchor failed — preserves load-distribution intent. Replaces the sequential 50s+ timeout cascade users observed with old data dirs. Stale-anchor self-pruning - New storage.get_known_anchor_last_seen() and storage.delete_known_anchor(). - maybe_prune_stale_anchor(): when a probe fails AND last_seen_ms > 3 days, delete the entry from known_anchors immediately. Recoverable anchors (failed once, succeeded recently) are preserved. Self-healing for old data dirs whose discovered anchors point to keypairs that rotated months ago. Android close button kills NodeService - New NodeService.stopFromNative() Kotlin static method called via JNI from android_wifi::stop_node_service(). exit_app invokes it on Android before app.exit(0). Previously the button ended the Activity but the foreground service kept networking running. Cosmetic - Power-icon SVG (inline) replaces ⏻ so Android webviews lacking U+23FB don't render a missing-image tofu box. Docs - design.html section 11 rewritten for portmapper (UPnP+NAT-PMP+PCP, v0.7.2) including per-platform contract and bidirectional anchor watcher. - design.html section 10 marks session relay as opt-in (v0.7.2) and EDM scanner as disabled-pending-refactor (v0.7.3). - download.html carries v0.7.3 release notes. - MEMORY.md updated; older v0.7.0/v0.7.1 status sections condensed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
4706e81603
commit
6ef11fa61c
14 changed files with 425 additions and 73 deletions
|
|
@ -1,6 +1,6 @@
|
|||
[package]
|
||||
name = "itsgoin-core"
|
||||
version = "0.7.2"
|
||||
version = "0.7.3"
|
||||
edition = "2021"
|
||||
|
||||
[dependencies]
|
||||
|
|
|
|||
|
|
@ -213,3 +213,44 @@ impl Drop for MulticastLockGuard {
|
|||
}
|
||||
}
|
||||
}
|
||||
|
||||
/// Stop the Android `NodeService` foreground service. Called from the
|
||||
/// in-app close button so the network process actually exits rather
|
||||
/// than continuing to run as a foreground service after the Activity
|
||||
/// closes (foreground services are kept alive across Activity exit by
|
||||
/// design).
|
||||
///
|
||||
/// Errors are logged but not propagated — best-effort cleanup before
|
||||
/// `AppHandle::exit(0)` finishes the Activity.
|
||||
pub fn stop_node_service() {
|
||||
if let Err(e) = stop_node_service_inner() {
|
||||
warn!("stop_node_service failed (will exit anyway): {}", e);
|
||||
}
|
||||
}
|
||||
|
||||
fn stop_node_service_inner() -> Result<(), String> {
|
||||
let ctx = ndk_context::android_context();
|
||||
if ctx.vm().is_null() {
|
||||
return Err("ndk_context: null JavaVM".into());
|
||||
}
|
||||
if ctx.context().is_null() {
|
||||
return Err("ndk_context: null activity context".into());
|
||||
}
|
||||
let vm = unsafe { JavaVM::from_raw(ctx.vm() as *mut _) }
|
||||
.map_err(|e| format!("JavaVM init: {:?}", e))?;
|
||||
let mut env = vm
|
||||
.attach_current_thread()
|
||||
.map_err(|e| format!("attach_current_thread: {:?}", e))?;
|
||||
let activity = unsafe { JObject::from_raw(ctx.context() as *mut _) };
|
||||
|
||||
// NodeService.stopFromNative(activity)
|
||||
env.call_static_method(
|
||||
"com/itsgoin/app/NodeService",
|
||||
"stopFromNative",
|
||||
"(Landroid/content/Context;)V",
|
||||
&[JValue::Object(&activity)],
|
||||
)
|
||||
.map_err(|e| format!("stopFromNative: {:?}", e))?;
|
||||
|
||||
Ok(())
|
||||
}
|
||||
|
|
|
|||
|
|
@ -145,14 +145,22 @@ pub(crate) async fn hole_punch_parallel(
|
|||
None
|
||||
}
|
||||
|
||||
// EDM port scanner — DISABLED in v0.7.3 (see hole_punch_with_scanning).
|
||||
// Constants and helpers preserved as the refactor target for a raw-UDP
|
||||
// scanner that bypasses iroh's path-store accumulation.
|
||||
|
||||
/// Timeout for each individual scan connect attempt (200ms → ~20 in-flight at 100/sec)
|
||||
#[allow(dead_code)]
|
||||
const SCAN_CONNECT_TIMEOUT_MS: u64 = 200;
|
||||
/// Scan rate: one attempt every 10ms = 100 ports/sec
|
||||
#[allow(dead_code)]
|
||||
const SCAN_INTERVAL_MS: u64 = 10;
|
||||
/// How often to punch peer's anchor-observed address during scanning (seconds).
|
||||
/// Each punch checks if the peer has opened a firewall port matching our actual port.
|
||||
#[allow(dead_code)]
|
||||
const SCAN_PUNCH_INTERVAL_SECS: u64 = 2;
|
||||
/// Maximum scan duration (seconds) — accept the cost for otherwise-impossible connections
|
||||
#[allow(dead_code)]
|
||||
const SCAN_MAX_DURATION_SECS: u64 = 300; // 5 minutes
|
||||
|
||||
/// Global cap on concurrent port-scan hole punches. Each scanner fires
|
||||
|
|
@ -164,11 +172,63 @@ const SCAN_MAX_DURATION_SECS: u64 = 300; // 5 minutes
|
|||
/// at proxy timeouts. A permit is acquired before the scanning loop
|
||||
/// starts and held until the scanner returns; extra callers fall back
|
||||
/// to the cheaper `hole_punch_parallel`.
|
||||
#[allow(dead_code)]
|
||||
fn scanner_semaphore() -> &'static tokio::sync::Semaphore {
|
||||
static SEM: std::sync::OnceLock<tokio::sync::Semaphore> = std::sync::OnceLock::new();
|
||||
SEM.get_or_init(|| tokio::sync::Semaphore::new(1))
|
||||
}
|
||||
|
||||
/// Hole punch orchestrator.
|
||||
///
|
||||
/// **v0.7.3:** the EDM port scanner is DISABLED. We do Step 1 (quick punch to
|
||||
/// the anchor-observed address) → Step 2 (parallel punch over the 30s window
|
||||
/// to all known addresses). No port scan.
|
||||
///
|
||||
/// **Why disabled:** iroh's `Endpoint` accumulates every `endpoint.connect()`
|
||||
/// target into a per-endpoint paths set and probes them all in the background
|
||||
/// under QUIC NAT-traversal. A 100-probes/sec / 5-min scan inserts ~30,000
|
||||
/// paths; iroh then probes all of them. Observed at 22MB/s outbound from a
|
||||
/// single client. Disabled until we replace per-probe `endpoint.connect()`
|
||||
/// with a raw `socket.send_to()` on the endpoint's bound UDP socket — see
|
||||
/// `edm_port_scan_disabled_v0_7_3` for the preserved scanner logic to
|
||||
/// refactor against.
|
||||
///
|
||||
/// Original docstring is preserved on `edm_port_scan_disabled_v0_7_3`.
|
||||
pub(crate) async fn hole_punch_with_scanning(
|
||||
endpoint: &iroh::Endpoint,
|
||||
target: &NodeId,
|
||||
addresses: &[String],
|
||||
_our_profile: crate::types::NatProfile,
|
||||
_peer_profile: crate::types::NatProfile,
|
||||
) -> Option<iroh::endpoint::Connection> {
|
||||
if let Some(conn) = hole_punch_single(endpoint, target, addresses).await {
|
||||
return Some(conn);
|
||||
}
|
||||
hole_punch_parallel(endpoint, target, addresses).await
|
||||
}
|
||||
|
||||
/// **DISABLED in v0.7.3** — kept as the refactor target for a safe replacement.
|
||||
///
|
||||
/// **Why disabled:** iroh's `Endpoint` accumulates every `endpoint.connect()`
|
||||
/// target into a per-endpoint paths set and probes them all in the background
|
||||
/// under QUIC NAT-traversal. A 100-probes/sec / 5-min scan inserts ~30,000
|
||||
/// paths; iroh then probes all of them. Observed at 22MB/s outbound from a
|
||||
/// single client (DoS-grade).
|
||||
///
|
||||
/// **Refactor target:** replace `endpoint.connect()` in the per-probe path
|
||||
/// with a raw `socket.send_to(...)` on the endpoint's bound UDP socket. The
|
||||
/// probe still opens a NAT mapping on our side; we just don't ask iroh to
|
||||
/// manage the path. The every-2s punch retains `endpoint.connect()` so the
|
||||
/// real handshake completes when the peer's punch arrives.
|
||||
///
|
||||
/// Logic worth preserving below: role-based scanner/puncher split,
|
||||
/// `PortWalkIter`, `scanner_semaphore`, `found_tx`/`found_rx` channel
|
||||
/// pattern, deadline + `tokio::select!` orchestration.
|
||||
///
|
||||
/// ---
|
||||
///
|
||||
/// Original docstring:
|
||||
///
|
||||
/// Advanced hole punch with port scanning fallback for EDM/port-restricted NAT.
|
||||
///
|
||||
/// **Role-based behavior** (each side calls this independently):
|
||||
|
|
@ -183,7 +243,8 @@ fn scanner_semaphore() -> &'static tokio::sync::Semaphore {
|
|||
/// NAT mapping alive and checks if the peer's scan has opened their firewall for us.
|
||||
///
|
||||
/// For both-EDM pairs: both sides scan + punch simultaneously.
|
||||
pub(crate) async fn hole_punch_with_scanning(
|
||||
#[allow(dead_code)]
|
||||
async fn edm_port_scan_disabled_v0_7_3(
|
||||
endpoint: &iroh::Endpoint,
|
||||
target: &NodeId,
|
||||
addresses: &[String],
|
||||
|
|
@ -389,12 +450,17 @@ pub(crate) async fn hole_punch_with_scanning(
|
|||
|
||||
/// Iterator that walks outward from a base port: base, base+1, base-1, base+2, base-2, ...
|
||||
/// Skips ports outside [1, 65535].
|
||||
///
|
||||
/// Used by `edm_port_scan_disabled_v0_7_3` — preserved for the future
|
||||
/// raw-UDP scanner refactor.
|
||||
#[allow(dead_code)]
|
||||
struct PortWalkIter {
|
||||
base: u16,
|
||||
offset: u32,
|
||||
tried_plus: bool, // within current offset, have we tried base+offset?
|
||||
}
|
||||
|
||||
#[allow(dead_code)]
|
||||
impl PortWalkIter {
|
||||
fn new(base: u16) -> Self {
|
||||
Self { base, offset: 0, tried_plus: false }
|
||||
|
|
|
|||
|
|
@ -92,6 +92,175 @@ async fn ensure_initial_v_me(
|
|||
generate_and_store_initial_v_me(&s, persona_id, now_ms)
|
||||
}
|
||||
|
||||
/// Probe a list of anchors with batched parallelism, returning the first
|
||||
/// successful NodeId. Remaining probes continue in background tasks after
|
||||
/// first success and naturally register additional mesh connections.
|
||||
///
|
||||
/// **Parameters fixed in v0.7.3:**
|
||||
/// - 3 anchors in flight at a time
|
||||
/// - 2-second stagger between batch dispatches
|
||||
/// - 10s per-anchor connect timeout
|
||||
/// - Failed probes to anchors with `last_seen_ms` older than 3 days
|
||||
/// auto-delete from `known_anchors` (self-healing pruning)
|
||||
///
|
||||
/// Returns `None` only when every probe completed without success.
|
||||
async fn probe_anchors_batched(
|
||||
anchors: Vec<(NodeId, Vec<std::net::SocketAddr>)>,
|
||||
network: Arc<crate::network::Network>,
|
||||
storage: Arc<StoragePool>,
|
||||
self_node_id: NodeId,
|
||||
label: &'static str,
|
||||
) -> Option<NodeId> {
|
||||
use std::sync::atomic::{AtomicUsize, Ordering};
|
||||
|
||||
const BATCH_SIZE: usize = 3;
|
||||
const BATCH_STAGGER_SECS: u64 = 2;
|
||||
const PER_ANCHOR_TIMEOUT_SECS: u64 = 10;
|
||||
const STALE_THRESHOLD_MS: u64 = 3 * 86_400 * 1000;
|
||||
|
||||
let total = anchors.len();
|
||||
if total == 0 {
|
||||
return None;
|
||||
}
|
||||
|
||||
let (success_tx, success_rx) = tokio::sync::oneshot::channel::<NodeId>();
|
||||
let success_tx = Arc::new(tokio::sync::Mutex::new(Some(success_tx)));
|
||||
let completed = Arc::new(AtomicUsize::new(0));
|
||||
let all_done = Arc::new(tokio::sync::Notify::new());
|
||||
|
||||
// Dispatcher: spawns per-anchor tasks in batches of BATCH_SIZE,
|
||||
// sleeping BATCH_STAGGER_SECS between batches. The per-anchor tasks
|
||||
// continue running after the dispatcher exits.
|
||||
let dispatcher = {
|
||||
let network = Arc::clone(&network);
|
||||
let storage = Arc::clone(&storage);
|
||||
let success_tx = Arc::clone(&success_tx);
|
||||
let completed = Arc::clone(&completed);
|
||||
let all_done = Arc::clone(&all_done);
|
||||
tokio::spawn(async move {
|
||||
let mut iter = anchors.into_iter();
|
||||
loop {
|
||||
let batch: Vec<_> = (&mut iter).take(BATCH_SIZE).collect();
|
||||
if batch.is_empty() {
|
||||
break;
|
||||
}
|
||||
let more = iter.size_hint().0 > 0;
|
||||
for (nid, addrs) in batch {
|
||||
let network = Arc::clone(&network);
|
||||
let storage = Arc::clone(&storage);
|
||||
let success_tx = Arc::clone(&success_tx);
|
||||
let completed = Arc::clone(&completed);
|
||||
let all_done = Arc::clone(&all_done);
|
||||
tokio::spawn(async move {
|
||||
let result = probe_one_anchor(&network, &storage, nid, addrs, self_node_id, label).await;
|
||||
if let Some(nid) = result {
|
||||
let mut guard = success_tx.lock().await;
|
||||
if let Some(sender) = guard.take() {
|
||||
let _ = sender.send(nid);
|
||||
}
|
||||
}
|
||||
let prev = completed.fetch_add(1, Ordering::SeqCst);
|
||||
if prev + 1 == total {
|
||||
all_done.notify_one();
|
||||
}
|
||||
});
|
||||
}
|
||||
if more {
|
||||
tokio::time::sleep(std::time::Duration::from_secs(BATCH_STAGGER_SECS)).await;
|
||||
}
|
||||
}
|
||||
})
|
||||
};
|
||||
|
||||
// Race: first success vs all probes complete unsuccessfully.
|
||||
let result = tokio::select! {
|
||||
Ok(nid) = success_rx => Some(nid),
|
||||
_ = all_done.notified() => None,
|
||||
};
|
||||
|
||||
// Detach the dispatcher; in-flight per-anchor tasks continue.
|
||||
drop(dispatcher);
|
||||
|
||||
let _ = BATCH_STAGGER_SECS; // silence unused-const if compiler is picky
|
||||
let _ = PER_ANCHOR_TIMEOUT_SECS;
|
||||
let _ = STALE_THRESHOLD_MS;
|
||||
result
|
||||
}
|
||||
|
||||
async fn probe_one_anchor(
|
||||
network: &crate::network::Network,
|
||||
storage: &Arc<StoragePool>,
|
||||
nid: NodeId,
|
||||
addrs: Vec<std::net::SocketAddr>,
|
||||
self_node_id: NodeId,
|
||||
label: &'static str,
|
||||
) -> Option<NodeId> {
|
||||
const PER_ANCHOR_TIMEOUT_SECS: u64 = 10;
|
||||
const STALE_THRESHOLD_MS: u64 = 3 * 86_400 * 1000;
|
||||
|
||||
if nid == self_node_id || network.is_peer_connected_or_session(&nid).await {
|
||||
return None;
|
||||
}
|
||||
let endpoint_id = match iroh::EndpointId::from_bytes(&nid) {
|
||||
Ok(eid) => eid,
|
||||
Err(_) => return None,
|
||||
};
|
||||
let mut addr = iroh::EndpointAddr::from(endpoint_id);
|
||||
for sa in &addrs {
|
||||
addr = addr.with_ip_addr(*sa);
|
||||
}
|
||||
info!(peer = hex::encode(&nid), label, "Trying anchor");
|
||||
let result = tokio::time::timeout(
|
||||
std::time::Duration::from_secs(PER_ANCHOR_TIMEOUT_SECS),
|
||||
network.connect_to_anchor(nid, addr),
|
||||
).await;
|
||||
|
||||
match result {
|
||||
Ok(Ok(())) => {
|
||||
info!(peer = hex::encode(&nid), label, "Connected to anchor");
|
||||
Some(nid)
|
||||
}
|
||||
Ok(Err(e)) => {
|
||||
debug!(error = %e, peer = hex::encode(&nid), label, "Anchor connect failed");
|
||||
maybe_prune_stale_anchor(storage, &nid, STALE_THRESHOLD_MS).await;
|
||||
None
|
||||
}
|
||||
Err(_) => {
|
||||
debug!(peer = hex::encode(&nid), label, "Anchor connect timed out");
|
||||
maybe_prune_stale_anchor(storage, &nid, STALE_THRESHOLD_MS).await;
|
||||
None
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/// If the anchor's last successful contact was more than `threshold_ms`
|
||||
/// ago, delete it from `known_anchors`. Future startups won't waste a
|
||||
/// probe slot on it. Anchors that were recently successful are preserved
|
||||
/// even when they fail a single probe (likely transient).
|
||||
async fn maybe_prune_stale_anchor(
|
||||
storage: &Arc<StoragePool>,
|
||||
nid: &NodeId,
|
||||
threshold_ms: u64,
|
||||
) {
|
||||
let s = storage.get().await;
|
||||
let last_seen_ms = match s.get_known_anchor_last_seen(nid) {
|
||||
Ok(Some(ms)) => ms,
|
||||
_ => return,
|
||||
};
|
||||
let now_ms = std::time::SystemTime::now()
|
||||
.duration_since(std::time::UNIX_EPOCH)
|
||||
.map(|d| d.as_millis() as u64)
|
||||
.unwrap_or(0);
|
||||
if now_ms > last_seen_ms && now_ms - last_seen_ms > threshold_ms {
|
||||
let _ = s.delete_known_anchor(nid);
|
||||
debug!(
|
||||
peer = hex::encode(nid),
|
||||
age_ms = now_ms - last_seen_ms,
|
||||
"Pruned stale anchor (>3 days since last success + failed probe)"
|
||||
);
|
||||
}
|
||||
}
|
||||
|
||||
impl Node {
|
||||
/// Create or open a node in the given data directory (Desktop profile)
|
||||
pub async fn open(data_dir: impl AsRef<Path>) -> anyhow::Result<Self> {
|
||||
|
|
@ -272,6 +441,11 @@ impl Node {
|
|||
|
||||
/// Bootstrap: connect to anchors, pull initial data, NAT probe, referrals.
|
||||
/// Can be called during open_with_bind (blocking startup) or deferred to background.
|
||||
///
|
||||
/// v0.7.3: anchor probing is batched (3 in flight, 2s stagger between batches,
|
||||
/// 10s per-anchor timeout, first success unblocks downstream, remaining probes
|
||||
/// continue in background and naturally fill peer connections). Failed probes
|
||||
/// to anchors >3 days stale auto-prune from `known_anchors`.
|
||||
pub async fn run_bootstrap(&self, data_dir: &Path) -> anyhow::Result<()> {
|
||||
let storage = &self.storage;
|
||||
let network = &self.network;
|
||||
|
|
@ -479,57 +653,28 @@ impl Node {
|
|||
let (discovered, bootstrap_known): (Vec<_>, Vec<_>) = known.into_iter()
|
||||
.partition(|(nid, _)| !bootstrap_anchor_ids.contains(nid));
|
||||
|
||||
// Phase 1: Try discovered (non-bootstrap) anchors first
|
||||
let mut connected_anchor = None;
|
||||
for (anchor_nid, anchor_addrs) in &discovered {
|
||||
if *anchor_nid == node_id || network.is_peer_connected_or_session(anchor_nid).await {
|
||||
continue;
|
||||
}
|
||||
let endpoint_id = match iroh::EndpointId::from_bytes(anchor_nid) {
|
||||
Ok(eid) => eid,
|
||||
Err(_) => continue,
|
||||
};
|
||||
let mut addr = iroh::EndpointAddr::from(endpoint_id);
|
||||
for sa in anchor_addrs {
|
||||
addr = addr.with_ip_addr(*sa);
|
||||
}
|
||||
info!(peer = hex::encode(anchor_nid), "Trying discovered anchor");
|
||||
match tokio::time::timeout(std::time::Duration::from_secs(10), network.connect_to_anchor(*anchor_nid, addr)).await {
|
||||
Ok(Ok(())) => {
|
||||
info!(peer = hex::encode(anchor_nid), "Connected to discovered anchor");
|
||||
connected_anchor = Some(*anchor_nid);
|
||||
break;
|
||||
}
|
||||
Ok(Err(e)) => debug!(error = %e, peer = hex::encode(anchor_nid), "Discovered anchor: connect failed"),
|
||||
Err(_) => debug!(peer = hex::encode(anchor_nid), "Discovered anchor: connect timed out"),
|
||||
}
|
||||
}
|
||||
// Phase 1: probe discovered (non-bootstrap) anchors in batches.
|
||||
// First success returns immediately; remaining probes continue in
|
||||
// background. Failed probes to anchors >3 days stale auto-prune.
|
||||
let mut connected_anchor = probe_anchors_batched(
|
||||
discovered.clone(),
|
||||
network.clone(),
|
||||
Arc::clone(storage),
|
||||
node_id,
|
||||
"discovered",
|
||||
).await;
|
||||
|
||||
// Phase 2: Fall back to bootstrap anchors only if no discovered anchor worked
|
||||
// Phase 2: bootstrap anchors as fallback — only fires if every
|
||||
// Phase 1 entry failed. Preserves the load-distribution intent
|
||||
// (don't smash the central anchor when discovered anchors work).
|
||||
if connected_anchor.is_none() {
|
||||
for (anchor_nid, anchor_addrs) in &bootstrap_known {
|
||||
if *anchor_nid == node_id || network.is_peer_connected_or_session(anchor_nid).await {
|
||||
continue;
|
||||
}
|
||||
let endpoint_id = match iroh::EndpointId::from_bytes(anchor_nid) {
|
||||
Ok(eid) => eid,
|
||||
Err(_) => continue,
|
||||
};
|
||||
let mut addr = iroh::EndpointAddr::from(endpoint_id);
|
||||
for sa in anchor_addrs {
|
||||
addr = addr.with_ip_addr(*sa);
|
||||
}
|
||||
info!(peer = hex::encode(anchor_nid), "Trying bootstrap anchor (fallback)");
|
||||
match tokio::time::timeout(std::time::Duration::from_secs(10), network.connect_to_anchor(*anchor_nid, addr)).await {
|
||||
Ok(Ok(())) => {
|
||||
info!(peer = hex::encode(anchor_nid), "Connected to bootstrap anchor");
|
||||
connected_anchor = Some(*anchor_nid);
|
||||
break;
|
||||
}
|
||||
Ok(Err(e)) => debug!(error = %e, peer = hex::encode(anchor_nid), "Bootstrap anchor: connect failed"),
|
||||
Err(_) => debug!(peer = hex::encode(anchor_nid), "Bootstrap anchor: connect timed out"),
|
||||
}
|
||||
}
|
||||
connected_anchor = probe_anchors_batched(
|
||||
bootstrap_known.clone(),
|
||||
network.clone(),
|
||||
Arc::clone(storage),
|
||||
node_id,
|
||||
"bootstrap",
|
||||
).await;
|
||||
}
|
||||
|
||||
// Phase 3: NAT probe + referrals from whichever anchor we connected to
|
||||
|
|
|
|||
|
|
@ -2248,6 +2248,33 @@ impl Storage {
|
|||
Ok(result)
|
||||
}
|
||||
|
||||
/// Get the last successful contact time (ms since epoch) for a known anchor.
|
||||
/// Returns None if the anchor isn't in the table.
|
||||
pub fn get_known_anchor_last_seen(&self, node_id: &NodeId) -> anyhow::Result<Option<u64>> {
|
||||
let mut stmt = self.conn.prepare(
|
||||
"SELECT last_seen_ms FROM known_anchors WHERE node_id = ?1",
|
||||
)?;
|
||||
let mut rows = stmt.query(params![node_id.as_slice()])?;
|
||||
if let Some(row) = rows.next()? {
|
||||
let ms: i64 = row.get(0)?;
|
||||
Ok(Some(ms as u64))
|
||||
} else {
|
||||
Ok(None)
|
||||
}
|
||||
}
|
||||
|
||||
/// Remove a known anchor entry. Used by the bootstrap connect path
|
||||
/// when a stale anchor (>3 days since last successful contact) fails
|
||||
/// to connect — self-healing pruning so future startups don't re-try
|
||||
/// long-dead entries.
|
||||
pub fn delete_known_anchor(&self, node_id: &NodeId) -> anyhow::Result<()> {
|
||||
self.conn.execute(
|
||||
"DELETE FROM known_anchors WHERE node_id = ?1",
|
||||
params![node_id.as_slice()],
|
||||
)?;
|
||||
Ok(())
|
||||
}
|
||||
|
||||
/// Prune known anchors to keep at most `max` entries (by highest success_count).
|
||||
pub fn prune_known_anchors(&self, max: usize) -> anyhow::Result<usize> {
|
||||
let count: i64 = self.conn.query_row(
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue