Phase 2e (0.6.1-beta): drop legacy upstream/downstream tables

The file_holders table is now the only tracker of per-file peer
relationships. post_upstream, post_downstream, blob_upstream, and
blob_downstream are dropped at first launch after the seed migration
copies any existing entries.

Schema:
- DROP TABLE IF EXISTS on all four legacy tables after seeding
- Seed migration guards with sqlite_master table_exists check so fresh
  installs don't crash trying to read non-existent sources
- Remove CREATE TABLE statements for the four tables from init
- Remove Protocol v4 Phase 6 post_upstream priority migration (dead)
- Remove blob_upstream preferred_tree column migration (dead)

Rust:
- Remove add/get/remove post_upstream, post_downstream,
  blob_upstream, blob_downstream methods
- Remove get_blob_upstream_preferred_tree / update variant
- Rewrite get_eviction_candidates's downstream_count subquery to
  count file_holders entries
- Rewrite apply_delete's cascade cleanup to clear file_holders
  instead of post_upstream/post_downstream
- cleanup_cdn_for_blob now clears file_holders for the CID

Callers:
- All dual-write sites in connection.rs and node.rs now do
  touch_file_holder only (legacy writes removed)
- get_stale_manifests replaced with get_stale_manifest_cids; caller
  in node.rs picks a refresh source from file_holders

Tests:
- Remove blob_upstream_crud, blob_downstream_crud_and_limit,
  blob_upstream_preferred_tree, remove_blob_upstream,
  post_downstream_crud
- Add file_holders_lru_cap and file_holders_direction_promotion tests

All 110 core tests passing. Workspace compiles clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Scott Reimers 2026-04-21 21:42:15 -04:00
parent 60463d1817
commit 5d9ba22427
3 changed files with 112 additions and 504 deletions

View file

@ -1350,7 +1350,6 @@ impl Node {
let source_addrs: Vec<String> = response.manifest.as_ref()
.map(|m| m.host_addresses.clone())
.unwrap_or_default();
let _ = storage.store_blob_upstream(cid, from_peer, &source_addrs);
let _ = storage.touch_file_holder(
cid,
from_peer,
@ -1419,7 +1418,6 @@ impl Node {
let _ = storage.store_cdn_manifest(cid, &author_json, &cdn_manifest.author_manifest.author, cdn_manifest.author_manifest.updated_at);
}
}
let _ = storage.store_blob_upstream(cid, &lateral, &[]);
let _ = storage.touch_file_holder(
cid,
&lateral,
@ -3095,20 +3093,27 @@ impl Node {
.duration_since(std::time::UNIX_EPOCH)
.unwrap_or_default()
.as_millis() as u64 - max_age_ms;
let stale = {
let stale_cids = {
let s = storage.get().await;
s.get_stale_manifests(cutoff).unwrap_or_default()
s.get_stale_manifest_cids(cutoff).unwrap_or_default()
};
for (cid, upstream_nid, _upstream_addrs) in &stale {
// Get current updated_at for this manifest
let current_updated_at = {
for cid in &stale_cids {
// Get current updated_at + pick a holder to refresh from
let (current_updated_at, refresh_source) = {
let s = storage.get().await;
s.get_cdn_manifest(cid).ok().flatten()
let updated_at = s.get_cdn_manifest(cid).ok().flatten()
.and_then(|json| serde_json::from_str::<crate::types::AuthorManifest>(&json).ok())
.map(|m| m.updated_at)
.unwrap_or(0)
.unwrap_or(0);
let source = s.get_file_holders(cid)
.unwrap_or_default()
.into_iter()
.next()
.map(|(nid, _)| nid);
(updated_at, source)
};
match network.request_manifest_refresh(cid, upstream_nid, current_updated_at).await {
let Some(upstream_nid) = refresh_source else { continue; };
match network.request_manifest_refresh(cid, &upstream_nid, current_updated_at).await {
Ok(Some(cdn_manifest)) => {
if crypto::verify_manifest_signature(&cdn_manifest.author_manifest) {
let author_json = serde_json::to_string(&cdn_manifest.author_manifest).unwrap_or_default();
@ -3135,7 +3140,7 @@ impl Node {
Err(e) => {
tracing::debug!(
cid = hex::encode(cid),
upstream = hex::encode(upstream_nid),
upstream = hex::encode(&upstream_nid),
error = %e,
"Manifest refresh from upstream failed"
);