Design an interface for storing and requesting (inbound) references to an object
When processing takedown requests, and deciding which objects should and can be removed, we need to look up, in the main storage instance, whether the candidates for removal are referenced by other objects (and, if so, avoid removing them).
To do so, we first use swh.graph to take care of "old enough" nodes, then to bridge the gap between swh.graph and real-time, we use ad-hoc indexes on the postgresql database and raw SQL queries. This approach doesn't scale to the full size of the current archive (we aren't able to maintain these indexes for all objects), and doesn't work at all in cassandra (where there is no way to create an index on a non-partitioning column).
The proposal is to maintain a specific table (or set of ingestion time-sharded tables) containing the edges of objects, indexed "backwards", that is populated when new objects are inserted in the storage, and that can be pruned on a regular basis when these edges are guaranteed to exist in an up-to-date swh.graph export.