Cassandra warnings about tombstones on raw_extrinsic_metadata queries
Looks like the raw_extrinsic_metadata
table has a lot of tombstones.
A tablestat info
Percent repaired: 99.97
Bytes repaired: 788.622GiB
Bytes unrepaired: 258.229MiB
Bytes pending repair: 0.000KiB
Average tombstones per slice (last five minutes): NaN
Maximum tombstones per slice (last five minutes): 0
Droppable tombstone ratio: 0.25052
The checker script is spamming the logs:
[2024-03-08 18:27:57,292] Server warning: Read 5000 live rows and 5805 tombstone cells for query SELECT directory, fetcher_name, fetcher_version, format, metadata, origin, path, release, revision, snapshot, type, visit FROM swh.raw_extrinsic_metadata WHERE target = swh:1:dir:0c00174e41033b337f67575ce08c7492eedb5f9a AND authority_type = forge AND authority_url = https://opam.ocaml.org AND discovery_date > 2024-02-16T03:53:45.000Z AND id > 7339181526ca084c8826499c2f589ab313236a23 LIMIT 5000; token 3702119759087180356 (see tombstone_warn_threshold)
The code used to retrieve the object is here
obj_model = RawExtrinsicMetadata.from_dict(obj)
ids = [obj_model.id]
cs_get = partial(cs_storage.raw_extrinsic_metadata_get_by_ids, ids)
pg_get = partial(pg_storage.raw_extrinsic_metadata_get_by_ids, ids)
Edited by Vincent Sellier