Content replayer may try to copy objects before they are available from an objstorage
There is going to be an issue with the content replayer in steady state: content objects (without the data) are written to Kafka before the data is written to the objstorage. So if content replayers are fast enough (and they probably will), they'll try to access the data in objstorage before it's there
(And if we wrote to Kafka after writing to the objstorage, then there would a risk of Kafka missing some content in case of failure, which is worse)
A possible solution is to have the content replayer retry reading objects, until they are available.
There is however the issue of missing objects (swh/devel/experiments/swh-db-audit#1817 (moved)), so it can't retry forever for all objects or it will get stuck. We see two possible solutions:
- a retry timeout, but it means that some objects might be skipped when they shouldn't (eg. if the object takes a lot of time to be available in the objstorage)
- "hardcoding" a list of missing objects in the configuration, but it could possibly grow large with time (hopefully it won't)
Migrated from T2003 (view on Phabricator)