Allow partial snapshot creation during ingestion
This introduces a create_partial_snapshot
parameter to the base loader constructor.
When activated, during each call of the store_data
method, if there are more data to
fetch, this will create a partial snapshot.
The final loop behaves as before, create the last visit with status 'full' targeting the snapshot.
The main difference between the 2 behavior is that an ingestion with that parameter on is more verbose in terms of origin_visit_status. This, in turn, allows to be incremental in subsequent visits for the same origin. This may especially be interesting for cases when loading fail due to out of hand resources issues (e.g. large svn or git repositories).
This is required to allow performance improvments on the loader git [1].
- [1] D6386
Related to swh-loader-git#3625 (closed)
Test Plan
tox
Migrated from D6380 (view on Phabricator)