Staging instance, all changes can be removed at any time

Skip to content

Allow partial snapshot creation during ingestion

Antoine R. Dumont requested to merge generated-differential-D6380-source into master

This introduces a create_partial_snapshot parameter to the base loader constructor. When activated, during each call of the store_data method, if there are more data to fetch, this will create a partial snapshot.

The final loop behaves as before, create the last visit with status 'full' targeting the snapshot.

The main difference between the 2 behavior is that an ingestion with that parameter on is more verbose in terms of origin_visit_status. This, in turn, allows to be incremental in subsequent visits for the same origin. This may especially be interesting for cases when loading fail due to out of hand resources issues (e.g. large svn or git repositories).

This is required to allow performance improvments on the loader git [1].

Related to swh-loader-git#3625 (closed)

Test Plan

tox


Migrated from D6380 (view on Phabricator)

Merge request reports

Loading