Analyze svn externals repository ingestion with loader.svn v1.0
Let's analyze the result of the ingestion of subversion repositories with external definitions submitted on staging two days ago.
First let's compute some statistics about the visit statuses of these repositories from the
save_origin_request
table of the staging webapp database.
13:51 $ ssh anlambert@webapp.internal.staging.swh.network
Linux webapp 4.19.0-18-amd64 #1 SMP Debian 4.19.208-1 (2021-09-29) x86_64
The programs included with the Debian GNU/Linux system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.
Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent
permitted by applicable law.
Last login: Fri Jan 21 12:29:24 2022 from 192.168.101.11
anlambert@webapp:~$ django-admin dumpdata --settings=swh.web.settings.production > webapp.staging.db.json
anlambert@carnavalet:~$ scp anlambert@webapp.internal.staging.swh.network:webapp.staging.db.json .
anlambert@carnavalet:~$ cd ~/swh/swh-environment/swh-web
(swh) ✘-PIPE ~/swh/swh-environment/swh-web [master|⚑ 106]
13:56 $ rm swh/web/settings/db.sqlite3
(swh) ✘-PIPE ~/swh/swh-environment/swh-web [master|⚑ 106]
13:56 $ make run-migrations-dev
python3 swh/web/manage.py migrate --settings=swh.web.settings.development -v0 2>/dev/null
(swh) ✔ ~/swh/swh-environment/swh-web [master|⚑ 106]
13:57 $ django-admin loaddata --settings=swh.web.settings.development ~/webapp.staging.db.json
Once the copy of the webapp staging database done, we removed manually (using sqlitebrowser
)
the lines in the save_origin_request
that do not concern subversion with externals.
We can now compute the statistics about visit statuses.
(swh) ✔ ~/swh/swh-environment/swh-web [master|⚑ 106]
13:45 $ sqlite3 swh/web/settings/db.sqlite3
SQLite version 3.34.1 2021-01-20 14:10:07
Enter ".help" for usage hints.
sqlite> select visit_status, count(*) from save_origin_request group by visit_status;
|241
created|4
failed|7
full|521
not_found|1
partial|117
So at the time of writing, on the 891 save code now requests for subversion repos with externals:
- 241 have not been visited yet
- 7 failed
- 521 succeeds with a full repository loading
- 117 succeeds with a partial repository loading
For the record, a partial repository loading can mean:
- the reconstructed filesystem for the last loaded revision differs from the one obtained with a subversion export operation on that revision, that check is performed by the
post_load
hook of the subversion loader -
svnrdump
could not dump the whole repository (network issue for instance), only a partial set of revisions
So results are not bad so far but there is still some issues in the externals support implementation in the loader, let's find them and fix them.
Related to swh/infra/sysadm-environment#3864 (closed)
Migrated from T3870 (view on Phabricator)