svn_repo: Optimize export of a remote subversion sub-path
Previously when exporting a sub-path of a remote subversion repository
over the network, the full repository was exported and the local path
targeting the sub-path was returned. This is no really optimal in terms
of network bandwidth if the repository filesystem is large but it was
implemented like this to ensure all tests related to sub-paths export
were passing regardless the subversion loader class used: either SvnLoader
or SvnLoaderFromRemoteDump
.
After some analysis, it turned out that it was possible to avoid exporting
the full repository but only the request sub-path when using the SvnLoader
class. So modify the SvnRepo
class to ensure that behavior and save some
network bandwidth when dealing with a large repository.
These changes in the SvnRepo
class induce some in the replay module to ensure
all tests still pass and it also enables to remove a no longer needed optional
parameter to the class constructor.
This also optimize the SvnDirectoryLoader
class introduced in !221 (merged) in terms
of performance, see example below:
- before that optimization:
[2023-05-31 12:08:29,333: INFO/MainProcess] Task swh.loader.svn.tasks.LoadSvnDirectory[57730b45-5151-4217-bbd4-0d17314cc910] received
[2023-05-31 12:08:29,334: INFO/MainProcess] loader@f1b5a8eddc92 ready.
[2023-05-31 12:08:29,437] Loading config file /loader.yml
[2023-05-31 12:08:29,451] Loader checksums computation: nar
[2023-05-31 12:08:31,494] Load origin 'svn://svn.savannah.gnu.org/apl/trunk' with type 'svn-export'
[2023-05-31 12:08:31,495] lister_not provided, skipping extrinsic origin metadata
[2023-05-31 12:08:52,163] svn export -r 1550 --depth infinity --ignore-keywords svn://svn.savannah.gnu.org/apl /tmp/tmp6r328lle/check-revision-1550.ah6tp1ks/apl
[2023-05-31 12:11:45,023] Artifact <svn-export> with path /tmp/tmp6r328lle/check-revision-1550.ah6tp1ks/apl/trunk
[2023-05-31 12:11:45,023] Artifact <svn-export> to check nar hashes: /tmp/tmp6r328lle/check-revision-1550.ah6tp1ks/apl/trunk
[2023-05-31 12:11:49,591] Number of skipped contents: 0
[2023-05-31 12:11:49,591] Number of contents: 4877
[2023-05-31 12:11:49,601] Flushing 4877 objects of type content (162059838 bytes)
[2023-05-31 12:12:21,042] Number of directories: 46
[2023-05-31 12:12:21,052] Flushing 46 objects of type directory (4978 entries)
[2023-05-31 12:12:22,373] Flushing 1 objects of type snapshot
[2023-05-31 12:12:23,201] Flushing 1 objects of type extid
[2023-05-31 12:12:24,223] cleanup /tmp/tmp6r328lle
[2023-05-31 12:12:24,909] Task swh.loader.svn.tasks.LoadSvnDirectory[57730b45-5151-4217-bbd4-0d17314cc910] succeeded in 235.46256677999918s: {'status': 'eventful'}
- after that optimization:
[2023-05-31 11:54:52,325] Task swh.loader.svn.tasks.LoadSvnDirectory[4efdfa16-59fe-498d-8db6-71e20b449076] received
[2023-05-31 11:54:52,328] Loading config file /loader.yml
[2023-05-31 11:54:52,344] Loader checksums computation: nar
[2023-05-31 11:54:55,388] Load origin 'svn://svn.savannah.gnu.org/apl/trunk' with type 'svn-export'
[2023-05-31 11:54:55,388] lister_not provided, skipping extrinsic origin metadata
[2023-05-31 11:55:07,639] svn export -r 1550 --depth infinity --ignore-keywords svn://svn.savannah.gnu.org/apl/trunk /tmp/tmp1hh2q8uz/check-revision-1550._nadhqvl/trunk
[2023-05-31 11:57:01,453] Artifact <svn-export> with path /tmp/tmp1hh2q8uz/check-revision-1550._nadhqvl/trunk
[2023-05-31 11:57:01,453] Artifact <svn-export> to check nar hashes: /tmp/tmp1hh2q8uz/check-revision-1550._nadhqvl/trunk
[2023-05-31 11:57:05,161] Number of skipped contents: 0
[2023-05-31 11:57:05,162] Number of contents: 4877
[2023-05-31 11:57:05,179] Flushing 4877 objects of type content (162059838 bytes)
[2023-05-31 11:57:33,333] Number of directories: 46
[2023-05-31 11:57:33,348] Flushing 46 objects of type directory (4978 entries)
[2023-05-31 11:57:34,596] Flushing 1 objects of type snapshot
[2023-05-31 11:57:34,749] Flushing 1 objects of type extid
[2023-05-31 11:57:35,768] cleanup /tmp/tmp1hh2q8uz
[2023-05-31 11:57:36,056] Task swh.loader.svn.tasks.LoadSvnDirectory[4efdfa16-59fe-498d-8db6-71e20b449076] succeeded in 163.7220381780062s: {'status': 'eventful'}