loader-svn: googlecode import: UnicodeDecodeError in user svn properties fails the loading
dump: /srv/storage/space/mirrors/code.google.com/sources/v2/code.google.com/9/9i00/9i00-repo.svndump.gz
$ python3
Python 3.6.4 (default, Jan 5 2018, 02:13:53)
[GCC 7.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> dump = '9i00-repo.svndump.gz'
>>> origin_url = 'http://%s.googlecode.com' % dump
>>>
>>> import logging
>>> logging.basicConfig(level=logging.DEBUG)
>>>
>>> from swh.loader.svn.tasks import MountAndLoadSvnRepositoryTsk
>>>
>>> t = MountAndLoadSvnRepositoryTsk()
>>> t.run(archive_path=dump, origin_url=origin_url, visit_date='2016-05-03T15:16:32+00:00')
INFO:swh.loader.svn.SvnLoader:Archive to mount and load 9i00-repo.svndump.gz
DEBUG:swh.scheduler.task.MountAndLoadSvnRepositoryTsk:Creating svn origin for http://9i00-repo.svndump.gz.googlecode.com
DEBUG:swh.scheduler.task.MountAndLoadSvnRepositoryTsk:Done creating svn origin for http://9i00-repo.svndump.gz.googlecode.com
DEBUG:swh.scheduler.task.MountAndLoadSvnRepositoryTsk:Creating origin_visit for origin 12 at time 2016-05-03T15:16:32+00:00
DEBUG:swh.scheduler.task.MountAndLoadSvnRepositoryTsk:Done Creating origin_visit for origin 12 at time 2016-05-03T15:16:32+00:00
INFO:swh.scheduler.task.MountAndLoadSvnRepositoryTsk:[revision_start-revision_end]: [1-35]
INFO:swh.scheduler.task.MountAndLoadSvnRepositoryTsk:Processing {'remote_url': 'file:///tmp/swh.loader.svn.q093b5py.tmp', 'local_url': b'/tmp/swh.loader.svn.n650d22c.tmp/swh.loader.svn.q093b5py.tmp', 'uuid': b'5d117054-d222-43b4-afd8-064e7e915043', 'swh-origin': 12}.
DEBUG:swh.scheduler.task.MountAndLoadSvnRepositoryTsk:rev: 1, swhrev: b6694c21369e5108b68bd0434740140b42adf042, dir: 75ed58f260bfa4102d0e09657803511f5f0ab372
DEBUG:swh.scheduler.task.MountAndLoadSvnRepositoryTsk:rev: 2, swhrev: 0d7c42ea166ae91fc0023a72c25d13896d722258, dir: e64d3ccab08070f74c9a4577dce5b1e8b531ce55
DEBUG:swh.scheduler.task.MountAndLoadSvnRepositoryTsk:rev: 3, swhrev: 2cf883ee78002af3cdfd3513005b15620b585fe2, dir: 85b2a5eaf76ac0cdc42efcd7f4e293c03c27b8ae
DEBUG:swh.scheduler.task.MountAndLoadSvnRepositoryTsk:rev: 4, swhrev: 1ead2ed0bc20326ea38d6a121190b21e85d7d629, dir: cc19a1ab046114d03b465082556cb55731a6fe6e
DEBUG:swh.scheduler.task.MountAndLoadSvnRepositoryTsk:rev: 5, swhrev: c7b9409849750ce074a75f5b4212b72fbdb66f3b, dir: 84a0b4786ce67c65bef611cf6b3a2c640b641763
DEBUG:swh.scheduler.task.MountAndLoadSvnRepositoryTsk:rev: 6, swhrev: d3b567d6904cb944e30177e006fd28fcfc77f4a8, dir: f6f53fc35b557199494cf777160abc4a36dcbef5
DEBUG:swh.scheduler.task.MountAndLoadSvnRepositoryTsk:rev: 7, swhrev: efcab6300195ccd0aaefa2dee2d5642e22db2344, dir: 31af4c42188972543ae2a548015f866567526b65
DEBUG:swh.scheduler.task.MountAndLoadSvnRepositoryTsk:rev: 8, swhrev: 022aa6dccf16a9706876960cb3969903e0427343, dir: cd54af8d241cc86eea8872b4296c1bfc9a0f6b0c
DEBUG:swh.scheduler.task.MountAndLoadSvnRepositoryTsk:rev: 9, swhrev: 8b03663681c85c94ae815624ef0d5bf2040fbcf6, dir: 4a5c0dd34fc1d9585d7bc741bfcfa12d709733c0
ERROR:swh.scheduler.task.MountAndLoadSvnRepositoryTsk:Loading failure, updating to `partial` status
Traceback (most recent call last):
File "/home/tony/work/inria/repo/swh/swh-environment/swh-loader-core/swh/loader/core/loader.py", line 862, in load
self.store_data()
File "/home/tony/work/inria/repo/swh/swh-environment/swh-loader-svn/swh/loader/svn/loader.py", line 341, in store_data
start_from_scratch=self.start_from_scratch)
File "/home/tony/work/inria/repo/swh/swh-environment/swh-loader-svn/swh/loader/svn/loader.py", line 539, in process_repository
svnrepo, revision_start, revision_end, revision_parents)
File "/home/tony/work/inria/repo/swh/swh-environment/swh-loader-svn/swh/loader/svn/loader.py", line 262, in process_swh_revisions
raise e
File "/home/tony/work/inria/repo/swh/swh-environment/swh-loader-svn/swh/loader/svn/loader.py", line 241, in process_swh_revisions
self.config['revision_packet_size']):
File "/home/tony/work/inria/repo/swh/swh-environment/swh-core/swh/core/utils.py", line 40, in grouper
for _data in itertools.zip_longest(*args, fillvalue=None):
File "/home/tony/work/inria/repo/swh/swh-environment/swh-loader-svn/swh/loader/svn/loader.py", line 185, in process_svn_revisions
for rev, nextrev, commit, new_objects, root_directory in gen_revs:
File "/home/tony/work/inria/repo/swh/swh-environment/swh-loader-svn/swh/loader/svn/svn.py", line 278, in swh_hash_data_per_revision
objects = self.swhreplay.compute_hashes(rev)
File "/home/tony/work/inria/repo/swh/swh-environment/swh-loader-svn/swh/loader/svn/ra.py", line 374, in compute_hashes
self.replay(rev)
File "/home/tony/work/inria/repo/swh/swh-environment/swh-loader-svn/swh/loader/svn/ra.py", line 359, in replay
self.conn.replay(rev, rev+1, self.editor)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb5 in position 0: invalid start byte
DEBUG:swh.scheduler.task.MountAndLoadSvnRepositoryTsk:Updating origin_visit for origin 12 with status partial
DEBUG:swh.scheduler.task.MountAndLoadSvnRepositoryTsk:Done updating origin_visit for origin 12 with status partial
DEBUG:swh.scheduler.task.MountAndLoadSvnRepositoryTsk:Clean up temp directory /tmp/swh.loader.svn.q093b5py.tmp for project
{'status': 'failed'}
Migrated from T946 (view on Phabricator)