race condition during concurrent loading of the same objects from multiple origins
Looking through kibana logs, we found the following error happening quite often (in the storage):
[2019-08-20 00:39:25,728: ERROR/ForkPoolWorker-88373] Loading failure, updating to `partial` status
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/swh/loader/core/loader.py", line 896, in load
self.store_data()
File "/usr/lib/python3/dist-packages/swh/loader/core/loader.py", line 1003, in store_data
self.send_batch_contents(self.get_contents())
File "/usr/lib/python3/dist-packages/swh/loader/core/loader.py", line 649, in send_batch_contents
packet_size_bytes=packet_size_bytes)
File "/usr/lib/python3/dist-packages/swh/loader/core/loader.py", line 41, in send_in_packets
sender(formatted_objects)
File "/usr/lib/python3/dist-packages/retrying.py", line 49, in wrapped_f
return Retrying(*dargs, **dkw).call(f, *args, **kw)
File "/usr/lib/python3/dist-packages/retrying.py", line 206, in call
return attempt.get(self._wrap_exception)
File "/usr/lib/python3/dist-packages/retrying.py", line 247, in get
six.reraise(self.value[0], self.value[1], self.value[2])
File "/usr/lib/python3/dist-packages/six.py", line 686, in reraise
raise value
File "/usr/lib/python3/dist-packages/retrying.py", line 200, in call
attempt = Attempt(fn(*args, **kwargs), attempt_number, False)
File "/usr/lib/python3/dist-packages/swh/loader/core/loader.py", line 400, in send_contents
result = self.storage.content_add(content_list)
File "/usr/lib/python3/dist-packages/swh/storage/api/client.py", line 24, in content_add
return self.post('content/add', {'content': content})
File "/usr/lib/python3/dist-packages/swh/core/api/__init__.py", line 198, in post
return self._decode_response(response)
File "/usr/lib/python3/dist-packages/swh/core/api/__init__.py", line 230, in _decode_response
raise pickle.loads(decode_response(response))
swh.storage.HashCollision: sha1
Occurrences:
root@uffizi:~# zgrep -c "swh.storage.HashCollision" /var/log/syslog.*
/var/log/syslog.1:168
/var/log/syslog.2.gz:152
/var/log/syslog.3.gz:136
/var/log/syslog.4.gz:127
/var/log/syslog.5.gz:168
/var/log/syslog.6.gz:112
/var/log/syslog.7.gz:137
Migrated from T2019 (view on Phabricator)