Fix longstanding deposit status verified which is usually a symptom of loading failure
This is staging but nonetheless, that could happen in production.
Currently this error occured during ingestion of a deposit:
Feb 24 20:36:42 worker0 python3[657402]: [2021-02-24 20:36:42,387: INFO/MainProcess] Received task: swh.loader.package.deposit.tasks.LoadDeposit[e3d3bc13-4ce4-4d5d-b6de-aa41f561cba3]
Feb 24 20:36:43 worker0 python3[657407]: [2021-02-24 20:36:43,888: ERROR/ForkPoolWorker-1] Failed loading branch HEAD for https://doi.org/10.5281/6a78d227-ae11-4b78-be69-100ba7faf725
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/swh/loader/package/loader.py", line 426, in load
res = self._load_revision(p_info, origin)
File "/usr/lib/python3/dist-packages/swh/loader/package/loader.py", line 541, in _load_revision
dl_artifacts = self.download_package(p_info, tmpdir)
File "/usr/lib/python3/dist-packages/swh/loader/package/deposit/loader.py", line 186, in download_package
return [self.client.archive_get(self.deposit_id, tmpdir, p_info.filename)]
File "/usr/lib/python3/dist-packages/swh/loader/package/deposit/loader.py", line 336, in archive_get
return download(url, dest=tmpdir, filename=filename, auth=self.auth)
File "/usr/lib/python3/dist-packages/swh/loader/package/utils.py", line 81, in download
raise ValueError("Fail to query '%s'. Reason: %s" % (url, response.status_code))
ValueError: Fail to query 'https://deposit-rp.internal.staging.swh.network/1/private/98/raw/'. Reason: 500
Feb 24 20:36:45 worker0 python3[657407]: [2021-02-24 20:36:45,179: WARNING/ForkPoolWorker-1] 1 failed branches
Feb 24 20:36:45 worker0 python3[657407]: [2021-02-24 20:36:45,180: WARNING/ForkPoolWorker-1] Failed branches: HEAD
Deposit information:
swh-deposit=> \conninfo
You are connected to database "swh-deposit" as user "guest" on host "db1.internal.staging.swh.network" (address "192.168.130.11") at port "5432".
SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, bits: 256, compression: off)
swh-deposit=> select * from deposit where id = 98;
-[ RECORD 1 ]--+-------------------------------------------------------------
id | 98
reception_date | 2021-02-24 20:35:25.262905+00
complete_date | 2021-02-24 20:36:11.063726+00
external_id |
swhid |
status | verified
client_id | 9
collection_id | 8
parent_id |
status_detail |
swhid_context |
check_task_id | 18736121
load_task_id | 18736136
origin_url | https://doi.org/10.5281/6a78d227-ae11-4b78-be69-100ba7faf725
Associated task is marked as completed:
swh-scheduler=> select * from task where id=18736136;
-[ RECORD 1 ]----+------------------------------------------------------------------------------------------------------------------
id | 18736136
type | load-deposit
arguments | {"args": [], "kwargs": {"url": "https://doi.org/10.5281/6a78d227-ae11-4b78-be69-100ba7faf725", "deposit_id": 98}}
next_run | 2021-02-24 20:36:30.43889+00
current_interval | 1 day
status | completed
policy | oneshot
retries_left | 3
priority |
There are 2 problems:
- the deposit should be marked as "failed" state
- the actual loading issue
Migrated from T3070 (view on Phabricator)
Edited by Phabricator Migration user