Staging instance, all changes can be removed at any time

Skip to content

cvsclient: Optimize use of temporary files

I encountered that issue while testing the loading of a CVS repository submitted to Save Code Now:

$ swh -l DEBUG loader run cvs pserver://anonymous@cvs.openacs.org/cvsroot/openacs-4

DEBUG:swh.loader.cvs.loader.CvsLoader:Fetching CVS rlog from cvs.openacs.org:/cvsroot/openacs-4
ERROR:swh.loader.cvs.loader.CvsLoader:Loading failure, updating to `failed` status
Traceback (most recent call last):
  File "/home/anlambert/.virtualenvs/swh/lib/python3.11/site-packages/swh/loader/core/loader.py", line 441, in load
  File "/home/anlambert/swh/swh-environment/swh-loader-cvs/build/__editable__.swh.loader.cvs-0.8.1-cp311-cp311-linux_x86_64/swh/loader/cvs/loader.py", line 585, in prepare
  File "/home/anlambert/swh/swh-environment/swh-loader-cvs/build/__editable__.swh.loader.cvs-0.8.1-cp311-cp311-linux_x86_64/swh/loader/cvs/cvsclient.py", line 356, in fetch_rlog
  File "/home/anlambert/swh/swh-environment/swh-loader-cvs/build/__editable__.swh.loader.cvs-0.8.1-cp311-cp311-linux_x86_64/swh/loader/cvs/cvsclient.py", line 294, in _parse_rlog_response
  File "/usr/lib/python3.11/tempfile.py", line 796, in TemporaryFile
  File "/usr/lib/python3.11/tempfile.py", line 789, in opener
  File "/usr/lib/python3.11/tempfile.py", line 395, in _mkstemp_inner
OSError: [Errno 24] Too many open files: '/tmp/tmpy5h2de_1'

When using the pserver protocol, the CVS loader can create a large amount of temporary files resulting in a loading error for a large repository. To mitigate that issue, prefer to use SpooledTemporaryFile instead of TemporaryFile to avoid creating files on disk when their size is lower than a cutoff value.

The checkout implementation of the loader when using pserver protocol was also simplified.

To be noted, those are quick fixes as we need to urgently archive the CVS repositories hosted on OSDN.net (using pserver protocol) before the incoming takedown, loader implementation should be improved later to better embrace Python best practices.

Edited by Antoine Lambert

Merge request reports

Loading