Oops, I accidentally wrote a shell in Python
Luigi workflows contain plenty of ad-hoc bash scripts like 'zstdcat | java | pv | zstdmt > output.tmp'.
This is a source of problems, because:
- a command failing in the script won't fail the script (bash eats errors), so the Python code believes the temporary output was successfully written, commits it, and proceeds: #4773 (closed)
- most arguments aren't properly escaped
- a future change to CountPaths will need to build a pipeline
dynamically (
zstdcat
in one case,zstdcat | cat - <(cat other_file.txt | sed)
in another,zstdcat | cat <(cat other_file.txt | sed) -
in another one), which was be pretty unreadable with string substitutions
For comparison, this is what the change looks like in order modules: https://gitlab.softwareheritage.org/-/snippets/1561 ; and it enables this change: https://gitlab.softwareheritage.org/-/snippets/1562
Edited by vlorentz