Draft: Add a learning/replaying proxy for loaders and listers
In order to make our test suite more reproducible, we configure a proxy for the loaders and listers. It can work in two modes: learning and replaying. In learning mode, HTTP(S) requests will be forwarded to their corresponding servers while recording the exchange. In replaying mode, the record will be used to replay the same exchanges (while denying any other requests), without reaching out to the wider Internet.
The idea is that the test suite should be run on Jenkins only in replay mode, so it won’t be affected by network or service issues.
The implementation uses mitmproxy
. It is run in a new proxy
Docker
service. mitmproxy is able to intercept HTTPS request and generate
new X509 certificates on the fly. The certificate for the authority
signing these generated certificates is installed as trusted in our
swh/stack
image.
Network records are written to assets/mitmproxy/swh-tests
. A
placeholder file is added to create the empty repository.
Usage of the proxy is fully contained in docker-compose.proxy.yml
that
is meant to be used as on override. Changing the mode of the proxy is
done by changing the command
stanza from learn
to replay
and
vice-versa.
Variables in env/proxy.env
are set in the swh-loader
and
swh-lister
containers. Internal hosts should be added to the
no_proxy
variable.
We also have to set PYTHONHASHSEED
to a fixed value in order to (at
least) have swh-loader-git
send the same network request every time.