Improve 'swh db' commands for easier usage in test environments and better consistency
Warning: I still have revisions to push in this MR to fit the description below. should be ok now
The main idea is to make it easy to manage the whole lifecycle of swh.core.db based backends getting configuration from the config file.
Using this configuration file:
$ cat conf.yml
storage:
cls: pipeline
steps:
- cls: masking
masking_db: postgresql:///?service=swh-masking-proxy
- cls: buffer
- cls: postgresql
db: postgresql://user:passwd@pghost:5433/swh-storage
objstorage:
cls: memory
scheduler:
cls: postgresql
db: postgresql:///?service=swh-scheduler
For each swh db
command, the main argument (previously only the backend 'module') can be:
- a (swh) package name without the
--all
option: this is the bw compat mode, for which the config entry is looked for in the config file using the current logic (esp. use the last entry of a pipeline, if any) - a (swh) package name with the
--all
option: run the command for all the backends found in the config file under the package section, - the actual backend (using the syntax ':', e.g.
storage:masking
), - a "path" to target which config entry (from the config file) should be used, like
storage.steps.0
(see the config example above), - when the
--dbname
is used, the config file is not used at all: it's an explicit db connection (libpq) string.
Examples:
$ export SWH_CONFIG_FILENAME=conf.yml
$ swh db init -p storage.steps.0
$ swh db init --all storage
$ swh db init --dbname postgresql:///?service=swh-scrubber scrubber # this has no entry in the config file
$ swh db init --dbname postgresql:///?service=masking swh:masking # this does not look into the config file
Or
$ swh db version -a storage
module: storage:masking
current code version: 194
version: 194
module: storage:postgresql
flavor: default
current code version: 193
version: 193
This should help deployment in integration testing environments (like docker), especially with cases like the example above where the storage consists in several postgresql-backend layers.
It de facto deprecates the usage of the --module-config-key
option.
This MR also change the way the dbmodule
table is managed: we used to only store the package name (e.g. 'storage' for the postgresql
backend of the swh.storage
) with some implicit business logic, or the module name (e.g. 'storage.proxies.masking'; without the swh.
prefix). We now store the backend as <package>:<cls>
where <package>
is the swh.<package>
in which the backend is defined, and <cls>
is the value of the cls
config entry for said backend. It generalizes the idea of declaring these cls
in swh.<package>.classes
entry points.
The swh db upgrade
should take care of updating the dbmodule
table accordingly.