Staging instance, all changes can be removed at any time

Skip to content

Implementation of recovery bundles

Recovery bundles are now created with everything needed to restore what will be deleted from the archive. This also goes with tooling to restore, extract content, display information, perform key rollovers and remote operations.

This merge request depends on !2 (merged).

Operations are done using the command line group swh alter recovery-bundle.

Dependency updates

As we want to save SkippedContent objects before removal the dependency on swh-storage is updated to at least version 1.16 as it introduces the required skipped_content_find() method.

Encryption is done using the rage command line tool, a Rust implementatin of the age encryption system. See: https://github.com/str4d/rage

Optional support for YubiKeys is provided via age-plugin-yubikey. See: https://github.com/str4d/age-plugin-yubikey#installation

Using rage (implemented in Rust) instead of age (the reference implementation in Go) is a matter of convenience. As age-plugin-yubikey is implemented in Rust, required only for one extra development platform instead of two felt more reasonable. All the required tools (rage, rage-keygen, age-plugin-yubikey are also made available from the authors as pre-built self-contained executables for common operating systems.

Secret sharing is implemented in Python by the shamir-mnemonic package.

Some implementation details

swh alter remove now requires the --identifier and --recovery-bundle options. The first to specify our identifier for the removal operation, and the latter the path of the recovery bundle. This is mandated as our recovery bundles require an identifier, and no removal should be made without a recovery bundle.

For each recovery bundle, we generate a new keypair. The public key will be discarded after the bundle has been created, but we want to encode the secret key using shared secrets. In order to avoid feeding shamir-mnemonic unecessary data, we decode the age secret key from Bech32 (see BIP-0173 for details). The Python reference implementation of the encoding by Pieter Wuille has been trimmed and added in a dedicated file. This felt short, specific and stable enough to avoid adding a new dependency.

The keys required to decrypt shared secrets can be stored in age identity files or on YubiKeys (using age-plugin-yubikey). With YubiKeys, we can recreate the needed identity files on the fly (using age-plugin-yubikey --identity). In the case, we don’t need anything to be files to be kept by users except for the bundle themselves. We do require a specific format (e.g “YubiKey serial 1234567 slot 8”) for identifiers of YubiKeys as there are no differences between encrypted payloads created for plain identity files or for YubiKeys.

The Content objects we used to create for our tests were more bogus than they should have been. Their sha1_git hash was not matching their SWHID. This prevented retrieving them from the storage once added. This is now fixed. We also added some assertions in sample_populated_storage to ensure that objects are indeed added to the storage before the fixture gets used.

Edited by Jérémy Bobbio (Lunar)

Merge request reports

Loading