template: Add graph for deployment in kubernetes clusters
After review and further discussions, these adaptations ensue:
-
Only reindex when missing graph.ef file -
Allow rpc to hit another grpc instance -
Normalize volume dataset and graph paths (instead of key configuration) -
Make the job able to prepare the in-memory volume directly -
Simplify the extraVolumes section (no need to explicit the persistent volume definition) -
Adapt for graph v6.0 (done since I had to adapt some instructions in Dockerfile)
This adds the graph template in the swh charts to deploy graph rpc/grpc instances.
It does not take care of the compression pipeline (outside the scope of this perimeter).
Specification: https://hedgedoc.softwareheritage.org/mOk8M7znToWeZoMTiMb1Gw?both
Summary
In the current mr, this declares graph instances per cluster:
-
local-cluster:
- 1 graph with the tiny 'test' example dataset (1 persistent volume)
- 1 graph with the python3k dataset [~30g] (1 persistent volume)
- 1 graph with the tiny 'test' & the same setup as the future production (1 in-memory & 1 persistent volume)
-
next-version cluster: 1 graph with the tiny 'test' example with the same setup as the future production (1 in-memory & 1 persistent volume)
Configuration
To initialize datasets, a configuration key "dataset" can be set per graph instance. When set, this triggers a job which fetches the graph dataset and installs it in a persistent volume (if not already populated [4]), do nothing otherwise [3].
The rpc (or grpc) service can be started with the key "startService". When true, the services starts when the dataset is ready, it waits otherwise.
When the (g)rpc service is started, depending on the key "prepareMemoryVolume" (and the "extraVolumes" setup), the memory volume is prepared with the *.graph files (copied from the persistent volume) and symlinks to the other files (linked from the the persistent volume).
If the volume is not to be prepared (prepareMemoryVolume to false, so for small to medium dataset retrieved from s3), the graph will start using all the dataset files present in the persistent volume (files are assumed consistent and complete)
Tests status
-
local-cluster: ok (pod crash, pod restart, remove pod, ...)
-
next-version: ok [2]
Next step
Once merged, declare the graph instance for staging (probably with the python3k dataset).
Impacts
The webapp instances, so far, were all implicitely configured to use the granet instance. One of the early commit of this mr explicited this configuration and will be adapted when new graph instance will be deployed in their respective clusters.
[3]
2024-08-28T13:39:37.852948616Z Dataset <2021-03-23-popular-3k-python> already present. Skip.
2024-08-28T13:39:37.852957652Z + echo 'Dataset <2021-03-23-popular-3k-python> already present. Skip.'
2024-08-28T13:39:37.852959776Z + exit 0
[4] Implementation detail (the tiny dataset is cloned out of the swh-graph repository, other dataset are retrieved through swh graph download
cli)
(I've dropped the test sample to demo what I've described so it goes faster but it'd go the same way with another dataset [tested already, no cli output though] ;)
2024-08-28T13:59:00.749342044Z + '[' -d /srv/graph/test/compressed ']'
2024-08-28T13:59:00.749363664Z + case "${DATASET_NAME}" in
2024-08-28T13:59:00.749365257Z + git clone --depth 1 https://gitlab.softwareheritage.org/swh/devel/swh-graph.git/ /tmp/swh-graph
2024-08-28T13:59:00.750556653Z Cloning into '/tmp/swh-graph'...
2024-08-28T13:59:01.264259345Z + mkdir -p /srv/graph/test/compressed
2024-08-28T13:59:01.266005009Z + rmdir /srv/graph/test/compressed
2024-08-28T13:59:01.267289847Z + cp -r /tmp/swh-graph/swh/graph/example_dataset/compressed /srv/graph/test/compressed
[1]
swh@graph-rpc-example-569955c457-dj4sw:~$ curl http://localhost:5009
<html>
<head><title>Software Heritage graph server</title></head>
<body>
<p>You have reached the <a href="https://www.softwareheritage.org/">
Software Heritage</a> graph API server.</p>
<p>See its
<a href="https://docs.softwareheritage.org/devel/swh-graph/api.html">API
documentation</a> for more information.</p>
</body>
helm diff
[swh] Comparing changes between branches production and mr/deploy-graph-in-kube (per environment)...
Your branch is up to date with 'origin/production'.
[swh] Generate config in production branch for environment staging, namespace swh...
[swh] Generate config in production branch for environment staging, namespace swh-cassandra...
[swh] Generate config in production branch for environment staging, namespace swh-cassandra-next-version...
Your branch is ahead of 'origin/staging' by 37 commits.
(use "git push" to publish your local commits)
[swh] Generate config in mr/deploy-graph-in-kube branch for environment staging...
[swh] Generate config in mr/deploy-graph-in-kube branch for environment staging...
[swh] Generate config in mr/deploy-graph-in-kube branch for environment staging...
Your branch is up to date with 'origin/production'.
[swh] Generate config in production branch for environment production, namespace swh...
[swh] Generate config in production branch for environment production, namespace swh-cassandra...
[swh] Generate config in production branch for environment production, namespace swh-cassandra-next-version...
Your branch is ahead of 'origin/staging' by 37 commits.
(use "git push" to publish your local commits)
[swh] Generate config in mr/deploy-graph-in-kube branch for environment production...
[swh] Generate config in mr/deploy-graph-in-kube branch for environment production...
[swh] Generate config in mr/deploy-graph-in-kube branch for environment production...
------------- diff for environment staging namespace swh -------------
_ __ __
_| |_ _ / _|/ _| between /tmp/swh-chart.swh.W2ixmoFC/staging-swh.before, 139 documents
/ _' | | | | |_| |_ and /tmp/swh-chart.swh.W2ixmoFC/staging-swh.after, 139 documents
| (_| | |_| | _| _|
\__,_|\__, |_| |_| returned 15 differences
|___/
data (v1/ConfigMap/swh/backend-utils)
+ three map entries added:
graph-prepare-memory-volume.sh: |
#!/usr/bin/env bash
# Uses env variables WITNESS_FILE, DATASET_SOURCE, DATASET_LOCATION, GRAPH_NAME
set -eux
[ -z "${DATASET_LOCATION}" ] && \
echo "<DATASET_LOCATION> env variable must be set" && exit 1
[ -z "${DATASET_SOURCE}" ] && \
echo "<DATASET_SOURCE> env variable must be set" && exit 1
[ -z "${WITNESS_FILE}" ] && \
echo "<WITNESS_FILE> env variable must be set" && exit 1
[ -z "${WITNESS_SOURCE_FILE}" ] && \
echo "<WITNESS_SOURCE_FILE> env variable must be set" && exit 1
[ -z "${PERIOD}" ] && \
echo "<PERIOD> env variable must be set" && exit 1
[ -z "${GRAPH_NAME}" ] && \
echo "<GRAPH_NAME> env variable must be set" && exit 1
[ -f ${WITNESS_FILE} ] && echo "Graph ready, do nothing." && exit 0
while [ ! -f ${WITNESS_SOURCE_FILE} ]; do
echo "${WITNESS_SOURCE_FILE} not present, wait for it to prepare the graph dataset..."
sleep $PERIOD
done
# Create empty dataset location destination for copy to be ok
mkdir -p ${DATASET_LOCATION}
graph_stats=${GRAPH_NAME}.stats
# Symlink all files from dataset source to the destination (including the *.graph)
[ -L "${DATASET_LOCATION}/${graph_stats}" ] || \
ln -sf ${DATASET_SOURCE}/* ${DATASET_LOCATION}/
graph_name=${GRAPH_NAME}.graph
# We hard-copy the *.graph file
if [ -L "${DATASET_LOCATION}/${graph_name}" ] || ! [ -f ${DATASET_LOCATION}/${graph_name} ]; then
cp -v --remove-destination ${DATASET_SOURCE}/${graph_name} ${DATASET_LOCATION}/;
fi
graph_transposed_name=${GRAPH_NAME}-transposed.graph
if [ -L ${DATASET_LOCATION}/${graph_transposed_name} ] || ! [ -f ${DATASET_LOCATION}/${graph_transposed_name} ]; then
cp -v --remove-destination ${DATASET_SOURCE}/${graph_transposed_name} ${DATASET_LOCATION}/;
fi
# Finally, we make explicit the graph is ready
touch ${WITNESS_FILE}
graph-wait-for-dataset.sh: |
#!/usr/bin/env bash
# Uses env variables WITNESS_FILE
[ -z "${WITNESS_FILE}" ] && \
echo "<WITNESS_FILE> env variable must be set" && exit 1
while [ ! -f ${WITNESS_FILE} ]; do
echo "${WITNESS_FILE} not present, wait for it to start the graph..."
sleep $PERIOD
done
graph-fetch-dataset.sh: |
#!/usr/bin/env bash
# Uses env variables WITNESS_FILE, DATASET_LOCATION, DATASET_NAME, GRAPH_NAME
[ -z "${DATASET_LOCATION}" ] && \
echo "<DATASET_LOCATION> env variable must be set" && exit 1
[ -z "${DATASET_NAME}" ] && \
echo "<DATASET_NAME> env variable must be set" && exit 1
[ -z "${WITNESS_FILE}" ] && \
echo "<WITNESS_FILE> env variable must be set" && exit 1
[ -z "${GRAPH_NAME}" ] && \
echo "<GRAPH_NAME> env variable must be set" && exit 1
set -eux
[ -f ${WITNESS_FILE} ] && \
echo "Dataset <${DATASET_NAME}> already present. Skip." && \
exit 0
case "${DATASET_NAME}" in
test|example)
# For test (or example) dataset sample, clone the source repository of
# swh.graph and use the example dataset within
git clone \
--depth 1 \
https://gitlab.softwareheritage.org/swh/devel/swh-graph.git/ \
/tmp/swh-graph
# Create empty dataset location destination for copy to be ok
mkdir -p ${DATASET_LOCATION} && rmdir ${DATASET_LOCATION}
# Actual copy of the test dataset
cp -r /tmp/swh-graph/swh/graph/example_dataset/compressed \
${DATASET_LOCATION}
# Make explicit the graph is ready
touch ${WITNESS_FILE}
;;
*)
# Otherwise, download the dataset locally
swh graph download \
--name ${DATASET_NAME} \
${DATASET_LOCATION}
# Reindex graph dataset (for those anterior to 2024). This should not be
# necessary for most recent graph datasets.
# For old datasets missing a .ef though, this just fails with
# `2024-09-02T14:11:56.190692004Z graph-rpc-python3k 0: Cannot map
# Elias-Fano pointer list .../graph.ef`, so we trigger a reindex step
reindex_witness_file=${DATASET_LOCATION}/${GRAPH_NAME}.ef
[ ! -f $reindex_witness_file ] && \
swh graph reindex ${DATASET_LOCATION}/${GRAPH_NAME}
# Make explicit the graph is ready
touch ${WITNESS_FILE}
;;
esac
spec.template.metadata.annotations.checksum/config_utils (apps/v1/Deployment/swh/storage-replayer-content)
± value change
- 054addbf40d1e68df86e403b5f07088bde1d888a1525118fad302837b54edec6
+ 3c9e1487729c0e9273fa3383e6022842b87e8decf8f6c77c145d36d0ba551213
spec.template.metadata.annotations.checksum/config_utils (apps/v1/Deployment/swh/storage-replayer-directory)
± value change
- 054addbf40d1e68df86e403b5f07088bde1d888a1525118fad302837b54edec6
+ 3c9e1487729c0e9273fa3383e6022842b87e8decf8f6c77c145d36d0ba551213
spec.template.metadata.annotations.checksum/config_utils (apps/v1/Deployment/swh/storage-replayer-extid)
± value change
- 054addbf40d1e68df86e403b5f07088bde1d888a1525118fad302837b54edec6
+ 3c9e1487729c0e9273fa3383e6022842b87e8decf8f6c77c145d36d0ba551213
spec.template.metadata.annotations.checksum/config_utils (apps/v1/Deployment/swh/storage-replayer-metadata)
± value change
- 054addbf40d1e68df86e403b5f07088bde1d888a1525118fad302837b54edec6
+ 3c9e1487729c0e9273fa3383e6022842b87e8decf8f6c77c145d36d0ba551213
spec.template.metadata.annotations.checksum/config_utils (apps/v1/Deployment/swh/storage-replayer-origin)
± value change
- 054addbf40d1e68df86e403b5f07088bde1d888a1525118fad302837b54edec6
+ 3c9e1487729c0e9273fa3383e6022842b87e8decf8f6c77c145d36d0ba551213
spec.template.metadata.annotations.checksum/config_utils (apps/v1/Deployment/swh/storage-replayer-origin-visit)
± value change
- 054addbf40d1e68df86e403b5f07088bde1d888a1525118fad302837b54edec6
+ 3c9e1487729c0e9273fa3383e6022842b87e8decf8f6c77c145d36d0ba551213
spec.template.metadata.annotations.checksum/config_utils (apps/v1/Deployment/swh/storage-replayer-origin-visit-status)
± value change
- 054addbf40d1e68df86e403b5f07088bde1d888a1525118fad302837b54edec6
+ 3c9e1487729c0e9273fa3383e6022842b87e8decf8f6c77c145d36d0ba551213
spec.template.metadata.annotations.checksum/config_utils (apps/v1/Deployment/swh/storage-replayer-raw-extrinsic-metadata)
± value change
- 054addbf40d1e68df86e403b5f07088bde1d888a1525118fad302837b54edec6
+ 3c9e1487729c0e9273fa3383e6022842b87e8decf8f6c77c145d36d0ba551213
spec.template.metadata.annotations.checksum/config_utils (apps/v1/Deployment/swh/storage-replayer-release)
± value change
- 054addbf40d1e68df86e403b5f07088bde1d888a1525118fad302837b54edec6
+ 3c9e1487729c0e9273fa3383e6022842b87e8decf8f6c77c145d36d0ba551213
spec.template.metadata.annotations.checksum/config_utils (apps/v1/Deployment/swh/storage-replayer-revision)
± value change
- 054addbf40d1e68df86e403b5f07088bde1d888a1525118fad302837b54edec6
+ 3c9e1487729c0e9273fa3383e6022842b87e8decf8f6c77c145d36d0ba551213
spec.template.metadata.annotations.checksum/config_utils (apps/v1/Deployment/swh/storage-replayer-skipped-content)
± value change
- 054addbf40d1e68df86e403b5f07088bde1d888a1525118fad302837b54edec6
+ 3c9e1487729c0e9273fa3383e6022842b87e8decf8f6c77c145d36d0ba551213
spec.template.metadata.annotations.checksum/config_utils (apps/v1/Deployment/swh/storage-replayer-snapshot)
± value change
- 054addbf40d1e68df86e403b5f07088bde1d888a1525118fad302837b54edec6
+ 3c9e1487729c0e9273fa3383e6022842b87e8decf8f6c77c145d36d0ba551213
spec.template.metadata.annotations.checksum/backend-utils (apps/v1/Deployment/swh/storage-postgresql-read-only)
± value change
- 054addbf40d1e68df86e403b5f07088bde1d888a1525118fad302837b54edec6
+ 3c9e1487729c0e9273fa3383e6022842b87e8decf8f6c77c145d36d0ba551213
spec.template.metadata.annotations.checksum/backend-utils (apps/v1/Deployment/swh/storage-postgresql-read-write)
± value change
- 054addbf40d1e68df86e403b5f07088bde1d888a1525118fad302837b54edec6
+ 3c9e1487729c0e9273fa3383e6022842b87e8decf8f6c77c145d36d0ba551213
------------- diff for environment staging namespace swh-cassandra -------------
_ __ __
_| |_ _ / _|/ _| between /tmp/swh-chart.swh.W2ixmoFC/staging-swh-cassandra.before, 438 documents
/ _' | | | | |_| |_ and /tmp/swh-chart.swh.W2ixmoFC/staging-swh-cassandra.after, 438 documents
| (_| | |_| | _| _|
\__,_|\__, |_| |_| returned 17 differences
|___/
data (v1/ConfigMap/swh-cassandra/backend-utils)
+ three map entries added:
graph-prepare-memory-volume.sh: |
#!/usr/bin/env bash
# Uses env variables WITNESS_FILE, DATASET_SOURCE, DATASET_LOCATION, GRAPH_NAME
set -eux
[ -z "${DATASET_LOCATION}" ] && \
echo "<DATASET_LOCATION> env variable must be set" && exit 1
[ -z "${DATASET_SOURCE}" ] && \
echo "<DATASET_SOURCE> env variable must be set" && exit 1
[ -z "${WITNESS_FILE}" ] && \
echo "<WITNESS_FILE> env variable must be set" && exit 1
[ -z "${WITNESS_SOURCE_FILE}" ] && \
echo "<WITNESS_SOURCE_FILE> env variable must be set" && exit 1
[ -z "${PERIOD}" ] && \
echo "<PERIOD> env variable must be set" && exit 1
[ -z "${GRAPH_NAME}" ] && \
echo "<GRAPH_NAME> env variable must be set" && exit 1
[ -f ${WITNESS_FILE} ] && echo "Graph ready, do nothing." && exit 0
while [ ! -f ${WITNESS_SOURCE_FILE} ]; do
echo "${WITNESS_SOURCE_FILE} not present, wait for it to prepare the graph dataset..."
sleep $PERIOD
done
# Create empty dataset location destination for copy to be ok
mkdir -p ${DATASET_LOCATION}
graph_stats=${GRAPH_NAME}.stats
# Symlink all files from dataset source to the destination (including the *.graph)
[ -L "${DATASET_LOCATION}/${graph_stats}" ] || \
ln -sf ${DATASET_SOURCE}/* ${DATASET_LOCATION}/
graph_name=${GRAPH_NAME}.graph
# We hard-copy the *.graph file
if [ -L "${DATASET_LOCATION}/${graph_name}" ] || ! [ -f ${DATASET_LOCATION}/${graph_name} ]; then
cp -v --remove-destination ${DATASET_SOURCE}/${graph_name} ${DATASET_LOCATION}/;
fi
graph_transposed_name=${GRAPH_NAME}-transposed.graph
if [ -L ${DATASET_LOCATION}/${graph_transposed_name} ] || ! [ -f ${DATASET_LOCATION}/${graph_transposed_name} ]; then
cp -v --remove-destination ${DATASET_SOURCE}/${graph_transposed_name} ${DATASET_LOCATION}/;
fi
# Finally, we make explicit the graph is ready
touch ${WITNESS_FILE}
graph-wait-for-dataset.sh: |
#!/usr/bin/env bash
# Uses env variables WITNESS_FILE
[ -z "${WITNESS_FILE}" ] && \
echo "<WITNESS_FILE> env variable must be set" && exit 1
while [ ! -f ${WITNESS_FILE} ]; do
echo "${WITNESS_FILE} not present, wait for it to start the graph..."
sleep $PERIOD
done
graph-fetch-dataset.sh: |
#!/usr/bin/env bash
# Uses env variables WITNESS_FILE, DATASET_LOCATION, DATASET_NAME, GRAPH_NAME
[ -z "${DATASET_LOCATION}" ] && \
echo "<DATASET_LOCATION> env variable must be set" && exit 1
[ -z "${DATASET_NAME}" ] && \
echo "<DATASET_NAME> env variable must be set" && exit 1
[ -z "${WITNESS_FILE}" ] && \
echo "<WITNESS_FILE> env variable must be set" && exit 1
[ -z "${GRAPH_NAME}" ] && \
echo "<GRAPH_NAME> env variable must be set" && exit 1
set -eux
[ -f ${WITNESS_FILE} ] && \
echo "Dataset <${DATASET_NAME}> already present. Skip." && \
exit 0
case "${DATASET_NAME}" in
test|example)
# For test (or example) dataset sample, clone the source repository of
# swh.graph and use the example dataset within
git clone \
--depth 1 \
https://gitlab.softwareheritage.org/swh/devel/swh-graph.git/ \
/tmp/swh-graph
# Create empty dataset location destination for copy to be ok
mkdir -p ${DATASET_LOCATION} && rmdir ${DATASET_LOCATION}
# Actual copy of the test dataset
cp -r /tmp/swh-graph/swh/graph/example_dataset/compressed \
${DATASET_LOCATION}
# Make explicit the graph is ready
touch ${WITNESS_FILE}
;;
*)
# Otherwise, download the dataset locally
swh graph download \
--name ${DATASET_NAME} \
${DATASET_LOCATION}
# Reindex graph dataset (for those anterior to 2024). This should not be
# necessary for most recent graph datasets.
# For old datasets missing a .ef though, this just fails with
# `2024-09-02T14:11:56.190692004Z graph-rpc-python3k 0: Cannot map
# Elias-Fano pointer list .../graph.ef`, so we trigger a reindex step
reindex_witness_file=${DATASET_LOCATION}/${GRAPH_NAME}.ef
[ ! -f $reindex_witness_file ] && \
swh graph reindex ${DATASET_LOCATION}/${GRAPH_NAME}
# Make explicit the graph is ready
touch ${WITNESS_FILE}
;;
esac
spec.template.metadata.annotations.checksum/backend-utils (apps/v1/Deployment/swh-cassandra/indexer-storage-rpc)
± value change
- c43dc08ca2b696716ab6bab9434a3bb8e6aafe02c326ca2ddbe2cb0b946c2e1f
+ 40879a26fb269aa7fb852180b3de6f7a3ab7a6910d5a00be3a6efe09cee885c1
spec.template.metadata.annotations.checksum/backend-utils (apps/v1/Deployment/swh-cassandra/search-rpc)
± value change
- c43dc08ca2b696716ab6bab9434a3bb8e6aafe02c326ca2ddbe2cb0b946c2e1f
+ 40879a26fb269aa7fb852180b3de6f7a3ab7a6910d5a00be3a6efe09cee885c1
spec.template.metadata.annotations.checksum/config_utils (apps/v1/Deployment/swh-cassandra/storage-replayer-content)
± value change
- c43dc08ca2b696716ab6bab9434a3bb8e6aafe02c326ca2ddbe2cb0b946c2e1f
+ 40879a26fb269aa7fb852180b3de6f7a3ab7a6910d5a00be3a6efe09cee885c1
spec.template.metadata.annotations.checksum/config_utils (apps/v1/Deployment/swh-cassandra/storage-replayer-directory)
± value change
- c43dc08ca2b696716ab6bab9434a3bb8e6aafe02c326ca2ddbe2cb0b946c2e1f
+ 40879a26fb269aa7fb852180b3de6f7a3ab7a6910d5a00be3a6efe09cee885c1
spec.template.metadata.annotations.checksum/config_utils (apps/v1/Deployment/swh-cassandra/storage-replayer-extid)
± value change
- c43dc08ca2b696716ab6bab9434a3bb8e6aafe02c326ca2ddbe2cb0b946c2e1f
+ 40879a26fb269aa7fb852180b3de6f7a3ab7a6910d5a00be3a6efe09cee885c1
spec.template.metadata.annotations.checksum/config_utils (apps/v1/Deployment/swh-cassandra/storage-replayer-metadata)
± value change
- c43dc08ca2b696716ab6bab9434a3bb8e6aafe02c326ca2ddbe2cb0b946c2e1f
+ 40879a26fb269aa7fb852180b3de6f7a3ab7a6910d5a00be3a6efe09cee885c1
spec.template.metadata.annotations.checksum/config_utils (apps/v1/Deployment/swh-cassandra/storage-replayer-origin)
± value change
- c43dc08ca2b696716ab6bab9434a3bb8e6aafe02c326ca2ddbe2cb0b946c2e1f
+ 40879a26fb269aa7fb852180b3de6f7a3ab7a6910d5a00be3a6efe09cee885c1
spec.template.metadata.annotations.checksum/config_utils (apps/v1/Deployment/swh-cassandra/storage-replayer-origin-visit)
± value change
- c43dc08ca2b696716ab6bab9434a3bb8e6aafe02c326ca2ddbe2cb0b946c2e1f
+ 40879a26fb269aa7fb852180b3de6f7a3ab7a6910d5a00be3a6efe09cee885c1
spec.template.metadata.annotations.checksum/config_utils (apps/v1/Deployment/swh-cassandra/storage-replayer-origin-visit-status)
± value change
- c43dc08ca2b696716ab6bab9434a3bb8e6aafe02c326ca2ddbe2cb0b946c2e1f
+ 40879a26fb269aa7fb852180b3de6f7a3ab7a6910d5a00be3a6efe09cee885c1
spec.template.metadata.annotations.checksum/config_utils (apps/v1/Deployment/swh-cassandra/storage-replayer-raw-extrinsic-metadata)
± value change
- c43dc08ca2b696716ab6bab9434a3bb8e6aafe02c326ca2ddbe2cb0b946c2e1f
+ 40879a26fb269aa7fb852180b3de6f7a3ab7a6910d5a00be3a6efe09cee885c1
spec.template.metadata.annotations.checksum/config_utils (apps/v1/Deployment/swh-cassandra/storage-replayer-release)
± value change
- c43dc08ca2b696716ab6bab9434a3bb8e6aafe02c326ca2ddbe2cb0b946c2e1f
+ 40879a26fb269aa7fb852180b3de6f7a3ab7a6910d5a00be3a6efe09cee885c1
spec.template.metadata.annotations.checksum/config_utils (apps/v1/Deployment/swh-cassandra/storage-replayer-revision)
± value change
- c43dc08ca2b696716ab6bab9434a3bb8e6aafe02c326ca2ddbe2cb0b946c2e1f
+ 40879a26fb269aa7fb852180b3de6f7a3ab7a6910d5a00be3a6efe09cee885c1
spec.template.metadata.annotations.checksum/config_utils (apps/v1/Deployment/swh-cassandra/storage-replayer-skipped-content)
± value change
- c43dc08ca2b696716ab6bab9434a3bb8e6aafe02c326ca2ddbe2cb0b946c2e1f
+ 40879a26fb269aa7fb852180b3de6f7a3ab7a6910d5a00be3a6efe09cee885c1
spec.template.metadata.annotations.checksum/config_utils (apps/v1/Deployment/swh-cassandra/storage-replayer-snapshot)
± value change
- c43dc08ca2b696716ab6bab9434a3bb8e6aafe02c326ca2ddbe2cb0b946c2e1f
+ 40879a26fb269aa7fb852180b3de6f7a3ab7a6910d5a00be3a6efe09cee885c1
spec.template.metadata.annotations.checksum/backend-utils (apps/v1/Deployment/swh-cassandra/storage-cassandra)
± value change
- c43dc08ca2b696716ab6bab9434a3bb8e6aafe02c326ca2ddbe2cb0b946c2e1f
+ 40879a26fb269aa7fb852180b3de6f7a3ab7a6910d5a00be3a6efe09cee885c1
spec.template.metadata.annotations.checksum/backend-utils (apps/v1/Deployment/swh-cassandra/storage-cassandra-read-only)
± value change
- c43dc08ca2b696716ab6bab9434a3bb8e6aafe02c326ca2ddbe2cb0b946c2e1f
+ 40879a26fb269aa7fb852180b3de6f7a3ab7a6910d5a00be3a6efe09cee885c1
------------- diff for environment staging namespace swh-cassandra-next-version -------------
_ __ __
_| |_ _ / _|/ _| between /tmp/swh-chart.swh.W2ixmoFC/staging-swh-cassandra-next-version.before, 345 documents
/ _' | | | | |_| |_ and /tmp/swh-chart.swh.W2ixmoFC/staging-swh-cassandra-next-version.after, 355 documents
| (_| | |_| | _| _|
\__,_|\__, |_| |_| returned 23 differences
|___/
(file level)
---
# Source: swh/templates/graph/configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: graph-grpc-example-configuration-template
namespace: swh-cassandra-next-version
data:
config.yml.template: |
graph:
cls: local_rust
grpc_server:
port: 50091
# Source: swh/templates/graph/configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
namespace: swh-cassandra-next-version
name: graph-rpc-example-configuration-template
data:
config.yml.template: |
graph:
cls: remote
grpc_server:
port: 50091
url: graph-grpc-next-version.internal.staging.swh.network:50091
# Source: swh/templates/graph/persistent-volume-claims.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: swh-graph-grpc-dataset-example-pvc
namespace: swh-cassandra-next-version
labels:
app: graph-grpc-example
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
storageClassName: local-persistent
volumeMode: Filesystem
# Source: swh/templates/graph/persistent-volume-claims.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: swh-graph-grpc-inmemory-pvc
namespace: swh-cassandra-next-version
labels:
app: graph-grpc-example
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
storageClassName: local-path
volumeMode: Filesystem
# Source: swh/templates/graph/service.yaml
apiVersion: v1
kind: Service
metadata:
name: graph-grpc-example
namespace: swh-cassandra-next-version
labels:
app: graph-grpc-example
spec:
type: ClusterIP
selector:
app: graph-grpc-example
ports:
- port: 50091
targetPort: 50091
name: grpc
# Source: swh/templates/graph/service.yaml
apiVersion: v1
kind: Service
metadata:
name: graph-rpc-example
namespace: swh-cassandra-next-version
labels:
app: graph-rpc-example
spec:
type: ClusterIP
selector:
app: graph-rpc-example
ports:
- port: 5009
targetPort: 5009
name: rpc
# Source: swh/templates/graph/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
namespace: swh-cassandra-next-version
name: graph-grpc-example
labels:
app: graph-grpc-example
spec:
revisionHistoryLimit: 2
selector:
matchLabels:
app: graph-grpc-example
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1
template:
metadata:
labels:
app: graph-grpc-example
annotations:
checksum/config: 6dea70875aa6e50ce0c2984ed612dbaaa9a6d9795aa5c8df143371ad56c6e0c6
checksum/config-utils: 94d255131467f84bef964a4c72b2b792c5ebaf711bb1c77829d7cd1007a8ac22
checksum/backend-utils: eca303b18ba9654db85a0f8d4838403916b673d7f06c09aedd8771b5028c7f62
spec:
nodeSelector:
kubernetes.io/hostname: rancher-node-staging-rke2-metal01
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: swh/graph
operator: In
values:
- "true"
priorityClassName: swh-cassandra-next-version-frontend-rpc
initContainers:
- name: prepare-configuration
image: "container-registry.softwareheritage.org/swh/infra/swh-apps/utils:20231211.1"
imagePullPolicy: IfNotPresent
command:
- /entrypoints/prepare-configuration.sh
env:
volumeMounts:
- name: configuration
mountPath: /etc/swh
- name: configuration-template
mountPath: /etc/swh/configuration-template
- name: config-utils
mountPath: /entrypoints
readOnly: true
- name: graph-prepare-memory-volume
image: "container-registry.softwareheritage.org/swh/infra/swh-apps/utils:20231211.1"
imagePullPolicy: IfNotPresent
command:
- /entrypoints/graph-prepare-memory-volume.sh
env:
- name: WITNESS_FILE
value: /srv/graph/test/compressed/.graph-is-initialized
- name: WITNESS_SOURCE_FILE
value: /srv/dataset/test/compressed/.graph-is-initialized
- name: PERIOD
value: 3
- name: GRAPH_NAME
value: example
- name: DATASET_SOURCE
value: /srv/dataset/test/compressed
- name: DATASET_LOCATION
value: /srv/graph/test/compressed
volumeMounts:
- name: backend-utils
mountPath: /entrypoints
readOnly: true
- name: swh-graph-grpc-dataset-example
mountPath: /srv/dataset
readOnly: false
- name: swh-graph-grpc-inmemory
mountPath: /srv/graph
readOnly: false
containers:
- name: graph-grpc-example
resources:
requests:
memory: 512Mi
cpu: 500m
image: "container-registry.softwareheritage.org/swh/infra/swh-apps/graph:20240920.2"
imagePullPolicy: IfNotPresent
ports:
- containerPort: 50091
name: grpc
readinessProbe:
tcpSocket:
port: grpc
initialDelaySeconds: 15
failureThreshold: 30
periodSeconds: 5
livenessProbe:
tcpSocket:
port: grpc
initialDelaySeconds: 10
periodSeconds: 5
command:
- /bin/bash
args:
- "-c"
- /opt/swh/entrypoint.sh
env:
- name: GRAPH_TYPE
value: grpc
- name: PORT
value: 50091
- name: GRAPH_PATH
value: /srv/graph/test/compressed/example
- name: STATSD_HOST
value: prometheus-statsd-exporter
- name: STATSD_PORT
value: 9125
- name: STATSD_TAGS
value: "deployment:graph-grpc-example"
- name: STATSD_SERVICE_TYPE
value: graph-grpc-example
- name: SWH_LOG_LEVEL
value: INFO
- name: SWH_CONFIG_FILENAME
value: /etc/swh/config.yml
- name: SWH_SENTRY_ENVIRONMENT
value: staging
- name: SWH_MAIN_PACKAGE
value: swh.graph
- name: SWH_SENTRY_DSN
valueFrom:
secretKeyRef:
name: common-secrets
key: storage-sentry-dsn
# 'name' secret should exist & include key
# if the setting doesn't exist, sentry pushes will be disabled
optional: true
- name: SWH_SENTRY_DISABLE_LOGGING_EVENTS
value: "true"
volumeMounts:
- name: configuration
mountPath: /etc/swh
- name: swh-graph-grpc-dataset-example
mountPath: /srv/dataset
readOnly: false
- name: swh-graph-grpc-inmemory
mountPath: /srv/graph
readOnly: false
volumes:
- name: configuration
emptyDir: {}
- name: configuration-template
configMap:
name: graph-grpc-example-configuration-template
items:
- key: config.yml.template
path: config.yml.template
- name: config-utils
configMap:
name: config-utils
defaultMode: 0555
- name: backend-utils
configMap:
name: backend-utils
defaultMode: 0555
- name: swh-graph-grpc-dataset-example
persistentVolumeClaim:
claimName: swh-graph-grpc-dataset-example-pvc
- name: swh-graph-grpc-inmemory
persistentVolumeClaim:
claimName: swh-graph-grpc-inmemory-pvc
# Source: swh/templates/graph/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
namespace: swh-cassandra-next-version
name: graph-rpc-example
labels:
app: graph-rpc-example
spec:
revisionHistoryLimit: 2
selector:
matchLabels:
app: graph-rpc-example
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1
template:
metadata:
labels:
app: graph-rpc-example
annotations:
checksum/config: 32f6eb85c0a9c4544695fc9c3f7b511625a667c21670e037995eba0b9c55292b
checksum/config-utils: 94d255131467f84bef964a4c72b2b792c5ebaf711bb1c77829d7cd1007a8ac22
checksum/backend-utils: eca303b18ba9654db85a0f8d4838403916b673d7f06c09aedd8771b5028c7f62
spec:
nodeSelector:
kubernetes.io/hostname: rancher-node-staging-rke2-metal01
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: swh/graph
operator: In
values:
- "true"
priorityClassName: swh-cassandra-next-version-frontend-rpc
initContainers:
- name: prepare-configuration
image: "container-registry.softwareheritage.org/swh/infra/swh-apps/utils:20231211.1"
imagePullPolicy: IfNotPresent
command:
- /entrypoints/prepare-configuration.sh
env:
volumeMounts:
- name: configuration
mountPath: /etc/swh
- name: configuration-template
mountPath: /etc/swh/configuration-template
- name: config-utils
mountPath: /entrypoints
readOnly: true
containers:
- name: graph-rpc-example
resources:
requests:
memory: 512Mi
cpu: 500m
image: "container-registry.softwareheritage.org/swh/infra/swh-apps/graph:20240920.2"
imagePullPolicy: IfNotPresent
ports:
- containerPort: 5009
name: rpc
readinessProbe:
httpGet:
path: /
port: rpc
initialDelaySeconds: 15
failureThreshold: 30
periodSeconds: 5
livenessProbe:
tcpSocket:
port: rpc
initialDelaySeconds: 10
periodSeconds: 5
command:
- /bin/bash
args:
- "-c"
- /opt/swh/entrypoint.sh
env:
- name: GRAPH_TYPE
value: rpc
- name: PORT
value: 5009
- name: GRAPH_PATH
value: /srv/graph/graph/compressed/example
- name: STATSD_HOST
value: prometheus-statsd-exporter
- name: STATSD_PORT
value: 9125
- name: STATSD_TAGS
value: "deployment:graph-rpc-example"
- name: STATSD_SERVICE_TYPE
value: graph-rpc-example
- name: SWH_LOG_LEVEL
value: INFO
- name: SWH_CONFIG_FILENAME
value: /etc/swh/config.yml
- name: SWH_SENTRY_ENVIRONMENT
value: staging
- name: SWH_MAIN_PACKAGE
value: swh.graph
- name: SWH_SENTRY_DSN
valueFrom:
secretKeyRef:
name: common-secrets
key: storage-sentry-dsn
# 'name' secret should exist & include key
# if the setting doesn't exist, sentry pushes will be disabled
optional: true
- name: SWH_SENTRY_DISABLE_LOGGING_EVENTS
value: "true"
volumeMounts:
- name: configuration
mountPath: /etc/swh
volumes:
- name: configuration
emptyDir: {}
- name: configuration-template
configMap:
name: graph-rpc-example-configuration-template
items:
- key: config.yml.template
path: config.yml.template
- name: config-utils
configMap:
name: config-utils
defaultMode: 0555
- name: backend-utils
configMap:
name: backend-utils
defaultMode: 0555
# Source: swh/templates/graph/ingress.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
namespace: swh-cassandra-next-version
name: graph-grpc-example-ingress-default
labels:
app: graph-grpc-example
endpoint-definition: default
annotations:
nginx.ingress.kubernetes.io/backend-protocol: GRPC
nginx.ingress.kubernetes.io/client-body-buffer-size: 128K
nginx.ingress.kubernetes.io/proxy-body-size: 4G
nginx.ingress.kubernetes.io/proxy-buffering: on
nginx.ingress.kubernetes.io/service-upstream: "true"
nginx.ingress.kubernetes.io/ssl-redirect: "true"
nginx.ingress.kubernetes.io/whitelist-source-range: "10.42.0.0/16,10.43.0.0/16,192.168.100.29/32,192.168.101.0/24,192.168.130.0/24,192.168.50.0/24"
spec:
ingressClassName: nginx
rules:
- host: graph-grpc-next-version.internal.staging.swh.network
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: graph-grpc-example
port:
number: 50091
# Source: swh/templates/graph/ingress.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
namespace: swh-cassandra-next-version
name: graph-rpc-example-ingress-default
labels:
app: graph-rpc-example
endpoint-definition: default
annotations:
nginx.ingress.kubernetes.io/client-body-buffer-size: 128K
nginx.ingress.kubernetes.io/proxy-body-size: 4G
nginx.ingress.kubernetes.io/proxy-buffering: on
nginx.ingress.kubernetes.io/service-upstream: "true"
nginx.ingress.kubernetes.io/whitelist-source-range: "10.42.0.0/16,10.43.0.0/16,192.168.100.29/32,192.168.101.0/24,192.168.130.0/24,192.168.50.0/24"
spec:
rules:
- host: graph-next-version.internal.staging.swh.network
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: graph-rpc-example
port:
number: 5009
data.config.yml.template (v1/ConfigMap/swh-cassandra-next-version/provenance-graph-granet-configuration-template)
± value change in multiline text (one insert, one deletion)
- url: graph.internal.softwareheritage.org:50091
+ url: grpc-next-version.internal.staging.swh.network:50091
data (v1/ConfigMap/swh-cassandra-next-version/backend-utils)
+ three map entries added:
graph-prepare-memory-volume.sh: |
#!/usr/bin/env bash
# Uses env variables WITNESS_FILE, DATASET_SOURCE, DATASET_LOCATION, GRAPH_NAME
set -eux
[ -z "${DATASET_LOCATION}" ] && \
echo "<DATASET_LOCATION> env variable must be set" && exit 1
[ -z "${DATASET_SOURCE}" ] && \
echo "<DATASET_SOURCE> env variable must be set" && exit 1
[ -z "${WITNESS_FILE}" ] && \
echo "<WITNESS_FILE> env variable must be set" && exit 1
[ -z "${WITNESS_SOURCE_FILE}" ] && \
echo "<WITNESS_SOURCE_FILE> env variable must be set" && exit 1
[ -z "${PERIOD}" ] && \
echo "<PERIOD> env variable must be set" && exit 1
[ -z "${GRAPH_NAME}" ] && \
echo "<GRAPH_NAME> env variable must be set" && exit 1
[ -f ${WITNESS_FILE} ] && echo "Graph ready, do nothing." && exit 0
while [ ! -f ${WITNESS_SOURCE_FILE} ]; do
echo "${WITNESS_SOURCE_FILE} not present, wait for it to prepare the graph dataset..."
sleep $PERIOD
done
# Create empty dataset location destination for copy to be ok
mkdir -p ${DATASET_LOCATION}
graph_stats=${GRAPH_NAME}.stats
# Symlink all files from dataset source to the destination (including the *.graph)
[ -L "${DATASET_LOCATION}/${graph_stats}" ] || \
ln -sf ${DATASET_SOURCE}/* ${DATASET_LOCATION}/
graph_name=${GRAPH_NAME}.graph
# We hard-copy the *.graph file
if [ -L "${DATASET_LOCATION}/${graph_name}" ] || ! [ -f ${DATASET_LOCATION}/${graph_name} ]; then
cp -v --remove-destination ${DATASET_SOURCE}/${graph_name} ${DATASET_LOCATION}/;
fi
graph_transposed_name=${GRAPH_NAME}-transposed.graph
if [ -L ${DATASET_LOCATION}/${graph_transposed_name} ] || ! [ -f ${DATASET_LOCATION}/${graph_transposed_name} ]; then
cp -v --remove-destination ${DATASET_SOURCE}/${graph_transposed_name} ${DATASET_LOCATION}/;
fi
# Finally, we make explicit the graph is ready
touch ${WITNESS_FILE}
graph-wait-for-dataset.sh: |
#!/usr/bin/env bash
# Uses env variables WITNESS_FILE
[ -z "${WITNESS_FILE}" ] && \
echo "<WITNESS_FILE> env variable must be set" && exit 1
while [ ! -f ${WITNESS_FILE} ]; do
echo "${WITNESS_FILE} not present, wait for it to start the graph..."
sleep $PERIOD
done
graph-fetch-dataset.sh: |
#!/usr/bin/env bash
# Uses env variables WITNESS_FILE, DATASET_LOCATION, DATASET_NAME, GRAPH_NAME
[ -z "${DATASET_LOCATION}" ] && \
echo "<DATASET_LOCATION> env variable must be set" && exit 1
[ -z "${DATASET_NAME}" ] && \
echo "<DATASET_NAME> env variable must be set" && exit 1
[ -z "${WITNESS_FILE}" ] && \
echo "<WITNESS_FILE> env variable must be set" && exit 1
[ -z "${GRAPH_NAME}" ] && \
echo "<GRAPH_NAME> env variable must be set" && exit 1
set -eux
[ -f ${WITNESS_FILE} ] && \
echo "Dataset <${DATASET_NAME}> already present. Skip." && \
exit 0
case "${DATASET_NAME}" in
test|example)
# For test (or example) dataset sample, clone the source repository of
# swh.graph and use the example dataset within
git clone \
--depth 1 \
https://gitlab.softwareheritage.org/swh/devel/swh-graph.git/ \
/tmp/swh-graph
# Create empty dataset location destination for copy to be ok
mkdir -p ${DATASET_LOCATION} && rmdir ${DATASET_LOCATION}
# Actual copy of the test dataset
cp -r /tmp/swh-graph/swh/graph/example_dataset/compressed \
${DATASET_LOCATION}
# Make explicit the graph is ready
touch ${WITNESS_FILE}
;;
*)
# Otherwise, download the dataset locally
swh graph download \
--name ${DATASET_NAME} \
${DATASET_LOCATION}
# Reindex graph dataset (for those anterior to 2024). This should not be
# necessary for most recent graph datasets.
# For old datasets missing a .ef though, this just fails with
# `2024-09-02T14:11:56.190692004Z graph-rpc-python3k 0: Cannot map
# Elias-Fano pointer list .../graph.ef`, so we trigger a reindex step
reindex_witness_file=${DATASET_LOCATION}/${GRAPH_NAME}.ef
[ ! -f $reindex_witness_file ] && \
swh graph reindex ${DATASET_LOCATION}/${GRAPH_NAME}
# Make explicit the graph is ready
touch ${WITNESS_FILE}
;;
esac
data.config.yml.template (v1/ConfigMap/swh-cassandra-next-version/web-cassandra-configuration-template)
± value change in multiline text (one insert, no deletions)
+ graph:
+ url: http://graph-next-version.internal.staging.swh.network/graph
spec.template.metadata.annotations.checksum/backend-utils (apps/v1/Deployment/swh-cassandra-next-version/indexer-storage-rw)
± value change
- 9d24f08f61c4b5483850e8d39f00d4d75dd52bda92d0caaae604f3fba223f192
+ eca303b18ba9654db85a0f8d4838403916b673d7f06c09aedd8771b5028c7f62
spec.template.metadata.annotations.checksum/config (apps/v1/Deployment/swh-cassandra-next-version/provenance-graph-granet)
± value change
- ec9e1a625460a25ae508384d7a89e4d30632178678968db8b7f222d620955e0c
+ c32d2ea55573517d9fb19e347a8710eb38312879154cd1ec77d3d37a436ed2c3
spec.template.metadata.annotations.checksum/backend-utils (apps/v1/Deployment/swh-cassandra-next-version/search-rpc)
± value change
- 9d24f08f61c4b5483850e8d39f00d4d75dd52bda92d0caaae604f3fba223f192
+ eca303b18ba9654db85a0f8d4838403916b673d7f06c09aedd8771b5028c7f62
spec.template.metadata.annotations.checksum/config_utils (apps/v1/Deployment/swh-cassandra-next-version/storage-replayer-content)
± value change
- 9d24f08f61c4b5483850e8d39f00d4d75dd52bda92d0caaae604f3fba223f192
+ eca303b18ba9654db85a0f8d4838403916b673d7f06c09aedd8771b5028c7f62
spec.template.metadata.annotations.checksum/config_utils (apps/v1/Deployment/swh-cassandra-next-version/storage-replayer-directory)
± value change
- 9d24f08f61c4b5483850e8d39f00d4d75dd52bda92d0caaae604f3fba223f192
+ eca303b18ba9654db85a0f8d4838403916b673d7f06c09aedd8771b5028c7f62
spec.template.metadata.annotations.checksum/config_utils (apps/v1/Deployment/swh-cassandra-next-version/storage-replayer-extid)
± value change
- 9d24f08f61c4b5483850e8d39f00d4d75dd52bda92d0caaae604f3fba223f192
+ eca303b18ba9654db85a0f8d4838403916b673d7f06c09aedd8771b5028c7f62
spec.template.metadata.annotations.checksum/config_utils (apps/v1/Deployment/swh-cassandra-next-version/storage-replayer-metadata)
± value change
- 9d24f08f61c4b5483850e8d39f00d4d75dd52bda92d0caaae604f3fba223f192
+ eca303b18ba9654db85a0f8d4838403916b673d7f06c09aedd8771b5028c7f62
spec.template.metadata.annotations.checksum/config_utils (apps/v1/Deployment/swh-cassandra-next-version/storage-replayer-origin)
± value change
- 9d24f08f61c4b5483850e8d39f00d4d75dd52bda92d0caaae604f3fba223f192
+ eca303b18ba9654db85a0f8d4838403916b673d7f06c09aedd8771b5028c7f62
spec.template.metadata.annotations.checksum/config_utils (apps/v1/Deployment/swh-cassandra-next-version/storage-replayer-origin-visit)
± value change
- 9d24f08f61c4b5483850e8d39f00d4d75dd52bda92d0caaae604f3fba223f192
+ eca303b18ba9654db85a0f8d4838403916b673d7f06c09aedd8771b5028c7f62
spec.template.metadata.annotations.checksum/config_utils (apps/v1/Deployment/swh-cassandra-next-version/storage-replayer-origin-visit-status)
± value change
- 9d24f08f61c4b5483850e8d39f00d4d75dd52bda92d0caaae604f3fba223f192
+ eca303b18ba9654db85a0f8d4838403916b673d7f06c09aedd8771b5028c7f62
spec.template.metadata.annotations.checksum/config_utils (apps/v1/Deployment/swh-cassandra-next-version/storage-replayer-raw-extrinsic-metadata)
± value change
- 9d24f08f61c4b5483850e8d39f00d4d75dd52bda92d0caaae604f3fba223f192
+ eca303b18ba9654db85a0f8d4838403916b673d7f06c09aedd8771b5028c7f62
spec.template.metadata.annotations.checksum/config_utils (apps/v1/Deployment/swh-cassandra-next-version/storage-replayer-release)
± value change
- 9d24f08f61c4b5483850e8d39f00d4d75dd52bda92d0caaae604f3fba223f192
+ eca303b18ba9654db85a0f8d4838403916b673d7f06c09aedd8771b5028c7f62
spec.template.metadata.annotations.checksum/config_utils (apps/v1/Deployment/swh-cassandra-next-version/storage-replayer-revision)
± value change
- 9d24f08f61c4b5483850e8d39f00d4d75dd52bda92d0caaae604f3fba223f192
+ eca303b18ba9654db85a0f8d4838403916b673d7f06c09aedd8771b5028c7f62
spec.template.metadata.annotations.checksum/config_utils (apps/v1/Deployment/swh-cassandra-next-version/storage-replayer-skipped-content)
± value change
- 9d24f08f61c4b5483850e8d39f00d4d75dd52bda92d0caaae604f3fba223f192
+ eca303b18ba9654db85a0f8d4838403916b673d7f06c09aedd8771b5028c7f62
spec.template.metadata.annotations.checksum/config_utils (apps/v1/Deployment/swh-cassandra-next-version/storage-replayer-snapshot)
± value change
- 9d24f08f61c4b5483850e8d39f00d4d75dd52bda92d0caaae604f3fba223f192
+ eca303b18ba9654db85a0f8d4838403916b673d7f06c09aedd8771b5028c7f62
spec.template.metadata.annotations.checksum/backend-utils (apps/v1/Deployment/swh-cassandra-next-version/storage-ro-postgresql)
± value change
- 9d24f08f61c4b5483850e8d39f00d4d75dd52bda92d0caaae604f3fba223f192
+ eca303b18ba9654db85a0f8d4838403916b673d7f06c09aedd8771b5028c7f62
spec.template.metadata.annotations.checksum/backend-utils (apps/v1/Deployment/swh-cassandra-next-version/storage-rw-cassandra)
± value change
- 9d24f08f61c4b5483850e8d39f00d4d75dd52bda92d0caaae604f3fba223f192
+ eca303b18ba9654db85a0f8d4838403916b673d7f06c09aedd8771b5028c7f62
spec.template.metadata.annotations.checksum/backend-utils (apps/v1/Deployment/swh-cassandra-next-version/storage-rw-postgresql)
± value change
- 9d24f08f61c4b5483850e8d39f00d4d75dd52bda92d0caaae604f3fba223f192
+ eca303b18ba9654db85a0f8d4838403916b673d7f06c09aedd8771b5028c7f62
spec.template.metadata.annotations.checksum/config (apps/v1/Deployment/swh-cassandra-next-version/web-cassandra)
± value change
- 4a74939ceecbd3da02d7c345f7d6f311f4e8ee90489b243880cf38e16b63e1a9
+ ea048a85eec059c93a401f5a5f35789f611058f186766aacbf49602bec831a5a
------------- diff for environment production namespace swh -------------
_ __ __
_| |_ _ / _|/ _| between /tmp/swh-chart.swh.W2ixmoFC/production-swh.before, 449 documents
/ _' | | | | |_| |_ and /tmp/swh-chart.swh.W2ixmoFC/production-swh.after, 449 documents
| (_| | |_| | _| _|
\__,_|\__, |_| |_| returned ten differences
|___/
data (v1/ConfigMap/swh/backend-utils)
+ three map entries added:
graph-prepare-memory-volume.sh: |
#!/usr/bin/env bash
# Uses env variables WITNESS_FILE, DATASET_SOURCE, DATASET_LOCATION, GRAPH_NAME
set -eux
[ -z "${DATASET_LOCATION}" ] && \
echo "<DATASET_LOCATION> env variable must be set" && exit 1
[ -z "${DATASET_SOURCE}" ] && \
echo "<DATASET_SOURCE> env variable must be set" && exit 1
[ -z "${WITNESS_FILE}" ] && \
echo "<WITNESS_FILE> env variable must be set" && exit 1
[ -z "${WITNESS_SOURCE_FILE}" ] && \
echo "<WITNESS_SOURCE_FILE> env variable must be set" && exit 1
[ -z "${PERIOD}" ] && \
echo "<PERIOD> env variable must be set" && exit 1
[ -z "${GRAPH_NAME}" ] && \
echo "<GRAPH_NAME> env variable must be set" && exit 1
[ -f ${WITNESS_FILE} ] && echo "Graph ready, do nothing." && exit 0
while [ ! -f ${WITNESS_SOURCE_FILE} ]; do
echo "${WITNESS_SOURCE_FILE} not present, wait for it to prepare the graph dataset..."
sleep $PERIOD
done
# Create empty dataset location destination for copy to be ok
mkdir -p ${DATASET_LOCATION}
graph_stats=${GRAPH_NAME}.stats
# Symlink all files from dataset source to the destination (including the *.graph)
[ -L "${DATASET_LOCATION}/${graph_stats}" ] || \
ln -sf ${DATASET_SOURCE}/* ${DATASET_LOCATION}/
graph_name=${GRAPH_NAME}.graph
# We hard-copy the *.graph file
if [ -L "${DATASET_LOCATION}/${graph_name}" ] || ! [ -f ${DATASET_LOCATION}/${graph_name} ]; then
cp -v --remove-destination ${DATASET_SOURCE}/${graph_name} ${DATASET_LOCATION}/;
fi
graph_transposed_name=${GRAPH_NAME}-transposed.graph
if [ -L ${DATASET_LOCATION}/${graph_transposed_name} ] || ! [ -f ${DATASET_LOCATION}/${graph_transposed_name} ]; then
cp -v --remove-destination ${DATASET_SOURCE}/${graph_transposed_name} ${DATASET_LOCATION}/;
fi
# Finally, we make explicit the graph is ready
touch ${WITNESS_FILE}
graph-wait-for-dataset.sh: |
#!/usr/bin/env bash
# Uses env variables WITNESS_FILE
[ -z "${WITNESS_FILE}" ] && \
echo "<WITNESS_FILE> env variable must be set" && exit 1
while [ ! -f ${WITNESS_FILE} ]; do
echo "${WITNESS_FILE} not present, wait for it to start the graph..."
sleep $PERIOD
done
graph-fetch-dataset.sh: |
#!/usr/bin/env bash
# Uses env variables WITNESS_FILE, DATASET_LOCATION, DATASET_NAME, GRAPH_NAME
[ -z "${DATASET_LOCATION}" ] && \
echo "<DATASET_LOCATION> env variable must be set" && exit 1
[ -z "${DATASET_NAME}" ] && \
echo "<DATASET_NAME> env variable must be set" && exit 1
[ -z "${WITNESS_FILE}" ] && \
echo "<WITNESS_FILE> env variable must be set" && exit 1
[ -z "${GRAPH_NAME}" ] && \
echo "<GRAPH_NAME> env variable must be set" && exit 1
set -eux
[ -f ${WITNESS_FILE} ] && \
echo "Dataset <${DATASET_NAME}> already present. Skip." && \
exit 0
case "${DATASET_NAME}" in
test|example)
# For test (or example) dataset sample, clone the source repository of
# swh.graph and use the example dataset within
git clone \
--depth 1 \
https://gitlab.softwareheritage.org/swh/devel/swh-graph.git/ \
/tmp/swh-graph
# Create empty dataset location destination for copy to be ok
mkdir -p ${DATASET_LOCATION} && rmdir ${DATASET_LOCATION}
# Actual copy of the test dataset
cp -r /tmp/swh-graph/swh/graph/example_dataset/compressed \
${DATASET_LOCATION}
# Make explicit the graph is ready
touch ${WITNESS_FILE}
;;
*)
# Otherwise, download the dataset locally
swh graph download \
--name ${DATASET_NAME} \
${DATASET_LOCATION}
# Reindex graph dataset (for those anterior to 2024). This should not be
# necessary for most recent graph datasets.
# For old datasets missing a .ef though, this just fails with
# `2024-09-02T14:11:56.190692004Z graph-rpc-python3k 0: Cannot map
# Elias-Fano pointer list .../graph.ef`, so we trigger a reindex step
reindex_witness_file=${DATASET_LOCATION}/${GRAPH_NAME}.ef
[ ! -f $reindex_witness_file ] && \
swh graph reindex ${DATASET_LOCATION}/${GRAPH_NAME}
# Make explicit the graph is ready
touch ${WITNESS_FILE}
;;
esac
data.config.yml.template (v1/ConfigMap/swh/web-app1-configuration-template)
± value change in multiline text (one insert, no deletions)
+ graph:
+ max_edges:
+ anonymous: 1000
+ staff: 0
+ user: 100000
+ url: http://graph.internal.softwareheritage.org:5009/graph
data.config.yml.template (v1/ConfigMap/swh/web-archive-configuration-template)
± value change in multiline text (one insert, no deletions)
+ graph:
+ max_edges:
+ anonymous: 1000
+ staff: 0
+ user: 100000
+ url: http://graph.internal.softwareheritage.org:5009/graph
spec.template.metadata.annotations.checksum/backend-utils (apps/v1/Deployment/swh/indexer-storage-read-only)
± value change
- 054addbf40d1e68df86e403b5f07088bde1d888a1525118fad302837b54edec6
+ 3c9e1487729c0e9273fa3383e6022842b87e8decf8f6c77c145d36d0ba551213
spec.template.metadata.annotations.checksum/backend-utils (apps/v1/Deployment/swh/indexer-storage-read-write)
± value change
- 054addbf40d1e68df86e403b5f07088bde1d888a1525118fad302837b54edec6
+ 3c9e1487729c0e9273fa3383e6022842b87e8decf8f6c77c145d36d0ba551213
spec.template.metadata.annotations.checksum/backend-utils (apps/v1/Deployment/swh/search-rpc)
± value change
- 054addbf40d1e68df86e403b5f07088bde1d888a1525118fad302837b54edec6
+ 3c9e1487729c0e9273fa3383e6022842b87e8decf8f6c77c145d36d0ba551213
spec.template.metadata.annotations.checksum/backend-utils (apps/v1/Deployment/swh/storage-postgresql-azure-readonly)
± value change
- 054addbf40d1e68df86e403b5f07088bde1d888a1525118fad302837b54edec6
+ 3c9e1487729c0e9273fa3383e6022842b87e8decf8f6c77c145d36d0ba551213
spec.template.metadata.annotations.checksum/backend-utils (apps/v1/Deployment/swh/storage-postgresql-winery)
± value change
- 054addbf40d1e68df86e403b5f07088bde1d888a1525118fad302837b54edec6
+ 3c9e1487729c0e9273fa3383e6022842b87e8decf8f6c77c145d36d0ba551213
spec.template.metadata.annotations.checksum/config (apps/v1/Deployment/swh/web-app1)
± value change
- 01464e602c77d5caa1a616af08f74224f6ee16175ac522724d8eb7330665dccc
+ 21ef18d9e65dae9b1494c87ffd1327183bdc106b0bb2c8c14be13f89fdac4c07
spec.template.metadata.annotations.checksum/config (apps/v1/Deployment/swh/web-archive)
± value change
- 063dad00b39ebd1276f4ac81b195c7346156b55e068aca8cce6939e5dad5ade5
+ 4534c9e9d8c7596f068533d88755b20175cf9e25dce347d43d67dbf1887b07b8
------------- diff for environment production namespace swh-cassandra -------------
_ __ __
_| |_ _ / _|/ _| between /tmp/swh-chart.swh.W2ixmoFC/production-swh-cassandra.before, 219 documents
/ _' | | | | |_| |_ and /tmp/swh-chart.swh.W2ixmoFC/production-swh-cassandra.after, 219 documents
| (_| | |_| | _| _|
\__,_|\__, |_| |_| returned 21 differences
|___/
data (v1/ConfigMap/swh-cassandra/backend-utils)
+ three map entries added:
graph-prepare-memory-volume.sh: |
#!/usr/bin/env bash
# Uses env variables WITNESS_FILE, DATASET_SOURCE, DATASET_LOCATION, GRAPH_NAME
set -eux
[ -z "${DATASET_LOCATION}" ] && \
echo "<DATASET_LOCATION> env variable must be set" && exit 1
[ -z "${DATASET_SOURCE}" ] && \
echo "<DATASET_SOURCE> env variable must be set" && exit 1
[ -z "${WITNESS_FILE}" ] && \
echo "<WITNESS_FILE> env variable must be set" && exit 1
[ -z "${WITNESS_SOURCE_FILE}" ] && \
echo "<WITNESS_SOURCE_FILE> env variable must be set" && exit 1
[ -z "${PERIOD}" ] && \
echo "<PERIOD> env variable must be set" && exit 1
[ -z "${GRAPH_NAME}" ] && \
echo "<GRAPH_NAME> env variable must be set" && exit 1
[ -f ${WITNESS_FILE} ] && echo "Graph ready, do nothing." && exit 0
while [ ! -f ${WITNESS_SOURCE_FILE} ]; do
echo "${WITNESS_SOURCE_FILE} not present, wait for it to prepare the graph dataset..."
sleep $PERIOD
done
# Create empty dataset location destination for copy to be ok
mkdir -p ${DATASET_LOCATION}
graph_stats=${GRAPH_NAME}.stats
# Symlink all files from dataset source to the destination (including the *.graph)
[ -L "${DATASET_LOCATION}/${graph_stats}" ] || \
ln -sf ${DATASET_SOURCE}/* ${DATASET_LOCATION}/
graph_name=${GRAPH_NAME}.graph
# We hard-copy the *.graph file
if [ -L "${DATASET_LOCATION}/${graph_name}" ] || ! [ -f ${DATASET_LOCATION}/${graph_name} ]; then
cp -v --remove-destination ${DATASET_SOURCE}/${graph_name} ${DATASET_LOCATION}/;
fi
graph_transposed_name=${GRAPH_NAME}-transposed.graph
if [ -L ${DATASET_LOCATION}/${graph_transposed_name} ] || ! [ -f ${DATASET_LOCATION}/${graph_transposed_name} ]; then
cp -v --remove-destination ${DATASET_SOURCE}/${graph_transposed_name} ${DATASET_LOCATION}/;
fi
# Finally, we make explicit the graph is ready
touch ${WITNESS_FILE}
graph-wait-for-dataset.sh: |
#!/usr/bin/env bash
# Uses env variables WITNESS_FILE
[ -z "${WITNESS_FILE}" ] && \
echo "<WITNESS_FILE> env variable must be set" && exit 1
while [ ! -f ${WITNESS_FILE} ]; do
echo "${WITNESS_FILE} not present, wait for it to start the graph..."
sleep $PERIOD
done
graph-fetch-dataset.sh: |
#!/usr/bin/env bash
# Uses env variables WITNESS_FILE, DATASET_LOCATION, DATASET_NAME, GRAPH_NAME
[ -z "${DATASET_LOCATION}" ] && \
echo "<DATASET_LOCATION> env variable must be set" && exit 1
[ -z "${DATASET_NAME}" ] && \
echo "<DATASET_NAME> env variable must be set" && exit 1
[ -z "${WITNESS_FILE}" ] && \
echo "<WITNESS_FILE> env variable must be set" && exit 1
[ -z "${GRAPH_NAME}" ] && \
echo "<GRAPH_NAME> env variable must be set" && exit 1
set -eux
[ -f ${WITNESS_FILE} ] && \
echo "Dataset <${DATASET_NAME}> already present. Skip." && \
exit 0
case "${DATASET_NAME}" in
test|example)
# For test (or example) dataset sample, clone the source repository of
# swh.graph and use the example dataset within
git clone \
--depth 1 \
https://gitlab.softwareheritage.org/swh/devel/swh-graph.git/ \
/tmp/swh-graph
# Create empty dataset location destination for copy to be ok
mkdir -p ${DATASET_LOCATION} && rmdir ${DATASET_LOCATION}
# Actual copy of the test dataset
cp -r /tmp/swh-graph/swh/graph/example_dataset/compressed \
${DATASET_LOCATION}
# Make explicit the graph is ready
touch ${WITNESS_FILE}
;;
*)
# Otherwise, download the dataset locally
swh graph download \
--name ${DATASET_NAME} \
${DATASET_LOCATION}
# Reindex graph dataset (for those anterior to 2024). This should not be
# necessary for most recent graph datasets.
# For old datasets missing a .ef though, this just fails with
# `2024-09-02T14:11:56.190692004Z graph-rpc-python3k 0: Cannot map
# Elias-Fano pointer list .../graph.ef`, so we trigger a reindex step
reindex_witness_file=${DATASET_LOCATION}/${GRAPH_NAME}.ef
[ ! -f $reindex_witness_file ] && \
swh graph reindex ${DATASET_LOCATION}/${GRAPH_NAME}
# Make explicit the graph is ready
touch ${WITNESS_FILE}
;;
esac
data.config.yml.template (v1/ConfigMap/swh-cassandra/web-cassandra-configuration-template)
± value change in multiline text (one insert, no deletions)
+ graph:
+ max_edges:
+ anonymous: 1000
+ staff: 0
+ user: 100000
+ url: http://graph.internal.softwareheritage.org:5009/graph
spec.template.metadata.annotations.checksum/backend-utils (apps/v1/Deployment/swh-cassandra/indexer-storage-read-only)
± value change
- c43dc08ca2b696716ab6bab9434a3bb8e6aafe02c326ca2ddbe2cb0b946c2e1f
+ 40879a26fb269aa7fb852180b3de6f7a3ab7a6910d5a00be3a6efe09cee885c1
spec.template.metadata.annotations.checksum/backend-utils (apps/v1/Deployment/swh-cassandra/indexer-storage-read-write)
± value change
- c43dc08ca2b696716ab6bab9434a3bb8e6aafe02c326ca2ddbe2cb0b946c2e1f
+ 40879a26fb269aa7fb852180b3de6f7a3ab7a6910d5a00be3a6efe09cee885c1
spec.template.metadata.annotations.checksum/backend-utils (apps/v1/Deployment/swh-cassandra/search-rpc)
± value change
- c43dc08ca2b696716ab6bab9434a3bb8e6aafe02c326ca2ddbe2cb0b946c2e1f
+ 40879a26fb269aa7fb852180b3de6f7a3ab7a6910d5a00be3a6efe09cee885c1
spec.template.metadata.annotations.checksum/config_utils (apps/v1/Deployment/swh-cassandra/storage-replayer-content)
± value change
- c43dc08ca2b696716ab6bab9434a3bb8e6aafe02c326ca2ddbe2cb0b946c2e1f
+ 40879a26fb269aa7fb852180b3de6f7a3ab7a6910d5a00be3a6efe09cee885c1
spec.template.metadata.annotations.checksum/config_utils (apps/v1/Deployment/swh-cassandra/storage-replayer-directory)
± value change
- c43dc08ca2b696716ab6bab9434a3bb8e6aafe02c326ca2ddbe2cb0b946c2e1f
+ 40879a26fb269aa7fb852180b3de6f7a3ab7a6910d5a00be3a6efe09cee885c1
spec.template.metadata.annotations.checksum/config_utils (apps/v1/Deployment/swh-cassandra/storage-replayer-extid)
± value change
- c43dc08ca2b696716ab6bab9434a3bb8e6aafe02c326ca2ddbe2cb0b946c2e1f
+ 40879a26fb269aa7fb852180b3de6f7a3ab7a6910d5a00be3a6efe09cee885c1
spec.template.metadata.annotations.checksum/config_utils (apps/v1/Deployment/swh-cassandra/storage-replayer-metadata)
± value change
- c43dc08ca2b696716ab6bab9434a3bb8e6aafe02c326ca2ddbe2cb0b946c2e1f
+ 40879a26fb269aa7fb852180b3de6f7a3ab7a6910d5a00be3a6efe09cee885c1
spec.template.metadata.annotations.checksum/config_utils (apps/v1/Deployment/swh-cassandra/storage-replayer-origin)
± value change
- c43dc08ca2b696716ab6bab9434a3bb8e6aafe02c326ca2ddbe2cb0b946c2e1f
+ 40879a26fb269aa7fb852180b3de6f7a3ab7a6910d5a00be3a6efe09cee885c1
spec.template.metadata.annotations.checksum/config_utils (apps/v1/Deployment/swh-cassandra/storage-replayer-origin-visit)
± value change
- c43dc08ca2b696716ab6bab9434a3bb8e6aafe02c326ca2ddbe2cb0b946c2e1f
+ 40879a26fb269aa7fb852180b3de6f7a3ab7a6910d5a00be3a6efe09cee885c1
spec.template.metadata.annotations.checksum/config_utils (apps/v1/Deployment/swh-cassandra/storage-replayer-origin-visit-status)
± value change
- c43dc08ca2b696716ab6bab9434a3bb8e6aafe02c326ca2ddbe2cb0b946c2e1f
+ 40879a26fb269aa7fb852180b3de6f7a3ab7a6910d5a00be3a6efe09cee885c1
spec.template.metadata.annotations.checksum/config_utils (apps/v1/Deployment/swh-cassandra/storage-replayer-raw-extrinsic-metadata)
± value change
- c43dc08ca2b696716ab6bab9434a3bb8e6aafe02c326ca2ddbe2cb0b946c2e1f
+ 40879a26fb269aa7fb852180b3de6f7a3ab7a6910d5a00be3a6efe09cee885c1
spec.template.metadata.annotations.checksum/config_utils (apps/v1/Deployment/swh-cassandra/storage-replayer-release)
± value change
- c43dc08ca2b696716ab6bab9434a3bb8e6aafe02c326ca2ddbe2cb0b946c2e1f
+ 40879a26fb269aa7fb852180b3de6f7a3ab7a6910d5a00be3a6efe09cee885c1
spec.template.metadata.annotations.checksum/config_utils (apps/v1/Deployment/swh-cassandra/storage-replayer-revision)
± value change
- c43dc08ca2b696716ab6bab9434a3bb8e6aafe02c326ca2ddbe2cb0b946c2e1f
+ 40879a26fb269aa7fb852180b3de6f7a3ab7a6910d5a00be3a6efe09cee885c1
spec.template.metadata.annotations.checksum/config_utils (apps/v1/Deployment/swh-cassandra/storage-replayer-skipped-content)
± value change
- c43dc08ca2b696716ab6bab9434a3bb8e6aafe02c326ca2ddbe2cb0b946c2e1f
+ 40879a26fb269aa7fb852180b3de6f7a3ab7a6910d5a00be3a6efe09cee885c1
spec.template.metadata.annotations.checksum/config_utils (apps/v1/Deployment/swh-cassandra/storage-replayer-snapshot)
± value change
- c43dc08ca2b696716ab6bab9434a3bb8e6aafe02c326ca2ddbe2cb0b946c2e1f
+ 40879a26fb269aa7fb852180b3de6f7a3ab7a6910d5a00be3a6efe09cee885c1
spec.template.metadata.annotations.checksum/backend-utils (apps/v1/Deployment/swh-cassandra/storage-cassandra-azure-readonly)
± value change
- c43dc08ca2b696716ab6bab9434a3bb8e6aafe02c326ca2ddbe2cb0b946c2e1f
+ 40879a26fb269aa7fb852180b3de6f7a3ab7a6910d5a00be3a6efe09cee885c1
spec.template.metadata.annotations.checksum/backend-utils (apps/v1/Deployment/swh-cassandra/storage-cassandra-readonly)
± value change
- c43dc08ca2b696716ab6bab9434a3bb8e6aafe02c326ca2ddbe2cb0b946c2e1f
+ 40879a26fb269aa7fb852180b3de6f7a3ab7a6910d5a00be3a6efe09cee885c1
spec.template.metadata.annotations.checksum/backend-utils (apps/v1/Deployment/swh-cassandra/storage-cassandra-readonly-internal)
± value change
- c43dc08ca2b696716ab6bab9434a3bb8e6aafe02c326ca2ddbe2cb0b946c2e1f
+ 40879a26fb269aa7fb852180b3de6f7a3ab7a6910d5a00be3a6efe09cee885c1
spec.template.metadata.annotations.checksum/config (apps/v1/Deployment/swh-cassandra/web-cassandra)
± value change
- a6bfbe7ca18bd6e42ba07d19d60e52c836a0d76e3fd07891d69be61e13fb4395
+ 2f816135212d6f85adea22863dc0ce641f43d702219f848c9540f5ca9ad4cd0f