plot: Add support for pandas >= 2
pandas.DataFrame.append
was removed in pandas 2.0, pandas.concat
must be used instead.
We missed the issue as our CI still uses Python 3.7 and the latest available pandas version on PyPI for that Python version is 1.3.5.
$ pytest -svv swh/scanner/tests/test_plot.py::test_build_hierarchical_df
================================================================================================================================== test session starts ==================================================================================================================================
platform linux -- Python 3.9.2, pytest-7.3.0, pluggy-1.0.0 -- /home/anlambert/.virtualenvs/swh/bin/python3
cachedir: .pytest_cache
hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase('/home/anlambert/swh/swh-environment/swh-scanner/.hypothesis/examples')
rootdir: /home/anlambert/swh/swh-environment/swh-scanner
configfile: pytest.ini
plugins: case-1.5.3, mock-3.10.0, postgresql-3.1.3, requests-mock-1.10.0, subprocess-1.5.0, anyio-3.6.2, forked-1.6.0, dash-2.9.2, redis-3.0.1, flask-1.2.0, asyncio-0.21.0, django-test-migrations-1.2.0, cov-4.0.0, httpserver-1.0.6, Faker-18.5.1, docker-compose-3.2.1, xdist-3.2.1, hypothesis-6.71.0, testinfra-7.0.0, swh.journal-1.3.1, swh.core-2.22.0
asyncio: mode=auto
collected 1 item
swh/scanner/tests/test_plot.py::test_build_hierarchical_df FAILED
======================================================================================================================================= FAILURES ========================================================================================================================================
______________________________________________________________________________________________________________________________ test_build_hierarchical_df _______________________________________________________________________________________________________________________________
source_tree = Directory(id=0a7b61ef5780b03aa274d11069564980246445ce, entries=[b'some-binary', b'link-to-another-quote', b'toexclude', b'bar', b'link-to-foo', b'foo'])
source_tree_dirs = [PosixPath('toexclude'), PosixPath('bar'), PosixPath('bar/barfoo'), PosixPath('bar/barfoo2'), PosixPath('foo')]
nodes_data = {CoreSWHID.from_string('swh:1:dir:0a7b61ef5780b03aa274d11069564980246445ce'): {'known': True}, CoreSWHID.from_string('...34c9a'): {'known': True}, CoreSWHID.from_string('swh:1:cnt:acac326ddd63b0bc70840659d4ac43619484e69f'): {'known': True}}
def test_build_hierarchical_df(source_tree, source_tree_dirs, nodes_data):
root = Path(source_tree.data["path"].decode())
dirs = [Path(dir_path) for dir_path in source_tree_dirs]
dirs_data = get_directory_data(root, source_tree, nodes_data)
max_depth = compute_max_depth(dirs)
metrics_columns = ["contents", "known"]
levels_columns = ["lev" + str(i) for i in range(max_depth)]
df_columns = levels_columns + metrics_columns
actual_df = generate_df_from_dirs(dirs_data, df_columns, max_depth)
> actual_result = build_hierarchical_df(
actual_df, levels_columns, metrics_columns, root
)
swh/scanner/tests/test_plot.py:60:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
swh/scanner/plot.py:147: in build_hierarchical_df
complete_df = complete_df.append(df_tree_list, ignore_index=True)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = Empty DataFrame
Columns: [id, parent, contents, known]
Index: [], name = 'append'
def __getattr__(self, name: str):
"""
After regular attribute access, try looking up the name
This allows simpler access to columns for interactive use.
"""
# Note: obj.x will always call obj.__getattribute__('x') prior to
# calling obj.__getattr__('x').
if (
name not in self._internal_names_set
and name not in self._metadata
and name not in self._accessors
and self._info_axis._can_hold_identifiers_and_holds_name(name)
):
return self[name]
> return object.__getattribute__(self, name)
E AttributeError: 'DataFrame' object has no attribute 'append'
../../../.virtualenvs/swh/lib/python3.9/site-packages/pandas/core/generic.py:5989: AttributeError
================================================================================================================================ short test summary info ================================================================================================================================
FAILED swh/scanner/tests/test_plot.py::test_build_hierarchical_df - AttributeError: 'DataFrame' object has no attribute 'append'
=================================================================================================================================== 1 failed in 0.75s ===================================================================================================================================