Staging instance, all changes can be removed at any time

Skip to content

common/converters: Harmonize UTF-8 decoding errors handling

Remove special UTF-8 decoding errors handling for the "message" value of a swh revision dictionary and use global decoding errors handler instead.

As a reminder, the global handler for UTF-8 decoding errors performs the following actions:

  • It puts all key names of a dictionary where UTF-8 decoding of values failed in a list and store it under a new key "decoding_failures".

  • A string that could not be decoded will have the bytes of its invalid UTF-8 sequences escaped.

Also add section about UTF-8 decoding errors in top level API documentation.

Closes #2617 (closed)


Migrated from D4003 (view on Phabricator)

Merge request reports

Loading