Closed
Description
Brief Description
I'd like to suggest an option to remove accents from column names on the clean_names function.
It could be implemented using the normalize function of the standard library unicodedata:
What is the best way to remove accents in a Python unicode string?
I have created a branch called strip_accents and checked that the code addition does not break any tests.
mralbu/pyjanitor/tree/strip_accents
Example API
# create test DataFrame
df = pd.DataFrame({"João": [1, 2], "Лука́ся": [1, 2], "Käfer": [1, 2]})
# remove column name accents
df = df.clean_names(strip_accents=True)
expected_columns = ["joao", "лукася", "kafer"]
assert set(df.columns) == set(expected_columns)