Description
In Python 3, identifiers can include non-ASCII characters. See Identifiers and keywords in the Python Reference.
Roughly speaking, the definition of letters is extended from [A-Za-z]
to all Unicode letter categories. The standard re
module, however, does not support Unicode character properties like Lu
, Ll
, etc; and I’m not sure its standard Unicode support will be enough.
Alternatives include the drop-in, compatible regex module (which supports Unicode properties with \p{}
, or manually compiling a list of character ranges to use (which gets very ugly very quickly).
Steps to reproduce
-
Create an UTF-8 file with, for example, the following contents:
"""Python 3 code with non-ASCII identifiers.""" def validar_contraseña(cadena): """Validates a password, in Spanish. """ contraseña_válida = len(cadena) >= 8 return contraseña_válida
(N.B.: I don’t write code with non-ASCII characters, but my students do.)
-
Run pylint3 on it.
Current behavior
$ pylint3 test.py
test.py:3:0: C0103: Function name "validar_contraseña" doesn't conform to snake_case naming style (invalid-name)
test.py:6:4: C0103: Variable name "contraseña_válida" doesn't conform to snake_case naming style (invalid-name)
(I encounter this when I run pylint on my students’ code.)
Expected behavior
$ pylint3 test.py
Your code has been rated at 10.00/10
(This is what I want their code to look like. 😄)
pylint --version output
Tested with:
$ pylint3 --version
pylint3 2.2.2
astroid 2.1.0
Python 3.7.2 (default, Jan 3 2019, 02:55:40)
[GCC 8.2.0]
$ ~/.local/bin/pylint --version
pylint 2.3.0-dev1
astroid 2.2.0-dev
Python 3.7.2 (default, Jan 3 2019, 02:55:40)
[GCC 8.2.0]
Many thanks in advance.