Rust bindings to the utf8proc library supporting normalization, case-folding, and character class testing.
A statically linked binary is under 350K, yet can replace the functionality of the unicode-width
and unicode-normalization
crates, among others.
It is used for Unicode implementation in the Julia programming language.
The underlying utf8proc library does not support any "derived properties", including the XID_Start
/XID_Continue
properties.
For this purpose, the unicode-ident
crate is recommended.
It is very fast and needs only ~10KiB of static storage.
Emulating this functionality would be slow and/or require additional static storage.
It also does not support lookup or resolution of character names.
For that, consider the unicode_names2
crate.
The safe bindings does not (yet) wrap all the functionality that the C library does. PRs are welcome.
This project is licensed under the MIT License. The utf8proc project is also licensed under the MIT license, but contains data licensed under the Unicode 3.0 license. See the LICENSE.md file for more details.
Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in this project by you shall be licensed under the MIT license, without any additional terms or conditions.