Skip to content

Better handling of unicode characters in sample names #292

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
isaacovercast opened this issue May 3, 2018 · 2 comments
Closed

Better handling of unicode characters in sample names #292

isaacovercast opened this issue May 3, 2018 · 2 comments

Comments

@isaacovercast
Copy link
Collaborator

Writing pandas dataframes to file buffers fails with unicode in sample names. Each assembly step will complete, but then writing stats files will fail with:

"Encountered an unexpected error (see ./ipyrad_log.txt)
Error message is below -------------------------------
writelines() argument must be a sequence of strings"

To reproduce:
Edit any of the simulated barcodes files and swap one of the letters for ö. Run steps.

Useful/related:
pandas-dev/pandas#680
https://stackoverflow.com/questions/38786936/pandas-convert-unicode-strings-to-string

@isaacovercast
Copy link
Collaborator Author

Use io.open and you can set the encoding to utf-8. This was useful: https://stackoverflow.com/questions/6048085/writing-unicode-text-to-a-text-file

@isaacovercast
Copy link
Collaborator Author

Fixed in v.0.7.24

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant