Skip to content

Added emitUTF8 setting. #1045

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Oct 17, 2019
Merged

Conversation

nicolaswilson
Copy link
Contributor

This PR adds a new setting 'emitUTF8' which avoids unicode escaping the emitted JSON.

@@ -267,7 +267,7 @@ static String toHex16Bit(unsigned int x) {
return result;
}

static String valueToQuotedStringN(const char* value, unsigned length) {
static String valueToQuotedStringN(const char* value, unsigned length, bool emitUTF8 = false ) {
Copy link
Member

@dota17 dota17 Oct 9, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's better to overload it instead of rewriting it, I guess?

etc, use it outside the method

if (_emitUTF8) {
  ....
} else {
  valueToQuotedStringN(value, length);
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to use the escaping of the tabs, newlines etc. in the first part of the function and then either write the UTF-8 characters or unicode escape them in the second part. Perhaps I should split the function into those separate pieces? What's your view?

Copy link
Contributor

@baylesj baylesj Oct 10, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually prefer it as is. There's nothing wrong with the default argument usage here, especially since most of the downsides of default arguments don't apply to static methods.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm glad that you're all considering binary compatibility. Thank you.

Yes, internal "statics" are hidden from the public API, so either way is fine with me.

This change is a nice idea.

@dota17
Copy link
Member

dota17 commented Oct 9, 2019

In addition, it's better to write some testcases to test this change.

@nicolaswilson
Copy link
Contributor Author

In addition, it's better to write some testcases to test this change.

I'll add some tests this evening.

Copy link
Contributor

@baylesj baylesj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy to add this feature, thanks for writing this up. If you can address my feedback, I'll look at submitting it.

@@ -267,7 +267,7 @@ static String toHex16Bit(unsigned int x) {
return result;
}

static String valueToQuotedStringN(const char* value, unsigned length) {
static String valueToQuotedStringN(const char* value, unsigned length, bool emitUTF8 = false ) {
Copy link
Contributor

@baylesj baylesj Oct 10, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually prefer it as is. There's nothing wrong with the default argument usage here, especially since most of the downsides of default arguments don't apply to static methods.

@fseasy
Copy link

fseasy commented Oct 11, 2019

Great! Thanks!
How about change the emitUTF8 to some other name? It is not clearly at the first glance for the emit word mean. How about noUTF8Decode or toUnicode (may be toUnicodeFromUTF8 most clearly, but too long...) ?
Just a naive option, Thanks 🚀

@dota17
Copy link
Member

dota17 commented Oct 14, 2019

Good for it.
Could you fix the clang-format error of travis-ci?Recently, we added clang format support to the travis bots.

@Sublihim
Copy link

Sublihim commented Jun 1, 2020

why didn't we add a description to jsoncpp/include/json/writer.h?

@Arminius
Copy link

Thanks so much for adding in this functionality, it's desperately needed. But I'm still baffled why it's opt-in?
Have a look at the JSON spec (RFC 8259):

8.1. Character Encoding

JSON text exchanged between systems that are not part of a closed
ecosystem MUST be encoded using UTF-8 [RFC3629].

UTF-8 encoding absolutely should be the defaut.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants