Skip to content

turning json option into macro parameter #308

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Mar 10, 2025

Conversation

lemire
Copy link
Member

@lemire lemire commented Mar 9, 2025

We have added a few options (such as json_fmt). These trigger a short series of branches in our parse_number_string function. Unfortunately, this is costing us a bit of performance. We can get it back by turning the runtime parameters into template parameters... there is still a branch, but it is more direct and likely extremely cheap.

Building:

cmake -B build -D FASTFLOAT_BENCHMARKS=ON
cmake --build build

Let us benchmark on an Apple M2 processor (LLVM 16). These benchmarks are part of our library.

Before:

⚡  ./build/benchmarks/realbenchmark
####
# reading /Users/dlemire/CVS/github/fast_float/bui/benchmarks/data/canada.txt
####
# read 111126 lines
ASCII volume = 1.93374 MB
fastfloat (64)                          :  1166.05 MB/s (+/- 8.5 %)    67.01 Mfloat/s
fastfloat (32)                          :  1211.52 MB/s (+/- 7.5 %)    69.62 Mfloat/s
UTF-16 volume = 3.86749 MB
fastfloat (64)                          :  2187.60 MB/s (+/- 5.6 %)    62.86 Mfloat/s
fastfloat (32)                          :  2269.15 MB/s (+/- 4.6 %)    65.20 Mfloat/s
####
# reading /Users/dlemire/CVS/github/fast_float/bui/benchmarks/data/mesh.txt
####
# read 73019 lines
ASCII volume = 0.536009 MB
fastfloat (64)                          :   773.74 MB/s (+/- 4.7 %)   105.40 Mfloat/s
fastfloat (32)                          :   686.86 MB/s (+/- 4.3 %)    93.57 Mfloat/s
UTF-16 volume = 1.07202 MB
fastfloat (64)                          :  1551.87 MB/s (+/- 4.8 %)   105.70 Mfloat/s
fastfloat (32)                          :  1396.92 MB/s (+/- 6.0 %)    95.15 Mfloat/s

After:

⚡  ./build/benchmarks/realbenchmark
####
# reading /Users/dlemire/CVS/github/fast_float/build/benchmarks/data/canada.txt
####
# read 111126 lines
ASCII volume = 1.93374 MB
fastfloat (64)                          :  1175.62 MB/s (+/- 4.9 %)    67.56 Mfloat/s
fastfloat (32)                          :  1224.92 MB/s (+/- 4.1 %)    70.39 Mfloat/s
UTF-16 volume = 3.86749 MB
fastfloat (64)                          :  2220.99 MB/s (+/- 4.4 %)    63.82 Mfloat/s
fastfloat (32)                          :  2292.35 MB/s (+/- 4.4 %)    65.87 Mfloat/s
####
# reading /Users/dlemire/CVS/github/fast_float/build/benchmarks/data/mesh.txt
####
# read 73019 lines
ASCII volume = 0.536009 MB
fastfloat (64)                          :   873.69 MB/s (+/- 2.0 %)   119.02 Mfloat/s
fastfloat (32)                          :   785.65 MB/s (+/- 4.3 %)   107.03 Mfloat/s
UTF-16 volume = 1.07202 MB
fastfloat (64)                          :  1638.44 MB/s (+/- 1.8 %)   111.60 Mfloat/s
fastfloat (32)                          :  1548.04 MB/s (+/- 4.2 %)   105.44 Mfloat/s

On the mesh dataset, we get a performance boost of about 15%.

This might be related to PR #307 by @IRainman

@dalle
Copy link
Collaborator

dalle commented Mar 9, 2025

Looks good. Do you think there be a macro for if-constexpr for C++17 enabled compilers?

Something like this?

#if __cplusplus >= 201703L || (defined(_MSVC_LANG) && _MSVC_LANG >= 201703L)
#define FASTFLOAT_IF_CONSTEXPR17(x) if constexpr (x)
#else
#define FASTFLOAT_IF_CONSTEXPR17(x) if (x)
#endif

...
FASTFLOAT_IF_CONSTEXPR17(basic_json_fmt)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you move the #include "event_counter.h" at the top of the file outside of the ifdef? That way it will compile on Windows too.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok

@IRainman
Copy link

IRainman commented Mar 9, 2025

Nice! Also the code definitely more cache friendly, it's shown by more constant time results on output.

Copy link
Collaborator

@dalle dalle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. I commented somewhere else that we could explore using if-constexpr, but it's up to you. I'm a bit worried of "constant expression in if-statement" warnings appearing.

@IRainman
Copy link

IRainman commented Mar 9, 2025

Also about std::from_chars and compatibility, as you can remember, the standard regexp isn't ideal and the fmt library, especially with compile time parsing macros
fmt::format(FMT_COMPILE("some value: {}"), value));
much faster than std implementation and generates very compact code.

@IRainman
Copy link

IRainman commented Mar 9, 2025

Looks good. I commented somewhere else that we could explore using if-constexpr, but it's up to you. I'm a bit worried of "constant expression in if-statement" warnings appearing.

It's probably fine because it's old code and in the std function should be consteval in used in constexpr if. Maybe I should write a defect report for standard std::is_constant_evaluated. What do you think?

@lemire
Copy link
Member Author

lemire commented Mar 9, 2025

@dalle

I am adding

#if defined(__cpp_if_constexpr) && __cpp_if_constexpr >= 201606L
#define FASTFLOAT_IF_CONSTEXPR17(x) if constexpr (x)
#else
#define FASTFLOAT_IF_CONSTEXPR17(x) if (x)
#endif

@IRainman
Copy link

IRainman commented Mar 9, 2025

Maybe it's better to convert all options into a template parameter?

@lemire
Copy link
Member Author

lemire commented Mar 10, 2025

Maybe it's better to convert all options into a template parameter?

Well. That's an empirical question (i.e., see benchmarks).

@lemire lemire merged commit e019768 into main Mar 10, 2025
27 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants