uucore: format: num_parser: Use ExtendedBigDecimal #7556

drinkcat · 2025-03-24T08:32:37Z

Stacks on top of #7514.

The main direct advantage of this change is that printf parser is a bit more accurate (more corner cases are covered), and can now print floats with arbitrary precision.

A follow-up CL will move seq to use this parsing functions as well, so that we do not have to maintain and fix 2 different set of parsing functions like we do now.

Changes the integer parsing code to use ExtendedBigDecimal internally
Add support for scientific number parsing.
Fix a few corner cases (leading +, no binary floats, etc.)

printf "%.24f %.24a\n" 0.1 0.1 after this PR:

0.100000000000000000000000 0xc.cccccccccccccccccccccccdp-7

main used to print (f64 precision)

0.100000000000000005551115 0xc.cccccccccccd000000000000p-7

More precise than what coreutils can do (f80 precision on x86):

0.100000000000000000001355 0xc.ccccccccccccccd000000000p-7

uucore: format: num_parser: Make it clear that scale can only be positive

After scratching my head a bit about why the hexadecimal code works,
seems better to do make scale an u64 to clarify.

Note that this may u64 may exceed i64 capacity, but that can only
happen if the number of digits provided > 2**63 (impossible).

uucore: format: num_parser: Parse exponent part of floating point numbers

Parse numbers like 123.15e15 and 0xfp-2, and add some tests
for that.

parse is becoming more and more of a monster: we should consider
splitting it into multiple parts.

Fixes #7474.

uucore: format: num_parser: Fix large hexadecimal float parsing

Large numbers can overflow u64 when doing 16u64.pow(scale):
do the operation on BigInt/BigDecimal instead.

Also, add a test. Wolfram Alpha can help confirm the decimal number
is correct (16-16**-21).

uucore: format: Use ExtendedBigDecimal in argument code

Provides arbitrary precision for float parsing in printf.

Also add a printf test for that.

uucore: format: num_parser: Add parser for ExtendedBigDecimal

Very simple as the f64 parser actually uses that as intermediary
value.

Add a few tests too.

uucore: format: num_parser: Disallow binary number parsing for floats

Fixes #7487.

Also, add more tests for leading zeros not getting parsed as octal
when dealing with floats.

uucore: format: num_parser: allow leading + sign when parsing

Leading plus signs are allowed for all formats.

Add tests (including some tests for negative i64 values, and mixed
case special values that springed to mind).

Fixes #7473.

uucore: format: num_parser: Turn parser into a trait

We call the function extended_parse, so that we do not clash
with other parsing functions in other traits.

Also implement parser for ExtendedBigDecimal (straightforward).
Base doesn't need to be public anymore.
Rename the error to ExtendedParserError.

uucore: format: num_parser: Fold special value parsing in main parsing function

uucore: format: num_parser: Use ExtendedBigDecimal for internal representation

ExtendedBigDecimal already provides everything we need, use that
instead of a custom representation.

github-actions · 2025-03-24T09:08:18Z

GNU testsuite comparison:

Congrats! The gnu test tests/misc/stdbuf is no longer failing!

github-actions · 2025-03-25T08:59:20Z

GNU testsuite comparison:

Skipping an intermittent issue tests/timeout/timeout (passes in this run but fails in the 'main' branch)

github-actions · 2025-03-25T13:25:51Z

GNU testsuite comparison:

Skip an intermittent issue tests/misc/stdbuf (fails in this run but passes in the 'main' branch)

Copilot

Pull Request Overview

This PR updates the number parsing functionality to use ExtendedBigDecimal, allowing arbitrary precision formatting and improved handling of corner cases in float parsing. Key changes include:

Updating printf parsing to use ExtendedBigDecimal for increased precision.
Adding support for scientific notation and fixing corner cases (leading plus sign, no binary floats, large hexadecimal floats).
Refactoring functions and traits (such as replacing get_f64 with get_extended_big_decimal) to consistently work with ExtendedBigDecimal.

Reviewed Changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated no comments.

File	Description
tests/by-util/test_printf.rs	Added a test to verify high precision float formatting using ExtendedBigDecimal
src/uucore/src/lib/features/format/spec.rs	Updated parsing logic to obtain ExtendedBigDecimal from arguments
src/uucore/src/lib/features/format/extendedbigdecimal.rs	Added Default implementation for ExtendedBigDecimal
src/uucore/src/lib/features/format/argument.rs	Refactored argument parsing to use ExtendedBigDecimal and ExtendedParserError

github-actions · 2025-03-29T15:18:31Z

GNU testsuite comparison:

Skip an intermittent issue tests/misc/stdbuf (fails in this run but passes in the 'main' branch)
Skipping an intermittent issue tests/timeout/timeout (passes in this run but fails in the 'main' branch)

drinkcat · 2025-03-29T16:22:43Z

(network flake in CI, unrelated)

…sentation ExtendedBigDecimal already provides everything we need, use that instead of a custom representation.

…g function

We call the function extended_parse, so that we do not clash with other parsing functions in other traits. - Also implement parser for ExtendedBigDecimal (straightforward). - Base doesn't need to be public anymore. - Rename the error to ExtendedParserError.

Leading plus signs are allowed for all formats. Add tests (including some tests for negative i64 values, and mixed case special values that springed to mind). Fixes uutils#7473.

Fixes uutils#7487. Also, add more tests for leading zeros not getting parsed as octal when dealing with floats.

Very simple as the f64 parser actually uses that as intermediary value. Add a few tests too.

Provides arbitrary precision for float parsing in printf. Also add a printf test for that.

Large numbers can overflow u64 when doing 16u64.pow(scale): do the operation on BigInt/BigDecimal instead. Also, add a test. Wolfram Alpha can help confirm the decimal number is correct (16-16**-21).

…bers Parse numbers like 123.15e15 and 0xfp-2, and add some tests for that. `parse` is becoming more and more of a monster: we should consider splitting it into multiple parts. Fixes uutils#7474.

…tive After scratching my head a bit about why the hexadecimal code works, seems better to do make scale an u64 to clarify. Note that this may u64 may exceed i64 capacity, but that can only happen if the number of digits provided > 2**63 (impossible).

github-actions · 2025-03-31T08:38:12Z

GNU testsuite comparison:

Skipping an intermittent issue tests/timeout/timeout (passes in this run but fails in the 'main' branch)

Copilot

Pull Request Overview

This PR refactors the numerical parsing and formatting in uucore to use ExtendedBigDecimal, resulting in increased precision and support for scientific notation while addressing various corner cases.

Converts integer and float parsing to use ExtendedBigDecimal for arbitrary precision.
Adds support for scientific number parsing and corrects edge cases (e.g., leading '+' signs, prevention of binary float parsing).
Updates tests and adjusts error handling to align with the new parser design.

Reviewed Changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated no comments.

File	Description
tests/by-util/test_printf.rs	Adds a new test for high-precision float output using ExtendedBigDecimal.
src/uucore/src/lib/features/format/spec.rs	Refactors float parsing to use get_extended_big_decimal, removing earlier conversion from f64.
src/uucore/src/lib/features/format/extendedbigdecimal.rs	Introduces the Default trait implementation for ExtendedBigDecimal.
src/uucore/src/lib/features/format/argument.rs	Updates argument extraction to use ExtendedBigDecimal and its associated parsing functions.

RenjiSann · 2025-04-01T08:53:06Z

Thanks !

drinkcat marked this pull request as ready for review March 24, 2025 08:52

drinkcat force-pushed the parse-bigdecimal branch from 9c71ab0 to b7f0ca5 Compare March 25, 2025 08:20

drinkcat mentioned this pull request Mar 25, 2025

%a does not support setting precision (e.g. %.6a) #7429

Closed

drinkcat force-pushed the parse-bigdecimal branch 2 times, most recently from 3d8ef6b to 7cf6b2b Compare March 25, 2025 12:26

drinkcat mentioned this pull request Mar 27, 2025

uucore: format: num_format: add fmt function tests, and workaround 0e10 printing. #7514

Merged

sylvestre force-pushed the parse-bigdecimal branch from 7cf6b2b to 0529905 Compare March 29, 2025 14:43

sylvestre requested a review from Copilot March 29, 2025 15:03

Copilot AI reviewed Mar 29, 2025

View reviewed changes

drinkcat added 10 commits March 31, 2025 10:04

uucore: format: num_parser: Use ExtendedBigDecimal for internal repre…

20add88

…sentation ExtendedBigDecimal already provides everything we need, use that instead of a custom representation.

uucore: format: num_parser: Fold special value parsing in main parsin…

8bbec16

…g function

uucore: format: num_parser: allow leading + sign when parsing

97e333c

Leading plus signs are allowed for all formats. Add tests (including some tests for negative i64 values, and mixed case special values that springed to mind). Fixes uutils#7473.

uucore: format: num_parser: Disallow binary number parsing for floats

d7502e4

Fixes uutils#7487. Also, add more tests for leading zeros not getting parsed as octal when dealing with floats.

uucore: format: num_parser: Add parser for ExtendedBigDecimal

71a2854

Very simple as the f64 parser actually uses that as intermediary value. Add a few tests too.

uucore: format: Use ExtendedBigDecimal in argument code

b5a6585

Provides arbitrary precision for float parsing in printf. Also add a printf test for that.

uucore: format: num_parser: Fix large hexadecimal float parsing

55773e9

Large numbers can overflow u64 when doing 16u64.pow(scale): do the operation on BigInt/BigDecimal instead. Also, add a test. Wolfram Alpha can help confirm the decimal number is correct (16-16**-21).

uucore: format: num_parser: Parse exponent part of floating point num…

bd68eb8

…bers Parse numbers like 123.15e15 and 0xfp-2, and add some tests for that. `parse` is becoming more and more of a monster: we should consider splitting it into multiple parts. Fixes uutils#7474.

drinkcat force-pushed the parse-bigdecimal branch from 0529905 to 30c89af Compare March 31, 2025 08:04

sylvestre requested a review from Copilot March 31, 2025 10:40

Copilot AI reviewed Mar 31, 2025

View reviewed changes

drinkcat mentioned this pull request Mar 31, 2025

uucore: format: Collection of small parser fixes #7623

Merged

RenjiSann merged commit ace92dc into uutils:main Apr 1, 2025
105 of 106 checks passed

BrewTestBot mentioned this pull request May 24, 2025

uutils-coreutils 0.1.0 Homebrew/homebrew-core#224645

Merged

moonfruit mentioned this pull request May 26, 2025

uutils-selected 0.1.0 moonfruit/homebrew-tap#243

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

uucore: format: num_parser: Use ExtendedBigDecimal #7556

uucore: format: num_parser: Use ExtendedBigDecimal #7556

Uh oh!

drinkcat commented Mar 24, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Mar 24, 2025

Uh oh!

github-actions bot commented Mar 25, 2025

Uh oh!

github-actions bot commented Mar 25, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

github-actions bot commented Mar 29, 2025

Uh oh!

drinkcat commented Mar 29, 2025

Uh oh!

github-actions bot commented Mar 31, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

RenjiSann commented Apr 1, 2025

Uh oh!

Uh oh!

Uh oh!

uucore: format: num_parser: Use ExtendedBigDecimal #7556

uucore: format: num_parser: Use ExtendedBigDecimal #7556

Uh oh!

Conversation

drinkcat commented Mar 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

uucore: format: num_parser: Make it clear that scale can only be positive

uucore: format: num_parser: Parse exponent part of floating point numbers

uucore: format: num_parser: Fix large hexadecimal float parsing

uucore: format: Use ExtendedBigDecimal in argument code

uucore: format: num_parser: Add parser for ExtendedBigDecimal

uucore: format: num_parser: Disallow binary number parsing for floats

uucore: format: num_parser: allow leading + sign when parsing

uucore: format: num_parser: Turn parser into a trait

uucore: format: num_parser: Fold special value parsing in main parsing function

uucore: format: num_parser: Use ExtendedBigDecimal for internal representation

Uh oh!

github-actions bot commented Mar 24, 2025

Uh oh!

github-actions bot commented Mar 25, 2025

Uh oh!

github-actions bot commented Mar 25, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

github-actions bot commented Mar 29, 2025

Uh oh!

drinkcat commented Mar 29, 2025

Uh oh!

github-actions bot commented Mar 31, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

RenjiSann commented Apr 1, 2025

Uh oh!

Uh oh!

drinkcat commented Mar 24, 2025 •

edited

Loading