Skip to content

cat: Formatting performance improvement #7642

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Apr 4, 2025

Conversation

karlmcdowall
Copy link
Contributor

Use memchr library in cat to improve performance when detecting newlines.
Significantly improves performance when running with -n, -s, -E, -b flags.

@karlmcdowall
Copy link
Contributor Author

karlmcdowall commented Apr 3, 2025

Some examples of performance gains...

-E flag...

$ hyperfine -L cat /usr/bin/cat,./target/release/cat.original,./target/release/cat "{cat} -E ./wikidatawiki-20240901-pages-logging27.xml"
Benchmark 1: /usr/bin/cat -E ./wikidatawiki-20240901-pages-logging27.xml
  Time (mean ± σ):      4.649 s ±  0.060 s    [User: 4.249 s, System: 0.397 s]
  Range (min … max):    4.545 s …  4.731 s    10 runs
 
Benchmark 2: ./target/release/cat.original -E ./wikidatawiki-20240901-pages-logging27.xml
  Time (mean ± σ):      4.788 s ±  0.082 s    [User: 4.226 s, System: 0.560 s]
  Range (min … max):    4.703 s …  4.982 s    10 runs
 
Benchmark 3: ./target/release/cat -E ./wikidatawiki-20240901-pages-logging27.xml
  Time (mean ± σ):      2.934 s ±  0.039 s    [User: 2.393 s, System: 0.541 s]
  Range (min … max):    2.895 s …  2.995 s    10 runs
 
Summary
  ./target/release/cat -E ./wikidatawiki-20240901-pages-logging27.xml ran
    1.58 ± 0.03 times faster than /usr/bin/cat -E ./wikidatawiki-20240901-pages-logging27.xml
    1.63 ± 0.04 times faster than ./target/release/cat.original -E ./wikidatawiki-20240901-pages-logging27.xml

-s flag...

$ hyperfine -L cat /usr/bin/cat,./target/release/cat.original,./target/release/cat "{cat} -s ./wikidatawiki-20240901-pages-logging27.xml"
Benchmark 1: /usr/bin/cat -s ./wikidatawiki-20240901-pages-logging27.xml
  Time (mean ± σ):      4.730 s ±  0.051 s    [User: 4.312 s, System: 0.417 s]
  Range (min … max):    4.653 s …  4.811 s    10 runs
 
Benchmark 2: ./target/release/cat.original -s ./wikidatawiki-20240901-pages-logging27.xml
  Time (mean ± σ):      4.839 s ±  0.079 s    [User: 4.264 s, System: 0.574 s]
  Range (min … max):    4.734 s …  4.972 s    10 runs
 
Benchmark 3: ./target/release/cat -s ./wikidatawiki-20240901-pages-logging27.xml
  Time (mean ± σ):      3.001 s ±  0.046 s    [User: 2.456 s, System: 0.545 s]
  Range (min … max):    2.928 s …  3.076 s    10 runs
 
Summary
  ./target/release/cat -s ./wikidatawiki-20240901-pages-logging27.xml ran
    1.58 ± 0.03 times faster than /usr/bin/cat -s ./wikidatawiki-20240901-pages-logging27.xml
    1.61 ± 0.04 times faster than ./target/release/cat.original -s ./wikidatawiki-20240901-pages-logging27.xml

@karlmcdowall karlmcdowall force-pushed the cat_write_to_end_perf branch from 76708e3 to cea1634 Compare April 3, 2025 15:53
Copy link

github-actions bot commented Apr 3, 2025

GNU testsuite comparison:

Skipping an intermittent issue tests/timeout/timeout (passes in this run but fails in the 'main' branch)

@sylvestre
Copy link
Contributor

Impressive!

Copy link

github-actions bot commented Apr 4, 2025

GNU testsuite comparison:

Skip an intermittent issue tests/timeout/timeout (fails in this run but passes in the 'main' branch)

karlmcdowall and others added 2 commits April 4, 2025 15:09
Use memchr library in `cat` to improve performance when detecting
newlines.
Significantly improves performance when running with -n, -s, -E, -b
flags.
Co-authored-by: Sylvestre Ledru <[email protected]>
@karlmcdowall karlmcdowall force-pushed the cat_write_to_end_perf branch from a8acb54 to 464d172 Compare April 4, 2025 21:09
Copy link

github-actions bot commented Apr 4, 2025

GNU testsuite comparison:

Skipping an intermittent issue tests/misc/stdbuf (passes in this run but fails in the 'main' branch)
Congrats! The gnu test tests/misc/tee is no longer failing!

@sylvestre sylvestre merged commit e6ff6d5 into uutils:main Apr 4, 2025
68 checks passed
eduardorittner pushed a commit to eduardorittner/coreutils that referenced this pull request Apr 4, 2025
* cat: Formatting performance improvement

Use memchr library in `cat` to improve performance when detecting
newlines.
Significantly improves performance when running with -n, -s, -E, -b
flags.


Co-authored-by: Sylvestre Ledru <[email protected]>

---------

Co-authored-by: Sylvestre Ledru <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants