Skip to content

feat: benchmark suite #10804

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 51 commits into
base: master
Choose a base branch
from
Open

Conversation

yash-atreya
Copy link
Member

@yash-atreya yash-atreya commented Jun 18, 2025

Motivation

Closes #10548

Aggregated benchmark suite for testing performance of various commands such as test, build, coverage across multiple foundry versions and repositories.

Solution

  • Benchmarks reside in /benches
  • For each individual command, a criterion benchmark should be written e.g bench/forge_test.rs
  • BenchmarkProject is the utility type to clone various repos and run forge commands on them
  • Benchmarks are invoked via the foundry-bench binary which is a CLI that has flags to specify --versions, --repos, --benchmarks.
  • BenchmarkResults type aggregates the criterion results from the target/criterion/* cache.
  • Finally, the aggregated results file is written that looks like this
  • Adds a benchmarks.yml workflow that enables running benchmarks manually
  • Refer to the README.md on how to run the benchmarks

Currently included

  • forge test
  • forge test - fuzz only
  • forge build with cache
  • forge build with no cache
  • forge coverage

To be addressed in a followup

  • Allowing for repo specific config such as env variables.
  • benchmarks.toml, which allows for specifying repo and version config like this
  • forge build with dynamic_test_linking
  • Invariant benches

PR Checklist

  • Added Tests
  • Added Documentation
  • Breaking changes

yash-atreya and others added 7 commits June 10, 2025 16:55
- Automated benchmarking across multiple Foundry versions using hyperfine
- Supports stable, nightly, and specific version tags (e.g., v1.0.0)
- Benchmarks 5 major Foundry projects: account, v4-core, solady, morpho-blue, spark-psm
- Tests forge test, forge build (no cache), and forge build (with cache)
- Generates comparison tables in markdown format
- Uses foundryup for version management
- Exports JSON data for detailed analysis

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
- Fix relative path issue causing JSON files to fail creation
- Convert benchmark directories to absolute paths using SCRIPT_DIR
- Improve markdown table formatting with proper column names and alignment
- Use unified table generation with string concatenation for better formatting
- Increase benchmark runs from 3 to 5 for more reliable results
- Use --prepare instead of --cleanup for better cache management
- Remove stderr suppression to catch hyperfine errors
- Update table headers to show units (seconds) for clarity

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
Copy link
Contributor

@0xrusowsky 0xrusowsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think that the only "must" is support for env vars, the rest are nice-to-have features

yash-atreya and others added 19 commits June 25, 2025 16:56
- run forge build in parallet for forge-test bench
- switch foundry versions
- README specifying prereqs
- Add `get_benchmark_versions()` helper to read versions from env var
- Update all benchmarks to use version helper for consistency
- Add `--versions` and `--force-install` flags to shell script
- Enable all three benchmarks (forge_test, build_no_cache, build_with_cache)
- Improve error handling for corrupted forge installations
- Remove complex workarounds in favor of clear error messages

The benchmarks now support custom versions via:
  ./run_benchmarks.sh --versions stable,nightly,v1.2.0

🤖 Generated with Claude Code

Co-Authored-By: Claude <[email protected]>
* feat: criterion benches

* - setup benchmark repos in parallel
- run forge build in parallet for forge-test bench
- switch foundry versions
- README specifying prereqs

* feat: shell script to run benches

* feat: ci workflow, fix script

* update readme

* feat: enhance benchmarking suite with version flexibility

- Add `get_benchmark_versions()` helper to read versions from env var
- Update all benchmarks to use version helper for consistency
- Add `--versions` and `--force-install` flags to shell script
- Enable all three benchmarks (forge_test, build_no_cache, build_with_cache)
- Improve error handling for corrupted forge installations
- Remove complex workarounds in favor of clear error messages

The benchmarks now support custom versions via:
  ./run_benchmarks.sh --versions stable,nightly,v1.2.0

🤖 Generated with Claude Code

Co-Authored-By: Claude <[email protected]>

* latest bench

* rm notes

* remove shell based bench suite

---------

Co-authored-by: Claude <[email protected]>
* main.rs
* forge version is controlled by the bin
* parses criterion json to collect results - writes to LATEST.md
@yash-atreya yash-atreya marked this pull request as ready for review July 1, 2025 12:20
@yash-atreya yash-atreya requested a review from 0xrusowsky July 1, 2025 13:22
0xrusowsky
0xrusowsky previously approved these changes Jul 1, 2025
Copy link
Contributor

@0xrusowsky 0xrusowsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like a great initial version to get the ball rolling!
for now i can't think of any more improvements than the listed ones in the follow-up section 👍

if: github.event.inputs.pr_number != ''
uses: actions/github-script@v7
with:
script: |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we move as much logic as possible away from the yaml file into separate files

jobs:
forge-test:
name: Run forge_test and forge_fuzz_test benchmarks
runs-on: ubuntu-latest
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should not rely on github runners for accurate measurements

| Repository | stable | nightly |
| ----------------- | ------ | ------- |
| ithacaxyz-account | 4.34 s | 3.69 s |
| solady | 3.68 s | 2.92 s |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is ok for now, but walltime is extremely unreliable and should not be the only metric we use

/// - "v0.2.0" - Specific version tag
/// - "commit-hash" - Specific commit hash
/// - "nightly-rev" - Nightly build with specific revision
pub static FOUNDRY_VERSIONS: &[&str] = &["stable", "nightly"];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these are the defaults correct?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah

Copy link
Contributor

github-actions bot commented Jul 4, 2025

📊 Foundry Benchmark Results

Click to view detailed benchmark results

Foundry Benchmark Results

Generated at: 2025-07-04 08:09:08 UTC

Date: 2025-07-04 08:08:57

Summary

Benchmarked 2 Foundry versions across 2 repositories.

Repositories Tested

  1. ithacaxyz/account
  2. Vectorized/solady

Foundry Versions

  • stable: forge Version: 1.2.3-stable (a813a2c 2025-06-08)
  • nightly: forge Version: 1.2.3-nightly (6092317 2025-07-04)

Forge Fuzz Test

Repository stable nightly
ithacaxyz-account 41.62 s 35.77 s
solady 38.38 s 33.33 s

Forge Test

Repository stable nightly
ithacaxyz-account 41.72 s 35.01 s
solady 38.99 s 34.21 s

System Information

  • OS: linux
  • CPU: 4
  • Rustc: rustc 1.88.0 (6b00bc388 2025-06-23)

Date: 2025-07-04 08:02:32

Summary

Benchmarked 2 Foundry versions across 2 repositories.

Repositories Tested

  1. ithacaxyz/account
  2. Vectorized/solady

Foundry Versions

  • stable: forge Version: 1.2.3-stable (a813a2c 2025-06-08)
  • nightly: forge Version: 1.2.3-nightly (6092317 2025-07-04)

Forge Build (With Cache)

Repository stable nightly
ithacaxyz-account 5.75 s 5.58 s
solady 8.13 s 8.14 s

Forge Build (No Cache)

Repository stable nightly
ithacaxyz-account 5.70 s 5.61 s
solady 8.06 s 8.12 s

System Information

  • OS: linux
  • CPU: 4
  • Rustc: rustc 1.88.0 (6b00bc388 2025-06-23)

Date: 2025-07-04 08:03:26

Summary

Benchmarked 2 Foundry versions across 1 repositories.

Repositories Tested

  1. ithacaxyz/account

Foundry Versions

  • stable: forge Version: 1.2.3-stable (a813a2c 2025-06-08)
  • nightly: forge Version: 1.2.3-nightly (6092317 2025-07-04)

Forge Coverage

Repository stable nightly
ithacaxyz-account 33.46 s 33.47 s

System Information

  • OS: linux
  • CPU: 4
  • Rustc: rustc 1.88.0 (6b00bc388 2025-06-23)

🤖 This comment was automatically generated by the Foundry Benchmarks workflow.

To run benchmarks manually: Go to Actions → "Run workflow"

@yash-atreya
Copy link
Member Author

Benchmarks ran in ci, but took super long and do not indicate accurate results. Probably because of github runner? @DaniPopes

Copy link
Contributor

github-actions bot commented Jul 4, 2025

📊 Foundry Benchmark Results

Click to view detailed benchmark results

Foundry Benchmark Results

Generated at: 2025-07-04 10:01:09 UTC

Date: 2025-07-04 10:00:55

Summary

Benchmarked 2 Foundry versions across 2 repositories.

Repositories Tested

  1. ithacaxyz/account
  2. Vectorized/solady

Foundry Versions

  • stable: forge Version: 1.2.3-stable (a813a2c 2025-06-08)
  • nightly: forge Version: 1.2.3-nightly (6092317 2025-07-04)

Forge Fuzz Test

Repository stable nightly
ithacaxyz-account 40.42 s 35.22 s
solady 36.40 s 32.22 s

Forge Test

Repository stable nightly
ithacaxyz-account 40.30 s 34.61 s
solady 39.02 s 33.67 s

System Information

  • OS: linux
  • CPU: 4
  • Rustc: rustc 1.88.0 (6b00bc388 2025-06-23)

Date: 2025-07-04 09:54:36

Summary

Benchmarked 2 Foundry versions across 2 repositories.

Repositories Tested

  1. ithacaxyz/account
  2. Vectorized/solady

Foundry Versions

  • stable: forge Version: 1.2.3-stable (a813a2c 2025-06-08)
  • nightly: forge Version: 1.2.3-nightly (6092317 2025-07-04)

Forge Build (No Cache)

Repository stable nightly
ithacaxyz-account 5.74 s 5.62 s
solady 8.13 s 8.17 s

Forge Build (With Cache)

Repository stable nightly
ithacaxyz-account 5.73 s 5.66 s
solady 8.17 s 8.19 s

System Information

  • OS: linux
  • CPU: 4
  • Rustc: rustc 1.88.0 (6b00bc388 2025-06-23)

Date: 2025-07-04 09:54:42

Summary

Benchmarked 2 Foundry versions across 1 repositories.

Repositories Tested

  1. ithacaxyz/account

Foundry Versions

  • stable: forge Version: 1.2.3-stable (a813a2c 2025-06-08)
  • nightly: forge Version: 1.2.3-nightly (6092317 2025-07-04)

Forge Coverage

Repository stable nightly
ithacaxyz-account 34.96 s 35.31 s

System Information

  • OS: linux
  • CPU: 4
  • Rustc: rustc 1.88.0 (6b00bc388 2025-06-23)

🤖 This comment was automatically generated by the Foundry Benchmarks workflow.

To run benchmarks manually: Go to Actions → "Run workflow"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: No status
Development

Successfully merging this pull request may close these issues.

feat: benchmark suite and update README with latest benchmarks
3 participants