feat: allow re-using a database across benchmark runs #1174

mr-salty · 2020-01-08T23:55:36Z

If the user specifies a non-empty --database <database_name> on the command
line, it will be re-used across runs. The database will be created and
populated if it doesn't already exist (i.e. on the first run) but it
will not be dropped at the end of the run as it normally would be. This
saves a lot of setup time if you're running the benchmark repeatedly.

Otherwise, the behavior is unchanged.

This change is

mr-salty · 2020-01-09T00:00:54Z

@coryan I debated introducing a--reuse_database flag and having the program use a non-random name, I don't have too much of a preference either way, LMK if you feel differently.

coryan

I really like the feature, thanks for implementing it. The code could be a smidge cleaner, made some suggestions below, WDYT?

Reviewed 3 of 3 files at r1.
Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @mr-salty)

google/cloud/spanner/benchmarks/multiple_rows_cpu_benchmark.cc, line 73 at r1 (raw file):

  virtual std::string AdditionalDdlStatement() = 0;
  virtual Status SetUp(Config const& config, cs::Database const& database,
                       bool fill) = 0;

This change fills slightly odd to me, basically in all cases passing fill == false has the same effect as not calling the function... Maybe we can just not call the function?

google/cloud/spanner/benchmarks/multiple_rows_cpu_benchmark.cc, line 160 at r1 (raw file):

  auto db = create_future.get();
  if (!db) {
    if (user_specified_database &&

FYI: alternatively we could do a admin_client.GetDatabase() if user_specified_database==true and only create the database if that fails. Both are fine by me, just something for you to consider.

google/cloud/spanner/benchmarks/multiple_rows_cpu_benchmark.cc, line 1501 at r1 (raw file):

  std::string AdditionalDdlStatement() override { return {}; }
  Status SetUp(Config const&, cs::Database const&, bool fill) override {
    fill_ = fill;

I do not think you need to save the value here. The main reason to have RunAllExperiment is to smoke test all the benchmarks in the CI builds, so calling it with true all the time is good enough.

google/cloud/spanner/benchmarks/single_row_throughput_benchmark.cc, line 48 at r1 (raw file):

  virtual void SetUp(Config const& config,
                     cloud_spanner::Database const& database, bool fill) = 0;

Same comment I think fill==false really has the same effect as not calling the function.

codecov · 2020-01-09T20:11:52Z

Codecov Report

Merging #1174 into master will decrease coverage by 0.02%.
The diff coverage is 66.12%.

@@            Coverage Diff             @@
##           master    #1174      +/-   ##
==========================================
- Coverage   95.71%   95.68%   -0.03%     
==========================================
  Files         175      175              
  Lines       13527    13567      +40     
==========================================
+ Hits        12947    12982      +35     
- Misses        580      585       +5

Impacted Files	Coverage Δ
...ogle/cloud/spanner/benchmarks/benchmarks_config.cc	`96.26% <0%> (-0.91%)`	⬇️
.../spanner/benchmarks/multiple_rows_cpu_benchmark.cc	`92.62% <64.86%> (-0.43%)`	⬇️
...nner/benchmarks/single_row_throughput_benchmark.cc	`92.73% <70.83%> (-0.8%)`	⬇️
google/cloud/spanner/internal/transaction_impl.h	`100% <0%> (ø)`	⬆️
google/cloud/spanner/transaction.h	`100% <0%> (ø)`	⬆️
...anner/integration_tests/client_integration_test.cc	`98.47% <0%> (ø)`	⬆️
google/cloud/spanner/value.h	`92.8% <0%> (+0.1%)`	⬆️
google/cloud/spanner/samples/samples.cc	`86.63% <0%> (+0.19%)`	⬆️
google/cloud/spanner/client.cc	`97.41% <0%> (+0.86%)`	⬆️
...ud/spanner/integration_tests/client_stress_test.cc	`83.33% <0%> (+1.75%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update fbdab30...560fe62. Read the comment docs.

mr-salty

Reviewable status: 1 of 3 files reviewed, 3 unresolved discussions (waiting on @coryan)

google/cloud/spanner/benchmarks/multiple_rows_cpu_benchmark.cc, line 73 at r1 (raw file):

Previously, coryan (Carlos O'Ryan) wrote…

This change fills slightly odd to me, basically in all cases passing fill == false has the same effect as not calling the function... Maybe we can just not call the function?

Done. (see comment in the other file)

google/cloud/spanner/benchmarks/multiple_rows_cpu_benchmark.cc, line 160 at r1 (raw file):

Previously, coryan (Carlos O'Ryan) wrote…

FYI: alternatively we could do a admin_client.GetDatabase() if user_specified_database==true and only create the database if that fails. Both are fine by me, just something for you to consider.

yeah, I thought about it but this seemed simpler to me.

google/cloud/spanner/benchmarks/multiple_rows_cpu_benchmark.cc, line 1501 at r1 (raw file):

Previously, coryan (Carlos O'Ryan) wrote…

I do not think you need to save the value here. The main reason to have RunAllExperiment is to smoke test all the benchmarks in the CI builds, so calling it with true all the time is good enough.

It still seems better to propagate it, just in case people use all with manual tests. I reworked it so just keep track of whether SetUp was called and call the experiment SetUp only if so.

google/cloud/spanner/benchmarks/single_row_throughput_benchmark.cc, line 48 at r1 (raw file):

Previously, coryan (Carlos O'Ryan) wrote…

Same comment I think fill==false really has the same effect as not calling the function.

Yeah, I did this file first, and didn't look close enough at what SetUp() was doing, so I passed the parameter to skip part of it, and essentially did the same thing in the multiple_rows one for "futurproofing". But, I see even in this case the code outside if (fill) is not necessary, so I will just take your suggestion and not call SetUp() at all.

If the user specifies a non-empty `--database <database_name>` on the command line, it will be re-used across runs. The database will be created and populated if it doesn't already exist (i.e. on the first run) but it will not be dropped at the end of the run as it normally would be. This saves a lot of setup time if you're running the benchmark repeatedly. Otherwise, the behavior is unchanged.

mr-salty

Reviewable status: 1 of 3 files reviewed, 4 unresolved discussions (waiting on @coryan)

google/cloud/spanner/benchmarks/multiple_rows_cpu_benchmark.cc, line 182 at r2 (raw file):

      exit_status = EXIT_FAILURE;
    } else {
      status = experiment->Run(config, database);

oops, I completely didn't mean to put this inside the 'if' - let me fix it!

mr-salty

Reviewable status: 1 of 3 files reviewed, 3 unresolved discussions (waiting on @coryan)

google/cloud/spanner/benchmarks/multiple_rows_cpu_benchmark.cc, line 182 at r2 (raw file):

Previously, mr-salty (Todd Derr) wrote…

oops, I completely didn't mean to put this inside the 'if' - let me fix it!

fixed (and tested)

coryan

Thank you, this will be a really useful thing to have.

Reviewed 1 of 2 files at r2, 1 of 1 files at r3.
Reviewable status: complete! all files reviewed, all discussions resolved

mr-salty · 2020-01-09T22:38:29Z

yeah, it really helped me immensely when debugging that deadlock issue the other night... well, luckily (for my motivation to do this) it wasn't until later that I realized I could have used a much smaller --table_size to reduce the overhead ;) but this is still better regardless.

…gle-cloud-cpp-spanner#1174) * feat: allow re-using a database across benchmark runs If the user specifies a non-empty `--database <database_name>` on the command line, it will be re-used across runs. The database will be created and populated if it doesn't already exist (i.e. on the first run) but it will not be dropped at the end of the run as it normally would be. This saves a lot of setup time if you're running the benchmark repeatedly. Otherwise, the behavior is unchanged. * address review comments. * the experiments need to run when using an existing database

mr-salty requested a review from coryan January 8, 2020 23:55

googlebot added the cla: yes This human has signed the Contributor License Agreement. label Jan 8, 2020

coryan suggested changes Jan 9, 2020

View reviewed changes

mr-salty commented Jan 9, 2020

View reviewed changes

mr-salty added 2 commits January 9, 2020 16:02

address review comments.

f4ed269

mr-salty commented Jan 9, 2020

View reviewed changes

the experiments need to run when using an existing database

560fe62

mr-salty commented Jan 9, 2020

View reviewed changes

coryan approved these changes Jan 9, 2020

View reviewed changes

mr-salty merged commit fd06727 into googleapis:master Jan 9, 2020

mr-salty deleted the bm_reuse branch January 9, 2020 22:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: allow re-using a database across benchmark runs #1174

feat: allow re-using a database across benchmark runs #1174

Uh oh!

mr-salty commented Jan 8, 2020 •

edited by coryan

Loading

Uh oh!

mr-salty commented Jan 9, 2020

Uh oh!

coryan left a comment

Uh oh!

codecov bot commented Jan 9, 2020 •

edited

Loading

Uh oh!

mr-salty left a comment

Uh oh!

mr-salty left a comment

Uh oh!

mr-salty left a comment

Uh oh!

coryan left a comment

Uh oh!

mr-salty commented Jan 9, 2020

Uh oh!

Uh oh!

feat: allow re-using a database across benchmark runs #1174

feat: allow re-using a database across benchmark runs #1174

Uh oh!

Conversation

mr-salty commented Jan 8, 2020 • edited by coryan Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mr-salty commented Jan 9, 2020

Uh oh!

coryan left a comment

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Jan 9, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

mr-salty left a comment

Choose a reason for hiding this comment

Uh oh!

mr-salty left a comment

Choose a reason for hiding this comment

Uh oh!

mr-salty left a comment

Choose a reason for hiding this comment

Uh oh!

coryan left a comment

Choose a reason for hiding this comment

Uh oh!

mr-salty commented Jan 9, 2020

Uh oh!

Uh oh!

mr-salty commented Jan 8, 2020 •

edited by coryan

Loading

codecov bot commented Jan 9, 2020 •

edited

Loading