Closed
Description
Describe the bug
CSV read speed drastically degrades when the number of params is large(>10^5) .
To Reproduce
@billgillespie and I ran into a model with 10^6 params and read_cmdstan_csv
becomes stalled . I was able to reproduce it with the following dummy model
parameters {
real k;
}
transformed parameters {
real x[100000] = rep_array(k, 100000);
}
model {
k ~ normal(0, 1);
}
The output file contains only one sample:
./dummy sample algorithm=fixed_param num_warmup=0 num_samples=1 output file=dummy.csv
rstan::read_stan_csv
vs cmdstanr::read_cmdstan_csv
:
> system.time(fit <- cmdstanr::read_cmdstan_csv("dummy.csv"))
user system elapsed
55.873 26.429 67.056
> system.time(fit <- rstan::read_stan_csv("dummy.csv"))
user system elapsed
5.449 0.341 5.847
Expected behavior
More realistic reading speed in this scenario.
Operating system
macos mojave
> sessionInfo()
R version 3.6.2 (2019-12-12)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Mojave 10.14.6
Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] compiler_3.6.2 tools_3.6.2 tcltk_3.6.2
CmdStanR version number
> library("cmdstanr")
This is cmdstanr version 0.1.3
- Online documentation and vignettes at mc-stan.org/cmdstanr
- Use set_cmdstan_path() to set the path to CmdStan
- Use install_cmdstan() to install CmdStan