Skip to content

read_cmdstan_csv efficiency issue for large number of parameters #299

Closed
@yizhang-yiz

Description

@yizhang-yiz

Describe the bug
CSV read speed drastically degrades when the number of params is large(>10^5) .

To Reproduce
@billgillespie and I ran into a model with 10^6 params and read_cmdstan_csv becomes stalled . I was able to reproduce it with the following dummy model

parameters {
  real k;
}

transformed parameters {
  real x[100000] = rep_array(k, 100000);
}

model {
  k ~ normal(0, 1);
}

The output file contains only one sample:

./dummy sample algorithm=fixed_param num_warmup=0 num_samples=1 output file=dummy.csv

rstan::read_stan_csv vs cmdstanr::read_cmdstan_csv:

> system.time(fit <- cmdstanr::read_cmdstan_csv("dummy.csv"))
   user  system elapsed 
 55.873  26.429  67.056 
> system.time(fit <- rstan::read_stan_csv("dummy.csv"))
   user  system elapsed 
  5.449   0.341   5.847 

Expected behavior
More realistic reading speed in this scenario.

Operating system
macos mojave

> sessionInfo()
R version 3.6.2 (2019-12-12)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Mojave 10.14.6

Matrix products: default
BLAS:   /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

loaded via a namespace (and not attached):
[1] compiler_3.6.2 tools_3.6.2    tcltk_3.6.2   

CmdStanR version number

> library("cmdstanr")
This is cmdstanr version 0.1.3
- Online documentation and vignettes at mc-stan.org/cmdstanr
- Use set_cmdstan_path() to set the path to CmdStan
- Use install_cmdstan() to install CmdStan

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions