Repository for generating large-scale linear programming (LP) test instances for the PDLP MPC paper.
The code was tested using Julia 1.10.0. To instantiate all the required packages execute
$ julia --project=. -e "import Pkg; Pkg.instantiate()"
To create small versions of the problem instances (for testing) run:
$ ./create_small_instances.sh
To create large versions of the problem instances run:
$ ./create_large_instances.sh
This is going to take a while and you will need a machine
with
You can modify the instance generation properties and seeds.
Take a look at the above shell files for examples of parameters
that you can pass to the generators. For more information
on the parameters for each generator you can use --help
,
for example, for generate-multicommodity-flow.jl
you can run
julia --project generate-multicommodity-flow.jl --help
This problem is motivated by optimizing the supply chain of a large retailer
over a relatively moderate time period (e.g., a month). The retailer sells
We have tried to make this as simple as possible while maintaining some realism. In practice, one can imagine more complicated and larger models incorporating, for example, the multiperiod aspect of the problem or the issue of deciding where to build new warehouses.
For the create_large_instances.sh
we use 20000 commodities, 100 warehouses and 1000 stores. For context, Walmart has over 4000 stores in the United States, and the company has over 200 distribution centers [E]. Moreover, the average Walmart
store sells around 120,000 different products [D].
Let
The locations below belong to the 2D square
Let
Let
Let
Minimize shipping costs plus overtime costs:
With the following constratints. Flow from each factory
For each warehouse
For each commodity
Demand for each commodity
For small instances you can also solve the problem using HiGHS and generate plots of the optimal solution for a random selection of the commodities:
$ julia --project generate-multicommodity-flow.jl \
--output_file problem-instances/multicommodity-flow-tiny-test-instance.mps.gz \
--optimize_model \
--folder_for_plots plot_optimal_flows
This was used during the development process to verify the model was producing reasonable optimal solutions.
Consider a homogenous cube
where
The temperature of the material at location
As there is relatively few heat sources we minimize the sum of
We then build a finite element approximation of Possion's equation:
and setup the boundary conditions
Then at the measurement locations
We generate
One can also validate recovery of the true solution by using the script validate-heat-source-solution.jl
.
This script compares the temperature profile
Statistical matching with covariate balancing constraints for making causal inference on observational data
In observational studies we often wish to perform causal inference where we try to understand the
impact of particular treatment on patient outcomes. However, one difficulty is that the patients
that receive the treatment can be a very different group of patients from those that did not receive
the treatement. For example, perhaps there is a confounding variable -- how sick the patient is
which affects whether they recieve the treatement and their outcome.
Matching is a popular approach in statistics where treated patients are matched to control patients
with similar covariate values. This produces a new smaller sample where the patients in both the
treatment and control are similar -- resembling a randomized experiment.
However, one problem with this approach is that the covariates in the treatment group may not match covariates in the subsampled control group [C].
For this reason, it has been proposed that ones finds the optimal matching subject
to additional constraints that the in the covariates in the subsampled control group match the treatment [A,B].
This can be modelled as integer program [A] but in practice the linear programming
relaxation can be used to quickly generate matchings for large
datasets [B].
We generate synthetic versions of this problem for testing.
There are
Furthermore, define the first moment of the treatment sample as
Also, define the second moment of the treatment sample as
We consider a bipartite graph with vertices given by
To turn it into a linear program, we relax the integrality requirements and only require that
Minimize the total weight of the assignment:
Subject to the following constraints. Matching constraints for treatment group:
Matching constraints for control group:
Covariate first moments of subsampled control approximately match the treatment:
Covariate second moments approximately match original sample:
We generate
and then set
This shift vector ensures that the covariate values of
We consider the classic robust production-inventory problems from [F]. The class of problems considers a firm with a central warehouse and
In each time period, the firm sequentially performs the following three steps:
-
The firm replenishes the inventory level at the central warehouse by producing additional products at their factories. Each factory produces the additional units with zero lead time, and the additional units are stored immediately in the central warehouse.
-
The firm observes the customer demand at the central warehouse.
-
The firm verifies that the remaining inventory in the warehouse lies within a pre-specified interval.
In addition to satisfying the constraints on the inventory level in the central warehouse at the end of each time period, the firm's production decisions must be in a certain range.
The goal of the firm is to satisfy the customer demand at minimal cost while satisfying production and warehouse constraints.
Let
The demand at the central warehouse is denoted by
which must be satisfied immediately without backlogging from the inventory in the central warehouse. The lower and upper bounds in the uncertainty set, denoted by
The remaining inventory level in the central warehouse at the end of each time period
where
We utilize a linear decision rule for the above robust optimization problems. More specifically, we set
The decision variable is the linear decision rule parameter
Minimize the worst-case cost with uncertainty:
Subject to the following constraints. Maximal total production level for each factory:
Maximal and minimal production level for each factor at a time period:
The remaining inventory in the warehouse lies within a pre-specified interval:
We generate the instance following [G], which generalized those from [F]. More specifically, we generate instances in which the customer demand and production costs of a new product follow a cyclic pattern due to seasonality over a selling horizon of one year.
Given a discretization of the selling season into
Given
and the capacities and initial inventory at the central warehouse are
[A] Zubizarreta, José R. "Using mixed integer programming for matching in an observational study of kidney failure after surgery." Journal of the American Statistical Association 107.500 (2012): 1360-1371.
[B] https://cran.r-project.org/web/packages/designmatch/index.html
[C] Ali, M.S., Groenwold, R.H., Belitser, S.V., Pestman, W.R., Hoes, A.W., Roes, K.C., de Boer, A. and Klungel, O.H., 2015. Reporting of covariate selection and balance assessment in propensity score analysis is suboptimal: a systematic review. Journal of clinical epidemiology, 68(2), pp.122-131.
[F] Aharon Ben-Tal, Alexander Goryashko, Elana Guslitzer, and Arkadi Nemirovski. Adjustable robust solutions of uncertain linear programs. Mathematical Programming, 99(2):351–376, 2004.
[G] Lu, Haihao, and Brad Sturt. "On the Sparsity of Optimal Linear Decision Rules in Robust Inventory Management." arXiv preprint arXiv:2203.10661 (2022).