Stream-Based Active Distillation for Scalable Model Deployment

[paper] [poster][test_set_WALT_cam1] [test_set_WALT_cam2]

Installation

The code was developed using Linux 20.04.

Setup your virtual environment

We recommend working in a virtualenv or conda environment.

conda create -y --name SBAD python pip
conda activate SBAD

Requirements

To reproduce the results, you need to install the requirements of the YOLOv8 framework AND:

cd ..
pip install -r requirements.txt

Configure wandb

We use wandb to log all experimentations. You can either use your own account, or create a team. Either way, you will need to login and setup an entity to push logs and model versions to.

Create a wandb entity
Setup wandb to send logs :

wandb login

Datasets

This is the required dataset structure :

Slight modifications to the structure are possible, but should be configured appropriately in the experimentation configuration.

Here's How we used WALT :

WALT

WALT-challenge
├── cam{1}
│   ├── week{1}
│   │   └── bank
│   │   │   ├── images
│   │   │   └── labels
|   |   └── test
│   │       ├── images
│   │       └── labels
.   .
│   └── week{i}
│       └── ...
.
└── cam{j}
    └── ...

Your Dataset

Dataset
├── bank
│   ├── images
│   └── labels
├── test
│   ├── images
│   └── labels
├── train (auto generated through sampling)
│   ├── images
│   └── labels
└── val (auto generated through sampling)
    ├── images
    └── labels

3. Getting Started

Generation of the pseudo labels (populate bank)

To generate the pseudo labels, execute the following command:

python annotation/generate_pseudo_labels.py --parent "YOURPATH/WALT" --extension "jpg-or-png"

Note: The 'bank' folder must contain an 'images' folder with all the images. If you are on Windows, only use " and not '.

Conduct an Experiment

Before conducting an experiment, ensure your wandb entity and the project are correctly set up in the experiments/model/yolov8.yaml Hydra config file.

To conduct an experiment, follow these steps:

Populate a val folder based on the data contained in the bank folder.
Populate a train folder based on a strategy applied on the data contained in the bank folder. The strategy ensures that no images are duplicated between the train and the val sets.
Launch a training on a yolov8n model based on the previously generated sets. The scripts automatically launch everything to wandb.

You can launch an experiment by executing the main script:

python train.py

In case of debugging, you can add HYDRA_FULL_ERROR=1 as an environment variable to see the traceback more clearly.

HYDRA_FULL_ERROR=1 python train.py

Modify the configs to change the experiments

Hydra is a configuration management tool that retrieves the information included in the *.yaml files to facilitate the deployment of the application.

To modify an experiment you can modify the configuration file experiments/experiment.yaml. At your first use, you will have to modify the paths to the dataset and your wandb username.

You need to modify

experiments/model/yolov8.yaml (WANDB entity)
experiments/experiment.yaml (Insert your data folder) The logs and outputs of the runs are stored in the output folder.

remark: if you are using Windows, do not forget to adapt your paths by using / instead of not \ or //.

IMPORTANT !

We use a specific run naming format to track the experiments in wandb and run testing. We do that using the name attribute in the dataset config file. Look at experiments/dataset/WALT.yaml for an example.

If you add parameters during training, make a note of it somewhere. For example if you use a batch number of 32 instead of the default 16, set your run name to : S05c016-firstn-100-batch-8. You should add this behavior to your hydra config files if you use your own dataset and experimentation config.

4. Testing

You can use the download to get all the models of a specific project from wandb. Then you use the inference tool to test the models on the dataset. Finally use the inference_coco tool to generate the same testing metrics for the student and teacher models. All the testing results are concatenated in a single file.

We have created a test.py file that executes all of these steps in a single command.

python test.py --run-prefix WALT --entity YourEntity --project WALT --template testing/templates/WALT.yaml --dataset_path "YOURPATH/WALT/"

Flags :

--entity : wandb project team
--project : wandb project name
--run-prefix : project name used as prefix in the runs names (sometimes it's different than the wandb project name, like in the case of "study")
--template : template file for data.yaml used to specify test "sub" datasets
--dataset_path : parent path containing all dataset folders (camX in case of WALT)
--query_filter : you can choose to download and test only specific models by filtering through characters or words in the run names.
--wandb-download : you can set this to false, if you would like to run all testing pipeline without the download script

Results will be in testdir/project where you'll find :

wandb folder containing downloaded weights.
inference_results.csv file containing inference results
plots folder containing generated plots for each metric in the inference results (To modify plots look in testing/plot.py)

You can also run each script/tool individually :

Download models :

python testing/download.py -e YourEntity -p WALT -f ./testdir/WALT/wandb -lf -d

Test downloaded models on the test set :

python3 ./testing/inference.py -w ./testing/WALT/wandb -d "YOURPATH/WALT/" -p WALT -y testing/templates/WALT.yaml -f test -c ./testdir/WALT/inference_results.csv

Test pretrained Student and Teacher models on dataset :

python3 testing/inference_coco.py --model yolov8n --csv_path ./testdir/WALT/inference_results.csv --dataset "s05c016->"YOURPATH/WALT/s05c016/" --dataset--data-template testing/templates/Ai-city.yaml --folder test

Plot graphs :

python testing/plot.py --csv_path ./testdir/WALT/inference_results.csv --save_path ./testdir/Ai-city/plots

Use the --help flag for more information on the usage of each script.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Stream-Based Active Distillation for Scalable Model Deployment

Table of Contents

Installation

Setup your virtual environment

Requirements

Configure wandb

Datasets

3. Getting Started

Generation of the pseudo labels (populate bank)

Conduct an Experiment

Modify the configs to change the experiments

4. Testing

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
experiments		experiments
images		images
subsampling		subsampling
testing		testing
visualization_tools		visualization_tools
yolov8		yolov8
.gitignore		.gitignore
README.md		README.md
generate_pseudo_labels.py		generate_pseudo_labels.py
requirements.txt		requirements.txt
test.py		test.py
train.py		train.py

manjahdani/SBAD

Folders and files

Latest commit

History

Repository files navigation

Stream-Based Active Distillation for Scalable Model Deployment

Table of Contents

Installation

Setup your virtual environment

Requirements

Configure wandb

Datasets

3. Getting Started

Generation of the pseudo labels (populate bank)

Conduct an Experiment

Modify the configs to change the experiments

4. Testing

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages