AnyCam: Learning to Recover Camera Poses and Intrinsics from Casual Videos

Felix Wimbauer, Weirong Chen, Dominik Muhle, Christian Rupprecht, Daniel Cremers

This is the official implementation for the CVPR 2025 paper:

AnyCam: Learning to Recover Camera Poses and Intrinsics from Casual Videos

Felix Wimbauer^1,2,3, Weirong Chen^1,2,3, Dominik Muhle^1,2, Christian Rupprecht³, and Daniel Cremers^1,2
¹Technical University of Munich, ²MCML, ³University of Oxford

CVPR 2025 (arXiv)

If you find our work useful, please consider citing our paper:

@inproceedings{wimbauer2025anycam,
  title={AnyCam: Learning to Recover Camera Poses and Intrinsics from Casual Videos},
  author={Wimbauer, Felix and Chen, Weirong and Muhle, Dominik and Rupprecht, Christian and Cremers, Daniel},
  booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year={2025}
}

News

2025/05 Add integration for PyTorch Hub and upload model files to HuggingFace.

2025/04 Improved demo script and colmap export.

2025/04 Initial code release.

ToDos

Setting Up the Environment

To set up the environment, follow these steps individually or see below:

Create a new conda environment with Python 3.11:
```
conda create -n anycam python=3.11
```
Activate the conda environment:
```
conda activate anycam
```

Install pytorch according to your CUDA version:

pip install torch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1 --index-url https://download.pytorch.org/whl/cu124

Install the corresponding cudatoolkit for compilation:
```
conda install -c nvidia cuda-toolkit
```
Install the required packages from requirements.txt:
```
pip install -r requirements.txt
```

Combined, this yields the following comand. Building might take a few minutes.

conda create -n anycam python=3.11 -y && \
conda activate anycam && \
pip install torch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1 --index-url https://download.pytorch.org/whl/cu124 && \
conda install -c nvidia cuda-toolkit -y && \
pip install -r requirements.txt

Note on dependencies

We use a slightly customized fork of UniMatch and UniDepth (in order to ensure backward-compatibility). Furthermore, we use the minipytorch3d variant by VGGSfM.

Download pretrained checkpoint

To download pretrained models, you can use the download_checkpoints.sh script. Follow these steps:

Open a terminal and navigate to the root directory of the repository.
Run the download_checkpoints.sh script with the desired model name. For example, to download the final anycam_seq8 model, use the following command:
```
./download_checkpoints.sh anycam_seq8
```

This will download and unpack the pretrained model into the pretrained_models directory. You can then use the downloaded model for evaluation or further training.

Demo

You can use the demo script to process custom videos and extract camera trajectories, depth maps, and 3D point clouds.

Basic Usage

We provide a simple script to run the basic functionality of AnyCam and export the results. The results can either be visualized in rerun.io, or exported to the Colmap format. To run the model in feed-forward only mode, turn off the ba_refinement flag. If the provided video has a high framerate, we recommend to subsample the video to a lower framerate by adding the fps=10 flag.

# Full model
python anycam/scripts/anycam_demo.py \
    ++input_path=/path/to/video.mp4 \ 
    ++model_path=pretrained_models/anycam_seq8 \
    ++visualize=true

# Feed-foward only without refinement
python anycam/scripts/anycam_demo.py \
    ++input_path=/path/to/video.mp4 \
    ++model_path=pretrained_models/anycam_seq8 \
    ++ba_refinement=false \
    ++visualize=true

Visualization with Remote Setup

If you are developing on a remote server, you can start rerun.io as a webserver. First, open a new terminal on your remote machine and start the viewer:

rerun --serve-web

Then, forward port 9090 to your local machine. Finally, make sure to launch the script with the ++rerun_mode=connect. You should be able to view the results in your browser under:

http://localhost:9090/?url=ws://localhost:9877

Export Options

Export to COLMAP format:

python anycam/scripts/anycam_demo.py \
    ++input_path=/path/to/video.mp4 \
    ++model_path=pretrained_models/anycam_seq8 \
    ++export_colmap=true \
    ++output_path=/path/to/output_dir

Save trajectory, depth maps, and other results:

python anycam/scripts/anycam_demo.py \
    ++input_path=/path/to/video.mp4 \
    ++model_path=pretrained_models/anycam_seq8 \
    ++output_path=/path/to/output_dir

PyTorch Hub Integration

AnyCam is also available through PyTorch Hub, making it easy to use the model in your own projects:

# Load the model
anycam = torch.hub.load('Brummi/anycam', 'AnyCam', version="1.0", training_variant="seq8", pretrained=True)

# Process a list of frames (H,W,3) [0,1] with or without bundle adjustment refinement
results = anycam.process_video(frames, ba_refinement=True)

# Access the results
trajectory = results["trajectory"]  # Camera poses
depths = results["depths"]          # Depth maps
uncertainties = results["uncertainties"]  # Uncertainty maps
projection_matrix = results["projection_matrix"]  # Camera intrinsics

The process_video function accepts the following parameters:

frames: List of frames as numpy arrays with shape (H,W,3) and values in [0,1]
config: Optional configuration dictionary for processing
ba_refinement: Whether to perform bundle adjustment (default: True)

Evaluation

To evaluate the AnyCam model, run the following command:

python anycam/scripts/evaluate_trajectories.py -cn evaluate_trajectories ++model_path=pretrained_models/anycam_seq8

You can also enable the with_rerun option during evaluation to plot the process to rerun.io:

python anycam/scripts/evaluate_trajectories.py -cn evaluate_trajectories ++model_path=pretrained_models/anycam_seq8 ++fit_video.ba_refinement.with_rerun=true

Visualization

You can use the Jupyter notebook anycam/scripts/anycam_4d_plot.ipynb for visualizing the results.

For more details, refer to the individual scripts and configuration files in the repository.

Training

Data Preparation

We use five datasets to train AnyCam:

We will soon release instructions on how to setup the data.

Training Stages

To train the AnyCam model, run the following commands. The provided setup assumes two A100 40GB GPUs. If your setup is different, modify the ++nproc_per_node and ++backend flags.

First stage (2 frames):

python train_anycam.py -cn anycam_training  \
    ++nproc_per_node=2 \
    ++backend=nccl \
    ++name=anycam_seq2 \
    ++output.unique_id=baseline

Second stage (8 frames):

python train_campred.py -cn anycam_training \
    ++nproc_per_node=2 \
    ++backend=nccl \
    ++name=anycam_seq8 \
    ++output.unique_id=baseline \
    ++batch_size=4 \
    ++dataset_params.frame_count=8 \
    ++training.optimizer.args.lr=1e-5 \
    ++training.from_pretrained=out/anycam_training/anycam_seq2_backend-nccl-2_baseline/training_checkpoint_247500.pt \
    ~dataloading.staged_datasets \
    ++validation.validation.fit_video_config=cam_pred/configs/eval_cfgs/train_eval.yaml \
    ++loss.0.lambda_label_scale=100

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
anycam		anycam
assets		assets
minipytorch3d		minipytorch3d
unimatch @ f6e36ac		unimatch @ f6e36ac
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
download_checkpoints.sh		download_checkpoints.sh
hubconf.py		hubconf.py
pyrightconfig.json		pyrightconfig.json
requirements.txt		requirements.txt
train_anycam.py		train_anycam.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

AnyCam: Learning to Recover Camera Poses and Intrinsics from Casual Videos

News

ToDos

Setting Up the Environment

Note on dependencies

Download pretrained checkpoint

Demo

Basic Usage

Visualization with Remote Setup

Export Options

PyTorch Hub Integration

Evaluation

Visualization

Training

Data Preparation

Training Stages

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

Brummi/anycam

Folders and files

Latest commit

History

Repository files navigation

AnyCam: Learning to Recover Camera Poses and Intrinsics from Casual Videos

News

ToDos

Setting Up the Environment

Note on dependencies

Download pretrained checkpoint

Demo

Basic Usage

Visualization with Remote Setup

Export Options

PyTorch Hub Integration

Evaluation

Visualization

Training

Data Preparation

Training Stages

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages