SmartSpatial: Enhancing the 3D Spatial Arrangement Capabilities of Stable Diffusion Models and Introducing a Novel 3D Spatial Evaluation Framework

Overview

This is the implementation code for SmartSpatial.

SmartSpatial is a novel framework that enhances 3D spatial arrangement capabilities in Stable Diffusion models. It introduces:

SmartSpatial: A method that improves object placement precision in text-to-image generation by incorporating 3D-aware conditioning and cross-attention mechanisms.
SmartSpatialEval: A novel evaluation framework leveraging vision-language models and graph-based spatial analysis to assess generated images.

Structure of SmartSpatial:

Structure of SmartSpatialEval:

Below are some example outputs:

Installation

To set up the environment, install the required dependencies using:

pip install -r requirements.txt

Reproduction & Evaluation

Configuration

Modify parameters in conf/base_config.yaml according to your needs. This includes:

Random seed for reproducibility
API key for external services (e.g., OpenAI)
Loss thresholds and hyperparameters

Evaluation

To reproduce the result, you can refer to the script in script/benchmarks to generate images using different benchmarks.

For evaluating the results, you can run the script in script/evaluation. In the folder, eval_trad.sh is for traditional metrics and eval_smart_spatial_eval is evaluating images via SmartSpatialEval.

Inference

For SmartSpatial, each modules are in my_model folder. To use SmartSpatial for spatially-aware text-to-image generation and instantiate the pipeline, you can refer to our pipeline code in SmartSpatial fodler. For inference with SmartSpatial, you can simply create a SmartSpatialPipeline object like this:

smart_spatial = SmartSpatialPipeline(conf, device)

Afterwards, run the "generate" function using this object and place in proper parameters. You can create your own reference images using matplotlib or Blender and convert them to depth map via our depth estimator code -- "preprocess_depth_map" function in SmartSpatialPipeline class.

For SmartSpatialEval, after placing you openai api key in conf/base_config.yaml, you can create SmartSpatialEval pipeline like this:

smart_spatial_eval = SmartSpatialEvalPipeline(conf, device)

and execute the "evaluate" function with proper parameters to evaluate your own dataset using SmartSpatialEval. Note that the data format should follow our dataset format. You can find the example format in "dataset>spatial_prompt.py".

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
HumanEvaluation		HumanEvaluation
SmartSpatial		SmartSpatial
SmartSpatialEval		SmartSpatialEval
assets		assets
conf		conf
dataset		dataset
evaluation		evaluation
my_model		my_model
reference_images		reference_images
script		script
.gitignore		.gitignore
FreeMono.ttf		FreeMono.ttf
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

SmartSpatial: Enhancing the 3D Spatial Arrangement Capabilities of Stable Diffusion Models and Introducing a Novel 3D Spatial Evaluation Framework

Overview

Structure of SmartSpatial:

Structure of SmartSpatialEval:

Installation

Reproduction & Evaluation

Configuration

Evaluation

Inference

About

Uh oh!

Releases

Packages

Languages

License

mao-code/SmartSpatial

Folders and files

Latest commit

History

Repository files navigation

SmartSpatial: Enhancing the 3D Spatial Arrangement Capabilities of Stable Diffusion Models and Introducing a Novel 3D Spatial Evaluation Framework

Overview

Structure of SmartSpatial:

Structure of SmartSpatialEval:

Installation

Reproduction & Evaluation

Configuration

Evaluation

Inference

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages