Skip to content

How to properly handle class imbalance in YOLOv5m #13589

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
1 task done
rmarkovic00 opened this issue May 13, 2025 · 6 comments
Open
1 task done

How to properly handle class imbalance in YOLOv5m #13589

rmarkovic00 opened this issue May 13, 2025 · 6 comments
Labels
detect Object Detection issues, PR's question Further information is requested

Comments

@rmarkovic00
Copy link

Search before asking

Question

Hi,

I'm working on a YOLOv5m object detection project using a custom dataset with three classes: car, truck, and person. The class distribution is unbalanced (e.g. many more cars than trucks or persons).

To address this, I added the following line in my dataset.yaml file: weights: [1.0, 2.0, 3.0]
And in hyp.no-augmentation.yaml, I set: cls_pw: 1.5

My goal is to help the model better learn underrepresented classes by giving them more weight during training.

Is this the correct and recommended approach in YOLOv5 for handling class imbalance?
Or is there a better way to improve performance on rare classes?

Thank you!

Additional

No response

@rmarkovic00 rmarkovic00 added the question Further information is requested label May 13, 2025
@UltralyticsAssistant UltralyticsAssistant added the detect Object Detection issues, PR's label May 13, 2025
@UltralyticsAssistant
Copy link
Member

👋 Hello @rmarkovic00, thank you for your interest in YOLOv5 🚀! Please visit our ⭐️ Tutorials to get started, where you can find quickstart guides for simple tasks like Custom Data Training all the way to advanced concepts like Hyperparameter Evolution.

If this is a custom training ❓ Question, please provide as much information as possible, including dataset image examples and training logs, and verify you are following our Tips for Best Training Results.

If this is a 🐛 Bug Report, please provide a minimum reproducible example to help us debug it.

Requirements

Python>=3.8.0 with all requirements.txt installed including PyTorch>=1.8. To get started:

git clone https://github.com/ultralytics/yolov5  # clone
cd yolov5
pip install -r requirements.txt  # install

Environments

YOLOv5 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):

Status

YOLOv5 CI
If this badge is green, all YOLOv5 GitHub Actions Continuous Integration (CI) tests are currently passing. CI tests verify correct operation of YOLOv5 training, validation, inference, export and benchmarks on macOS, Windows, and Ubuntu every 24 hours and on every commit.

🔎 This is an automated response. An Ultralytics engineer will review your question and provide further assistance soon!

@pderrenger
Copy link
Member

Hi @rmarkovic00,

Yes, your approach for handling class imbalance in YOLOv5 is valid. Using class weights in dataset.yaml and adjusting cls_pw in your hyperparameters are good starting points.

Some additional strategies you could implement:

  1. Data augmentation focused on minority classes (trucks and persons)
  2. Targeted dataset expansion by collecting more samples of underrepresented classes
  3. Adjust your IoU thresholds during evaluation to be class-specific
  4. Consider using the --rect flag during training, which may help with detecting objects at different scales

For severe imbalance, you might also try undersampling the majority class (cars) or implementing focal loss by adjusting the hyperparameters.

YOLOv5 handles bias fairly well with proper configuration, but monitoring slice-level performance during training (how the model performs on each class separately) will help you identify if further adjustments are needed.

@soham-2006
Copy link

If you want to actually guide the model to better learn rare classes during training, you need to modify the loss computation or re-balance the dataset or sampling.

  1. Use Focal Loss
    YOLOv5 supports Focal Loss, which is great for class imbalance

  2. Oversampling Rare Classes (Data Level)
    Duplicate images with rare classes to balance the dataset.

Tools like Albumentations or custom scripts can help automate this.

  1. Manual Class Weighting in the Loss Function (Advanced)
    YOLOv5 doesn’t support per-class loss weighting natively during training.

But you can modify the loss function in the code (loss.py) to manually apply class weights.

Requires knowledge of PyTorch and understanding of YOLO loss structure.

  1. Synthetic Data Augmentation for Rare Classes
    Use tools like imgaug, Albumentations, or Roboflow to generate more samples for rare classes.

Helps increase generalization without full duplication.

@pderrenger
Copy link
Member

Hi @soham-2006, thanks for those additional suggestions.

Building on this discussion, I'd like to add that class imbalance is a form of representation bias that can significantly impact model performance. Per our documentation on dataset bias, analyzing performance across slices is critical - a model with 90% overall accuracy might only achieve 60% on underrepresented classes.

@rmarkovic00 - You might also consider:

  1. Using the Albumentations integration with YOLOv5 for advanced augmentations targeting minority classes - we have documentation on this at https://docs.ultralytics.com/integrations/albumentations/

  2. For severe imbalance, try adjusting the fl_gamma parameter in your hyperparameters to increase the focus on hard examples (higher values like 1.5-2.0 can help with rare classes)

  3. Monitoring class-specific metrics during training with --verbose flag to catch issues early

Remember that the most effective approach often combines multiple strategies rather than relying on weights alone.

@soham-2006
Copy link

Thanks

@pderrenger
Copy link
Member

@rmarkovic00 - Just checking if the suggestions we've provided have been helpful for your class imbalance issue. If you've had a chance to implement any of these approaches, we'd be interested to hear about your results. Feel free to let us know if you encounter any challenges or have additional questions about implementing these strategies with your YOLOv5 model.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
detect Object Detection issues, PR's question Further information is requested
Projects
None yet
Development

No branches or pull requests

4 participants