PR: Fix Duplicate Metric Logging in MLFlowLogger to Prevent MLflow Database Errors #20871

KAVYANSHTYAGI · 2025-06-02T13:30:20Z

What does this PR do?

This PR fixes a long standing issue in PyTorch Lightning’s MLFlowLogger where logging the same metric (with the same name and step) more than once in a run causes a unique constraint violation on certain MLflow backends (e.g., PostgreSQL).
Now, MLFlowLogger tracks (metric, step) pairs and skips any duplicate metric logs within a run, preventing database errors and improving robustness.

This change also updates the class docstring to document this new behavior and adds a unit test to verify that duplicate metric logs are ignored as expected.

Fixes #20865

Motivation and Context

Some MLflow tracking servers (such as those backed by PostgreSQL) enforce a unique constraint on metrics.

If the same metric (with identical name and step) is logged more than once, MLflow returns an error and metric logging fails, potentially halting training.

This situation often arises when users call .log() in multiple hooks or callbacks.

The deduplication logic ensures only the first log of a metric per (name, step) is recorded per run.

Dependencies

No new dependencies are introduced.

Does your PR introduce any breaking changes?

No breaking changes .... existing behavior is preserved except that duplicate metric logs are now silently skipped (users may see a log message if a duplicate is skipped).

Other Checklist Items

Documentation updated- yes(see class docstring in MLFlowLogger)

New test added for deduplication- yes

Fun fact:
This change will help Lightning users avoid subtle training failures, especially with remote or production MLflow tracking servers!

📚 Documentation preview 📚: https://pytorch-lightning--20871.org.readthedocs.build/en/20871/

for more information, see https://pre-commit.ci

Update mlflow.py

dc49140

KAVYANSHTYAGI requested a review from williamFalcon as a code owner June 2, 2025 13:30

github-actions bot added the pl Generic label for PyTorch Lightning package label Jun 2, 2025

pre-commit-ci bot and others added 2 commits June 2, 2025 13:30

[pre-commit.ci] auto fixes from pre-commit.com hooks

7d786af

for more information, see https://pre-commit.ci

Update test_mlflow.py

fd4bafa

KAVYANSHTYAGI requested review from lantiga, Borda, tchaton, justusschock and ethanwharris as code owners June 2, 2025 13:32

[pre-commit.ci] auto fixes from pre-commit.com hooks

49f8f4c

for more information, see https://pre-commit.ci

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

PR: Fix Duplicate Metric Logging in MLFlowLogger to Prevent MLflow Database Errors #20871

PR: Fix Duplicate Metric Logging in MLFlowLogger to Prevent MLflow Database Errors #20871

KAVYANSHTYAGI commented Jun 2, 2025 •

edited by github-actions bot

Loading

Uh oh!

Uh oh!

PR: Fix Duplicate Metric Logging in MLFlowLogger to Prevent MLflow Database Errors #20871

Are you sure you want to change the base?

PR: Fix Duplicate Metric Logging in MLFlowLogger to Prevent MLflow Database Errors #20871

Conversation

KAVYANSHTYAGI commented Jun 2, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

KAVYANSHTYAGI commented Jun 2, 2025 •

edited by github-actions bot

Loading