Skip to content

Unable to use Silhouette Visualizer with Gaussian Mixture Model #1303

Open
@Thecave3

Description

@Thecave3

Describe the bug
Silhouette score and its visualization can be calculated for Gaussian Mixture Model outputs, while this library currently does not support this.

To Reproduce
I used the example code from here and I changed the model from Kmeans to GMM.

# Steps to reproduce the behavior (code snippet):
# Should include imports, dataset loading, and execution
from sklearn.mixture import GaussianMixture as GMM

from yellowbrick.cluster import SilhouetteVisualizer
from yellowbrick.datasets import load_nfl

# Load a clustering dataset
X, y = load_nfl()

# Specify the features to use for clustering
features = ['Rec', 'Yds', 'TD', 'Fmb', 'Ctch_Rate']
X = X.query('Tgt >= 20')[features]

# Instantiate the clustering model and visualizer
model = GMM(5, random_state=42)
visualizer = SilhouetteVisualizer(model, colors='yellowbrick')

visualizer.fit(X)        # Fit the data to the visualizer
visualizer.show()        # Finalize and render the figure

Dataset
The dataset chosen does not affect the outcome.

Expected behavior
I expect the fitting of the data and the visualization of the scores over the figure.

Traceback

Traceback (most recent call last):
  File "sil_testet.py", line 15, in <module>
    visualizer = SilhouetteVisualizer(model, colors='yellowbrick')
  File "/usr/local/lib/python3.8/dist-packages/yellowbrick/cluster/silhouette.py", line 118, in __init__
    super(SilhouetteVisualizer, self).__init__(estimator, ax=ax, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/yellowbrick/cluster/base.py", line 45, in __init__
    raise YellowbrickTypeError(
yellowbrick.exceptions.YellowbrickTypeError: The supplied model is not a clustering estimator; try a classifier or regression score visualizer instead!

Desktop (please complete the following information):

  • OS: Ubuntu 20.04
  • Python Version 3.8
  • Yellowbrick Version I have no clue on how to retrieve it, I installed it with pip.

Additional context

I believe SilhouetteVisualizer should support GMM due to the possibility of using it as a clustering methodology (e.g., Gaussian Mixture Models Clustering Algorithm Explained).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions