Skip to content

Fix bad input and deployment container crash error in notebook tests #3609

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

javed-73
Copy link
Contributor

@javed-73 javed-73 commented Jun 3, 2025

Description

Couple of issues fixed in the PR

  1. Bad input error.
    Image-object-detection notebook
    test run is failing because of bad input.

image_object_detection_job = automl.image_object_detection(
compute=compute_name,
experiment_name=exp_name,
training_data=my_training_data_input,
validation_data=my_validation_data_input,
target_column_name="label",
primary_metric="mean_average_precision",
tags={"my_custom_tag": "My custom value"},
)

image_object_detection_job.set_limits(
max_trials=2,
max_concurrent_trials=2,
)

Issue: max trials used for running the sweep pipeline job uses max trials as 2, which is throwing "Error: Input request is invalid" as can be seen in the job

Updating to max trials to 3 fixes it.
Job

  1. Error: Container crash at deployment step
    Failed build
    Looking at the deployment logs, below is the error.
    RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU.
    Job
    Fix: use GPU machine for deployment.

Checklist

  • I have read the contribution guidelines.
  • I have coordinated with the docs team ([email protected]) if this PR deletes files or changes any file names or file extensions.
  • Pull request includes test coverage for the included changes.
  • This notebook or file is added to the CODEOWNERS file, pointing to the author or the author's team.

@javed-73
Copy link
Contributor Author

javed-73 commented Jun 4, 2025

The image-object-detection test is failing later at deployment step which will be taken care in a separate PR. But as far as bad input failures is concerned, this change has worked.

SamGos93
SamGos93 previously approved these changes Jun 4, 2025
Copy link
Contributor

@yeshsurya yeshsurya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please merge post gates are successfull

@javed-73 javed-73 changed the title fix bad input error in notebook test Fix bad input and deployment container crash error in notebook tests Jun 5, 2025
@yeshsurya yeshsurya self-requested a review June 5, 2025 05:57
@@ -85,6 +85,7 @@ jobs:
source "${{ github.workspace }}/infra/bootstrapping/init_environment.sh";
bash "${{ github.workspace }}/infra/bootstrapping/sdk_helpers.sh" generate_workspace_config "../../.azureml/config.json";
bash "${{ github.workspace }}/infra/bootstrapping/sdk_helpers.sh" replace_template_values "automl-image-object-detection-task-fridge-items.ipynb";
sed -i 's/max_trials=2/max_trials=3/g' automl-image-object-detection-task-fridge-items.ipynb
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this not working with 2 but with 3 max trials?

max_trials as per documentation mentions the following:
Parameter for maximum number of configurations to sweep. Must be an integer between 1 and 1000. When exploring just the default hyperparameters for a given model algorithm, set this parameter to 1. Default value is 1.

Trying to understand why?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants