Skip to content

Restore default CI configuration for VAE and Siamese examples using accelerator API #1342

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Jun 15, 2025
Merged
Show file tree
Hide file tree
Changes from 7 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 5 additions & 5 deletions siamese_network/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,17 +21,17 @@ Optionally, you can add the following arguments to customize your execution.
--epochs number of epochs to train (default: 14)
--lr learning rate (default: 1.0)
--gamma learning rate step gamma (default: 0.7)
--accel use accelerator
--no-accel disables accelerator
--dry-run quickly check a single pass
--seed random seed (default: 1)
--log-interval how many batches to wait before logging training status
--save-model Saving the current Model
```

To execute in an GPU, add the --accel argument to the command. For example:
If a hardware accelerator device is detected, the example will execute on the accelerator; otherwise, it will run on the CPU.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I advice to still change this readme a little. Issues I currently see are:

  1. The only given example in the readme is with --no-accel. That's a little weird since we actually want to use accel rather than not to :).
  2. "Optionally, you can add the following arguments" seems out of place. You kind of expect some usage examples before this, but they are actually after.

So, I suggest to change order of the sections and give basic example first. Something like this (a sketch):

# Siamese Network Example
Siamese network <...>

To run the example, execute:
   python main.py
If a hardware accelerator device is detected <...>

To force execution on the CPU <...>
  python main.py --no-accel

Optionally, you can add the following arguments <...>

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes have been made according to your suggestion.


To force execution on the CPU, use `--no-accel` command line argument:

```bash
python main.py --accel
python main.py --no-accel
```

This command will execute the example on the detected GPU.
21 changes: 9 additions & 12 deletions siamese_network/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -247,8 +247,8 @@ def main():
help='learning rate (default: 1.0)')
parser.add_argument('--gamma', type=float, default=0.7, metavar='M',
help='Learning rate step gamma (default: 0.7)')
parser.add_argument('--accel', action='store_true',
help='use accelerator')
parser.add_argument('--no-accel', action='store_true',
help='disables accelerator')
parser.add_argument('--dry-run', action='store_true', default=False,
help='quickly check a single pass')
parser.add_argument('--seed', type=int, default=1, metavar='S',
Expand All @@ -258,16 +258,13 @@ def main():
parser.add_argument('--save-model', action='store_true', default=False,
help='For Saving the current Model')
args = parser.parse_args()

use_accel = not args.no_accel and torch.accelerator.is_available()

torch.manual_seed(args.seed)

if args.accel and not torch.accelerator.is_available():
print("ERROR: accelerator is not available, try running on CPU")
sys.exit(1)
if not args.accel and torch.accelerator.is_available():
print("WARNING: accelerator is available, run with --accel to enable it")

if args.accel:
if use_accel:
device = torch.accelerator.current_accelerator()
else:
device = torch.device("cpu")
Expand All @@ -276,12 +273,12 @@ def main():

train_kwargs = {'batch_size': args.batch_size}
test_kwargs = {'batch_size': args.test_batch_size}
if device=="cuda":
cuda_kwargs = {'num_workers': 1,
if use_accel:
accel_kwargs = {'num_workers': 1,
'pin_memory': True,
'shuffle': True}
train_kwargs.update(cuda_kwargs)
test_kwargs.update(cuda_kwargs)
train_kwargs.update(accel_kwargs)
test_kwargs.update(accel_kwargs)

train_dataset = APP_MATCHER('../data', train=True, download=True)
test_dataset = APP_MATCHER('../data', train=False)
Expand Down
12 changes: 10 additions & 2 deletions vae/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,15 @@ The main.py script accepts the following optional arguments:
```bash
--batch-size input batch size for training (default: 128)
--epochs number of epochs to train (default: 10)
--accel use accelerator
--no-accel disables accelerator
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest to add a text similar to the one in siamese example after the options list.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

--seed random seed (default: 1)
--log-interval how many batches to wait before logging training status
```
```

If a hardware accelerator device is detected, the example will execute on the accelerator; otherwise, it will run on the CPU.

To force execution on the CPU, use `--no-accel` command line argument:

```bash
python main.py --no-accel
```
14 changes: 5 additions & 9 deletions vae/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,31 +13,27 @@
help='input batch size for training (default: 128)')
parser.add_argument('--epochs', type=int, default=10, metavar='N',
help='number of epochs to train (default: 10)')
parser.add_argument('--accel', action='store_true',
help='use accelerator')
parser.add_argument('--no-accel', action='store_true',
help='disables accelerator')
parser.add_argument('--seed', type=int, default=1, metavar='S',
help='random seed (default: 1)')
parser.add_argument('--log-interval', type=int, default=10, metavar='N',
help='how many batches to wait before logging training status')
args = parser.parse_args()

use_accel = not args.no_accel and torch.accelerator.is_available()

torch.manual_seed(args.seed)

if args.accel and not torch.accelerator.is_available():
print("ERROR: accelerator is not available, try running on CPU")
sys.exit(1)
if not args.accel and torch.accelerator.is_available():
print("WARNING: accelerator is available, run with --accel to enable it")

if args.accel:
if use_accel:
device = torch.accelerator.current_accelerator()
else:
device = torch.device("cpu")

print(f"Using device: {device}")

kwargs = {'num_workers': 1, 'pin_memory': True} if device=="cuda" else {}
kwargs = {'num_workers': 1, 'pin_memory': True} if use_accel else {}
train_loader = torch.utils.data.DataLoader(
datasets.MNIST('../data', train=True, download=True,
transform=transforms.ToTensor()),
Expand Down