Skip to content

Feature: Parallel iterations #63

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

MashAliK
Copy link
Contributor

@MashAliK MashAliK commented Jun 9, 2025

Add parallel iterations along with a couple of bug fixes/improvements. I would consider this a pretty important feature because of the significant speedup it provides to training. This is the implementation that I tried and has worked for my use cases.

Primary changes:

  • Use concurrent.futures in controller.py to spawn processes for performing the iterations of training
  • Moved most of the iteration logic inside the function run_iteration_sync in a new file iteration.py which is run by the workers to allow concurrent execution
  • Pass config into spawned processes to initialize new instances of LLMEnsemble and ProgramDatabase (these classes are not pickleable so this is the best approach I could think of to use these classes in each of the worker processes)
  • As a consequence of this approach a snapshot of the database needs to be saved in the root process every iteration and each new process needs to load it

Minor changes:

  • The calculate_edit_distance function was crashing the database when I was using it and since there's already libraries that do this routine I ended up using one of them (levenshtein)
  • Replaced the use of the Levenshtein distance with a ratio in _calculate_island_diversity since it's easier to read
  • Replaced the use of the Levenshtein distance with a ratio in _calculate_feature_coords since it's normalized for code length
  • Introduced allowed_population_overflow since otherwise the database was adding and removing a program every iteration when it reach the allowed program limit
  • Added logging for MAP-Elite features

@MashAliK
Copy link
Contributor Author

MashAliK commented Jun 9, 2025

Solves: #32
I believe it also contains the solution to #60 because I remember moving the call to _enforce_population_limit later in add fixed an error where the program was removed from the list while it was still being added
Taking a closer look this bug looks different then the one I saw (I encountered mine when calling add, not sample)

@codelion
Copy link
Owner

Thanks for contributing can you rebase from main, I can test and then review this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants