Nightly build 2025-06-28
Pre-release
Pre-release
HyperQueue dev
Breaking change
- In
--crash-limit
value 0 is no longer allowed, use--crash-limit=unlimited
. - The
--workers-per-alloc
flag of thehq alloc add
command has been replaced with--max-workers-per-alloc
,
which determines the maximum number of workers to spawn in each allocation. Previously the flag caused the
allocator to (almost) always spawn the determined number of workers per allocation, regardless of actual
computational load.
Changes
The automatic allocator has been finally reimplemented, and is now much better:
- It now uses information from the scheduler to determine how many allocations to spawn, and thus it can react to the
current computational load much more accurately. It should also be less "eager". - It properly supports multi-node tasks.
- It considers computational load across all allocation queues (before, each queue was treated separately, which led to
creating too many submissions). - It now exposes a
min-utilization
parameter, which can be used to avoid spawning an allocation that couldn't be utilized
enough.
As this is a large behavioral change, we would be happy to hear your feedback!
New features
- New command
hq task explain <job_id> <task_id>
explains why a task cannot be run on a given worker. - The server scheduler now slightly prioritizes tasks from older jobs and finishing partially-computed task graphs
- New values for
--crash-limit
:never-restart
- task is never restarted, even if it "crashes" on a worker that was explicitly terminated.unlimited
- unlimited crash limit
hq worker info
contains more informationhq job forget
tries to free more memory- You can now configure Job name in the Python API.
hq job progress
now displays all jobs and tasks that you wait for, rather than those that were unfinished at the
time when the command was executed.
Fixes
- Fixed a problem with journal loading when task dependencies are used
- Fixed restoring crash counters and instance ids from journal
- Fixed some corner cases of load balancing in server scheduler
Docs
- CLI documentation (when
--help
is used) was cleaned up and improved - Our documentation now contains an automatically generated reference of all available HQ CLI commands and options.
Experimental
- Added direct data transfers between tasks. User API not stabilized
Artifact summary:
- hq-vdev-*: Main HyperQueue build containing the
hq
binary. Download this archive to
use HyperQueue from the command line. - hyperqueue-dev-*: Wheel containing the
hyperqueue
package with HyperQueue Python
bindings.