Description
Context
At our check-in today, Jeff and I investigated a user issue involving preparing a system with Interchange, running some early prep simulations in GROMACS, and then re-ingesting coordinates from GROMACS back into the Interchange in order to export the prepared system to OpenMM for production simulation.
The user was trying to use PDB files as the interchange format, which is kind of a nightmare for a whole host of reasons. I think using Interchange to perform one phase of a simulation in one engine and then the next phase in another engine is something we should support. In OpenMM this is trivial - you just set the positions attribute. GROMACS offers the GRO file for storing/exchanging coordinates (and optionally velocities) of an existing system/topology within GROMACS world, and I think support for ingesting coordinates from a GRO file into the Interchange that produced the GROMACS system would provide the same flexibility for that engine without the complexity of a PDB file.
More generally, making it easy to get positions and box vectors back into the Interchange from exported systems, as we currently do for OpenMM, provides good leverage on making Interchange more useful. These methods are easy to write, can be made very strict in what input they accept (because we largely control both sides of the equation), and amplify the ease that Interchange introduces to mixed engine workflows.
Proposal
To that end, I propose the Interchange.positions_from_gro(file)
instance method, which would take either a filename or file object of a GRO file, and set the positions and box vectors of self
to the positions and box vectors in the GRO file. It would instead raise an exception if the atom names in the GRO file did not exactly match the atom names of the interchange, in order, including virtual sites. It would be documented to only work on GRO files created by GROMACS from simulations whose topologies were generated by the receiving Interchange instance (ie, self
). For maximum user ease, it would also make any molecules broken over PBCs whole again, though I think it's reasonable to expect users to do this in GROMACS.
Example of use
Users could then do something like:
# Python
interchange = create_my_interchange()
interchange.to_gromacs("my_gromacs_export")
# Shell
gmx grompp -f my_incredibly_complicated_simulation.mdp -c my_gromacs_export.gro -p my_gromacs_export.top -o my_topol.tpr
gmx mdrun -s my_topol.tpr -c final_coords.gro
gmx trjconv -f final_coords.gro -s my_topol.tpr -o final_coords_wrapped.gro -pbc whole #Unnecessary if `positions_from_gro()` treats PBCs
# Python
interchange.positions_from_gro("final_coords_wrapped.gro")
sim = interchange.to_openmm_simulation(...)
sim.run(forever)
Alternatives
Currently, users could do this by creating a new Interchange with from_gromacs
from the generated top file, but updating the existing Interchange with the new positions would be faster, would support directives Interchange can write but not read, and avoids the photocopier effect. Users could call from_gromacs
and then copy the resulting positions back to the original interchange, but this is more work for them and doesn't address the speed and directive support concerns.
Parsing a GRO file with readlines
and split
is pretty trivial, so users could do that themselves and then update the positions and box attributes directly, but this requires them to know technical details about the GRO format, including that it may be more appropriate for this use case than PDB. The proposed method could be described in examples as the way to get coordinates back from GROMACS and would generally just work.