Skip to content

Topology.identical_molecule_groups is too slow #2035

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
mattwthompson opened this issue Mar 18, 2025 · 0 comments
Open

Topology.identical_molecule_groups is too slow #2035

mattwthompson opened this issue Mar 18, 2025 · 0 comments
Assignees
Labels
polymer-performance Runtime of loading and/or parametrizing (bio)polymers

Comments

@mattwthompson
Copy link
Member

Describe the bug

Subgraph isormorphism is central to using the toolkit on multi-molecule systems but it is slow, especially for large and complicated systems.

To Reproduce

topology.json.zip

With a topology of fairly modest size and refactoring to use a Rust re-implementation of networkx (#2033), this takes about 10 minutes:

In [1]: from openff.toolkit import Topology

In [2]: topology = Topology.from_json(open("topology.json").read())

In [3]: %timeit -o -r1 -n1 topology.identical_molecule_groups
11min 2s ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)
Out[3]: <TimeitResult : 11min 2s ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)>

In [4]: !open .

In [5]: topology.n_atoms, topology.n_molecules
Out[5]: (4119, 10)

Using the current main branch, it takes at least twice that (24 minutes, still running):

Image

Output

Computing environment (please complete the following information):

  • Operating system
  • Output of running conda list

Additional context

#1143 #1734 #353 #2008 openforcefield/openff-interchange#1156 etc.

@mattwthompson mattwthompson added the polymer-performance Runtime of loading and/or parametrizing (bio)polymers label Mar 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
polymer-performance Runtime of loading and/or parametrizing (bio)polymers
Projects
None yet
Development

No branches or pull requests

2 participants