Use `openff-forcebalance` for optimizations #221

mattwthompson · 2023-01-25T22:57:30Z

No description provided.

codecov · 2023-01-26T00:01:24Z

Codecov Report

Merging #221 (8d1c2d2) into main (656dca3) will decrease coverage by 1.47%.
The diff coverage is 42.85%.

❗ Current head 8d1c2d2 differs from pull request most recent head 6417800. Consider uploading reports for the commit 6417800 to get more accurate results

Additional details and impacted files

Yoshanuikabundi

Looks good to me! Thanks! Test are passing, the original ForceBalance is not in the solved CI environment, and RTD isn't throwing up any import error warnings, so I think you got em all!

I have a few questions about some changes to the environments, but nothing blocking :)

devtools/conda-envs/test-env.yaml

Yoshanuikabundi

Whoops! Tried a test-fit locally and it's not working. Sorry, I should've done this before submitting my approving review but I only thought of it immediately after clicking submit.

{"type": "FileNotFoundError", "message": "[Errno 2] No such file or directory", "traceback":       
 "Traceback (most recent call last):\n  File                                                        
 \"/home/joshmitchell/Documents/openff/bespokefit/.soap/forksbalance/lib/python3.10/site-packages/ce
 lery/app/trace.py\", line 451, in trace_task\n    R = retval = fun(*args, **kwargs)\n  File        
 \"/home/joshmitchell/Documents/openff/bespokefit/.soap/forksbalance/lib/python3.10/site-packages/ce
 lery/app/trace.py\", line 734, in __protected_call__\n    return self.run(*args, **kwargs)\n  File 
 \"/home/joshmitchell/Documents/openff/bespokefit/openff/bespokefit/executor/services/optimizer/work
 er.py\", line 63, in optimize\n    result = optimizer.optimize(\n  File                            
 \"/home/joshmitchell/Documents/openff/bespokefit/openff/bespokefit/optimizers/model.py\", line 233,
 in optimize\n    results = cls._optimize(schema, initial_force_field)\n  File                      
 \"/home/joshmitchell/Documents/openff/bespokefit/openff/bespokefit/optimizers/forcebalance/forcebal
 ance.py\", line 113, in _optimize\n    results = cls._collect_results(\"\")\n  File                
 \"/home/joshmitchell/Documents/openff/bespokefit/openff/bespokefit/optimizers/forcebalance/forcebal
 ance.py\", line 135, in _collect_results\n    results_dictionary =                                 
 cls._read_output(root_directory)\n  File                                                           
 \"/home/joshmitchell/Documents/openff/bespokefit/openff/bespokefit/optimizers/forcebalance/forcebal
 ance.py\", line 179, in _read_output\n    with open(os.path.join(root_directory, \"optimize.out\"))
 as log:\nFileNotFoundError: [Errno 2] No such file or directory: 'optimize.out'\n"}

Has the default output file name changed?

Invoking command:

openff-bespoke executor run --smiles                 'CC'               \
                            --workflow               'default'          \
                            --n-qc-compute-workers   12                 \
                            --qc-compute-n-cores     1                  \
                            --default-qc-spec        xtb gfn2xtb none   \
                            --target-torsion         '[C:1]-[C:2]'

Environment was test-env.yaml + xtb-python

mattwthompson · 2023-01-26T02:20:21Z

I haven't changed any file I/O (yet), just the name of the executable and how it's called. I get the same errors as you. I can only guess, but I guess the optimization is actually failing to converge at some point.

How should I go about working on this? I don't see any new files on my computer, so it'd be hard to reproduce just the optimization step. I'd rather avoid figuring out how to run XTB myself, though with some effort I could poke through this codebase and figure out how to generate the QC data and set up an optimization that looks similar.

Yoshanuikabundi · 2023-01-26T04:29:43Z

Ah damn. Yeahhh so this is why I started #199. These files are pretty aggressively cleaned up. You can ask politely to keep things around:

BEFLOW_OPTIMIZER_KEEP_FILES=True openff-bespoke executor run
    --smiles                 'CC'                  \
    --workflow               'default'             \
    --n-qc-compute-workers   12                    \
    --qc-compute-n-cores     1                     \
    --default-qc-spec        xtb gfn2xtb none      \
    --target-torsion         '[C:1]-[C:2]'         \
    --directory              bespoke-test-fit

But the code that saves stuff doesn't get called if the optimizer fails, and by default everything is saved to temporary folders. Let me merge #222 and take another crack at #199 before we progress this.

mattwthompson · 2023-01-26T05:52:07Z

Sounds good to me - I'm pretty confident the problem is somewhere in openff-forcebalance

mattwthompson · 2023-01-26T15:16:51Z

After a decent night's sleep, I realized a simple shutil.copytree can freeze out a minimal reproduction of the optimization failure. This is what I'm working off of now - #199 is still of value but not necessarily a blocker to me working on this.

pr-221.tar.gz

mattwthompson · 2023-01-26T20:49:43Z

The above tarball is now working standalone on my machine but the integration test is failing (in CI and locally). Something changed with the actual logging, which I'm looking into now.

mattwthompson · 2023-01-27T00:08:39Z

I don't know why the test openff/bespokefit/tests/optimizers/forcebalance/test_forcebalance.py::test_forcebalance_optimize is failing. When I drop a debugger in, the file optimize.out exists but is empty. When I attempt to reproduce it as closely as possible in an interpreter, it exists and includes the output I expect (including the optimization converged tag bring printed multiple times): https://gist.github.com/mattwthompson/872cfd173f4a4c1683172a4e4d9f66b8

The integration test succeeds locally, not sure if it will in CI.

mattwthompson · 2023-01-27T17:26:11Z

I'm not sure why the non-OpenEye tests are failing. I made a new environment from the same file and the integration test runs fine locally.

j-wags · 2023-01-27T19:23:23Z

@Yoshanuikabundi - I consider you the "product owner" for bespokefit at this point (and you're doing a good job of it!). I'd caution that merging this PR effectively constitutes a decision to make the next release of BespokeFit use OpenFF-Forcebalance. If you're up to take that risk feel free to merge this, but if it needs more thought, it may be good to decide with @mattwthompson what the conditions are to undo this PR if a quick release of BespokeFit needs to be made. If it were me, I might leave this PR in a do-not-merge state until the decision is made to make a new BespokeFit release using the new dep.

mattwthompson · 2023-01-27T20:16:35Z

Jeff and I view this somewhat differently (I think merging any PR commits to the direction of development and the topic of how stable a package and its dependencies must be in order to be released is a separate discussion) - but the decision ultimately rests with you and I'm happy to go back and forth on the planning here or in another context.

If we decide to merge this and need to back off before a release is made - which could plausibly happen for a number of different reasons - I'll take responsibility for reverting this.

In the current state, though, I could use help with

Figuring out why the noted unit test is failing
- When I run optimizations, the log file is created just as expected and contains the optimization complete line that's needed. I can't tell if something is actually wrong with the code or if something needs to be changed in how the behavior is mocked inside tests.
Figuring out why the integration test fails with the non-OpenEye environment
- I ... got nothin.

I might have a eureka moment next week on either or both of these problems, but I've slept on them once without any movement.

Yoshanuikabundi · 2023-01-30T05:55:29Z

Thanks for the advice @j-wags, that settles a thought I'd had running around my head... I believe BespokeFit is designed to work with multiple different optimizers, it's just that ForceBalance is the only one that's been implemented - maybe @jthorton can confirm if I can't figure it out today? So I think once this is ready to merge, we should be able to relatively easily refactor it to live alongside the existing ForceBalance implementation. This would make reversion easy and allow users the choice after upstream ForceBalance gets fixed - I can imagine that might be desirable? We can have BespokeFit require FB>=1.9.5 or something.

Am I wrong in thinking that with this merged and working, we'd be wanting to re-release 0.2.0 in pretty short order? What's the plan on making a release of ForksBalance?

@mattwthompson I'll take a look :)

Yoshanuikabundi · 2023-01-30T09:41:26Z

OK I took a crack at that refactoring. It's not too bad (see #223) but I need to think about how we want to do tests and whether the additional maintenance burden is worth it.

The test you mentioned @mattwthompson is still failing. With the changes I made to the test suite in #223 I managed to get it to fail with both FBs, which at least suggests it's a BespokeFit problem... but I don't know what. I think there might have been a change in a dependency that changed some behavior we were relying on or something, possibly in Pytest's monkeypatch apparatus. I'll look further tomorrow.

I did get the integration tests to run when unit tests fail!

jthorton · 2023-01-30T13:04:11Z

Right the idea was to add more as we developed alternatives to the current methods so I would say if openff-forcebalance is going to offer something different then it would be good to split them. However, if it's going to be the same package with the same methods I don't see the point in doing that work but I could be convinced otherwise if I am missing something!

For the integration testing, I had a go at trying to reproduce the issue using the CI environment without openeye and get the same convergence issue problem, it seems that for this test the gradient norm hasn't converged before the step size gets too small. I compared the charges in the openmm systems using openeye and ambertools and they seem fine (integer 0 charge as expected not too different), this is where I expected there could be some difference. After a lot of trial and error I found that I could get the test to work if I replaced the scan.xyz from the targtes folder with the one produced when openeye is in the environment, the only difference I think between the two is that one uses rdkit to generate the seed conformer for the torsiondrive and the other uses openeye?! I don't understand why this would be an issue, on the other hand, I found that changing the test to use methanol and the target smirks to match worked fine!

openff-bespoke executor run --smiles                 'CO'               \
                                      --workflow               'default'          \
                                      --default-qc-spec        xtb gfn2xtb none   \
                                      --target-torsion         '[C:1]-[O:2]

mattwthompson · 2023-01-30T14:51:33Z

Am I wrong in thinking that with this merged and working, we'd be wanting to re-release 0.2.0 in pretty short order?

These decisions and all planning around releases are squarely up to you, and you're the arbiter of any changes I'd wish to make to this codebase in between releases. I have suggestions on how we should target releases if you're open to receiving them, but at the end of the day it's up to you

What's the plan on making a release of ForksBalance?

I'm not totally sure; so far it's a totally solo project that hasn't been hardened by use. This does come with upsides - I offer no guarantees of any stability with respect to the API or any behavior and no users that would be upset by sweeping changes, therefore I can rapidly develop and apply changes in response to user features. The intent is for it to serve as a minimal replacement for the software products we produce, which currently is only this project. The primary benefits are it being a lighter-weight distribution of the SMIRNOFF targets than the canonical ForceBalance and the ability to rapidly develop new features and apply bugfixes.

If using openff-forcebalance as the optimizer passes all unit tests and the integration tests - a state I believe we're near - I argue there are only two conclusions that can be drawn

It is ready for production
More tests must be written

I'm still working on some changes to that codebase and applying more tests, but that's a statement that may be true for a while. Releases and packages can be made soon, kinda-soon, or much later; I'm hoping to get more feedback from users to decide how early I want to be responsible for caring for downstream changes.

I believe BespokeFit is designed to work with multiple different optimizers

Is this exposed as an option in the CLI/API? I don't see anything in openff-bespoke executor run --help but I could be missing something, and I don't know the source code all that well.

j-wags · 2023-01-31T02:55:31Z

@Yoshanuikabundi Matt pointed out that I sprung this notion of "product ownership" without really laying out what it means, but it looks like you've picked up exactly what I was thinking. Thank you for helpfully reading my mind.

The other piece of relevant information is "when will we get leeping/forcebalance>=1.9.5?". I've opened the PR with (I think) the appropriate changes, but we're just getting back into the school year here so Lee Ping may not have a ton of time to make a new release.

Yoshanuikabundi · 2023-01-31T07:22:56Z

@mattwthompson I just spent 5 hours writing #224 to discuss release strategy! I would love to hear your suggestions there.

The intent is for it to serve as a minimal replacement for the software products we produce, which currently is only this project.

Just for my curiosity - what about our mainline force field fits?

Is this exposed as an option in the CLI/API? I don't see anything in openff-bespoke executor run --help but I could be missing something, and I don't know the source code all that well.

Good question. I don't think it is, but only because I've never seen it. I don't wanna jinx it... but it might be easy.

@j-wags Haha you said project owner so I decided I would own it :D Now I can finally execute my crazy up-pin every dependency that uses semver plan...

If leeping/forcebalance>=1.9.5 is right around the corner then it might be better to wait for that for 0.2.0, and support openff-forcebalance in a future release. I'm very anxious about changing code too much with the current test suite and debugging experience; this PR seems like it should be pretty easy but it's still broken.

Yoshanuikabundi · 2023-01-31T07:33:46Z

@jthorton At present the two packages have different namespaces and different CLI commands, so we can't support both without splitting the optimizer implementations. I can just imagine users being upset with either choice, and we don't have to make a choice... so why not offer both!?

I guess the other option would be to just use either depending on what's available, but I think that just moves the problem into the future when they do diverge?

Wow, amazing detective work! It's good to know what's going wrong but I wanna experiment a bit with tweaking the settings of the conformer generation before we change the molecule.

Also I pinged you again on #194 - I know it's a lot of questions but every little bit would help. I can write up what my understanding is so you can just correct my misunderstandings if that would be easier?

I also have written up what I want to do with BespokeFit re: improving the developer experience in #224. A lot of my ideas might have very obvious downsides that I'm just not aware of yet, so if you have any thoughts on any of it I'd love to hear them!

mattwthompson · 2023-02-03T00:52:37Z

Just for my curiosity - what about our mainline force field fits?

That's up to @lilyminium and her team, not me. I don't know their plans, but it's easy to imagine overlapping interests in rapid development, integration with our stack, possible performance improvements, and even experimental features.

lilyminium · 2023-02-03T01:54:08Z

Yes, I imagine it would be a lot easier to iterate on FF fits with tweaks and new features, working in partnership with the OpenFF fork with all the nice things Matt just mentioned :)

* attempt to support forcebalance and forksbalance * Replace whitespace in optimizer name with hyphen * Uncomment generate() * Fix tests * Add --optimizer switch to CLI

mattwthompson · 2023-02-17T04:07:47Z

A pretty inconvenient time to run into issues with the QCArchive stack. I can get the --optimizer openff-forcebalance flag working, though, just as advertised elsewhere!

So this works ... with the caveat that running the optimization with the forked version actually uses the upstream ForceBalance to write out targets. Right now they're the same but in the future it might make sense for them to diverge. The class that writes out the targets should have access to the optimizer from the schema of some factory, so it should be possible to branch the logic there. I just won't get to that tonight.

In terms of what's next,

I can continue working on this in this PR, or it can be merged in and the next steps can be in follow-up PRs. Either way I'm not blocked and there's some framework for making sure I don't break other things.
This shouldn't be considered a blocker to making a release if it seems like 0.2.1 with a forcebalance >=1.9.5 constraint brings everything online, passes tests, satisfies other requirements, etc.
- Since --optimizer openff-forcebalance is (at best) experimental, it doesn't need to be packaged in a release
- We might want to throw out some warnings to the user if that flag is used (I think around here?)

mattwthompson · 2023-02-17T16:40:08Z

Another question - do we need another schema for this optimizer?

https://github.com/openforcefield/openff-bespokefit/blob/main/openff/bespokefit/schema/optimizers.py

Yoshanuikabundi · 2023-02-20T05:51:11Z

with the caveat that running the optimization with the forked version actually uses the upstream ForceBalance to write out targets.

I believe this is true only for targets other than torsions - which can be demonstrated by running a default fit in an environment that doesn't have upstream ForceBalance.

I can continue working on this in this PR, or it can be merged in and the next steps can be in follow-up PRs. Either way I'm not blocked and there's some framework for making sure I don't break other things.

I think I'd like to leave this open for the time being, because....

This shouldn't be considered a blocker to making a release if it seems like 0.2.1 with a forcebalance >=1.9.5 constraint brings everything online, passes tests, satisfies other requirements, etc.

I think this is the way to go. But I need to focus on NAGL docs for the next couple weeks.

Another question - do we need another schema for this optimizer?

At the moment, ForceBalanceSchema is pulling double duty. So ForksBalance can be configured without needing its own schema. As the two optimizers diverge, this'll become unworkable, so yes at some point we'll need a second schema. But we're planning on breaking the configuration schemas soon anyway so I don't think this is urgent, and the design of these schemas possibly needs some more thought. But if it stops working, or you want to be able to configure a fork-only feature, feel free to duplicate the schema!

jthorton · 2023-02-20T09:48:16Z

I believe this is true only for targets other than torsions

I think they all need to use the molecule class from ForceBalance, see the super call in torsions here.

Yoshanuikabundi · 2023-07-08T04:08:06Z

Hey Matt - if you're working on this again, I just wanted to mention that I've come around on continuing to support original Force Balance - if it's easier to focus on our fork and require users to use an old version for the original, that's fine by me.

mattwthompson changed the title ~~[DNM] Use openff-forcebalance for optimizations~~ Use openff-forcebalance for optimizations Jan 26, 2023

mattwthompson requested a review from Yoshanuikabundi January 26, 2023 00:07

mattwthompson marked this pull request as ready for review January 26, 2023 00:08

Yoshanuikabundi approved these changes Jan 26, 2023

View reviewed changes

devtools/conda-envs/test-env.yaml Outdated Show resolved Hide resolved

devtools/conda-envs/test-env.yaml Outdated Show resolved Hide resolved

Yoshanuikabundi requested changes Jan 26, 2023

View reviewed changes

mattwthompson force-pushed the openff-forcebalance branch from f1b7d6c to 3ee15cd Compare January 26, 2023 02:28

mattwthompson marked this pull request as draft January 26, 2023 14:51

mattwthompson marked this pull request as ready for review January 27, 2023 02:21

Use openff-forcebalance for optimizations

8d1c2d2

mattwthompson force-pushed the openff-forcebalance branch from 6ae2937 to 8d1c2d2 Compare January 27, 2023 21:12

Run integration tests even when unit tests fail

7a574bf

Yoshanuikabundi mentioned this pull request Jan 30, 2023

Attempt to support forcebalance and forksbalance #223

Merged

3 tasks

Yoshanuikabundi and others added 4 commits February 17, 2023 09:11

Attempt to support forcebalance and forksbalance (#223)

8cc1409

* attempt to support forcebalance and forksbalance * Replace whitespace in optimizer name with hyphen * Uncomment generate() * Fix tests * Add --optimizer switch to CLI

Merge remote-tracking branch 'upstream/main' into openff-forcebalance

ad7c6a7

Lint

de10e1f

Run integration test with each optimizer

d36e70c

Merge remote-tracking branch 'upstream/main' into openff-forcebalance

6417800

mattwthompson closed this Feb 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use `openff-forcebalance` for optimizations #221

Use `openff-forcebalance` for optimizations #221

mattwthompson commented Jan 25, 2023 •

edited

Loading

codecov bot commented Jan 26, 2023 •

edited

Loading

Yoshanuikabundi left a comment

Yoshanuikabundi left a comment •

edited

Loading

mattwthompson commented Jan 26, 2023

Yoshanuikabundi commented Jan 26, 2023

mattwthompson commented Jan 26, 2023

mattwthompson commented Jan 26, 2023

mattwthompson commented Jan 26, 2023 •

edited

Loading

mattwthompson commented Jan 27, 2023 •

edited

Loading

mattwthompson commented Jan 27, 2023

j-wags commented Jan 27, 2023

mattwthompson commented Jan 27, 2023 •

edited

Loading

Yoshanuikabundi commented Jan 30, 2023

Yoshanuikabundi commented Jan 30, 2023 •

edited

Loading

jthorton commented Jan 30, 2023

mattwthompson commented Jan 30, 2023

j-wags commented Jan 31, 2023

Yoshanuikabundi commented Jan 31, 2023

Yoshanuikabundi commented Jan 31, 2023

mattwthompson commented Feb 3, 2023

lilyminium commented Feb 3, 2023

mattwthompson commented Feb 17, 2023

mattwthompson commented Feb 17, 2023

Yoshanuikabundi commented Feb 20, 2023

jthorton commented Feb 20, 2023 •

edited

Loading

Yoshanuikabundi commented Jul 8, 2023

Use openff-forcebalance for optimizations #221

Use openff-forcebalance for optimizations #221

Conversation

mattwthompson commented Jan 25, 2023 • edited Loading

codecov bot commented Jan 26, 2023 • edited Loading

Codecov Report

Yoshanuikabundi left a comment

Choose a reason for hiding this comment

Yoshanuikabundi left a comment • edited Loading

Choose a reason for hiding this comment

mattwthompson commented Jan 26, 2023

Yoshanuikabundi commented Jan 26, 2023

mattwthompson commented Jan 26, 2023

mattwthompson commented Jan 26, 2023

mattwthompson commented Jan 26, 2023 • edited Loading

mattwthompson commented Jan 27, 2023 • edited Loading

mattwthompson commented Jan 27, 2023

j-wags commented Jan 27, 2023

mattwthompson commented Jan 27, 2023 • edited Loading

Yoshanuikabundi commented Jan 30, 2023

Yoshanuikabundi commented Jan 30, 2023 • edited Loading

jthorton commented Jan 30, 2023

mattwthompson commented Jan 30, 2023

j-wags commented Jan 31, 2023

Yoshanuikabundi commented Jan 31, 2023

Yoshanuikabundi commented Jan 31, 2023

mattwthompson commented Feb 3, 2023

lilyminium commented Feb 3, 2023

mattwthompson commented Feb 17, 2023

mattwthompson commented Feb 17, 2023

Yoshanuikabundi commented Feb 20, 2023

jthorton commented Feb 20, 2023 • edited Loading

Yoshanuikabundi commented Jul 8, 2023

Use `openff-forcebalance` for optimizations #221

Use `openff-forcebalance` for optimizations #221

mattwthompson commented Jan 25, 2023 •

edited

Loading

codecov bot commented Jan 26, 2023 •

edited

Loading

Yoshanuikabundi left a comment •

edited

Loading

mattwthompson commented Jan 26, 2023 •

edited

Loading

mattwthompson commented Jan 27, 2023 •

edited

Loading

mattwthompson commented Jan 27, 2023 •

edited

Loading

Yoshanuikabundi commented Jan 30, 2023 •

edited

Loading

jthorton commented Feb 20, 2023 •

edited

Loading