-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use openff-forcebalance
for optimizations
#221
Conversation
Codecov Report
Additional details and impacted files |
openff-forcebalance
for optimizationsopenff-forcebalance
for optimizations
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me! Thanks! Test are passing, the original ForceBalance is not in the solved CI environment, and RTD isn't throwing up any import error warnings, so I think you got em all!
I have a few questions about some changes to the environments, but nothing blocking :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whoops! Tried a test-fit locally and it's not working. Sorry, I should've done this before submitting my approving review but I only thought of it immediately after clicking submit.
{"type": "FileNotFoundError", "message": "[Errno 2] No such file or directory", "traceback":
"Traceback (most recent call last):\n File
\"/home/joshmitchell/Documents/openff/bespokefit/.soap/forksbalance/lib/python3.10/site-packages/ce
lery/app/trace.py\", line 451, in trace_task\n R = retval = fun(*args, **kwargs)\n File
\"/home/joshmitchell/Documents/openff/bespokefit/.soap/forksbalance/lib/python3.10/site-packages/ce
lery/app/trace.py\", line 734, in __protected_call__\n return self.run(*args, **kwargs)\n File
\"/home/joshmitchell/Documents/openff/bespokefit/openff/bespokefit/executor/services/optimizer/work
er.py\", line 63, in optimize\n result = optimizer.optimize(\n File
\"/home/joshmitchell/Documents/openff/bespokefit/openff/bespokefit/optimizers/model.py\", line 233,
in optimize\n results = cls._optimize(schema, initial_force_field)\n File
\"/home/joshmitchell/Documents/openff/bespokefit/openff/bespokefit/optimizers/forcebalance/forcebal
ance.py\", line 113, in _optimize\n results = cls._collect_results(\"\")\n File
\"/home/joshmitchell/Documents/openff/bespokefit/openff/bespokefit/optimizers/forcebalance/forcebal
ance.py\", line 135, in _collect_results\n results_dictionary =
cls._read_output(root_directory)\n File
\"/home/joshmitchell/Documents/openff/bespokefit/openff/bespokefit/optimizers/forcebalance/forcebal
ance.py\", line 179, in _read_output\n with open(os.path.join(root_directory, \"optimize.out\"))
as log:\nFileNotFoundError: [Errno 2] No such file or directory: 'optimize.out'\n"}
Has the default output file name changed?
Invoking command:
openff-bespoke executor run --smiles 'CC' \
--workflow 'default' \
--n-qc-compute-workers 12 \
--qc-compute-n-cores 1 \
--default-qc-spec xtb gfn2xtb none \
--target-torsion '[C:1]-[C:2]'
Environment was test-env.yaml
+ xtb-python
I haven't changed any file I/O (yet), just the name of the executable and how it's called. I get the same errors as you. I can only guess, but I guess the optimization is actually failing to converge at some point. How should I go about working on this? I don't see any new files on my computer, so it'd be hard to reproduce just the optimization step. I'd rather avoid figuring out how to run XTB myself, though with some effort I could poke through this codebase and figure out how to generate the QC data and set up an optimization that looks similar. |
f1b7d6c
to
3ee15cd
Compare
Ah damn. Yeahhh so this is why I started #199. These files are pretty aggressively cleaned up. You can ask politely to keep things around: BEFLOW_OPTIMIZER_KEEP_FILES=True openff-bespoke executor run
--smiles 'CC' \
--workflow 'default' \
--n-qc-compute-workers 12 \
--qc-compute-n-cores 1 \
--default-qc-spec xtb gfn2xtb none \
--target-torsion '[C:1]-[C:2]' \
--directory bespoke-test-fit But the code that saves stuff doesn't get called if the optimizer fails, and by default everything is saved to temporary folders. Let me merge #222 and take another crack at #199 before we progress this. |
Sounds good to me - I'm pretty confident the problem is somewhere in |
After a decent night's sleep, I realized a simple |
The above tarball is now working standalone on my machine but the integration test is failing (in CI and locally). Something changed with the actual logging, which I'm looking into now. |
I don't know why the test The integration test succeeds locally, not sure if it will in CI. |
I'm not sure why the non-OpenEye tests are failing. I made a new environment from the same file and the integration test runs fine locally. |
@Yoshanuikabundi - I consider you the "product owner" for bespokefit at this point (and you're doing a good job of it!). I'd caution that merging this PR effectively constitutes a decision to make the next release of BespokeFit use OpenFF-Forcebalance. If you're up to take that risk feel free to merge this, but if it needs more thought, it may be good to decide with @mattwthompson what the conditions are to undo this PR if a quick release of BespokeFit needs to be made. If it were me, I might leave this PR in a do-not-merge state until the decision is made to make a new BespokeFit release using the new dep. |
Jeff and I view this somewhat differently (I think merging any PR commits to the direction of development and the topic of how stable a package and its dependencies must be in order to be released is a separate discussion) - but the decision ultimately rests with you and I'm happy to go back and forth on the planning here or in another context. If we decide to merge this and need to back off before a release is made - which could plausibly happen for a number of different reasons - I'll take responsibility for reverting this. In the current state, though, I could use help with
I might have a eureka moment next week on either or both of these problems, but I've slept on them once without any movement. |
6ae2937
to
8d1c2d2
Compare
Thanks for the advice @j-wags, that settles a thought I'd had running around my head... I believe BespokeFit is designed to work with multiple different optimizers, it's just that ForceBalance is the only one that's been implemented - maybe @jthorton can confirm if I can't figure it out today? So I think once this is ready to merge, we should be able to relatively easily refactor it to live alongside the existing ForceBalance implementation. This would make reversion easy and allow users the choice after upstream ForceBalance gets fixed - I can imagine that might be desirable? We can have BespokeFit require FB>=1.9.5 or something. Am I wrong in thinking that with this merged and working, we'd be wanting to re-release 0.2.0 in pretty short order? What's the plan on making a release of ForksBalance? @mattwthompson I'll take a look :) |
OK I took a crack at that refactoring. It's not too bad (see #223) but I need to think about how we want to do tests and whether the additional maintenance burden is worth it. The test you mentioned @mattwthompson is still failing. With the changes I made to the test suite in #223 I managed to get it to fail with both FBs, which at least suggests it's a BespokeFit problem... but I don't know what. I think there might have been a change in a dependency that changed some behavior we were relying on or something, possibly in Pytest's monkeypatch apparatus. I'll look further tomorrow. I did get the integration tests to run when unit tests fail! |
Right the idea was to add more as we developed alternatives to the current methods so I would say if For the integration testing, I had a go at trying to reproduce the issue using the CI environment without openeye and get the same convergence issue problem, it seems that for this test the gradient norm hasn't converged before the step size gets too small. I compared the charges in the openmm systems using openeye and ambertools and they seem fine (integer 0 charge as expected not too different), this is where I expected there could be some difference. After a lot of trial and error I found that I could get the test to work if I replaced the
|
These decisions and all planning around releases are squarely up to you, and you're the arbiter of any changes I'd wish to make to this codebase in between releases. I have suggestions on how we should target releases if you're open to receiving them, but at the end of the day it's up to you
I'm not totally sure; so far it's a totally solo project that hasn't been hardened by use. This does come with upsides - I offer no guarantees of any stability with respect to the API or any behavior and no users that would be upset by sweeping changes, therefore I can rapidly develop and apply changes in response to user features. The intent is for it to serve as a minimal replacement for the software products we produce, which currently is only this project. The primary benefits are it being a lighter-weight distribution of the SMIRNOFF targets than the canonical ForceBalance and the ability to rapidly develop new features and apply bugfixes. If using
I'm still working on some changes to that codebase and applying more tests, but that's a statement that may be true for a while. Releases and packages can be made soon, kinda-soon, or much later; I'm hoping to get more feedback from users to decide how early I want to be responsible for caring for downstream changes.
Is this exposed as an option in the CLI/API? I don't see anything in |
@Yoshanuikabundi Matt pointed out that I sprung this notion of "product ownership" without really laying out what it means, but it looks like you've picked up exactly what I was thinking. Thank you for helpfully reading my mind. The other piece of relevant information is "when will we get leeping/forcebalance>=1.9.5?". I've opened the PR with (I think) the appropriate changes, but we're just getting back into the school year here so Lee Ping may not have a ton of time to make a new release. |
@mattwthompson I just spent 5 hours writing #224 to discuss release strategy! I would love to hear your suggestions there.
Just for my curiosity - what about our mainline force field fits?
Good question. I don't think it is, but only because I've never seen it. I don't wanna jinx it... but it might be easy. @j-wags Haha you said project owner so I decided I would own it :D Now I can finally execute my crazy up-pin every dependency that uses semver plan... If leeping/forcebalance>=1.9.5 is right around the corner then it might be better to wait for that for 0.2.0, and support openff-forcebalance in a future release. I'm very anxious about changing code too much with the current test suite and debugging experience; this PR seems like it should be pretty easy but it's still broken. |
@jthorton At present the two packages have different namespaces and different CLI commands, so we can't support both without splitting the optimizer implementations. I can just imagine users being upset with either choice, and we don't have to make a choice... so why not offer both!? I guess the other option would be to just use either depending on what's available, but I think that just moves the problem into the future when they do diverge? Wow, amazing detective work! It's good to know what's going wrong but I wanna experiment a bit with tweaking the settings of the conformer generation before we change the molecule. Also I pinged you again on #194 - I know it's a lot of questions but every little bit would help. I can write up what my understanding is so you can just correct my misunderstandings if that would be easier? I also have written up what I want to do with BespokeFit re: improving the developer experience in #224. A lot of my ideas might have very obvious downsides that I'm just not aware of yet, so if you have any thoughts on any of it I'd love to hear them! |
That's up to @lilyminium and her team, not me. I don't know their plans, but it's easy to imagine overlapping interests in rapid development, integration with our stack, possible performance improvements, and even experimental features. |
Yes, I imagine it would be a lot easier to iterate on FF fits with tweaks and new features, working in partnership with the OpenFF fork with all the nice things Matt just mentioned :) |
* attempt to support forcebalance and forksbalance * Replace whitespace in optimizer name with hyphen * Uncomment generate() * Fix tests * Add --optimizer switch to CLI
A pretty inconvenient time to run into issues with the QCArchive stack. I can get the So this works ... with the caveat that running the optimization with the forked version actually uses the upstream ForceBalance to write out targets. Right now they're the same but in the future it might make sense for them to diverge. The class that writes out the targets should have access to the optimizer from the schema of some factory, so it should be possible to branch the logic there. I just won't get to that tonight. In terms of what's next,
|
Another question - do we need another schema for this optimizer? https://github.com/openforcefield/openff-bespokefit/blob/main/openff/bespokefit/schema/optimizers.py |
I believe this is true only for targets other than torsions - which can be demonstrated by running a default fit in an environment that doesn't have upstream ForceBalance.
I think I'd like to leave this open for the time being, because....
I think this is the way to go. But I need to focus on NAGL docs for the next couple weeks.
At the moment, |
Hey Matt - if you're working on this again, I just wanted to mention that I've come around on continuing to support original Force Balance - if it's easier to focus on our fork and require users to use an old version for the original, that's fine by me. |
No description provided.