Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Select model feature in process_density_map script #150

Merged
merged 8 commits into from
Nov 15, 2024

Conversation

kvr2007
Copy link
Contributor

@kvr2007 kvr2007 commented Jun 6, 2023

Implements an improved map/model comparison and also provides an easy way to select a good-fitting single model from a trajectory (for downstream model building). A new dependency is added in this case: gemmi, which is a robust (and fast) library to handle PDB and MRC files (see https://gemmi.readthedocs.io/en/latest/)

In addition, the cryofold2 tutorial code is converted into a simple command-line script cryofold2_setup. This way the user can try the workflow on any data.

To test run the following commands with the CryoFold2 tutorial data (https://github.com/maccallumlab/meld/tree/master/docs/tutorial/cryofold_tutorial)

cryofold2_setup 4ake_t_s2.mrc 1ake_s.pdb
launch_remd --console-log --platform CUDA # this command may differ if running on a cluster
process_density_map select_model 4ake_t_s2.mrc Data/trajectory.pdb --resolution 4

The outputs are written to a new subfolder Postprocess:

Postprocess/best_model.pdb: Auto-selected model

Postprocess/plot/rmsd.png: RMSD to the starting model during the simulation; use to judge convergence

Postprocess/plot/cc_ifsc.png: Real-space CC and iFSC between the experimental map and MD trajectory models; use to select the best model

Postprocess/plot/fsc_step_???.png: Map/model FSC plots for corresponding MD steps

kvr2007 added 2 commits June 6, 2023 16:52
…script

Using process_density_map select_model it is now possible to select a best-fitting model from the MD trajectory:

1. Using a 10-frame window RMSF is calculated for each atom
2. RMSF is converted to B-factors and saved in a PDB file
3. A map is simulated from a model using gemmi library: the simulated map accounts for atomic scattering factors and B-factors determined in the previous step
4. Optionally a mask is created from the model to exclude the solvent signal in the experimental map
5. Experimental and simulated map are compared using real-space cross-correlation or integrated FSC (iFSC, as defined by Wang et al doi:10.7554/eLife17219). The model with the best CC or iFSC is symlinked. In addition map/model FSC plots are generated for each PDB file

The script cryofold2_setup is a command-line adaptation of CryoFold2 tutorial (https://github.com/ccccclw/meld/blob/master/docs/tutorial/cryofold_tutorial/setup_cryofold.py)
@jlmaccal
Copy link
Contributor

jlmaccal commented Jun 6, 2023

This looks cool, but I'm a bit leery of the gemmi dependency. How easy is it to install? Is it packaged through conda, etc? It would be great if we can figure out how to make it an optional dependency that's only needed for cryoEM.

@kvr2007
Copy link
Contributor Author

kvr2007 commented Jun 7, 2023 via email

@jlmaccal
Copy link
Contributor

I missed that there were conda-forge binaries. Can you add those to the packaging requirements and then we'll test that everything works.

@kvr2007
Copy link
Contributor Author

kvr2007 commented Jun 16, 2023

Sure, but where do I find the packaging requirements file? Is it somewhere in the CI section? Or should I just place the requirement in meld/setup.py?

@jlmaccal
Copy link
Contributor

Well, the first place would be in the CI/testing configuration. Have a look at build-ubuntu-latest.yml. Once that is working, then we would need to update the packing scripts: https://github.com/conda-forge/meld-feedstock.

Add python dependencies for process_density script:
- mdtraj
- gemmi
- mrcfile
- matplotlib
- progressbar2
@kvr2007
Copy link
Contributor Author

kvr2007 commented Jun 16, 2023

OK, just updated build-ubuntu-latest.yml in my branch. I also added other python dependencies that were necessary to run the script process_density_map. I hope did everything correctly, never really done CI myself :)

It would be great to have the newest version of meld available in conda-forge, installation from source was not trivial for me.

Maybe it would also help to add the python dependencies to meld/setup.py? I think right now it does not check for this.

@jlmaccal
Copy link
Contributor

Yes, can you add this to the dependencies in setup.py? Can you also add this to CHANGELOG?

We should probably cut a new release. It's fairly straightforward, but I just need to find the time.

@kvr2007
Copy link
Contributor Author

kvr2007 commented Jun 20, 2023

OK, all done. I also fixed a bug in the script process_density_map select_model, and added temperature and number of steps options to cryofold2_setup

It is now possible to set density weighting in cryofold2_setup; temperature settings (ramp between replicas, alphamin/max) are added and properly described in the CLI help
@kvr2007
Copy link
Contributor Author

kvr2007 commented Nov 13, 2024

@jlmaccal , apologies for bugging you, just wanted to check on the status of this PR, is there anything else that prevents it from getting merged?

@jlmaccal
Copy link
Contributor

Just some merge conflicts in process_density_map. Can you resolve them?

@kvr2007
Copy link
Contributor Author

kvr2007 commented Nov 14, 2024

@jlmaccal , thanks for a swift response! I've just fixed the merge conflicts

@jlmaccal jlmaccal merged commit a7700e0 into maccallumlab:master Nov 15, 2024
1 of 3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants