Skip to content

Latest commit

 

History

History
81 lines (67 loc) · 4.71 KB

deprecated_README.md

File metadata and controls

81 lines (67 loc) · 4.71 KB

Protein-TDA

[For multiple PDBs] To get Wasserstein distances, enter below code in bash shell...
python -m test_persistent_homology --pdbs 3CLN 1CFD 1L7Z --selections "backbone and segid A"

[For a reference PDB and trajectories with single-CPU] To get Wasserstein distances, enter below code in bash shell...
python -m test_persistent_homology --data_dir /Scr/hyunpark/Monster/vaegan_md_gitlab/data --psf reference_autopsf.psf --pdb reference_autopsf.pdb --trajs adk.dcd --selection "backbone"

[For a reference PDB and trajectories with multi-CPUs] To get Wasserstein distances, enter below code in bash shell...
python -m test_persistent_homology --data_dir /Scr/hyunpark/Monster/vaegan_md_gitlab/data --psf reference_autopsf.psf --pdb reference_autopsf.pdb --trajs adk.dcd --selection "backbone" --multip --filename TEST.pickle

For now, use below conda environment

/Scr/hyunpark/anaconda3/envs/deeplearning/bin/python

(single CPU) 228.51532316207886 seconds... versus (multi CPU) 14.778266191482544 seconds...

[To get PyTorch dataLoader of graph format using PyTorchGeometric PyTorch] This extracts XYZ coordinates and multiprocesses PH; subscriptable by index for Dataset and loadable for DataLoader
python -m data_utils --psf reference_autopsf.psf --pdb reference_autopsf.pdb --trajs adk.dcd --save_dir . --data_dir /Scr/hyunpark/Monster/vaegan_md_gitlab/data --multiprocessing --filename temp2.pickle python -m main --which_mode preprocessing --pdb_database /Scr/arango/Sobolev-Hyun/2-MembTempredict/testing/ --save_dir /Scr/hyunpark/Protein-TDA/pickled --filename dppc.pickle --multiprocessing --optimizer torch_adam --log --gpu --epoches 1000 --batch_size 16 --ce_re_ratio 1 0.1

[To train from saved pickle/dat files] Assuming that pickle/dat files for coordinates, PH and temperature are saved, we can start training neural network model...

python -m main --ignore_topologicallayer --optimizer torch_adamw --which_mode train --gpu --log --batch_size 8 --epoches 100
python -m main --which_mode train --load_ckpt_path /Scr/hyunpark/Protein-TDA/saved --name vit_model --backbone vit --filename dppc.pickle --multiprocessing --optimizer torch_adam --log --gpu --epoches 1000 --batch_size 16 --ce_re_ratio 1 0.1



For distributed data parallelization

python -m torch.distributed.run --nnodes=1 --nproc_per_node=gpu --max_restarts 0 --module main --gpu --log --ignore_topologicallayer --optimizer torch_adam --which_mode train --batch_size 8 --epoches 100
python -m torch.distributed.run --nnodes=1 --nproc_per_node=gpu --max_restarts 0 --module main --which_mode train --name vit_model --backbone vit --filename dppc.pickle --multiprocessing --optimizer torch_adam --log --gpu --epoches 1000 --batch_size 16 --ce_re_ratio 1 0.1



For DGX-3 submission, assuming submit_local contains proper job scheduling...
submit_local main.py main main dgx-test

To continue training...
python -m main --which_mode train --name vit_model --backbone vit --filename dppc.pickle --multiprocessing --optimizer torch_adam --log --gpu --epoches 1000 --batch_size 16 --ce_re_ratio 1 0.1 --resume



To infer on all data...
python -m main --which_mode infer --name vit_model --backbone vit --filename dppc.pickle --multiprocessing --optimizer torch_adam --log --gpu --epoches 1000 --batch_size 16 --ce_re_ratio 1 0.1 --resume



To infer PDB patches' temperatures inside e.g. inference_save/T.123 directory, and to save inside inference_save directory as pickles
python -m main --which_mode infer_custom --name convnext_model --filename dppc.pickle --multiprocessing --optimizer torch_adam --log --gpu --epoches 1000 --batch_size 512 --ce_re_ratio 1 0.1 --backbone convnext --resume --pdb_database inference_folder --save_dir inference_save --search_temp 123

Train/Inference databases


Patch Lipids for training:
   /Scr/arango/Sobolev-Hyun/2-MembTempredict/testing/
Patch Lipids for inference:
   /Scr/arango/Sobolev-Hyun/5-Analysis/DPPC-CHL/inference_folder/
Individual Lipids for training:
   /Scr/arango/Sobolev-Hyun/2-MembTempredict/indiv_lips_H/
Individual Lipids for inference:
   /Scr/arango/Sobolev-Hyun/5-Analysis/AQP5-PC
   /Scr/arango/Sobolev-Hyun/5-Analysis/LAINDY-PE
   /Scr/arango/Sobolev-Hyun/5-Analysis/B2GP1-PC
   /Scr/arango/Sobolev-Hyun/5-Analysis/B2GP1-PS