Need to create file where to write information about database connexion for pdbe_test (db, user and password) in the config directory
Need to create a schema under pdbe_test in Oracle SGBD
Main program is
All the files generated are wrote under MDA_results/CATH_4_1 directory
- Get different PDB entries data from ENTITY_CATH and ENTITY_SCOP tables (PDBEREAD_PDBE_LIVE schema)
- Aggregate data from the both tables on
- Compare start/end residue numbers
- Calculate percentage and overlapping for domains
- Enter data in DOMAIN_MAPPING
Calculate percentage, overlapping, medals for nodes (superfamilies)
If sequence coverage by CATH domain and by SCOP domain > 25% => enter data in NODE_MAPPING
Clustering nodes using DOMAIN_MAPPING and NODE_MAPPING
If average domain coverage at SF level > 25% and overlapping between CATH and SCOP > 25% => enter data in CLUSTER
Determine MDA blocks for each cluster (MDA block is a sequence of following CATH and SCOP domain superfamilies)
If overlapping between CATH and SCOP > 50% => enter data in MDA_BLOCKS with different CATH and SCOP domains begin-end positions
Link with cluster in CLUSTER_BLOCK table
- For each MDA block, get chains ID with uniprot ID and uniprot sequence coverage percentage by the chain corresponding
- Enter data into BLOCK_CHAIN and BLOCK_UNIPROT tables
- Write mda_blocks.list (uniprot with number of chain for each block) and mda_info.list (for each uniprot in each block: chain and coverage)
chopping: equivalence split, one instance, class4...
homology differences
Write info in files
All the files generated are wrote under MDA_results/CATH_ECOD directory
The mapping process is the same as the CATH/SCOP mapping above, but just replacing SCOP data by ECOD data and the tables names have _ECOD added at the end
To get the SEGMENT_ECOD table the program calls a python script located under the update_database directory. ECOD data are obtained from the ECOD website (