cgkmc

Available since 0.8.0

Generate KMC-CG mappings of large biomolecules

Description

The cgkmc module realizes the implementation of K-Means Clustering Coarse-Graining (KMC-CG), a new method for generating optimal CG mappings for large biomolecules. KMC-CG removes the sequence-dependent constraints of ED-CG, allowing it to explore a more extensive space and thus enabling the discovery of more physically optimal CG mappings. Furthermore, the implementation of the K-means clustering algorithm can variationally optimize the CG mapping with efficiency and stability.

Usage

Syntax of running cgkmc module

Required arguments:
  beta        parameter controlling the positional residual
  gamma       parameter controlling the continuity residual
  frames      initialize the coordinates of CA atoms

For more parameters, functions and how to run KMC-CG, please see the example file arp23.ipynb in the OpenMSCG/examples folder.

Notes

The KMC-CG method reads in the aligned AA trajectories (after removing the transitional and rotational motions to the reference frame), and generates the optimal CG mappings based on the K-Means clustering method by minimizing a variational function that consists of a positional, a fluctuation, and a penalty term.

For more information about the KMC-CG method, see the following paper: Wu, J.; Xue, W.; Voth, G. A. K-Means Clustering Coarse-Graining (KMC-CG): A Next Generation Methodology for Determining Optimal Coarse-Grained Mappings of Large Biomolecules. J. Chem. Theory Comput. 2023.

A sample result is saved in the following format:

0
[0, 1, 2, 8, 9]
1
[3, 4, 5, 6, 7]

The number is the index of each CG site and the following list includes the residue indices belonging to that CG site. All the index starts from 0.