logo

Menu:



Updates:

Nov 01, 2016:
An updated version of ArbMatch has been posted here. Please send any questions, comments and suggestions to ...



Print
Munkres Match - Align any two molecules

This code uses the Kuhn-Munkres or Hungarian algorithm to optimally align two arbitrarily ordered isomers. Given two isomers A and B whose Cartesian coordinates are given in XYZ format, it will optimally align B on A to minimize the Kabsch root-mean-square deviation (RMSD) between structure A and B after

  1. a Kuhn-Munkres assignment/reordering (quick)
  2. a Kuhn-Munkres assignment/reordering factoring in axes swaps and reflections (~48x slower) 

We recommend the second method although the first one would still be better than RMSD calculations without atom reorderings.

 A web server with this implementation is available at http://marcy.furman.edu/munkres-match.html.

While this script is kept as minimal as possible in order to ensure ease of use and portability, it does require these two Python packages beyond what's included in standard python installations.

 

  1. Python Numpy module
  2. Python Hungarian module by Harold Cooper
    (Hungarian: Munkres' Algorithm for the Linear Assignment Problem in Pytho
    (https://github.com/Hrldcpr/Hungarian)
    This is a wrapper to a fast C++ implementation of the Kuhn-Munkres algorithm. The installation instructions are described at https://github.com/Hrldcpr/Hungarian

    Alternatively, one can use Brian Clapper's Munkres module or another similar module includeded in SciNumpy. This could require one to make small changes to the current script.

Other optional tools are:

  1. Prin_coords.py - using principal coordinates generally yields better alignment (lower RMSDs).  A Python script to convert molecules from arbitrary to principal coordinate system is included.
  2. In cases where one wants to use atom types including connectivity and hybridization information, it is necessary to use OpenBabel to convert the Cartesian coordinates to SYBYL Mol2 (sy2) and MNA (mna) formats. 

The best way to take advantage of these two optional tools is probably to use the attached driver script (driver-script.csh) The syntax looks like 

Usage:

driver-script.csh -<flag> <filename_1.xyz> <filename_2.xyz>"
: where the <flag> is "
: -N   match by atom or element Name "
: -T   match by SYBYL atom Type"
: -C   match by NMA atom Connectivity type"

Eg.

: driver-script.csh -N cluster1.xyz cluster2.xyz"
: driver-script.csh -T cluster1.xyz cluster2.xyz"
: driver-script.csh -C cluster1.xyz cluster2.xyz"

This matches the Cartesian coordinates of the file1 and file2 using the Kuhn-Munkres algorithm based on atom names (-N), type (-T) or connectivity (-T). "

It produces s-file1.xyz and s-file2-matched.xyz which are the sorted and matched file1 and file2.xyz, respectively."

The code will provide the following:

  1. 1The initial Kabsch RMSD
  2. The final Kabsch RMSD after the application of the Kuhn-Munkres algorithm
  3. The coordinates corresponding to the best alignment of B on A to a file called B-aligned_to-A.xyz

If you find this script useful for any publishable work, please cite the corresponding paper:

  Berhane Temelso, Joel M. Mabey, Toshiro Kubota, Nana Appiah-padi, George C. Shields. 
J. Chem. Info. Model. X(Y), 2016

Upload Cartesian (*.xyz) coordinates of the two molecules to match:




Ignore hydrogens (heavy atom only)
No axes swaps or reflections