Abbreviated groups are stored in a TAB-delimited text file called default.abbrevgroup. The basic format is:
AcCC=O2
AcAcCC(=O)CC(=O)5
AcetCC=O2
AdeNC1=C2N=CNC2=NC=N161
JC*.CCC[C@H](n)C=O |r,m:1:3.4|78
{warning} Please make sure the words are separated by TAB characters not by spaces.
Codename : abbrevgroup See also CXSMILES and CXSMARTS
In these lines the very first word is the abbreviation, the second is the CXSMILES string representing the molecule fragment depicted by the abbreviation. These are followed by the attachment atom numbers (in the CXSMILES string). In the first line using the Ac abbreviation the second carbon is the attachment atom so if we have a connection to an other molecule part then this atom will make the connection. If there is no number following the the CXSMILES string the abbreviated group can not be connected to other atoms. (However their number is not limited too since Marvin 6.0.)
Usually the bond points towards the middle of the abbreviation but when the string contains atom symbols, probably we want to make it point to the symbol of the bonding atom. Furthermore it is desirable to flip the abbreviation when the group is in the opposite side:
To achieve the flipping effect one have to provide the alternative name of the abbreviated group that will be printed on the left side of the molecule:
CNC#N1leftName=NC
CO2EtCCOC=O4leftName=EtO2C
CO2HOC=O2leftName=HO2C
COOHOC=O2leftName=HOOC
COOiAmCC(C)CCOC=O7leftName=iAmOOC
If the abbreviation contains numbers, those will be treated as subscripts:
C10H21CCCCCCCCCC1leftName=H21C10
CBr3BrC(Br)Br2leftName=Br3C
Additionally there can be groups where it is good to have flipping abbreviations but the string represents the form that is used on the left side. For these groups (for example AcO, MeO) the rightName specifier can be used:
BnNHNCC1=CC=CC=C11rightName=HNBn
BnOOCC1=CC=CC=C11rightName=OBn
BnO2CO=COCC1=CC=CC=C12rightName=CO2Bn
BnOOCO=COCC1=CC=CC=C12rightName=COOBn
If you do not want to flip an abbreviation but want to be sure that the bond points to an atom symbol and not to the middle of the string, you still can define the center specifier:
c-C10H19C1CCCCCCCCC11center=AUTO
c-C11H21C1CCCCCCCCCC11center=AUTO
c-C12H23C1CCCCCCCCCCC11center=AUTO
This option allows to point to the very first character in the abbreviated group string that is the same as the atom symbol of the binding atom. This option makes it possible to fine-tune the position of the bond to point to any of the characters.
Atom properties of the atoms inside the are stored int the CxSmiles description part of the molecule. But properties of the abbreviation atom can not be stored there: they are stored in a separate field with the same syntax as in the CxSmiles format. In the example below the abbreviation atom of Alanine contains two properties with keys 'property 1' and 'property 2' and values 'value 1' and 'value 2' correspondingly. Properties are separated by the character ':'. Characters '.' and ':' in property keys and values are escaped.
Ala C[C@H](n)C=O |r| 3 4 abbrevAtomProperties=property 1 .value 1:property 2.value 2
From the 5.10 release extension possibilities have been introduced for the default built-in abbreviated groups.
A user can define abbreviated groups to be used in MarvinSketch in a file called user.abbrevgroup. This file has to be placed in the ChemAxon settings directory that is located in the chemaxon or .chemaxon folder inside the home directory of the user.
For developers who are developing based on the MarvinBeans library, or who are using the Marvin Applets package, it is possible to extend the abbreviated group list via files that can be defined for usage in MarvinSketch with the help of the customAbbrevGroups parameter.