Molconverter is a command line program in Marvin Suite and JChem that converts between various file types.
molconvert [options] outformat[:exportoptions] [files...]
The outformat stands for one of the supported formats.
Chemical Formats
Format owner | Format name | Format type | Outformat value |
---|---|---|---|
Chemaxon | Marvin Document (MRV) | Document | mrv |
Chemaxon | Chemaxon Object Notation (CXON) | Document | cxon |
Chemaxon | Chemaxon Compressed Molfile | Molecule | csmol csrxn cssdf csrdf |
Chemaxon | Chemaxon Extended SMILES | Molecule | cxsmiles |
Chemaxon | Chemaxon Extended SMARTS | Molecule | cxsmarts |
Chemaxon | Chemaxon SMILES Abbreviated Groups | Molecule | abbrevgroup |
PerkinElmer Informatics | ChemDraw sketch file (CDX) | Document | cdx |
Dassault Systemes | ISIS/Draw sketch file (SKC) | Document | skc |
Dassault Systemes | CTFile formats | Molecule | mol rgf rxn sdf rdf |
Daylight | SMILES | Molecule | smiles |
Daylight | SMARTS | Molecule | smarts |
IUPAC/InChI Trust | IUPAC InChI | Molecule | inchi |
IUPAC/InChI Trust | IUPAC InChIKey | Hash | inchikey |
IUPAC | IUPAC Name | Molecule | name |
Peptide Sequence | Molecule | peptide | |
CSV | N/A | csv |
Image Formats
Format name | Outformat value |
---|---|
Portable Network Graphics | png |
MS Bitmap | bmp |
JPEG | jpeg |
Enhanced Windows Metafile | emf |
Tag Image File Format | tiff |
Encapsulated PostScript | eps |
-o file |
Write output to specified file instead of standard output |
---|---|
-m |
Produce multiple output files |
-e charset |
Set the input character encoding. The encoding must be supported by Java. |
-e [in ]..[ out] |
Set the input (in) and/or output (out) character encodings. Examples: UTF-8, ASCII, Cp1250 (Windows Eastern European), Cp1252 (Windows Latin 1), ms932 (Windows Japanese). |
-s string |
Read molecule from specified SMILES, SMARTS or peptide string (try to recognize its format) |
-s string { format : options } |
Read molecule from the string in the specified format (can be omitted), using the specified importoptions (can be omitted) |
-f string |
Specify the import format and options |
--peptide string |
Read molecule from specified peptide string |
-g |
Continue with next molecule on error (default: exit on error) |
-Y |
Remove explicit H atoms |
-I <range> |
process input molecules with molecule index (1-based) falling into the specified range (e.g. 5-8,15 refers to molecules 5,6,7,8,15) |
-U |
fuse input molecules and output the union |
-R <file>[:<range>] |
fuse fragments to input molecule(s) from file with specified mol index range range syntax: "-5,10-20,25,26,38-" (e.g. -R frags.mrv:20-) |
-R<i> <file>[:<range>] |
fuse R definition members to input molecule(s) from file in specified index range (e.g. -R1 rdef1.mrv:5-8,19) |
-R<i>:<1|2> <file>[:<range>] |
fuse R definition members to input molecule(s) from file in specified index range, filter molecules having 1 (2, resp.) attachment points (e.g. -R1:2 rdef1.mrv:-3,8-10) |
-T "<f1>:<f2>:..." |
Export molecule properties <f1>, <f2>, ... with the result separated by tab characters. Supported in SMILES, SMARTS, CXSMILES, CXSMARTS, InChI, InChIKey (where the result is a single line). There is an option to export all properties with -T "*" . |
-F |
Remove small fragments, keep the largest |
-c "f1 OP value&f2 OP value..." |
Filtering by the values of fields in the case of SDF import. OP may be: =,<,>,<=,>= |
--mol-fields-to-records |
Convert molecule type fields to separate records. |
-v |
Verbose |
-vv |
Very verbose (print stack trace at error) |
-2 [ : options] [ : F<i1><i2>...,<iN>] |
Calculate 2D coordinates Options for coordinate calculation. Performs partial clean with fixed atom coordinates for atoms <i1><i2>...,<iN> (1-based indexes) if the Fparameter is specified. |
-3 [ : options] |
Calculate 3D coordinates Options for coordinate calculation. |
-H3D |
Help on options for 3D calculations. Detailed list on Clean 3d Options |
The format specific export option can be specified with the format descriptor.The outformat value and the options are separated by a colon, the options by commas.
The following example creates a 100x100 pixel JPEG image on yellow background, with 95% quality
molconvert jpeg:w100,Q95,#ffff00 nice.mol -o nice.jpg
The format specific import options can be specified between braces, in one of the following forms:
filename{options} | Description |
---|---|
filename{MULTISET,options} | to merge molecules into one that contains multiple atom sets |
filename{format:} | to skip automatic format recognition |
filename{format:options} | to skip automatic format recognition |
filename{format:MULTISET,options} |
Format name | Export options | Import opions |
---|---|---|
MRV | >> | |
CXSMARTS,CXSMILES | >> | >> |
CTFile formats (MOL, SDF, RXN, RDF, RGF) | >> | >> |
SMILES, SMARTS | >> | >> |
InChI,InChIKey | >> | |
Name | >> | |
Peptide | >> | >> |
You can also use the Basic export options for all formats.
Example
Printing the SMILES string of a molecule in a molfile
molconvert smiles caffeine.mol
Dearomatizing an aromatic molecule:
molconvert smiles:-a -s "c1ccccc1"
Aromatizing a molecule:
molconvert smiles:a -s "C1=CC=CC=C1"
(The default general aromatization is used.)
Aromatizing a molecule using the basic algorithm:
molconvert smiles:a_bas -s "CN1C=NC2=C1C(=O)N(C)C(=O)N2C"
Converting a SMILES file to MDL Molfile
molconvert mol caffeine.smiles -o caffeine.mol
Making an SDF from molfiles:
molconvert sdf *.mol -o molecules.sdf
Printing the encodings of SDfiles in the working directory:
molconvert query-encoding *.sdf
SMILES to Molfile with optimized 2D coordinate calculation, converting double bonds with unspecified cis/trans to "either"
molconvert -2:2e mol caffeine.smiles -o caffeine.mol
2D coordinate calculation with optimization and fixed atom coordinates for atoms 1, 5, 6:
molconvert -2:2:F1,5,6 mol caffeine.mol
Import a file as XYZ, do not try to recognize the file format:
molconvert smiles "foo.xyz{xyz:}"
Note: This is just an example. XYZ and other formats known by Marvin are always recognized (send us a bug report otherwise), so the specification of the input format is usually not needed. It is only relevant if a user-defined import module is used.
Import a file as XYZ, with bond-length cut-off = 1.4, and max. number of Carbon connections = 4, export to SMILES:
molconvert smiles "foo.xyz{f1.4C4}"
Import a file as Gzipped XYZ, with the same import options as in the previous example:
molconvert smiles "foo.xyz.gz{gzip:xyz:f1.4C4}"
Like the previous example but merge the molecules into one molecule that contains multiple atom sets. MDL molfile is exported.
molconvert mol "foo.xyz.gz{gzip:xyz:MULTISET,f1.4C4}"
Import an SDF and export a table containing selected molecules with columns: SMILES, ID, and logP:
molconvert smiles -c "ID<=1000&logP>=-2&logP<=4" -T "ID:logP" foo.sdf
Fuse R2 definition from file, filter fragments with 1 attachment point:
molconvert mrv in.mrv -R2:1 rdef.mrv
Fuse fragments from file (note, that the input molecule, which the fragments are fused to, should also be specified):
molconvert mrv in.mrv -R frags.mrv
Generate all common names for a structure:
molconvert "name:common,all" -s tylenol
Generate the most popular common name for a structure (It fails if none is known.):
molconvert name:common -s viagra
Generate SMILES from those molecules that names are mentioned in a file foo.html:
molconvert smiles foo.html