Apart from its specific options listed below, Document to Structure also accepts all Name to Structure format options, to configure which name conversions are attempted.
Codename | Explanation |
---|---|
cas | enable the conversion of CAS Registry Numbers® (off by default, uses a webservice, read the notice about CAS Registry Numbers®) |
smiles | enable the conversion of SMILES strings (on by default) |
inchi | enable the conversion of InChI strings (on by default) |
ocr | enable the processing of scanned text in PDF documents (on by default) |
osr | enable the conversion of structure drawings by any available OSR external tool (on by default if any such tool is installed) |
osra | enable the conversion of structure drawings by the OSRA external tool (on by default if OSRA is installed). Using this option will specify that OSRA should be used even if other OSR tools are available. |
clide | enable the conversion of structure drawings by the CLiDE external tool (on by default if CLiDE is installed). Using this option will specify that CLiDE should be used even if other OSR tools are available. |
imago | enable the conversion of structure drawings by the OSRA external tool (on by default if Imago is installed). Using this option will specify that Imago should be used even if other OSR tools are available. |
timeout=N | the maximum number of seconds to run, with 0 for no timeout (default: no timeout) |
osraTimeout=N | configure the maximum number of seconds to run OSRA on an image (default: 20 seconds) |
clideTimeout=N | configure the maximum number of seconds to run CLiDE on an image (default: 20 seconds) |
imagoTimeout=N | configure the maximum number of seconds to run Imago on an image (default: 20 seconds) |
filterOSR | enable the filtering of OSR structures for incomplete recognition (on by default) |
text | enable the processing of the textual content of the document (on by default). The text is searched for text-based formats: name, smiles, InChI, (all on by default) and CAS Registry Numbers® (off by default, see the cas option above) |
acronyms | enable the conversion acronyms, such as ATP for Adenosine TriPhosphate (off by default) |
vernacular | enable the conversion of everyday terms like "water" or "steam" (off by default) |
OLE | enable the conversion of structures embedded in office documents (on by default) |
startPage=N | start processing document at page N (can be combined with endPage to process a range of pages) |
endPage=N | stop processing document at page N |
insideTag=<tag> | for markup formats, enable the conversion only inside the given tag (typically insideTag=body for HTML). Off by default. |
contextRadius=N | maximum number of characters of context to include, on each side of the hit (default = 40). |
contextIndex | whether to include the index of the hit in the context. Off by default. |
Each option can be preceded by a minus sign - (for instance -smiles ) to disable it. Both forms smiles and +smiles are accepted to enable an option.