Reactivity rules

    These examples show how to specify a reactivity rule on reactants and/or products: reaction products accepted only if the reactivity rule is satisfied. Rules are defined as chemical terms and evaluated by the Chemical Terms Evaluator. See the Reaction rules section of the Reactor Manual for the syntax of these rules.

    1. The reaction file for our first example details an amine isocyanate reaction. See:amine+isocyanate.rdf It contains the reactivity rule in the REACTIVITY RDF tag (you can see it in MView by setting Table / Show Fields):

      images/download/attachments/1806400/Amine_isocyanate_react.png
      match(ratom(1), "[#6][N:1]", 1)
        || match(ratom(1), "[#6][N:1][N:2]", 2))
        && !match(ratom(1), "[#6][N:1][N:2]", 1)
        && !match(ratom(1), "[N:1][C,P,S]=O", 1)

      The reactivity rule in this example describes amine-type nitrogens: a N atom is taken as an amine-type nitrogen if and only if it is either an amine or a hydrazine(N2) but not a hydrazine(N1) and not an amide.

      hydrazine

      images/download/attachments/1806400/Hydrazyne.png

      Our reactant files are amines.smiles and isocyanates.smiles:

      Amines

      Nc1ccccc1OC(F)F
      N[C@H]1CCOC1=O
      c1ccncc1
      CC(C)N1CCCCC1
      c1ccc2[nH]ccc2c1
      CC(=O)NC1CCCCC1
      images/download/attachments/1806400/amines.png


      Isocyanates

      CCOC(=O)CCCN=C=O
      CCOc1ccc(N=C=O):c(c1)N(=O)=O
      O=C=NC1c2ccccc2-c3ccccc13
      O=C=Nc1ccc2OCOc2c1
      C[C@H](N=C=O)c1cccc2ccccc12
      O=C=N[C@@H]1C[C@H]1c2ccccc2
      images/download/attachments/1806400/Isocyanates.png


      Look at the amines: The first and the second amine satisfy the condition, the next two are not attached to a hydrogen therefore do not satisfy the reaction equation itself, the remaining two do not satisfy the reactivity rule above. In the examples below, Reactor is run with multiple reactants in sequential mode: the first amine is paired with the first isocyanate, then the second amine with the second isocyanate, and so on.

      Run Reactor by:

      react -r amine+isocyanate.rdf amines.smiles isocyanates.smiles -t reaction

      The result is:

      Nc1ccccc1OC(F)F.CCOC(=O)CCCN=C=O>>CCOC(=O)CCCNC(=O)Nc1ccccc1OC(F)F
      
      N[C@H]1CCOC1=O.CCOc1ccc(N=C=O)c(c1)N(=O)=O>>CCOc1ccc(NC(=O)N[C@H]2CCOC2=O)c(c1)N(=O)=O

      You can see that only the first two amines have been processed:

      images/download/attachments/1806400/Amine_isocyanate_result.png

      For comparison, run Reactor without reaction rules (option -n):

      react -r amine+isocyanate.rdf amines.smiles isocyanates.smiles -t reaction -n r

      We have 4result rows corresponding to the first two and the last two amines, since the reaction equation is not satisfied for the third and the fourth amines:

      Nc1ccccc1OC(F)F.CCOC(=O)CCCN=C=O>>CCOC(=O)CCCNC(=O)Nc1ccccc1OC(F)F
      
      N[C@H]1CCOC1=O.CCOc1ccc(N=C=O)c(c1)N(=O)=O>>CCOc1ccc(NC(=O)N[C@H]2CCOC2=O)c(c1)N(=O)=O
      
      c1ccc2[nH]ccc2c1.C[C@H](n-c-o)c1cccc2ccccc12>>C[C@H](NC(=O)[nH]1ccc2ccccc12)c3cccc4ccccc34
      
      CC(=O)NC1CCCCC1.O=C=N[C@@H]1C[C@H]1c2ccccc2>>CC(=O)N(C1CCCCC1)C(=O)N[C@@H]2C[C@H]2c3ccccc3
    2. In our next example the reaction is stored in acid-halide+nucleophile.rdf and the corresponding conditionial rule is applied to nucleophiles, the second reactant of the reaction:

      !match(ratom(3), "[#6][N,O,S:1][N,O,S:2]", 1) && !match(ratom(3), "[N,O,S:1][C,P,S]=[N,O,S]", 1)

      The reaction with atom mapping is shown below:

      images/download/attachments/1806400/Acilhalide_nucleophile.png

      The two matching conditions say that the reaction center of the nucleophile (reactant atom with map 3) should not match the atom with map 1 in any of the following molecule structures:

      images/download/attachments/1806400/Acilhalide_nucleophile_m1.png images/download/attachments/1806400/Acilhalide_nucleophile_m2.png

      The reactant files are acid-halides.smiles and nucleophiles.smiles. Several sample molecules from each are shown below:

      Acid-halides

      CCCCCCCCCC(Cl)=O
      FC(F)(F)c1cccc(\C=C\C(Cl)=O)c1
      ClC(=O)c1ccc2ccccc2n1
      ClC(=O)C(=O)c1c[nH]c2ccccc12
      images/download/attachments/1806400/acilhalides.png

      Nucleophiles

      CCCCN
      CCN(CC)CCNCCN
      CC1CC1CO
      CC(O)CS
      images/download/attachments/1806400/nucleophiles.png

      Run Reactor in combinatorial mode (-m comb), extract the first product (-x 1):

      react -r acid-halide+nucleophile.rdf acid-halides.smiles nucleophiles.smiles -m comb -x 1 -o acid-halide+nucleophile_result.smiles

      The result is stored in acid-halide+nucleophile_result.smiles. Some sample products are shown below:

      CCCCCCCCCC(=O)NCCCC
      CCCCCCCCCC(=O)N(CCN)CCN(CC)CC
      CCCCCCCCCC(=O)OCC1CC1C
      CCN(CC)CCNCCNC(=O)\C=C\c1cccc(c1)C(F)(F)F
      images/download/attachments/1806400/Acilhalide_nucleophile_results.png

      By default, Reactor filters product repetitions resulting from processing symmetric reaction centers. To increase efficiency, it is sometimes useful to allow product repetitions by skipping this duplicate check with the -w option. Run Reactor with this option:

      react -r acid-halide+nucleophile.rdf acid-halides.smiles nucleophiles.smiles -m comb -x 1 -w -o acid-halide+nucleophile_result_dup.smiles

      Observe, that the result file with duplicates acid-halide+nucleophile_result_dup.smiles contains 190 structures, while the result file without duplicates acid-halide+nucleophile_result.smiles contains only 130 structures.

      Now run Reactor without reaction rules (option -n):

      react -r acid-halide+nucleophile.rdf acid-halides.smiles nucleophiles.smiles -m comb -x 1 -n r -o acid-halide+nucleophile_result_norules.smiles

      The result is stored in acid-halide+nucleophile_result_norules.smiles.

      Observe, that with the application of reaction rules we have 130 products in acid-halide+nucleophile_result.smiles, while ignoring reaction rules results in 180 products in acid-halide+nucleophile_result_norules.smiles.

    3. A generalization of the amine+isocyanate reaction in our first reactivity rule example can be obtained if we look at the nucleophile+isocyanate reaction and use the following rule:

      !match(ratom(3), "[#6][N,O,S:1][N,O,S:2]", 1)
      && !match(ratom(3), "[N,O,S:1][C,P,S]=[N,O,S]", 1)
      images/download/attachments/1806400/Isocyanate_nucleophile.png

      The reaction file is isocyanate+nucleophile.rdf. The reactant files are isocyanates_more.smiles and nucleophiles.smiles. Four sample molecules from each are shown below:

      Isocyanates

      CCOC(=O)CCCN=C=O
      CCOc1ccc(N=C=O):c(c1)N(=O)=O
      CCN=C=O
      S=C=Nc1cccnc1
      images/download/attachments/1806400/Isocyanates_more.png

      Nucleophiles

      CCCCN
      CCN(CC)CCNCCN
      CC1CC1CO
      CC(O)CS
      images/download/attachments/1806400/nucleophiles.png

      Run Reactor in combinatorial mode (-m comb):

      react -r isocyanate+nucleophile.rdf isocyanates_more.smiles nucleophiles.smiles -m comb -t reaction -o isocyanate+nucleophile_result.smiles

      The result is stored in isocyanate+nucleophile_result.smiles. The two sample results below show that the same reactants can be transformed to different products by choosing different reaction centers:

      CCOC(=O)CCCN=C=O.CCN(CC)CCNCCN>>CCOC(=O)CCCNC(=O)N(CCN)CCN(CC)CC
      CCOC(=O)CCCN=C=O.CCN(CC)CCNCCN>>CCOC(=O)CCCNC(=O)NCCNCCN(CC)CC
      images/download/attachments/1806400/Isocyanate_nucleophile_result.png

      For comparison, run Reactor without reaction rules (option -n):

      react -r isocyanate+nucleophile.rdf isocyanates_more.smiles nucleophiles.smiles -m comb -t reaction -n r -o isocyanate+nucleophile_result_norules.smiles

      Compare the number of results: with the application of the rules we have 169 results, while without the rules we have 234 results.

    Note, that reactions with rules can be defined either in an RDF/MRV file with the rule specified in an RDF/MRV tag as shown above, or else in the reaction string as shown in the Selectivity rule examples below.

    The use and meaning of command-line options in the above commands:

    Option Description Default
    -n ignore reaction rules ('r', 's', 't', 'rs' or 'rt') process reaction rules
    -w allow duplicate product lists filter product repetitions