Task Manager

    {info} Task Manager provides the possibility of scheduling special long-lasting tasks for other web services.

    This documentation describes installation, administration and usage of Task Manager.

    Task Manager is available in two modes:

    1. As part of a microservices system
    2. As a standalone web application

    In both modes, it has to be connected to one or more managed modules.

    Managed chemical functional modules

    Task Manager does not provide chemical functionality on its own, but rather manages other chemical functional modules, so these modules must be running along with Task Manager.

    The managed modules you have to start depend on the endpoints of Task Manager that you want to use:

    endpoints managed module
    /rest-v1/work-queue/db/* DB Web Services
    /rest-v1/work-queue/reactor/* Reactor Web Services

    Note: more modules will be supported in the future.

    Microservices system mode

    In microservices system mode, Task Manager runs together with the Config, Discovery and Gateway services. These three services are mandatory, and optionally other services can also be part of the system, including the chosen managed modules. All configuration must be done in the Config service.

    The default configuration applies to the microservices system mode.

    The web application runs on host <server-host> and listens on port <gateway-server-port>.

    In microservices system mode, Task Manager and the managed modules connect automatically.

    Standalone web application mode

    In standalone web application mode, Task Manager and the managed modules run without the Config, Discovery and Gateway services (however, the installer installs them as well).

    The default configuration must be changed according to the standalone web application mode; set eureka.client.enabled=false in the application.properties file of Task Manager and the managed modules as well.

    The address of the managed modules should be set in <server-host>:<server-port> format in the application.properties file of Task Manager.

    managed module application.properties example
    DB Web Services com.chemaxon.taskmanager.service.jwsdb http://localhost:8062
    Reactor Web Services com.chemaxon.taskmanager.service.jws-reactor http://localhost:8067

    The managed modules also need access to Task Manager. The com.chemaxon.taskmanager.service property in their application.properties file need to be set in <server-host>:<server-port> format. Example: http://localhost:8068

    Download

    See here.

    Software requirements

    See here.

    Installation

    See here.

    Module is installed into folder: jws/jws-taskmanager

    Licenses

    See here.

    Logging

    See here.

    Configuration

    Default configuration:

    application.properties
    server.port=8068
    logging.file.name=../logs/jws-taskmanager.log
    eureka.client.enabled=true
    bootstrap.properties
    spring.cloud.config.failFast=true
    spring.cloud.config.uri=${CONFIG_SERVER_URI:http\://localhost\:8888/}
    spring.cloud.config.retry.initialInterval=3000
    spring.cloud.config.retry.multiplier=1.2
    spring.cloud.config.retry.maxInterval=60000
    spring.cloud.config.retry.maxAttempts=100

    For more configuration options, see the Spring documentation page.

    Database configuration

    Task Manager service has its own database to store added structures and task statuses. H2 and PostgreSQL databases are supported.

    Default configuration:

    application.properties description
    ${CXN_TASK_JDBC_URL:jdbc:h2:file:./data/task_db}
    spring.datasource.driverClassName=${CXN_TASK_DRIVER:org.h2.Driver} Value is org.postgresql.Driver in PostgreSQL case
    spring.datasource.username=${CXN_TASK_JDBC_USER:user}
    spring.datasource.password=${CXN_TASK_JDBC_PASSWORD:password}
    spring.jpa.database-platform=${CXN_TASK_DIALECT:org.hibernate.dialect.H2Dialect} Value is org.hibernate.dialect.PostgreSQLDialect in PostgreSQL case

    File and S3 import configuration

    Task manager can import structures from tables which are exported from DB Web Services. It can be uploaded from file or from an S3 bucket.

    AWS credential configuration details can be found here.

    Import setup can be configured with below properties.

    application.properties description
    com.chemaxon.webservices.db.import.dbImportStrategy=FILE Specifies whether use FILE based import or S3 based. Default: FILE
    com.chemaxon.webservices.db.import_export.dir=./data/export Import file path in FILE based import strategy case
    com.chemaxon.webservices.db.import.s3BuckeBasetUrl:s3://export-bucket/ Import S3 bucket in S3 strategy case.

    DB Web Services configuration

    Below configuration can be added to the DB Web Services application.properties file regarding Task Manager communication.

    application.properties description
    com.chemaxon.taskmanager.scheduler.enabled=false When Task Manager starts a job in DB Web Services, only one DB service instance starts processing, regardless of running instances. If this attribute is true all DB service instances check running task and join processing (e.g. molecule import). If DB Web Service instance is restarted during task is in progress then process continues if this attribute is configured as true. Default value is false.

    Highly recommended to switch on with task manager usage.
    com.chemaxon.taskmanager.scheduler.frequency=60000 DB Web Services tries to join running task
    with this regularity if com.chemaxon.taskmanager.scheduler.enabled=true. It is in milliseconds.
    com.chemaxon.taskmanager.service=http://${TASKMANAGER_SERVICE_URL} When DB Web services runs in standalone mode, it can connect to Task Manager via this URL.
    com.chemaxon.taskmanager.db.batchSize:5000 DB Web Services request structure for processing from Task Manager with this size.

    Reactor Web Services configuration

    Task Manager service has its own algorithm to create batches by requestLimit and responseLimit properties. The tighter constraint applies on batches.

    Default configuration:

    application.properties description
    com.chemaxon.taskmanager.reactor.requestLimit=1000 Maximum number of molecules in a batch
    com.chemaxon.taskmanager.reactor.responseLimit=1000 Highest possible result number of executed batch

    Below configuration can be added to the Reactor Web Services application.properties file regarding Task Manager communication.

    application.properties description
    com.chemaxon.reactor.task.service.scheduler.enabled=false When Task Manager starts a job in Reactor Web Services, only one reactor service instance starts processing, regardless of running instances. If this attribute is true all Reactor service instances check running task and join processing (e.g. react). If Reactor Web Services instance is restarted during task is in progress then process continues if this attribute is configured as true. Default value is false.
    com.chemaxon.reactor.task.service.scheduler.frequency=60000 Reactor Web Services tries to join running task
    with this regularity if com.chemaxon.reactor.task.service.scheduler.enabled=true. It is in milliseconds.
    com.chemaxon.taskmanager.service=http://${TASKMANAGER_SERVICE_URL} When Reactor Web Services runs in standalone mode, it can connect to Task Manager via this URL.

    Retry mechanism

    Task Manager has a retry mechanism implemented in it when it communicates with a managed module. This is adjustable with these two properties in application.properties file in Task Manager:

    application.properties description
    com.chemaxon.retrytemplate.backOffPeriod=10000 The time in milliseconds the service waits between two attempts to send the request to the other service again, if no response was given.
    com.chemaxon.retrytemplate.maxAttempts=20 The maximum number of attempts. Note that it also contains the very first call.

    With the above written default values, the service tries to get a response for about 3 minutes.

    When a managed module is restarted during task execution, structures can get stuck in IN_PROGRESS state. Task Manager reprocesses these structures according to below configuration.

    application.properties description
    com.chemaxon.taskmanager.scheduler.frequency=60000 Regularity of task manager job which identifies stuck sructures. It is in milliseconds.
    com.chemaxon.taskmanager.scheduler.timelimit=600000 Time limit before task manager tries to reprocess stuck structure.
    com.chemaxon.taskmanager.scheduler.retrylimit=3 Number of structure process retry

    High Availability (HA)

    Running more instances of the task manager service ensures HA and load balancing.

    HA mode requires PostgreSQL database. It is not supported with H2 database.

    Running the server

    Prerequisites in case of microservices system mode:

    1. Config service is running
    2. Discovery service is running
    3. Gateway service is running

    Run the service in command line in folder jws/jws-taskmanager:

    jws-taskmanager-service start (on Windows)

    jws-taskmanager-service start (on Linux)

    or

    run-jws-taskmanager.exe (on Windows)

    run-jws-taskmanager (on Linux)

    API documentation

    Find and try out the API on the Swagger UI.

    Mode URL of Swagger UI Default URL of Swagger UI
    microservices system <serverhost>:<gateway-port>/jws-taskmanager/API/ localhost:8080/jws-taskmanager/API/
    standalone web application mode <serverhost>:<server-port>/API/ localhost:8068/API/

    Usage

    The guidelines on the Swagger UI API documentation of your installed module display the methods and syntax implemented for reaching the functionalities of the Task Manager toolkit.

    Db Web Services tasks

    Task Manager can be used to import structures into an already existing table. The table must be created at the /rest-v1/db/additional/createTable/{tableName} endpoint of DB Web Services.

    The structures can be added via a REST endpoint, or can be uploaded from file or from an S3 bucket.

    All uploaded structures have an automatically generated, unique key, which is only used by Task Manager and does not affect the behavior of the import. It can be used to query or delete structures.

    If the structures have their own ID property, this property can also be set as an identifier in the table. The Add structure endpoint (PUT - /rest-v1/work-queue/db/task/{taskId}) has an optional id attribute for this purpose. It is recommended to provide the id property.

    Example - Import structure

    1. Create task

    PUT - /rest-v1/work-queue/db/task

    {
      "params": {
        "isDuplicateFiltering": "true",
        "tableName": "mytable"
      },
      "task": "batchInsert"
    }
    1. Add structures with ID

    PUT - /rest-v1/work-queue/db/task/{taskId}

    {
      "inputFormat": "smiles",
      "structures": [
        {
          "id": 1,
          "structure": "O=C1CCCC=C1"
        },
        {
          "id": 2,
          "structure": "C1CCCCC1"
        }
      ]
    }

    Note: provided structure ID will be used as the ID in DB Web Services import.

    1. Manage task (Start and Pause)

    You can start task with POST call below.

    POST - /rest-v1/work-queue/db/task/{taskId}/manage

    {
      "action": "START"
    }

    Started task status becomes "WAITING", which means it is in the execution queue. If our process started to execute the task, its status becomes "IN_PROGRESS". When a task is in "WAITING" or "IN_PROGRESS" status, it can be paused by the call below, and then it transitions to the "PAUSED" states and its execution stops. A "PAUSED" task can be continued with calling start again as shown above.

    Pause call:

    POST - /rest-v1/work-queue/db/task/{taskId}/manage

    {
      "action": "PAUSE"
    }
    1. Monitor task progress

    GET - /rest-v1/work-queue/db/task/{taskId}

    When task is in progress, the response contains progress information.

    {
      "status": "IN_PROGRESS",
      "processedPercentage": 12
    }

    If import is done, task is placed in READY status.

    {
      "status": "READY"
    }

    If the task has any issues, the status is updated to ERROR, and the response contains the number of processed and failed inputs.

    {
      "status": "ERROR",
      "processedSuccess": 0,
      "processedError": 2
    }
    1. Task result

    GET - /rest-v1/work-queue/db/task/{taskId}/results?resultIdType=ALL

    When the task has finished, the results can be retrieved.

    Example result in case of success:

    {
      "failedIds": [],
      "failedKeys": [],
      "duplicatedIds": [],
      "duplicatedKeys": [],
      "successfulIds": [
        1,
        2
      ],
      "successfulKeys": [
        "s100",
        "s105"
      ]
    }

    Note: if IDs were not provided when the structures were added, the "successfulIds" list will contain the automatically generated IDs.

    Example result in case of error:

    {
      "failedIds": [
        1,
        2
      ],
      "failedKeys": [
        "s100",
        "s105"
      ],
      "duplicatedIds": [],
      "duplicatedKeys": [],
      "successfulIds": [],
      "successfulKeys": []
    }

    Reactor Web Services tasks

    Task Manager can be used to execute reaction on already added structures.

    The structures can be added via a REST endpoint, or can be uploaded from file or from an S3 bucket.

    All uploaded structures have an automatically generated, unique key, which is only used by Task Manager and does not affect the behavior of the import. It can be used to query or delete structures.

    The structures may have their own ID property, which is the identifier in the reaction results. The Add structure endpoint (PUT - /rest-v1/work-queue/reactor/task/{taskId}) has an optional id attribute for this purpose. It is recommended to provide the id property.

    When uploading structures, it is mandatory to specify the position, which indicates the position of the uploaded reactant lists in relation to each other.

    Example - Import structure

    1. Create task

    PUT - /rest-v1/work-queue/reactor/task

    {
      "params": {
        "copyPropertyByReactant": [
          {
            "copyAs": "NewPropertyName",
            "copyFrom": 1,
            "propertyName": "PropertyName"
          }
        ],
        "outputFormat": "smiles",
        "productIndexes": [
          1
        ],
        "ratio": [
          1
        ],
        "reactantInputFormat": "smiles",
        "reaction": "[#6:8]-[#6:1](=[O:3])-[#6:2](-[#6:9])=[O:6]>>[H:5][#8:4]-[#6:1](=[O:3])[C:2]([#6:8])([#6:9])[#8:6][H:7]",
        "resultType": "product",
        "showUnsuccessfulReactions": true,
        "unambiguousOnly": false
      },
      "task": "react"
    }
    1. Add structures with ID and position

    PUT - /rest-v1/work-queue/reactor/task/{taskId}

    {
      "inputFormat": "smiles",
      "position": 0,
      "structures": [
        {
          "id": 1,
          "structure": "OC(=O)CC(=O)C(=O)CC(O)=O"
        },
        {
          "id": 2,
          "structure": "C1CCCCC1"
        }
      ]
    }

    Note: provided structure ID will be seen in result.

    1. Manage task (Start and Pause)

    You can start task with POST call below.

    POST - /rest-v1/work-queue/reactor/task/{taskId}/manage

    {
      "action": "START"
    }

    Started task status becomes "WAITING", which means it is in the execution queue. If our process started to execute the task, its status becomes "IN_PROGRESS". When a task is in "WAITING" or "IN_PROGRESS" status, it can be paused by the call below, and then it transitions to the "PAUSED" states and its execution stops. A "PAUSED" task can be continued with calling start again as shown above.

    Pause call:

    POST - /rest-v1/work-queue/reactor/task/{taskId}/manage

    {
      "action": "PAUSE"
    }
    1. Monitor task progress

    GET - /rest-v1/work-queue/reactor/task/{taskId}

    When task is in progress, the response contains progress information.

    {
      "status": "IN_PROGRESS",
      "processedPercentage": 12
    }

    If reaction is done, task is placed in READY status.

    {
      "status": "READY"
    }

    If the task has any issues, the status is updated to ERROR, and the response contains the number of processed and failed inputs.

    {
      "status": "ERROR",
      "processedSuccess": 0,
      "processedError": 2
    }
    1. Task result

    GET - /rest-v1/work-queue/reactor/task/3/results?resultIdType=SUCCESS&pageNumber=0&pageSize=20

    When the task has finished, the results can be retrieved.

    Example result in case of success:

    {
      "products": [
        "{\"result\":\"OC(=O)CC(O)(CC(O)=O)C(O)=O\",\"reactantIds\":[\"1\"]}"
      ]
    }

    Note: if IDs were not provided when the structures were added, the "reactantIds" list will contain NULL-s.

    Example result in case of error:

    GET - /rest-v1/work-queue/reactor/task/3/results?resultIdType=ERROR&pageNumber=0&pageSize=20

    {
      "results": [
        {
          "errorMessage": "Unable to use specified reaction.",
          "structures": [
            {
              "id": 1,
              "structure": "OC(=O)CC(=O)C(=O)CC(O)=O",
              "position": 0
            },
            {
              "id": 2,
              "structure": "C1CCCCC1",
              "position": 0
            }
          ]
        }
      ]
    }