2020.10 Release Notes
askcos-core
Developer notes:
- Add functions for reordering precursors using SCScore (MR askcos-core!11)
- Add SCScore termination criteria to tree builder (Issue ASKCOS#224, MR askcos-core!13, Epic &11)
- Enable disabling of price lookup in tree builder (Issue ASKCOS#191, MR askcos-core!13, Epic &11)
- Filter templates by attribute filter (Issue askcos-core#1, MR askcos-core!14, Epic &13)
- Update Docker image Python version to 3.7 and RDKit version to 2020.03.06 (MR askcos-core!15)
- Add solvent scores to the condition recommender (Issue askcos-core#2, MR askcos-core!16, Epic &14)
- Add new version of MCTS tree builder (MR askcos-core!17)
- New pathway enumeration algorithm for tree builder (MR askcos-core!18)
- Add pathway ranking and clustering model (MR askcos-core!20)
- Improve flexibility of mongodb connection parameters (MR askcos-core!22)
- Return template rank with one-step predictions (MR askcos-core!23)
- Add QM descriptor model (MR askcos-core!24)
- Add QM-based universal regioselectivity model (MR askcos-core!24)
- Switch to C++ version of rdchiral for improved performance (MR askcos-core!25)
- Return graph attributes, including tree depth, in treedata json (Issue ASKCOS#284, MR askcos-core!26, Epic &11)
Bug fixes:
- Fix template drawing for aromatic structures (MR askcos-core!12)
- Improved tracking of different buyable sources (Issue ASKCOS#150, MR askcos-core!13)
- Tree builder no longer returns cyclic pathways (Issue ASKCOS#39, MR askcos-core!18)
- Pass template_set to retro transformer (MR askcos-core!19)
- Pass template_set to retrieve_template_metadata (Issue askcos-core#3, MR askcos-core!21, Epic &11)
Breaking changes:
- Return values changed for neural net context recommender get_n_conditions method (MR askcos-core!16)
- Return values changed for MCTS tree builder tree_status method (MR askcos-core!18)
- Return values changed for MCTS tree builder get_buyable_paths method (MR askcos-core!18)
askcos-data
Release notes:
- Update buyables data with sources (MR askcos-data!4)
- Add solvent EHS score data (MR askcos-data!5)
- Add retrosynthetic pathway ranking model (MR askcos-data!6)
- Add pistachio template relevance model and data (MR askcos-data!7)
- Add QM descriptor model (MR askcos-data!8)
- Add QM-based universal regioselectivity model (MR askcos-data!8)
askcos-site
User notes:
- Implemented Ketcher molecule editor in IPP (Issue ASKCOS#315, MR askcos-site!6, Epic &4)
- New feature to filter IPP results by reacting atoms (Issue askcos-site#2, MR askcos-site!21, Epic &4)
- Add SCScore termination criteria to tree builder (Issue ASKCOS#224, MR askcos-site!22, Epic &11)
- Enable selection of buyables source in IPP and tree builder (Issue ASKCOS#150, MR askcos-site!22)
- Expose number of pathways to return as a user option (Issue ASKCOS#235, MR askcos-site!22, Epic &11)
- Enable disabling of price lookup in tree builder (Issue ASKCOS#191, MR askcos-site!22, Epic &11)
- Support template attribute filters in IPP (Issue askcos-site#8, MR askcos-site!23, Epic &13)
- Add splash page during IPP loading (MR askcos-site!25)
- Display solvent scores in condition recommender results (Issue askcos-site#10, MR askcos-site!28, Epic &14)
- Add support for new version of MCTS tree builder (Issue ASKCOS#280, MR askcos-site!29, Epic &11)
- Implement pathway ranking and clustering model and API (MR askcos-site!34)
- Redesign tree results page using jspanel (MR askcos-site!34)
- Display precursor template rank in IPP (MR askcos-site!39)
- Add QM descriptor model and API (MR askcos-site!40)
- Add QM-based universal regioselectivity model and update general selectivity API (MR askcos-site!40)
- Add new regioselectivity page in forward synthesis module (MR askcos-site!40)
- Update navigation menu organization (MR askcos-site!42)
- Add support for specifying tree builder result format (MR askcos-site!43)
- Return tree lengths from tree builder API (Issue ASKCOS#284, MR askcos-site!43, Epic &11)
- Add support for new regioselectivity model in IPP (MR askcos-site!45)
Developer notes:
- Support custom django settings file (MR askcos-site!19)
- Support celery task priorities (Issue ASKCOS#287, MR askcos-site!20)
- Update Docker image Python version to 3.7 and RDKit version to 2020.03.06 (MR askcos-site!24)
- Move some processing of tree builder results from client to server (MR askcos-site!30)
- Switch to api/v2 in forward prediction page (Issue askcos-site#9, MR askcos-site!31, Epic &14)
- Improved flexibility of mongodb, rabbitmq, and redis connections (MR askcos-site!38)
- Switch to C++ version of rdchiral for improved performance (MR askcos-site!41)
Bug fixes:
- Enable precursor scoring using SCScore in IPP (Issue askcos-site#6, MR askcos-site!18)
- Do not load template in tree builder celery workers (MR askcos-site!38)
- Canonicalize SMILES input in IPP (Issue ASKCOS#303, MR askcos-site!44)
- Fix rabbitmq connection environment variable names (MR askcos-site!46)
Breaking changes:
- API parameters changed for /api/v2/general-selectivity/ endpoint (MR askcos-site!40)
- Graph format changed for saved tree builder results (MR askcos-site!30)
askcos-deploy
User notes:
- Add support for custom django settings file (MR askcos-deploy!40)
- Introduce Helm chart for Kubernetes deployment (Issue ASKCOS#338, ASKCOS#339, MR askcos-deploy!48)
Developer notes:
- Add new tree builder services to deployment (MR askcos-deploy!44)
- Add pathway ranking services to deployment (MR askcos-deploy!45)
- Upgrade backend service images and switch to alpine where available (MR askcos-deploy!46)
- Add pistachio template relevance TFX service to deployment (MR askcos-deploy!49)
- Add QM descriptor and new regioselectivity services to deployment (MR askcos-deploy!50)
Bug fixes:
- Use COMPOSE_PROJECT_NAME in seed-db function (MR askcos-deploy!41)
- Clear rabbitmq data when updating deployment (Issue askcos-deploy#15, MR askcos-deploy!43)
- Fix rabbitmq connection parameters in Helm chart (Issue askcos-deploy#16, MR askcos-deploy!52)
- Fix service status checking in init containers in Helm chart (Issue askcos-deploy#17, MR askcos-deploy!52)
Deprecation:
- Previous k8 deployment configuration and script deprecated in favor of Helm chart (MR askcos-deploy!48)
Docker Compose Deployment
We currently support two methods for deploying ASKCOS: Docker Compose and Kubernetes. Docker Compose is a simpler method for deploying on a single workstation, while Kubernetes is more complex but is suitable for scaling across multiple nodes.
Software Prerequisites
To deploy ASKCOS using Docker Compose, you must have the following installed on your machine:
- git
- Docker (installation instructions)
- Docker Compose (installation instructions)
Quickstart
ASKCOS can be downloaded using deploy tokens, which provide read-only access to the source code and our container registry in GitLab. Below is a complete example showing how to deploy the ASKCOS application using deploy tokens (omitted in this example). The deploy tokens can be found on the MLPDS Member Resources ASKCOS Versions Page.
$ export DEPLOY_TOKEN_USERNAME=
$ export DEPLOY_TOKEN_PASSWORD=
$ docker login registry.gitlab.com -u $DEPLOY_TOKEN_USERNAME -p $DEPLOY_TOKEN_PASSWORD
$ git clone https://$DEPLOY_TOKEN_USERNAME:$DEPLOY_TOKEN_PASSWORD@gitlab.com/mlpds_mit/askcos/askcos-deploy.git
$ cd askcos-deploy
$ git checkout 2020.10
$ bash deploy.sh deploy
Upgrade Information
The askcos-deploy repository also provides scripts to upgrade an existing ASKCOS deployment in-place.
From v0.3.1 or above
$ git checkout 2020.10
$ bash deploy.sh update -v 2020.10
If you have not seeded the database before (if you're upgrading from v0.3.1), you will need to do so:
$ bash deploy.sh set-db-defaults seed-db
If you have not already done so, you should re-index the database because new index types were introduced in 2020.07 to significantly improve lookup efficiency:
$ bash deploy.sh index-db --drop-indexes
The 2020.10 release includes a new Pistachio template set and template relevance model. In order to use it, you will need to seed some new data into the mongo database:
bash deploy.sh seed-db -c pistachio -r pistachio --append
The 2020.10 release also includes updated set of default buyables data. If desired, you can import the new data using the following command:
bash deploy.sh seed-db -b default --append
Note that this will result in some duplicate data. If you have not added custom buyables data, you can drop the existing buyables database and import the updated data by omitting the --append
argument.
!>In some cases, we have seen issues with resetting rabbitmq data while upgrading to 2020.10. If you see celery workers restarting after updating and inequivalent arg 'x-max-priority'
errors in worker logs, you should restart rabbitmq again using docker-compose rm -fsv rabbit && docker-compose up -d rabbit
.
From v0.2.x or v0.3.0
Upgrading from earlier versions of ASKCOS directly to 2020.10 has not been thoroughly tested. Instead, we suggest upgrading to v0.4.1 as an intermediate step.
$ git checkout v0.4.1
$ bash backup.sh
$ bash deploy.sh update -v 0.4.1
$ bash deploy.sh set-db-defaults seed-db
$ bash restore.sh
After upgrading to v0.4.1, you should follow the above instructions to upgrade to 2020.10.
!>Note: A large amount of data was migrated to the mongodb in v0.4.1 (chemhistorian), and seeding may take some time to complete. We send this seeding task to the background so the rest of the application can start and become functional without having to wait. If using the default set of data (i.e. - using the exact commands above), you can monitor the progress of mongodb seeding using bash deploy.sh count-mongo-docs
, which will tell you how many documents have been seeded out of the expected number. Complete seeding is not necessary for application functionality unless you use the chemical popularity logic in the tree builder.
First Time Deployment
Deploying the Web Application
Deployment is initiated by a bash script that runs a few docker-compose commands in a specific order. Several database services need to be started first, and more importantly seeded with data, before other services (which rely on the availability of data in the database) can start. The deploy.sh
script is provided in the askcos-deploy repository and should be run as follows:
$ bash deploy.sh command [optional arguments]
There are a number of available commands, including the following for common deploy tasks:
deploy
: runs standard first-time deployment tasks, includingseed-db
update
: pulls new docker image from GitLab repository and restarts all servicesseed-db
: seed the database with default or custom data filesstart
: start a deployment without performing first-time tasksstop
: stop a running deploymentclean
: stop a running deployment and remove all docker containers and volumes
For a running deployment, new data can be seeded into the database using the seed-db
command along with arguments indicating the types of data to be seeded. Note that this will replace the existing data in the database. The available arguments are as follows:
-b, --buyables
: specify buyables data to seed, eitherdefault
or path to data file-c, --chemicals
: specify chemicals data to seed, eitherdefault
or path to data file-x, --reactions
: specify reactions data to seed, eitherdefault
or path to data file-r, --retro-templates
: specify retrosynthetic templates to seed, eitherdefault
or path to data file-f, --forward-templates
: specify forward templates to seed, eitherdefault
or path to data file
For example, to seed default buyables data and custom retrosynthetic pathways, run the following from the deploy folder:
$ bash deploy.sh seed-db --buyables default --retro-templates /path/to/my.retro.templates.json.gz
To update a deployment, run the following from the deploy folder:
$ bash deploy.sh update --version x.y.z
To stop a currently running application, run the following from the deploy folder:
$ bash deploy.sh stop
If you would like to clean up and remove everything from a previous deployment (NOTE: you will lose user data), run the following from the deploy folder:
$ bash deploy.sh clean
Backing Up User Data
From v0.3.1 or above
If you are upgrading from v0.3.1 or later, the backup/restore process is no longer needed unless you are moving deployments to a new machine.
New backup and restore functions were added in askcos-deploy 2020.07 to provide more robust backup/restore capabilities based on Docker volumes. The commands can be used whether the site is running or not; the only requirement is that the mongo_data
and mysql_data
Docker volumes exist.
To backup:
bash deploy.sh backup [-d /absolute/path/to/backup/dir]
To restore:
bash deploy.sh restore [-d /absolute/path/to/backup/dir]
!>Note: These backup and restore processes are run in a bare alpine linux image which will be automatically pulled by Docker.
From v0.2.x or v0.3.0
If you are upgrading the deployment from a previous version (prior to v0.3.1), or moving the application to a different server, you may want to retain user accounts and user-saved data/results. The provided backup.sh
and restore.sh
scripts in the askcos-deploy/utils/legacy
directory are capable of handling the backup and restoring process. Please read the following carefully so as to not lose any user data:
- Start by making sure the previous version you would like to backup is currently up and running with
docker-compose ps
. - Checkout the newest version of the askcos-deploy:
git checkout 2020.10
- Run
$ bash utils/legacy/backup.sh
- Make sure that the
deploy/backup
folder is present, and there is a folder with a long string of numbers (year+month+date+time) that corresponds to the time you just ran the backup command - If the backup was successful (
db.json
anduser_saves
(<v0.3.1) orresults.mongo
(>=0.3.1) should be present), you can safely tear down the old application withdocker-compose down [-v]
- Deploy the new application with
bash deploy.sh deploy
or update withbash deploy.sh update -v x.y.z
- Restore user data with
bash utils/legacy/restore.sh
!>Note: For versions >=0.3.1, user data persists in docker volumes and is not tied to the lifecycle of the container services. If the [-v] flag is not used with docker-compose down
, volumes do not get removed, and user data is safe. In this case, the backup/restore procedure is not necessary as the containers that get created upon an install/upgrade will continue to use the docker volumes that contain all the important data. If the [-v] flag is used, all data will be removed and a restore will be required to recover user data.
Add Customization
There are a few parts of the application that you can customize:
- Header sub-title next to ASKCOS (to designate this as a local deployment at your organization)
- Email addresses for the support form
- Whether to enable the chemical name to SMILES resolver
- Whether authorization is required to modify the buyables database.
These are handled as an environment variables that can change upon deployment (and are therefore not tied into the image directly). This can be found in the customization
file, which is created automatically during deployment from the customization.example
file. Please let us know what other degrees of customization you would like.
Managing Django
If you'd like to manage the Django app (i.e. - run python manage.py ...), for example, to create an admin superuser, you can run commands in the running app service as follows:
$ docker-compose exec app bash -c "python /usr/local/askcos-site/manage.py createsuperuser"
In this case you'll be presented an interactive prompt to create a superuser with your desired credentials.
Scaling Workers
Only 1 worker per queue is deployed by default with limited concurrency. This is not ideal for many-user demand. The scaling of each worker is defined at the top of the deploy.sh
script. To scale a desired worker, change the appropriate value in deploy.sh
, for example:
n_tb_c_worker=N # Tree builder chiral worker
where N is the number of workers you want. Then run bash deploy.sh start [-v <version>]
.
Kubernetes Deployment
In ASKCOS 2020.10, we're introducing a new Helm chart to make it easier to deploy ASKCOS on Kubernetes. The previous Kubernetes configuration can still be used for 2020.07 or earlier but will no longer be updated.
Software Prerequisites
In addition to git and Docker, we will assume that you are using a cluster which already has Kubernetes configured. You will also need to install Helm 3: https://helm.sh/docs/intro/install/.
Quickstart
Similar to the Docker Compose deployment, you will need to obtain the ASKCOS deploy tokens in order to clone the askcos-deploy repository and access the GitLab image registry. The deploy tokens can be found on the MLPDS Member Resources ASKCOS Versions Page.
$ export DEPLOY_TOKEN_USERNAME=
$ export DEPLOY_TOKEN_PASSWORD=
$ git clone https://$DEPLOY_TOKEN_USERNAME:$DEPLOY_TOKEN_PASSWORD@gitlab.com/mlpds_mit/askcos/askcos-deploy.git
$ cd askcos-deploy
$ git checkout 2020.10
$ helm install --set imageCredentials.username=$DEPLOY_TOKEN_USERNAME --set imageCredentials.password=$$DEPLOY_TOKEN_PASSWORD mydeploy ./helm/askcos
For more configuration options, please check out the values file at askcos-deploy/helm/askcos/values.yaml
.
Add Customization
For Kubernetes, the same customizations can be applied as for the Docker Compose deployment:
- Header sub-title next to ASKCOS (to designate this as a local deployment at your organization)
- Email addresses for the support form
- Whether to enable the chemical name to SMILES resolver
- Whether authorization is required to modify the buyables database.
The environment variables for these customizations can be adjusted in the env
block of the values.yaml
file.
Managing Django
If you'd like to manage the Django app (i.e. - run python manage.py ...), for example, to create an admin superuser, you can run commands in the running app container as follows:
$ kubectl exec [ASKCOS POD] -c app -i -t -- python /usr/local/askcos-site/manage.py createsuperuser
In this case you'll be presented an interactive prompt to create a superuser with your desired credentials.
Scaling Workers
For Kubernetes, worker replicas can also be set in the values.yaml
file. Celery workers are defined in the celery
block as a list, and each item has a replicaCount
field for for setting the number of replicas.
(Optional) Building Docker Images
If you would like to build the askcos-site Docker image yourself, you will need to download the appropriate repositories depending on where you want to start.
To only build askcos-site using a pre-built askcos-core image:
$ git clone https://gitlab.com/mlpds_mit/askcos/askcos-site
$ cd askcos-site
$ make [TAG=my_tag]
A Makefile is provided to make it easier to build the image with a default image name. You can also use the docker build
command directly:
$ docker build -t <image name>:<tag> .
!>Note: The image name should correspond with what exists in the docker-compose.yml
file. By default, the image name is environment variable ASKCOS_IMAGE_REGISTRY
+ askcos-site
. If you choose to use a custom image name, make sure to modify the ASKCOS_IMAGE_REGISTRY
variable or the docker-compose.yml
file accordingly. For Kubernetes deployment, the image registry and tag are defined in the values.yaml
file.
Similarly, if you also want to build askcos-core:
$ git clone https://gitlab.com/mlpds_mit/askcos/askcos-core
$ cd askcos-core
$ make [TAG=my_tag]
Note that you will need to specify the appropriate askcos-core version when building askcos-site afterwards:
$ cd askcos-core
$ make TAG=my_tag
$ cd ../askcos-site
$ make CORE_VERSION=my_tag TAG=my_tag
ASKCOS Development
Software package for the prediction of feasible synthetic routes towards a desired compound and associated tasks related to synthesis planning. Originally developed under the DARPA Make-It program and now being developed under the MLPDS Consortium.