2021.01 Release Notes
askcos-site
User notes:
- New reaction condition recommendation model added to API v2 and Forward Synthesis Planner (MR askcos-site!53)
- Add search bar to home page as a new single-entry point to all tasks (MR askcos-site!51)
- Replace aromatic site-selectivity page with new interactive interface in forward synthesis planner (askcos-site#20, ASKCOS#126, MR askcos-site!51)
- Replace atom mapping page with home page utility (askcos-site#21, ASKCOS#126, MR askcos-site!51)
- Switch draw page JS drawer to Ketcher (askcos-site#4, MR askcos-site!51, Epic &4)
- Switch forward predictor JS drawer to Ketcher (askcos-site#3, MR askcos-site!51, Epic &4)
- Add new "tree view" in IPP to highlight individual pathways (MR askcos-site!54)
- Add feature to view all recommended templates in IPP (ASKCOS#328, MR askcos-site!9, Epic &6)
- Add pagination to tree list view panel in tree builder results page (MR askcos-site!56)
- Update module overview page and add references (ASKCOS#293, MR askcos-site!58)
- Add option to invert reaction atoms filter to select conserved reaction atoms (ASKCOS#217, MR askcos-site!60)
Developer notes:
- Refactor IPP frontend code to use new graph data structure (MR askcos-site!50)
- Add support for reaction SMILES to /api/v2/rdkit/smiles/canonicalize endpoint (MR askcos-site!51)
- Reduce celery sub-tasks in impurity predictor worker for efficiency (ASKCOS#333, MR askcos-site!55)
Bug fixes:
- Fix incorrect template attribute comparison operators in IPP (MR askcos-site!49)
- Fix precursor re-clustering when loading tree builder results in IPP (ASKCOS#312, MR askcos-site!50)
- Fix treatment of duplicate chemicals in IPP (askcos-site#1, MR askcos-site!50)
- Fix loading of tree builder results in treedata format (askcos-site#27, MR askcos-site!57)
Deprecated (for removal in 2021.04):
- API v1 will no longer be maintained and has been replaced completely by API v2
- Ajax-based frontend pages will no longer be maintained and have been replaced by Vue.js alternatives
askcos-core
User notes:
- Add conda environment file to facilitate local installation of dependencies (MR askcos-core!30)
- Add new quantitative reaction condition recommendation model (askcos-core!33)
Developer notes:
- Adjust impurity predictor inspector to use forward predictor score (askcos-core!34)
Bug fixes:
- Do not run pathway ranking when there is 1 tree (MR askcos-core!31)
askcos-data
- Add graph model for new reaction condition recommender (MR askcos-data!10)
- Add script to download fingerprint model for new reaction condition recommender (MR askcos-data!10)
askcos-deploy
- Set up GitLab CI to push helm chart to registry (MR askcos-deploy!55)
- Add new condition recommender to deployment configuration (MR askcos-deploy!59)
- Switch mysql deployment from replication to standalone in Helm chart (MR askcos-deploy!56)
- Increase nginx client_max_body_size in Helm chart (MR askcos-deploy!56)
- Add depends_on settings to docker-compose configuration (MR askcos-deploy!61)
- Add init containers to control pod startup order in Helm chart (MR askcos-deploy!61)
Changes to askcos-data in 2021.01
ASKCOS 2021.01 includes two versions of a new reaction condition recommendation model. Due to the large model size and limitations on GitLab repository size, the fingerprint-based model is not included in the askcos-data git repository, but can be downloaded from our server.
Effects on deployment using Docker Compose
For deploying ASKCOS using the askcos-site Docker image, the workflow is unchanged from before because the fingerprint model is included in the Docker image. Please note that downloading the model will take substantially longer than before. The overall resource requirements for deploying ASKCOS are also higher as a result - for the default configuration, a minimum of 45 GB of memory and 60 GB of disk space is recommended.
During the deploy/upgrade process, a Docker volume is populated with askcos-data to facilitate seeding of the mongo database. This process now also takes longer, which can lead to Read timed out
errors from Docker Compose. To avoid this error, you can increase the value of COMPOSE_HTTP_TIMEOUT
, either in the .env
file or by running export COMPOSE_HTTP_TIMEOUT=240
(or other value) to set it temporarily.
If you are updating an existing deployment, populating this volume can be skipped since the mongo database has already been seeded. To do so, comment out or delete line 55 in docker-compose.yml
, containing 'appdata:/usr/local/askcos-core/askcos/data'
. Note that this is a temporary workaround, and we are working on a permanent solution for the next release.
Effects on local use of askcos-core and askcos-data
For users who have or would like to download the askcos-data repository directly for local use, there is now an extra step needed to download the fingerprint model from our server. For ease-of-use, we have provided a script to perform the download automatically and place the model files in the appropriate locations. The full process for cloning the askcos-data repository is now as follows:
$ export DATA_TOKEN_USERNAME=
$ export DATA_TOKEN_PASSWORD=
$ git clone https://$DATA_TOKEN_USERNAME:$DATA_TOKEN_PASSWORD@gitlab.com/mlpds_mit/ASKCOS/askcos-data.git
$ cd askcos-data
$ git lfs pull
$ bash get-extra-models.sh -u $DATA_TOKEN_USERNAME -p $DATA_TOKEN_PASSWORD
The data tokens can be found on the MLPDS Member Resources ASKCOS Versions Page.
Docker Compose Deployment
We currently support two methods for deploying ASKCOS: Docker Compose and Kubernetes. Docker Compose is a simpler method for deploying on a single workstation, while Kubernetes is more complex but is suitable for scaling across multiple nodes.
Software Prerequisites
To deploy ASKCOS using Docker Compose, you must have the following installed on your machine:
- git
- Docker (installation instructions)
- Docker Compose (installation instructions)
Quickstart
ASKCOS can be downloaded using deploy tokens, which provide read-only access to the source code and our container registry in GitLab. Below is a complete example showing how to deploy the ASKCOS application using deploy tokens (omitted in this example). The deploy tokens can be found on the MLPDS Member Resources ASKCOS Versions Page.
$ export DEPLOY_TOKEN_USERNAME=
$ export DEPLOY_TOKEN_PASSWORD=
$ docker login registry.gitlab.com -u $DEPLOY_TOKEN_USERNAME -p $DEPLOY_TOKEN_PASSWORD
$ git clone https://$DEPLOY_TOKEN_USERNAME:$DEPLOY_TOKEN_PASSWORD@gitlab.com/mlpds_mit/askcos/askcos-deploy.git
$ cd askcos-deploy
$ git checkout 2021.01
$ bash deploy.sh deploy
Upgrade Information
The askcos-deploy repository also provides scripts to upgrade an existing ASKCOS deployment in-place.
From v0.3.1 or above
$ git checkout 2021.01
$ bash deploy.sh update -v 2021.01
If you have not seeded the database before (if you're upgrading from v0.3.1), you will need to do so:
$ bash deploy.sh set-db-defaults seed-db
If you have not already done so, you should re-index the database because new index types were introduced in 2020.07 to significantly improve lookup efficiency:
$ bash deploy.sh index-db --drop-indexes
The 2020.10 release included a new Pistachio template set and template relevance model. If you have not already done so, you will need to seed some new data into the mongo database:
bash deploy.sh seed-db -c pistachio -r pistachio --append
The 2020.10 release also included updated set of default buyables data. If you have not already done so, you can import the new data using the following command:
bash deploy.sh seed-db -b default --append
Note that this will result in some duplicate data. If you have not added custom buyables data, you can drop the existing buyables database and import the updated data by omitting the --append
argument.
!>In some cases, we have seen issues with resetting rabbitmq data while upgrading from 2020.07 to 2020.10. If you see celery workers restarting after updating and inequivalent arg 'x-max-priority'
errors in worker logs, you should restart rabbitmq again using docker-compose rm -fsv rabbit && docker-compose up -d rabbit
.
From v0.2.x or v0.3.0
Upgrading from earlier versions of ASKCOS directly to 2021.01 has not been thoroughly tested. Instead, we suggest upgrading to v0.4.1 as an intermediate step.
$ git checkout v0.4.1
$ bash backup.sh
$ bash deploy.sh update -v 0.4.1
$ bash deploy.sh set-db-defaults seed-db
$ bash restore.sh
After upgrading to v0.4.1, you should follow the above instructions to upgrade to 2021.01.
!>Note: A large amount of data was migrated to the mongodb in v0.4.1 (chemhistorian), and seeding may take some time to complete. We send this seeding task to the background so the rest of the application can start and become functional without having to wait. If using the default set of data (i.e. - using the exact commands above), you can monitor the progress of mongodb seeding using bash deploy.sh count-mongo-docs
, which will tell you how many documents have been seeded out of the expected number. Complete seeding is not necessary for application functionality unless you use the chemical popularity logic in the tree builder.
First Time Deployment
Deploying the Web Application
Deployment is initiated by a bash script that runs a few docker-compose commands in a specific order. Several database services need to be started first, and more importantly seeded with data, before other services (which rely on the availability of data in the database) can start. The deploy.sh
script is provided in the askcos-deploy repository and should be run as follows:
$ bash deploy.sh command [optional arguments]
There are a number of available commands, including the following for common deploy tasks:
deploy
: runs standard first-time deployment tasks, includingseed-db
update
: pulls new docker image from GitLab repository and restarts all servicesseed-db
: seed the database with default or custom data filesstart
: start a deployment without performing first-time tasksstop
: stop a running deploymentclean
: stop a running deployment and remove all docker containers and volumes
For a running deployment, new data can be seeded into the database using the seed-db
command along with arguments indicating the types of data to be seeded. Note that this will replace the existing data in the database. The available arguments are as follows:
-b, --buyables
: specify buyables data to seed, eitherdefault
or path to data file-c, --chemicals
: specify chemicals data to seed, eitherdefault
or path to data file-x, --reactions
: specify reactions data to seed, eitherdefault
or path to data file-r, --retro-templates
: specify retrosynthetic templates to seed, eitherdefault
or path to data file-f, --forward-templates
: specify forward templates to seed, eitherdefault
or path to data file
For example, to seed default buyables data and custom retrosynthetic pathways, run the following from the deploy folder:
$ bash deploy.sh seed-db --buyables default --retro-templates /path/to/my.retro.templates.json.gz
To update a deployment, run the following from the deploy folder:
$ bash deploy.sh update --version x.y.z
To stop a currently running application, run the following from the deploy folder:
$ bash deploy.sh stop
If you would like to clean up and remove everything from a previous deployment (NOTE: you will lose user data), run the following from the deploy folder:
$ bash deploy.sh clean
Backing Up User Data
From v0.3.1 or above
If you are upgrading from v0.3.1 or later, the backup/restore process is no longer needed unless you are moving deployments to a new machine.
New backup and restore functions were added in askcos-deploy 2020.07 to provide more robust backup/restore capabilities based on Docker volumes. The commands can be used whether the site is running or not; the only requirement is that the mongo_data
and mysql_data
Docker volumes exist.
To backup:
bash deploy.sh backup [-d /absolute/path/to/backup/dir]
To restore:
bash deploy.sh restore [-d /absolute/path/to/backup/dir]
!>Note: These backup and restore processes are run in a bare alpine linux image which will be automatically pulled by Docker.
From v0.2.x or v0.3.0
If you are upgrading the deployment from a previous version (prior to v0.3.1), or moving the application to a different server, you may want to retain user accounts and user-saved data/results. The provided backup.sh
and restore.sh
scripts in the askcos-deploy/utils/legacy
directory are capable of handling the backup and restoring process. Please read the following carefully so as to not lose any user data:
- Start by making sure the previous version you would like to backup is currently up and running with
docker-compose ps
. - Checkout the newest version of the askcos-deploy:
git checkout 2021.01
- Run
$ bash utils/legacy/backup.sh
- Make sure that the
deploy/backup
folder is present, and there is a folder with a long string of numbers (year+month+date+time) that corresponds to the time you just ran the backup command - If the backup was successful (
db.json
anduser_saves
(<v0.3.1) orresults.mongo
(>=0.3.1) should be present), you can safely tear down the old application withdocker-compose down [-v]
- Deploy the new application with
bash deploy.sh deploy
or update withbash deploy.sh update -v x.y.z
- Restore user data with
bash utils/legacy/restore.sh
!>Note: For versions >=0.3.1, user data persists in docker volumes and is not tied to the lifecycle of the container services. If the [-v] flag is not used with docker-compose down
, volumes do not get removed, and user data is safe. In this case, the backup/restore procedure is not necessary as the containers that get created upon an install/upgrade will continue to use the docker volumes that contain all the important data. If the [-v] flag is used, all data will be removed and a restore will be required to recover user data.
Add Customization
There are a few parts of the application that you can customize:
- Header sub-title next to ASKCOS (to designate this as a local deployment at your organization)
- Email addresses for the support form
- Whether to enable the chemical name to SMILES resolver
- Whether authorization is required to modify the buyables database.
These are handled as an environment variables that can change upon deployment (and are therefore not tied into the image directly). This can be found in the customization
file, which is created automatically during deployment from the customization.example
file. Please let us know what other degrees of customization you would like.
Managing Django
If you'd like to manage the Django app (i.e. - run python manage.py ...), for example, to create an admin superuser, you can run commands in the running app service as follows:
$ docker-compose exec app bash -c "python /usr/local/askcos-site/manage.py createsuperuser"
In this case you'll be presented an interactive prompt to create a superuser with your desired credentials.
Scaling Workers
Only 1 worker per queue is deployed by default with limited concurrency. This is not ideal for many-user demand. The scaling of each worker is defined at the top of the deploy.sh
script. To scale a desired worker, change the appropriate value in deploy.sh
, for example:
n_tb_c_worker=N # Tree builder chiral worker
where N is the number of workers you want. Then run bash deploy.sh start [-v <version>]
.
Kubernetes Deployment
ASKCOS 2021.01 includes a Helm chart to make it easier to deploy ASKCOS on Kubernetes. The previous Kubernetes configuration can still be used for 2020.07 or earlier but will no longer be updated.
Software Prerequisites
In addition to git and Docker, we will assume that you are using a cluster which already has Kubernetes configured. You will also need to install Helm 3: https://helm.sh/docs/intro/install/.
Quickstart
Similar to the Docker Compose deployment, you will need to obtain the ASKCOS deploy tokens in order to clone the askcos-deploy repository and access the GitLab image registry. The deploy tokens can be found on the MLPDS Member Resources ASKCOS Versions Page.
$ export DEPLOY_TOKEN_USERNAME=
$ export DEPLOY_TOKEN_PASSWORD=
$ git clone https://$DEPLOY_TOKEN_USERNAME:$DEPLOY_TOKEN_PASSWORD@gitlab.com/mlpds_mit/askcos/askcos-deploy.git
$ cd askcos-deploy
$ git checkout 2021.01
$ helm install --set imageCredentials.username=$DEPLOY_TOKEN_USERNAME --set imageCredentials.password=$$DEPLOY_TOKEN_PASSWORD mydeploy ./helm/askcos
For more configuration options, please check out the values file at askcos-deploy/helm/askcos/values.yaml
.
Add Customization
For Kubernetes, the same customizations can be applied as for the Docker Compose deployment:
- Header sub-title next to ASKCOS (to designate this as a local deployment at your organization)
- Email addresses for the support form
- Whether to enable the chemical name to SMILES resolver
- Whether authorization is required to modify the buyables database.
The environment variables for these customizations can be adjusted in the env
block of the values.yaml
file.
Managing Django
If you'd like to manage the Django app (i.e. - run python manage.py ...), for example, to create an admin superuser, you can run commands in the running app container as follows:
$ kubectl exec [ASKCOS POD] -c app -i -t -- python /usr/local/askcos-site/manage.py createsuperuser
In this case you'll be presented an interactive prompt to create a superuser with your desired credentials.
Scaling Workers
For Kubernetes, worker replicas can also be set in the values.yaml
file. Celery workers are defined in the celery
block as a list, and each item has a replicaCount
field for for setting the number of replicas.
(Optional) Building Docker Images
If you would like to build the askcos-site Docker image yourself, you will need to download the appropriate repositories depending on where you want to start.
To only build askcos-site using a pre-built askcos-core image:
$ git clone https://gitlab.com/mlpds_mit/askcos/askcos-site
$ cd askcos-site
$ make [TAG=my_tag]
A Makefile is provided to make it easier to build the image with a default image name. You can also use the docker build
command directly:
$ docker build -t <image name>:<tag> .
!>Note: The image name should correspond with what exists in the docker-compose.yml
file. By default, the image name is environment variable ASKCOS_IMAGE_REGISTRY
+ askcos-site
. If you choose to use a custom image name, make sure to modify the ASKCOS_IMAGE_REGISTRY
variable or the docker-compose.yml
file accordingly. For Kubernetes deployment, the image registry and tag are defined in the values.yaml
file.
Similarly, if you also want to build askcos-core:
$ git clone https://gitlab.com/mlpds_mit/askcos/askcos-core
$ cd askcos-core
$ make [TAG=my_tag]
Note that you will need to specify the appropriate askcos-core version when building askcos-site afterwards:
$ cd askcos-core
$ make TAG=my_tag
$ cd ../askcos-site
$ make CORE_VERSION=my_tag TAG=my_tag
ASKCOS Development
Software package for the prediction of feasible synthetic routes towards a desired compound and associated tasks related to synthesis planning. Originally developed under the DARPA Make-It program and now being developed under the MLPDS Consortium.