Potency Challenge (Sample Dataset)
This dataset is made available as part of the ASAP Discovery x OpenADMET competition. It's a small portion of the training data, made available already to let teams prepare data loaders and other utilities.
Structure
This dataset has the following columns:
Column | Dtype | Description |
---|
Molecule Name | str | Internal identifier at ASAP Discovery for this molecule |
CXSMILES | str | Text representation of the 2D molecular structure |
pIC50 (SARS-CoV-2 Mpro) | float | Negative log10 of the IC50 values of the dose-response curve |
pIC50 (MERS-CoV Mpro) | float | Negative log10 of the IC50 values of the dose-response curve |
For the challenge, we will provide all these columns for the train set. At test time, we will only provide the CXSMILES
.
📦 Raw data package
We've sacrificed the completeness of the scientific data to improve ease of use. However, for those that are interested, you can also access the raw data package that this dataset has been created from here.