taps.apps.docking.train¶
Protein docking model training.
Module adapted from ParslDock.
MorganFingerprintTransformer
¶
Bases: BaseEstimator, TransformerMixin
Class that converts SMILES strings to fingerprint vectors.
Source code in taps/apps/docking/train.py
fit
¶
fit(
X: list[str], y: NDArray[bool] | None = None
) -> MorganFingerprintTransformer
Train model.
Parameters:
-
X(list[str]) –List of SMILES strings.
-
y(NDArray[bool] | None, default:None) –Array of true fingerprints.
Returns:
-
MorganFingerprintTransformer–The trained model.
Source code in taps/apps/docking/train.py
transform
¶
Compute the fingerprints.
Parameters:
-
X(list[str]) –List of SMILES strings.
-
y(NDArray[bool] | None, default:None) –Array of true fingerprints.
Returns:
Source code in taps/apps/docking/train.py
compute_morgan_fingerprints
¶
compute_morgan_fingerprints(
smiles: str,
fingerprint_length: int,
fingerprint_radius: int,
) -> NDArray[bool]
Get Morgan Fingerprint of a specific SMILES string.
Adapted from: https://github.com/google-research/google-research/blob/> dfac417/mol_dqn/chemgraph/dqn/deep_q_networks.py#L750
Parameters:
-
smiles(str) –The molecule as a SMILES string.
-
fingerprint_length(int) –Bit-length of fingerprint.
-
fingerprint_radius(int) –Radius used to compute fingerprint.
Returns:
-
NDArray[bool]–Array containing the Morgan fingerprint with shape
-
NDArray[bool]–[hparams, fingerprint_length].
Source code in taps/apps/docking/train.py
train_model
¶
Train a machine learning model using Morgan Fingerprints.
Parameters:
-
training_data(DataFrame) –Dataframe with a 'smiles' and 'score' column that contains molecule structure and docking score, respectfully.
Returns:
-
Pipeline–A trained model.
Source code in taps/apps/docking/train.py
run_model
¶
Run a model on a list of smiles strings.
Parameters:
-
model(Pipeline) –Trained model that takes SMILES strings as inputs.
-
smiles(list[str]) –List of molecules to evaluate.
Returns:
-
DataFrame–A dataframe with the molecules and their predicted outputs