taps.apps.docking.train¶
Protein docking model training.
Module adapted from ParslDock.
MorganFingerprintTransformer ¶
Bases: BaseEstimator, TransformerMixin
Class that converts SMILES strings to fingerprint vectors.
Source code in taps/apps/docking/train.py
fit() ¶
fit(
X: list[str], y: array[int] | None = None
) -> MorganFingerprintTransformer
Train model.
Parameters:
-
X(list[str]) –List of SMILES strings.
-
y(array[int] | None, default:None) –Array of true fingerprints.
Returns:
-
MorganFingerprintTransformer–The trained model.
Source code in taps/apps/docking/train.py
transform() ¶
Compute the fingerprints.
Parameters:
-
X(list[str]) –List of SMILES strings.
-
y(array[int] | None, default:None) –Array of true fingerprints.
Returns:
Source code in taps/apps/docking/train.py
compute_morgan_fingerprints() ¶
compute_morgan_fingerprints(
smiles: str,
fingerprint_length: int,
fingerprint_radius: int,
) -> tuple[int, int]
Get Morgan Fingerprint of a specific SMILES string.
Adapted from: https://github.com/google-research/google-research/blob/> dfac417/mol_dqn/chemgraph/dqn/deep_q_networks.py#L750
Parameters:
-
smiles(str) –The molecule as a SMILES string.
-
fingerprint_length(int) –Bit-length of fingerprint.
-
fingerprint_radius(int) –Radius used to compute fingerprint.
Returns:
Source code in taps/apps/docking/train.py
train_model() ¶
Train a machine learning model using Morgan Fingerprints.
Parameters:
-
training_data(DataFrame) –Dataframe with a 'smiles' and 'score' column that contains molecule structure and docking score, respectfully.
Returns:
-
Pipeline–A trained model.
Source code in taps/apps/docking/train.py
run_model() ¶
Run a model on a list of smiles strings.
Parameters:
-
model(Pipeline) –Trained model that takes SMILES strings as inputs.
-
smiles(list[str]) –List of molecules to evaluate.
Returns:
-
DataFrame–A dataframe with the molecules and their predicted outputs