schrodinger.active_learning.al_report module¶

schrodinger.active_learning.al_report.get_ligand_ml_metric(ligand_ml_model_file)¶

Extract the test set metrics, test set labels and predictions from ligand_ml model file.

Parameters: ligand_ml_model_file (str) – ligand_ml .qzip model file.
Returns: r2, mae, rmse, labels and prediction of the test set
Return type: float, float, float, 1d numpy array, 2d numpy array

schrodinger.active_learning.al_report.make_train_report(ligand_ml_model_file, report_path, iter_num)¶

Generate a pdf file that records the test set metrics of the ligand_ml model.

Parameters

ligand_ml_model_file (str) – ligand_ml .qzip model file.
report_path (str) – path of the pdf report
iter_num (int) – current iteration number

schrodinger.active_learning.al_report.get_image(path, width=72.0)¶

Convert image file to reportlab image object that has the same aspect ratio and specified width.

Parameters

path (str) – path of the image file.
width (float) – width of the reportlab image.

Returns

reportlab image

Return type

reportlab.platypus.Image

schrodinger.active_learning.al_report.get_report_maker(active_learning_job)¶

Get corresponding report maker for the active learning job. It returns None for evaluate task since we do not have report for it yet.

Parameters: active_learning_job (ActiveLearningJob) – active learning job to be processed.
Returns: corresponding report maker
Return type: ALPilotReportMaker

schrodinger.active_learning.al_report.get_time_cost(nodes, node_name)¶

Return the time cost of a node. It returns ‘Unavailable’ if the time cost is not available.

Parameters

nodes (dict{str: ActiveLearningNode}) – dict that maps node name to node object.
node_name (str) – name of the active learning node of interest.

Returns

time cost in h/m/s format.

Return type

str

schrodinger.active_learning.al_report.get_score_pred_as_array(title_to_score, pred_score_file, discard_cutoff, ascending=True)¶

Return the score, predicted score, prediction uncertainty of the ligands as the N X 3 numpy array.

Parameters

title_to_score (dict(str:float)) – dict that maps ligand title to score.
pred_score_file (str) – path of the ligand ml prediction .csv file.
discard_cutoff (float) – score cutoff for excluding the ligands in ML training set.
ascending (bool) – lower value means better ligand if ascending is True

Returns

numpy array of (num_of_ligands X (score, pred, uncertain))

Return type

N X 3 numpy array

schrodinger.active_learning.al_report.calculate_recovery_ratio(label_pred, top_ratio)¶

Calculate the recovery ratio of the best ligands based on label in different numbers of the top ligands predicted by ligand_ml. More negative value means better ligand.

Parameters

label_pred ((number of ligands X 2) numpy array.) – numpy array contains the (label, prediction).
top_ratio (float) – top ratio of the ligands by label.

Returns

(screen ratio, recovery ratio of top ligands defined by top_ratio) of all the ligands.

Return type

(number of ligands X 2) numpy array

schrodinger.active_learning.al_report.plot_regression(y_true, y_pred, fname)¶

Generate regression plot. This function is sightly modified from ligand_ml/plotting.py to change the labels of axis.

Parameters

y_true (1d numpy array) – test set label.
y_pred (2d numpy array) – ligand_ml prediction and uncertainty
fname (str) – filename to save the image

schrodinger.active_learning.al_report.plot_recovery(recovery_results, fname)¶

Generate and save recovery plot image.

Parameters

recovery_results (dict{float:np.array}) – dict that maps top ratio to the recovery ratio numpy array.
fname (str) – path of the saved image.

schrodinger.active_learning.al_report.make_regress_recovery_plots(y_true, y_pred_uncertain, top_ratio_samples, regress_text, recovery_text)¶: Generate regression plot and recovery plot and include both in a table. Also return the recovery results for the sampled top ratios as a dict.

schrodinger.active_learning.al_report.make_recovery_table(recovery_results, screen_ratio_samples)¶

Generate a list of list that contains the recovery ratio for certain top ratio and screen ratio.

Parameters

recovery_results (dict{float:np.array}) – dict that maps top ratio to the recovery ratio numpy array.
screen_ratio_samples (list(float)) – list of screen ratios

Returns

table as a list of list, table caption, largest enrichment in the table.

Return type

list(list(str)), str, float

schrodinger.active_learning.al_report.get_conclusion_string(best_enrichment, job_type, high_enrich=10, low_enrich=2)¶: Return the conclusion string based on the job type and the higheest enrichment we have in the recovery ratio table.

class schrodinger.active_learning.al_report.ALReportMaker(active_learning_job)¶

Bases: object

Base class for different types of AL report maker.

__init__(active_learning_job)¶: Initialize the report maker for an active learning job

initReport(header)¶: Initialize the report and add header information

class schrodinger.active_learning.al_report.ALPilotReportMaker(active_learning_job)¶

Bases: schrodinger.active_learning.al_report.ALReportMaker

__init__(active_learning_job)¶: Initialize the report maker for an active learning job

report()¶: Function for building the report

addRunDetail()¶: Add job specifications and running time cost information to the report

addRecoveryResults()¶: Add the regression plot, recovery plot, recovery table and conclusion to the report.

initReport(header)¶: Initialize the report and add header information

class schrodinger.active_learning.al_report.ALScreenReportMaker(active_learning_job)¶

Bases: schrodinger.active_learning.al_report.ALReportMaker

__init__(active_learning_job)¶: Initialize the report maker for an active learning job

report()¶: Function for building the report

addRunDetail()¶: Add job specifications and running time cost information to the report

addRecoveryResults()¶: Add the regression plot, recovery plot, recovery table and conclusion to the report.

initReport(header)¶: Initialize the report and add header information