schrodinger.active_learning.al_report module

schrodinger.active_learning.al_report.get_ligand_ml_metric(ligand_ml_model_file)[source]

Extract the test set metrics, test set labels and predictions from ligand_ml model file.

Parameters

ligand_ml_model_file (str) – ligand_ml .qzip model file.

Returns

r2, mae, rmse, labels and prediction of the test set

Return type

float, float, float, 1d numpy array, 2d numpy array

schrodinger.active_learning.al_report.make_train_report(ligand_ml_model_file, report_path, iter_num)[source]

Generate a pdf file that records the test set metrics of the ligand_ml model.

Parameters
  • ligand_ml_model_file (str) – ligand_ml .qzip model file.

  • report_path (str) – path of the pdf report

  • iter_num (int) – current iteration number

schrodinger.active_learning.al_report.get_image(path, width=72.0)[source]

Convert image file to reportlab image object that has the same aspect ratio and specified width.

Parameters
  • path (str) – path of the image file.

  • width (float) – width of the reportlab image.

Returns

reportlab image

Return type

reportlab.platypus.Image

schrodinger.active_learning.al_report.get_report_maker(active_learning_job)[source]

Get corresponding report maker for the active learning job. It returns None for evaluate task since we do not have report for it yet.

Parameters

active_learning_job (ActiveLearningJob) – active learning job to be processed.

Returns

corresponding report maker

Return type

ALPilotReportMaker

schrodinger.active_learning.al_report.get_time_cost(nodes, node_name)[source]

Return the time cost of a node. It returns ‘Unavailable’ if the time cost is not available.

Parameters
  • nodes (dict{str: ActiveLearningNode}) – dict that maps node name to node object.

  • node_name (str) – name of the active learning node of interest.

Returns

time cost in h/m/s format.

Return type

str

schrodinger.active_learning.al_report.get_score_pred_as_array(title_to_score, pred_score_file, discard_cutoff, ascending=True)[source]

Return the score, predicted score, prediction uncertainty of the ligands as the N X 3 numpy array.

Parameters
  • title_to_score (dict(str:float)) – dict that maps ligand title to score.

  • pred_score_file (str) – path of the ligand ml prediction .csv file.

  • discard_cutoff (float) – score cutoff for excluding the ligands in ML training set.

  • ascending (bool) – lower value means better ligand if ascending is True

Returns

numpy array of (num_of_ligands X (score, pred, uncertain))

Return type

N X 3 numpy array

schrodinger.active_learning.al_report.calculate_recovery_ratio(label_pred, top_ratio)[source]

Calculate the recovery ratio of the best ligands based on label in different numbers of the top ligands predicted by ligand_ml. More negative value means better ligand.

Parameters
  • label_pred ((number of ligands X 2) numpy array.) – numpy array contains the (label, prediction).

  • top_ratio (float) – top ratio of the ligands by label.

Returns

(screen ratio, recovery ratio of top ligands defined by top_ratio) of all the ligands.

Return type

(number of ligands X 2) numpy array

schrodinger.active_learning.al_report.plot_regression(y_true, y_pred, fname)[source]

Generate regression plot. This function is sightly modified from ligand_ml/plotting.py to change the labels of axis.

Parameters
  • y_true (1d numpy array) – test set label.

  • y_pred (2d numpy array) – ligand_ml prediction and uncertainty

  • fname (str) – filename to save the image

schrodinger.active_learning.al_report.plot_recovery(recovery_results, fname)[source]

Generate and save recovery plot image.

Parameters
  • recovery_results (dict{float:np.array}) – dict that maps top ratio to the recovery ratio numpy array.

  • fname (str) – path of the saved image.

schrodinger.active_learning.al_report.make_regress_recovery_plots(y_true, y_pred_uncertain, top_ratio_samples, regress_text, recovery_text)[source]

Generate regression plot and recovery plot and include both in a table. Also return the recovery results for the sampled top ratios as a dict.

schrodinger.active_learning.al_report.make_recovery_table(recovery_results, screen_ratio_samples)[source]

Generate a list of list that contains the recovery ratio for certain top ratio and screen ratio.

Parameters
  • recovery_results (dict{float:np.array}) – dict that maps top ratio to the recovery ratio numpy array.

  • screen_ratio_samples (list(float)) – list of screen ratios

Returns

table as a list of list, table caption, largest enrichment in the table.

Return type

list(list(str)), str, float

schrodinger.active_learning.al_report.get_conclusion_string(best_enrichment, job_type, high_enrich=10, low_enrich=2)[source]

Return the conclusion string based on the job type and the higheest enrichment we have in the recovery ratio table.

class schrodinger.active_learning.al_report.ALReportMaker(active_learning_job)[source]

Bases: object

Base class for different types of AL report maker.

__init__(active_learning_job)[source]

Initialize the report maker for an active learning job

initReport(header)[source]

Initialize the report and add header information

class schrodinger.active_learning.al_report.ALPilotReportMaker(active_learning_job)[source]

Bases: schrodinger.active_learning.al_report.ALReportMaker

__init__(active_learning_job)[source]

Initialize the report maker for an active learning job

report()[source]

Function for building the report

addRunDetail()[source]

Add job specifications and running time cost information to the report

addRecoveryResults()[source]

Add the regression plot, recovery plot, recovery table and conclusion to the report.

initReport(header)

Initialize the report and add header information

class schrodinger.active_learning.al_report.ALScreenReportMaker(active_learning_job)[source]

Bases: schrodinger.active_learning.al_report.ALReportMaker

__init__(active_learning_job)[source]

Initialize the report maker for an active learning job

report()[source]

Function for building the report

addRunDetail()[source]

Add job specifications and running time cost information to the report

addRecoveryResults()[source]

Add the regression plot, recovery plot, recovery table and conclusion to the report.

initReport(header)

Initialize the report and add header information