schrodinger.test.stu.outcomes.compare_csv module

See compare_csv docstring

$Revision 0.2 $

@copyright: (c) Schrodinger, LLC. All rights reserved

schrodinger.test.stu.outcomes.compare_csv.DEFAULT_TOLERANCE = 0.005

Default tolerance if none is provided.

schrodinger.test.stu.outcomes.compare_csv.help()[source]
schrodinger.test.stu.outcomes.compare_csv.compare_csv(test_file, ref_file, tolerance=None, reltol=None, lines=None, delimiter=None, ignore_cols=None, sort_by=None, skip_rows=None, comment=None)[source]

Workup for comparing two CSV files.

Example use::

outcome_workup = compare_csv(‘test.csv’, ‘ref.csv’, tolerance=0.05, lines=3, delimiter=’ ‘)

Numeric values will be compared using tolerance or reltol, as described below. Equality is required for values that cannot be cast as floats (strings for example).

The lines in each CSV file are expected to line up (i.e. line 1 in test.csv is compared with line 1 from ref.csv). This means if a line is skipped in test.csv all subsequent lines will cause failures (so, many failure messages will be printed - one for each line after the skip).

Parameters
  • test_file (str) – Filename of csv to be tested.

  • ref_file (str) – Filename of reference csv.

  • tolerance (float|None) – Maximum possible deviation from the ref csv for numeric values. If this and reltol are both None, a default value will be used.

  • reltol (float|None) – Maximum possible deviation from the ref csv, expressed as a relative value. For example if this is 0.02, values may be different by up to 2%.

  • lines (int) – Number of lines to compare. Default is to compare all lines and require that the same number of lines to be in the reference and test files.

  • delimiter (str) – Delimiter to use while reading the csv. The default delimiter is ‘,’.

  • ignore_cols (list) – List of column names to ignore

  • sort_by (list of str) – Before comparing, sort the ref_file and test_file based on the column name(s) specified in sort_by.

  • skip_rows (list or int) – Line numbers to skip specified by the list (0 indexed). If an integer is specified, number of lines to skip from the start of file.

  • comment (str) – Indicates that commented lines should not be parsed. If found at the beginning of a line, the line will be ignored altogether. This parameter must be a single character.

schrodinger.test.stu.outcomes.compare_csv.compare_dfs(ref_df, test_df, tolerance_args, lines)[source]

Compare test and reference dfs using specified tolerance. Return a list of any violations.