silico package

Submodules

silico.analysis module

silico.analysis.df_agg_mean(df, group_cols, raw=False)[source]

Aggregate a dataframe to summarize it with the mean and its error

Parameters:
  • df (pd.DataFrame) – The dataframe.

  • group_cols (list of str) – Columns used as index for the aggregation.

  • raw (bool) – If False, the result is a table of strings representing the number with its error. If True, the columns will an additional level providing both the mean and its error (sem).

Returns:

The summarizing dataframe

Return type:

pd.DataFrame

silico.analysis.format_mag_err(mag, err, sep=' ± ', increase=0, increase_ones=True)[source]

Format a magnitude and its error as a string

Parameters:
  • mag (float) – Value of the magnitude

  • err (float) – Value of the associated error

  • sep (str) – Characters to use to join the numbers. Include spaces if needed.

  • increase (int) – A number to increase (or decrease if negative) the number of significant digits.

  • increase_ones (bool) – Whether the number of significant digits increases by one when the leading digit is one.

Returns:

The representation of the magnitude with its error.

Return type:

str

silico.analysis.paired_t_test(df, col_left, col_right, common_col='seed')[source]

Perform and summarize unilateral t-test to decide a difference between columns is significant.

Considered calling .round(5) or alike on output for clearer reading.

Parameters:
  • df (pd.DataFrame) – The results of the experiment.

  • col_left (str) – Name of the “left” column to compare

  • col_right (str) – Name of the “left” column to compare

  • common_col (str) – Identifier of the column indexing the repetitions of the experiments.

Returns:

Dataframe with the mean values of the left and right column, as well as the p-values of unilateral

tests. p-value-less corresponds to the test with alternative hypothesis col_left < col_right.

Return type:

pd.Dataframe

silico.base module

class silico.base.Experiment(variables, f, store, base_name=None, add_stats=True, strategy='grid', mid_point=None)[source]

Bases: object

An experiment

get_result(kwargs)[source]

Get the result of a certain configuration, running it if not available

get_results_df(skip_errors=True)[source]

Get a dataframe with the available results

Parameters:

skip_errors (bool) – Whether to ignore errors. If false, an “_error” column with the trace will be available

Returns:

The dataframe with the results.

Return type:

pd.DataFrame

invalidate(only_grid=False)[source]

Remove all existing trial data

Parameters:

only_grid (bool) – True to remove only files which correspond to grid values. Otherwise, all .pkl files are removed.

iter_results(skip_errors=True)[source]

Iterate pairs of kwargs, results

If a result is not available, it is skipped. Error behaviour depends on the skip_errors parameter.

Parameters:

skip_errors (bool) – Whether to ignore errors. If false, an “_error” key mapping to the trace will be available.

Yields:

2-tuple of dict – Pairs of kwargs and results of trials.

iter_values()[source]

Iterate all combinations of kwargs

run_all(method='sequential', threads=2)[source]

Run all trials. If already run, kept.

status()[source]

Report the status of the experiment

Returns:

A mapping of statistics of the process, including:
  • total: The total number of instances.

  • done: The trials already completed.

  • errors: The number of detected errors in the trials already completed.

Return type:

dict of str

class silico.base.GridVariable(name, grid, standard=None)[source]

Bases: Variable

A variable whose values are defined on some grid points

get_standard()[source]
iter_values()[source]
class silico.base.SubExperiment(original, fixed)[source]

Bases: Experiment

An restriction of an experiment, where some of its variables are fixed

class silico.base.Trial(kwargs, f, base_path='', base_name=None)[source]

Bases: object

A Trial able to provide a result from a dict of parameters

delete()[source]

Remove the stored results of the trial

get_file_name(extension='.pkl')[source]

Get a unique filename for the trial

get_hash()[source]

Get a hash identifying the trial

load()[source]

Load the results of the trial if available

load_or_run(add_stats=True)[source]

Load the results if available, otherwise running the trial, storing the results, and returning them

run()[source]

Execute the trial

run_and_save(add_stats=True)[source]

Execute the trial and store the results as a pickle and in the db

class silico.base.Variable(name, standard=None)[source]

Bases: object

A variable taking part in an experiment

iter_values()[source]
silico.base.ensure_dir_exists(path)[source]

Ensure a directory exists, creating it if needed. :param path: The path to the directory. :type path: str

silico.base.implicit_variable_cast(variable)[source]

silico.cli module

silico.cli.get_experiment(file, experiment=None, report=True)[source]

Load an experiment from a script where it is defined

silico.common module

silico.common.is_notebook()[source]

Detect if running in a notebook

silico.common.set_kwargs(f, fixed_kwargs)[source]

Closure of a function fixing some kwargs

silico.metrics module

silico.metrics.get_classification_metrics(y, predictions, classes=None)[source]
silico.metrics.plot_confusion_matrix(m, labels=None, figure_kwargs=None, normalize=None)[source]
Parameters:
  • m (list of list of float) – The confusion matrix.

  • labels (list of str) – Names of the classes in the order of the confusion matrix.

  • figure_kwargs (dict) – Parameters to create the Figure.

  • normalize (str) – A normalization method to use with the matrix: - None: No normalization (number of instances). - “row”: Normalize by rows (true in scikit-learn convention). - “col”: Normalize by columns (predicted in scikit-learn convention). - “all”: Normalize by total instances.

  • Returns – Axes: An axes instance for further tweaking.

silico.ml module

Common sklearn predictors

silico.plot module

silico.plot.highlight_max(data, levels=(0, 1), color='red')[source]

Highlight the maximum in a pandas dataframe.

Use with df.style.apply(highlight_max,axis=<behavior>). Set axis=0 or axis=1 for per column/per row highlighting. Set axis=None with some levels set (e.g., (0, 1)) to highlight on some levels of a multiindex.

Parameters:
  • data – Dataframe or series to highlight.

  • levels (tuple of int) – Levels to highlight by. Ignored if data is a series.

  • color – Color to set

Returns:

silico.plot.highlight_threshold(s, threshold, column, greater=True, color='red')[source]

Highlight rows such that a value is greater (or less) than a threshold

Typical use: df_out.style.apply(

highlight_threshold, threshold=0.1, greater=False, column=[“p-value-less”], axis=1, color=”green”

).apply(

highlight_threshold, threshold=0.1, greater=False, column=[“p-value-greater”], axis=1, color=”red”

)

Parameters:
  • s – Series to highlight

  • threshold – Value used as threshold.

  • column – Name of the column to check.

  • greater (bool) – Whether to highlight values greater than the threshold, otherwise highlighting lesser.

  • color – Color to set

Returns:

silico.urinal module

Urinal protocol iteration

silico.urinal.distance_matrix(indices, shape, p=2)[source]

Returns an array whose elements are the p-norm distances to the point with the given indices.

Parameters:
  • indices (tuple of int) – The indices of the point.

  • shape (tuple of int) – The shape of the array.

  • p (int) – The p-norm to use (deafults to 2=ecuclidean)

Returns:

An array of the given shape whose elements are the p-norm distances to the point of the given indices.

silico.urinal.urinal_iteration(dims, p=1)[source]

Yield the coordinates of urinal-like iteration

Module contents

silico - Python package to handle in silico experiments