silico package

Submodules

silico.analysis module

silico.analysis.df_agg_mean(df, group_cols, raw=False)[source]

Aggregate a dataframe to summarize it with the mean and its error

Parameters:

df (pd.DataFrame) – The dataframe.
group_cols (list of str) – Columns used as index for the aggregation.
raw (bool) – If False, the result is a table of strings representing the number with its error. If True, the columns will an additional level providing both the mean and its error (sem).

Returns:

The summarizing dataframe

Return type:

pd.DataFrame

silico.analysis.format_mag_err(mag, err, sep=' ± ', increase=0, increase_ones=True)[source]

Format a magnitude and its error as a string

Parameters:

mag (float) – Value of the magnitude
err (float) – Value of the associated error
sep (str) – Characters to use to join the numbers. Include spaces if needed.
increase (int) – A number to increase (or decrease if negative) the number of significant digits.
increase_ones (bool) – Whether the number of significant digits increases by one when the leading digit is one.

Returns:

The representation of the magnitude with its error.

Return type:

str

silico.analysis.paired_t_test(df, col_left, col_right, common_col='seed')[source]

Perform and summarize unilateral t-test to decide a difference between columns is significant.

Considered calling .round(5) or alike on output for clearer reading.

Parameters:

df (pd.DataFrame) – The results of the experiment.
col_left (str) – Name of the “left” column to compare
col_right (str) – Name of the “left” column to compare
common_col (str) – Identifier of the column indexing the repetitions of the experiments.

Returns:

Dataframe with the mean values of the left and right column, as well as the p-values of unilateral: tests. p-value-less corresponds to the test with alternative hypothesis col_left < col_right.

Return type:

pd.Dataframe

silico.base module

class silico.base.Experiment(variables, f, store, base_name=None, add_stats=True, strategy='grid', mid_point=None)[source]

Bases: object

An experiment

get_result(kwargs)[source]: Get the result of a certain configuration, running it if not available

get_results_df(skip_errors=True)[source]

Get a dataframe with the available results

Parameters:: skip_errors (bool) – Whether to ignore errors. If false, an “_error” column with the trace will be available
Returns:: The dataframe with the results.
Return type:: pd.DataFrame

invalidate(only_grid=False)[source]

Remove all existing trial data

Parameters:: only_grid (bool) – True to remove only files which correspond to grid values. Otherwise, all .pkl files are removed.

iter_results(skip_errors=True)[source]

Iterate pairs of kwargs, results

If a result is not available, it is skipped. Error behaviour depends on the skip_errors parameter.

Parameters:: skip_errors (bool) – Whether to ignore errors. If false, an “_error” key mapping to the trace will be available.
Yields:: 2-tuple of dict – Pairs of kwargs and results of trials.

iter_values()[source]: Iterate all combinations of kwargs

run_all(method='sequential', threads=2)[source]: Run all trials. If already run, kept.

status()[source]

Report the status of the experiment

Returns:

A mapping of statistics of the process, including:

total: The total number of instances.
done: The trials already completed.
errors: The number of detected errors in the trials already completed.

Return type:

dict of str

class silico.base.GridVariable(name, grid, standard=None)[source]

Bases: Variable

A variable whose values are defined on some grid points

get_standard()[source]

iter_values()[source]

class silico.base.SubExperiment(original, fixed)[source]

Bases: Experiment

An restriction of an experiment, where some of its variables are fixed

class silico.base.Trial(kwargs, f, base_path='', base_name=None)[source]

Bases: object

A Trial able to provide a result from a dict of parameters

delete()[source]: Remove the stored results of the trial

get_file_name(extension='.pkl')[source]: Get a unique filename for the trial

get_hash()[source]: Get a hash identifying the trial

load()[source]: Load the results of the trial if available

load_or_run(add_stats=True)[source]: Load the results if available, otherwise running the trial, storing the results, and returning them

run()[source]: Execute the trial

run_and_save(add_stats=True)[source]: Execute the trial and store the results as a pickle and in the db

class silico.base.Variable(name, standard=None)[source]

Bases: object

A variable taking part in an experiment

iter_values()[source]

silico.base.ensure_dir_exists(path)[source]: Ensure a directory exists, creating it if needed. :param path: The path to the directory. :type path: str

silico.base.implicit_variable_cast(variable)[source]

silico.cli module

silico.cli.get_experiment(file, experiment=None, report=True)[source]: Load an experiment from a script where it is defined

silico.common module

silico.common.is_notebook()[source]: Detect if running in a notebook

silico.common.set_kwargs(f, fixed_kwargs)[source]: Closure of a function fixing some kwargs

silico.metrics module

silico.metrics.get_classification_metrics(y, predictions, classes=None)[source]

silico.metrics.plot_confusion_matrix(m, labels=None, figure_kwargs=None, normalize=None)[source]

Parameters:

m (list of list of float) – The confusion matrix.
labels (list of str) – Names of the classes in the order of the confusion matrix.
figure_kwargs (dict) – Parameters to create the Figure.
normalize (str) – A normalization method to use with the matrix: - None: No normalization (number of instances). - “row”: Normalize by rows (true in scikit-learn convention). - “col”: Normalize by columns (predicted in scikit-learn convention). - “all”: Normalize by total instances.
Returns – Axes: An axes instance for further tweaking.

silico.ml module

Common sklearn predictors

silico.plot module

silico.plot.highlight_max(data, levels=(0, 1), color='red')[source]

Highlight the maximum in a pandas dataframe.

Use with df.style.apply(highlight_max,axis=<behavior>). Set axis=0 or axis=1 for per column/per row highlighting. Set axis=None with some levels set (e.g., (0, 1)) to highlight on some levels of a multiindex.

Parameters:

data – Dataframe or series to highlight.
levels (tuple of int) – Levels to highlight by. Ignored if data is a series.
color – Color to set

Returns:

silico.plot.highlight_threshold(s, threshold, column, greater=True, color='red')[source]

Highlight rows such that a value is greater (or less) than a threshold

Typical use: df_out.style.apply(

highlight_threshold, threshold=0.1, greater=False, column=[“p-value-less”], axis=1, color=”green”

).apply(: highlight_threshold, threshold=0.1, greater=False, column=[“p-value-greater”], axis=1, color=”red”

)

Parameters:

s – Series to highlight
threshold – Value used as threshold.
column – Name of the column to check.
greater (bool) – Whether to highlight values greater than the threshold, otherwise highlighting lesser.
color – Color to set

Returns:

silico.urinal module

Urinal protocol iteration

silico.urinal.distance_matrix(indices, shape, p=2)[source]

Returns an array whose elements are the p-norm distances to the point with the given indices.

Parameters:

indices (tuple of int) – The indices of the point.
shape (tuple of int) – The shape of the array.
p (int) – The p-norm to use (deafults to 2=ecuclidean)

Returns:

An array of the given shape whose elements are the p-norm distances to the point of the given indices.

silico.urinal.urinal_iteration(dims, p=1)[source]: Yield the coordinates of urinal-like iteration

Module contents

silico - Python package to handle in silico experiments