Utilities

In addition to dataset fetching, scikit-datasets provide some utility functions that make easier dataset-related tasks, such as launching experiments and formatting their scores.

Estimator

The following functions are related estimators that follow the scikit-learn API.

json2estimator(estimator, **kwargs)

Instantiate a Scikit-learn estimator from a json file.

Experiment

The following functions can be used to execute several experiments, such as classification or regression tasks, with different datasets for a posterior comparison. These experiments are created using the Sacred library, storing the most common parameters of interest, such as time required for training or final scores. After the experiments have finished, the final scores can be easily retrieved in order to plot a table or perform hypothesis testing.

create_experiments(*, datasets, estimators, ...)

Create several Sacred experiments.

run_experiments(experiments)

Run Sacred experiments.

fetch_scores(*, storage[, ids, ...])

Fetch scores from Sacred experiments.

ScoresInfo(dataset_names, estimator_names, ...)

Class containing the scores of several related experiments.

Scores

The following functions can be used to format and display the scores of machine learning or hypothesis testing experiments.

scores_table(scores[, stds, nobs, ...])

Scores table.

hypotheses_table(samples, models, *[, ...])

Hypotheses table.