Example: Pause and Continue

This notebook shows how one can execute two functions for one experiment by setting the status to paused, and unpausing it later on with a different execution function. This example is heavily based on the example_general_usage.ipynb notebook.

To execute this notebook you need to install:

pip install py_experimenter
pip install scikit-learn

Experiment Configuration File

First we define an experiment configuraiton file. Note that in comparison to the basic example two resultfields paused_at_seconds and resumed_at_seconds were added.

[1]:

import os

content = """
PY_EXPERIMENTER:
  n_jobs: 1

  Database:
    provider: sqlite
    database: py_experimenter
    table:
      name: example_pause_and_continue
      keyfields:
        dataset:
          type: VARCHAR(50)
          values: [iris]
        cross_validation_splits:
          type: INT
          values: [5]
        seed:
          type: INT
          values:
            start: 2
            stop: 7
            step: 2
        kernel:
          type: VARCHAR(50)
          values: [linear, poly, rbf, sigmoid]
      resultfields:
        pipeline: LONGTEXT
        train_f1: DECIMAL
        train_accuracy: DECIMAL
        test_f1: DECIMAL
        test_accuracy: DECIMAL
        paused_at_seconds: DOUBLE
        resumed_at_seconds: DOUBLE

  CUSTOM:
    path: sample_data

  CODECARBON:
    offline_mode: False
    measure_power_secs: 25
    tracking_mode: process
    log_level: error
    save_to_file: True
    output_dir: output/CodeCarbon
"""
# Create config directory if it does not exist
if not os.path.exists('config'):
    os.mkdir('config')

# Create config file
experiment_configuration_file_path = os.path.join('config', 'example_pause_and_continue.yml')
with open(experiment_configuration_file_path, "w") as f:
  f.write(content)

Defining Pausing Execution Function

Next we fill the table, define the execution function that gets paused after five seconds and run this execution function.

[2]:

import datetime

from py_experimenter.experimenter import ExperimentStatus, PyExperimenter
from py_experimenter.result_processor import ResultProcessor

experimenter = PyExperimenter(experiment_configuration_file_path=experiment_configuration_file_path, name='example_notebook')

experimenter.fill_table_from_config()

def pause_after_5_seconds(parameters: dict, result_processor: ResultProcessor, custom_config: dict):
    import time
    time.sleep(5)
    result_processor.process_results({
        'paused_at_seconds': datetime.datetime.now().timestamp()
    })
    return ExperimentStatus.PAUSED



experimenter.execute(pause_after_5_seconds, max_experiments=1)

2024-03-11 08:18:43,065  | py-experimenter - INFO     | Found 4 keyfields
2024-03-11 08:18:43,066  | py-experimenter - INFO     | Found 7 resultfields
2024-03-11 08:18:43,067  | py-experimenter - WARNING  | No logtables given
2024-03-11 08:18:43,067  | py-experimenter - WARNING  | No custom section defined in config
2024-03-11 08:18:43,068  | py-experimenter - WARNING  | No codecarbon section defined in config
2024-03-11 08:18:43,069  | py-experimenter - INFO     | Initialized and connected to database
2024-03-11 08:18:43,105  | py-experimenter - INFO     | 12 rows successfully added to database. 0 rows were skipped.
[codecarbon INFO @ 08:18:43] [setup] RAM Tracking...
[codecarbon INFO @ 08:18:43] [setup] GPU Tracking...
[codecarbon INFO @ 08:18:43] No GPU found.
[codecarbon INFO @ 08:18:43] [setup] CPU Tracking...
[codecarbon WARNING @ 08:18:43] No CPU tracking mode found. Falling back on CPU constant mode.
[codecarbon WARNING @ 08:18:44] We saw that you have a 12th Gen Intel(R) Core(TM) i7-1260P but we don't know it. Please contact us.
[codecarbon INFO @ 08:18:44] CPU Model on constant consumption mode: 12th Gen Intel(R) Core(TM) i7-1260P
[codecarbon INFO @ 08:18:44] >>> Tracker's metadata:
[codecarbon INFO @ 08:18:44]   Platform system: Linux-5.15.146.1-microsoft-standard-WSL2-x86_64-with-glibc2.35
[codecarbon INFO @ 08:18:44]   Python version: 3.9.0
[codecarbon INFO @ 08:18:44]   CodeCarbon version: 2.3.4
[codecarbon INFO @ 08:18:44]   Available RAM : 15.475 GB
[codecarbon INFO @ 08:18:44]   CPU count: 16
[codecarbon INFO @ 08:18:44]   CPU model: 12th Gen Intel(R) Core(TM) i7-1260P
[codecarbon INFO @ 08:18:44]   GPU count: None
[codecarbon INFO @ 08:18:44]   GPU model: None
[codecarbon INFO @ 08:18:52] Energy consumed for RAM : 0.000008 kWh. RAM Power : 5.803128719329834 W
[codecarbon INFO @ 08:18:52] Energy consumed for all CPUs : 0.000059 kWh. Total CPU Power : 42.5 W
[codecarbon INFO @ 08:18:52] 0.000068 kWh of electricity used since the beginning.
2024-03-11 08:18:52,537  | py-experimenter - INFO     | All configured executions finished.

Showcase Paused Execution

Below we only show that the execution of the experiment with id=1 has been paused.

[3]:

experimenter.get_table()

[3]:

	ID	dataset	cross_validation_splits	seed	kernel	creation_date	status	start_date	name	machine	pipeline	train_f1	train_accuracy	test_f1	test_accuracy	paused_at_seconds	resumed_at_seconds	end_date	error
0	1	iris	5	2	linear	2024-03-11 08:18:43	paused	2024-03-11 08:18:43	example_notebook	Worklaptop	None	None	None	None	None	1.710142e+09	None	2024-03-11 08:18:52	None
1	2	iris	5	4	linear	2024-03-11 08:18:43	created	None	None	None	None	None	None	None	None	NaN	None	None	None
2	3	iris	5	6	linear	2024-03-11 08:18:43	created	None	None	None	None	None	None	None	None	NaN	None	None	None
3	4	iris	5	2	poly	2024-03-11 08:18:43	created	None	None	None	None	None	None	None	None	NaN	None	None	None
4	5	iris	5	4	poly	2024-03-11 08:18:43	created	None	None	None	None	None	None	None	None	NaN	None	None	None
5	6	iris	5	6	poly	2024-03-11 08:18:43	created	None	None	None	None	None	None	None	None	NaN	None	None	None
6	7	iris	5	2	rbf	2024-03-11 08:18:43	created	None	None	None	None	None	None	None	None	NaN	None	None	None
7	8	iris	5	4	rbf	2024-03-11 08:18:43	created	None	None	None	None	None	None	None	None	NaN	None	None	None
8	9	iris	5	6	rbf	2024-03-11 08:18:43	created	None	None	None	None	None	None	None	None	NaN	None	None	None
9	10	iris	5	2	sigmoid	2024-03-11 08:18:43	created	None	None	None	None	None	None	None	None	NaN	None	None	None
10	11	iris	5	4	sigmoid	2024-03-11 08:18:43	created	None	None	None	None	None	None	None	None	NaN	None	None	None
11	12	iris	5	6	sigmoid	2024-03-11 08:18:43	created	None	None	None	None	None	None	None	None	NaN	None	None	None

Define resuming execution function

Lastly, we can define a second execution function that resumes the paused execution function. After running this execution function, the experiment is finished and the resulting table is shown.

[4]:

import random

import numpy as np
from sklearn.datasets import load_iris
from sklearn.model_selection import cross_validate
from sklearn.pipeline import make_pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.svm import SVC

def resume_after_5_seconds(parameters: dict, result_processor: ResultProcessor, custom_config: dict):
    result_processor.process_results({
        'resumed_at_seconds': datetime.datetime.now().timestamp()
    })
    seed = parameters['seed']
    random.seed(seed)
    np.random.seed(seed)

    data = load_iris()
    # In case you want to load a file from a path
    # path = os.path.join(custom_config['path'], parameters['dataset'])
    # data = pd.read_csv(path)

    X = data.data
    y = data.target

    model = make_pipeline(StandardScaler(), SVC(kernel=parameters['kernel'], gamma='auto'))
    result_processor.process_results({
        'pipeline': str(model)
    })

    if parameters['dataset'] != 'iris':
        raise ValueError("Example error")

    scores = cross_validate(model, X, y,
                            cv=parameters['cross_validation_splits'],
                            scoring=('accuracy', 'f1_micro'),
                            return_train_score=True
                            )

    result_processor.process_results({
        'train_f1': np.mean(scores['train_f1_micro']),
        'train_accuracy': np.mean(scores['train_accuracy'])
    })

    result_processor.process_results({
        'test_f1': np.mean(scores['test_f1_micro']),
        'test_accuracy': np.mean(scores['test_accuracy'])
    })


experimenter.unpause_experiment(1, resume_after_5_seconds)

experimenter.get_table()

[codecarbon INFO @ 08:18:52] [setup] RAM Tracking...
[codecarbon INFO @ 08:18:52] [setup] GPU Tracking...
[codecarbon INFO @ 08:18:52] No GPU found.
[codecarbon INFO @ 08:18:52] [setup] CPU Tracking...
[codecarbon WARNING @ 08:18:52] No CPU tracking mode found. Falling back on CPU constant mode.
[codecarbon WARNING @ 08:18:54] We saw that you have a 12th Gen Intel(R) Core(TM) i7-1260P but we don't know it. Please contact us.
[codecarbon INFO @ 08:18:54] CPU Model on constant consumption mode: 12th Gen Intel(R) Core(TM) i7-1260P
[codecarbon INFO @ 08:18:54] >>> Tracker's metadata:
[codecarbon INFO @ 08:18:54]   Platform system: Linux-5.15.146.1-microsoft-standard-WSL2-x86_64-with-glibc2.35
[codecarbon INFO @ 08:18:54]   Python version: 3.9.0
[codecarbon INFO @ 08:18:54]   CodeCarbon version: 2.3.4
[codecarbon INFO @ 08:18:54]   Available RAM : 15.475 GB
[codecarbon INFO @ 08:18:54]   CPU count: 16
[codecarbon INFO @ 08:18:54]   CPU model: 12th Gen Intel(R) Core(TM) i7-1260P
[codecarbon INFO @ 08:18:54]   GPU count: None
[codecarbon INFO @ 08:18:54]   GPU model: None
[codecarbon INFO @ 08:18:57] Energy consumed for RAM : 0.000000 kWh. RAM Power : 5.803128719329834 W
[codecarbon INFO @ 08:18:57] Energy consumed for all CPUs : 0.000001 kWh. Total CPU Power : 42.5 W
[codecarbon INFO @ 08:18:57] 0.000001 kWh of electricity used since the beginning.

[4]:

	ID	dataset	cross_validation_splits	seed	kernel	creation_date	status	start_date	name	machine	pipeline	train_f1	train_accuracy	test_f1	test_accuracy	paused_at_seconds	resumed_at_seconds	end_date	error
0	1	iris	5	2	linear	2024-03-11 08:18:43	done	2024-03-11 08:18:43	example_notebook	Worklaptop	Pipeline(steps=[('standardscaler', StandardSca...	0.971667	0.971667	0.966667	0.966667	1.710142e+09	1.710142e+09	2024-03-11 08:18:57	None
1	2	iris	5	4	linear	2024-03-11 08:18:43	created	None	None	None	None	NaN	NaN	NaN	NaN	NaN	NaN	None	None
2	3	iris	5	6	linear	2024-03-11 08:18:43	created	None	None	None	None	NaN	NaN	NaN	NaN	NaN	NaN	None	None
3	4	iris	5	2	poly	2024-03-11 08:18:43	created	None	None	None	None	NaN	NaN	NaN	NaN	NaN	NaN	None	None
4	5	iris	5	4	poly	2024-03-11 08:18:43	created	None	None	None	None	NaN	NaN	NaN	NaN	NaN	NaN	None	None
5	6	iris	5	6	poly	2024-03-11 08:18:43	created	None	None	None	None	NaN	NaN	NaN	NaN	NaN	NaN	None	None
6	7	iris	5	2	rbf	2024-03-11 08:18:43	created	None	None	None	None	NaN	NaN	NaN	NaN	NaN	NaN	None	None
7	8	iris	5	4	rbf	2024-03-11 08:18:43	created	None	None	None	None	NaN	NaN	NaN	NaN	NaN	NaN	None	None
8	9	iris	5	6	rbf	2024-03-11 08:18:43	created	None	None	None	None	NaN	NaN	NaN	NaN	NaN	NaN	None	None
9	10	iris	5	2	sigmoid	2024-03-11 08:18:43	created	None	None	None	None	NaN	NaN	NaN	NaN	NaN	NaN	None	None
10	11	iris	5	4	sigmoid	2024-03-11 08:18:43	created	None	None	None	None	NaN	NaN	NaN	NaN	NaN	NaN	None	None
11	12	iris	5	6	sigmoid	2024-03-11 08:18:43	created	None	None	None	None	NaN	NaN	NaN	NaN	NaN	NaN	None	None

CodeCarbon Entries

Note that for each execution a different CodeCarbon entry is created.

[5]:

experimenter.get_codecarbon_table()

[5]:

	ID	experiment_id	codecarbon_timestamp	project_name	run_id	duration_seconds	emissions_kg	emissions_rate_kg_sec	cpu_power_watt	gpu_power_watt	...	cpu_model	gpu_count	gpu_model	longitude	latitude	ram_total_size	tracking_mode	on_cloud	power_usage_efficiency	offline_mode
0	1	1	2024-03-11T08:18:52	codecarbon	a87e6dcb-aace-44b1-9d83-c9744312ce62	5.096071	2.471057e-05	0.000005	42.5	0.0	...	12th Gen Intel(R) Core(TM) i7-1260P	None	None	9.5312	52.4771	15.47501	machine	N	1.0	0
1	2	1	2024-03-11T08:18:57	codecarbon	5132d06c-9a59-4d0e-8dc5-0afa4984a5f8	0.133938	3.813803e-07	0.000003	42.5	0.0	...	12th Gen Intel(R) Core(TM) i7-1260P	None	None	9.5312	52.4771	15.47501	machine	N	1.0	0

2 rows × 34 columns