Example: Pause and Continue

This notebook shows how one can execute two functions for one experiment by setting the status to paused, and unpausing it later on with a different execution function. This example is heavily based on the example_general_usage.ipynb notebook.

To execute this notebook you need to install:

pip install py_experimenter
pip install scikit-learn

Experiment Configuration File

First we define an experiment configuraiton file. Note that in comparison to the basic example two resultfields paused_at_seconds and resumed_at_seconds were added.

[1]:
import os

content = """
PY_EXPERIMENTER:
  n_jobs: 1

  Database:
    provider: sqlite
    database: py_experimenter
    table:
      name: example_pause_and_continue
      keyfields:
        dataset:
          type: VARCHAR(50)
          values: [iris]
        cross_validation_splits:
          type: INT
          values: [5]
        seed:
          type: INT
          values:
            start: 2
            stop: 7
            step: 2
        kernel:
          type: VARCHAR(50)
          values: [linear, poly, rbf, sigmoid]
      resultfields:
        pipeline: LONGTEXT
        train_f1: DECIMAL
        train_accuracy: DECIMAL
        test_f1: DECIMAL
        test_accuracy: DECIMAL
        paused_at_seconds: DOUBLE
        resumed_at_seconds: DOUBLE

  CUSTOM:
    path: sample_data

  CODECARBON:
    offline_mode: False
    measure_power_secs: 25
    tracking_mode: process
    log_level: error
    save_to_file: True
    output_dir: output/CodeCarbon
"""
# Create config directory if it does not exist
if not os.path.exists('config'):
    os.mkdir('config')

# Create config file
experiment_configuration_file_path = os.path.join('config', 'example_pause_and_continue.yml')
with open(experiment_configuration_file_path, "w") as f:
  f.write(content)

Defining Pausing Execution Function

Next we fill the table, define the execution function that gets paused after five seconds and run this execution function.

[2]:
import datetime

from py_experimenter.experimenter import ExperimentStatus, PyExperimenter
from py_experimenter.result_processor import ResultProcessor

experimenter = PyExperimenter(experiment_configuration_file_path=experiment_configuration_file_path, name='example_notebook')

experimenter.fill_table_from_config()

def pause_after_5_seconds(parameters: dict, result_processor: ResultProcessor, custom_config: dict):
    import time
    time.sleep(5)
    result_processor.process_results({
        'paused_at_seconds': datetime.datetime.now().timestamp()
    })
    return ExperimentStatus.PAUSED



experimenter.execute(pause_after_5_seconds, max_experiments=1)
2024-03-11 08:18:43,065  | py-experimenter - INFO     | Found 4 keyfields
2024-03-11 08:18:43,066  | py-experimenter - INFO     | Found 7 resultfields
2024-03-11 08:18:43,067  | py-experimenter - WARNING  | No logtables given
2024-03-11 08:18:43,067  | py-experimenter - WARNING  | No custom section defined in config
2024-03-11 08:18:43,068  | py-experimenter - WARNING  | No codecarbon section defined in config
2024-03-11 08:18:43,069  | py-experimenter - INFO     | Initialized and connected to database
2024-03-11 08:18:43,105  | py-experimenter - INFO     | 12 rows successfully added to database. 0 rows were skipped.
[codecarbon INFO @ 08:18:43] [setup] RAM Tracking...
[codecarbon INFO @ 08:18:43] [setup] GPU Tracking...
[codecarbon INFO @ 08:18:43] No GPU found.
[codecarbon INFO @ 08:18:43] [setup] CPU Tracking...
[codecarbon WARNING @ 08:18:43] No CPU tracking mode found. Falling back on CPU constant mode.
[codecarbon WARNING @ 08:18:44] We saw that you have a 12th Gen Intel(R) Core(TM) i7-1260P but we don't know it. Please contact us.
[codecarbon INFO @ 08:18:44] CPU Model on constant consumption mode: 12th Gen Intel(R) Core(TM) i7-1260P
[codecarbon INFO @ 08:18:44] >>> Tracker's metadata:
[codecarbon INFO @ 08:18:44]   Platform system: Linux-5.15.146.1-microsoft-standard-WSL2-x86_64-with-glibc2.35
[codecarbon INFO @ 08:18:44]   Python version: 3.9.0
[codecarbon INFO @ 08:18:44]   CodeCarbon version: 2.3.4
[codecarbon INFO @ 08:18:44]   Available RAM : 15.475 GB
[codecarbon INFO @ 08:18:44]   CPU count: 16
[codecarbon INFO @ 08:18:44]   CPU model: 12th Gen Intel(R) Core(TM) i7-1260P
[codecarbon INFO @ 08:18:44]   GPU count: None
[codecarbon INFO @ 08:18:44]   GPU model: None
[codecarbon INFO @ 08:18:52] Energy consumed for RAM : 0.000008 kWh. RAM Power : 5.803128719329834 W
[codecarbon INFO @ 08:18:52] Energy consumed for all CPUs : 0.000059 kWh. Total CPU Power : 42.5 W
[codecarbon INFO @ 08:18:52] 0.000068 kWh of electricity used since the beginning.
2024-03-11 08:18:52,537  | py-experimenter - INFO     | All configured executions finished.

Showcase Paused Execution

Below we only show that the execution of the experiment with id=1 has been paused.

[3]:
experimenter.get_table()
[3]:
ID dataset cross_validation_splits seed kernel creation_date status start_date name machine pipeline train_f1 train_accuracy test_f1 test_accuracy paused_at_seconds resumed_at_seconds end_date error
0 1 iris 5 2 linear 2024-03-11 08:18:43 paused 2024-03-11 08:18:43 example_notebook Worklaptop None None None None None 1.710142e+09 None 2024-03-11 08:18:52 None
1 2 iris 5 4 linear 2024-03-11 08:18:43 created None None None None None None None None NaN None None None
2 3 iris 5 6 linear 2024-03-11 08:18:43 created None None None None None None None None NaN None None None
3 4 iris 5 2 poly 2024-03-11 08:18:43 created None None None None None None None None NaN None None None
4 5 iris 5 4 poly 2024-03-11 08:18:43 created None None None None None None None None NaN None None None
5 6 iris 5 6 poly 2024-03-11 08:18:43 created None None None None None None None None NaN None None None
6 7 iris 5 2 rbf 2024-03-11 08:18:43 created None None None None None None None None NaN None None None
7 8 iris 5 4 rbf 2024-03-11 08:18:43 created None None None None None None None None NaN None None None
8 9 iris 5 6 rbf 2024-03-11 08:18:43 created None None None None None None None None NaN None None None
9 10 iris 5 2 sigmoid 2024-03-11 08:18:43 created None None None None None None None None NaN None None None
10 11 iris 5 4 sigmoid 2024-03-11 08:18:43 created None None None None None None None None NaN None None None
11 12 iris 5 6 sigmoid 2024-03-11 08:18:43 created None None None None None None None None NaN None None None

Define resuming execution function

Lastly, we can define a second execution function that resumes the paused execution function. After running this execution function, the experiment is finished and the resulting table is shown.

[4]:
import random

import numpy as np
from sklearn.datasets import load_iris
from sklearn.model_selection import cross_validate
from sklearn.pipeline import make_pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.svm import SVC

def resume_after_5_seconds(parameters: dict, result_processor: ResultProcessor, custom_config: dict):
    result_processor.process_results({
        'resumed_at_seconds': datetime.datetime.now().timestamp()
    })
    seed = parameters['seed']
    random.seed(seed)
    np.random.seed(seed)

    data = load_iris()
    # In case you want to load a file from a path
    # path = os.path.join(custom_config['path'], parameters['dataset'])
    # data = pd.read_csv(path)

    X = data.data
    y = data.target

    model = make_pipeline(StandardScaler(), SVC(kernel=parameters['kernel'], gamma='auto'))
    result_processor.process_results({
        'pipeline': str(model)
    })

    if parameters['dataset'] != 'iris':
        raise ValueError("Example error")

    scores = cross_validate(model, X, y,
                            cv=parameters['cross_validation_splits'],
                            scoring=('accuracy', 'f1_micro'),
                            return_train_score=True
                            )

    result_processor.process_results({
        'train_f1': np.mean(scores['train_f1_micro']),
        'train_accuracy': np.mean(scores['train_accuracy'])
    })

    result_processor.process_results({
        'test_f1': np.mean(scores['test_f1_micro']),
        'test_accuracy': np.mean(scores['test_accuracy'])
    })


experimenter.unpause_experiment(1, resume_after_5_seconds)

experimenter.get_table()
[codecarbon INFO @ 08:18:52] [setup] RAM Tracking...
[codecarbon INFO @ 08:18:52] [setup] GPU Tracking...
[codecarbon INFO @ 08:18:52] No GPU found.
[codecarbon INFO @ 08:18:52] [setup] CPU Tracking...
[codecarbon WARNING @ 08:18:52] No CPU tracking mode found. Falling back on CPU constant mode.
[codecarbon WARNING @ 08:18:54] We saw that you have a 12th Gen Intel(R) Core(TM) i7-1260P but we don't know it. Please contact us.
[codecarbon INFO @ 08:18:54] CPU Model on constant consumption mode: 12th Gen Intel(R) Core(TM) i7-1260P
[codecarbon INFO @ 08:18:54] >>> Tracker's metadata:
[codecarbon INFO @ 08:18:54]   Platform system: Linux-5.15.146.1-microsoft-standard-WSL2-x86_64-with-glibc2.35
[codecarbon INFO @ 08:18:54]   Python version: 3.9.0
[codecarbon INFO @ 08:18:54]   CodeCarbon version: 2.3.4
[codecarbon INFO @ 08:18:54]   Available RAM : 15.475 GB
[codecarbon INFO @ 08:18:54]   CPU count: 16
[codecarbon INFO @ 08:18:54]   CPU model: 12th Gen Intel(R) Core(TM) i7-1260P
[codecarbon INFO @ 08:18:54]   GPU count: None
[codecarbon INFO @ 08:18:54]   GPU model: None
[codecarbon INFO @ 08:18:57] Energy consumed for RAM : 0.000000 kWh. RAM Power : 5.803128719329834 W
[codecarbon INFO @ 08:18:57] Energy consumed for all CPUs : 0.000001 kWh. Total CPU Power : 42.5 W
[codecarbon INFO @ 08:18:57] 0.000001 kWh of electricity used since the beginning.
[4]:
ID dataset cross_validation_splits seed kernel creation_date status start_date name machine pipeline train_f1 train_accuracy test_f1 test_accuracy paused_at_seconds resumed_at_seconds end_date error
0 1 iris 5 2 linear 2024-03-11 08:18:43 done 2024-03-11 08:18:43 example_notebook Worklaptop Pipeline(steps=[('standardscaler', StandardSca... 0.971667 0.971667 0.966667 0.966667 1.710142e+09 1.710142e+09 2024-03-11 08:18:57 None
1 2 iris 5 4 linear 2024-03-11 08:18:43 created None None None None NaN NaN NaN NaN NaN NaN None None
2 3 iris 5 6 linear 2024-03-11 08:18:43 created None None None None NaN NaN NaN NaN NaN NaN None None
3 4 iris 5 2 poly 2024-03-11 08:18:43 created None None None None NaN NaN NaN NaN NaN NaN None None
4 5 iris 5 4 poly 2024-03-11 08:18:43 created None None None None NaN NaN NaN NaN NaN NaN None None
5 6 iris 5 6 poly 2024-03-11 08:18:43 created None None None None NaN NaN NaN NaN NaN NaN None None
6 7 iris 5 2 rbf 2024-03-11 08:18:43 created None None None None NaN NaN NaN NaN NaN NaN None None
7 8 iris 5 4 rbf 2024-03-11 08:18:43 created None None None None NaN NaN NaN NaN NaN NaN None None
8 9 iris 5 6 rbf 2024-03-11 08:18:43 created None None None None NaN NaN NaN NaN NaN NaN None None
9 10 iris 5 2 sigmoid 2024-03-11 08:18:43 created None None None None NaN NaN NaN NaN NaN NaN None None
10 11 iris 5 4 sigmoid 2024-03-11 08:18:43 created None None None None NaN NaN NaN NaN NaN NaN None None
11 12 iris 5 6 sigmoid 2024-03-11 08:18:43 created None None None None NaN NaN NaN NaN NaN NaN None None

CodeCarbon Entries

Note that for each execution a different CodeCarbon entry is created.

[5]:
experimenter.get_codecarbon_table()
[5]:
ID experiment_id codecarbon_timestamp project_name run_id duration_seconds emissions_kg emissions_rate_kg_sec cpu_power_watt gpu_power_watt ... cpu_model gpu_count gpu_model longitude latitude ram_total_size tracking_mode on_cloud power_usage_efficiency offline_mode
0 1 1 2024-03-11T08:18:52 codecarbon a87e6dcb-aace-44b1-9d83-c9744312ce62 5.096071 2.471057e-05 0.000005 42.5 0.0 ... 12th Gen Intel(R) Core(TM) i7-1260P None None 9.5312 52.4771 15.47501 machine N 1.0 0
1 2 1 2024-03-11T08:18:57 codecarbon 5132d06c-9a59-4d0e-8dc5-0afa4984a5f8 0.133938 3.813803e-07 0.000003 42.5 0.0 ... 12th Gen Intel(R) Core(TM) i7-1260P None None 9.5312 52.4771 15.47501 machine N 1.0 0

2 rows × 34 columns