rl_salamandra_alignment package

Subpackages

Submodules

rl_salamandra_alignment.cli module

Console script for rl_salamandra_alignment.

rl_salamandra_alignment.cli.main()[source]: Reinforcement Learning for Salamandra on MN5

rl_salamandra_alignment.convert_dataset module

rl_salamandra_alignment.generate_scripts module

Tools for generating scripts by filling out templates

rl_salamandra_alignment.generate_scripts.generate_all_job_files(config)[source]

Generate the slurm scripts for an experiment

Parameters:: config (dict) – execution config dict for experiment
Returns:: for each subexperiment, paths to the distributed execution script and the launching script
Return type:: list[tuple]

rl_salamandra_alignment.generate_scripts.generate_distributed_run_script(output_dir, config, id)[source]

Write the script for distributed execution of a subexperiment, by filling out the script template.

Parameters:

output_dir (str) – root directory for the outputs of the experiment.
config (dict) – execution config dict for subexperiment
id (str) – Sub-experiment id

Returns:

Text of the script for distributed execution

Return type:

str

rl_salamandra_alignment.generate_scripts.generate_eval_scripts_for_one_training(output_dir, config, id)[source]

Generate the EVALUATION slurm scripts for a subexperiment

Parameters:

output_dir (str) – root directory for the outputs of the experiment.
config (dict) – execution config dict for subexperiment
id (str) – Sub-experiment id

Returns:

paths to the evaluation scripts

Return type:

dict

rl_salamandra_alignment.generate_scripts.generate_harness_eval_script(output_dir, config, id)[source]

Generate the HARNESS evaluation slurm scripts for a subexperiment

Parameters:

output_dir (str) – root directory for the outputs of the experiment.
config (dict) – execution config dict for subexperiment
id (str) – Sub-experiment id

Returns:

paths to the harness evaluation scripts

Return type:

dict

rl_salamandra_alignment.generate_scripts.generate_launch_script(output_dir, config, id)[source]

Write the script for launching a subexperiment, by filling out the script template.

Parameters:

output_dir (str) – root directory for the outputs of the experiment.
config (dict) – execution config dict for subexperiment
id (str) – Sub-experiment id

Returns:

Text of the script for launching the subexperiment

Return type:

str

rl_salamandra_alignment.generate_scripts.generate_local_eval_script(output_dir, config, id)[source]

Generate the LOCAL evaluation slurm scripts for a subexperiment (for example, red teaming)

Parameters:

output_dir (str) – root directory for the outputs of the experiment.
config (dict) – execution config dict for subexperiment
id (str) – Sub-experiment id

Returns:

paths to the local evaluation scripts

Return type:

dict

rl_salamandra_alignment.generate_scripts.generate_one_job_set(output_dir, config, id)[source]

Generate the slurm scripts for a subexperiment (both training and evaluation)

Parameters:

output_dir (str) – root directory for the outputs of the experiment.
config (dict) – execution config dict for subexperiment
id (str) – Sub-experiment id

Returns:

paths to the scripts for training and evaluation

Return type:

dict

rl_salamandra_alignment.generate_scripts.generate_one_training_job(output_dir, config, id)[source]

Generate the TRAINING slurm scripts for a subexperiment

Parameters:

output_dir (str) – root directory for the outputs of the experiment.
config (dict) – execution config dict for subexperiment
id (str) – Sub-experiment id

Returns:

paths to the distributed execution script and the launching script

Return type:

dict

rl_salamandra_alignment.generate_scripts.generate_slurm_preamble(sbatch_args)[source]

Generate the preamble for a slurm job.

Parameters:: sbatch_args (dict) – Arguments for Sbatch
Returns:: Preamble with all #SBATCHs filled
Return type:: str

rl_salamandra_alignment.generate_scripts.get_config_ids(config_list)[source]

Give an unique ID to each config. Each ID corresponds to a subexperiment.

Parameters:: config_list (list) – List of config dicts
Returns:: List of tuples (id, config)
Return type:: list

rl_salamandra_alignment.generate_scripts.get_output_dir(config)[source]

Extract the field ‘output directory’ from a config dict, making sure it is a string

Parameters:: config (dict) – execution config dict for experiment
Raises:: ValueError – Raised if the value is not a string (e.g. a list).
Returns:: Path to the output directory
Return type:: str

rl_salamandra_alignment.generate_scripts.get_script_args_string(script_args_dict)[source]

Convert a python dictionary into a bash dictionary

Parameters:: script_args_dict (dict) – python dictionary to convert
Returns:: bash dictionary
Return type:: str

rl_salamandra_alignment.generate_scripts.replace_in_template(text_template, variable_name, variable_value)[source]

Inside a template, replaces a variable name “{{MY_VAR}}” by a value

Parameters:

text_template (str) – Template
variable_name (str) – Name of the variable in all caps.
variable_value (Union[str, int, float]) – Value of the variable

Returns:

template with the filled slot

Return type:

str

rl_salamandra_alignment.generate_scripts.setup_macro_output_dir_tree(output_dir)[source]

Construct the directory tree for running an experiment.

Parameters:: output_dir (str) – root directory for the outputs of the experiment.
Return type:: None

rl_salamandra_alignment.generate_scripts.setup_micro_output_dir_tree(output_dir, config, id)[source]

Construct the directory tree for running a subexperiment.

Parameters:

output_dir (str) – root directory for the outputs of the experiment.
config (dict) – execution config dict for subexperiment
id (str) – Sub-experiment id

Return type:

None

Module contents

Top-level package for RL - Salamandra Alignment.

rl_salamandra_alignment.setup_logging(level=20, format='%(asctime)s - %(name)s - %(levelname)s - %(message)s')[source]: Set up logging for this package. Ensures proper handler setup and allows dynamic level changes.