rl_salamandra_alignment package
Subpackages
- rl_salamandra_alignment.distributed_configs package
- rl_salamandra_alignment.templates package
- rl_salamandra_alignment.trl_scripts package
- Submodules
- rl_salamandra_alignment.trl_scripts.alignprop module
- rl_salamandra_alignment.trl_scripts.bco module
- rl_salamandra_alignment.trl_scripts.chat module
- rl_salamandra_alignment.trl_scripts.cpo module
- rl_salamandra_alignment.trl_scripts.ddpo module
- rl_salamandra_alignment.trl_scripts.dpo module
- rl_salamandra_alignment.trl_scripts.dpo_online module
- rl_salamandra_alignment.trl_scripts.dpo_vlm module
- rl_salamandra_alignment.trl_scripts.gkd module
- rl_salamandra_alignment.trl_scripts.grpo module
- rl_salamandra_alignment.trl_scripts.kto module
- rl_salamandra_alignment.trl_scripts.nash_md module
- rl_salamandra_alignment.trl_scripts.orpo module
- rl_salamandra_alignment.trl_scripts.reward_modeling module
- rl_salamandra_alignment.trl_scripts.sft module
- rl_salamandra_alignment.trl_scripts.sft_video_llm module
- rl_salamandra_alignment.trl_scripts.sft_vlm module
- rl_salamandra_alignment.trl_scripts.xpo module
- Module contents
- rl_salamandra_alignment.utils package
Submodules
rl_salamandra_alignment.cli module
Console script for rl_salamandra_alignment.
rl_salamandra_alignment.convert_dataset module
rl_salamandra_alignment.generate_scripts module
Tools for generating scripts by filling out templates
- rl_salamandra_alignment.generate_scripts.generate_all_job_files(config)[source]
Generate the slurm scripts for an experiment
- Parameters:
config (dict) – execution config dict for experiment
- Returns:
for each subexperiment, paths to the distributed execution script and the launching script
- Return type:
list[tuple]
- rl_salamandra_alignment.generate_scripts.generate_distributed_run_script(output_dir, config, id)[source]
Write the script for distributed execution of a subexperiment, by filling out the script template.
- Parameters:
output_dir (str) – root directory for the outputs of the experiment.
config (dict) – execution config dict for subexperiment
id (str) – Sub-experiment id
- Returns:
Text of the script for distributed execution
- Return type:
str
- rl_salamandra_alignment.generate_scripts.generate_eval_scripts_for_one_training(output_dir, config, id)[source]
Generate the EVALUATION slurm scripts for a subexperiment
- Parameters:
output_dir (str) – root directory for the outputs of the experiment.
config (dict) – execution config dict for subexperiment
id (str) – Sub-experiment id
- Returns:
paths to the evaluation scripts
- Return type:
dict
- rl_salamandra_alignment.generate_scripts.generate_harness_eval_script(output_dir, config, id)[source]
Generate the HARNESS evaluation slurm scripts for a subexperiment
- Parameters:
output_dir (str) – root directory for the outputs of the experiment.
config (dict) – execution config dict for subexperiment
id (str) – Sub-experiment id
- Returns:
paths to the harness evaluation scripts
- Return type:
dict
- rl_salamandra_alignment.generate_scripts.generate_launch_script(output_dir, config, id)[source]
Write the script for launching a subexperiment, by filling out the script template.
- Parameters:
output_dir (str) – root directory for the outputs of the experiment.
config (dict) – execution config dict for subexperiment
id (str) – Sub-experiment id
- Returns:
Text of the script for launching the subexperiment
- Return type:
str
- rl_salamandra_alignment.generate_scripts.generate_local_eval_script(output_dir, config, id)[source]
Generate the LOCAL evaluation slurm scripts for a subexperiment (for example, red teaming)
- Parameters:
output_dir (str) – root directory for the outputs of the experiment.
config (dict) – execution config dict for subexperiment
id (str) – Sub-experiment id
- Returns:
paths to the local evaluation scripts
- Return type:
dict
- rl_salamandra_alignment.generate_scripts.generate_one_job_set(output_dir, config, id)[source]
Generate the slurm scripts for a subexperiment (both training and evaluation)
- Parameters:
output_dir (str) – root directory for the outputs of the experiment.
config (dict) – execution config dict for subexperiment
id (str) – Sub-experiment id
- Returns:
paths to the scripts for training and evaluation
- Return type:
dict
- rl_salamandra_alignment.generate_scripts.generate_one_training_job(output_dir, config, id)[source]
Generate the TRAINING slurm scripts for a subexperiment
- Parameters:
output_dir (str) – root directory for the outputs of the experiment.
config (dict) – execution config dict for subexperiment
id (str) – Sub-experiment id
- Returns:
paths to the distributed execution script and the launching script
- Return type:
dict
- rl_salamandra_alignment.generate_scripts.generate_slurm_preamble(sbatch_args)[source]
Generate the preamble for a slurm job.
- Parameters:
sbatch_args (dict) – Arguments for Sbatch
- Returns:
Preamble with all #SBATCHs filled
- Return type:
str
- rl_salamandra_alignment.generate_scripts.get_config_ids(config_list)[source]
Give an unique ID to each config. Each ID corresponds to a subexperiment.
- Parameters:
config_list (list) – List of config dicts
- Returns:
List of tuples (id, config)
- Return type:
list
- rl_salamandra_alignment.generate_scripts.get_output_dir(config)[source]
Extract the field ‘output directory’ from a config dict, making sure it is a string
- Parameters:
config (dict) – execution config dict for experiment
- Raises:
ValueError – Raised if the value is not a string (e.g. a list).
- Returns:
Path to the output directory
- Return type:
str
- rl_salamandra_alignment.generate_scripts.get_script_args_string(script_args_dict)[source]
Convert a python dictionary into a bash dictionary
- Parameters:
script_args_dict (dict) – python dictionary to convert
- Returns:
bash dictionary
- Return type:
str
- rl_salamandra_alignment.generate_scripts.replace_in_template(text_template, variable_name, variable_value)[source]
Inside a template, replaces a variable name “{{MY_VAR}}” by a value
- Parameters:
text_template (str) – Template
variable_name (str) – Name of the variable in all caps.
variable_value (Union[str, int, float]) – Value of the variable
- Returns:
template with the filled slot
- Return type:
str
- rl_salamandra_alignment.generate_scripts.setup_macro_output_dir_tree(output_dir)[source]
Construct the directory tree for running an experiment.
- Parameters:
output_dir (str) – root directory for the outputs of the experiment.
- Return type:
None
- rl_salamandra_alignment.generate_scripts.setup_micro_output_dir_tree(output_dir, config, id)[source]
Construct the directory tree for running a subexperiment.
- Parameters:
output_dir (str) – root directory for the outputs of the experiment.
config (dict) – execution config dict for subexperiment
id (str) – Sub-experiment id
- Return type:
None
Module contents
Top-level package for RL - Salamandra Alignment.