flexs.baselines.explorers.environments.dyna_ppo¶

DyNA-PPO environment module.

class flexs.baselines.explorers.environments.dyna_ppo.DynaPPOEnvironment(alphabet, seq_length, model, landscape, batch_size)[source]¶

Bases: tf_agents.environments.py_environment.PyEnvironment

DyNA-PPO environment based on TF-Agents.

action_spec()[source]¶: Define agent actions.

property batch_size[source]¶: Tf-agents property that return env batch size.

batched()[source]¶: Tf-agents function that says that this env returns batches of timesteps.

get_cached_fitness(seq)[source]¶: Get cached sequence fitness computed in previous episodes.

observation_spec()[source]¶: Define environment observations.

sequence_density(seq)[source]¶: Get average distance to seq out of all observed sequences.

set_fitness_model_to_gt(fitness_model_is_gt)[source]¶

Set the fitness model to the ground truth landscape or to the model.

Call with True when doing an experiment-based training round and call with False when doing a model-based training round.

time_step_spec()[source]¶: Define time steps.

class flexs.baselines.explorers.environments.dyna_ppo.DynaPPOEnvironmentMutative(alphabet, starting_seq, model, landscape, max_num_steps)[source]¶

Bases: tf_agents.environments.py_environment.PyEnvironment

DyNA-PPO environment based on TF-Agents.

Note that unlike the other DynaPPO environment, this one is mutative rather than constructive.

action_spec()[source]¶: Define agent actions.

get_state_string()[source]¶: Get sequence representing current state.

observation_spec()[source]¶: Define environment observations.

sequence_density(seq)[source]¶: Get average distance to seq out of all observed sequences.

set_fitness_model_to_gt(fitness_model_is_gt)[source]¶

Set the fitness model to the ground truth landscape or to the model.

Call with True when doing an experiment-based training round and call with False when doing a model-based training round.