flexs.baselines.explorers.environments.dyna_ppo

DyNA-PPO environment module.

class flexs.baselines.explorers.environments.dyna_ppo.DynaPPOEnvironment(alphabet, seq_length, model, landscape, batch_size)[source]

Bases: tf_agents.environments.py_environment.PyEnvironment

DyNA-PPO environment based on TF-Agents.

action_spec()[source]

Define agent actions.

property batch_size[source]

Tf-agents property that return env batch size.

batched()[source]

Tf-agents function that says that this env returns batches of timesteps.

get_cached_fitness(seq)[source]

Get cached sequence fitness computed in previous episodes.

observation_spec()[source]

Define environment observations.

sequence_density(seq)[source]

Get average distance to seq out of all observed sequences.

set_fitness_model_to_gt(fitness_model_is_gt)[source]

Set the fitness model to the ground truth landscape or to the model.

Call with True when doing an experiment-based training round and call with False when doing a model-based training round.

time_step_spec()[source]

Define time steps.

class flexs.baselines.explorers.environments.dyna_ppo.DynaPPOEnvironmentMutative(alphabet, starting_seq, model, landscape, max_num_steps)[source]

Bases: tf_agents.environments.py_environment.PyEnvironment

DyNA-PPO environment based on TF-Agents.

Note that unlike the other DynaPPO environment, this one is mutative rather than constructive.

action_spec()[source]

Define agent actions.

get_state_string()[source]

Get sequence representing current state.

observation_spec()[source]

Define environment observations.

sequence_density(seq)[source]

Get average distance to seq out of all observed sequences.

set_fitness_model_to_gt(fitness_model_is_gt)[source]

Set the fitness model to the ground truth landscape or to the model.

Call with True when doing an experiment-based training round and call with False when doing a model-based training round.