stormvogel.extensions.gym_sampling¶
Functions¶
|
Sample the gym environment and convert it to a Stormvogel MDP. |
Module Contents¶
- stormvogel.extensions.gym_sampling.sample_gym(env: gymnasium.Env, no_samples: int = 10, sample_length: int = 20, gymnasium_scheduler: Callable[[Any], int] | None = None, convert_obs: Callable[[Any], Any] = lambda x: ..., max_size: int = 10000)¶
Sample the gym environment and convert it to a Stormvogel MDP. In reality, gym environments are POMDPs, and gymnasium only allows us to access the observation. The result is an MDP where states with the same observations (and termination) are lumped together. Probablities are frequentist estimates. Their accuracy depends on how often each “state” is visited.
- Args:
env (gym.Env): Gymnasium env. no_samples (int): Total number of samples (starting at an initial state).
To resolve multiple initial states, a new, single initial state is added if necessary.
sample_length (int): The maximum length of a single sample. gymnasium_scheduler (Callable[[any], int] | None): A function from states to action numbers. convert_obs (Callable[[any], any]): Converts the observations to a hashable type. You can also apply rounding here.