MultitaskEnv#

class curiosity_gym.envs.multitaskenv.MultitaskEnv(agentPOV: AgentPOV | str = 'global', task: int = 1, random: bool = False, render_mode: str | None = None, window_width: int = 1200)#

Defines the structure of the curiosity-gym multitask environment.

The environment consists of three rooms with two distinct tasks. The agent always starts in the middle room. The left room is used for task no. 1, where the agent needs to collect a key, open a door and move to the green target cell. The right room represents task no. 2, where the agent needs to push a ball to the purple target cell. Only one task is active at a time. The agent will only gain a reward for completing the task that is currently active. The environment is designed to test the agent’s capability of transitioning between different tasks within the same environment.

Parameters:

agentPOV (AgentPOV | str, optional) – Object or string defining the observations and action spaces of the RL agent. Valid string values are ‘global’, ‘local_W’ and ‘forward_L_W’, where W and L are integers defining the width and length of the respective POV. By default GlobalView.
task (int, optional) – Identifier for task. In task no. 1 the agent needs to reach the green target cell in the left room. In task no. 2 the agent needs to push the ball to the purple target cell in the right room. By default 1.
random (bool, optional) – Whether the position of the target for both tasks should be randomly selected within their respective rooms. By default False.
render_mode (str | None, optional) – Render mode in which the environment is run. If render mode is human, the environment will be rendered in PyGame. By default None.
window_width (int, optional) – Horizontal size of the PyGame window in human render mode. By default 1200.

../_images/MultitaskEnv_optimal.gif — Example of a MultitaskEnv episode with an optimal policy for alternating tasks.#

Methods

`check_task`	Checks whether the agent has completed the task that is currently active.
`close`	Clean up the environment.
`find_object`	Get non-wall grid object at given position.
`get_object_ids`	Get ids for all grid object types.
`get_state`	Get the current state of the environment.
`get_wrapper_attr`	Gets the attribute name from the environment.
`has_wrapper_attr`	Checks if the attribute name exists in the environment.
`heatmap`	Display heatmap of position counts of the agent.
`init_render`	Initialise render objects.
`load_walls`	Convert array of positions to wall objects for environment.
`render`	Compute the render frames as specified by `render_mode`.
`reset`	Reset the environment to an initial internal state.
`set_wrapper_attr`	Sets the attribute name on the environment with value.
`simulate`	Simulate the state of the environment if a given action were taken.
`step`	Run one timestep of the environment’s dynamics using the agent actions.

Attributes

`metadata`	Metadata of the environment.
`np_random`	Returns the environment's internal `_np_random` that if not set will initialise with a random seed.
`np_random_seed`	Returns the environment's internal `_np_random_seed` that if not set will first initialise with a random int as seed.
`render_mode`	Render mode in which the environment is run.
`spec`
`unwrapped`	Returns the base non-wrapped environment.
`reward_range`	Range of rewards that can be obtained within one episode.
`action_space`	Space of possible actions a RL agent can choose from.
`observation_space`	Space of possible observations returned by the environment.