HumanAgent Interface

maze_env

Maze Environment



Integration of Maze game with the OpenEnv framework.

Architecture



┌────────────────────────────────────┐
│ RL Training Code (Client) │
│ MazeEnv.step(action) │
└──────────────┬─────────────────────┘
│ HTTP
┌──────────────▼─────────────────────┐
│ FastAPI Server (Docker) │
│ MazeEnvironment │
│ ├─ Wraps Maze environment │
│ └─ Agent controls player │
└────────────────────────────────────┘


Installation & Usage



Option 1: Local Development (without Docker)



Requirements:
  • Python 3.11+

  • Numpy


  • from envs.maze_env import MazeEnv, MazeAction

    Start local server manually


    python -m envs.maze_env.server.app



    Connect to local server


    env = MazeEnv(base_url="http://localhost:8000")

    Reset environment


    result = env.reset()
    print(f"Initial state: {result.observation.info_state}")
    print(f"Legal actions: {result.observation.legal_actions}")

    Take actions


    for _ in range(10):
    action_id = result.observation.legal_actions[0] # Choose first legal action
    result = env.step(MazeAction(action_id=action_id))
    print(f"Reward: {result.reward}, Done: {result.done}")
    if result.done:
    break

    Cleanup


    env.close()


    Option 2: Docker (Recommended)



    Build Docker image:

    cd OpenEnv
    docker build -f src/envs/maze_env/server/Dockerfile -t maze-env:latest .


    Use with from_docker_image():

    from envs.maze_env import MazeEnv, MazeAction

    Automatically starts container


    env = MazeEnv.from_docker_image("maze-env:latest")

    result = env.reset()
    result = env.step(MazeAction(action_id=0))

    env.close() # Stops container


    Configuration



    Variables



  • maze : Maze as a numpy array saved in mazearray.py


  • Example



    docker run -p 8000:8000 maze-env:latest


    API Reference



    MazeAction



    @dataclass
    class MazeAction(Action):
    action: int # Action to be taken


    MazeObservation



    @dataclass
    class MazeObservation(Observation):
    position: List[int] # [row, col]
    total_reward: float # Total reward
    legal_actions: List[int] = field(default_factory=list) # Legal action based on the current position


    MazeState



    @dataclass
    class MazeState(State):
    episode_id: str # Episode
    step_count: int # Number of steps
    done: bool = False # Solve status


    References



  • [Maze Environment](https://github.com/erikdelange/Reinforcement-Learning-Maze)

Take Action

Current State

Status: Not initialized
Episode ID: -
Step Count: 0
State Observer

Current Observation

No observation yet

Action History

No actions taken yet