-
Notifications
You must be signed in to change notification settings - Fork 0
One ego per scene #14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Greptile OverviewGreptile SummaryThis PR implements a "one ego per scene" training mode where each world contains exactly one ego agent training alongside co-player agents, enabling multi-agent training scenarios. The implementation includes a new C function Key Changes:
Critical Issues Found:
Configuration Changes:
Confidence Score: 2/5
Important Files ChangedFile Analysis
Sequence DiagramsequenceDiagram
participant Config as Config File
participant Main as Training Script
participant Vector as vector.py
participant Drive as drive.py
participant Binding as binding.h/c
participant DriveH as drive.h
Main->>Config: Load adaptive.ini
Config-->>Main: one_ego_per_scene=True<br/>co_player_enabled=True<br/>create_expert_overflow=False
Main->>Vector: make() with env_kwargs
alt co_player_enabled == True
Vector->>Vector: Load co-player policy from checkpoint
Vector->>Vector: Wrap policy with LSTM if rnn config exists
Vector->>Vector: Store policy in env_kwargs["co_player_policy"]["co_player_policy_func"]
end
Vector->>Drive: Initialize Drive environments
Drive->>Drive: Parse co_player_policy dict
alt co_player_conditioning exists
Drive->>Drive: Set co_player_condition_type
else co_player_conditioning is None
Note over Drive: BUG: co_player_condition_type<br/>not initialized!
end
Drive->>Binding: my_shared_population_play()
alt one_ego_per_scene == True
Binding->>Binding: my_shared_one_ego_per_scene()
loop For each ego agent
Binding->>Binding: Select random map
Binding->>DriveH: set_active_agents()
DriveH->>DriveH: Iterate through entities
alt create_expert_overflow == False
DriveH->>DriveH: Skip non-controlled agents
DriveH->>DriveH: Skip overflow agents beyond max_controlled_agents
else create_expert_overflow == True
DriveH->>DriveH: Create overflow agents as experts
end
Binding->>Binding: Assign 1 ego + N co-players per world
Binding->>Binding: Calculate placeholder slots
end
Binding-->>Drive: Return agent_offsets, map_ids, ego_ids, coplayer_ids
else one_ego_per_scene == False
Binding->>Binding: my_shared_split_numerically()
Binding-->>Drive: Split agents across worlds
end
Drive->>Drive: Store ego_ids, co_player_ids, place_holder_ids
Drive->>Drive: Initialize C environments with parameters
loop Training Loop
Main->>Drive: Step environments
alt co_player_condition_type != "none"
Note over Drive: BUG: AttributeError if<br/>co_player_condition_type not defined
Drive->>Drive: Add conditioning to co-player obs
end
Drive->>Drive: Forward ego policy
Drive->>Drive: Forward co-player policy
Drive-->>Main: Return observations, rewards, dones
end
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additional Comments (1)
-
pufferlib/ocean/drive/drive.py, line 447 (link)logic:
AttributeErrorwhenco_player_conditioningisNoneself.co_player_condition_typeonly set whenself.co_player_conditioningis truthy (line 141-142). Need to check attribute exists first.
9 files reviewed, 3 comments
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
-Weights and Biases Link