-
Notifications
You must be signed in to change notification settings - Fork 2
Behaviour Architecture & FAQ
For this code release, a skeleton behaviour module has been included. You are welcome to use it as a base, or implement your own behaviour architecture. This page is an overview on how our code is designed and used.
Our behaviours are written in Python. At a high level, our C++ perception thread reads information from the world (through vision / sonars, etc), localises based on that information, and then calls a python function in an embedded interpreter which has to return an action to perform. This action is relayed to the relevant robot outputs (LEDs directly, or to the walk engine for physical motion).
In the past we’ve had essentially a tree of “skills”, where each skill has a series of states and creates lower level skills where needed. This was in the right direction, but a little messy / hard to follow, so the goal for the 2014 behaviour refresh was to make it clearer what the structure is, clean up the older behaviours and make it easier to write new behaviours / see how existing behaviours work. To do this quickly (without going away in a separate branch for too long), we’ve kept things as backwards compatible as possible.
There’s three core Classes that are important:
- BehaviourTask: This is the superclass for any task.
- World: Every task has a world object that gives it access to shared information about the world. It also sets a behaviour request in that world which gets read at the end of the behaviour tick.
- TaskState: This is for Tasks that are complex enough to warrant states / state transitions / hysteresis.
- BehaviourTask is the parent-class for most skills and roles.
- Each task directly performs some action or delegates to a subtask.
- __init__ is the initializer of the class. It defines world, current_state, prev_state and calls the init method. Should not be overriden
- transition is where self.current_state should be updated. By default, it calls the transition method for current_state. Should be overriden.
- printStateChange is self-explanatory. It helps visualise the state change for debugging purposes. Should not be overriden
- init is called during initialisation. You should set the current_state variable to the initial state. This is directly called from __init__. should be overriden.
- tick gets called once on each frame. Note: All this function does is purely calls transition and **_tick**. _Should not be overriden.\_
- **_tick** is where the concrete classes perform some action. You don't return a behaviour request. You set the b_request variable. By default, it runs the tick() of the current_state. _Should be overriden_.
Often while writing behaviours you'll need to perform general tasks that are not unique to your behaviour (e.g calculating geometries, checking information about the team, etc). To be more efficient and have fewer places to change when we perform actions, we have utility modules that contain a lot of this information. These are listed here:
- Constants.py: Wrapper for
robot
module constants and python constants. Please use these over directly accessing the robot module so that it's easier to change in one place if the C++ blackboard changes. This one will also include any other constants that aren't from C++, so it'll all be in one place. - Global.py: Information about the world and you (e.g canSeeBall(), myPose(), amILost(), offNetwork().
- util
- MathUtil.py: Additional maths on top of what's in the math library
- FieldGeometry.py - Geometry functions for common field calculations (angles, etc)
- TeamStatus.py: Team information (things like player numbers, whether an obstacle is a teammate, etc).
- Timer.py: A simple timer utility so you don't have to keep doing the thing where you subtract starttime from blackboard.vision.timestamp. Also provides useful stopwatch like features so you can time the total amount of something in pieces.
- Vector2D.py: Simple class for vectors.
If you use something that's general but not available in one of these (e.g some special vector calculation), please add it to these to keep our code reusable.
These modules have the latest blackboard on each tick, so you don't need to pass the blackboard around to them.
The head skills are controlled separately.
Python code is found here:
image/home/nao/data/behaviours/
If you want to see how the C++ code that starts the behaviours, it's in the perception folder:
robot/perception/behaviour/
nao_sync robot-name
# e.g: nao_sync luigi
Have a look at a simple one, like GameController. Create a filename with the same name as your skill, so MyNewSkill.py would have a class:
from Task import BehaviourTask
import robot
class SonarTest(BehaviourTask):
def init(self):
# Any special initialisation code. Note that if you don't override __init__, it will set up the world
# variables, etc. If you decide to override __init__ (there's valid cases to do so), see what
# BehaviourTask.__init__ and make sure to also do what it does.
pass
def transition(self):
# Any transition code if you have taskstates.
pass
def _tick(self):
# What to run in each timestep.
# Note: use this if you want to let the system manage transitions for you. Essentially it will call
# transition and then _tick.
# If you want to manage that yourself, then you override tick()
req = robot.BehaviourRequest()
req.actions = robot.All()
# Some request stuff.
self.world.b_request = req # This will automatically get returned to C++.
See GameController for a simple example. See Ready for an example that uses TaskStates and init() instead of just init().
The world object your task has access to will have a blackboard. This matches the blackboard in c++.
self.world.blackboard.vision.timestamp
Sync it to the robot. Test it in isolation by running it with runswift:
runswift -s YourSkillName
GameController
In the TeamSkill, which runs when the GameController is in the Playing state.
The highest level of the architecture, behaviour.py will create a new World() object and pass it to the first skill it creates (the one you passed it with -s or GameController by default). This world object stays the same throughout, the only thing that changes is the blackboard variable within it which we set. This structure allows us to change the blackboard once and have every skill that's looking at the same world have the new blackboard. Otherwise, if each skill had a "self.blackboard" we would need to update that reference in each tick by calling an update function on each item in the tree (which adds more code complexity).
We use boost python to run our python interpreter and pass data back and forth from it. The best place to get some basic knowledge on how this works is the boost documentation: [http://www.boost.org/doc/libs/1_57_0/libs/python/doc/tutorial/doc/html/python/exposing.html] - In our case, you're going to need to put the vector into a blackboard, add a field to the wrapper class for that blackboard, and then you can access it through the current blackboard instance in python. As an example, look at robotObstacles in the Localisation blackboard:
- In
robot/blackboard/Blackboard.hpp
(line 185) it gets defined as part of that blackboard. - In
robot/perception/behaviour/python/wrappers/LocalisationBlackboard_wrap.cpp
(line 15) it gets included in the wrapper, so that it's available as an attribute of LocalisationBlackboard types in python. - The blackboard that's sent into python through
behaviour.py
has a localisation object, so you can now access the robotObstacles in that instance of the blackboard:blackboard.localisation.robotObstacles
(e.g line 129 of Global.py).
This design is evolving but started from these two messages in the robocup email threads:
The decomposition of behaviour has intrigued me for quite some time. I spent most of my PhD days getting a machine to decompose simple games into task-hierarchies. There is also a considerable literature on this issue. I have noticed a common theme emerging from all these approaches, which I will briefly outline below. At this stage the task for a Nao to learn a task hierarchy by itself from experience I think is still beyond the AI’s current capability. Our task, while challenging, can be more limited, ie to hand-code thebehaviour task-hierarchy and policy. To my mind there are two aspects to this challenge:
-
Decide on a formal approach to specifying the behaviour (more on this below), and
-
Coding up a winning behaviour for 2014 using 1.
The Task-Hierarchy Formalism
The elements of a task-hierarchy include tasks, states, actions, transitions, and a policy. In each task, the robot/world is in a state and an action transitions it to another state. A policy is a function from states to actions. Given a state, the policy tells us which action to execute. For example, at the highest level, game-controller task states include ready, set and playing. The game controller transitions from ready to set to playing based on keyboard actions of the game controller person. You will notice that in the playing state the policy is ambiguous. Transitions from playing can lead to ready, penalised and finished. This is because this broad game controller state does not describe the whole state of the game controller which includes a timer, referee calls, robot ids, etc. Given this state the policy is not ambiguous. Playing+GoalScored+<10min transitions to ready, Playing+>=10min —>Finished. The policy of the game controller has been described for us by the rules of the game and the game-controller code.
Actions can be complex and temporally extended. For example, in the playing state the game controller issues a “play” action to all the robots. This action will trigger robots to take roles and initiate complex movement to hopefully steal the ball and kick it into the goal. Such temporally extended actions are sometimes referred to as options (but I’ll just refer to them all generically as actions). Actions invoke another (sub)task with its own states, actions, transitions, and policy. For example, the play action issued to each robot by the game controller could invoke a “DetermineRole” task. This task may have the following broad states: FindTeamBall, Goalie, Striker, Defender, Supporter, Forward. We transition from FindTeamBall to Striker if we are closest (time-wise) to the ball (with some hysteresis to avoid thrashing between roles). Similarly for the other roles. Again the true underlying state is quite complex and includes robot id, ball distance for all robots, current role, etc. The goalie's role is allocated based on the robot id, not the relationship to the ball. The FindTeamBall state will invoke other subtasks, which may invoke others still, right down to the primitive actions that control the individual motors that move the head, arms, legs, (and LEDs, audio, and communication).
Each task (node or module) links to others in a hierarchy. The resultant task-graph is a directed acyclic graph (DAG). The root of the DAG is the game-controller and the leaves are primitive actions that are sent to the motor controller, etc. We are responsible for designing the tasks in the middle. Working on the new walk, I see this job as coding the lower level part of the task-hierarchy policy that walks omni-directionally, stands, kicks, etc.
I would be interested to discuss how this lines up with our current approach to programming behaviours, and to plan and specify the whole task-hierarchy by subtask.
Im not really sure if the discussion is aimed at the module design or what, but here is my 2c. Well, Im not sure what the current way of doing behaviours is in the rUNSWift codebase but here is a quick overview of a possible design I just came up with:
- Behaviour module are run on a tick
- On each tick there is an associated world state input (the current robot and ball GPS positions, visible objects, current joint angles, etc).
- On each tick we expect the output of the Behaviour module to be some set of non-conflicting commands to the walk module and any other output module (eg: LED controller or whatever).
- A Behaviour is a class, a Behaviour instance is an object of the associated class.
- A Behaviour instance can have state (member variables), the state persists with the object across ticks.
- When a Behaviour is invoked to "perform", it can read the world state, it can write to the output command, and it can invoke another Behaviour.
- A Behaviour invokes another by having a local member variable instance of that target Behaviour.
- The local member variable Behaviour can be destroyed and re-created, this effectively wipes the state of that Behaviour instance.
- A Behaviour can be invoked with parameters specific to that Behaviour.
- If a Behaviour writes to the output command, and then another one writes the same fields later on, the last one in wins.
Here is an example of how the Behaviour hierarchy would work:
- There is one top level behaviour, lets called it RootBehaviour. This will probably do things like see if the game is actually running, if so then delegate to say the RoleSwitcherBehaviour.
- The RoleSwitcherBehaviour will have an internal member variable Behaviour, lets say called "currentRole". When a role switch is required, the currentRole is deleted and assigned to a new Behaviour instance of the appropriate type (eg: StrikerBehaviour).
- If no role switch is required on a particular tick, the currentRole is invoked to perform, which will have access to its previous state in the form of member variables.
Anyway, I think you get the picture. This is fairly basic and simple. But you get Behaviour delegation, you get encapsulated state between ticks, and its fairly easy to grasp.