-
-
Notifications
You must be signed in to change notification settings - Fork 12
Agents & State Spaces
- formally, state is a function of history, St = f( Ht )
-
- Environment State Set is the environments private state representation
- i.e. whatever data the environment uses to pick the next observation/reward to emit
- The environment stats is not usually visible to the agent
- Even if Set is visible, it may contain irrelevant information
-
- Agent State Sat is the agents internal state representation
- i.e. whatever data the agent uses to pick the next action
- i.e. it is the information used by RL algorithms
- It can be any function of history:
- Sat = f( Ht )
-
- history can be thrown away, current state contains compact representation of history
Needed to pass Turing test:
- NLP
- Knowledge Representation
- state space
- blackboard (ETS)
- working memory(short-term, ETS)
- long term memory (mnesia, db)
- Automated Reasoning
- Machine Learning
- What is an Agent? Perceives env through sensors, execute actions using actuators
- What is Rational Agent? Always selects an action based on the percept sequence it has received so as to maximize its (expected) performance measure given the percepts it has received and the knowledge possessed by it.
-
An ideal agent always chooses the action which maximizes its expected performance, given its percept sequence so far.
-
An autonomous agent uses its own experience rather than built-in knowledge of environment by designer.
-
The most challenging environments are partially observable, stochastic, sequential, dynamic, and continuous, and multi-agent systems.
- Efficient
- No internal representation for reasoning, inference
- No strategic planning, learning
- percept based agents are not good for multiple, opposing goals
- information comes from sensors - percepts
- changes the agents current state of the world
- based on state of world and knowledge(memory), it triggers actions through effectors
- information comes from sensors - percepts
- changes the agents current state of the world
- based on state of the world and knowledge(memory), it chooses actions and carries them out through effectors
- Goal Formulation
- Agents action will depend on its goal
- Goal Formulation based on current situation is a way of solving many problems and search is a universal problem solving mechanism in AI
- The sequence of steps required to solve a problem is not known a priori and must be determined by a systemic exploration of the alternatives.
Utility Based (several goals, some with preference, determines utility of goals to choose from, maximize utility)
- A more general framework
- Diff preferences for different goals
- A utility function maps a state or sequence of states to a real valued utility
- The agent acts so as to maximize expected utility
- learning allows an agent to operate in unknown environment
- The learning element modifies the performance element
- Learning is required for true autonomy
- think of state space as the things that vary only.. things that don’t change don’t belong in state space.
- state space should contain all requisite info (nothing outside state space, e.g. world state) required for goal test and for defining successor function
- the state space may be explicitly represented, where all known nodes in system are represented
- typically it is implicitly represented and generated when required
- the agent knows
- the initial state
- the operators(actions) (computes successor of node)
- an operator is a function which expands a node. Compute the successor node(s).
The frontier(fringe) is the initial state of the State Space at beginning of search.
-
A state space is a graph (V, E)
- V is a set of nodes
- E is a set of Edges
-
Each Edge has a fixed, positive cost
-
Each Node is a data structure:
- A state description
- Parent of the node
- Depth of node
- the operator that generated this node
- cost of this path (sum of operator costs)
-
Implicit state space
Basic Search Key Issues:
- search tree may be unbounded
- return path, node, or ?
- How are merge(combine frontier nodes), and select(frontier prioritization) done?
- Is graph weighted or unweighted?
- How much is known about the quality of intermediate states?
- Is the aim to find a minimal cost path or any path as soon as possible?
- Blind Search (without Heuristics)
- Depth First Search
- Breadth First Search
- Iterative Deepening Search
- Iterative Broadening Search
- Informed Search (Heuristic Information)
- Constraint Satisfaction
- Adversary Search (X Person Games)
- set of states
- operators, [and costs]
- start state
- goal state [test]
- Path: start -> a state satisfying goal test [May require shortest path]
- Reduction
Many problems can be represented as a set of states and a set of rules of how one state is transformed into another, agent must choose sequence of actions to reach goal
- Each state is abstract representation of the agents env. It is an abstraction that denotes a configuration of the agent.
- INITIAL STATE: The description of the starting config of agent
- An ACTION/OPERATOR takes the agent from one state to another State. A state can have a number of successor states.
- PLAN: A is a sequence of actions
- GOAL: a description of a set of desirable states of the world. Goal states are often specified by a goal test which any goal state must satisfy.
- PATH COST: path -> positive number, usually path cost = sum of step costs
- PROBLEM FORMULATION: choosing a relevant set of states to consider, and a feasible set of operators for moving from one state to another.
- SEARCH: is the process of imagining sequences of operators applied to the initial state, and checking which sequence reaches a goal state.
-
S: The full set of states
-
S-sub-zero: the initial state, member of S
-
A:S- -> S: set of operators
-
G: the set of final states
-
SEARCH PROBLEM: Find a sequence of actions which transforms the agent from the initial state to a goal state.
- formulate a problem as a state space search by showing the legal problem states, the legal operators, and the initial and goal states
- a state is defined by the spec of the values of all attributes of interest in world
- An operator changes one state into the other; it has a precondition which is the value of certain attributes prior to the application of the operator, and a set of effects, which are the attributes altered by the operator
- The initial state is where you start
- the goal state is the partial description of the solution
- Percept Sequence, Background Knowledge, Feasible Actions
- Deal with expected outcome of actions beforehand
- Fully Observable - all env relevent to action being considered is observable
- Partially Observable - The relevant features of the environment are only partially observable, agent must track & reason about env Ex. Fully: Chess, Partial: Poker
- Deterministic: The next state of env is completely described by current stat and agents action. Eg. Image Analysis
- Stochastic: If an element of interference of uncertainty occurs then the env is stochastic. Note that a deterministic yet partially observable env will appear to be stochastic to agent. Eg. Ludo
- Strategic: environment state wholly determined by the preceding state and the actions of multiple agents. Eg. Chess
- Episodic Environment: subsequent episodes(phases) do not depend on what actions occurred in previous episodes, current episode does not effect subsequent episodes
- Sequential Environment: the agent engages in a series of connected episodes (phases), implies may need to plan ahead based on previous episodes
- Static Env: does not change from one state to next while agent is considering its course of action. The only changes to the environment are those caused by agent itself. Passage of time is irrelevant to deliberation. Agent does not need to observer world during deliberation.
- Dynamic Env: Changes over time independent of the actions of the agent — and thus if an agent does not respond in a timely manger, this counts as a choice to do nothing. Agent needs to observe world while deliberating
- Discrete: # of distinct percepts and/or actions is limited, the env is discrete, otherwise it is continuous.
- If the environment contains other intelligent agents, the agent needs to be concerned about strategic, game-theoretic aspects of the environment (for either cooperative or competitive agents).
- Most engineering environments don’t have multi-agent properties, whereas most social and economic systems get their complexity from the interactions of (more or less) rational agents.
- Knowledge rich: enormous amount of information that the environment contains
- Input rich: enormous amount of input the environment can send to an agent.
- Sensing Strategies
- Attentional Mechanisms
- One or both of above so the agent may more readily focus it’s efforts in such rich environments