-
Notifications
You must be signed in to change notification settings - Fork 6
maxweightofinputtreeedgesdisplayed
This iteb_support_theorem describes an ITEB score that is calculated for any edge in an output supertree given a set of input trees.
Here we describe a score for the entire supertree given a set of inputs which could be optimized by strictly ranking edge inclusion decisions by satisfying the grouping provided by the input trees given in order.
This is not meant to imply that this is the scoring system used in tree-ranked based SBH-TAG synthesis. This is merely one possible scoring system that would justify the procedure.
(same as that used on the iteb_support_theorem page. Although here we'll try to use n
and N
for nodes in input and the supertree respectively. v
and V
look too similar).
-
t
will be a input tree. -
L(t)
is the label set of the treet
-
L(t, n)
for a noden
will be the label set of the subtree oft
that descends fromn
-
L(N)
for a nodeN
in the TAG will be the labels under this node in the TAG - the union of labels sets for all of the child nodes ofn
(each leaf node of the TAG is assigned single label) -
out(t, v)
is defined to beL(t) - L(t, n)
. This the outgroup of noden
with respect to the label set oft
-
E(t)
is the set of edges in treet
-
I
is the set of input trees
We say a rooted supertree T
, to displays and n → n'
from an input tree t
if and only if there exists an edge N → N'
in the supertree for which:
- iteb1:
L(N) ∩ L(t) = L(t, n)
Note this is only the first condition required for the statement that tree t
supports edge N → N'
sensu ITEB support.
It is possible for a supertree to display an edge n → n'
, while that edge does not support any edge in the supertree.
This occurs there is a case of a node, N
, and its parent, N'
in the supertree both having the property that L(N) ∩ L(t) = L(N') ∩ L(t) = L(t, n)
.
From the perspective of the source tree the edge N→ N'
is redundant with the edge one step closer to the root, so t
does not support the existence of this edge.
The supertree would still display n → n'
even if N → N'
were collapsed.
The set of input tree edges displayed by supretree T
is SITED(T)
The cardinality of this set is the cardinality of set of input tree edges displayed:
CSITED(T) = | SITED(T) |
If each edge, e
, in the input trees is assigned a weight w(e)
then a supertree's weight of the set of input tree edges displayed is simply the sum of the weights for every edge in SITED(T)
:
WSITED(T) = \sum_{e \in SITED(t)} w(e)
Consider the case in which input trees are ranked from the least important, imp=1
, to most important, imp=|I|
.
Let e_i
refer to an edge with the highest weigh from a tree with imp
score of i
, and the tree be t_i
Let the weights for every edge be positive numbers and the weights satisfying for every e
in t_i
:
w(e) > |E(t_[i-1])| w(e_[i-1]) + Z[i - 1]
where:
Z[1] = 0
Z[i] = \sum_{j=1}^{i-1}|E(t_[j])| w(e_[j])
equivalently, we could define Z[i]
recursively:
Z[i] = Z[i-1] + |E(t_[i-1])| w(e_[i-1)
A weighting scheme that fits this set of constraints wiil be referred to as using "tree ranking edge weights".
This name comes from the fact that the weights for edges from trees with high importance scores are larger than the sum of the edge weights for all lower ranked trees.
Under such a scheme, a tree has a higher WSITED(T)
score if it displays one edge from a high ranked tree than it would if it displayed all of the edges from tree of lower importance.
The TAG-based synthetic tree operation may not be guaranteed to find the optimal tree under system of maximizing the WSITED(T)
score for a system of tree ranking edge weights.
However, each synthesis decision at a node can be justified as optimizing the score under such a system.
To prove that the synthesis maximizes the score, one would have to show that the construction of the TAG means that all of the relevant edges are incident on a node when a synthesis decision is being made.