Skip to content
This repository has been archived by the owner on Sep 22, 2019. It is now read-only.

maxweightofinputtreeedgesdisplayed

Mark T. Holder edited this page Feb 4, 2015 · 7 revisions

Maximum Weight of Input Tree Edges Displayed

This iteb_support_theorem describes an ITEB score that is calculated for any edge in an output supertree given a set of input trees.

Here we describe a score for the entire supertree given a set of inputs which could be optimized by strictly ranking edge inclusion decisions by satisfying the grouping provided by the input trees given in order.

This is not meant to imply that this is the scoring system used in tree-ranked based SBH-TAG synthesis. This is merely one possible scoring system that would justify the procedure.

Notation

(same as that used on the iteb_support_theorem page. Although here we'll try to use n and N for nodes in input and the supertree respectively. v and V look too similar).

  • t will be a input tree.
  • L(t) is the label set of the tree t
  • L(t, n) for a node n will be the label set of the subtree of t that descends from n
  • L(N) for a node N in the TAG will be the labels under this node in the TAG - the union of labels sets for all of the child nodes of n (each leaf node of the TAG is assigned single label)
  • out(t, v) is defined to be L(t) - L(t, n) . This the outgroup of node n with respect to the label set of t
  • E(t) is the set of edges in tree t
  • I is the set of input trees

CSITED

definition of "display"

We say a rooted supertree T, to displays and n → n' from an input tree t if and only if there exists an edge N → N' in the supertree for which:

  • iteb1: L(N) ∩ L(t) = L(t, n)

Note this is only the first condition required for the statement that tree t supports edge N → N' sensu ITEB support. It is possible for a supertree to display an edge n → n', while that edge does not support any edge in the supertree. This occurs there is a case of a node, N, and its parent, N' in the supertree both having the property that L(N) ∩ L(t) = L(N') ∩ L(t) = L(t, n). From the perspective of the source tree the edge N→ N' is redundant with the edge one step closer to the root, so t does not support the existence of this edge. The supertree would still display n → n' even if N → N' were collapsed.

SITED

The set of input tree edges displayed by supretree T is SITED(T)

The cardinality of this set is the cardinality of set of input tree edges displayed:

CSITED(T) = | SITED(T) |

WSITED

If each edge, e, in the input trees is assigned a weight w(e) then a supertree's weight of the set of input tree edges displayed is simply the sum of the weights for every edge in SITED(T):

WSITED(T) = \sum_{e \in SITED(t)} w(e)

TAG resolution in terms of strict preference for tree ranking

tree ranking edge weights

Consider the case in which input trees are ranked from the least important, imp=1, to most important, imp=|I|. Let e_i refer to an edge with the highest weigh from a tree with imp score of i, and the tree be t_i Let the weights for every edge be positive numbers and the weights satisfying for every e in t_i:

w(e) > |E(t_[i-1])| w(e_[i-1]) + Z[i - 1]

where:

Z[1] = 0
Z[i] = \sum_{j=1}^{i-1}|E(t_[j])| w(e_[j])

equivalently, we could define Z[i] recursively:

Z[i] = Z[i-1] + |E(t_[i-1])| w(e_[i-1)

A weighting scheme that fits this set of constraints wiil be referred to as using "tree ranking edge weights".

This name comes from the fact that the weights for edges from trees with high importance scores are larger than the sum of the edge weights for all lower ranked trees. Under such a scheme, a tree has a higher WSITED(T) score if it displays one edge from a high ranked tree than it would if it displayed all of the edges from tree of lower importance.

The TAG-based synthetic tree operation may not be guaranteed to find the optimal tree under system of maximizing the WSITED(T) score for a system of tree ranking edge weights. However, each synthesis decision at a node can be justified as optimizing the score under such a system.

To prove that the synthesis maximizes the score, one would have to show that the construction of the TAG means that all of the relevant edges are incident on a node when a synthesis decision is being made.

Clone this wiki locally