Loading [Contrib]/a11y/accessibility-menu.js

Assemble tree -- Assemble phylogenetic trees

A relative topic to this page is the Supertree, see wiki at http://en.wikipedia.org/wiki/Supertree for more information. Here, I only introduce a three-stages approach to construct a phylogeny on the EIAV dataset. The detail steps are the following:

  1. apply phyclust to get $K$ clusters,
  2. apply paml.baseml on central sequences to obtain the stem tree,
  3. apply paml.baseml on sequences of each clusters to obtain $K$ leaves trees, then
  4. adjoint the $K$ leaves trees to the stem tree.

The R script (ex_assembletree.r) uses the above steps to analysis pony625.fas and to generate two figures below. I arbitrary choose $K = 4$ clusters. The full tree is assemble from the stem tree with the four leaves trees attached to each node of the stem tree, and the colors indicate clusters.

Warning: If the number of sequences of a leaves tree is large, it may take extremely long time to find one tree by the function paml.baseml given a model, not to mention to find the best tree/model. Also, this function may dump large output files even under the settings restricting minimum messages.

Opening questions:
  1. Is a phylogenetic tree reasonable to the data for rapid evolving viruses or close related subspecies?
  2. How small are the clusters appropriate for constructing leaves trees?
  3. What does this assemble tree approach mean/want to say? Reasonable?
  4. How do we adjoin the unrooted leaves trees to the stem tree?
  5. Do we have to scale leaves trees before adjoint to the stem tree?
  6. Are there other ways to construct/search a tree?
  7. How about to use the assembled star trees as the initial trees and search an optimized/restricted tree based on this initial?
  8. How close is this optimized/restricted tree comparing to the best/true tree or the neighbor-joining tree?
  9. Would PHYLIP provide better tree results?

The results of the three-stages approach are displayed in the following figure (click to enlarge.) The top is the stem tree, the middle is the scaled leaves trees, and the bottom is the assemble tree.