ete3 has support for phyloxml which I use with archaeopteryx tree viewer for a lot of my day-to-day phylogenetics visualisation. My main reason for using phyloxml is one of convenience as I have a script that will easily add in the proper organism name onto the tree and I think that archaeopteryx is a really good basic tree viewer. I wanted to draw a tree from phyloxml in ete using my own style and to have the proper organism name to be rendered. In my phyloxml file I have this coded in as the scientific name for each leaf (see below for phyloxml snippet), so now all I needed to do was make this the node name when rendering the tree.
<clade> <name>IMG_2526164742</name> <branch_length>0.19955</branch_length> <taxonomy> <scientific_name>Desulfobacterium anilini DSM 4660</scientific_name> </taxonomy> </clade>
I found that the interface for phyloxml was not the same as for newick formatted trees and unfortunately the documentation for phyloxml in ete3 is a bit lacking as there wasn’t a complete listing of methods for each class. After much messing around, looking at the source code of ete3 and examining python objects using the builtin
dir function I was able to get what I wanted.
turns out that for each node/leaf I needed to access the
phyloxml_clade attribute, which has an attribute taxonomy, which implements an iterable interface (I think it’s probably a list), which I could then use to access the scientific name and make the name of the leaf for printing. It’s a little convoluted but easy when you know how.
from ete3 import Phyloxml project = Phyloxml() # iterate through the trees in the phyloxml file for tree in project.get_phylogeny(): # go through the node in the tree for node in tree: # assign the node name from the data in the phyloxml file node.name = node.phyloxml_clade.taxonomy.get_scientific_name() tree.show()