Results obtained by tuning precision and recall in TPR-w ensembles

We can tune precision and recall rates in TPR-w ensembles, through the global parameter $ w$:

$\displaystyle p_i(x) = w \cdot \hat{p}_i(x) + \frac{1 - w}{\vert\phi(x)\vert} \sum_{j \in \phi(x)} p_j(x)
$

Fig. 7 and 8 show the hierarchical precision, recall and F-measure as functions of the parameter $ w$. For small values of $ w$ ($ w$ can vary from 0 to 1) the weight of the decision of the parent local predictor is small, and the ensemble decision depends mainly by the positive predictions of the offsprings nodes(classifiers): in this case we obtain a higher hierarchical recall for the TPR-w ensemble. On the contrary higher values of $ w$ correspond to a higher weight of the ``parent'' local predictor, with a resulting higher precision. The opposite trends of precision and recall are quite clear in all graphs of Fig. 7. The best F-score is in ``middle'' values of the parameter parent-weight: in practice in most of the analyzed data sets the best F-measure is achieved for $ w$ between $ 0.5$ and $ 0.8$, but if we need higher recall rates (at the expense of the precision) we can choose lower $ w$ values, and higher values of $ w$ are needed if precision is our first aim.

Figure 7: Precision, Recall and F-measure as a function of the parent weight in TPR-w ensembles. Horizontal lines refers to top-down ensembles. Protein domain binary data: (a) linear kernel (b) gaussian kernel; PPI BioGRID data: (c) linear (d) gaussian kernel; PPI Von Mering data: (e) linear (f) gaussian kernel; Pairwise sequence similarity data: (g) linear (h) polynomial kernel.


\includegraphics[width = 7.2cm]{eps/YeastDomainNobleBinary.c0.5.hier.2way.F.w.eps}
\includegraphics[width = 7.2cm]{eps/YeastDomainNobleBinary.g0.001c100.hier.2way.F.w.eps}

(a)
(b)

\includegraphics[width = 7.2cm]{eps/YeastBiogrid.c100.hier.2way.F.w.b0.eps}
\includegraphics[width = 7.2cm]{eps/YeastBiogrid.g0.1c100.hier.2way.F.w.b0.eps}

(c)
(d)

\includegraphics[width = 7.2cm]{eps/YeastVM.c1000.hier.2way.F.w.b0.eps}
\includegraphics[width = 7.2cm]{eps/YeastVM.g0.1c100.hier.2way.F.w.b0.eps}

(e)
(f)

\includegraphics[width = 7.2cm]{eps/YeastSW.c0.001.hier.2way.F.w.b0.eps}
\includegraphics[width = 7.2cm]{eps/YeastSW.d3c1.hier.2way.F.w.eps}

(g)
(h)

Figure 8: Precision, Recall and F-measure as a function of the parent weight in TPR-w ensembles. Horizontal lines refers to top-down ensembles. Protein domain logE data: (a) linear kernel (b) gaussian kernel; Phylogenetic data: (c) linear (d) gaussian kernel; Gene expression data: (e) linear (f) gaussian kernel.


\includegraphics[width = 7.2cm]{eps/YeastDomainLog.c1000.hier.2way.F.w.b0.eps}
\includegraphics[width = 7.2cm]{eps/YeastDomainLog.g1c100.hier.2way.F.w.b0.eps}

(a)
(b)

\includegraphics[width = 7.2cm]{eps/Yeast.phylo.linear.F.w.b0.eps}
\includegraphics[width = 7.2cm]{eps/Yeast.phylo.gauss.F.w.b0.eps}

(c)
(d)

\includegraphics[width = 7.2cm]{eps/YeastExpr.c0.1.hier.2way.F.w.b0.eps}
\includegraphics[width = 7.2cm]{eps/YeastExpr.g0.001c100.hier.2way.F.w.b0.eps}

(e)
(f)