While looking up something else, I found a paper by Ling-li Sun (2007): “Spanning trees with leaves bounded by independence number”, Applied Mathematics E-Notes 7: 50–52.

Sun proves that any undirected graph \( G \) has a spanning tree with at most \( \alpha(G) \) leaves (where \( \alpha(G) \) is the size of the largest independent set of \( G \)) and that such a tree can be found in \( O(n^2) \) time. Sun's proof is very short, and improves on an earlier paper by Rahman and Kaykobad with a weaker result (a bound on the degree of the tree rather than its number of leaves). Here's an even shorter proof (for a somewhat weaker version of the problem; see comments) and a faster algorithm: use a depth-first search tree. Its leaves form an independent set, therefore, its number of leaves is at most the size of the maximum independent set.

One could also phrase Sun's result as a paired approximation in the style of my upcoming SODA paper. In any graph one can find either an approximate independent set or an approximate minimum-leaf spanning tree with an approximation ratio \( O(\sqrt{n}) \): if the DFS tree has fewer than \( \sqrt{n} \) leaves, it's a good approximation to the minimum leaf spanning tree, and if it has more leaves then its leaves form a good approximation to the maximum independent set. However, the known inapproximability results for the minimum leaf spanning tree are relatively weak (see Salamon and Wiener, IPL 2007) so it's possible that there exists a better direct approximation for that problem by itself.


ext_220158: Off by one?

Isn't your DFS argument off by one if the root of the DFS tree ends up as a leaf? So if there is a graph on a,b,c,d,e,f with edges a-b and a-[c,d,e,f] and b-[c,d,e,f] (only) then alpha is 4 (c,d,e,f) but the dfs tree at a might go a-b then b-c, b-d, b-e, b-f giving 5 leaves.

11011110: Re: Off by one?

I was thinking of rooted trees. But the Sun paper is looking at unrooted trees. So I guess the DFS argument is weaker in that respect — thanks for the correction. I should also have noted that Sun's claim is only for \( \alpha\gt 1 \), since a complete graph still requires two leaves in its (unrooted) spanning trees, although the (rooted) DFS tree has only one leaf in this case.

Using a DFS tree as the initial tree for Sun's algorithm does at least improve its running time to linear, since it only has to go through one iteration to get to a tree with a number of leaves at least as good as the independent set given by the leaves of the DFS tree. Sun writes the algorithm as if α is known, but in the original version of the algorithm one can just keep going until the leaves are all independent, and in the modified version one starts with a DFS tree and then (if the root is a leaf and is adjacent to some other leaves) performs a single improvement step.

None: a fix?

Very nice. If your solution fails, your DFS tree \( T \) must not be a path and one of its leaves must have an edge \( e \) to the root (otherwise it would be ok). Form \( T' \) by adding \( e \) to \( T \) and removing the first edge on the path from the root which goes to a vertex of degree \( \gt 2 \) in \( T \). \( T' \) is ok and it is still linear time.


11011110: Re: a fix?

That's essentially what one gets by applying one round of Sun's algorithm to the DFS tree.


Hi, just to point out that the result on bounding the number of leaves of a spanning tree in terms of the independence number (via DFS) is contained in Proposition 8 of the following paper:
L. Gargano, M. Hammer, P. Hell, L. Stacho, U. Vaccaro
``Spanning Spiders and Light- Splitting Switches'',
Discrete Mathematics, vol. 285, Issue 1-3, pp. 83-95, 2004.