Decomposable (probabilistic) models are log-linear models generated by acyclic hypergraphs, and a number of nice properties enjoyed by them are known. In many applications the following selection problem naturally arises: given a probability distribution p over a finite set V of n discrete variables and a positive integer k, find a decomposable model with tree-width k that best fits p. If H is the generating hypergraph of a decomposable model and pH is the estimate of p under the model, we can measure the closeness of pH to p by the information divergence D(p:pH), so that the problem above reads: given p and k, find an acyclic, connected hypergraph H of tree-width k such that D(p:pH) is minimum. It is well-known that this problem is NP-hard. However, for k=1 it was solved by Chow and Liu in a very efficient way; thus, starting from an optimal Chow-Liu solution, a few forward-selection procedures have been proposed with the aim at finding a `good' solution for an arbitrary k. We propose a backward-selection procedure which starts from the (trivial) optimal solution for k=n−1, and we show that, in a study case taken from literature, our procedure succeeds in finding an optimal solution for every k.
.
Let P be a discrete multidimensional probability distribution over a finite set of variables N which is only partially specified by the requirement that it has prescribed given marginals {PA;A∈\SS}, where \SS is a class of subsets of N with ⋃\SS=N. The paper deals with the problem of approximating P on the basis of those given marginals. The divergence of an approximation P^ from P is measured by the relative entropy H(P|P^). Two methods for approximating P are compared. One of them uses formerly introduced concept of {\em dependence structure simplification\/} (see Perez \cite{Per79}). The other one is based on an {\em explicit expression}, which has to be normalized. We give examples showing that neither of these two methods is universally better than the other. If one of the considered approximations P^ really has the prescribed marginals then it appears to be the distribution P with minimal possible multiinformation. A simple condition on the class \SS implying the existence of an approximation P^ with prescribed marginals is recalled. If the condition holds then both methods for approximating P give the same result.