# Optimization¶

It is often useful to construct a distribution $$d^\prime$$ which is consistent with some marginal aspects of $$d$$, but otherwise optimizes some information measure. For example, perhaps we are interested in constructing a distribution which matches pairwise marginals with another, but otherwise has maximum entropy:

In : In : from dit.algorithms.distribution_optimizers import MaxEntOptimizer


## Helper Functions¶

There are three special functions to handle common optimization problems:

In : In : from dit.algorithms import maxent_dist, marginal_maxent_dists


The first is maximum entropy distributions with specific fixed marginals. It encapsulates the steps run above:

In : In : print(maxent_dist(xor, [[0,1], [0,2], [1,2]]))
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-3-b385aa28608c> in <module>
----> 1 print(maxent_dist(xor, [[0,1], [0,2], [1,2]]))

NameError: name 'xor' is not defined

In : Class:          Distribution

In : Alphabet:       ('0', '1') for all rvs
...: Base:           linear
...: Outcome Class:  str
...: Outcome Length: 3
...: RV Names:       None
...:
File "<ipython-input-5-35057d4d19b8>", line 1
Alphabet:       ('0', '1') for all rvs
^
SyntaxError: invalid syntax


The second constructs several maximum entropy distributions, each with all subsets of variables of a particular size fixed:

In : In : k0, k1, k2, k3 = marginal_maxent_dists(xor)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-6-71e6864e6530> in <module>
----> 1 k0, k1, k2, k3 = marginal_maxent_dists(xor)

NameError: name 'xor' is not defined


where k0 is the maxent dist corresponding the same alphabets as xor; k1 fixes $$p(x_0)$$, $$p(x_1)$$, and $$p(x_2)$$; k2 fixes $$p(x_0, x_1)$$, $$p(x_0, x_2)$$, and $$p(x_1, x_2)$$ (as in the maxent_dist example above), and finally k3 fixes $$p(x_0, x_1, x_2)$$ (e.g. is the distribution we started with).