# General Information¶

Documentation:

http://docs.dit.io

https://pypi.org/project/dit/

https://anaconda.org/conda-forge/dit

Dependencies:

## Optional Dependencies¶

• colorama: colored column heads in PID indicating failure modes

• cython: faster sampling from distributions

• hypothesis: random sampling of distributions

• matplotlib, python-ternary: plotting of various information-theoretic expansions

• numdifftools: numerical evaluation of gradients and hessians during optimization

• pint: add units to informational values

• scikit-learn: faster nearest-neighbor lookups during entropy/mutual information estimation from samples

Mailing list:

None

Code and bug tracker:

https://github.com/dit/dit

BSD 3-Clause, see LICENSE.txt for details.

### Quickstart¶

The basic usage of dit corresponds to creating distributions, modifying them if need be, and then computing properties of those distributions. First, we import:

In : In : import dit


Suppose we have a really thick coin, one so thick that there is a reasonable chance of it landing on its edge. Here is how we might represent the coin in dit.

In : In : d = dit.Distribution(['H', 'T', 'E'], [.4, .4, .2])

In : In : print(d)
Class:          Distribution
Alphabet:       ('E', 'H', 'T') for all rvs
Base:           linear
Outcome Class:  str
Outcome Length: 1
RV Names:       None

x   p(x)
E   1/5
H   2/5
T   2/5

In : Class:          Distribution

In : Alphabet:       ('E', H', 'T') for all rvs
...: Base:           linear
...: Outcome Class:  str
...: Outcome Length: 1
...: RV Names:       None
...:
File "<ipython-input-5-765b249d398d>", line 1
Alphabet:       ('E', H', 'T') for all rvs
^
SyntaxError: invalid syntax


Calculate the probability of $$H$$ and also of the combination: $$H~\mathbf{or}~T$$.

In : In : d['H']
Out: 0.4

In : Out: 0.4

In : In : d.event_probability(['H','T'])
Out: 0.8

In : Out: 0.8


Calculate the Shannon entropy and extropy of the joint distribution.

In : In : dit.shannon.entropy(d)
Out: 1.5219280948873621

In : Out: 1.5219280948873621

In : In : dit.other.extropy(d)
Out: 1.1419011889093373

In : Out: 1.1419011889093373


Create a distribution representing the $$\mathbf{xor}$$ logic function. Here, we have two inputs, $$X$$ and $$Y$$, and then an output $$Z = \mathbf{xor}(X,Y)$$.

In : In : import dit.example_dists


Calculate the Shannon mutual informations $$\I[X:Z]$$, $$\I[Y:Z]$$, and $$\I[X,Y:Z]$$.

In : In : dit.shannon.mutual_information(d, ['X'], ['Z'])
---------------------------------------------------------------------------
ditException                              Traceback (most recent call last)
<ipython-input-15-1c8c59aabbb1> in <module>
----> 1 dit.shannon.mutual_information(d, ['X'], ['Z'])

~/checkouts/readthedocs.org/user_builds/dit/conda/latest/lib/python3.7/site-packages/dit/shannon/shannon.py in mutual_information(dist, rvs_X, rvs_Y, rv_mode)
157
158     """
--> 159     H_X = entropy(dist, rvs_X, rv_mode=rv_mode)
160     H_Y = entropy(dist, rvs_Y, rv_mode=rv_mode)
161     # Make sure to union the indexes. This handles the case when X and Y

72             rv_mode = RV_MODES.INDICES
73
---> 74         d = dist.marginal(rvs, rv_mode=rv_mode) # pylint: disable=no-member
75     else:
76         d = dist

1288         # We parse the rv_mode now, so that we can reassign their names
1289         # after coalesce has finished.
-> 1290         rvs, indexes = parse_rvs(self, rvs, rv_mode, unique=True, sort=True)
1291
1292         ## Eventually, add in a method specialized for dense distributions.

~/checkouts/readthedocs.org/user_builds/dit/conda/latest/lib/python3.7/site-packages/dit/helpers.py in parse_rvs(dist, rvs, rv_mode, unique, sort)
334         msg = 'rvs contains invalid random variables, {0}, {1} {2}.'
335         msg = msg.format(indexes, good_indexes, rv_mode)
--> 336         raise ditException(msg)
337
338     # Sort the random variable names (or indexes) by their index.

ditException: rvs contains invalid random variables, ['X'], set() 0.

In : Out: 0.0

In : In : dit.shannon.mutual_information(d, ['Y'], ['Z'])
---------------------------------------------------------------------------
ditException                              Traceback (most recent call last)
<ipython-input-17-90efbc2156b7> in <module>
----> 1 dit.shannon.mutual_information(d, ['Y'], ['Z'])

~/checkouts/readthedocs.org/user_builds/dit/conda/latest/lib/python3.7/site-packages/dit/shannon/shannon.py in mutual_information(dist, rvs_X, rvs_Y, rv_mode)
157
158     """
--> 159     H_X = entropy(dist, rvs_X, rv_mode=rv_mode)
160     H_Y = entropy(dist, rvs_Y, rv_mode=rv_mode)
161     # Make sure to union the indexes. This handles the case when X and Y

72             rv_mode = RV_MODES.INDICES
73
---> 74         d = dist.marginal(rvs, rv_mode=rv_mode) # pylint: disable=no-member
75     else:
76         d = dist

1288         # We parse the rv_mode now, so that we can reassign their names
1289         # after coalesce has finished.
-> 1290         rvs, indexes = parse_rvs(self, rvs, rv_mode, unique=True, sort=True)
1291
1292         ## Eventually, add in a method specialized for dense distributions.

~/checkouts/readthedocs.org/user_builds/dit/conda/latest/lib/python3.7/site-packages/dit/helpers.py in parse_rvs(dist, rvs, rv_mode, unique, sort)
334         msg = 'rvs contains invalid random variables, {0}, {1} {2}.'
335         msg = msg.format(indexes, good_indexes, rv_mode)
--> 336         raise ditException(msg)
337
338     # Sort the random variable names (or indexes) by their index.

ditException: rvs contains invalid random variables, ['Y'], set() 0.

In : Out: 0.0

In : In : dit.shannon.mutual_information(d, ['X', 'Y'], ['Z'])
---------------------------------------------------------------------------
ditException                              Traceback (most recent call last)
<ipython-input-19-1af669dd1aec> in <module>
----> 1 dit.shannon.mutual_information(d, ['X', 'Y'], ['Z'])

~/checkouts/readthedocs.org/user_builds/dit/conda/latest/lib/python3.7/site-packages/dit/shannon/shannon.py in mutual_information(dist, rvs_X, rvs_Y, rv_mode)
157
158     """
--> 159     H_X = entropy(dist, rvs_X, rv_mode=rv_mode)
160     H_Y = entropy(dist, rvs_Y, rv_mode=rv_mode)
161     # Make sure to union the indexes. This handles the case when X and Y

72             rv_mode = RV_MODES.INDICES
73
---> 74         d = dist.marginal(rvs, rv_mode=rv_mode) # pylint: disable=no-member
75     else:
76         d = dist

1288         # We parse the rv_mode now, so that we can reassign their names
1289         # after coalesce has finished.
-> 1290         rvs, indexes = parse_rvs(self, rvs, rv_mode, unique=True, sort=True)
1291
1292         ## Eventually, add in a method specialized for dense distributions.

~/checkouts/readthedocs.org/user_builds/dit/conda/latest/lib/python3.7/site-packages/dit/helpers.py in parse_rvs(dist, rvs, rv_mode, unique, sort)
334         msg = 'rvs contains invalid random variables, {0}, {1} {2}.'
335         msg = msg.format(indexes, good_indexes, rv_mode)
--> 336         raise ditException(msg)
337
338     # Sort the random variable names (or indexes) by their index.

ditException: rvs contains invalid random variables, ['X', 'Y'], set() 0.

In : Out: 1.0


Calculate the marginal distribution $$P(X,Z)$$. Then print its probabilities as fractions, showing the mask.

In : In : d2 = d.marginal(['X', 'Z'])
---------------------------------------------------------------------------
ditException                              Traceback (most recent call last)
<ipython-input-21-b067ba4a93be> in <module>
----> 1 d2 = d.marginal(['X', 'Z'])

1288         # We parse the rv_mode now, so that we can reassign their names
1289         # after coalesce has finished.
-> 1290         rvs, indexes = parse_rvs(self, rvs, rv_mode, unique=True, sort=True)
1291
1292         ## Eventually, add in a method specialized for dense distributions.

~/checkouts/readthedocs.org/user_builds/dit/conda/latest/lib/python3.7/site-packages/dit/helpers.py in parse_rvs(dist, rvs, rv_mode, unique, sort)
334         msg = 'rvs contains invalid random variables, {0}, {1} {2}.'
335         msg = msg.format(indexes, good_indexes, rv_mode)
--> 336         raise ditException(msg)
337
338     # Sort the random variable names (or indexes) by their index.

ditException: rvs contains invalid random variables, ['X', 'Z'], set() 0.

In : In : print(d2.to_string(show_mask=True, exact=True))
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-22-40352d6ba310> in <module>

NameError: name 'd2' is not defined

In : Class:          Distribution

In : Alphabet:       ('0', '1') for all rvs
....: Base:           linear
....: Outcome Class:  str
....: Outcome Length: 2 (mask: 3)
....: RV Names:       ('X', 'Z')
....:
File "<ipython-input-24-6b5343e0ae87>", line 1
Alphabet:       ('0', '1') for all rvs
^
SyntaxError: invalid syntax


Convert the distribution probabilities to log (base 3.5) probabilities, and access its probability mass function.

In : In : d2.set_base(3.5)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-25-a4c25fbf4cdd> in <module>
----> 1 d2.set_base(3.5)

NameError: name 'd2' is not defined

In : In : d2.pmf
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-26-1667a2505e35> in <module>
----> 1 d2.pmf

NameError: name 'd2' is not defined

In : array([-1.10658951, -1.10658951, -1.10658951, -1.10658951])
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-27-bbf92d577a74> in <module>
----> 1 array([-1.10658951, -1.10658951, -1.10658951, -1.10658951])

NameError: name 'array' is not defined


Draw 5 random samples from this distribution.

In : In : d2.rand(5)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-28-6015593867b3> in <module>
----> 1 d2.rand(5)

NameError: name 'd2' is not defined

In : Out: ['01', '10', '00', '01', '00']


Enjoy!