# Notation¶

dit is a scientific tool, and so, much of this documentation will contain mathematical expressions. Here we will describe this notation.

## Basic Notation¶

A random variable $$X$$ consists of outcomes $$x$$ from an alphabet $$\mathcal{X}$$. As such, we write the entropy of a distribution as $$\H{X} = \sum_{x \in \mathcal{X}} p(x) \log_2 p(x)$$, where $$p(x)$$ denote the probability of the outcome $$x$$ occuring.

Many distributions are joint distribution. In the absence of variable names, we index each random variable with a subscript. For example, a distribution over three variables is written $$X_0X_1X_2$$. As a shorthand, we also denote those random variables as $$X_{0:3}$$, meaning start with $$X_0$$ and go through, but not including $$X_3$$ — just like python slice notation.

If a set of variables $$X_{0:n}$$ are independent, we will write $$\ind X_{0:n}$$. If a set of variables $$X_{0:n}$$ are independent conditioned on $$V$$, we write $$\ind X_{0:n} \mid V$$.

If we ever need to describe an infinitely long chain of variables we drop the index from the side that is infinite. So $$X_{:0} = \ldots X_{-3}X_{-2}X_{-1}$$ and $$X_{0:} = X_0X_1X_2\ldots$$. For an arbitrary set of indices $$A$$, the corresponding collection of random variables is denoted $$X_A$$. For example, if $$A = \{0,2,4\}$$, then $$X_A = X_0 X_2 X_4$$. The complement of $$A$$ (with respect to some universal set) is denoted $$\overline{A}$$.

Furthermore, we define $$0 \log_2 0 = 0$$.

## Advanced Notation¶

When there exists a function $$Y = f(X)$$ we write $$X \imore Y$$ meaning that $$X$$ is informationally richer than $$Y$$. Similarly, if $$f(Y) = X$$ then we write $$X \iless Y$$ and say that $$X$$ is informationally poorer than $$Y$$. If $$X \iless Y$$ and $$X \imore Y$$ then we write $$X \ieq Y$$ and say that $$X$$ is informationally equivalent to $$Y$$. Of all the variables that are poorer than both $$X$$ and $$Y$$, there is a richest one. This variable is known as the meet of $$X$$ and $$Y$$ and is denoted $$X \meet Y$$. By definition, $$\forall Z s.t. Z \iless X$$ and $$Z \iless Y, Z \iless X \meet Y$$. Similarly of all variables richer than both $$X$$ and $$Y$$, there is a poorest. This variable is known as the join of $$X$$ and $$Y$$ and is denoted $$X \join Y$$. The joint random variable $$(X,Y)$$ and the join are informationally equivalent: $$(X,Y) \ieq X \join Y$$.

Lastly, we use $$X \mss Y$$ to denote the minimal sufficient statistic of $$X$$ about the random variable $$Y$$.