Entropy is a concept derived from information theory, meaning the expected amount of information given the occurrence of an event. Following Shannon (1948Shannon, Claude E. 1948. “A Mathematical Theory of Communication.” Bell System Technical Journal 27 (3): 379–423. https://doi.org/10.1002/j.1538-7305.1948.tb01338.x.), given an event \(y\) with probability density function \(f(\cdot)\), the information content given the occurrence of \(y\) can be defined as \(g(f(y)) \colon= - \log f(y)\). Therefore, the expected information or, put simply, the entropy is
\[ H(f) \colon = -E \big[ \log f(y) \big] = - \int_{-\infty}^{\infty} f(y) \log f(y) dy \]
Assuming a discrete distribution, with \(p_k\) as the probability of occurring event \(k \in K\), the entropy formula takes the form:
\[ H = - \sum_{k \in K} p_k \log p_k \text{.} \]
The main idea behind it is that the expected amount of information of an event is inversely proportional to the probability of its occurrence. In other words, the information derived from the observation of a rare event is higher than of the information of more probable events.
Using ideas presented in Frank A. Cowell, Flachaire, and Bandyopadhyay (2009Cowell, Frank A., Emmanuel Flachaire, and Sanghamitra Bandyopadhyay. 2009. “Goodness-of-Fit: An Economic Approach.” Economics Series Working Papers 444. University of Oxford, Department of Economics. https://ideas.repec.org/p/oxf/wpaper/444.html.), substituting the density function by the income share of an individual:
\[ s(q) = {F}^{-1}(q) / \int_{0}^{1} F^{-1}(t)dt = y/\mu \]
the entropy function becomes the Theil-T inequality index:
\[ I_{Theil} = \int_{0}^{\infty} \frac{y}{\mu} \log \bigg( \frac{y}{\mu} \bigg) dF(y) = -H(s) \]
Therefore, the entropy-based inequality measure increases as a person’s income \(y\) deviates from the mean \(\mu\). This is the basic idea behind entropy-based inequality measures.