Sep 6 2023

Information Entropy and Communication

When trying to understand a set of facts about a situation, say during an investigation or while conducting diligence, it is important to drill down as quickly as possible. Bare with me for a second, this scenario will be a little unusual. Let’s say you were looking to buy a car and you had access to a sales person who could bring you any car in the country. However, this sales person would present you with a totally random car plucked from a random sales lot somewhere in the country and you could only ask them about the car, you can’t see it yourself. What question(s) would you ask the sales person to filter out a possible car as quickly as possible? what order would you ask the question(s) in? Think about all of the possible cars, models, years, conditions, features, etc. That’s an enormous problem space. However, there are likely swaths of automatically disqualifying characteristics you can use to quickly rule out potential options. If you only want a minivan then all trucks, sedans, sports cars, etc. are automatically ruled out - so your first question may be is it a minivan?” and if the answer is no” then there’s no need to ask a further question - just bring me another option!

With that setup, let’s touch on some of the theory. If you have a system” or a problem space representing a finite set of possible outcomes represented by the state of a variable, the Entropy is the amount of information” inherent to the variables possible outcomes. Interestingly, and helpfully, Entropy can also be called Surprise.

To use a common example: a fair coin has two sides, both equally likely to land up, if flipped in the air. Each outcome has a probability of 50% (1/2) and when calculated (-0.5*log2(0.5)) has 0.5 bits of entropy. When added together, the two possible outcomes equal 1 bit of entropy.

To take it a step further: a fair d10 dice has 10 sides, all equally likely to land up, if rolled across a surface. Each outcome has a probability of 10% (1/10) and when calculated (-0.1*log2(0.1)) has 0.332 bits of entropy. When added together, the ten possible outcomes equal 3.32 bits of entropy.

Now, if I flipped the coin and rolled the d10 dice behind a screen and told you had to guess the outcome, but could only ask one question about either the coin toss or dice role, what question would you ask? Most would intuitively ask what the outcome of the dice role was. Why? Because the odds for guessing a coin flip are better, so ask about the outcome of the dice role.

Looking at the situation from an information entropy scenario, there are two scenarios. First, the combined entropy is the 1 bit from the coin flip and the 3.32 bits from the dice roll, making 4.32 bits of entropy in the problem space. If the results of the coin flip is known, then the entropy of the system falls from 4.32 to 3.32. However, if the results of the dice roll are known, then the entropy of the system falls from 4.32 to 1. Said a different way the greatest reduction in information entropy is by determining the state of the d10 dice.

Practical Applications

Communication - When communicating with others, especially over a bandwidth constrained medium like text (email, Slack, etc.) it is important to reduce the entropy for your reader as quickly as possible. Think about how annoying it is when someone messages you Hello” with no other context or when you get a random inbound from someone you don’t know. One of underlying reasons those situations are bothersome is because they are inherently high entropy for the receiver. The person receiving the message, such as yourself, don’t have initial context for what the person is pinging you about, so the potential problem space is enormous or ambiguous and therefore inherently difficult, uncertain, etc. The sender should reduce entropy as quickly as possible and get to the point.

Diligence - Similar to the communication piece, but a different twist, if you’re trying to gather information from someone, like in the opening scenario, it is best to ask initial questions that help convey the most information, narrow the possible surprise”, and reduce the entropy. If a co-worker comes to you crying, asking them what they ate for breakfast likely wont reduce the entropy. Asking instead if they are happy tears or sad tears or if it’s about work or something personal will cut down on the problem space quickly as you narrow in on what is wrong (or right!).

Metrics - If you think to the metrics your team or organization uses to determine the health of the business or the performance of team/org functions, what is the most informative metric? Likely that metric conveys the most information and therefore lowers the entropy by the greatest degree. Which metrics tell you the most about something you care about? One way to make a metric more informative and contain more entropy is by normalizing it. If someone said they closed 57 work items” - is that a lot? a little? over what time frame? Instead representing it as, they closed 132% of the monthly average work items for the team” puts the figure in comparison to an average over a time period, thereby conveying a greater amount of information and reducing the surprise/entropy.