Cause and effect. When X happens, Y happens. If you turn it around, Y happens because of X. The first is a correlation; the second is causation.
You’d think the second should also be true because the first is true. Unfortunately, not always so.
Let’s illustrate:
There was a speaker on stage with a microphone in his hand. Some blocks away, there was a fire outbreak, but nothing major. Firefighters responded, and then came the police—all blaring their sirens. Whenever one of those cars with a siren passed the building where the speaker was, the microphone gave a screeching sound. It happened thrice. A car passes, and the microphone makes that uncomfortable sound. At first, it seemed like a coincidence, then it happened the second time, and then again. The speaker was unaware that the audience had observed and asked what was causing the microphone to make this sound. Is it bad? They said no, the cars with the sirens were interfering, causing the microphone to make that sound. Then another car with a siren passed, but the microphone did not sound. The speaker said, so, what’s up? Why did it not make a sound now?
What do you think causes the sound?
As for what happened, you’d have to find out.
It is hard to predict the future. So hard that such predictions are always blown out of proportion by both optimists and pessimists alike.
How many headlines of the world ending have we read? Movies about the world ending are a familiar thing.
On the other part, shouldn’t we have solved hunger and poverty by now? Shouldn’t we be using flying cars? Instead, we got 140 characters. (this statement belongs to Peter Thiel)
Let’s forget about the future. How about explaining the past? It should be easier to explain the past. That is not always the case.
Humans find it difficult to explain why things happen. The cause of a thing is hard to pinpoint. Too many variables have varying effects on the subject. That is why studies control factors such as race, age, etc.
We seek to form narratives about things because it is tidier and easy to recall. It is a straight line or an arc. Simple.
Again, this is not our fault. It’s nobody’s fault. It’s just how the brain works.
Many things can contribute to one thing happening, but we might see only one or two major ones and attribute them to be the cause of X.
A funny example is lung cancer and smoking. It has already been established that smoking causes lung cancer way before the tests confirm such happened. Why? Because lung cancer cases increased with the increase in cigarette consumption. Until the study demonstrated that it was cigarettes, we couldn’t be sure that it was cigarettes. It might have been anything or a combination of both causes. When X happens, Y happens most times in correlation.
Correlation
The case for correlation is much easier. When X happens, Y happens. “If you look long enough, you’ll see patterns.”
Correlation means relationship and association with another variable. For example, a movement in one variable is associated with a movement in another variable.
A lot of things seemingly correlate. You can find any pattern you are looking for in a big enough population.
Look at the chart below:
Correlation can be found anywhere
Formulas
When X is inversely proportional to Y. W is directly proportional to J. This means they correlate; X is negatively correlated to Y. Which means they go in opposite directions in the same proportion. If X decreases by 2x, then Y decreases by 2x.
For directly proportional, W increases as J increases. If W increases by 2x, then J also increases by 2x. Both X, Y and W, J cases have a proportion of 1:1.
In Statistics
In statistics, there is a decree of the strength of a correlation. Correlation is measured between +1 through 0 to -1. When one variable increases as the other increases, the correlation is positive; when one decreases as the other increases, it is negative.
In medicine
There’s this commonly used word, risk factor. I like to look at it as a probability. When medics say obesity is a risk factor for diabetics. It means that obesity increases the possibility that one gets diabetes. Risk factors are not necessarily the causes of illnesses; instead, they are correlations to diseases.
“For example, being young cannot be said to cause measles, but young people have a higher rate of measles because they are less likely to have developed immunity during a previous epidemic” So, being young is correlated to measles.
Poor sleep and lack of exercise are risk factors for cardiovascular diseases. This doesn’t mean you'd get a stroke because you did not sleep well yesterday. Instead, it increases the chances of a stroke happening.
Investing
Correlation can be used to measure the relationship between assets. These assets can be the same asset class, such as Apple against Tesla. Different assets, such as bonds against stocks etc.
It explains why bonds go down when the stock market is up (negative correlation) or gold doesn’t have any correlation to real estate. Gold does have a positive correlation to bonds.
Alternative assets, such as gold, crypto, art etc, are expected to negatively correlate to the stock market since they are a form of diversification.
“The linear correlation coefficient can help determine the relationship between an investment and the overall market or other securities. It is often used to predict stock market returns.”
Conclusion
So what happened?
Whenever a siren passed, the speaker moved away from the window. In doing that, he moved closer to the speakers, and this caused the screeching sound. The third time the siren passed, the microphone was with a person in the audience.
Correlation is not causation.
As we know, there are known knowns; there are things we know we know. We also know there are known unknowns; that is to say we know there are some things we do not know. But there are also unknown unknowns - the ones we don't know we don't know. - Donald Rumsfeld
Reading
Want to see more of my readings, check out MtReadings
Just curious, are you an economics or statistics student/researcher? You seem very passionate about correlation and causation