Autocorrelation and Causation

Everyone talks about how correlation doesn’t imply causation, but no one says what autocorrelation implies.

Maybe its because we don’t talk about autocorrelation at all πŸ™‚

Lets start talking about autocorrelation by saying what it is:
First of all, autocorrelation is a concept related to time series. Time series is an ordered series of measurements taken at intervals over time. You know how we measure CPU utilization every 10 minutes and then display nice “CPU over time” graphs? thats a time series.
Autocorrelation is the correlation between points in the time series. So we can compare every point in the time series to the measurement taken 10 minutes later, 20 minutes later, etc. And we can find out that every point in our graph is strongly correlated with the measurement taken 10 minutes later and the point taken 60 minutes later.

But does that imply causation?
We normally don’t assume that the current value of the CPU caused the value that the CPU has in 60 minutes. It makes much more sense to assume that there is a third factor that causes the CPU to peak every 60 minutes. This effect is also called seasonality. The weather today is strongly correlated with the weather on Dec 9th 2008. The third factor in this case is the circles our planet makes around the sun.

However the autocorrelation with the point immediately following the current value, often does imply causation of sorts. If you want to make a good guess about the value of IBM stock tomorrow, your best bet is to guess that it will be the same as the value today. Stock values usually have very strong short-term autocorrelation, and we can say that tomorrow’s value is todays value plus some error. IBM stock prices are normally stable, so the error is normally small. So you can say that today’s stock value is caused by today’s value. In a similar way the CPU in 2 minutes can be predicted to be identical to the CPU right now.

I’m hesitant to call this “causation”, because although the stock price today does cause the stock price of tomorrow (plus an error!), the “real” cause is that stock prices and cpus behave in a specific way. On the other hand, we know that they behave in a specific way because we measured the autocorrelation, modeled it and made predictions that work. So in two important uses of causation, understanding the behavior of the thing we measured and making predictions, we can say that we have a cause-and-effect relation. Albeit a bit less intuitive that usual.

If you dig the idea of explaining and predicting CPU and other important performance measurements by using only the measure itself without looking for other explaining factors, then you should definitely attend my presentation about time series analysis at RMOUG. I’ll show exactly how we find autocorrelations and how to predict future values and we’ll discuss whether or not this is a useful method.


7 Comments on “Autocorrelation and Causation”

  1. Freek says:

    Hmm…
    The stock price is (normally) a representation of a companies value or the value the speculants think it has. As (again, normally) the value of a company does not change over night, its stock price does not either. It might appear that today’s stock price is caused by yesterdays price, while it is in fact caused by the companies value.
    Same thing for the cpu usage, which is caused by the processes running on the system.

    I have the feeling that this is somewhat the same as with ratio’s. You need to look at the values making up the ratio and not to the ratio itself.

    I would love to see your presentation. Will it become available for the unlucky who will not be able to attend?

    ps) just looked at the sessions: wauw

  2. moshez says:

    Re: stock price and causation: We have a model for why stock prices don’t change a lot. A stock value represents the future dividends of the company (theoretically). So it is driven by two factors:
    * Interest rates (which allow us to compare future money to current money)
    * Prediction of future dividends
    None of these components is likely to change that much in the course of a day, especially for an established company like IBM where its markets are more-or-less understood. So I would argue for a third causative agent in this case as well — the components that determine a price.

  3. prodlife says:

    Saying “Company value doesn’t change over night” is exactly the same as saying “Today’s company value is the cause for tomorrow’s”. So even if you look at the underlying series, you still get the same behavior.

    I’ll post the presentation in my blog as soon as I’m done writing it. It’ll take a while, I’m just in the “throw ideas at the blog” stage.

  4. prodlife says:

    Moshez,

    Prediction of future dividends is based on current dividends, so naturally the cause for future value is the current value πŸ™‚

  5. joel garry says:

    The only problem is… stock price modeling doesn’t work. Most of the money is made through parasitical transaction fees. The value error coefficient is best described as herding wildebeests. Sometimes they stampede.

    • prodlife says:

      If by “stock price modeling” you mean the stuff used in technical analysis – you are right and it does not work.

      The “Random Walk” model, also known as “MA(1)” or “current price + noise” is pretty well proven. Its not a very useful model for making money, but it does a good job of describing reality.


Leave a comment