I came across the term Cohort Analysis for the first time when I read Eric Ries’s Lean Startup and the concept seemed very powerful fundamentally especially in the context of web based products. And since the world is moving from desktop to the cloud, it seemed like an important concept to grasp. Therefore, I had to get down and begin researching on the topic. This post is my attempt to get the basics right.

Cohort analysis as I understand is a tool that helps measure user engagement with a product over time. It helps the product team to understand if the user engagement is actually getting better over time or is only appearing to improve because of growth.

What is a cohort?

A cohort is what it literally means—a group of people who share common characteristics over a period of time.

The following are the examples of answers that a cohort analysis can answer:

  1. To what extent have changes to the features improved conversion rates?
  2. Does the introduction of feature X increase the likelihood that new customers will sign up for the service?
  3. Are customers acquired via email marketing more likely to repeat purchase or be upsold, compared to those acquired e.g. via AdWords marketing?

 One reason why the cohort analysis is valuable is because it helps to separate growth metrics from engagement metrics.

This is important because growth can easily mask engagement problems.

If you’re successfully adding lots of users to your service, your overall engagement numbers will look positive because those new users are relatively well-engaged, spending lots of time on the site in the beginning. If you only looked at overall engagement numbers then you would think that your service is continuously getting stronger.

The figure below shows a graphical representation of engagement over time:



If you try to make sense of the above graph, the engagement has improved steadily with October cohort receiving 53% engagement. This indicates that a lot of improvements have gone into the product in the previous months that enabled a higher engagement in the end.

In reality, however, it may be that people stop being engaged after a couple of weeks on the service. They might leave for any number of reasons: it’s not useful, the novelty wore off, they added all their friends and now have nothing to do, etc. But the lack of activity of these users is being hidden by the impressive growth numbers of new users…there are enough people being added to the service that the lack of engagement from a small number of folks just doesn’t show up.

This is where the cohort analysis proves valuable. By bucketing people into the month (or week) they started using the service, you can keep track of their engagement over time. You can now make assessments like “the March cohort is engaged better than the February cohort” and the like. If your numbers are flat month over month, which is often the case even if the face of impressive growth, then you have not improved user engagement over that time.

An extensive study on Cohort Analysis is presented in Keplar’s blog. http://www.keplarllp.com/blog/2012/04/cohort-analyses-for-digital-businesses-an-overview

Here is an excellent video by Michael Hermen on how to conduct  a Cohort Analysis using MS Excel.