Cohort Analysis and User Retention

Analysing cohorts will help you to prevent the “leakage” of users, improve marketing campaigns and increase the LTV ratio. This is what a product marketing specialist Bei Lu is confident in. For 7 years Bei Lu had been working on data analysis for Ebay marketplace, was Sr. Director of Analytics in Smule Inc, is known to all readers of the "Lean Start-up" book - IMVU. Today she is Head of Analytics for AI and Robotics in Anki.

How cohort analysis will help your startup

Cohort analysis is a big part of analytics, related to the lifecycle of a customer. It is a basic segmentation of users which, however, is essential, as it will solve 3 problems simultaneously:

1. It will help to calculate the RR (Retention Rate) and to retain users effectively.

2. It will allow to predict the LTV (Lifetime Value) and to work on improving this indicator.

3. It will advise you how to optimise marketing campaigns and product features.

First of all, we shall understand what the cohort analysis is. I like the definition that was given by Alistair Croll and Benjamin Yoskovitz in their book “Lean Analytics: Use Data to Build a Better Startup Faster”.

“Cohort analysis is a method of behavioural analytics,... which instead of analysing users as a whole, divides them into groups, or the cohorts. People in cohorts usually share common characteristics or experiences within a defined time-span. Cohort analysis helps a company to see patterns clearly across the lifecycle of a customer rather than slicing across all customers blindly, without accounting for the natural cycle that a customer undergoes.”

For a cohort, it is important that the customers take action at the specific point in time. What can be this action, or in other words, a benchmark for the cohort to be formed? There are 2 approaches:

1. Attraction:

  • For the mobile apps —  install of the app
  • For the online resources — registration on the website, or just first visit

2. Monetisation:

  • First purchase/payment

For mobile apps,  when the customer downloads the app from store, we don’t have any information about them yet. Once the app is installed (opened), we have a minimum set of data to identify the user further. That’s why install date was used to form cohorts.

We form cohorts by day, although for the e-commerce business, which is less dynamic, you could use longer periods (a month or a year). Let’s try to use this method in practice.

How to calculate Retention Rate for a cohort

For startups, it is important to increase the user base month by month. However, even if you effectively work on the attraction, the user base of active clients may not grow.

Imagine, that in June you attracted 1 million users, in July 500 thousand. However, once you checked the total result after the second month, it was 800 thousand and not 1,5 million.  

What could have happened? I call this problem the “leaky bucket”.  

leaky bucket in cohort analysis

Whilst you were working on attracting new customers, the leaky bucket (which contained 1 million users) “lost" 700 thousands of them. They simply became inactive. Thus, after a two-month period, you only have 800 thousand people left.

It’s impossible to repair the leak completely, some users will still leave. However, you can minimise the effects by first understanding  the Retention Rate for cohorts. For that you can use a simple formula:

User Retention Rate = (R-A) / E*100


  • E — a number of active users at the end of the previous period,
  • A — a number of users, attracted during the current period,
  • R — total amount of users by the end of the current period.

Let’s calculate for our example.

  • E (June’s cohort) = 1 million
  • A (July’s cohort) = 500 thousand
  • R (the remaining part) = 800 thousand

(800 000 - 500 000) / 1 000 000 * 100 = 30%

The result shows that you were able to keep only 30% of the users. This is a very low figure, and the goal is to increase it as much as possible. The higher the percentage is, the better it works for the business.

You can compare a cohort of days, months or years. For mobile app startups, to calculate the Retention Rate on a daily basis is important. If the user doesn’t come back to us the next day, it is very unlikely that s/he will return ever again.

Cohorts development can be represented on a graph. This way it is easy to keep track of how well you retain users and which cohort deviates from the norm, in other words, is more or less representative.

Here is an example of such schedule (the chart and figures are hypothetical) ):

Cohorts development on the graph

On the Y-axis we can see the percentage of active users, and on the X-axis — the time periods, in this case, months. Note that they are numbered and aren’t specified. This is done intentionally. Number 1 refers to the first month of each cohort: for a June’s cohort it will be June, for the July’s one July respectively. This way it is easy to compare the curves.

Cohort analysis and forecasting the LTV

Forecasting the revenue growth for startups is based on a forecast of the active users database growth. In this case, LTV — Lifetime Value becomes the key indicator. It allows you to predict how much money a cohort will bring during its lifecycle.

Imagine that you have a stable business model and a history, where you can draw data for analysis. You can assume that the users on the site "will live" for 12, 36 or 48 months, and calculate the LTV for the selected period.

For startups it’s better to make a forecast 2-3 months in advance, otherwise, the performance will not be justified. After 6 months you may have lost 80% of the cohort.  Therefore the LTV forecast for startups rather serves as a reference. It can be improved by updating the data.

See also in our future article: Is it worth to calculate LTV for a marketplace?

It is important to note that you can configure the cohorts by the step of interaction (registration, download, installation) to measure your User Retention Rate. For a User Lifetime Value cohorts should be formed by the date of purchase only. This is a financial measure, and you can only analyse the cohorts that buy into you.

Look at the chart (the chart and figures are hypothetical). The Y-axis is the percentage of active users in the cohort, which brings you profit. On the X-axis are months.

For Lifetime Value - cohorts purchase only

I suggested that one transaction brings you $1. In the first month there was formed a cohort of 100 people (= 100%):

  • At the end of the month, 65% of users made a purchase (= 65, the revenue is $0,65 per transaction);
  • In the second one the purchase was made by 75% of users from the previous month (= 49 people, the income is $0,49), etc.

You know how much money the cohort brings per month, and therefore you can make a forecast for a longer period — one to three years. I believe that 12 months is too short of a period for predictions (if we talk about sustainable business). Mostly LTV is forecasted for 36 months for mature business.

How a cohort can help to customise marketing campaigns

We have already figured out that the cohort analysis helps to track customer lifecycle and to make a forecast. But that isn’t everything. This needs to be connected to business strategy or operations like optimising the marketing strategy.

identify opportunities for marketing, user, product experience improvement

Cohorts’ behavior will help you to understand how to plan marketing campaigns and when to release new products. The graph shows in which month clients’ activity abruptly goes down. Using this valuable insight, you can start the reactivation, such as sending notifications to the users to keep them on the website.

Here is an example of how the cohort analysis affects business operations  (the chart and figures are hypothetical).

case study for a mobile app-company

Meanings for the hypothetical figures:

  • Positive numbers: upward trend
  • Negative numbers: downward trend

Through the cohort analysis, we can see that at the end of June there was a sharp decline (-3). Our task is to analyse the situation and understand the reasons. We formulate the hypothesis:

  • something is wrong with the product,
  • there was a change suggested,
  • there are some problems with the website,
  • the marketing campaign wasn’t successful,
  • there are external reasons (related to users’ geography etc.)

This represents real cases  from my experience. As it turned out, the decline happened due to the software issues. The company was growing, quickly filling up the website, so it faced the problem of accepting more traffic. New servers had to be installed closer to countries with biggest growth., After that indicators got back to normal.

4 tips on the effective cohort analysis

1. Identify suitable indicators to monitor based on your business model:

  • who should be considered as active users,
  • whether to analyse the cohort by day or by month,
  • which cohorts to compare.

2. Monitor and collect the data about users:

  • user identification (account or mobile device ID, Cookie, digital fingerprint, IDFA, email or phone number),
  • behavioural data (how users interact with the product and react to marketing campaigns).

3. Get insights from your analysis:

  • explore the data and generate hypotheses,
  • identify key behaviour scenarios,
  • work on retention,
  • align the marketing calendar with the cohorts’ behaviour.

4. Test the ways to optimise and get better every day.

Preview photo: casaltamoiola /

Header photo: macrovector /

Related Articles

Recommended by MP Wiki

Do you want to launch your own markeplace and need some help?