We originally wrote this as a guest post on Seedcamp. We're republishing today, with revised text, updated links, as well as an extra section with some thoughts on our own experience.
Here's the link to the original post: http://www.seedcamp.com/2014/01/guest-post-tips-to-make-the-most-of-your-startups-data-from-andy-young-by-popcorn-metrics.html You can skip to the end if you just want our extra thoughts. Otherwise, read on... :)
*BACKGROUND: Andy Young co-founded GroupSpaces and is now helping build Stripe.
Spending his time between San Fran and London. We were lucky enough to grab him during the Seedcamp Academy for a session on startup metrics, drawn from his own experience of being in the metrics trenches as an entrepreneur. Given that we love metrics, when Andy Young came to Seedcamp to talk about the subject, we were super excited!*
1. Know about BIG changes ASAP!
Andy is heavily focused at the macro-level. He wants to understand what’s been going on and what is happening right now with all of his user base.
For Andy, if something significant happened he wants to know it ASAP, like spikes due to a press release, site breakdown or early warnings. He accomplishes this by accessing the company internal database and running queries several times a day.
To give you an example on the importance of the knowledge a company can grasp, we’ll share with you one of Andy’s stories.
2. Know what's driving your traffic
This happened back when he was working on GroupSpaces – an online company that provides technology to help real world clubs, societies, associations and other groups manage their membership and activities, and promote themselves online.
At the time, GroupSpaces’ visitor numbers were growing steadily, until one fine day the thing just spiked hugely for a couple of weeks and then died out to the previous levels. What just happened? Could the company be missing a huge event on which they could capitalize? The only thing they knew at the moment was that the event originated in South Africa.
A few months passed and BAM! Huge spike again! Source – South Africa. Andy couldn’t just let this go – what if they were missing something big? He held his breath and dived deep in data. The investigation come to a conclusion when they found out that the source of the event was an University course that used Groupspaces in the duration of the classes – the reason why the site visits spiked and died out after the course was finished!
The insight here is that if you know who, why and what is driving your traffic, you can choose to act upon it. For example, GroupSpaces, with this information could choose to focus on a new niche.
To get insights like this, one cannot just evaluate realtime data but to be able to access historical data and run comparisons.
3. Different views, Different insights
One of the biggest points Andy raised is getting the right information from the data.
Imagine a scenario for a growing online sales business, where you take a set of data from October to March and sample it in different ways – one sample with the monthly sales and the other the daily sales. Those two samples will give fairly different insights:
On the monthly sales sample you will get an upwards trend and the insight is that the business is growing steadily. On the daily sales though, you get to see the daily spikes and and instead of the smooth upwards trend, you get a huge gap in January sales (after Christmas). What you get to see with this sample, is the spikes on sales during the different periods, that are otherwise hidden in a different sample.
A way to get the benefits of both approaches is to do a 28 day rolling total (for each day, the value displayed is the sum of the sales of day, plus the sales from all the previous 28 days). This will reduce the noise, but still give you the trends and the relevant spikes.
The point here is that different views on the data will give different insights on your business. Always try different approaches to get the big picture.
4. On cohort analysis
A cohort is a group of people who share a common characteristic or experience within a defined period. For example, taking a group sample from January and another group sample from February and then follow their behaviour in the next months.
Andy was using cohort analysis to know the % of active users from different months, who were still active one or two months after the sign up.
The point here is not so much on understanding who the customer is or where they come from, but to know how the changes made to the product (on a given month) impacted the behaviour of the subsequent cohorts.
Andy raised some issues with the cohort analysis: "It’s an inflexible, time consuming task and there is a delay to get the latest data (you need to wait the period of the cohort to end before you analyze it)."
If you are using your own database like him, he found a good way to solve this:
- Use automation to build and run queries every hour will save you time.
- Store the results in a simple database.
- Create a page to graph the results (a tool like a HighCharts will help you build the graphs).
If you are using advanced web analytics tools like mixpanel and KISSmetrics, you can use their cohort analysis tools to show how many users with X characteristic, did Y action in your website, and then come back and did Z thing for a period of time.
5. Andy’s don’ts with data:
- Don't use stale data: Don’t use outdated information – when the criteria have changed;
- Don't forget context: Don’t use just one view of the data – the sample of the data needs to be in the right context. If you are evaluating sales growth for the month of January of an online business the result will most definitely be negative according to the previous month – XMAS. You are better off by doing a cohort – that is, measure over multiple windows and compare them.
- Don't calculate by hand: Don’t do manual calculations – this will just drive you mad. Do this just to get to know what type of data you need and how to get it. Then automate it.
6. Andy’s Do’s with data:
- Do Graphs – This will allow you to get a quick overview of what is happening and it’s also an easy way to compare with other sets of data. You can use also Chart.js, besides HighCharts.
- Do Automate - This will be a huge payoff on the long run. Less headaches, more information quickly. If you don’t want to hardcode it yourself use a tool like Ayehu.
- Do Real-time – Will allow you to know what is happening now in your website. The use for this is very specific, since you shouldn’t make long run decisions, based on short-term trends. The benefit is to allow you to react quickly to an expected event, like a server breakdown.
7. Key takeaways:
If we could take only 3 key insights from Andy’s knowledge, they would be the following:
- Use real time analytics to spot critical issues.
- Leverage cohort analysis to get the trends and measure the impact of the changes you make.
- Look at the same data from different angles, as it will provide different insights.
As we said, at Popcorn Metrics we live and breathe Metrics & Analytics, so our opinion can be a little biased. But trying to grow your business without getting the information you need, is as good as driving blindfolded. Knowing what is happening will empower you to make confident decisions to push forward your business.
Since we wrote the original post, we've been wrestling with some of these challenges of our own.
1. RETENTION: Specifically we've found it hard to track some key data, such as how frequently a user returns. Our product tends to be very much set-up and forget (until you want to track something else), and we use mixpanel to understand usage of our Tracking Editor tool. Mixpanel is a great product (we LOVE it!) but without writing code we haven't yet found a great way to automate this. We can see when a user last visited, but that's different from frequency. (We'd like to easily see who are the users that are engaging the most.) So this is something we'll probably build into Popcorn Metrics.
2. COHORTS: We've also found this difficult in a startup stage, because unless you have a high and constant stream of new users its hard to measure the difference between changes. We also tend to release multiple times per day, so our product evolution is constant. The way we're solving this right now is to work in batches. We have a small but steady stream of new sign-ups, but when we do an exposure event like BetaList or erlibird we tend to stagger the event, and push a big bunch of changes (based on everything we've learned since the previous cohort).
3. 28 DAY ROLLING TOTALS: We LOVE this as a concept, and it fits REALLY well with a product that is in constant evolution. At the moment we're in the process of implementing this internally before potentially rolling it out to our core product.
Want to get started tracking your own site?
If you want to get started tracking user events on your own website into Google Analytics, mixpanel, KISSmetrics, etc., feel free to check out our visual events tracker tool.
I'd love to know what challenges you're experiencing?