How North Star Metrics Can Lead You South
What is a North Start Metric (NSM)?
Let’s start with a definition:
The North Star Metric is the single metric that best captures the core value that your product delivers to customers.
- What is a North Star Metric
It serves to unify a company or department to produce a single result that everyone can concentrate on.
North Star Metrics for growth
There are several great articles on how to create or choose an NSM and I have no intention of recreating those articles. [1][2]
Generally, NSMs are used to inspire hyperfocus on growth (but don’t have to). In some cases, you could say it defines growth. As you may know already, I’m a Y-Combinator fanboy and you’ll have to excuse me for referencing a few of their articles. Start = Growth is a great one by Paul Graham and explains this concept well.
One of their key concepts is “you make what you measure.” [3] This is incredibly vital to startups, but people seem to stop reading there and miss the corollary, " be careful what you measure".
I want to bring another perspective. The inspiration of this article came when I was reading about companies who intentionally made a choice not to grow in the book: Small Giants: Companies That Choose to Be Great Instead of Big by Bo Burlingham. I then found that Forbes was also inspired by this book and has a section dedicated to them: Forbes — Small Giants.
The stark contrast of seeing what these Small Giants concentrated on, what their NSM was, in comparison to the various growth-oriented startups I studied was startling.
To clarify, I love hypergrowth startups, but a mistake that I see many of these startups making was never made by a Small Giant. Before we get to what they did we need to back up and talk about qualifying growth.
Qualified growth
I’m using a less common definition of “qualify” so I thought I would define it as I am using it:
" to reduce from a general to a particular or restricted form"
When I say “qualify” growth, I mean to make your NSM less general and more specific to the core value that your product delivers.
I’ll use Doordash as an example. I’m not affiliated with them nor do I have any particular insight into their company beyond what they’ve said publicly.
A reasonable NSM for Doordash could be: “number of orders/week”.
This would probably be great when they were first starting. They go from 0 orders to hundreds within a few months and possibly thousands of orders a few months after that.
Isolated responsibilities
When the company consisted of only the founders they were probably doing a lot more than just processing orders and inherently knew the importance of various different parts of the company. Doordash had 4 founders, and you can bet pretty heavily that the same people who were building the website, who were contacting restaurants, and who were delivering the food, were at least in great communication with each other if not fulfilling several of these responsibilities at once.
As they started to grow and people were given isolated responsibilities the fulfillment of this NSM would have lead to serious trouble.
For instance, perhaps they got a new Marketing Manager and he is able to get a promotion in a new, neighboring town for Doordash. He did, fortunately, remember to work with the Tech department to make sure they had restaurants in this new town. The promotion for this town makes the metric look great!
Shortly after customer support starts receiving complaints and they realize they didn’t have enough delivery drivers (or any) in that area to cater to the demand. Marketing was just told to increase the metric and in its department’s eyes, they succeeded.
They end up having to refund hundreds of orders because they’re unable to fulfill them. They lost thousands of dollars, several of the restaurants went off the platform and there were new terrible Yelp reviews. Their isolation to “get more orders”, unqualified and without awareness or discussion led them to disaster.
Gamification
When you don’t properly qualify metrics people will turn them into a game. You tell them you need more Instagram followers and they buy a bunch of followers, possibly truly believing that it will lead to real followers.
Mission accomplished!
Or not. This might sound like a Silicon Valley Episode but I can tell you from my personal experience in the valley that the show is less of an exaggeration than many believe.
You might be inclined to think, “sure, but that’s only the dishonest employees”. Let me tell you, give someone a metric and apply pressure to them to hit that metric and you get some screwy results. You could propose, alternatively, not to apply that pressure. I would agree with you, but I also think that the likelihood of having a continuously pressure-free environment is very low.
Real-world example: Zenefits. Zenefits CEO was fired because he developed a program that essentially allowed people to cheat on their state-required licensing to increase growth. What do you think his NSM was? It obviously didn’t include quality control.
This brings us to the solution.
Qualify the metrics
Please, please, qualify your metrics. I cannot emphasize this point enough.
Iterate over this step until you’re absolutely sure that the metrics express the complete value that your product delivers (see the definition of NSM.)
Qualifying metrics is usually adding a level of quality control to the raw statistic.
This is where the Small Giants were different. Their NSM was based around quality rather than growth. You see something more along the lines of “number of happy customers who ate at our restaurant/week” than “number of sandwiches made/week”. It’s this hyperfocus on quality of the customer and quality of their employees that makes all the difference.
Going back and altering our Doordash example: “number of orders delivered/week.”
This is better and would have certainly avoided the disaster seen earlier as it forces several departments to work together for it to go up. Because you’re now concentrating on delivered, you must interact with the Order Delivery department and then they also have to interact with the Driver Acquisition department to make sure that they are able to fulfill expected orders.
This will still fall over in a real-world scenario if it’s the only or main statistic you’re paying attention to. It completely misses anything that relates to retention. You could discover that you have 10% retention, or in other words, 9/10 people who order from you never order again.
This means marketing might be doing “well” or using gimmicks, like a plethora of press releases, and getting people excited, but the product is lacking.
Protocol development discussed below
Iterate
You find that the delivery estimate time window is being missed by 50%. Change it again: “number of orders delivered on time/week*"**.*
Retention is up to 30%. Better, but something is still wrong. You check your NPS score and it’s at 20 (very low). NPS is often correlated with customer happiness, so you come up with:
“Number of orders delivered on time to satisfied customers/week”
This might be the end or you might need to iterate again. But now you have so many more departments tied in. Perhaps you were hiring terrible drivers who were throwing the food around or treating your customers disrespectfully, and you’ve now tied hiring into your statistic. You cannot have satisfied customers (a lagging metric) without discussing your needs and expectations of the drivers and how they pick up food. You’ve also tied in delivery time estimates for the restaurants and assurance of delivery.
Isolated departments
This is the second most important part of this article and a little bit of a tangent.
I mentioned isolated responsibilities above, but I didn’t go into it much. Having crazy amounts of isolation or being “modular” in your approach to departments seems like someone trying to apply programming architecture principles to business without fully understanding how it works.
In programming, modular design is often thought of as ideal design. This means that something works completely on its own and can be swapped out for something else.
To give a real-world example, think of an email client. If you use Gmail you can use their online interface, or you can use Outlook or Apple Mail or Thunderbird. They can be swapped at any point without affecting how your mail is sent or received. This works because there are very detailed protocols for how email works that they all have agreed to use.
Most departments are not this modular or well defined.
Using our example, let’s say Doordash has a department that hires its delivery drivers. Could you just completely transfer or let go of everyone in this department and hire an outside firm to do your hiring for you? You would probably find that culture starts changing, a different level of skill comes in (higher or lower), people coming in are less trained in their specific roles and you probably get more or fewer people than you need. Unless you define the protocol.
Protocol has a few different meanings, the definition I am using is:
A protocol is an agreement between departments to define needs and expectations or communication channels (to impart these needs) that can be used by each department.
In the beginning stages of a company, you are often aware of everyone’s needs because the team is so small. It’s hard to define protocols because everything is changing so fast: the needs or expectations of a department can vary drastically from week to week.
So, you focus on communication between departments so that as things change the departments are each kept aware and can adapt as needed.
As parts of your operation or agreements become “stable,” protocol is developed.
Growth, then, needs to be balanced by high inter-department communication and protocol creation.
Returning to North Star Metrics, a department’s NSM must be qualified by the protocols or agreements made between departments. We’ll continue to use Doordash as an example.
- Driver team: “number of drivers hired/month”
Pressure could easily lead to hiring bad drivers which then affect other departments or customers. Qualify it: “number of drivers hired maintaining a 4.5-star rating after 3 months/month”. Note, this is a lagging metric, so it wouldn’t catch something immediately. - Restaurant team: “restaurants delivering food/month”. This is better because it takes into account restaurants who stop using the platform*.* That being said, it still could use a quality control in the metric to tie departments together — what if drivers have to wait 30 minutes everytime for food causing the driver to get a low rating because the customer doesn’t know the difference in a delay from the driver or the restaurant. Let’s try again: “restaurants delivering food within 5 minutes of their estimated time/month”.
Summary
North Star Metrics (NSMs) are vital to a company’s growth but should be qualified with quality assurance. This becomes more important and apparent when a company is growing and departments get their own isolated NSMs.
It is useful to iterate over the process to find what metric requires departments to work together, develop communication channels and protocols, and captures the core value of your product.
There are many real-world examples of startups that failed or blundered because they pursued growth in an unsustainable/incorrect way. I challenge you to figure out if it came from an unqualified metric and what that metric was.
References
[1] What is a North Star Metric by Sean Ellis
[2] Every Product Needs a North Star Metric: Here’s How to Find Yours by Sandhya Hegde
[3] Startups in 13 Sentences by Paul Graham — Sentence 7
Thanks to Nicole Elaine and Larry Jones for reading the draft and providing editorial feedback.