Of course, the standard answer to the above question is, “It depends.” And that’s true. A story point is an arbitrary unit of measure that represents the amount of complexity, effort, and uncertainty in a bit of work. What a single point means can vary from team to team. However, once a team settles on what one point represents, that remains fixed. So far, so good.

When I was first introduced to agile, my team was defining story points in terms of a loose relationship with time. They were doing so because they were a team that was newly introduced to agile, having embraced it in the middle of a thoroughly waterfall culture. They were in uncharted waters, pardon the pun. I don’t fault them for using time as their scale. It’s by far the easiest way to quantify a story point for someone who is new to them. The problem, as I see it, is that they were doing so in a way that didn’t scale.

Their metric for one point was one developer for one day. Sounds reasonable. But, their metric for a 13 point story, the biggest that they would accept in a sprint, was two developers for the entire sprint. The problem here is that the math doesn’t work out. When I raised this concern with the team, there were mixed responses. Some people thought it made sense, others said something to the effect of, “I’m not sure story points scale like that.” I did some Googling around, but found that sources were divided just as evenly as my team. However, as time went on, I became more and more certain that I was right. Our velocity was all over the place. We couldn’t finish our commitments in one sprint, but the next we would finish our work before the sprint was half over.

Initially, as we were dealing with this instability, I was a developer on the team. I eventually took over as scrum master. At that time, the product owner was becoming frustrated by our unstable velocity. He couldn’t predict anything. His cone of uncertainty was enormous. I recognized that the problem was twofold: points that don’t scale and stories that are too large.

Since the team couldn’t settle on a scale for points, I started pushing the product owner to split stories smaller and smaller. We quickly got to where we rarely had a story that took more than a couple days. This had two benefits. The first was nothing surprising. Smaller stories are easier to estimate and, consequently, more accurate. The second was, as I see it, more interesting. Eliminating those larger stories had the unexpected side effect of making our points scale better. The scale was more and more skewed as stories got larger. By splitting those stories smaller, we invariably split a 13 point story into 20 points worth of smaller stories. Things began to stabilize.

And there was much rejoicing