Tuesday, March 20, 2012


Of all the agile topics I have listened too, none seem to generate the passion and emotion as the topic of points. What is a point? How is a point calculated? Can you compare points across teams?

A typical reaction to standardizing the definition of a point is an emotional one. Many teams do practice relative sizing for point estimates, that is, if feature X was 1 point, what is the estimate for feature Y? Usually feature Y would be estimated as a factor of feature X. Typical point ranges would be 1, 2, 3, 5, 8 points etc.  (it is common to use Fibonacci or geometric progression when estimating).

The approach I have taken with points goes against conventional thinking. Often times, the initial reaction is one of apostasy. However, once cooler heads prevail, much of the resistance disappears.

To begin with, points are a means to estimate. It is not a measure of productivity. It is a means to size effort for a card by a pair on a team. A team will measure its ability to do work in points and use the points it completed from the last sprint as a baseline to plan its capacity for the next sprint. As each card is estimated in points, it is tallied against the total points a team completed in the previous sprint. The total points completed by the team in the previous sprint are the team’s current capacity. Typically, a team will plan a few additional cards beyond its capacity in case things go better than they did in the previous sprint.

It is stated by some that points are an estimate of complexity. Most insist that points are not tied to time. If you suggest that a point is tied to a block of time “them are fighten words!” Perhaps it is because time was used with waterfall estimates and we hate to tie an agile estimate to a block of time. I say, it is because time was often used as a means to “whip teams”. Consequently we avoid the time estimate experience by using points.

However, if you ask a team how many points they completed in the last iteration they can tell you. If you ask how many pairs the team had for the last iteration they can also tell you that. If you ask the team for the length of the iteration you will get an answer. From there, figuring out the time it takes a pair to do a point is a simple calculation.

Example: a team composed of two pairs of developers (assuming a right sized team composed of business analysts and automated testers along with the Scrum Master and Product Owner) and the iteration is two weeks in length (9 business days for development, 1 day for show and tell, sprint planning and retrospective) you can perform the following calculation. Assume the team completed 18 points.
1 point = 9 days / (18 points / 2 pairs) = 1 day
That is, it took one pair one day to complete one point. (yes, the example is contrived to make the math simple but it works with any number of points completed by a team).
Once I realized that it was myth that points were not tied to time (time is a law of nature and the fourth dimension we all live in, even on agile teams) I decided to embrace it. Not to use as a whip or to intimidate developers but to improve the team’s ability to plan. I tried the following experiment.

I asked the team to set its relative size for a point. I asked the team to recall a card that took one day to complete. With the card in mind, I ask the team to set the card's relative size to one point. With the team's common understanding of a point the team then played planning poker. The other constraint used by the team was that we only would work on cards that were estimated to be one, two or three points. If a card was five points or more, we decided the card was not yet defined and would need to be further decomposed before it was ready to be estimated. Thus, a one point card equated to one day of work for a pair, two points equated to 2 days of work and a three point card was estimated to take a pair three days. By keeping all cards small (one, two or three points), we improved the team's ability to estimate via planning poker.

Note: the team was not asked to estimate based on time, but rather would normalize a point for the team (and across teams) by recalling recent cards that took about a day. There are no rules on settting relative size. A team or teams can set relative size to anything they choose. By coming up with a common definition across teams for the one point relative size, this enabled the teams to continually keep the value of a point from drifting and providing some consistentcy of what a point meant for each team across all teams.

Taking this approach for over a dozen teams for a couple of years, I observed the following. If one point was the work that one pair could do in a day’s time, during the two week iteration it could be calculated that a pair could complete 9 points. If the team had three pair, the calculation for the total team would be about 27 points (i.e., 3 pair, times 9 days, times 1 point per pair per day). However, in practice, that is not what I observed.

Using the above assumptions, instead of the teams hitting the points as calculated above, the teams would consistently come in at about 80% of their estimated points. So in the above example, instead of completing 27 points, the team would typically complete about 22 points. The 80% rule appeared to apply to teams of two, three, four or five pairs (we did not have any one pair teams).

To introduce terms, the calculated points are called “ideal points”. The 80% of ideal points are the “actual points” completed by the team. We will use these terms again.

Now remember, a point is an estimate. It is not a measure of an absolute quantity of work that a team can do. For example, two teams can estimate a backlog of cards. Their estimates will likely be different. Each team takes into account its skill level, the complexity of the environment and technology and the difficulty of the features. For a given team, one point is an estimate by the team of the work that a pair can do in a day by that team. This gives the team complete control over its estimates and enables it to consider the critical factors needed to complete the work. This is essential for a self-directed team. A self-direct team owns its estimates. One team's estimates can never be used by another team for its estimates.

Once I understood the relationship between an established team’s “ideal points” and its “actual points”, I would use the same approach for new teams. What I discovered was that a new team would typically take between 3 to 5 iteration to hit its “actual points”.  I was able to apply the three to five iteration ramp-up to full capacity (a team’s “actual points”) across a number of new team. It worked surprisingly well and proved to be very useful as a planning tool.

Furthermore, under crunch, I noticed that some established team’s “actual points” might exceed 80% of its “ideal points”. However, it came at a cost. Such teams would express weariness and did not believe they could sustain such a pace for an extended period of time. Seeing a team’s “actual points” climb above 80% of its “ideal points” became an early warning to me that the team was running hot and it needed to plan less points for the next iteration. On the other hand, if a team was not hitting 80% of its ideal points, it would indicate to me that their were blockers in the system. I would then assist the team by identifying and removing blockers in the system which then enabled the team to hit its "actual points".

If you have a development center composed of a number of teams, by standardizing on a relative size of a point equals one pair effort for one day, it became easy to understand if the teams are estimating correctly. It also became easier to plan a new team’s ramp-up and it enabled one to recognize when a team or teams were running to hot or perhaps had blockers that prevented them from hitting their actual capacity.

Try it on your teams. I would like to hear about your experiences. The teams I worked with became very comfortable with their estimates once they standardized on the relative size of a point. They could easily calculate their “ideal capacity” for a sprint and then would commit to their “actual capacity”. The model was very sustainable.

Some might say, why use points at all, just use time or hours. I say no. The reason being is that hours provides false precision. The agile estimating process using discreet points such as one, two or three points for a reason. While it is true that any one card typically does not take exactly one, two or three days, in agregrate the estimates prove to be an accurate measure. By keeping each cards estimate less precise (done in points not hours), the net result is the total points estimated for the sprint backlog proves to be a good measure of the team's capacity to do work.

One other reason that hours don't work. Hours for a point can vary for a team depending on the ratio between developer pairs and business analysts and automated testers not to mention the Scrum Master and Product Owner. Don't fall into the trap of using hours. It does not allow for the variability in team make up and it provides a false sense of precision that does not exist on any team or project, agile or not.

Oh, by the way, even if you standardize on the relative size of one point equals a pair effort for one day, you still cannot compare teams. There are too many other variables including skills, technology stacks, feature complexity, environments etc. However, it does make it easier for a team to know if their points are reasonable. It also keeps the number of points provided by similar teams consistent in size. What is does not do is ensure the amount of actual work completed across teams is the same. Given the complexity of teams, organizations, businesses and technologies, I do not know how that will ever be possible. Nor is it necessary for teams to be productive and provide real business value.

No comments:

Post a Comment