The very unhappy path to Terminal 5

I see that the new terminal at London’s Heathrow airport is in the midst of another weekend’s disruption. Problems on the terminal’s opening weekend resulted in over 200 flights cancelled and a backlog of 28,000 bags. The chaos has already cost British Airways, the sole user of the terminal, £16m, and some estimates put the eventual cost around £50m.

Initial problems reported included the failure of either passengers or staff to find the car parks, slow security clearance for staff, consequent delayed opening of check-in desks, and multiple unspecified failures of the baggage handling systems. Once the initial failures occurred, a cascade of problems followed as passengers began to clog up the people-processing mechanisms of the terminal.

This weekend’s disruption has been blamed on “a new glitch” in the baggage handling system. I suspect that means that when they solved one set of problems they unmasked another. A spokeswoman assures us that they’re merely planning how to put an identified solution in place. Her statement doesn’t include any reference to the fact that these problems often nest, like Russian dolls, and that the new solution may uncover—or introduce—new problems.

Of course, my reaction was, “Did they test the terminal before opening it?” The errors shown include both functional errors (people can’t find the car park) and non-functional ones (the baggage system failed under load). No system is implemented bug-free, but the breadth of error type got me wondering.

Fortunately, the Beeb covered some of the testing performed before the terminal opened. Apparently, operation of the terminal was tested over a six month period, using 15,000 people. The testing started with small groups of 30 – 100 people walking through specific parts of the passenger experience. Later, larger groups simulated more complex situations. The maximum test group used was 2,250. BAA said these people would “try out the facilities as if they were operating live.”

Do 2,250 people count as a live test? Are they numerous enough to cause the sorts of problems you’re looking for in a volume test?

I plucked a few numbers off the web and passed them through a spreadsheet. T5 was designed to handle 30 million passengers per year, which comes out to an average of 82,000 per day, or 5,000-odd per hour in the 16-hour operating day (Heathrow has nighttime flight restrictions). These are wildly low numbers, because airports have to handle substantial peaks and troughs. Say that on the busiest day you get 150% of flat average, or 7,500 people per hour. Assuming 75% of the people in the terminal are either arriving from or heading toward London, and the rest are stopping over for an average of 2 hours, that’s about 9,375 passengers in the terminal at a given time.

9,375 is more than 2,250. You can,however, magnify a small sample to simulate a large one (for instance, by shutting off 2/3 the terminal to compact them into a smaller space). It’s not just a numbers game, but a question of how you use your resources.

Most of the testing documentation will of course be confidential. But I found an account of one of the big tests. I would expect that any such report was authorised by BAA, and would therefore be unrealistically rosy; they want passengers to look forward to using the new terminal. But still, the summary shocked me.

In fact the whole experience is probably a bit like the heyday of glamorous air travel – no queues, no borders and no hassle.

Any tester can translate that one. It means:

We didn’t test the queuing mechanisms, border controls, or the way the systems deal with hassled passengers.

In software terms, there is something known as the happy path, which is what happens when all goes well. The happy path is nice to code, nice to test, nice to show to management. It is, however, not the only path through the system, and all the wretched, miserable and thorn-strewn paths must also be checked. This is particularly important in any scenario where problems are prone to snowballing. (Airport problems, of course, snowball beautifully.)

Based on the account I read, these testers were set up to walk the happy path. They were not paid for their labours, but were instead fed and rewarded with gifts. I’m sure food and goodie bags were cheaper than actual pay, but they dilute the honesty of the exchange. We’re animals at heart, and we don’t bite the hand that feeds us. We like people who give us presents. Getting those people—mostly British people—to act like awkward customers, simulate jet lag or disorientation, or even report problems must have been like getting water to flow uphill.

Furthermore, look at the profile of testers mentioned: an ordinary reporter and a bunch of scouts and guides. I wish I believed that the disabled, the families with cranky children, and the non-English speakers were just at another table at breakfast. But I don’t. I suspect the test population was either self-selecting, or chosen to be easy to deal with. In either case, it didn’t sound very realistic.

It’s possible that there was another test day for people who walked the unhappy path, and that it wasn’t reported. It’s possible that they did clever things, like salt the crowd with paid actors to clog up the works and make trouble, and that our reporter simply missed those incidents.

But I’ve worked on big projects for big companies, and that’s not what I’m betting. I suspect there were very good test plans, but that for reasons of cost and timing they were deemed impractical. So compromises were sought in large meetings with mediocre biscuits. Gantt charts were redrawn late at night using vague estimates that were then taken as hard facts. Tempers were lost, pecking orders maintained. People assured each other that it would be all right on the night.

It wasn’t.

I wish I believed that the next time someone does something like this, they’ll learn the lessons from the T5 disaster. But that’s happy path thinking, and I’m a tester. I know better.

15 thoughts on “The very unhappy path to Terminal 5”

  1. As of this morning (Sunday 6 April), the Beeb says that the baggage system is working better.

    However, it has snowed, and air traffic control has reduced the number of flights allowable because of that.

  2. Excellent post, abi. I wish every project manager in the world was required to read it.

    Ah, the happy path. When it’s the rosy path through the PERT chart, it’s called best-case management, and it works pretty much like inadvertently invoking the name of the Dark One: you get the worst case despite all your planning.

    The aviation industry seems especially prone to this kind of mistake. Aerospace engineers are an especially pessimistic bunch; they’re always liking for low-probability failures that can cascade into catastrophes. That’s one of the reasons there are so few catastrophes involving airplanes. But managers and executives seem to have learned from this that catastrophes just can’t happen to them, so why worry?

    Still, you’d think that the people who planned Heathrow Terminal 5 would have learned from the Denver International clusterf**k of a few years ago. Exactly the same: open a massive new terminal with an automated baggage handling system that’s borked right from the start. Oh, that can’t happen to us!

  3. Bruce:

    Ah, yes, the Denver baggage handling system. It’s a case study in bad implementations. Opened nearly two years late, millions of dollars over budget, and never more than partly functional, it’s the stuff of legend.

    Apparently they gave up on it in 2005.

  4. abi wrote:

    > Denver […] Apparently they gave up on it in 2005.

    I recall hearing that United (I think) ran a parallel manual baggage handling setup even while they were still paying through the nose for the automated one. It simply wasn’t worth using.

  5. The Terminal 4 baggage handling system went down in February 2006 because of a “software glitch”.

  6. Anyone wanna bet it was the damn same contractor that implemented both DIA’s and Heathrow’s systems — because it’s the only company with experience with such large-scale implementations, however frakked?

    Sheesh.

  7. “…these problems often nest, like Russian dolls, and that the new solution may uncover–or introduce–new problems…”

    I usually visualize these situations like the scene oin the first Indiana Jones movie: you’re walking down a tunnel, you come across a pit with hungry saurians at the bottom, but your sidkick notices it too late so down he goes; you fix that bug by putting planks over the pit and it’s a cake walk going across so off you go into the dark, strutting on until you put your foot down on something that goes click, then you hear snikt, and you’ve been skewered.

  8. Serge,

    That scenario is a little too simple for such a large project. You need to have armies of minions being slowly killed off by hazard after hazard.

  9. That’d be true, Abi, for the actual testing phase of the Labyrinth of Doom. But if one wants to explain the concept of the Labyrinth of Doom to the non-technical people in charge of approving the project, one must provide them with a simple example. The danger of course is that, during the testing phase, they will remember only the simple example and not the hundreds (if not thousands) of pages of the actual plan.

    “Henchfolks, you’re way over budget and way behind schedule.”
    “What do you mean, O Merciless Ming? We provided a detailed plan.”
    “You told me that testing the Labyrinth of Doom would entail two Perilous Traps.”
    “Butbutbut…”
    “Guards! Take that engineer away!”

  10. I reckon BA could have easily simultaneously saved money and had their 2,250 people behave much more naturally.

    Having promised them food and gifts, the trick would be to screw up the catering and under supply the gifts. Combine this with some poor communications and they’d have had a perfect crowd.

    Angry, resentful people feeling like they’d been cheated of their entitlements trying to complain to “friendly” staff with no control over the underlying faceless bureaucracy.

    Now that would be real testing.

  11. Good morning abi!

    It looks like Making Light is unreachable today. Responds to pings, but times out via firefox. (I also tried a proxy service, but no luck there, either.)

  12. That reminds me, you’re welcome to stay here if you have an urge to slum in London for a bit. You get the IndiaTown tour for free.

  13. Jules,

    You are suitably evil. I wish I had thought of that. (I wish they had, too!)

    Avedon,

    Thank you for the invite. I’ll ping you if I’m to be in London any time soon, for (at the very least) a visit to a pub.

  14. The heads of the companies concerned—British Airlines and the British Airport Authority—have been hauled up in front of the House of Commons Transport Select Committee to explain the situation. (http://news.bbc.co.uk/1/hi/business/7388296.stm)

    Colin Matthews, BAA’s chief executive, admitted that “the totality of the testing regime did not adequately replicate the reality of the first days of operations”. He also said that there had not yet been “any inquiry into exactly who was responsible for the woes that beset the airport.”

    Willie Walsh, meanwhile, admitted that “the airline (BA) had compromised on testing the new building and said that their staff were not given enough training.”

    BAA’s non-executive chairman, Sir Nigel Rudd, summed things up nicely. “It was clearly a huge embarrassment to the company, me personally and the board…Nothing can take away that failure. We had all believed genuinely that it would be a great opening, which clearly it wasn’t.”

    Nonsense. I’m sure not “everyone” believed it would be a great opening. Just everyone these gentlemen were listening to.

    I hope that these guys put some time and energy into an honest investigation of what happened in the lead-up to the opening of the terminal. I hope they learn from what they find.

    And look! Is that a pig flying by my window?

Comments are closed.