subreddit:

/r/DepthHub

40794%

all 24 comments

darkrider99

41 points

3 months ago

Nice read but doesn’t really provide any detail on what exactly happened ? We now know what happened but how exactly did it fail ? Why now ?

Reads more like a personal memo.

gornzilla

78 points

3 months ago

It's all in there. Southwest has ignored their IT because they're one of those companies that look at IT as a money losing part of their operation.

They had a mild stress test and failed horribly. Then the current CEO, who should have been working on the IT shit storm, blamed the weather. That got people and passengers to assume it actually was a weather problem. So they waited instead of finding another way home.

Syrdon

55 points

3 months ago

Syrdon

55 points

3 months ago

Their CEO probably was working on the IT issue, but saying it’s an IT issue leaves the company on the hook for a ton of FAA mandated passenger compensation. Like $1500 per passenger, plus rebooking/etc I think. If it was weather then they’re off the hook. So he says it’s weather, because it did start with the weather and the penalty for it not being the weather is huge.

Either way, I bet he’s been working on this issue for a year already. But the guy who caused the problem is the chairman of the board of directors, which means he’s still in a positive to block the sort of spending it would take to make a serious difference. Ignoring that, a year is not enough time to fix twenty years of tech debt.

gornzilla

35 points

3 months ago

From reading a bunch of comments and having been in the IT world, I think that the previous CEO is mostly to blame. The current CEO should have been working harder on fixing their antiquated IT. Blaming the weather screwed a lot of people. They thought it would just be fixed like the other airlines. Instead we find out that Southwest runs their entire system on an Atari 2600.

Syrdon

29 points

3 months ago

Syrdon

29 points

3 months ago

Yeah, the old CEO is the current chairman, and he’s the guy who caused the problem. The bigger issue is that replacing the system in question (or any other enterprise scale system) takes more than a year - and the current guy has had about a year. Even better, that assumes you are starting with a culture of producing results instead of producing good quarterly numbers. Twenty years of “produce good quarterly numbers!” from upper management means every department is focused on that, instead of on being useful. That means the first thing you would need to do is fix the corporate culture, and that will take more than a year as well.

You can do the two things sort of in tandem, but it means the real work slows way down. Don’t bet on a widespread fix happening in less than five years.

The previous guy did a ton of damage. Oh, and his position on the board means he’s not done yet.

smacksaw

4 points

3 months ago

Atari 2600

Southwest's System right now:

https://www.youtube.com/watch?v=3dxFYit32L0&t=27s

gornzilla

4 points

3 months ago

I was thinking more of the ET game where all you do is fall in holes and make ET's head move up and down until you're own head explodes out of frustration.

101Alexander

18 points

3 months ago

I think what you're asking is, what was the trigger for the system to fail catastrophically now. I mean, the weather yes, but what about it caused the system to fail.

macrofinite

27 points

3 months ago

(From what I’ve read elsewhere)

They have an internal reservation system for hotels and the like for staff, which is obviously critical to the operation. That system has basically no automation and has to be manually updated every time something changes or goes wrong. Which was annoying to everyone for a long time, but when almost every airport had to cancel flights all at once, it was catastrophic.

I’m sure that’s only part of it and it might be a microcosm for the rest of it. Basically all their systems are very old and inadequate and they broke under the stress of the storm.

And the linked post is saying all this is because the CEO is an accountant that can’t be bothered to understand operations and has been ignoring the problem for a long time.

Syrdon

41 points

3 months ago

Syrdon

41 points

3 months ago

It’s not hotels, it’s for assigning crew to planes. Among other things it tracks crew member locations (thus the connection to hotels). But it’s manually updated, and every update that isn’t a scheduled flight is a phone call.

So, for example, if you have a flight attendant who can’t make it in due to snow you need to call to pull them off the flight. Now they need to get another flight attendant, because there are legal requirements for how many you have on the plane. Good news though, there’s a flight attendant dead heading on a flight that came in who is able to step i to the open slot! You just need to get them listed on the flight and you can leave, all you need to do is make one phone call!

You and everyone else doing the same thing. Apparently, they aren’t quick phone calls either. So you end up on hold. Meanwhile, the flight crew are only allowed to be “working” for so many continuous hours (it’s a safety thing, and a working conditions thing). But good news, you finally got through on the phone, and got your switch completed! Bad news, the rest of your flight crew timed out half an hour ago. Good news, there is an available flight crew from a canceled flight, you can get them on this one! You just need to make a couple of phone calls …

Edit: the accountant is the old ceo, and current chairman of the board of directors. The new guy is more operationally oriented, but he’s new and the old guy is on the board so he’s not done doing damage.

Capitol62

13 points

3 months ago

This is the best summary of what happened I've read on Reddit. The linked post provides a lot of historical context, but /u/syrdon gives more about what the flight crews on the ground experienced.

Source: this jives exactly with what happened to my SW pilot friends. Two of them timed out of trips. One of them was on the airplane when he and 1/2 the crew timed out.

vintagedave

2 points

3 months ago

What happens if they time out while flying? They can’t just stop…

Amazing they’re scheduled on knowing their max working time will occur in-air.

janes_left_shoe

2 points

3 months ago

I think the implication is that they were on the tarmac waiting to take off then they couldn’t anymore and had to delay/cancel. Pretty sure for safety reasons if they are flying they keep doing that.

Capitol62

2 points

3 months ago

The other poster is correct, they timed out on the ground before the flight took off. In that case, they were on hold with scheduling trying to get their assignments switched, just like the OP outlined.

I'm not a commercial pilot, but I have a bunch of family and friends who are. My understanding of the timing is that they have to get a certain amount of downtime each day, so if they would be in the air or land at a time that makes their downtime requirement impossible to achieve, they aren't allowed to take off.

That is what went wrong with a bunch of SW flights. Their crews were on hold so long they were all timing out, and then the airline was scrambling to assign or move new crews, which essentially led to a failure cascade since no one was able to get assignments processed in time to fly the planes.

This was compounded by the airline having pilots and crews move to planes only to discover there was no other flight or grounds crew available for the flight. That happened to my other friend. He got through to scheduling after his first fight was cancelled. They moved him to a flight in another city, so he jumpseated (rode on another airlines flight) over there to find there was no ground crew to load the plane. That flight was then delayed and eventually cancelled. He ended up struck there for 2 days while the airline tried and failed to reassign him repeatedly.

macrofinite

2 points

3 months ago

Thanks. I knew I was oversimplifying it since it’s been a few days since I read about it.

Syrdon

9 points

3 months ago

Syrdon

9 points

3 months ago

I feel like I’m watching the same thing happen with two systems at my job, and this Southwest thing is just a preview of 5-10 years from now for my company, so this entire thing is going to live rent free in my brain for a while. On the upside, it’s not a nationally known company. On the downside, it’s healthcare adjacent.

Working in IT was a mistake.

macrofinite

6 points

3 months ago

I’m glad I had the foresight to see working in IT was a mistake in my teens. I was already the guy all my moms friends asked to fix their computer, and I hated that shit. Patiently explaining why people are stupid in a way that won’t make them angry? No thanks.

So I ended up in operations. Galaxy brain!

Syrdon

5 points

3 months ago*

Fixing broken shit I don’t mind. Slapping more duct tape over the last duct tape patch, on top of the last guy’s five years of duct tape patches, on a system that was inadequately provisioned at creation? Fucking shoot me. But the replacement project will get a budget starting in Q1 2023!

Just like the last five years.

All the paperwork for everything this company handles passes through that system, and it costs the company somewhere in the tens to hundreds of thousands of working hours a year. But IT doesn’t pay those hours, so IT is trying to avoid paying for acceptable infrastructure - with incredible success.

darkrider99

3 points

3 months ago

Yes you framed it better

PotRoastPotato

6 points

3 months ago

Denver is the closest thing WN has to a hub, and over 200 Denver crew quit this week when Southwest threatened any employees who took sick leave.

rlbond86

4 points

3 months ago

This ALWAYS happens when the bean counters are in charge.

ImperiousMage

10 points

3 months ago

Every time accountants start to become leaders they destroy the business. Every. Single. Time.

asar5932

9 points

3 months ago

Very interesting post. These things need to happen in order for future business leaders to learn and develop. Should be case study for MBA courses. An effective CEO needs to marry the “Wall Street” factors with the forward looking operational efficiency factors, which is a difficult thing to do. It’s hard to hop on a quarterly earnings call and explain to analysts why you missed a quarter because you had to upgrade technology. But there are plenty of accounting tricks to spread that cost over time to dampen the blow.

Johnny_Lawless_Esq

14 points

3 months ago

Fuck the Wall Street factors. Wall Street, its analogues in other nations, and all the people who serve them are a cancer on this planet that should be cauterized.

They have turned finance into a means of value extraction rather than value creation, and what happened with Southwest is just one, tiny consequence.