By ALLISON OPSON-CLEMENT News Editor
Western’s network was down from 9:30 a.m. to 2 p.m. Wednesday, Jan. 14, because of a router overload due to increased traffic, partly because of an external hacking attempt; the campus system was restored by University Computing Services (UCS) workers, and diagnostics are ongoing.
“There’s a whole bunch of ‘don’t know’ right now,” Bill Kernan, director of University Computing Services, said, adding that he and UCS are taking a forensic look into what happened.
The focus was on getting Western’s computers going again. Kernan said his entire team worked continuously, not stopping for lunch at all, and stayed clear until the end: many left only at 9 p.m. that night, after almost twelve hours of non-stop work.
The network interruption was noted at 9:30 a.m., and Kernan and his team were contacted.
They spent the next hour troubleshooting.
“The typical issues weren’t there,” Kernan said.
He started calling in help from off-site backup. By the end, UCS was on the phone with, off and on, up to three engineers simultaneously, all coordinating and working on the problem.
“I got as many resources thrown at it as I could,” Kernan said.
He called what happened a “perfect storm.” Two things happened nearly simultaneously, but either one alone could have been sufficient to bring down the network, because both resulted in traffic flow beyond what the main router on campus has had to deal with before.
He said it was like two fire hoses of information: the streams were too strong, even alone, but together, it was tremendous.
Increased usage overwhelmed the router. In addition to the increase of normal traffic, it also had net flow logs which were running. These help in diagnostics for determining the types and amounts of usage when that can help UCS.
“It’s not like we did something new recently,” he said. “Net flow shouldn’t have done this to us.” The whole network had been stable up until this incident, but in this case, the net flow logs happened to be the tipping point on the scales.
The other thing that happened was that the main host server for the campus system experienced an attack from external sources. The hackers’ IP addresses were traced back to computers in China.
“They used the server as a launching pad for an attack against the network,” Kernan said. The attacks took the router down via the compromised host server. He called this a malicious compromise of the system, a directed denial of service attack.
No data was compromised, Kernan said. Only the one server was affected, and it is currently out commission.
Kernan said they made the choice to get campus back up and running. The system was restored to operation by temporarily taking it out from behind the protective firewall. This was done with fewer than half of the most important of the 22 campus networks, and only between 1 p.m. and 8 p.m. on Wednesday.
Without the fire wall, there was less stress on the router, and service resumed. During the time the fire wall was down, UCS decided that it was necessary to temporarily function without the net flow logs, and removed those to keep the system operational.
At 8 p.m. the system was returned behind the firewall. There were no ill effects of operating without the firewall, Kernan said, partly because it was such a short time frame.
UCS also attempted to reintegrate the compromised server, but within the two minutes that it was on, it was the target of 430,000 attacks. It is currently off the system.
Western’s system is up and running. A forensic investigation is taking place, according to Kernan, but this is only secondary to keeping the campus computer network functioning.
“It was a complicated problem,” Kernan said. He will be posting more details on his blog in the next couple of days as they learn more.
For more information as it becomes available, visit wou.edu/wp/underthehood