Salting wouldn't have changed the outcome of the cracking attempts in the article. The primary focus was on how pre-computed hashes (rainbow tables, etc.) are no longer a tool used by most people attacking password lists because GPU-based hashers are efficient enough, the dicts long enough, and the methods of permuting the dicts (combinations, leet substitutions, markov chains, etc.) are rich enough that storing all possible hashes is actually less effective than writing an MD5 kernel for your AMD GPU and letting it go, monitoring for trends in the output.
Every time figure quoted in the article (eg "Retrieved 2700 passwords in 2 minutes 30 seconds") would have been up to 16,000x larger if a salt had been used since a separate hash would need to be computed for each password rather than just one hash to compare to all 16,000 passwords. 2.5 minutes x 16,000 would be around 28 days to compute the "first pass" alone.. much less feasible, especially as the parts that took multiple hours would take several years to finish. It's true you could still crack the same percentage of passwords eventually, but being able to crack 90% in 5 years or in 20 hours is a very different outcome.
The point is that a salt would essentially give you a constant speed of cracking versus a speed that scaled with the size of the dataset you have. For example, the guy in the article cracked 10,000-ish passwords in 16 minutes - 600 passwords per minute. If he'd had a dataset of 1.6 million, he would have cracked over a million (assuming the passwords were of the same quality and he has some good way of storing them on his hardware to allow fast lookup), at a rate of 60,000 passwords per minute, all on his single GPU.
However, with a salt he'd be limited to about 1 password per every 27 minutes. And this rate would not increase as the dataset grew larger, which means that even if he had 10x the power as you suggest he could only crack about 20 passwords per hour (more or less). Thus to crack even 1% of a 1.6 million password dataset (16,000 passwords) it would take over a month of non-stop computation on 10 GPU's, versus a single GPU computing 62% (again over a million) in 16 minutes.
So even if a compromised website took a month to realize the leak, it would be able to inform it's users in time such that 99% of them would be able to change passwords and avoid their accounts being compromised in time. With no salt, no such luck since as we can see here up to 90% of the dataset would be cracked within a day.
But with a per-user salt, they would have to attack each password individually, using each of these techniques with each given salt and hope for a coincidence with the stored hash, rather than run their techniques and compare the resulting hashes to the stored ones? Something like:
Global hash:
while(s = guess()) {
if (md5(s,salt) in list) {
print s;
delete(md5(s,salt),list);
}
}
Unique hash:
while(s = guess()) {
foreach(h in list) {
if (md5(s,salt)==h) {
print s;
delete(h,list);
}
}
}
The former requires one md5 calculation and one lookup per guess, the latter needs n md5 calculations per guess. That certainly is quite a difference, assuming the bottleneck to be md5.