I can't get into too much detail, but there were increased failure rates during a few jobs. In one case, we added ionice. In another it was a matter of adding a missing index to the DB (full table scan instead of looking at records from the last week).
There was one periodic job that we moved from the production server to work off the daily backups instead of the live server.
There was one periodic job that we moved from the production server to work off the daily backups instead of the live server.