In this installment I thought I'd discuss a simple device that helps both end users and corporate IT staff more effectively capture information to facilitate troubleshooting. The lowly "system journal."
A system journal is nothing more than a notebook in which you record information. For example, a server room should have an Event Log book and every time someone performs any action on one of the servers the date, time, initials of the technician, and a description of what was done should be recorded. The reason is simple, when something goes wrong it can usually be traced back to a recent change made to the system. But you'd be surprised how hard it is to figure out who made what change to which server/application/service and when.
Another thing that should be recorded in an Event Log is the creating of folder structures for the purpose of temporarily storing stuff. I can't begin to tell you the number of times I've run into the following situation on network upgrade projects. When we start looking at migrating the data from the old server we run into huge amounts of stuff that had no business being on the server in the first place. "Ah, the last IT guy must have copied so-and-so's laptop to that drive," "Gee, I guess that looks like a copy of the database from when we were installing that upgrade two years ago," "Gosh, I've no idea what all that stuff is, looks like old Goldmine data." It just goes on and on. All this old stuff accumulates like junk in your closet only it never gets cleaned out. The older it is the less likely anyone will take the responsibility to just delete it. And it all gets backed up in the nightly backup, eating up tape and adding hours to the backup job. The Event Log should make temporary copying of files easy to spot and should also generate a note as to when the data in question can safely be removed in the future.
Another notebook that should be in every server room is a problem log where that "something goes wrong" bit is recorded the first time it is observed. Date, time, manifestation, symptoms, side effects, everything that is noticed should be recorded. Just as important is to record each and every step taken to try to resolve the problem, and this can take discipline because it's time consuming when you're frantically trying to fix a broken server.
But the benefits are real. First, in the heat of trying to fix a computer there is a tendency to start throwing solutions at a problem hoping one of them will stick and fix something. Second, it is not uncommon for a hurriedly applied fix to not only not correct the problem but to introduce a new problem unrelated to the first one. Knowing everything that was done (and in the order that the fixes were tried) makes it a lot easier to unwind things that missed the target once the real problem is found and corrected.
This all applies to end users as well. Every computer should have a system journal in close proximity to it, and every time some piece of software is installed it should be recorded. Again, date, time, who did it, and exactly what they did. Every time there's a glitch, blue screen, GPF, suddenly flaky behavior, it should be recorded. Along with which applications were running when the problem occurred, what you were doing at the time, etc.
The system journal is your best resource when your computer develops a problem again because often what was recently done to a computer is a major factor in troubleshooting a problem. And it also allows patterns to be detected in what would otherwise be written off as random glitches. If you have an IT staff you can provide them with detailed information about the problem which should help get it resolved quickly. If you're on your own at least you'll have information that may help you troubleshoot the problem yourself.
This is not a new idea by any stretch of the imagination but you'd be amazed at how many server rooms do not maintain written logs and very, very few end users go to the trouble. If you don't have a system journal for your computer, start one today. Yes, I know, lead graphite and papyrus derivate technology may seem pretty primitive but it's a cheap solution to troubleshooting high tech problems.
You can reach T.J. Lee at:
mailto:tj_lee@TheNakedPC.com
