I inherited a number of Perl, shell, and TCL scripts that produce reports from web proxy logs.  The web proxies are scattered around the planet.  Each day at midnight local proxy time the compressed logs are exported to a centralized repository.  A set of CRON jobs executes the main script that takes as argument the region and date of the date of the log to be processed.

Since the logs are exported as compressed file and each region has two or more proxies, the main processing script must first combined the compressed logs for each region.  Once that is done several other scripts are called in turn which produce various reports such as top domain and top user. These scripts call other scripts which call other script….You get the idea.

The scripts had no error reporting so my first task after taking over from the consultant who wrote them was to add echo statements and some simple checks for exit status.  The shell scripts all use similar information on file locations, etc.  Instead of hard coding this information into each script I created another shell script that contains all the environment information.  Making a change to one script requires making a change to several others.  Messy. And time consuming.

I have decided to rewrite the whole bloody mess to do better error reporting, to put all the processing into one location, to make use of date types that do not exits in bash, and to practice my Perl.  Of course in typical hacker fashion I have started coding even before I have an architecture worked out.  I tend to keep all this in my head.  This does not always work out well especially when the problems require thinking in depth.