Thursday, December 23, 2010

Standards for Cleaning up ETL Log and Bad Files

Directing Session and Workflow logs to the Correct Directory

Each person creating workflows in the PowerCenter development environment needs to check to be certain that output files resulting from the run of the workflow and session(s) (i.e., .log, .bad, .ind files) are being directed to the correct directories on the host machine. The destination needs to be hard-coded in the properties of the workflow and session using the ‘Properties’ tab for each object.
For example, the value for $PMSessionLogFile for the Development Repository is set to /dev/null. The result will be that if any team has not entered a specific destination for session log information that the session will fail due to an invalid directory being used.

Cleaning up Files in Log Directory

Below is a sample script that was written to clean out the /load/pmserver/tmp directory of any files. You can use this script as part of a batch job you set up in UC4 to keep your team's log directory (or any other directory your team uses) free of files of a particular age. Just be certain to modify it to meet your circumstances. For example, if I were on the UD team and wanted to purge files from the /usr/local/autopca/ud/log/prospect directory that were older than 7 days, the command would be:

/usr/bin/find /usr/local/autopca/ud/log/prospect -mtime +7 -exec rm {} \;