In OSX 10.6 I'm running logcheck.sh via. launchd using this plist
"http://www.apple.com/DTDs/PropertyList-1.0.dtd">
Label org.logcheck
Program /opt/local/bin/logcheck.sh
StartInterval 600
logcheck runs at the specified interval but it won't send me mail using the command below:
cat $TMPDIR/checkreport.$$ | $MAIL -s "$HOSTNAME $DATE system check" $SYSADMIN
where
$TMPDIR=/opt/local/var/tmp
$MAIL=/usr/bin/mail
$SYSADMIN=myuser
however, if I hack it, and change the command to:
cat $TMPDIR/checkreport.$$ > /Users/myuser/report
cat /Users/myuser/report | $MAIL -s "$HOSTNAME $DATE system check" $SYSADMIN
then I receive the mail.
Checking permission on tmp with $ls -l /opt/local/var
I get
drwx------ 20 root admin 680 Jul 12 13:29 tmp/
If I run sudo /opt/local/bin/logcheck.sh
the first command works.
If I use /opt/local/bin/logcheck.sh
in root's crontab the first command works.
If I throw in the script echo "$(whoami)" > /Users/myuser/launchduser
I see that it is indeed being run by root.
Why am I not getting mail with the first command in launchd? Is it a permissions issue with the pipe to mail?
Answer
I've recently been working on this myself, and have found entries in the system log (/var/log/system.log
) that show errors related to this issue, such as:
Nov 1 08:52:14 my-computer com.apple.launchd[1] (org.postfix.master[22591]): Stray process with PGID equal to this dead job: PID 22592 PPID 1 pickup
Nov 1 08:52:14 my-computer com.apple.launchd[1] (org.postfix.master[22591]): Stray process with PGID equal to this dead job: PID 22594 PPID 1 cleanup
I found that my logcheck script and the expected email worked perfectly when executed from the command line, and that the logcheck script was performing its functions well when launched using launchd via a LaunchDaemon script.
However, the mail never arrived when using launchd
. The errors above, and many others, involving postfix and sendmail, indicate that the child sendmail processes were being terminated by launchd (as part of its garbage collection routines?) before they had time to complete.
I added the following key to my plist:
AbandonProcessGroup
and the mail started flowing when using launchd. Unfortunately, I still get the stray process/dead job messages in my system.log, which I'm currently working on eliminating. I've added a sleep 120
line to my logcheck.sh
script, which reduced, but has not eliminated, these messages. I could lengthen the time of the sleep command in logcheck.sh
, so that the script persists longer, but I don't like this particular 'hack' and want to find a more elegant solution. I believe launchd does not begin its garbage collection until after the logcheck.sh process completes....
I'm going to try explicitly lengthening the TimeOut key in the controlling plist, and see if that works better.
No comments:
Post a Comment