Wednesday, June 1, 2016

permissions - What is preventing me from piping from a '600' file into mail within launchd?



In OSX 10.6 I'm running logcheck.sh via. launchd using this plist






"http://www.apple.com/DTDs/PropertyList-1.0.dtd">


Labelorg.logcheck
Program/opt/local/bin/logcheck.sh
StartInterval600




logcheck runs at the specified interval but it won't send me mail using the command below:




cat $TMPDIR/checkreport.$$ | $MAIL -s "$HOSTNAME $DATE system check" $SYSADMIN


where



$TMPDIR=/opt/local/var/tmp
$MAIL=/usr/bin/mail
$SYSADMIN=myuser



however, if I hack it, and change the command to:



cat $TMPDIR/checkreport.$$ > /Users/myuser/report
cat /Users/myuser/report | $MAIL -s "$HOSTNAME $DATE system check" $SYSADMIN


then I receive the mail.



Checking permission on tmp with $ls -l /opt/local/var I get




drwx------  20 root  admin  680 Jul 12 13:29 tmp/


If I run sudo /opt/local/bin/logcheck.sh the first command works.



If I use /opt/local/bin/logcheck.sh in root's crontab the first command works.



If I throw in the script echo "$(whoami)" > /Users/myuser/launchduser I see that it is indeed being run by root.



Why am I not getting mail with the first command in launchd? Is it a permissions issue with the pipe to mail?



Answer



I've recently been working on this myself, and have found entries in the system log (/var/log/system.log) that show errors related to this issue, such as:



Nov  1 08:52:14 my-computer com.apple.launchd[1] (org.postfix.master[22591]): Stray process with PGID equal to this dead job: PID 22592 PPID 1 pickup
Nov 1 08:52:14 my-computer com.apple.launchd[1] (org.postfix.master[22591]): Stray process with PGID equal to this dead job: PID 22594 PPID 1 cleanup


I found that my logcheck script and the expected email worked perfectly when executed from the command line, and that the logcheck script was performing its functions well when launched using launchd via a LaunchDaemon script.



However, the mail never arrived when using launchd. The errors above, and many others, involving postfix and sendmail, indicate that the child sendmail processes were being terminated by launchd (as part of its garbage collection routines?) before they had time to complete.




I added the following key to my plist:



AbandonProcessGroup



and the mail started flowing when using launchd. Unfortunately, I still get the stray process/dead job messages in my system.log, which I'm currently working on eliminating. I've added a sleep 120 line to my logcheck.sh script, which reduced, but has not eliminated, these messages. I could lengthen the time of the sleep command in logcheck.sh, so that the script persists longer, but I don't like this particular 'hack' and want to find a more elegant solution. I believe launchd does not begin its garbage collection until after the logcheck.sh process completes....



I'm going to try explicitly lengthening the TimeOut key in the controlling plist, and see if that works better.



No comments:

Post a Comment

linux - How to SSH to ec2 instance in VPC private subnet via NAT server

I have created a VPC in aws with a public subnet and a private subnet. The private subnet does not have direct access to external network. S...