December 2016

Saturday, December 31, 2016

sql server - What to store on ephemeral drive? (windows ec2 + sql)

I have an EBS-backed Windows EC2 server with an additional ephemeral drive (aka "instance store") which is really fast (SSD). The instance runs an IIS website + SQL Server

What can I move to the instance-store to fasten things up?

Currently I have moved:

"TempDB" database for SQL server

Non-crucial (temporary) backups

Swap-file

What else can I move to the ephemeral drive to fasten things up? Windows TEMP folder? IIS logs? Would love the ideas.

internet - Periods of high packet loss, two ethernet cables plugged into TWC modem

A small business for which I am the system administrator has been experiencing intermittent internet dropouts. There seems to be no pattern to the dropouts — sometimes there is nothing for a week, and other times it will drop out dozens of times in a single day.

I have been running mtr (my traceroute) to try to diagnose the issue. I am including the mtr results below, pinging Google's Public DNS. I have also been running a test within the LAN, which shows exactly zero packets dropped, even over the course of a week or more; this leads me to believe the problem lies beyond our router. Everything is connected by gigabit ethernet. 10.0.0.1 is the LAN address our router, and the next hop (Time Warner's server) is where the packet loss begins.

                        Packets               Pings
 Host                 Loss% Drop   Snt   Last   Avg  Best  Wrst StDev
 1. 10.0.0.1           0.0%    0 10722    0.4   0.3   0.2  10.6   0.8
 2. ##.###.##.##       7.9%  845 10722   22.1  21.7   7.7 292.0  14.2
 3. ##.###.##.##       7.9%  845 10722   13.8  20.2   8.4 459.2  13.2

 4. ##.###.##.##       7.9%  847 10722   22.3  22.0   8.9 374.5  13.2
 5. ##.###.##.##       7.9%  852 10722   29.1  24.7  10.4 290.1  13.4
 6. ##.###.##.##       7.9%  848 10722   23.3  25.2   8.3 643.9  15.3
 7. 66.109.6.163       7.9%  849 10722   15.1  23.3   9.0 554.9  17.2
 8. 66.110.96.53       7.9%  848 10722   26.3  21.3   7.9 467.2  14.4
 9. 72.14.195.232      7.9%  846 10721   28.0  21.9   8.9 402.2  14.7
10. ???
11. 108.170.238.201    7.9%  842 10721   19.1  21.8   8.8 498.9  14.5
12. 8.8.8.8            7.9%  844 10721   24.0  21.4   8.7 414.5  13.5

It very recently came to my attention that there is a second device connected to the modem via ethernet. I want to note that the modem is actually a wireless gateway, but it was put into bridge mode and we have confirmed that our router (an Apple AirPort Extreme AC) gets a proper WAN address and DNS servers. So, the gateway is effectively just a modem.

Here is our network topography:

Time Warner Business Modem
├── Mystery Device
└── Apple AirPort Extreme AC
    ├── Netgear Unmanaged Gigabit Switch 1
    │   ├── Apple iMac 1
    │   │   └── Canon Printer

    │   ├── Apple iMac 2
    │   ├── Apple iMac 3
    │   └── HP Ubuntu Laptop
    ├── Netgear Unmanaged Gigabit Switch 2
    │   ├── HP Ubuntu Desktop
    │   ├── Brother Printer 1
    │   └── Brother Printer 2
    └── Apple iMac 4

I work remotely so have not been able to visit the office to inspect the situation myself. We have been in contact with Time Warner and they assert that their modem has had no signal issues for the past few months. Our issues started a few weeks ago, after several years of consistent connection.

The cable between the modem and router was just replaced to rule out a faulty connection between the two, but the issue persists.

The core of my question is this — if this is a typical TW Business Class modem, does it only have one IP address to assign? I don't believe we are paying for additional IPs and I assume we only get one by default. If so, are the two ethernet devices connected (AirPort and mystery device) in competition for that IP address? Or has it been assigned to the router permanently by its MAC address? Would such a competition yield the observed mtr results above, perhaps when that mystery device attempts to connect to the internet?

Any advice would be much appreciated. If you have any idea of how I can further isolate the variables, I am comfortable with UNIX tools. In the meantime I have instructed someone in the office to disconnect the mystery device, but it is difficult to determine whether or not the issue will be fixed by that alone because it is not always observable, sometimes for days at a time.

Update: The second ethernet cable coming out of the modem was not connected to anything, and has been unplugged. I assume that rules out a conflict in IP address assignment. Any other diagnostic ideas would be very much welcomed.

Friday, December 30, 2016

linux - Ubuntu 14.04 Software RAID 1 - md0 inactive

I have my root filesystem on /dev/sdc, and a software RAID 1 spanning /dev/sda and /dev/sdb (I think). I physically moved my computer and ran software updates today (either of these could be the culprit), then noticed that my RAID array was no longer available. I see mdadm has marked it inactive, though I'm not sure why. I also am unable to mount it. I see other suggestions out there, but none that look exactly like my situation, and I'm worried about losing data.

I have not edited any configuration files and this configuration was previously working (with the exception that the RAID was not auto-mounted, which didn't bother me much).

edit: I should also mention that I originally tried setting up software RAID when I built the machine, something went wrong and I think I accidentally destroyed the data on the RAID, so I set up another software RAID and have been using that ever since. I believe that's the reason for the two entries. And now that I look at it, it looks like my data may not even be mirrored across the two drives? Just two separate RAID 1s on one drive each somehow?

edit 2: It looks like /dev/sdb is the RAID configuration that I want based on the update time of today, and the RAID consisting of /dev/sda1 and /dev/sdb1 is the old configuration that has an update time of February when I built this.

cat /proc/mdstat

root@waffles:~# cat /proc/mdstat

Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 
md127 : inactive sda1[0](S)
      976630488 blocks super 1.2

md0 : inactive sdb[1](S)
      976631512 blocks super 1.2

unused devices:

mdadm --examine --scan --config=/etc/mdadm/madadm.conf

root@waffles:~# mdadm --examine --scan --config=/etc/mdadm/madadm.conf
ARRAY /dev/md/0 metadata=1.2 UUID=dd54a7bd:15442724:ffd24430:0c1444b3 name=waffles:0
ARRAY /dev/md/0 metadata=1.2 UUID=047187c2:2a72494b:57327e8e:7ce78e9c name=waffles:0

cat /etc/mdadm/mdadm.conf

root@waffles:~# cat /etc/mdadm/mdadm.conf

# mdadm.conf
#
# Please refer to mdadm.conf(5) for information about this file.
#

# by default (built-in), scan all partitions (/proc/partitions) and all
# containers for MD superblocks. alternatively, specify devices to scan, using
# wildcards if desired.
#DEVICE partitions containers


# auto-create devices with Debian standard permissions
CREATE owner=root group=disk mode=0660 auto=yes

# automatically tag new arrays as belonging to the local system
HOMEHOST 

# instruct the monitoring daemon where to send mail alerts
MAILADDR root

# definitions of existing MD arrays

#ARRAY /dev/md/0 metadata=1.2 UUID=047187c2:2a72494b:57327e8e:7ce78e9c name=waffles:0

# This file was auto-generated on Fri, 20 Feb 2015 10:00:12 -0500
# by mkconf $Id$
ARRAY /dev/md0 metadata=1.2 name=waffles:0 UUID=dd54a7bd:15442724:ffd24430:0c1444b3

cat /proc/mounts

root@waffles:~# cat /proc/mounts

rootfs / rootfs rw 0 0
sysfs /sys sysfs rw,nosuid,nodev,noexec,relatime 0 0
proc /proc proc rw,nosuid,nodev,noexec,relatime 0 0
udev /dev devtmpfs rw,relatime,size=16379004k,nr_inodes=4094751,mode=755 0 0
devpts /dev/pts devpts rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000 0 0
tmpfs /run tmpfs rw,nosuid,noexec,relatime,size=3278828k,mode=755 0 0
/dev/disk/by-uuid/28631011-e1c9-4152-85b6-82073656a9ee / ext4 rw,relatime,errors=remount-ro,data=ordered 0 0
none /sys/fs/cgroup tmpfs rw,relatime,size=4k,mode=755 0 0
none /sys/fs/fuse/connections fusectl rw,relatime 0 0
none /sys/kernel/debug debugfs rw,relatime 0 0

none /sys/kernel/security securityfs rw,relatime 0 0
none /run/lock tmpfs rw,nosuid,nodev,noexec,relatime,size=5120k 0 0
none /run/shm tmpfs rw,nosuid,nodev,relatime 0 0
none /run/user tmpfs rw,nosuid,nodev,noexec,relatime,size=102400k,mode=755 0 0
none /sys/fs/pstore pstore rw,relatime 0 0
systemd /sys/fs/cgroup/systemd cgroup rw,nosuid,nodev,noexec,relatime,name=systemd 0 0
/home/todd/.Private /home/todd ecryptfs rw,nosuid,nodev,relatime,ecryptfs_fnek_sig=b12c61ee79f0f7fc,ecryptfs_sig=2b32246c98b2f7ca,ecryptfs_cipher=aes,ecryptfs_key_bytes=16,ecryptfs_unlink_sigs 0 0
gvfsd-fuse /run/user/1000/gvfs fuse.gvfsd-fuse rw,nosuid,nodev,relatime,user_id=1000,group_id=1000 0 0

cat /etc/fstab

root@waffles:~# cat /etc/fstab
# /etc/fstab: static file system information.
#
# Use 'blkid' to print the universally unique identifier for a
# device; this may be used with UUID= as a more robust way to name devices
# that works even if disks are added and removed. See fstab(5).
#
#                

# / was on /dev/sdc1 during installation
UUID=28631011-e1c9-4152-85b6-82073656a9ee /               ext4    errors=remount-ro 0       1
# swap was on /dev/sdc5 during installation
#UUID=d662ea5e-38f3-4a71-8a56-fa658c32b2eb none            swap    sw              0       0
/dev/mapper/cryptswap1 none swap sw 0 0

mount /dev/md0 /media/raid1/

root@waffles:~# mount /dev/md0 /media/raid1/

mount: /dev/md0: can't read superblock

grep 'md0' /var/log/syslog

root@waffles:~# grep 'md0' /var/log/syslog
Dec 21 13:50:16 waffles kernel: [    1.043320] md/raid1:md0: active with 2 out of 2 mirrors
Dec 21 13:50:16 waffles kernel: [    1.043327] md0: detected capacity change from 0 to 1000070512640
Dec 21 13:50:16 waffles kernel: [    1.050982]  md0: unknown partition table
Dec 21 14:20:16 waffles mdadm[1921]: DeviceDisappeared event detected on md device /dev/md0

Dec 21 14:32:26 waffles mdadm[2426]: DeviceDisappeared event detected on md device /dev/md0
Dec 21 14:37:17 waffles kernel: [  302.004127] EXT4-fs (md0): unable to read superblock
Dec 21 14:37:17 waffles kernel: [  302.004198] EXT4-fs (md0): unable to read superblock
Dec 21 14:37:17 waffles kernel: [  302.004244] EXT4-fs (md0): unable to read superblock
Dec 21 14:37:17 waffles kernel: [  302.004294] FAT-fs (md0): unable to read boot sector
Dec 21 14:45:26 waffles mdadm[1917]: DeviceDisappeared event detected on md device /dev/md0
Dec 21 15:38:31 waffles kernel: [ 3190.749438] EXT4-fs (md0): unable to read superblock
Dec 21 15:38:31 waffles kernel: [ 3190.749609] EXT4-fs (md0): unable to read superblock
Dec 21 15:38:31 waffles kernel: [ 3190.749679] EXT4-fs (md0): unable to read superblock
Dec 21 15:38:31 waffles kernel: [ 3190.749749] FAT-fs (md0): unable to read boot sector

mdadm --examine /dev/sda1

root@waffles:~# mdadm --examine /dev/sda1
/dev/sda1:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 047187c2:2a72494b:57327e8e:7ce78e9c

           Name : waffles:0  (local to host waffles)
  Creation Time : Thu Feb 12 15:43:00 2015
     Raid Level : raid1
   Raid Devices : 2

 Avail Dev Size : 1953260976 (931.39 GiB 1000.07 GB)
     Array Size : 976630336 (931.39 GiB 1000.07 GB)
  Used Dev Size : 1953260672 (931.39 GiB 1000.07 GB)
    Data Offset : 262144 sectors
   Super Offset : 8 sectors

          State : clean
    Device UUID : 0b0a69b7:3c3900c0:6e26b3e4:91155d98

    Update Time : Fri Feb 20 09:36:16 2015
       Checksum : 9bfb3aa - correct
         Events : 27


   Device Role : Active device 0
   Array State : AA ('A' == active, '.' == missing)

mdadm --examine /dev/sdb1

root@waffles:~# mdadm --examine /dev/sdb1
/dev/sdb1:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : 047187c2:2a72494b:57327e8e:7ce78e9c

           Name : waffles:0  (local to host waffles)
  Creation Time : Thu Feb 12 15:43:00 2015
     Raid Level : raid1
   Raid Devices : 2

 Avail Dev Size : 1953260976 (931.39 GiB 1000.07 GB)
     Array Size : 976630336 (931.39 GiB 1000.07 GB)
  Used Dev Size : 1953260672 (931.39 GiB 1000.07 GB)
    Data Offset : 262144 sectors
   Super Offset : 8 sectors

          State : clean
    Device UUID : 2fdaaf8c:30d5c44e:893f9a5a:11d8170c

    Update Time : Fri Feb 20 09:36:16 2015
       Checksum : 576cfb5c - correct
         Events : 27


   Device Role : Active device 1
   Array State : AA ('A' == active, '.' == missing)

mdadm --examine /dev/sdb (given the update times here, I think this is the one I care about)

root@waffles:~# mdadm --examine /dev/sdb
/dev/sdb:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : dd54a7bd:15442724:ffd24430:0c1444b3

           Name : waffles:0  (local to host waffles)
  Creation Time : Fri Feb 20 10:03:33 2015
     Raid Level : raid1
   Raid Devices : 2

 Avail Dev Size : 1953263024 (931.39 GiB 1000.07 GB)
     Array Size : 976631360 (931.39 GiB 1000.07 GB)
  Used Dev Size : 1953262720 (931.39 GiB 1000.07 GB)
    Data Offset : 262144 sectors
   Super Offset : 8 sectors

          State : clean
    Device UUID : f2e16155:49caff6d:d13115a6:379d2fc8

    Update Time : Mon Dec 21 13:14:19 2015
       Checksum : d5017b27 - correct
         Events : 276


   Device Role : Active device 1
   Array State : AA ('A' == active, '.' == missing)

Any suggestions to get this mounted again? It could be a bad drive from the move, but I was careful when moving the computer and see others have solved similar issues in software.

Answer

You've got a rather perplexing system there. /dev/sdb (the entire volume) and /dev/sdb1 (the first partition on that volume) are both being detected as RAID devices. This is confusing the OS, and it's creating two RAID arrays: /dev/md0 is a degraded RAID 1 array consisting of /dev/sdb, and /dev/md127 is a degraded RAID 1 array consisting of /dev/sda1. Since they're degraded, the OS won't automatically start them.

The first step in recovering from this is to make a volume-level backup (dd if=/dev/sda, dd if=/dev/sdb), so that if things go wrong, you won't be any worse off than you currently are.

Once you've done that, you can activate your arrays in read-only mode: mdadm --run --readonly /dev/md0; mdadm --run --readonly /dev/md127, mount the disks, and see what each contains.

Assuming that you're correct that /dev/sdb is the RAID array you're using, the next step is to figure out what it was using as the second volume of the RAID array: the metadata clearly states that when you shut it down, it was a two-disk RAID 1 array with both disks present.

If you can't figure it out, or don't want to use whatever the missing piece is, and you're correct that /dev/sda1 contains nothing important, the next step is to add it to /dev/md0:

Wipe out the partition table and md metadata as a safety precaution: dd if=/dev/zero of=/dev/sda bs=1M count=1024

Add it to the array: mdadm --manage /dev/md0 --add /dev/sda and let the array rebuild.

The final step is to wipe out the md superblock on /dev/sdb1. According to the mdadm man page, mdadm --zero-superblock /dev/sdb1 will work, but since the superblock is inside an existing array, I'd be very nervous about actually doing this.

domain name system - DNS using CNAMEs breaks MX records?

We are trying to move all our websites we host to CNAMES as we are planning on moving servers in the new year and would like the ability to move some clients to one server and other clients somewhere else. We were planning on giving clients a unique CNAME which we can then change at a later date. (We have other reasons for doing this now but that is the main one)

We have been testing out this theory with a few of our own domains and it seemed to be fine. However when checking the MX records on a domain I got the CNAME value back rather than the MX record.

Sadly all of these domains are done via control panels, but I am guessing they are just writing zone files for me.

I want to create 2 CNAMEs for the company.com

company.com. IN CNAME client.dns.ourserver.com
www          IN CNAME client.dns.ourserver.com

The MX record is something like the following:

company.com  IN MX 10 mail.company.com

We have an A record for mail.company.com

Doing:

host -t mx company.com

Returns the CNAME value rather than the mx record.

Is this expected behaviour?

I have managed to get the above configuration working with the 123-reg.co.uk control panel, but not sure if that is more luck than anything.

Answer

This is a common error. You cannot use a CNAME RR for your root domain (e.g. company.com) and define additional resource records for the same zone.

See Why can't I create a CNAME record for the root record? and RFC1034 section 3.6.2 for details:

If a CNAME RR is present at a node, no
other data should be present; this

ensures that the data for a canonical
name and its aliases cannot be
different.

linux - PHP unable to allocate memory

On my way to the office this morning, every website on our shared VPS started giving the same error (several times, not the typical memory_limit error which is fatal):

Warning: Unknown: Unable to allocate memory for pool. in Unknown on line 0

The shared server is a 64-bit OpenVZ container running cPanel. There are only ~6 VPSes on the host-- this is the largest one at only 4GB. The host itself has 24GB RAM. As the below graphs show, the memory usage on the host and VPS are both rather low. CPU Usage/Disk/Host all seem to be normal. RlimitMem was set to 583653034, yet the memory usage is about the same as it usually is.

Apache 2.2, PHP 5.2 (mod_php)

Restarting Apache has corrected the problem for now. However, I'd like to prevent it from happening again and I'm not sure what was limiting the memory. RlimitMem was set to 583653034, yet the memory usage is about the same as it usually is. There's seems to be plenty of memory: what caused this error?

shared vm memory usage, hovering around 50%

host vm memory usage, used hovering at 20%, cached at 65% until 6AM, where it dropped to ~60%, buffered at ~10%

 apc.ttl=0
 apc.shm_size=0
 apc.mmap_file_mask=(blank)

1 Segment(s) with 32.0 MBytes
(mmap memory, pthread mutex locking)

Answer

That is definitely the error you get when APC runs out of memory. When I (re)build servers, I often forget to increase this value to 128 M (suitable for my application) and that is the exact error you see.

Thursday, December 29, 2016

domain name system - messed up confused dns records!

I'm totally new to this, please help. While trying to redirect the mail from my server to gmail using MX records, I think I've messed up, and I'm way confused about that.

First: I can set DNS records both in DomainCentral (registrar) and in cPanel (server). I don't know where the changes should be done. Should I remove all records from cPanel?

Second: I changed CNAME in DomainCentral like this: mail -> ghs.google.com for many hours now. And removed CNAME entries in cPanel. mail.mydomain.com doesn't work.

Third: removed mx.mydomain.com and added the following to DomainCentral:

1   @       ASPMX.L.GOOGLE.COM.
5   @       ALT1.ASPMX.L.GOOGLE.COM.
5   @       ALT2.ASPMX.L.GOOGLE.COM.
10  @       ASPMX2.GOOGLEMAIL.COM.
10  @       ASPMX3.GOOGLEMAIL.COM.

it doesn't work.

Fourth: DKIM setup requires two fields to be filled in, and DomainCentral offers one field for TXT.

Answer

Your DNS should be changed with your DNS provider. This might be the same as your registrar, or it could be in your cpanel. We can't answer that one for you, but my money is on CPanel. If your registrar has "nameserver" records, do not touch these.

Did you put a full stop . at the end of ghs.google.com? You've done it correctly in step #3. DNS propogation can take time though, so if it's taking a while, it depends on what the TTL (Time To Live) of the previous record was. Hours is often not enough time to wait. Sometimes it takes days.

See above. DNS propogation takes time, especially if the upstream DNS servers are badly behaved (smack, bad DNS). What you've got there looks totally correct though.

You'll need to contact DomainCentral's support and ask them for additional TXT records. Although to be honest from what I've read in the past, you're lucky to even have one.

If you're really serious about it, you might want to consider seperating your DNS from your hosting provider. This gives you the ability to dump your host without having to re-configure all your DNS, and also means that you can guarantee that your DNS settings are correct, without having to guess where you're meant to go to get it work.

-- Update --

I've done an nslookup on your domain, and the MX records look totally correct:

> set q=mx
> christian-linux.com
Server:  enetsdc1.enets.local

Address:  192.168.161.2

Non-authoritative answer:
christian-linux.com     MX preference = 10, mail exchanger = aspmx3.googlemail.com
christian-linux.com     MX preference = 1, mail exchanger = aspmx.l.google.com
christian-linux.com     MX preference = 5, mail exchanger = alt1.aspmx.l.google.com
christian-linux.com     MX preference = 5, mail exchanger = alt2.aspmx.l.google.com
christian-linux.com     MX preference = 10, mail exchanger = aspmx2.googlemail.com

aspmx3.googlemail.com   internet address = 72.14.213.27

aspmx2.googlemail.com   internet address = 74.125.43.27

That all looks correct, so I'd say it's just a propogation thing.

linux - Fedora, fsck fails at boot

I'm facing a problem in my server during the startup. This is my actual configuration:

320GB each. Raid1 -> /dev/md127. Working.
/dev/sda
/dev/sdc

1000GB each. Raid1 -> Use to be /dev/md126, now it is /dev/md1. AFAIK, it works properly
/dev/sdb

/dev/sdd

2000GB -> Started to show some bad-functioning symptons. Now is disconnected
/dev/sde

This are the actual errors i got:

fsck.ext4: No such file or directory while trying to open /dev/md126
/dev/md126:

The superblock could not be read or does nor describe a correct ext2 filesystem. If the device is valid and it really contains an ext2 filesystem (and not swap or ufs or something else), then the superblock is corrupt, and you might try running e2fsck with an alternate superblock:
e2fsck -b 8193 


fsck.ext4: No such file or directory while trying to open /dev/sde1
/dev/sde1:
The superblock could not be read or does nor describe a correct ext2 filesystem. If the device is valid and it really contains an ext2 filesystem (and not swap or ufs or something else), then the superblock is corrupt, and you might try running e2fsck with an alternate superblock:
e2fsck -b 8193

Then, i can press CTRL+D and system got rebooted, or enter root password and be dropped to a shell with this prompt -> "(Repair filesystem) 1:"

With the /dev/md126 AKA /dev/md1 there is no problem, because from this shell, i can mount /dev/md1 and have access to the data.

With the /dev/sde1, i dont know why the error occurs since the disk is disconnected.

Both devices (md126 and sde1) are included in /etc/fstab, but from the shell, seems like all the filesystem are read-only, so i cant modify fstab file.

Any ideas of what can i do? Im kind of lost now.
Thank you in advance.

EDIT:
From the "Repair filesystem" shell, i can replicate the error messages doing "fsck -A -y", so it walk through all the etc/fstab file cheking each entry, but as i said, the filesystem is "read-only", so i cant change the file.

About the old /etc/md126 AKA /etc/md1 array that i can mount perfectly from the shell, if i make a fdisk on the 2 disk that compoes the array i get this:

Disk /dev/sdb: 1000.2 GB, xxxxxxx bytes
255 heads, 63 sectors....
Units= sectors of 1 * 512 = 512 bytes
Sector size (min/optimal) = 4096/4096

**Disk identifier: 0x00000000**

Disk /dev/sdb doesnt contain a valid partition table

[And exactly the same for /dev/sdd]

Answer

You need to fix your /etc/fstab. While it contains entries telling the OS that it should find and mount /dev/md126 and /dev/sde1 on boot, the OS won't be happy.

Either boot from rescue media and modify the entry, or boot, fix the root filesystem, do mount -o remount,rw / and modify the entry.

Once that fstab entry says /dev/md1 in place of /dev/md126, and says nothing about /dev/sde1, your boot sequence should be much less painful.

Windows Software Raid 0 over hardware raid 1 does this equal a hybrid 1+0?

I've got a server running windows 2008R2. The hardware controller can only do Raid 1 or 0 and it's hotswappable. I have 4 drives (hot swappable hardware), each pair is mirrored (raid 1). I'd like to do raid 10 but the hardware can't.

So, I'm wondering if stripe (raid 0) in software across the 2 mirrored pairs (in hardware) would give me a disk io performance increase. Our big issue is disk IO, we have extra CPU clock cycles to spare, but would like to not have 2 virtual drives but instead one striped volume.

Would there be a performance increase using software striping? Can this be configured during install?

Wednesday, December 28, 2016

Find what is causing failed connections in mysql

I have a replicated MySQL environments were the master db gets a continuous "Aborted_connects". How can I find out what IP and what user are causing the aborted connections. There are about 500 connections created by 25 applications. I only have access to the database servers.

Answer

Changing the log-warnings level to 2 will give more information in the error logs. See doc. You can change it on the fly.

Moreover, here is a great blog post which I've used many times to track down the bad connection attempts using tcpdumps.

Ext4 physical volume with rescued data to LVM

I have /dev/sdc where are rescued data from VG vg_data. I need convert /dev/sdc to LVM with this data. Is it possible?

Edit1:
- I formatted /dev/VG_DATA/data from xfs to ext4
- Copied from sdc1 to LVM data (I thinked)

After I run rsync from sdc1 to data lvm i got errors and /dev/sdb1 (lvm data) is gone to offline.

> 79.497864] sd 1:0:1:0: rejecting I/O to offline device [   79.497891] Aborting journal on device dm-2-8. [   79.497895] sd 1:0:1:0:
> rejecting I/O to offline device [   79.497899] JBD2: Error -5 detected
> when updating journal superblock for dm-2-8. [   79.497916] sd
> 1:0:1:0: rejecting I/O to offline device [   79.497953] sd 1:0:1:0:

> rejecting I/O to offline device [   79.497970] EXT4-fs error (device
> dm-2): ext4_journal_check_start:56: Detected aborted journal [  
> 79.497983] sd 1:0:1:0: rejecting I/O to offline device [   79.498154] sd 1:0:1:0: rejecting I/O to offline device [   79.498169] sd 1:0:1:0:
> rejecting I/O to offline device [   79.498192] sd 1:0:1:0: rejecting
> I/O to offline device [   79.498203] sd 1:0:1:0: rejecting I/O to
> offline device

What is wrong with volume group? It is virtual mach. on hyper-v 08 r2

Edit 2 - from boot log

Sep 17 17:01:23 localhost lvm: 1 logical volume(s) in volume group "VG_DATA" now active
Sep 17 17:01:23 localhost systemd: Found device /dev/mapper/VG_DATA-data.
Sep 17 17:01:23 localhost systemd: Started LVM2 PV scan on device 8:17.
Sep 17 17:01:23 localhost systemd: Mounting /data...
Sep 17 17:01:23 localhost kernel: XFS (sda1): Ending clean mount
Sep 17 17:01:23 localhost systemd: Mounted /boot.
Sep 17 17:01:23 localhost kernel: EXT4-fs (dm-2): recovery complete
Sep 17 17:01:23 localhost kernel: EXT4-fs (dm-2): mounted filesystem with ordered data mode. Opts: (null)

Sep 17 17:01:23 localhost systemd: Mounted /data.

linux - How to analyse logs after the site was hacked

One of our web-projects was hacked. Malefactor changed some template files in project and 1 core file of the web-framework (it's one of the famous php-frameworks).
We found all corrupted files by git and reverted them. So now I need to find the weak point.

With high probability we can say, that it's not the ftp or ssh password abduction. The support specialist of hosting provider (after logs analysis) said that it was the security hole in our code.

My questions:

1) What tools should I use, to review access and error logs of Apache? (Our server distro is Debian).

2) Can you write tips of suspicious lines detection in logs? Maybe tutorials or primers of some useful regexps or techniques?

3) How to separate "normal user behavior" from suspicious in logs.

4) Is there any way to preventing attacks in Apache?

Thanks for your help.

centos - How to allow user1 to 1 directory and another user2 to another directory. Not only home directory

In CentOS, how can i specify user1 to directory /var/www/html and user2 to directory /home/user2, and as default user3 and all others to there home directory?

My config file has following /etc/vsftpd/vsftpd.conf:

# You may specify an explicit list of local users to chroot() to their home
# directory. If chroot_local_user is YES, then this list becomes a list of
# users to NOT chroot().
#chroot_list_enable=YES
chroot_local_user=YES

# (default follows)
#chroot_list_file=/etc/vsftpd/chroot_list

This allow all users to there home directory only, but i need to specify user1 to another directory for full web directory access, and user2 only to his home directory (for personal ftp upload/download)

Answer

You should use virtualuser setup; Look here
there is mine example of what to do. look at local_root=/var/ftpserver/ or at pam.d and change it for mysql. there you should be able to costumize directory and so on.

linux - IPTables + Limit module: Why doesn't limit-burst get completely used?

Long time reader, first time poster.. yada yada yada..

Anyways, I am hopeful someone out there has some extensive iptables/netfilter LIMIT or HASHLIMIT module experience and explain the behavior I'm witnessing.

Background:

We have a webserver and want to limit how many connections a customer can have over the course of a month (HTTP keepalives are off, btw). So, I am trying to use the iptables LIMIT module to limit the number of their new connections to a set number per month (let's say 500). iptables LIMIT module uses a "token bucket" algorithm so I should be able to set the limit-burst (bucket size) to 500 and the limit (refill rate) to 500 divided by 28 days or about 18/day. This will make sure the bucket gets refilled in a month's time (4 weeks) if it is every completely emptied. (I understand this will actually grant more than exactly 500 but it should be close enough for our needs).

Here are my iptables rules (We group IPs using ipset. LimBurTest4 contains my source testing machines)

Chain INPUT (policy DROP 2316 packets, 186K bytes)
 pkts bytes target     prot opt in     out     source               destination
2952K  626M ACCEPT     all  --  *      *       0.0.0.0/0            0.0.0.0/0      state RELATED,ESTABLISHED /* Accept outgoing return traffic */
  379 13702 ACCEPT     icmp --  *      *       0.0.0.0/0            0.0.0.0/0      icmptype 8 state NEW limit: avg 1/sec burst 1
  377 30868 DROP       icmp --  *      *       0.0.0.0/0            0.0.0.0/0
    0     0 DROP       tcp  --  *      *       0.0.0.0/0            0.0.0.0/0      tcp flags:0x3F/0x00 /* Block NULL packets */

   73 14728 DROP       tcp  --  *      *       0.0.0.0/0            0.0.0.0/0      tcp flags:!0x17/0x02 state NEW /* Block SYN flood */
    0     0 DROP       tcp  --  *      *       0.0.0.0/0            0.0.0.0/0      tcp flags:0x3F/0x3F /* Block XMAS packets */
   24   120 ACCEPT     tcp  --  *      *       0.0.0.0/0            0.0.0.0/0      match-set LimBurTest4 src tcp dpt:443 limit: avg 18/day burst 500 /* LimitBurst Test */
   76 30180 LOG        tcp  --  *      *       0.0.0.0/0            0.0.0.0/0      match-set LimBurTest4 src tcp dpt:443 /* LimitBurst Testing */ LOG flags 0 level 4 prefix "LimBurTest Over Quota "
 2522  138K REJECT     tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            tcp dpt:443 /* Reject queries */ reject-with tcp-reset

I add the LIMIT rule like this:

iptables -I INPUT 7 -m set --match-set LimBurTest4 src -p tcp --dport 443 -m limit --limit 18/day --limit-burst 500 -m comment --comment "Limit Burst Test" -j ACCEPT

Testing:
Then I created a simple shell script to do requests to my webserver one after the other using curl. Each successful request takes about 0.20 ms. Output of the script looks like this:

./limBurTest.sh
1  - 200   -  0.257 ms 
2  - 200   -  0.193 ms 
3  - 200   -  0.155 ms 
etc.

etc.

Outcome:
My expectation of this configuration would be to quickly (in a couple seconds) use up all 500 connections before I start seeing rejected connections. However, that is not happening at all. Instead, my test script made 24 successful connections and then the rest were rejected. For example, in the above iptables output, I ran my shell script in a loop 100 times and you can see 24 ACCEPT rule matches and 76 LOG rule matches after the ACCEPT rule. I have tested this on CentOS 6.8 and Ubuntu 16.04 and both behave this way, but this seems contrary to the documentation. Why can I not use up all 500 connections specified by limit-burst?

And, yes, of course, I have done extensive googling and have seen lots of examples of people saying that the LIMIT module is supposed to work exactly as I've described. And I've read the netfilter docs numerous times.

What am I missing here?

Thank you in advance.

Domain, hosting, and email

This may not be the site for this question

I have a domain name I registered with 1and1. I am using squarespace to host my website, pointing to that domain, and I just signed up for office365, and want to use that for email.

Squarespace provided me with an IP address to set as the A record. I added this on my 1and1 admin console, and was able to access my webpage. Office 365 provided me with two name servers, for my email, and told me this would not change my A record mapping. As soon as I entered the name server information, I was no longer able to bring up my web page at squarespace.

I do not know much about DNS or hosting, so I am not sure if what I want to do is even possible. Though I imagine it would be.
Any help is appreciated.

Answer

ok, so it looks like microsoft does provide authoritative name servers for office 365... and it sounds like you already moved your authoritative name servers over to MS. once you did that, the existing A record for squarespace ceased to exist, as it was not on MS NS's. you have to add that A record to your new authoritative nameservers with microsoft.

http://onlinehelp.microsoft.com/en-us/office365-smallbusinesses/hh416759.aspx

http://onlinehelp.microsoft.com/en-us/office365-smallbusinesses/jj655390.aspx

the above links should explain the difference between having 1and1 host your dns, and adding the necessary records to make O365 work for your domain, as opposed to having microsoft host your auth ns's.

if you keep microsoft as your authoritative name servers, it is not clear to me that they will let you add an A record to a third party (ie. squarespace IP)

i would probably advise to revert to using 1and1 NS's, and manually add the dns records needed for O365, as per articles above.

Tuesday, December 27, 2016

smtp - Sendmail relay authentication

I'm trying to set up my sendmail to authenticate against a relay (comcast). I'm not seeing any attempts to authenticate at all. I'm trying to just debug how authentication works, and can't connect all the pieces...

I have, in my .mc file:

define(`RELAY_MAILER_ARGS', `TCP $h 587')dnl
define(`SMART_HOST', `relay:smtp.comcast.net.')dnl
define(`confAUTH_MECHANISMS', `PLAIN')dnl
FEATURE(`authinfo',`hash /etc/mail/client-info')dnl

And in my /etc/mail/client-info:

AuthInfo:*.comcast.net "U:root" "I:comcast_user" "P:comcast_password"

Now, I know everything is fine with the u/p, as I could authenticate directly through SMTP, using telnet.

There are two things I don't understand.

When AuthInfo records are searched for, they are matched by the target hostname. How? Does it it use the map key (something I would expect), or uses the so-called "Domain" ("R:" parameter that I don't set in my auth-info line)

What is "U:", really? Sendmail README (http://www.sendmail.org/m4/smtp_auth.html) says it's "user(authoraztion id)", and "I:" is "authentication ID". That suggests that my username should be in "U:", actually, but http://www.sendmail.org/~ca/email/auth.html says that "I:" is your remote user name.

The session looks like this:

[root@manticore]/etc/mail# sendmail -qf -v
Warning: Option: AuthMechanisms requires SASL support (-DSASL)

Running /var/spool/mqueue/p97CgcWq023273 (sequence 1 of 399)
my@email.com... Connecting to smtp.comcast.net. port 587 via relay...
220 omta19.westchester.pa.mail.comcast.net comcast ESMTP server ready

>>> EHLO my.host.name
250-omta19.westchester.pa.mail.comcast.net hello [my.ip.add.res], pleased to meet you
250-HELP
250-AUTH LOGIN PLAIN
250-SIZE 15728640
250-ENHANCEDSTATUSCODES
250-8BITMIME
250-STARTTLS
250 OK
>>> STARTTLS

220 2.0.0 Ready to start TLS
>>> EHLO my.host.name
250-omta19.westchester.pa.mail.comcast.net hello [my.ip.add.res], pleased to meet you
250-HELP
250-AUTH LOGIN PLAIN
250-SIZE 15728640
250-ENHANCEDSTATUSCODES
250-8BITMIME
250 OK
>>> MAIL From:<> SIZE=2183

550 5.1.0 Authentication required
MAILER-DAEMON... aliased to postmaster
postmaster... aliased to root
root... aliased to my@email.com
postmaster... aliased to root
root... aliased to my@email.com
>>> RSET
250 2.0.0 OK



[root@manticore]/etc/mail# sendmail -d0.1
Version 8.14.3
 Compiled with: DNSMAP LOG MAP_REGEX MATCHGECOS MILTER MIME7TO8 MIME8TO7
                NAMED_BIND NETINET NETINET6 NETUNIX NEWDB NIS PIPELINING SCANF
                SOCKETMAP STARTTLS TCPWRAPPERS USERDB XDEBUG

Thanks,
Pawel.

performance - How can I tell which page is creating a high-CPU-load httpd process?

I have a LAMP server (CentOS-based MediaTemple (DV) Extreme with 2GB RAM) running a customized Wordpress+bbPress combination .

At about 30k pageviews per day the server is starting to groan. It stumbled earlier today for about 5 minutes when there was an influx of traffic. Even under normal conditions I can see that the virtual server is sometimes at 90%+ CPU load. Using Top I can often see 5-7 httpd processes that are each using 15-30% (and sometimes even 50%) CPU.

Before we do a big optimization pass (our use of MySQL is probably the culprit) I would love to find the pages that are the main offenders and deal with them first. Is there a way that I can find out which specific requests were responsible for the most CPU-hungry httpd processes? I have found a lot of info on optimization in general, but nothing on this specific question.

Secondly, I know there are a million variables, but if you have any insight on whether we should be at the boundaries of performance with a single dedicated virtual server with a site of this size, then I would love to hear your opinion. Should we be thinking about moving to a more powerful server, or should we be focused on optimization on the current server?

Answer

strace is a good way to start debugging this kind of problem. Try to strace the pid of one of the Apache processes consuming more CPU:

strace -f -t -o strace.output -p PID

This will show you the system calls made within that process. Take a look at strace.output and see what the process was doing. This might enlighten the way and show you where the process is hanging. The "-t" flag is very important here as it will prefix each line of the strace output with the time of the day. So, search for a leap.

On the other hand and as you think MySQL is probably the culprit, I'd enable the slow query log, take a look at it and try to optimize that queries. More info about the slow query log here.

Also, don't forget to take a look at the logfiles of your webserver.

Regarding your second question, I think it's hard to tell with only this info. Separating the frontend (webserver) from the backend (database) is always a good practice if you have the budget for it. On the other hand, I think that before adding more hardware, one should focus on trying to optimize the performance using the current hardware. Otherwise, the problem is probably just being postponed.

Hope this helps.

domain name system - How use DNS server to create simple HA (High availability) of my website?

Welcome,

How can i use DNS server to create simple HA (High availability) of website ?

For example if my web-server ( for better understanding i use internal IP in real it will be other hosting companies)

192.168.0.120 :80 (is offline)
traffic go to
192.168.0.130 :80

You have right, i use bad word "hight avability" of course i was thinking about failover.

Using few IP in A records is good for simple load-balancing.
But not in case, if i want notice user about failure (for example display page, Oops something is wrong without our server, we working on it) against "can't establish connection".

I was thinking about setting up something like this

2 DNS servers, one installed on www server

Both have low TTL

on my domain, set up 2 ns records

first for DNS with my apache server
second to other dns

If user try connect he will get ip of www server using first dns, if that dns is offline (probably www server is also down) so it will try second NS record, what will point to another dns, that dns will point to "backup" page.

That's what i would like to do.

If You have other idea please share.

Reverse proxy is not option, because IP of server can change, or i can use other country for backup.

Monday, December 26, 2016

Synchronize Active Directory to Database

We are in a situation where we would like to offer our customers to be able to manage their users themselves. It is around 300 customers with up to a total of 10.000 users.

Besides creating, updating and removing users, they will very often read information about users for statics and other useful informations available. All this functionality, should be available from an Intranet web page (.NET Framework 4) that the users will access through Citrix or similar.

Now the problem is that we would really like the users not to query AD directly for each request, but rather make them hit a database that is synchronized with AD.

It would be sufficient to run this synchronization a few time each day (maybe every 5. hour). When they create a user, it should not be available right away, but reviewed and then created within two days (the next step would be to remove this manual review, but that's out of scope for this question).

What do you think about this synchronization of AD? Does anyone have any experience with it and is it something that is done in other organizations, where you will have lots of requests which is better handled by a database than AD (I presume)?

Are there any techniques out there for writing such a script that synchronizes AD with database tables? My primary concern is the groups/members relations which can be rather complicated. Or are there software that synchronizes AD with a database?

Any comments will be much appreciated. Thank you.

domain name system - What am I doing wrong with bind9?

I am trying to bind a domain name to a vps but I am failing..

I get this when I dig:

; <<>> DiG 9.10.3-P4-Ubuntu <<>> ns1.example.com @61.15.2.95
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 49520
;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 2, ADDITIONAL: 2
;; WARNING: recursion requested but not available


;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;ns1.example.com.           IN  A

;; ANSWER SECTION:
ns1.example.com.        604800  IN  A   61.15.2.95

;; AUTHORITY SECTION:

example.com.        604800  IN  NS  ns2.example.com.
example.com.        604800  IN  NS  ns1.example.com.

;; ADDITIONAL SECTION:
ns2.example.com.        604800  IN  A   178.159.2.95

;; Query time: 314 msec
;; SERVER: 178.159.2.95#53(178.159.2.95)
;; WHEN: Sat Apr 15 14:26:22 +04 2017
;; MSG SIZE  rcvd: 106

Problem
;; WARNING: recursion requested but not available
since it is just a warning I tried to register it at quickhostuk but I got this error at dns management:

Failed to Modify Domain Nameservers: Nameserver not found at registry

here is what I did..

say, my vps IP is: 61.15.2.95
domain name: example.com
name servers:

ns1.example.com=>61.15.2.95

ns2.example.com =>61.15.2.95

1.I installed bind9.

2.I created a zone in named.conf.local

zone "example.com" {
        type master;
        file "/etc/bind/db.example.com";
};

3. I created a db file for db.example.com

;
; BIND data file for local loopback interface
;
$TTL    604800
@       IN      SOA     ns1.example.com. root.ns1.example.com. (
                              3         ; Serial
                         604800         ; Refresh
                          86400         ; Retry
                        2419200         ; Expire
                        604800 )       ; Negative Cache TTL

;
@                IN      NS      ns1.example.com.
@                IN      NS      ns2.example.com.
@                IN      A       61.15.2.95
ns1              IN      A       61.15.2.95
ns2              IN      A       61.15.2.95

4.I modified named.conf.options and I added my vps ip to forwarders, I also tried google's 8.8.8.8 and 8.8.4.4

options {
        directory "/var/cache/bind";

        // If there is a firewall between you and nameservers you want
        // to talk to, you may need to fix the firewall to allow multiple
        // ports to talk.  See http://www.kb.cert.org/vuls/id/800113

        // If your ISP provided one or more IP addresses for stable
        // nameservers, you probably want to use them as forwarders.
        // Uncomment the following block, and insert the addresses replacing

        // the all-0's placeholder.

        forwarders {
                61.15.2.95;
        };


        //========================================================================
        // If BIND logs error messages about the root key being expired,
        // you will need to update your keys.  See https://www.isc.org/bind-keys

        //========================================================================
        dnssec-validation auto;

        auth-nxdomain no;    # conform to RFC1035
        listen-on-v6 { any; };
};

How should I correctly put it all together to make it work?

Saturday, December 24, 2016

domain name system - Whitelabel DNS with Amazon Route53

I've searched the internet for the answer, but I can't find it and this is because DNS management is not exactly in my skillset. I want to use Amazon AWS Route53 to host a multi tenant php application with loadbalancer and ec2 insances. I currently have my domain example.com forwarded to route53 and residing in a hosted zone. This part is working fine and the domain example.com is pointing to the correct instance. However I want to create subdomains like

ns1.example.com and
ns2.example.com

so that I can use them as name servers for other domains. Let's say I want to forward mydomain2.com to the loadbalancer by changing mydomain2.com's name servers to ns1.example.com and ns2.example.com. However I'm not sure how I shoud add both subdomains ns1 and ns2 and point them to the LB in the amazon Route53 gui console.

I will be thankful for any assistance.

Friday, December 23, 2016

domain name system - When do I need an own DNS server?

Note: I realized belatedly that this is a duplicate of question 23744, which already has good answers. I couldn't close this post for lack of reputation, maybe someone else could step in.

I use hosted servers for my company and although I might opt for colocation of more customized machines in the future, on the whole I'm not too keen on diving too deep into "datacenter business". So generally, I would like to leave the handling of my infrastructure to dedicated pros as much as possible.

Recently, I've been starting to lust for some more flexibility regarding the DNS entries for my domain and have looked into running my own name server(s). It seems to me that running a professional, failsafe name server is a little more effort than I'm willing to commit to just now. Still, I like the idea of having a lot of control over it.

For a more experienced Sysadmin, what are indications to run own name servers? And when using name servers that are professionally maintained, what self-administration options are advisable to look out for?

Nginx - too many redirects

Sorry of the messy conf file. I have been trying everything, so it's become a mess. I want to enable SSL and redirect all traffic to https. Currently, with the below conf file, I can navigate to HTTP with no issues. HTTPS just redirects back to HTTP. If I uncomment line 5, then the browser simply says too many redirects. Any help is appreciated.

UPDATE: Based on the comment below, I updated the conf file. Same issue. The minute I add return 301 I get too many redirects error.

    # Redirect all variations to https://www domain
server {
  listen 80;
  server_name name.com www.name.com;
 # return 301 https://www.name.com$request_uri;

}

server {
  listen 443 ssl;
  server_name name.com;

  ssl_certificate /etc/letsencrypt/live/name.com/fullchain.pem;
  ssl_certificate_key /etc/letsencrypt/live/name.com/privkey.pem;

  # Set up preferred protocols and ciphers. TLS1.2 is required for HTTP/2

  ssl_protocols TLSv1 TLSv1.1 TLSv1.2;
  ssl_prefer_server_ciphers on;
  ssl_ciphers ECDH+AESGCM:ECDH+AES256:ECDH+AES128:DH+3DES:!ADH:!AECDH:!MD5;

  # This is a cache for SSL connections
  ssl_session_cache shared:SSL:2m;
  ssl_session_timeout 60m;


 #return 301 https://www.name.com$request_uri;

}

Solving DHCP Scope exhaustion - multiple /24 subnets or a single /23 subnet?

We have a client with a remote site with a 10.0.9.0/24 network. The DHCP scope on this network is nearly exhausted, with only about 20 addresses available. We could possibly increase this to 30 by re-IPing some static devices and shrinking our exclusion range. This may not be a long-term solution.

I've been tasked with engineering a long-term solution. Caveat: It's been 14 years since I did any serious Cisco router/switch configuration.

It was recommended that I create 3 additional VLANs with 3 additional subnets (10.0.10.0, etc). This has the advantage of allowing me to leave most of my static assigned IPs alone, but requires a ton of switch and router configuration for inter-VLAN routing. With a single DHCP server, this complicates the DHCP setup as well, with multiple DHCP scopes and a superscope. Most of this configuration can be done with minimal or no downtime for clients.

The other option is to resubnet the site to 10.0.10.0/23. I would have to re-IP every static device, but changing IPs and subnet masks is a bit easier (for me) than configuring multiple VLANs and inter-VLAN routing across 4 different subnets.
(Note: The 10.0.8.0/24 network is in use at another site, so we can't use 10.0.8.0/23 to avoid changing the 10.0.9.* IP addresses at this site.) This configuration will require significant downtime as network devices are re-IPd.

My understanding between the two solutions is that a single /23 subnet has a larger broadcast domain (510 clients) where as 4 /24 subnets have 4 smaller broadcast domains. With ~220 devices, I don't see broadcast traffic as a significant factor. The switches in use are Cisco 2960-s models with limited layer-3 capability. Therefore most of the inter-VLAN routing would fall back on the site's router, increasing router overhead.

Is there an industry standard preferred method for resolving DHCP scope exhaustion? Are there any pros/cons to either solution that I haven't mentioned here that may make a difference? Is there a third, better solution?

More Info:
Since whatever I do, I'm re-engineering this network, I'm trying to identify what the "right" solution is. If this were a new network that would support more than 254 devices, would it be better to build a single /23 subnet or multiple /24 subnets?

If I remembered all of my CCNA training from 2002, adding new /24 subnets and VLANs might be trivial. But I know from experience that it's not so trivial to maintain. I am looking for advice as to which solution will cause me the least amount of pain now and a year from now.

Wednesday, December 21, 2016

Apache: How to redirect OPTIONS request with .htaccess?

I have Apache 2.2.4 server with a lot of messages like this in the access_log:

::1 - - [15/May/2010:19:55:01 +0200] "OPTIONS * HTTP/1.0" 400 543

::1 - - [15/May/2010:20:22:17 +0200] "OPTIONS * HTTP/1.0" 400 543
::1 - - [15/May/2010:20:24:58 +0200] "OPTIONS * HTTP/1.0" 400 543
::1 - - [15/May/2010:20:25:55 +0200] "OPTIONS * HTTP/1.0" 400 543
::1 - - [15/May/2010:20:27:14 +0200] "OPTIONS * HTTP/1.0" 400 543

These are the "internal dummy connections" as explained on this page:

http://wiki.apache.org/httpd/InternalDummyConnection

The page also hits my main problem: "In 2.2.6 and earlier, in certain configurations, these requests may hit a heavy-weight dynamic web page and cause unnecessary load on the server. You can avoid this by using mod_rewrite to respond with a redirect when accessed with that specific User-Agent or IP address."

Well, obviously I cannot use UserAgent because I minimized the server signature, but I could use IP address. However, I don't have a clue what should the RewriteCond and RewriteRule look for IPv6 address ::1.

The website where this runs is using CodeIgniter, so there is already the following .htaccess in place, I just need to add to it:

RewriteEngine on
RewriteCond %{REQUEST_URI} ^/system.*
RewriteRule ^(.*)$ /index.php?/$1 [G]
RewriteCond %{REQUEST_FILENAME} !-f

RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ /index.php?/$1 [L]

Any idea how to write this .htaccess rule?

Solved: Adding another rule makes OPTIONS fall through current rules and be handled the same way as Apache is doing by default.

RewriteEngine on
RewriteCond %{REQUEST_URI} ^/system.*

RewriteRule ^(.*)$ /index.php?/$1 [G]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REMOTE_HOST} !^::1$
RewriteRule ^(.*)$ /index.php?/$1 [L]

I never access the website via localhost on IPv6, so this works great.

Answer

RewriteCond %{REMOTE_HOST} ^::1$

RewriteRule  ^OPTIONS  http://www.google.com/  [L]

that's my best guess, i'm certain of the RewriteCond, but not quite with the RewriteRule

it will match on the REMOTE_HOST being ::1 and then rewrite a request for any URL starting with OPTIONS to www.google.com

MySQL on Linux out of memory

OS: Redhat Enterprise Linux Server Release 5.3 (Tikanga)

Architecture: Intel Xeon 64Bit

MySQL Server 5.5.20 Enterprise Server advanced edition.

Application: Liferay.

My database size is 200MB. RAM is 64GB.
The memory consumption increases gradually and we run out of memory.
Then only rebooting releases all the memory, but then process of memory consumption starts again and reaches 63-64GB in less than a day.

Parameters detail:

key_buffer_size=16M

innodb_buffer_pool_size=3GB

inndb_buffer_pool_instances=3

max_connections=1000

innodb_flush_method=O_DIRECT

innodb_change_buffering=inserts

read_buffer_size=2M

read_rnd_buffer_size=256K

It's a serious production server issue that I am facing.

What could be the reason behind this and how to resolve.

This is the report of 2pm today, after Linux was rebooted yesterday @ around 10pm.

Output of free -m


             total       used       free     shared    buffers     cached
Mem:         64455      22053      42402          0       1544       1164
-/+ buffers/cache:      19343      45112

Swap:        74998          0      74998

Output of vmstat 2 5


   procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------    

   r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
   0  0      0 43423976 1583700 1086616    0    0     1   173   22   27  1  1 98  0  0
   2  0      0 43280200 1583712 1228636    0    0     0   146 1265  491  2  2 96  1  0
   0  0      0 43421940 1583724 1087160    0    0     0   138 1469  738  2  1 97  0  0
   1  0      0 43422604 1583728 1086736    0    0     0  5816 1615  934  1  1 97  0  0
   0  0      0 43422372 1583732 1086752    0    0     0  2784 1323  545  2  1 97  0  0

Output of top -n 3 -b



top - 14:16:22 up 16:32,  5 users,  load average: 0.79, 0.77, 0.93
Tasks: 345 total,   1 running, 344 sleeping,   0 stopped,   0 zombie
Cpu(s):  1.0%us,  0.9%sy,  0.0%ni, 98.1%id,  0.1%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:  66002772k total, 22656292k used, 43346480k free,  1582152k buffers
Swap: 76798724k total,        0k used, 76798724k free,  1163616k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                     
 6434 mysql     15   0 4095m 841m 5500 S 113.5  1.3 426:53.69 mysqld                     

    1 root      15   0 10344  680  572 S  0.0  0.0   0:03.09 init                        
    2 root      RT  -5     0    0    0 S  0.0  0.0   0:00.01 migration/0                 
    3 root      34  19     0    0    0 S  0.0  0.0   0:00.00 ksoftirqd/0                 
    4 root      RT  -5     0    0    0 S  0.0  0.0   0:00.00 watchdog/0                  
    5 root      RT  -5     0    0    0 S  0.0  0.0   0:00.01 migration/1

Answer

I had a similar issue, and basically I changed the mysqltuner.pl script and made it more verbose and know what happened.

Basically, the memory usage, if you are using any variation of my-innodb-heavy-4G.cnf config file, the major part of memory using will be nearly like this:

memory usage = min(tmp_table_size, max_heap_table_size) 
    + key_buffer_size + query_cache_size 
    + innodb_buffer_pool_size + innodb_additional_mem_pool_size + innodb_log_buffer_size
    + (max_connections * 
        (read_buffer_size + read_rnd_buffer_size 
           + sort_buffer_size + thread_stack + join_buffer_size
        )
    )

This sum have not all factors, please refer mysqltuner.pl script code (and run it) to see them all.

So, it seems you need to lower a lot read_buffer_size, read_rnd_buffer_size, sort_buffer_size, thread_stack and join_buffer_size, since its sum is multiplied by 1000 from max_connections.

Other solution is to lower a little bit the max_connections number. With this huge memory for thread buffers, innodb_buffer_pool_size and all InnoDB related variables becomes a minor issue.

You can also try to figure out if your applications really a huge amount of sort_buffer_size and join_buffer_size. If not, put these values down.

Hope it helped.

high availability - Cheap, minimal, robust web application hosting for at least three nines (99.9%) uptime?

How can one achieve the cheapest yet very reliable web application configuration?

Let's assume at least 2 application servers for $80/mo plus a DB server will support most people's applications for a while - and we just want to achieve good reliability (at least 3 nines).

One can rent a pretty good VPS solution for around $80/mo right now from any reasonable provider (Amazon EC2, Slicehost, etc.). However, none of these VPS hosting solutions are perfect, and I've experienced more than 99.9% downtime with each of them.

I'm not sure how best to configure these 3 machines - the best would be to put the two app servers on different providers (best with different network connections) and use HAProxy to keep an eye on each other. If one fails, it will update the DNS to remove it from the pool of IP addresses for your application. But what to do about the DB server? That's still a single point of failure.

I have had issues with DNS in the past, but this can be handled by an external dedicated provider like DNS Made Easy for very cheap ($15/year). This supports dynamically modifying DNS entries if necessary as well, which is handy if you're not planning on being able to update them manually during a crisis.

Backups should be done to an external source (S3 or FTP site) at least once per day - again a minimal each month. You need an automated deploy and restore script in order to get past 3 nines I think from your backup as well.

I don't quite feel like this is quite there due to the DB availability, but it'll cost you around $80x3 + disk space + DNS = $250/mo.

Can one do better?

Tuesday, December 20, 2016

vsftpd default permissions for website directory and ftp

I'm trying to setup vsftpd and my users correctly. I can connect to the ftp with my user but I can't create any directory or file. My websites folder will be there: /srv/www/domain

vsftpd.conf

anonymous_enable=NO
local_enable=YES
write_enable=YES
local_umask=022
chroot_local_user=YES

Directory permissions

drwxrwxr-x  4 root www-data 4096 Oct  5 20:58 www

drwxrwxr-x 2 user_ftp www-data 4096 Oct  5 22:19 domain

User group

user_ftp => www-data

It's strange because when the domain folder have:

755 permission: I can't connect to my FTP account (500 OOPS: vsftpd: refusing to run with writable root inside chroot()) but I can add files & folders (if i change the permission when i'm logged)

575 permission: I can connect, but can't edit/delete/add files & folders

What I'm doing wrong :)?

Answer

My solution.

with chroot_local_user set to YES, the root should be the owner of the home directory, after that you can create other directory and assign permission to the FTP USER so he can do everything he want.

Root user access/permission to /srv/www/domain

FTP user access/permission to /srv/www/domain/public_html

Monday, December 19, 2016

ubuntu - RabbitMq Management plugin only on localhost

On RabbitMQ 3.5.7 Ubuntu 16.04.

I want to implement RabbitMq Management plugin only on localhost, the idea is to use a tunnel to reach the RabbitMq Management Web GUI from the computer I use to connect to my server using SSH.

I found this thread that seems to document everything to do.

Here is what I have done:
I edited /etc/rabbitmq/rabbitmq-env.conf, it looks like that:

export RABBITMQ_CONFIG_FILE="/etc/rabbitmq/rabbitmq.config"
# Defaults to rabbit. This can be useful if you want to run more than one node
# per machine - RABBITMQ_NODENAME should be unique per erlang-node-and-machine
# combination. See the clustering on a single machine guide for details:

# http://www.rabbitmq.com/clustering.html#single-machine
#NODENAME=rabbit

# By default RabbitMQ will bind to all interfaces, on IPv4 and IPv6 if
# available. Set this if you only want to bind to one network interface or#
# address family.
#NODE_IP_ADDRESS=127.0.0.1

# Defaults to 5672.
#NODE_PORT=5672


export RABBITMQ_NODENAME=rabbit@localhost
export RABBITMQ_NODE_IP_ADDRESS=127.0.0.1
export ERL_EPMD_ADDRESS=127.0.0.1

Then I've created and edited "/etc/rabbitmq/rabbitmq.config":

[
        {rabbitmq_management, [

                {listener,[{port, 15672},{ip, "127.0.0.1"}]}
        ]},
        {kernel, [
                {inet_dist_use_interface,{127.0.0.1}}
        ]}
].

I've launched some service rabbitmq-server reload, service rabbitmq-server stop, service rabbitmq-server start.

It did not work.

I rebooted the machine it is still not working.

When I do a sudo lsof -i-n -P I see that:

beam 1199 rabbitmq 8u IPv4 13374 0t0 TCP *:25672 (LISTEN)

beam 1199 rabbitmq 9u IPv4 13376 0t0 TCP 127.0.0.1:60223-127.0.0.1:4369 (ESTABLISHED)

beam 1199 rabbitmq 18u IPv4 14714 0t0 TCP 127.0.0.1:5672 (LISTEN)

beam 1199 rabbitmq 19u IPv4 14716 0t0 TCP *:15672 (LISTEN)

In "/var/log/rabbitmq/rabbit@localhost.log", I can see:

"config file(s) : /etc/rabbitmq/rabbitmq.config (not found)"

Answer

I solved it, my mistake was:

export RABBITMQ_CONFIG_FILE="/etc/rabbitmq/rabbitmq.config" instead of export RABBITMQ_CONFIG_FILE="/etc/rabbitmq/rabbitmq" in "/etc/rabbitmq/rabbitmq-env.conf"

It is not necessary to specify the extention ".config" of the file.

And in "/etc/rabbitmq/rabbitmq.config", I just kept:

[
        {rabbitmq_management, [
                {listener,[{port, 15672},{ip, "127.0.0.1"}]}
        ]},
]

The node: {kernel, [{inet_dist_use_interface,{127.0.0.1}}]} was creating some conflict, I took it away with no further investigation.

linux - Can't login to remote MariaDB server with phpMyAdmin, but works in shell

I recently configured two servers. The first to use Apache and phpMyAdmin. On the other server I have a maria-DB server correctly configured.

phpMyAdmin it's reading the config file, but I can't connect to MariaDB server and PMA is throwing

#2002 Cannot log in to the MySQL server

phpMyAdmin Error

Using mysql command to connect from the web server to the database server and using the same user/password, I can successfully connect to the database server.

No errors in mysql, all the ports for mysql are opened in the firewall, no php errors. I didn't have any luck finding the problem.

Edit:

Accesing the Server via Shell

[root@pw000i rafael]# mysql -h [IP ADRESS TO THE REMOTE SERVER]  -u rafael -p
Enter password:
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 4
Server version: 5.5.36-MariaDB-log MariaDB Server

Copyright (c) 2000, 2014, Oracle, Monty Program Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.


MariaDB [(none)]>

config.inc.php from PhpMyAdmin

/**
 * phpMyAdmin configuration file, you can use it as base for the manual
 * configuration. For easier setup you can use "setup/".

 *
 * All directives are explained in Documentation.html and on phpMyAdmin
 * wiki .
 */

/*
 * This is needed for cookie based authentication to encrypt password in
 * cookie
 */
$cfg['blowfish_secret'] = 'MY SECRET PASSPHRASE IS HIDDEN'; /* YOU MUST FILL IN THIS FOR COOKIE AUTH! */


/**
 * Server(s) configuration
 */
$i = 0;

// The $cfg['Servers'] array starts with $cfg['Servers'][1].  Do not use
// $cfg['Servers'][0]. You can disable a server config entry by setting host
// to ''. If you want more than one server, just copy following section
// (including $i incrementation) serveral times. There is no need to define

// full server array, just define values you need to change.
$i++;
$cfg['Servers'][$i]['host']          = '10.XX.X.XXX'; // MySQL hostname or IP ad                                                                  dress
$cfg['Servers'][$i]['port']          = '';          // MySQL port - leave blank                                                                   for default port
$cfg['Servers'][$i]['socket']        = '';          // Path to the socket - leav                                                                  e blank for default socket
$cfg['Servers'][$i]['connect_type']  = 'tcp';       // How to connect to MySQL s                                                                  erver ('tcp' or 'socket')
$cfg['Servers'][$i]['extension']     = 'mysqli';    // The php MySQL extension t                                                                  o use ('mysql' or 'mysqli')
$cfg['Servers'][$i]['compress']      = FALSE;       // Use compressed protocol f                                                                  or the MySQL connection
                                                    // (requires PHP >= 4.3.0)
$cfg['Servers'][$i]['controluser']   = '';          // MySQL control user settin                                                                  gs

                                                    // (this user must have read                                                                  -only
$cfg['Servers'][$i]['controlpass']   = '';          // access to the "mysql/user                                                                  "
                                                    // and "mysql/db" tables).
                                                    // The controluser is also
                                                    // used for all relational
                                                    // features (pmadb)
$cfg['Servers'][$i]['auth_type']     = 'cookie';      // Authentication method (                                                                  config, http or cookie based)?
$cfg['Servers'][$i]['user']          = 'rafael';          // MySQL user
$cfg['Servers'][$i]['password']      = '';          // MySQL password (only need                                                                  ed
                                                    // with 'config' auth_type)

$cfg['Servers'][$i]['only_db']       = '';          // If set to a db-name, only
                                                    // this db is displayed in l                                                                  eft frame
                                                    // It may also be an array o                                                                  f db-names, where sorting order is relevant.
$cfg['Servers'][$i]['hide_db']       = '';          // Database name to be hidde                                                                  n from listings
$cfg['Servers'][$i]['verbose']       = 'new-db-mariaDB';          // Verbose nam                                                                  e for this host - leave blank to show the hostname

$cfg['Servers'][$i]['pmadb']         = '';          // Database used for Relatio                                                                  n, Bookmark and PDF Features
                                                    // (see scripts/create_table                                                                  s.sql)
                                                    //   - leave blank for no su                                                                  pport
                                                    //     DEFAULT: 'phpmyadmin'

$cfg['Servers'][$i]['bookmarktable'] = '';          // Bookmark table
                                                    //   - leave blank for no bo                                                                  okmark support
                                                    //     DEFAULT: 'pma_bookmar                                                                  k'
$cfg['Servers'][$i]['relation']      = '';          // table to describe the rel                                                                  ation between links (see doc)
                                                    //   - leave blank for no re                                                                  lation-links support
                                                    //     DEFAULT: 'pma_relatio                                                                  n'
$cfg['Servers'][$i]['table_info']    = '';          // table to describe the dis                                                                  play fields
                                                    //   - leave blank for no di                                                                  splay fields support
                                                    //     DEFAULT: 'pma_table_i                                                                  nfo'
$cfg['Servers'][$i]['table_coords']  = '';          // table to describe the tab                                                                  les position for the PDF schema

                                                    //   - leave blank for no PD                                                                  F schema support
                                                    //     DEFAULT: 'pma_table_c                                                                  oords'
$cfg['Servers'][$i]['pdf_pages']     = '';          // table to describe pages o                                                                  f relationpdf
                                                    //   - leave blank if you do                                                                  n't want to use this
                                                    //     DEFAULT: 'pma_pdf_pag                                                                  es'
$cfg['Servers'][$i]['column_info']   = '';          // table to store column inf                                                                  ormation
                                                    //   - leave blank for no co                                                                  lumn comments/mime types
                                                    //     DEFAULT: 'pma_column_                                                                  info'
$cfg['Servers'][$i]['history']       = '';          // table to store SQL histor                                                                  y
                                                    //   - leave blank for no SQ                                                                  L query history

                                                    //     DEFAULT: 'pma_history                                                                  '
$cfg['Servers'][$i]['verbose_check'] = TRUE;        // set to FALSE if you know                                                                   that your pma_* tables
                                                    // are up to date. This prev                                                                  ents compatibility
                                                    // checks and thereby increa                                                                  ses performance.
$cfg['Servers'][$i]['AllowRoot']     = TRUE;        // whether to allow root log                                                                  in
$cfg['Servers'][$i]['AllowDeny']['order']           // Host authentication order                                                                  , leave blank to not use
                                     = '';
$cfg['Servers'][$i]['AllowDeny']['rules']           // Host authentication rules                                                                  , leave blank for defaults
                                     = array();
$cfg['Servers'][$i]['AllowNoPassword']              // Allow logins without a pa                                                                  ssword. Do not change the FALSE

                                     = FALSE;       // default unless you're run                                                                  ning a passwordless MySQL server
$cfg['Servers'][$i]['designer_coords']              // Leave blank (default) for                                                                   no Designer support, otherwise
                                     = '';          // set to suggested 'pma_des                                                                  igner_coords' if really needed
$cfg['Servers'][$i]['bs_garbage_threshold']         // Blobstreaming: Recommente                                                                  d default value from upstream
                                     = 50;          //   DEFAULT: '50'
$cfg['Servers'][$i]['bs_repository_threshold']      // Blobstreaming: Recommente                                                                  d default value from upstream
                                     = '32M';       //   DEFAULT: '32M'
$cfg['Servers'][$i]['bs_temp_blob_timeout']         // Blobstreaming: Recommente                                                                  d default value from upstream
                                     = 600;         //   DEFAULT: '600'
$cfg['Servers'][$i]['bs_temp_log_threshold']        // Blobstreaming: Recommente                                                                  d default value from upstream

                                     = '32M';       //   DEFAULT: '32M'
/*
 * End of servers configuration
 */

/*
 * Directories for saving/loading files from server
 */
$cfg['UploadDir'] = '/var/lib/phpMyAdmin/upload';
$cfg['SaveDir']   = '/var/lib/phpMyAdmin/save';


/*
 * Disable the default warning that is displayed on the DB Details Structure
 * page if any of the required Tables for the relation features is not found
 */
$cfg['PmaNoRelation_DisableWarning'] = TRUE;
?>

I'm using Fedora 20 on both servers. Any Ideas?

Answer

Solved, the guilty was SELinux.

[root@pw000i rafael]# getsebool -a | grep httpd
httpd_anon_write --> off
httpd_builtin_scripting --> on
httpd_can_check_spam --> off
httpd_can_connect_ftp --> off
httpd_can_connect_ldap --> off
httpd_can_connect_mythtv --> off

httpd_can_connect_zabbix --> off
httpd_can_network_connect --> off
httpd_can_network_connect_cobbler --> off
httpd_can_network_connect_db --> off     <----- THIS SETTING IT'S THE GUILTY 
httpd_can_network_memcache --> off
httpd_can_network_relay --> off
httpd_can_sendmail --> off
httpd_dbus_avahi --> off
httpd_dontaudit_search_dirs --> off
httpd_enable_cgi --> on

httpd_enable_ftp_server --> off
httpd_enable_homedirs --> off
httpd_execmem --> off
httpd_graceful_shutdown --> on
httpd_manage_ipa --> off
httpd_mod_auth_ntlm_winbind --> off
httpd_mod_auth_pam --> off
httpd_read_user_content --> off
httpd_run_stickshift --> off
httpd_serve_cobbler_files --> off

httpd_setrlimit --> off
httpd_ssi_exec --> off
httpd_sys_script_anon_write --> off
httpd_tmp_exec --> off
httpd_tty_comm --> off
httpd_unified --> off
httpd_use_cifs --> off
httpd_use_fusefs --> off
httpd_use_gpg --> off
httpd_use_nfs --> off

httpd_use_openstack --> off
httpd_use_sasl --> off
httpd_verify_dns --> off

the solution it's simple

[root@pw000i rafael]# setsebool -P httpd_can_network_connect_db on

proxy - HaProxy for game servers, what's wrong with my configuration?

I host a website with a game client that uses TCP for connections. I am currently looking for a way to proxy my connections from one server to my origin server. I have tested Iptables forwarding and I have found that whilst it works HaProxy seems to perform better. The problem I'm facing is HaProxy randomly disconnects users after a random time of their connection being open. It's often less than 2 minutes. I'm quite new to Linux and of course I am a novice to haProxy.

Here my configuration, the origin IP address has been removed for obvious reasons:

global
daemon
maxconn 1000

defaults
    mode tcp
    timeout connect 5000ms

    timeout client 5000ms
    timeout server 5000ms

frontend proxy-in
    mode tcp
    bind *:1233
    default_backend proxy-out

backend proxy-out
    mode tcp

    server s1 127.0.0.1:1232

listen admin
    bind *:7772
    stats enable

Thank you. More details can be provided (if needed).

Sunday, December 18, 2016

windows - Constant unexplained disk activity

The screenshot below is from a Windows Server machine that is acting very strange lately. Every few minutes the HDD's go spinning like crazy without any reason I can explain.

RAM Checked, OK

HDD surface check, OK

HDD SMART Monitor, No Errors

Disk defragmented on schedule, no errors

No new hardware or software has been installed except Sophos AV which I actually suspect being the cause of everything.

I have also checked for hidden Bitcoin-Mining processes that are usually fired when the PC is idle but I found none.

What might be the cause of this problem?

Answer

The hard drive activity shown in your Resource Monitor screen shot doesn't inherently indicate anything out of the ordinary. Your hard drive is operating at nearly 100% capacity as indicated by the blue line in the Disk graph. That could be perfectly normal for your computer, especially if you're running a mechanical hard drive as they are frequently the first of the four primary system resources (CPU, RAM, I/O, Network) to become bottle-necked.

But without knowing your system, that's just a guess.

The best way to know if you're experiencing unusual system activity (and troubleshoot it) is to have a performance baseline. This is simply a record of system resource use created during known-"normal" system operation. This can be as simple as keeping Resource Monitor open while using your system normally to get a "feel" for the graphs and other data during normal activities. Or you can take a more professional approach and use something like Windows Performance Monitor to make a detailed record of exact counter values (The excellent TechNet blog How to create a “black box” performance counter data collector is a good place to start).

With a good idea in hand what is normal for your system you can more effectively troubleshoot suspected "unusual" behavior. In your case you suspect your antivirus, so you might record some system activity with the software installed, then remove it and have a look at the reported activity again. If the antivirus software is to blame it should be clear from a comparison of the monitored activity.

linux - Does SendMail Support Outbound TLS Encryption WITHOUT additions to the sendmail.mc file?

CentOS 5.x

Does SendMail support opportunistic TLS "out of the box"?

I'm used to having to explicitly add the following blurb to /etc/mail/sendmail.mc

define(`confAUTH_MECHANISMS’, `LOGIN PLAIN’)dnl

define(`confCACERT_PATH’,`/etc/pki/tls/certs’)dnl
define(`confCACERT’,` /etc/pki/tls/certs/intermediates.crt’)dnl
define(`confSERVER_CERT’,` /etc/pki/tls/certs/tls-cert-public.pem’)dnl
define(`confSERVER_KEY’,` /etc/pki/tls/certs/tls-cert-private.key‘)dnl
define(`confCLIENT_CERT’,` /etc/pki/tls/certs/tls-cert-public.pem‘)dnl
define(`confCLIENT_KEY’,` /etc/pki/tls/certs/tls-cert-private.key‘)dnl

However it SEEMS like outbound TLS is working without this having to be there. I notice the following in delivery logs:

Mar  4 04:36:08 bob sendmail[23831]: q29Ja84u011122: from=, size=3262, class=0, nrcpts=1, msgid=, proto=ESMTP, daemon=MTA, relay=exch.foo.com [192.168.0.1]
Mar  4 04:36:08 bob sendmail[23834]: STARTTLS=client, relay=mx.remotefoo.com., version=TLSv1/SSLv3, verify=FAIL, cipher=AES128-SHA, bits=128/128
Mar  4 04:36:08 bob sendmail[23834]: q29Ja84u011122: to=, delay=00:00:00, xdelay=00:00:00, mailer=esmtp, pri=123262, relay=mx.remotefoo.com. [12.13.14.15], dsn=4.0.0, stat=Deferred: Connection reset by mx.remotefoo.com.

IP, email addresses, and hostnames have been renamed to protect the innocent. =) The middle line is what confuses me. I would expect to see that only if sendmail actually uses TLS.

Is this possible? If so, where are the public/private keys that are used for this?

UPDATE

I'm revisiting this issue because I'm still curious. Here's the full sendmail.mc (IPs changed to protect the innocent):

divert(-1)
#
# DO NOT EDIT THIS FILE.  It is managed by the appliance node manager
# or create_smtp_profile script.  Any changes you make may be
# overwritten.
#
divert(0)

dnl #
dnl # This is the sendmail macro config file for m4. If you make changes to
dnl # /etc/mail/sendmail.mc, you will need to regenerate the
dnl # /etc/mail/sendmail.cf file by confirming that the sendmail-cf package is
dnl # installed and then performing a
dnl #
dnl #     make -C /etc/mail
dnl #
include(`/usr/share/sendmail-cf/m4/cf.m4')dnl
VERSIONID(`setup for linux-gnu')dnl

OSTYPE(`linux-gnu')dnl
dnl #
dnl # Disable DNS lookups
FEATURE(`nocanonify')dnl
define(`confBIND_OPTS',`-DNSRCH -DEFNAMES')dnl
dnl #
dnl # default logging level is 9, you might want to set it higher to
dnl # debug the configuration
dnl #
dnl define(`confLOG_LEVEL', `9')dnl

define(`confLOG_LEVEL', `9')dnl

dnl #
dnl # Uncomment and edit the following line if your outgoing mail needs to
dnl # be sent out through an external mail server:
dnl #

dnl #
dnl # Uncomment and edit the following line if your incoming mail needs to
dnl # be sent to an internal mail server:

dnl #
dnl define(`MAIL_HUB',`smtp.your.provider')dnl
dnl FEATURE(`stickyhost')dnl
dnl #
define(`confDOMAIN_NAME', `subdomain.support.foo.com')dnl
define(`confDEF_USER_ID',``8:12'')dnl
dnl define(`confAUTO_REBUILD')dnl
define(`confTO_CONNECT', `1m')dnl
define(`confTRY_NULL_MX_LIST',true)dnl
define(`confDONT_PROBE_INTERFACES',true)dnl

define(`PROCMAIL_MAILER_PATH',`/usr/bin/procmail')dnl
define(`ALIAS_FILE', `/etc/aliases')dnl
define(`STATUS_FILE', `/var/log/mail/statistics')dnl
define(`UUCP_MAILER_MAX', `2000000')dnl
define(`confUSERDB_SPEC', `/etc/mail/userdb.db')dnl
define(`confPRIVACY_FLAGS', `authwarnings,novrfy,noexpn,restrictqrun')dnl
define(`confAUTH_OPTIONS', `A')dnl
dnl #
dnl # The following allows relaying if the user authenticates, and disallows
dnl # plaintext authentication (PLAIN/LOGIN) on non-TLS links

dnl #
dnl define(`confAUTH_OPTIONS', `A p')dnl
dnl #
dnl # PLAIN is the preferred plaintext authentication method and used by
dnl # Mozilla Mail and Evolution, though Outlook Express and other MUAs do
dnl # use LOGIN. Other mechanisms should be used if the connection is not
dnl # guaranteed secure.
dnl # Please remember that saslauthd needs to be running for AUTH.
dnl #
dnl TRUST_AUTH_MECH(`EXTERNAL DIGEST-MD5 CRAM-MD5 LOGIN PLAIN')dnl

dnl define(`confAUTH_MECHANISMS', `EXTERNAL GSSAPI DIGEST-MD5 CRAM-MD5 LOGIN PLAIN')dnl
dnl #
dnl # Rudimentary information on creating certificates for sendmail TLS:
dnl #     cd /usr/share/ssl/certs; make sendmail.pem
dnl # Complete usage:
dnl #     make -C /usr/share/ssl/certs usage
dnl #
dnl define(`confCACERT_PATH',`/usr/share/ssl/certs')
dnl define(`confCACERT',`/usr/share/ssl/certs/ca-bundle.crt')
dnl define(`confSERVER_CERT',`/usr/share/ssl/certs/sendmail.pem')

dnl define(`confSERVER_KEY',`/usr/share/ssl/certs/sendmail.pem')
dnl #
dnl # This allows sendmail to use a keyfile that is shared with OpenLDAP's
dnl # slapd, which requires the file to be readble by group ldap
dnl #
dnl define(`confDONT_BLAME_SENDMAIL',`groupreadablekeyfile')dnl
dnl #
dnl define(`confTO_QUEUEWARN', `4h')dnl
dnl define(`confTO_QUEUERETURN', `5d')dnl
define(`confQUEUE_LA', `50')dnl

define(`confREFUSE_LA', `50')dnl
define(`confTO_IDENT', `0')dnl
dnl FEATURE(delay_checks)dnl
FEATURE(`no_default_msa',`dnl')dnl
FEATURE(`smrsh',`/usr/sbin/smrsh')dnl
FEATURE(`mailertable',`hash -o /etc/mail/mailertable.db')dnl
FEATURE(`virtusertable',`hash -o /etc/mail/virtusertable.db')dnl
FEATURE(redirect)dnl
FEATURE(always_add_domain)dnl
FEATURE(use_cw_file)dnl

FEATURE(use_ct_file)dnl
dnl #
dnl # The following limits the number of processes sendmail can fork to accept
dnl # incoming messages or process its message queues to 12.) sendmail refuses
dnl # to accept connections once it has reached its quota of child processes.
dnl #
dnl define(`confMAX_DAEMON_CHILDREN', 12)dnl
dnl #
dnl # Limits the number of new connections per second. This caps the overhead
dnl # incurred due to forking new sendmail processes. May be useful against

dnl # DoS attacks or barrages of spam. (As mentioned below, a per-IP address
dnl # limit would be useful but is not available as an option at this writing.)
dnl #
dnl define(`confCONNECTION_RATE_THROTTLE', 3)dnl
dnl #
dnl # The -t option will retry delivery if e.g. the user runs over his quota.
dnl #
FEATURE(local_procmail,`',`procmail -t -Y -a $h -d $u')dnl
FEATURE(`access_db',`hash -T -o /etc/mail/access.db')dnl
FEATURE(`blacklist_recipients')dnl

define(`confDOUBLE_BOUNCE_ADDRESS', `')dnl
EXPOSED_USER(`root')dnl
dnl #
dnl # The following causes sendmail to only listen on the IPv4 loopback address
dnl # 127.0.0.1 and not on any other network devices. Remove the loopback
dnl # address restriction to accept email from the internet or intranet.
dnl #
DAEMON_OPTIONS(`Port=smtp,Addr=127.0.0.1, Name=MTA, InputMailFilters=')dnl
DAEMON_OPTIONS(`Port=smtp, Addr=192.168.1.1,Name=MTA,Modifiers=b,InputMailFilters=')dnl
CLIENT_OPTIONS(`Family=inet, Addr=192.168.1.1')dnl


dnl #
dnl # The following causes sendmail to additionally listen to port 587 for
dnl # mail from MUAs that authenticate. Roaming users who can't reach their
dnl # preferred sendmail daemon due to port 25 being blocked or redirected find
dnl # this useful.
dnl #
dnl DAEMON_OPTIONS(`Port=submission, Name=MSA, M=Ea')dnl
dnl #
dnl # The following causes sendmail to additionally listen to port 465, but

dnl # starting immediately in TLS mode upon connecting. Port 25 or 587 followed
dnl # by STARTTLS is preferred, but roaming clients using Outlook Express can't
dnl # do STARTTLS on ports other than 25. Mozilla Mail can ONLY use STARTTLS
dnl # and doesn't support the deprecated smtps; Evolution <1.1.1 uses smtps
dnl # when SSL is enabled-- STARTTLS support is available in version 1.1.1.
dnl #
dnl # For this to work your OpenSSL certificates must be configured.
dnl #
dnl DAEMON_OPTIONS(`Port=smtps, Name=TLSMTA, M=s')dnl
dnl #

dnl # The following causes sendmail to additionally listen on the IPv6 loopback
dnl # device. Remove the loopback address restriction listen to the network.
dnl #
dnl DAEMON_OPTIONS(`port=smtp,Addr=::1, Name=MTA-v6, Family=inet6')dnl
dnl #
dnl # enable both ipv6 and ipv4 in sendmail:
dnl #
dnl DAEMON_OPTIONS(`Name=MTA-v4, Family=inet, Name=MTA-v6, Family=inet6')
dnl #
dnl # We strongly recommend not accepting unresolvable domains if you want to

dnl # protect yourself from spam. However, the laptop and users on computers
dnl # that do not have 24x7 DNS do need this.
dnl #
FEATURE(`accept_unresolvable_domains')dnl
dnl #
dnl FEATURE(`relay_based_on_MX')dnl
dnl #
dnl # Also accept email sent to "localhost.localdomain" as local email.
dnl #
LOCAL_DOMAIN(`localhost.localdomain')dnl

dnl #
dnl # The following example makes mail from this host and any additional
dnl # specified domains appear to be sent from mydomain.com
dnl #
dnl MASQUERADE_AS(`mydomain.com')dnl
dnl #
dnl # masquerade not just the headers, but the envelope as well
dnl #
dnl FEATURE(masquerade_envelope)dnl
dnl #

dnl # masquerade not just @mydomainalias.com, but @*.mydomainalias.com as well
dnl #
dnl FEATURE(masquerade_entire_domain)dnl
dnl #
dnl MASQUERADE_DOMAIN(localhost)dnl
dnl MASQUERADE_DOMAIN(localhost.localdomain)dnl
dnl MASQUERADE_DOMAIN(mydomainalias.com)dnl
dnl MASQUERADE_DOMAIN(mydomain.lan)dnl




MAILER(smtp)dnl
MAILER(procmail)dnl

I also collected a packet capture and confirmed that the server is indeed initiating TLS connections to the external party.

UPDATE #2

I cranked logging all the way up (99) and sent a test message to a gmail account. I am noticing interesting details:

Jun  6 12:55:10 foobox sendmail[1663]: r56Jsk7N001660: SMTP outgoing connect on foobox.foo.com
Jun  6 12:55:10 foobox sendmail[1663]: STARTTLS: ClientCertFile missing
Jun  6 12:55:10 foobox sendmail[1663]: STARTTLS: ClientKeyFile missing
Jun  6 12:55:10 foobox sendmail[1663]: STARTTLS: CACertPath missing
Jun  6 12:55:10 foobox sendmail[1663]: STARTTLS: CACertFile missing
Jun  6 12:55:10 foobox sendmail[1663]: STARTTLS: CRLFile missing
Jun  6 12:55:10 foobox sendmail[1663]: STARTTLS=client, init=1
Jun  6 12:55:10 foobox sendmail[1663]: STARTTLS=client, start=ok
Jun  6 12:55:10 foobox sendmail[1663]: STARTTLS=client, info: fds=10/9, err=2

Jun  6 12:55:10 foobox sendmail[1663]: STARTTLS=client, info: fds=10/9, err=2
Jun  6 12:55:10 foobox sendmail[1663]: STARTTLS=client, get_verify: 20 get_peer: 0x8907258
Jun  6 12:55:10 foobox sendmail[1663]: STARTTLS=client, relay=gmail-smtp-in.l.google.com., version=TLSv1/SSLv3, verify=FAIL, cipher=RC4-SHA, bits=128/128
Jun  6 12:55:10 foobox sendmail[1663]: STARTTLS=client, cert-subject=/C=US/ST=California/L=Mountain+20View/O=Google+20Inc/CN=mx.google.com, cert-issuer=/C=US/O=Google+20Inc/CN=Google+20Internet+20Authority, verifymsg=unable to get local issuer certificate
Jun  6 12:55:10 foobox sendmail[1663]: STARTTLS=read, info: fds=10/9, err=2
Jun  6 12:55:10 foobox last message repeated 3 times
Jun  6 12:55:10 foobox sendmail[1663]: STARTTLS=write, info: fds=10/9, err=3
Jun  6 12:55:10 foobox last message repeated 3 times
Jun  6 12:55:10 foobox sendmail[1663]: STARTTLS=read, info: fds=10/9, err=2
Jun  6 12:55:10 foobox sendmail[1663]: r56Jsk7N001660: to=, delay=00:00:07, xdelay=00:00:00, mailer=esmtp, pri=120015, relay=gmail-smtp-in.l.google.com. [74.125.129.27], dsn=2.0.0, stat=Sent (OK 198738510 s9si492345031pan.259 - gsmtp)

Answer

I can confirm that I am seeing the same thing, independently; a sendmail installation with no certificates configured in is still taking advantage of TLS when sending to a server which advertises itself as supporting that protocol.

To see what's going on, I ran a packet capture with tcpdump -n -n -w /tmp/pax.dump port 25 and host 178.18.123.145 on the sending server, then fed that packet dump into wireshark, which told me this:

wireshark analysis of smtp tls conversation

Note how the highlighted packet (no. 17) contains certificate information, as did the packet four prior (no. 13). Packet 13 is the server's certificate, with the chain of trust, and has "Certificates length" of 2327 bytes. This one is the client's certificate, and it has length zero bytes (highlighted line in the packet breakdown window). So I think there's pretty good evidence to suggest that sendmail generates a random keypair for client purposes, and presents it with a zero-length certificate.

If you find this behaviour annoying, as I did, you can turn it off for communications to all hosts by putting

Try_TLS:                NO

in /etc/mail/access and regenerating access.db.