Thursday, March 31, 2016

domain - What happens when multiple MX records in a DNS name have different TTLs?

I am wondering what happens when a domain name in DNS has multiple MX records with different Times to Live (TTLs)?



For example, what if these are the MX records for example.com?




  • TTL = 3 days,         priority = 1, result = mx1-slow.example.com


  • TTL = 60 seconds, priority = 1, result = mx1-fast.example.com

  • TTL = 1 day,            priority = 2, result = mx2.example.com

  • TTL = 1 hour,          priority = 3, result = mx3-hour.example.com

  • TTL = 60 seconds, priority = 3, result = mx3-fast.example.com

  • TTL = 2 days,         priority = 3, result = mx3-slow.example.com



What happens if a mail transfer agent is sending a message to this domain -- where some of the MX servers might be working and some might not -- and has the results cached for 2 minutes? 2 hours? 1.5 days? 2.5 days? Does it need to go by the smallest TTL across all the MX records (60 seconds in this case) and do a re-lookup of all the MX records if that much time has passed, ignoring the longer TTLs on the remaining MX records? Or does the cache actually take into account all of the different TTLs somehow? If all of the TTLs are taken into account, can you please provide some example scenarios of how this might work?

switch - gigabit ethernet to single mode LC fiber




I need a fiber converter that takes single mode LC fiber from the patch to a gigabit port on the switch. I've been investigating the Canary GFT-1036, but am having a difficult time finding a vendor my university can purchase from. Are there any alternative products that anyone has/is using to offer up some ideas?


Answer



Copper to LC single-mode is a pretty common combination. Easiest approach, particularly if you're limited in vendors, would be to get a switch with SFP ports and a suitable single-mode SFP module. (There are applications where you want a more transparent solution; adding more switches can sometimes be a problem. In that case, then you'll want a media converter -- I'd recommend one that takes SFPs, as you get more flexibility to switch or replace optics.)


bash - Python script crashes when run via Cron

I have a python script that completes exactly as expected when run manually as the root user. When I put the script into cron.daily it crashes 100% of the time.



The error is a timeout error, but this question is not about troubleshooting the error directly.



Executing this works:



$ /etc/cron.daily/myscript



But it crashes when run automatically via cron.



The Question: What is different about a root bash shell and the environment /etc/cron.daily executes in?

HP Proliant DL380 G7 compatible with Kingston SKC1000H/960G PCIe SSD NVME?

Hi to all the experts out there.




Can anyone confirm if they've successfully installed and operated the Kingston SKC1000H/960G PCIe SSD NVME card in an HP Proliant DL380 G7 Server? If so, is it just plug-and-play? Does the extra capacity just become available as a new volume?



Despite being out of drive bays, we are looking for a performance upgrade on our SQL service and have had excellent experience with SSD drives in the past, but we have never used a PCIe solution before. (Suggestions for alternatives to the Kingston device would be welcome too if anyone has such experience)



Thanks in advance.

zfs - Upgrade Nexenta to Solaris Express?



I've had a NAS built on Nextenta that has worked great for a while. However, I'm looking to upgrade the NAS to run Solaris Express 11 as it's better support by my company's IT department, as well as having direct support from Oracle.



I've currently got two zpools, syspool containin the OS and 'tank' containing all the data. What would be the easiest way to run the upgrade to the latest Solaris 11 Express? Can I just run the installer and have it detect and import the existing 'tank' zpool when it first boots?


Answer



I don't think you can upgrade directly from Nexenta do Solaris 11 Express since they've different user lands.



Reinstalling with Solaris 11 Express and importing the pool should work. I don't think Nexenta has done any incompatible changes at this point.




Beware that upgrading your pool after Solaris 11 Express is installed will not allow you to go back to Nexenta (since it doesn't support newer ZFS pool versions).


smtp email forwards with Gmail and Wordpress

I'm looking for help. I'm trying to add a new email account to Gmail. I've done it before for one of my websites email domain and it worked fine. However, now, this is the message I get. Furthermore, my marketing team member also could not have it set it up.



error message snapshot 2

Wednesday, March 30, 2016

Synchronize SQL with information from Active Directory

I would like to pass information from Active Directory to a Microsoft SQL Server database any time a change is made in Active Directory. In this way the SQL will always have a reliable copy of the AD.
Well, i have been reading some post that deal with similar questions, but I can't find the solution.



In this post, for example: https://stackoverflow.com/questions/4782292/synchronization-between-c-app-and-active-directory
, a user said:





'...Then let AD synchronize with SQL.'




That sounds good for me, but how can I do that?



@Pablo: So we have an existing c# application that manages users and groups in SQL. We would like this app to manage also the Active Directory. The idea is to query Active Directory (AD) directly with the application but also continue saving the information in the SQL as it does now. But I see a problem: when changes are made only in the Active Directory => SQL will not have an updated information, because we will have different data between AD and SQL.



I see you propose to query AD for changes and it can be done per access or using polling on a scheduled interval. I'm not familiar with it, could you explain me how to do that?

Tuesday, March 29, 2016

mysql - InnoDB: Error: log file ./ib_logfile0 is of different size



I just added the following lines in /etc/mysql/my.cnf after I converted one database to use InnoDB engine.



innodb_buffer_pool_size = 2560M
innodb_log_file_size = 256M
innodb_log_buffer_size = 8M
innodb_flush_log_at_trx_commit = 2
innodb_thread_concurrency = 16

innodb_flush_method = O_DIRECT


But it raise "ERROR 2013 (HY000) at line 2: Lost connection to MySQL server during query" error restarting mysqld. And mysql error log shows the following



InnoDB: Error: log file ./ib_logfile0 is of different size 0 5242880 bytes
InnoDB: than specified in the .cnf file 0 268435456 bytes!
100118 20:52:52 [ERROR] Plugin 'InnoDB' init function returned error.
100118 20:52:52 [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed.
100118 20:52:52 [ERROR] Unknown/unsupported table type: InnoDB

100118 20:52:52 [ERROR] Aborting


So I commented out this line



# innodb_log_file_size  = 256M


And it restarted mysql successfully.




I wonder what's the "5242880 bytes of log file" showed in mysql error? It's the first database on InnoDB engine on this server so when and where is that log file created? In this case, how can I enable innodb_log_file_size directive in my.cnf?



EDIT



I tried to delete /var/lib/mysql/ib_logfile0 and restart mysqld but it still failed. It now shows the following in error log.



100118 21:27:11  InnoDB: Log file ./ib_logfile0 did not exist: new to be created
InnoDB: Setting log file ./ib_logfile0 size to 256 MB
InnoDB: Database physically writes the file full: wait...
InnoDB: Progress in MB: 100 200

InnoDB: Error: log file ./ib_logfile1 is of different size 0 5242880 bytes
InnoDB: than specified in the .cnf file 0 268435456 bytes!


Resolution



It works now after deleted both ib_logfile0 and ib_logfile1 in /var/lib/mysql


Answer



InnoDB is insanely picky about it's config; if something's not right, it'll just give up and go home. To make a change to the log file size without data loss:





  1. Revert any config changes you've made to the log file size and start MySQL again.

  2. In your running MySQL: SET GLOBAL innodb_fast_shutdown=0;

  3. Stop MySQL

  4. Make the configuration change to the log file size.

  5. Delete both log files.

  6. Start MySQL. It will complain about the lack of log files, but it'll create them and all will be well.


Monday, March 28, 2016

scripting - Manage Multiple IIS servers without shared configuration



We currently have 2 web servers IIS 8.5 in DEV




We will push the servers to production to replace our current production servers



My question is, is the a way to build scripts on the fly to apply configuration updates to production after being tested in production.



We cannot use shared configurations because not all sites are the same but for the ones that are we would like be able to update them through a scripted method. Unless there is a better way.



I am thinking something like SQL when you get through their wizards and you have the option to build the script.



Is there anything like that with IIS 8.5?



Answer



Configuration Editor within IIS Manager does exactly this.




Creates a script of the actions that you have recently performed. Opens the Script Dialog window that displays the script for your action in three programming languages: C#, JavaScript, and AppCmd. The generate script functionality is only enabled after you take an action, such as changing the value of a property. You must generate a script that includes your action before you click Apply.
Note that the script will not include immediate actions, such as locking a section, editing a collection, or reverting to parent.




You open Config Editor then modify the elements and attributes which define your custom configuration. Once you modify something you will see the Generate Script action become enabled.




Click this and you will see your changes as C#, JavaScript, AppCmd or PowerShell.



enter image description here



You can then combine each change into one script that you could automate with Chocolatey, DSC, SaltStack, Puppet, etc.


linux - Tracking down load average



The "load average" on a *nix machine is the "average length of the run queue", or in other words, the average number of processes that are doing something (or waiting to do something). While the concept is simple enough to understand, troubleshooting the problem can be less straight-forward.



Here's the statistics on a server I worked on today that made me wonder the best way to fix this sort of thing. Here's the statistics:




  • 1GB RAM free, 0 swap space usage

  • CPU times around 20% user, 30% wait, 50% idle (according to top)


  • About 2 to 3 processes in either "R" or "D" state at a time (tested using ps | grep)

  • Server logs free of any error messages indicating hardware problems

  • Load average around 25.0 (for all 3 averages)

  • Server visibly unresponsive for users



I eventually "fixed" the problem by restarting MySQLd... which doesn't make a lot of sense, because according to mysql's "show processlist" command, the server was theoretically idle.



What other tools/metrics should I have used to help diagnose this issue and possibly determine what was causing the server load to run so high?


Answer




It sounds like your server is IO bound - hence the processes sat in D state.



Use iostat to see what the load is on your disks.



If MySQL is causing lots of disk seeks then consider putting your MySQL data on a completely separate physical disk. If it's still slow and it's part of a master-slave setup, put the replication logs onto a separate disk too.



Note that a separate partition or logical disk isn't enough - head seek times are generally the limiting factor, not data transfer rates.


Sunday, March 27, 2016

domain name system - Reverse DNS in a CIDR world



Reverse DNS seems to be strongly tied to class boundaries, what methods exist now that CIDR is the standard to delegate authority for a subnet? If multiple methods exist which one is best? Do you need to handle delegation differently depending on the DNS server (Bind, djbdns, Microsoft DNS, other)? Lets say I have control of a network that is a Class B 168.192.in-addr.arpa please provide examples for:




  • How to delegate authority for a /22?

  • How to delegate authority for a /25?


Answer



Delegating a /22 is easy, it's delegation of the 4 /24s. A /14 is delegation of the 4 /16s, etc.




RFC2317 covers the special cases with a netmask longer than /24. Basically there's no super-clean way to do delegation of in-addr.arpa zones on anything but octet boundaries, but you can work around this. Let's say I want to delegate 172.16.23.16/29, which would be the IP addresses 172.16.23.16 -> 172.16.23.23.



As the owner of the 23.16.172.in-addr.arpa zone, I might would put this in my 23.16.172.rev zone file to delegate this range to my customer:



16-29              IN NS  ns1.customer.com
16-29 IN NS ns2.customer.com
16 IN CNAME 16.16-29.23.16.172.in-addr.arpa.
17 IN CNAME 17.16-29.23.16.172.in-addr.arpa.
18 IN CNAME 18.16-29.23.16.172.in-addr.arpa.

19 IN CNAME 19.16-29.23.16.172.in-addr.arpa.
20 IN CNAME 20.16-29.23.16.172.in-addr.arpa.
21 IN CNAME 21.16-29.23.16.172.in-addr.arpa.
22 IN CNAME 22.16-29.23.16.172.in-addr.arpa.
23 IN CNAME 23.16-29.23.16.172.in-addr.arpa.


So, you can see that I'm defining a new zone (16-29.23.16.172.in-addr.arpa.) and delegating it to my customer's name servers. Then I'm creating CNAMEs from the IPs to be delegated to the corresponding number under the newly delegated zone.



As the customer to whom these have been delegated, I would do something like the following in named.conf:




zone "16-29.23.16.172.in-addr.arpa" { 
type master;
file "masters/16-29.23.16.172.rev";
};


And then in the .rev file, I would just make PTRs like any normal in-addr.arpa zone:



17                 IN PTR office.customer.com.

18 IN PTR www.customer.com.
(etc)


That's sort of the clean way to do it and it makes savvy customer happy because they have an in-addr.arpa zone to put the PTRs in, etc. A shorter way to do it for customer who want to control reverse DNS but don't want to set up a whole zone is to just CNAME individual record to similar names in their main zone.



In this case, we, as the delegators, would have something like this in our 23.16.172.rev file:



16                 IN CNAME    16.customer.com.
17 IN CNAME 17.customer.com.

18 IN CNAME 18.customer.com.
19 IN CNAME 19.customer.com.
20 IN CNAME 20.customer.com.
21 IN CNAME 21.customer.com.
22 IN CNAME 22.customer.com.
23 IN CNAME 23.customer.com.


So it's similar in concept to the other idea, but instead of creating a new zone and delegating it to the customer, you're CNAMEing the records to names in the customer's already-existing main zone.




The customer would have something like this in their customer.com zone file:



office             IN A   172.16.23.17
17 IN PTR office.customer.com.
www IN A 172.16.23.18
18 IN PTR www.customer.com.
(etc)


It just depends on the type of customer. Like I said, it just depends on the customer type. A savvy customer will prefer to set up their own in-addr.arpa zone and will think it very odd to have PTRs in a domain-name zone. A non-savvy customer will want it to "just work" without having to do a ton of extra configuration.




There are likely other methods, just detailing the two I'm familiar with.






I was just thinking about my statement about how /22 and /14 are easy and thinking about why that's true but anything between 25 and 32 is hard. I haven't tested this, but I wander if you could delegate the entire /32 to the customer like this:



16                 IN NS ns1.customer.com.
17 IN NS ns1.customer.com.
(etc)



Then, on the customer side, you catch the entire /32:



zone "16.23.16.172.in-addr.arpa" { type master; file "masters/16.23.16.172.rev"; };
zone "17.23.16.172.in-addr.arpa" { type master; file "masters/17.23.16.172.rev"; };
(etc)


And then in the individual file you would have something like this:




@            IN PTR office.customer.com.


The obvious downside is that one file per /32 is kind of gross. But I bet it would work.



All the stuff I mentioned is pure DNS, if any DNS server didn't let you do it it's because it's restricting the full functionality of DNS. My examples are obviously using BIND, but we've done the customer side of this using Windows DNS and BIND. I don't see a reason it wouldn't work with any server.


sendmail + SASL2 Auth Error

My sendmail config is like so:



/usr/lib/sasl2/Sendmail.conf
pwcheck_method:saslauthd




We are running "saslauthd"



root 32102 1 0 81 0 - 1128 fcntl_ 11:37 ? 00:00:00 /usr/sbin/saslauthd -m /var/run/saslauthd -a pam



When I saslauth with account name "cccc" everything works OK. But when I saslauth with account name "cccc@domain.com" it fails.



/var/log/messages
saslauthd[32103]: do_auth : auth failure: [user=cccc] [service=smtp] [realm=domain.com] [mech=pam] [reason=PAM auth error]



What's the problem?

php - FastCGI for DocumentRoot only

I'm trying to set up HHVM for one of my websites. It is running on Apache 2.4, Ubuntu 14.04. I apologize for not being the most advanced system administrator, I am more in the software engineering end of the spectrum.



I've got HHVM installed when but when I use FastCGI for the entire Apache server, it breaks ownCloud. After a bit of reading, I find that the ownCloud developers are still in the process of getting it compatible with HHVM.



So I want to run just one directory with FastCGI and hence HHVM.



My website which I do want to run FastCGI for is in:

/var/www/website
A sub-directory in here is the document root as per:



DocumentRoot /var/www/website/www


(Some of the code lives above the document root for the website; classes, etc.)



ownCloud is in:
/var/www/owncloud

And has the configuration:



Alias /owncloud "/var/www/owncloud/"

Options +FollowSymLinks
AllowOverride All



I've tried quite a few things in apache2.conf without success.




ProxyPass / fcgi://127.0.0.1:9000/var/www
ProxyPass /owncloud/ //127.0.0.1:80/owncloud


­



ProxyPass /var/www/website/ fcgi://127.0.0.1:9000/var/www/website



­



ProxyPass /website/ fcgi://127.0.0.1:9000/var/www/website


­




SetHandler fastcgi-script
SetHandler proxy:fcgi://127.0.0.1:9000




Any ideas?

amazon web services - How to set reverse DNS in AWS for my private nameserver?

I wanted to set rDNS for AWS for my mail server. I have created glue records, therefore my nameservers are like ns1.mydomain.com & ns2.mydomain.com.
Note: My domain registrar is AWS and may mailserver will take care of DNS.




So, I followed this guide https://aws.amazon.com/premiumsupport/knowledge-center/route-53-reverse-dns/, but I am little confused.



Suppose my IP is 50.60.70.80.



I created a new hosted zone with the name 70.60.50.in-addr.arpa



I created a record set and added a PTR record for the SMTP server as follows
Name field : 80.70.60.50.in-addr.arpa
Value field: mail.mydomain.com




Now, in this hosted zone, I have 2 extra records.
In type NS, I replaced AWS nameservers to my nameservers ns1.mydomain.com & ns2.mydomain.com.
I don't know what to do with SOA. I would be thankful for any help.

osx snow leopard - sniffing vmnet8 on VMware fusion and OSX 10.6

I'm currently using VMWare Fusion on Mac OSX and I have a VM running. In order to sniff out the network traffic I ran the following as root :-



$ /Library/Application\ Support/VMware\ Fusion/vmnet-sniffer vmnet8 -w login.pcap -e



However, a file named login.pcap is not created - Is there something obvious that I'm doing wrong?



P.S. I've tagged this as "vmware-workstation" as I could not find appropriate tags. I would love to have used "vmnet8", "vmware-fusion".

debian - Bad continuous write performance on RAID-5 when not writing huge amounts




I had some problems getting acceptable read/write performance for my RAID5 + crypt + ext4 and was finally able to track it down to the following problem:






  • Hard disks 4x WD RED 3 TB WDC WD30EFRX-68EUZN0 as /dev/sd[efgh]

  • sde and sdf are connected via controller A using a 3 Gbps/s SATA link (even though 6 Gbps would have been available)

  • sdg and sdh are connected via controller B using a 6 Gbps/s SATA link






Write test 4 times for each disk (everything as I expected)



# dd if=/dev/zero of=/dev/sd[efgh] bs=2G count=1 oflag=dsync
sde: 2147479552 bytes (2.1 GB) copied, xxx s, [127, 123, 132, 127] MB/s
sdf: 2147479552 bytes (2.1 GB) copied, xxx s, [131, 130, 118, 137] MB/s
sdg: 2147479552 bytes (2.1 GB) copied, xxx s, [145, 145, 145, 144] MB/s
sdh: 2147479552 bytes (2.1 GB) copied, xxx s, [126, 132, 132, 132] MB/s



Read test using hdparm and dd (everything as I expected)



# hdparm -tT /dev/sd[efgh]
# echo 3 | tee /proc/sys/vm/drop_caches; dd of=/dev/null if=/dev/sd[efgh] bs=2G count=1 iflag=fullblock

(sde)
Timing cached reads: xxx MB in 2.00 seconds = [13983.68, 14136.87] MB/sec
Timing buffered disk reads: xxx MB in 3.00 seconds = [143.16, 143.14] MB/sec
2147483648 bytes (2.1 GB) copied, xxx s, [140, 141] MB/s


(sdf)
Timing cached reads: xxx MB in 2.00 seconds = [14025.80, 13995.14] MB/sec
Timing buffered disk reads: xxx MB in 3.00 seconds = [140.31, 140.61] MB/sec
2147483648 bytes (2.1 GB) copied, xxx s, [145, 141] MB/s

(sdg)
Timing cached reads: xxx MB in 2.00 seconds = [14005.61, 13801.93] MB/sec
Timing buffered disk reads: xxx MB in 3.00 seconds = [153.11, 151.73] MB/sec
2147483648 bytes (2.1 GB) copied, xxx s, [154, 155] MB/s


(sdh)
Timing cached reads: xxx MB in 2.00 seconds = [13816.84, 14335.93] MB/sec
Timing buffered disk reads: xxx MB in 3.00 seconds = [142.50, 142.12] MB/sec
2147483648 bytes (2.1 GB) copied, xxx s, [140, 140] MB/s




4x 32 GiB for testing




# gdisk -l /dev/sd[efgh]
GPT fdisk (gdisk) version 0.8.10

Partition table scan:
MBR: protective
BSD: not present
APM: not present
GPT: present


Found valid GPT with protective MBR; using GPT.
Disk /dev/sde: 5860533168 sectors, 2.7 TiB
Logical sector size: 512 bytes
Disk identifier (GUID): xxx
Partition table holds up to 128 entries
First usable sector is 34, last usable sector is 5860533134
Partitions will be aligned on 2048-sector boundaries
Total free space is 5793424237 sectors (2.7 TiB)

Number Start (sector) End (sector) Size Code Name

1 2048 67110911 32.0 GiB FD00 Linux RAID




# mdadm --create --verbose /dev/md0 --level=5 --raid-devices=4 --chunk=256K /dev/sd[efgh]1
(some tests later ...)
# mdadm --grow --verbose /dev/md0 --layout=right-asymmetric
# mdadm --detail /dev/md0
/dev/md0:

Version : 1.2
Creation Time : Sat Dec 10 03:07:56 2016
Raid Level : raid5
Array Size : 100561920 (95.90 GiB 102.98 GB)
Used Dev Size : 33520640 (31.97 GiB 34.33 GB)
Raid Devices : 4
Total Devices : 4
Persistence : Superblock is persistent

Update Time : Sat Dec 10 23:56:53 2016

State : clean
Active Devices : 4
Working Devices : 4
Failed Devices : 0
Spare Devices : 0

Layout : right-asymmetric
Chunk Size : 256K

Name : vm:0 (local to host vm)

UUID : 80d0f886:dc380755:5387f78c:1fac60da
Events : 158

Number Major Minor RaidDevice State
0 8 65 0 active sync /dev/sde1
1 8 81 1 active sync /dev/sdf1
2 8 97 2 active sync /dev/sdg1
4 8 113 3 active sync /dev/sdh1





I expected the array to perform roughly between 350 - 400 MB/s for continuous reads and writes. Reading or writing the entire volume actually yields results perfectly within this range:



# echo 3 | tee /proc/sys/vm/drop_caches; dd of=/dev/null if=/dev/md0 bs=256K
102975406080 bytes (103 GB) copied, 261.373 s, 394 MB/s

# dd if=/dev/zero of=/dev/md0 bs=256K conv=fdatasync
102975406080 bytes (103 GB) copied, 275.562 s, 374 MB/s



However, the write performance greatly depends on the amount of data written. As expected the transfer rate increases with the amount of data, but drops dead when reaching 2 GiB and only recovers slowly when further increasing the size:



# dd if=/dev/zero of=/dev/md0 bs=256K conv=fdatasync count=x
count=1: 262144 bytes (262 kB) copied, xxx s, [3.6, 7.6, 8.9, 8.9] MB/s
count=2: 524288 bytes (524 kB) copied, xxx s, [3.1, 17.7, 15.3, 15.7] MB/s
count=4: 1048576 bytes (1.0 MB) copied, xxx s, [13.2, 23.9, 26.9, 25.4] MB/s
count=8: 2097152 bytes (2.1 MB) copied, xxx s, [24.3, 46.7, 45.9, 42.8] MB/s
count=16: 4194304 bytes (4.2 MB) copied, xxx s, [5.1, 77.3, 42.6, 73.2, 79.8] MB/s
count=32: 8388608 bytes (8.4 MB) copied, xxx s, [68.6, 101, 99.7, 101] MB/s

count=64: 16777216 bytes (17 MB) copied, xxx s, [52.5, 136, 159, 159] MB/s
count=128: 33554432 bytes (34 MB) copied, xxx s, [38.5, 175, 185, 189, 176] MB/s
count=256: 67108864 bytes (67 MB) copied, xxx s, [53.5, 244, 229, 238] MB/s
count=512: 134217728 bytes (134 MB) copied, xxx s, [111, 288, 292, 288] MB/s
count=1K: 268435456 bytes (268 MB) copied, xxx s, [171, 328, 319, 322] MB/s
count=2K: 536870912 bytes (537 MB) copied, xxx s, [228, 337, 330, 334] MB/s
count=4K: 1073741824 bytes (1.1 GB) copied, xxx s, [338, 348, 348, 343] MB/s <-- ok!
count=8K: 2147483648 bytes (2.1 GB) copied, xxx s, [168, 147, 138, 139] MB/s <-- bad!
count=16K: 4294967296 bytes (4.3 GB) copied, xxx s, [155, 160, 178, 144] MB/s
count=32K: 8589934592 bytes (8.6 GB) copied, xxx s, [256, 238, 264, 246] MB/s

count=64K: 17179869184 bytes (17 GB) copied, xxx s, [298, 285] MB/s
count=128K: 34359738368 bytes (34 GB) copied, xxx s, [347, 336] MB/s
count=256K: 68719476736 bytes (69 GB) copied, xxx s, [363, 356] MB/s <-- getting better


(below 2 GiB the first measurement seems to indicate the usage of some read-cache)



While transferring 2 GiB or more than I observed something strange in iotop:





  • Phase 1: in the beginning "Total DISK WRITE" and "Actual DISK WRITE" are both about "400 MB/s". dd has an IO values around 85 % while all others are at 0 %. This phase lasts longer on bigger transfers.

  • Phase 2: A few seconds (~16 s) before the transmission is done, a kworker jumps in and /steals/ 30 - 50 percentage points of IO from dd. The distribution fluctuates between 30:50 % and 50:30 %. At the same time the "Total DISK WRITE" drops down to 0 B/s and "Actual DISK WRITE" jumps between 20 - 70 MB/s . This phase seems to last for a constant time.

  • Phase 3: Within the last 3 s "Actual DISK WRITE" jumps up to > 400 MB/s while "Total DISK WRITE" stays at 0 B/s. dd and kworker are both listed with an IO-value of 0 %

  • Phase 4: the IO-value of dd jumps up to 5 % for single second. At the same time the transfer completes.



Some more tests



# dd if=/dev/zero of=/dev/md0 bs=256K count=32K oflag=direct
8589934592 bytes (8.6 GB) copied, 173.083 s, 49.6 MB/s


# dd if=/dev/zero of=/dev/md0 bs=256M count=64 oflag=direct
17179869184 bytes (17 GB) copied, 47.792 s, 359 MB/s

# dd if=/dev/zero of=/dev/md0 bs=768M count=16K oflag=direct
50734301184 bytes (51 GB) copied, 136.347 s, 372 MB/s <-- peak performance

# dd if=/dev/zero of=/dev/md0 bs=1G count=16K oflag=direct
41875931136 bytes (42 GB) copied, 112.518 s, 372 MB/s <-- peak performance


# dd if=/dev/zero of=/dev/md0 bs=2G count=16 oflag=direct
34359672832 bytes (34 GB) copied, 103.355 s, 332 MB/s

# dd if=/dev/zero of=/dev/md0 bs=256K count=32K oflag=dsync
8589934592 bytes (8.6 GB) copied, 498.77 s, 17.2 MB/s

# dd if=/dev/zero of=/dev/md0 bs=256M count=64 oflag=dsync
17179869184 bytes (17 GB) copied, 58.384 s, 294 MB/s

# dd if=/dev/zero of=/dev/md0 bs=1G count=8 oflag=dsync

8589934592 bytes (8.6 GB) copied, 26.4799 s, 324 MB/s

# dd if=/dev/zero of=/dev/md0 bs=2G count=8 oflag=dsync
17179836416 bytes (17 GB) copied, 192.327 s, 89.3 MB/s

# dd if=/dev/zero of=/dev/md0 bs=256K; echo "sync"; sync
102975406080 bytes (103 GB) copied, 275.378 s, 374 MB/s
sync




  • bs=256K oflag=direct -> 100 % IO, no kworker present, bad performance

  • bs=1G oflag=direct -> < 5 % IO, no kworker present, ok performance

  • bs=2G oflag=direct -> > 80 % IO, kworker jumps in every now and then, ok performance

  • oflag=dsync -> < 5 % IO, kworker jumps in every now and then; needs huge block sizes to obtain acceptable speed, but > 2G results in a massive performance drop.

  • echo "sync"; sync -> same as conv=fdatasync; sync returns immediately






What's that mysterious Phase 2 where both processes seem to fight for IO?



Who's transferring the data to the hardware in phase 3?



And most importantly: How can I minimize the strange effect to obtain the full 400 MB/s which the array seems to be able to provide? (Or am I even asking a XY-Problem?)





There's a long story of trial and error preceding the current state. I switched the scheduler from cfq to noop and decreased the RAIDs chunk size from 512k to 256k yielding slightly better results. Changing to --layout=right-asymmetric didn't change anything. Temporarily deactivating the hard drive's write cache performed worse.




The crypt layer mentioned in the first sentence is currently completely absent and will be re-introduced later.



# uname -a
Linux vm 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u2 (2016-10-19) x86_64 GNU/Linux

Answer



What you are seeing is an artifact of your dd command line, specifically from the conv=fdatasync option. From the man page:




Each CONV symbol may be:
...
fdatasync: physically write output file data before finishing
...





conv=fdatasync basically instruct dd to execute a single, final fdatasync syscall before returing. However, write are cached while dd runs. Your I/O phases can be explained as following:




  1. dd quickly writes in the pagecache, without actually touching the disk

  2. the pagecache is nearly full and a kworker kernel begin flushing it do disk. During pagecache flushing, dd is briefly paused (resulting in high iowait); after some pagecache is freed, dd can recover operation

  3. the difference between TOTAL and ACTUAL disk writes in iotop depends on how the pagecache is, respectively, filled and flushed

  4. the cycle continue




In short, there is not issue here. If you want to observe uncached behavior, replace conv=fdatasync with oflag=direct: with this flag, you can completely bypass the pagecache.



For observing cache but synchronized behavior, replace conv=fdatasync with oflag=sync: with this flag, dd call fdatasync as each block is written to disk.



Further optimization can be obtained by fine-tuning your I/O stack (ie: I/O scheduler, merge behavior, stripe cache, ecc) but this is an entirely other question.


Saturday, March 26, 2016

Windows Remote desktop over ssh by putty

I know this topic may be duplicate, but i cannot find out the solution from old post.



Here is my situation



Work PC <> Firewall (port 3389 blocked) <> Internet <> Home Router <> Home PC



So i want to build up a ssh channel in home pc remote from work pc over ssh tunnel (port 443)



I have tired putty, bitvise ssh, freesshd but with no luck



what I have tired currently:



Assume:





  • Home and Work OS : Windows 7

  • Public IP of Home Router: 100.100.100.100

  • Private IP of Home PC :192.168.1.199



Home PC :




  • install bitvise ssh listening localhost on port 22, enable port forwarding

  • create a test local account


  • a Remote desktop account ready

  • connect to bitvise ssh port 22 by putty use test account

  • Router port forward 443 to 192.168.1.199:3391

  • putty config : tunnel L3391 127.0.0.1:3389



Testing :




  1. Open windows rdp,


  2. connect to 127.0.0.1:3391,

  3. login dialog prompt,

  4. enter the remote desktop account -> rdp welcome screen pop up, ->
    Access is denied message displayed.



For the localhost testing already failure, so i have not try to test it from out side



Anyone knows how to make it happen, thanks

linux - RAID iostat showing 0.00 for wait's and %util

I have a EC2 server running ubuntu 11.10 with 6 raid devices, 5 with 8 EBS drives each as part of a RAID 10 setup, the other one with 4 EBS drives as part of a RAID 0 setup.



I am finding that even though each of the individual EBS drive devices are showing correct iostat's that the md devices are showing 0.00 for avgqu-sz, await, r_await, w_await, svctm and %util. The other stats for the md devices (rrqm/s, wrqm/s, r/s, w/s, rkB/s, wkB/s, avgrq-sz) all seem to be correct.




Any ideas on how I might be able to get stats for the missing columns?

ftp - VSFTPD not behaving in command line, but fine in GUI (Filezilla)

I'm trying to use VSFTP 3.0.3-3ubuntu2 on an Ubuntu 16:04.1 install. I have done this so that an application on another server can run an automated script to pull files from the FTP server triggered by a cron job.



I can connect to the FTP server easily using Filezilla and can browse the ftp directory structure, and both download & upload files in both active and passive mode, so I know that VSTFP is installed and behaving at least in some form, as expected. However, when I try to connect to the FTP server using command line, I can't browse the file structure at all, I can't even do a simple dir or ls command, so naturally the software running on my other server isn't able to pull the required files since it's in essence, running a script to connect to the FTP server in the background.



Can anybody at all suggest what might be the problem here?



Here's what I get when I log into to my FTP server and try to type a few basic commands:




FTP from DOS (Windows 10 command line)



230 Login successful

ftp> ls
550 Permission denied

ftp> dir
550 Permission denied.

425 Use PORT or PASV first

ftp> quote passv
227 Entering Pasive Mode (172,31,45,155,173,61)

ftp> ls
550 Permission denied

ftp> dir
425 Failed to establish a connection


mget filename.csv
Permission denied
200 Switching to ASCII mode.
Cannot find list of remote files


FTP from Linux Command Line:



230 Login successful.

Remote system type is UNIX.
Using binary mode to transfer files.
ftp> ls
550 Permission denied.
ftp: bind: Address already in use
ftp> dir
550 Permission denied.
ftp> quote pasv
227 Entering Passive Mode (172,31,45,155,165,40).
ftp> ls

550 Permission denied.
ftp> dir
550 Permission denied.


I have also tried enabling passive mode via command line but am getting the same "permission denied" error messages. Although for a "dir" command I get a slightly different response in Passive - 425 Failed to establish connection.



I originally set-up the server as per the tutorial from:
https://www.digitalocean.com/community/tutorials/how-to-set-up-vsftpd-for-a-user-s-directory-on-ubuntu-16-04




I have enabled vsftpd logging, but it is only showing connection successful notifications in relation to trying to login via command line. I can't see any clues as to why I cannot run command via the command line. If I login via Filezilla and pull a few files, I can see those downloads being logged correctly.



My VSFTP config file is as follows, note that the bottom line, I have added in the "commands allowed" line in an attempt to get VSFTP to listen to me, but clearly is hasn't worked. Also, I did set up sftp originally, however, have since commented out some of the lines as the software on the other server I'm using to get the files is set to only use FTP.



# Example config file /etc/vsftpd.conf
#
# The default compiled in settings are fairly paranoid. This sample file
# loosens things up a bit, to make the ftp daemon more usable.
# Please see vsftpd.conf.5 for all compiled in defaults.
#

# READ THIS: This example file is NOT an exhaustive list of vsftpd options.
# Please read the vsftpd.conf.5 manual page to get a full idea of vsftpd's
# capabilities.
#
#
# Run standalone? vsftpd can run either from an inetd or as a standalone
# daemon started from an initscript.
listen=NO
#
# This directive enables listening on IPv6 sockets. By default, listening

# on the IPv6 "any" address (::) will accept connections from both IPv6
# and IPv4 clients. It is not necessary to listen on *both* IPv4 and IPv6
# sockets. If you want that (perhaps because you want to listen on specific
# addresses) then you must run two copies of vsftpd with two configuration
# files.
listen_ipv6=YES
#
# Allow anonymous FTP? (Disabled by default).
anonymous_enable=NO
#

# Uncomment this to allow local users to log in.
local_enable=YES
#
# Uncomment this to enable any form of FTP write command.
write_enable=YES
#
# Default umask for local users is 077. You may wish to change this to 022,
# if your users expect that (022 is used by most other ftpd's)
local_umask=022
#

# Uncomment this to allow the anonymous FTP user to upload files. This only
# has an effect if the above global write enable is activated. Also, you will
# obviously need to create a directory writable by the FTP user.
#anon_upload_enable=YES
#
# Uncomment this if you want the anonymous FTP user to be able to create
# new directories.
#anon_mkdir_write_enable=YES
#
# Activate directory messages - messages given to remote users when they

# go into a certain directory.
dirmessage_enable=YES
#
# If enabled, vsftpd will display directory listings with the time
# in your local time zone. The default is to display GMT. The
# times returned by the MDTM FTP command are also affected by this
# option.
use_localtime=YES
#
# Activate logging of uploads/downloads.

xferlog_enable=YES
#
# Make sure PORT transfer connections originate from port 20 (ftp-data).
connect_from_port_20=YES
#
# If you want, you can arrange for uploaded anonymous files to be owned by
# a different user. Note! Using "root" for uploaded files is not
# recommended!
#chown_uploads=YES
#chown_username=whoever

#
# You may override where the log file goes if you like. The default is shown
# below.
#xferlog_file=/var/log/vsftpd.log
#
# If you want, you can have your log file in standard ftpd xferlog format.
# Note that the default log file location is /var/log/xferlog in this case.
#xferlog_std_format=YES
#
# You may change the default value for timing out an idle session.

#idle_session_timeout=600
#
# You may change the default value for timing out a data connection.
#data_connection_timeout=120
#
# It is recommended that you define on your system a unique user which the
# ftp server can use as a totally isolated and unprivileged user.
#nopriv_user=ftpsecure
#
# Enable this and the server will recognise asynchronous ABOR requests. Not

# recommended for security (the code is non-trivial). Not enabling it,
# however, may confuse older FTP clients.
#async_abor_enable=YES
#
# By default the server will pretend to allow ASCII mode but in fact ignore
# the request. Turn on the below options to have the server actually do ASCII
# mangling on files when in ASCII mode.
# Beware that on some FTP servers, ASCII support allows a denial of service
# attack (DoS) via the command "SIZE /big/file" in ASCII mode. vsftpd
# predicted this attack and has always been safe, reporting the size of the

# raw file.
# ASCII mangling is a horrible feature of the protocol.
#ascii_upload_enable=YES
#ascii_download_enable=YES
#
# You may fully customise the login banner string:
# ftpd_banner= FTP for supplier product feeds ONLY
#
# You may specify a file of disallowed anonymous e-mail addresses. Apparently
# useful for combatting certain DoS attacks.

#deny_email_enable=YES
# (default follows)
#banned_email_file=/etc/vsftpd.banned_emails
#
# You may restrict local users to their home directories. See the FAQ for
# the possible risks in this before using chroot_local_user or
# chroot_list_enable below.
# chroot_local_user=YES
#
# You may specify an explicit list of local users to chroot() to their home

# directory. If chroot_local_user is YES, then this list becomes a list of
# users to NOT chroot().
# (Warning! chroot'ing can be very dangerous. If using chroot, make sure that
# the user does not have write access to the top level directory within the
# chroot)

chroot_local_user=YES
allow_writeable_chroot=YES

# chroot_list_enable=YES

# (default follows)
# chroot_list_file=/etc/vsftpd.chroot_list

pasv_enable=Yes
pasv_min_port=40000
pasv_max_port=50000
pasv_address=xxx.xxx.xxx.xxx

# Added as per https://www.digitalocean.com/community/tutorials/how-to-set-up-vsftpd-for-a-user-s-directory-on-ubuntu-16-04
user_sub_token=$USER

local_root=/home/$USER/ftp

userlist_enable=YES
userlist_file=/etc/vsftpd.userlist
userlist_deny=NO

# You may activate the "-R" option to the builtin ls. This is disabled by
# default to avoid remote users being able to cause excessive I/O on large
# sites. However, some broken FTP clients such as "ncftp" and "mirror" assume
# the presence of the "-R" option, so there is a strong case for enabling it.

#ls_recurse_enable=YES
#
# Customization
#
# Some of vsftpd's settings don't fit the filesystem layout by
# default.
#
# This option should be the name of a directory which is empty. Also, the
# directory should not be writable by the ftp user. This directory is used
# as a secure chroot() jail at times vsftpd does not require filesystem

# access.
secure_chroot_dir=/var/run/vsftpd/empty
#
# This string is the name of the PAM service vsftpd will use.
pam_service_name=vsftpd
#
# This option specifies the location of the RSA certificate to use for SSL
# encrypted connections.
# rsa_cert_file=/etc/ssl/certs/ssl-cert-snakeoil.pem
# rsa_private_key_file=/etc/ssl/private/ssl-cert-snakeoil.key

ssl_enable=NO

rsa_cert_file=/etc/ssl/private/vsftpd.pem
rsa_private_key_file=/etc/ssl/private/vsftpd.pem

#allow_anon_ssl=NO
#force_local_data_ssl=YES
#force_local_logins_ssl=YES

#ssl_tlsv1=YES

#ssl_sslv2=NO
#ssl_sslv3=NO

#require_ssl_reuse=NO
#ssl_ciphers=HIGH

# Uncomment this to indicate that vsftpd use a utf8 filesystem.
#utf8_filesystem=YES

cmds_allowed=dir,get,ls,put,cd,mkdir,rm,rmdir,PUT,PWD,CWD,SYST,FEAT,STOR,LIST,MKD,DELE,RMD,GET,EPSV,PASV,RETR,TYPE,NOOP,EXIT,QUIT



Log from Command line login and dir list attempts...



Mon Jan  9 16:41:05 2017 [pid 11271] FTP response: Client "::ffff:80.229.82.100", "220 (vsFTPd 3.0.3)"
Mon Jan 9 16:41:06 2017 [pid 11271] FTP command: Client "::ffff:80.229.82.100", "OPTS UTF8 ON"
Mon Jan 9 16:41:06 2017 [pid 11271] FTP response: Client "::ffff:80.229.82.100", "200 Always in UTF8 mode."
Mon Jan 9 16:41:09 2017 [pid 11271] FTP command: Client "::ffff:80.229.82.100", "USER pmcftp"
Mon Jan 9 16:41:09 2017 [pid 11271] [pmcftp] FTP response: Client "::ffff:80.229.82.100", "331 Please specify the password."
Mon Jan 9 16:41:16 2017 [pid 11271] [pmcftp] FTP command: Client "::ffff:80.229.82.100", "PASS "

Mon Jan 9 16:41:16 2017 [pid 11270] [pmcftp] OK LOGIN: Client "::ffff:80.229.82.100"
Mon Jan 9 16:41:17 2017 [pid 11275] [pmcftp] FTP response: Client "::ffff:80.229.82.100", "230 Login successful."
Mon Jan 9 16:41:27 2017 [pid 11275] [pmcftp] FTP command: Client "::ffff:80.229.82.100", "NLST"
Mon Jan 9 16:41:27 2017 [pid 11275] [pmcftp] FTP response: Client "::ffff:80.229.82.100", "550 Permission denied."
Mon Jan 9 16:41:29 2017 [pid 11275] [pmcftp] FTP command: Client "::ffff:80.229.82.100", "LIST"
Mon Jan 9 16:41:29 2017 [pid 11275] [pmcftp] FTP response: Client "::ffff:80.229.82.100", "425 Use PORT or PASV first."
Mon Jan 9 16:42:02 2017 [pid 11275] [pmcftp] FTP command: Client "::ffff:80.229.82.100", "PASV"
Mon Jan 9 16:42:02 2017 [pid 11275] [pmcftp] FTP response: Client "::ffff:80.229.82.100", "227 Entering Passive Mode (0,0,0,0,194,105)."
Mon Jan 9 16:42:04 2017 [pid 11275] [pmcftp] FTP command: Client "::ffff:80.229.82.100", "NLST"
Mon Jan 9 16:42:04 2017 [pid 11275] [pmcftp] FTP response: Client "::ffff:80.229.82.100", "550 Permission denied."

Mon Jan 9 16:42:05 2017 [pid 11275] [pmcftp] FTP command: Client "::ffff:80.229.82.100", "LIST"
Mon Jan 9 16:43:05 2017 [pid 11275] [pmcftp] FTP response: Client "::ffff:80.229.82.100", "425 Failed to establish connection."


Log from Filezilla login and download
n.b In trying to put a file to the server I get a critical transfer error. Whilst this is a problem, it's not my priority here.



Mon Jan  9 16:48:05 2017 [pid 11275] [pmcftp] FTP response: Client "::ffff:80.229.82.100", "421 Timeout."
Mon Jan 9 16:48:16 2017 [pid 11460] CONNECT: Client "::ffff:80.229.82.100"
Mon Jan 9 16:48:16 2017 [pid 11460] FTP response: Client "::ffff:80.229.82.100", "220 (vsFTPd 3.0.3)"

Mon Jan 9 16:48:16 2017 [pid 11460] FTP command: Client "::ffff:80.229.82.100", "AUTH TLS"
Mon Jan 9 16:48:16 2017 [pid 11460] FTP response: Client "::ffff:80.229.82.100", "530 Please login with USER and PASS."
Mon Jan 9 16:48:16 2017 [pid 11460] FTP command: Client "::ffff:80.229.82.100", "AUTH SSL"
Mon Jan 9 16:48:16 2017 [pid 11460] FTP response: Client "::ffff:80.229.82.100", "530 Please login with USER and PASS."
Mon Jan 9 16:48:16 2017 [pid 11460] FTP command: Client "::ffff:80.229.82.100", "USER pmcftp"
Mon Jan 9 16:48:16 2017 [pid 11460] [pmcftp] FTP response: Client "::ffff:80.229.82.100", "331 Please specify the password."
Mon Jan 9 16:48:16 2017 [pid 11460] [pmcftp] FTP command: Client "::ffff:80.229.82.100", "PASS "
Mon Jan 9 16:48:16 2017 [pid 11459] [pmcftp] OK LOGIN: Client "::ffff:80.229.82.100"
Mon Jan 9 16:48:16 2017 [pid 11464] [pmcftp] FTP response: Client "::ffff:80.229.82.100", "230 Login successful."
Mon Jan 9 16:48:16 2017 [pid 11464] [pmcftp] FTP command: Client "::ffff:80.229.82.100", "PWD"

Mon Jan 9 16:48:16 2017 [pid 11464] [pmcftp] FTP response: Client "::ffff:80.229.82.100", "257 "/" is the current directory"
Mon Jan 9 16:48:46 2017 [pid 11514] CONNECT: Client "::ffff:80.229.82.100"
Mon Jan 9 16:48:46 2017 [pid 11514] FTP response: Client "::ffff:80.229.82.100", "220 (vsFTPd 3.0.3)"
Mon Jan 9 16:48:46 2017 [pid 11514] FTP command: Client "::ffff:80.229.82.100", "AUTH TLS"
Mon Jan 9 16:48:46 2017 [pid 11514] FTP response: Client "::ffff:80.229.82.100", "530 Please login with USER and PASS."
Mon Jan 9 16:48:46 2017 [pid 11514] FTP command: Client "::ffff:80.229.82.100", "AUTH SSL"
Mon Jan 9 16:48:46 2017 [pid 11514] FTP response: Client "::ffff:80.229.82.100", "530 Please login with USER and PASS."
Mon Jan 9 16:48:46 2017 [pid 11514] FTP command: Client "::ffff:80.229.82.100", "USER pmcftp"
Mon Jan 9 16:48:46 2017 [pid 11514] [pmcftp] FTP response: Client "::ffff:80.229.82.100", "331 Please specify the password."
Mon Jan 9 16:48:46 2017 [pid 11514] [pmcftp] FTP command: Client "::ffff:80.229.82.100", "PASS "

Mon Jan 9 16:48:46 2017 [pid 11513] [pmcftp] OK LOGIN: Client "::ffff:80.229.82.100"
Mon Jan 9 16:48:46 2017 [pid 11518] [pmcftp] FTP response: Client "::ffff:80.229.82.100", "230 Login successful."
Mon Jan 9 16:48:46 2017 [pid 11518] [pmcftp] FTP command: Client "::ffff:80.229.82.100", "CWD /nimans"
Mon Jan 9 16:48:46 2017 [pid 11518] [pmcftp] FTP response: Client "::ffff:80.229.82.100", "250 Directory successfully changed."
Mon Jan 9 16:48:48 2017 [pid 11518] [pmcftp] FTP command: Client "::ffff:80.229.82.100", "TYPE I"
Mon Jan 9 16:48:48 2017 [pid 11518] [pmcftp] FTP response: Client "::ffff:80.229.82.100", "200 Switching to Binary mode."
Mon Jan 9 16:48:48 2017 [pid 11518] [pmcftp] FTP command: Client "::ffff:80.229.82.100", "PASV"
Mon Jan 9 16:48:48 2017 [pid 11518] [pmcftp] FTP response: Client "::ffff:80.229.82.100", "227 Entering Passive Mode (0,0,0,0,167,164)."
Mon Jan 9 16:48:48 2017 [pid 11518] [pmcftp] FTP command: Client "::ffff:80.229.82.100", "RETR F126841.CSV"
Mon Jan 9 16:48:48 2017 [pid 11518] [pmcftp] FTP response: Client "::ffff:80.229.82.100", "150 Opening BINARY mode data connection for F126841.CSV (1434704 bytes)."

Mon Jan 9 16:48:49 2017 [pid 11518] [pmcftp] OK DOWNLOAD: Client "::ffff:80.229.82.100", "/nimans/F126841.CSV", 1434704 bytes, 1153.59Kbyte/sec
Mon Jan 9 16:48:49 2017 [pid 11518] [pmcftp] FTP response: Client "::ffff:80.229.82.100", "226 Transfer complete."
Mon Jan 9 16:49:13 2017 [pid 11518] [pmcftp] FTP command: Client "::ffff:80.229.82.100", "TYPE A"
Mon Jan 9 16:49:13 2017 [pid 11518] [pmcftp] FTP response: Client "::ffff:80.229.82.100", "200 Switching to ASCII mode."
Mon Jan 9 16:49:13 2017 [pid 11518] [pmcftp] FTP command: Client "::ffff:80.229.82.100", "PASV"
Mon Jan 9 16:49:13 2017 [pid 11518] [pmcftp] FTP response: Client "::ffff:80.229.82.100", "227 Entering Passive Mode (0,0,0,0,165,201)."
Mon Jan 9 16:49:13 2017 [pid 11518] [pmcftp] FTP command: Client "::ffff:80.229.82.100", "STOR .htaccess"
Mon Jan 9 16:49:13 2017 [pid 11518] [pmcftp] FTP response: Client "::ffff:80.229.82.100", "553 Could not create file."
Mon Jan 9 16:49:13 2017 [pid 11518] [pmcftp] FAIL UPLOAD: Client "::ffff:80.229.82.100", "/nimans/.htaccess", 0.00Kbyte/sec

Friday, March 25, 2016

apache 2.2 - Passenger: mod_rewrite rules for non-default page cache directory for Rails application



Does anybody have some working Apache mod_rewrite rules that enable Phusion Passenger (mod_rails) to use a non-default location for the page cache within a Rails application? I'd like the cached files to go in /public/cache rather than the default of /public.


Answer



I found the answer in this blog post:



RailsAllowModRewrite On  

RewriteEngine On

RewriteCond %{THE_REQUEST} ^(GET|HEAD)
RewriteCond %{REQUEST_URI} ^/([^.]+)$
RewriteCond %{DOCUMENT_ROOT}/cache/%1.html -f
RewriteRule ^/[^.]+$ /cache/%1.html [QSA,L]

RewriteCond %{THE_REQUEST} ^(GET|HEAD)
RewriteCond %{DOCUMENT_ROOT}/cache/index.html -f
RewriteRule ^/$ /cache/index.html [QSA,L]


ubuntu 14.04 - Remove index.php from URL Apache 2.4

I have been trying to remove index.php from my application since a day. I have tried everything but am not able to remove it. The problem is - I am not allowed to have .htaccess file in the project directory and the config in .htaccess file which works on my local dev machine does not work there.




The config I have on my local machine is:




# Rules to serve URLs which point to files directly
# ----------
RewriteEngine On
RewriteBase /
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ index.php/$1 [L]




I am aware that I cannot use RewriteBase directive inside the host configuration file (/etc/apache2/sites-available/000-default.conf/000-default.conf). However disabling that doesn't work either.



Why is this problem so difficult? There are a lot more complicated redirections that are working perfectly, but there seems to be no definite solution to removal of index.php from the URL!



EDIT: It is installed on a Ubuntu 14.04 machine.

user permissions - Why shouldn't you let developers near root passwords?





I just came across Something is burning in the server room; how can I quickly identify what it is?. In the comments I found the following quote:



you don't let a developer anywhere near your root passwords


As a developer but also someone very interested in sysadmin stuff, I'm wondering if there's anything to this phrase more than just the standard for everyone - that is, don't give away passwords? The commentor states this as though it's fairly common knowledge in the sysadmin community. Is it?



Answer



I am not saying sysadmins are perfect, but there are a huge number of examples of developers that shouldn't even be developers. Let alone being permitted to having root access to systems with mission critical data.



But in any case it is mostly about taking responsibility for changes made. For many sysadmins it seems that developers will go do things on servers, and then they do not take responsibility for what they have done. Developers are often only focused on getting the single program/task they are working on going, and don't take the time to think about the big picture or long term state of the system they are touching. They haven't learned habits about how to make changes safely.



These aren't true for all developers, there are certainly some great developers that can maintain systems just fine. It just often seems like these are true 95% of the time.



Habits like:





  • Don't make changes to a system without testing on dev systems first

  • Don't make changes that you don't understand (blindly Googling, and doing things you don't understand)

  • Don't make changes without verifying that you know that the backups systems are working, and how to roll back anything you do.

  • Don't make changes that will make the system more difficult to maintain in the long term. (ie, don't add any technical debt)

  • Don't compromise the security of the system

  • Don't change the system in a way that will increase the potential regulatory risk (see PCI-DSS, HIPPA, FERPA, etc)


Thursday, March 24, 2016

apache 2.2 - Linux top command. Memory usage

I am testing my web server with Jmeter. I launch 40 users test, then dump top command.
What i see, is 40 (+1 host) apache processes. Each process uses appr. 7mb of RES memory. But 7*40 is 280 mb of memory. But top shows that there are 508mb total and 345mb free. So only 163mb used...
Why i have such strange stuff?




top - 04:49:24 up 1 day, 10:49,  1 user,  load average: 0.28, 0.18, 0.16
Tasks: 107 total, 2 running, 105 sleeping, 0 stopped, 0 zombie
Cpu(s): 1.4%us, 0.4%sy, 0.0%ni, 97.6%id, 0.5%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 508132k total, 162428k used, 345704k free, 28340k buffers
Swap: 916476k total, 21800k used, 894676k free, 63480k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
9816 www-data 20 0 69232 7260 3232 S 1.9 1.4 0:00.69 apache2
9890 www-data 20 0 69232 7260 3232 S 1.9 1.4 0:00.06 apache2
9900 www-data 20 0 69232 7260 3232 S 1.9 1.4 0:00.04 apache2

9906 www-data 20 0 69232 7256 3232 S 1.9 1.4 0:00.04 apache2
9908 www-data 20 0 69232 7256 3232 S 1.9 1.4 0:00.06 apache2
1 root 20 0 2836 760 460 S 0.0 0.1 0:01.50 init
2 root 20 0 0 0 0 S 0.0 0.0 0:00.01 kthreadd
3 root 20 0 0 0 0 S 0.0 0.0 0:00.99 ksoftirqd/0
4 root RT 0 0 0 0 S 0.0 0.0 0:00.00 migration/0
5 root RT 0 0 0 0 S 0.0 0.0 0:00.00 watchdog/0
6 root 20 0 0 0 0 S 0.0 0.0 0:04.20 events/0
7 root 20 0 0 0 0 S 0.0 0.0 0:00.00 cpuset
8 root 20 0 0 0 0 S 0.0 0.0 0:00.00 khelper

9 root 20 0 0 0 0 S 0.0 0.0 0:00.00 netns
10 root 20 0 0 0 0 S 0.0 0.0 0:00.00 async/mgr
11 root 20 0 0 0 0 S 0.0 0.0 0:00.00 pm
12 root 20 0 0 0 0 S 0.0 0.0 0:00.45 sync_supers
13 root 20 0 0 0 0 S 0.0 0.0 0:00.62 bdi-default
14 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kintegrityd/0
15 root 20 0 0 0 0 S 0.0 0.0 0:05.89 kblockd/0
16 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kacpid
17 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kacpi_notify
18 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kacpi_hotplug

19 root 20 0 0 0 0 S 0.0 0.0 0:00.00 ata_aux
20 root 20 0 0 0 0 S 0.0 0.0 0:00.00 ata_sff/0
21 root 20 0 0 0 0 S 0.0 0.0 0:00.00 khubd
22 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kseriod
23 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kmmcd
25 root 20 0 0 0 0 S 0.0 0.0 0:00.08 khungtaskd
26 root 20 0 0 0 0 S 0.0 0.0 0:08.30 kswapd0
27 root 25 5 0 0 0 S 0.0 0.0 0:00.00 ksmd
28 root 20 0 0 0 0 S 0.0 0.0 0:00.00 aio/0
29 root 20 0 0 0 0 S 0.0 0.0 0:00.00 ecryptfs-kthrea

30 root 20 0 0 0 0 S 0.0 0.0 0:00.00 crypto/0
35 root 20 0 0 0 0 S 0.0 0.0 0:00.00 pciehpd
37 root 20 0 0 0 0 S 0.0 0.0 0:00.00 scsi_eh_0
38 root 20 0 0 0 0 S 0.0 0.0 0:00.02 scsi_eh_1
41 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kstriped
42 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kmpathd/0
43 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kmpath_handlerd
44 root 20 0 0 0 0 S 0.0 0.0 0:00.00 ksnapd
45 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kondemand/0
46 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kconservative/0

239 root 20 0 0 0 0 S 0.0 0.0 0:03.95 mpt_poll_0
240 root 20 0 0 0 0 S 0.0 0.0 0:00.00 mpt/0
241 root 20 0 0 0 0 S 0.0 0.0 0:00.00 scsi_eh_2
258 root 20 0 0 0 0 S 0.0 0.0 0:05.60 jbd2/sda1-8
259 root 20 0 0 0 0 S 0.0 0.0 0:00.00 ext4-dio-unwrit
304 root 20 0 2392 68 68 S 0.0 0.0 0:00.04 upstart-udev-br
306 root 16 -4 2440 72 68 S 0.0 0.0 0:00.06 udevd
414 root 18 -2 2328 64 60 S 0.0 0.0 0:00.00 udevd
415 root 18 -2 2328 64 60 S 0.0 0.0 0:00.00 udevd
518 root 20 0 0 0 0 S 0.0 0.0 0:02.87 vmmemctl

526 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kpsmoused
556 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kgameportd
618 syslog 20 0 33976 440 440 S 0.0 0.1 0:00.83 rsyslogd
689 root 20 0 1856 84 80 S 0.0 0.0 0:00.00 getty
693 root 20 0 1856 84 80 S 0.0 0.0 0:00.00 getty
697 root 20 0 1856 84 80 S 0.0 0.0 0:00.00 getty
698 root 20 0 1856 84 80 S 0.0 0.0 0:00.00 getty
701 root 20 0 1856 84 80 S 0.0 0.0 0:00.00 getty
703 memcache 20 0 54192 128 128 S 0.0 0.0 0:07.77 memcached
705 root 20 0 2456 268 204 S 0.0 0.1 0:00.42 cron

706 daemon 20 0 2316 0 0 S 0.0 0.0 0:00.00 atd
722 root 20 0 5640 360 256 S 0.0 0.1 0:00.53 sshd
753 mysql 20 0 153m 456 32 S 0.0 0.1 0:20.57 mysqld
9280 root 20 0 2780 1408 1064 S 0.0 0.3 0:00.05 login
9292 zim32 20 0 8828 6068 1536 S 0.0 1.2 0:00.26 bash
9324 root 20 0 7268 2968 2180 S 0.0 0.6 0:03.50 mc
9326 root 20 0 6252 3544 1588 S 0.0 0.7 0:00.21 bash
9735 root 20 0 0 0 0 S 0.0 0.0 0:00.00 flush-8:0
9808 root 20 0 68892 8624 4828 S 0.0 1.7 0:00.15 apache2
9814 www-data 20 0 69232 7260 3232 S 0.0 1.4 0:01.03 apache2

9827 www-data 20 0 69232 7260 3232 S 0.0 1.4 0:00.97 apache2
9842 www-data 20 0 69232 7264 3236 S 0.0 1.4 0:00.40 apache2
9844 www-data 20 0 69232 7260 3232 S 0.0 1.4 0:00.40 apache2
9870 www-data 20 0 69232 7264 3236 S 0.0 1.4 0:00.22 apache2
9872 www-data 20 0 69232 7260 3232 S 0.0 1.4 0:00.23 apache2
9877 www-data 20 0 69232 7260 3232 S 0.0 1.4 0:00.18 apache2
9878 www-data 20 0 69232 7260 3232 S 0.0 1.4 0:00.18 apache2
9888 www-data 20 0 69232 7260 3232 S 0.0 1.4 0:00.07 apache2
9889 www-data 20 0 69232 7260 3232 S 0.0 1.4 0:00.06 apache2
9891 www-data 20 0 69232 7260 3232 S 0.0 1.4 0:00.08 apache2

9892 www-data 20 0 69232 7256 3232 S 0.0 1.4 0:00.05 apache2
9893 www-data 20 0 69232 7260 3232 S 0.0 1.4 0:00.06 apache2
9894 www-data 20 0 69232 7256 3232 S 0.0 1.4 0:00.08 apache2
9895 www-data 20 0 69232 7256 3232 S 0.0 1.4 0:00.06 apache2
9896 www-data 20 0 69232 7256 3232 S 0.0 1.4 0:00.04 apache2
9897 www-data 20 0 69232 7248 3228 S 0.0 1.4 0:00.06 apache2
9898 www-data 20 0 69232 7260 3232 S 0.0 1.4 0:00.03 apache2
9899 www-data 20 0 69232 7260 3236 S 0.0 1.4 0:00.06 apache2
9901 www-data 20 0 69232 7260 3232 S 0.0 1.4 0:00.04 apache2
9902 www-data 20 0 69232 7256 3232 S 0.0 1.4 0:00.04 apache2

9903 www-data 20 0 69232 7260 3232 S 0.0 1.4 0:00.03 apache2
9904 www-data 20 0 69232 7256 3232 S 0.0 1.4 0:00.08 apache2
9905 www-data 20 0 69232 7256 3232 S 0.0 1.4 0:00.05 apache2
9907 www-data 20 0 69232 7260 3232 S 0.0 1.4 0:00.05 apache2
9909 www-data 20 0 69232 7256 3232 S 0.0 1.4 0:00.08 apache2
9911 www-data 20 0 69232 7256 3232 S 0.0 1.4 0:00.04 apache2
9912 www-data 20 0 69232 7248 3228 S 0.0 1.4 0:00.04 apache2
9913 www-data 20 0 69232 7260 3232 S 0.0 1.4 0:00.06 apache2
9914 www-data 20 0 69232 7260 3232 S 0.0 1.4 0:00.04 apache2
9915 www-data 20 0 69232 7260 3232 R 0.0 1.4 0:00.04 apache2

9916 www-data 20 0 69232 7256 3232 S 0.0 1.4 0:00.04 apache2
9917 www-data 20 0 69232 7256 3232 S 0.0 1.4 0:00.06 apache2
9918 www-data 20 0 69232 7248 3228 S 0.0 1.4 0:00.02 apache2
9919 root 20 0 2632 1068 816 R 0.0 0.2 0:00.02 top

Wednesday, March 23, 2016

log files - Is Apache stalled? /server-status shows over 240 requests like "OPTIONS * HTTP/1.0" 200 - "-" "Apache (internal dummy connection)"



Some details:




  • Webserver: Apache/2.2.13 (FreeBSD) mod_ssl/2.2.13 OpenSSL/0.9.8e

  • OS: FreeBSD 7.2-RELEASE

  • This is a FreeBSD Jail.


  • I believe I use the Apache 'prefork' MPM (I run the default for FreeBSD).

  • I use the default values for MaxClients (256)



I have enabled mod_status, with "ExtendedStatus On". When I view /server-status , I see a handful of regular requests. I also see over 240 requests from the 'localhost', like these.



37-0    -   0/0/1   .   0.00    1510    0   0.0 0.00    0.00    127.0.0.2   www.example.gov OPTIONS * HTTP/1.0
38-0 - 0/0/1 . 0.00 1509 0 0.0 0.00 0.00 127.0.0.2 www.example.gov OPTIONS * HTTP/1.0
39-0 - 0/0/3 . 0.00 1482 0 0.0 0.00 0.00 127.0.0.2 www.example.gov OPTIONS * HTTP/1.0
40-0 - 0/0/6 . 0.00 1445 0 0.0 0.00 0.00 127.0.0.2 www.example.gov OPTIONS * HTTP/1.0



I also see about 2417 requests yesterday from the localhost, like these:



Apr 14 11:16:40 192.168.16.127 httpd[431]: www.example.gov 127.0.0.2 - - [15/Apr/2010:11:16:40 -0700] "OPTIONS * HTTP/1.0" 200 - "-" "Apache (internal dummy connection)"


The page at http://wiki.apache.org/httpd/InternalDummyConnection says "These requests are perfectly normal and you do not, in general, need to worry about them", but I'm not so sure.



Why are there over 230 of these? Are these active connections? If I have "MaxClients 256", and over 230 of these connections, it seems that my webserver is dangerously close to running out of available connections. It also seems like Apache should only need a handful of these "internal dummy connections"




We actually had two unexplained outages last night, and I am wondering if these "internal dummy connection" caused us to run out of available connections.



UPDATE 2010/04/16



It is 8 hours later. The /server-status page still shows that there are 243 lines which say "www.example.gov OPTION *". I believe these connections are not active. The server is mostly idle (1 requests currently being processed, 9 idle workers). There are only 18 active httpd processes on the Unix host.



If these connections are not active, why do they show up under /server-status? I would have expected them to expire a few minutes after they were initialized.


Answer



Well, this had a surprise answer. This was caused by a filesystem problem when we took UFS filesystem snapshots at midnight.




This seems to be caused by a FreeBSD UFS bug. We use FreeBSD Jails on a FreeBSD Host, with the default UFS filesystem. The UFS filesystem is large -- 1.8TB.



Once per night, we run a backup using 'dump(8)'. dump(8) was creating a snapshot of the filesystem before backing it up, and this froze the filesystem. Dump is supposed to work with filesystems less then 2TB, but it failed in our case. This guy had the same problem.



(I moved my answer from the question section down here to the answer section. stefan, 20100608)


amazon web services - AWS CLI Elastic IP, IPv6



I have two AWS instances working in high availability. (I'm using keepalived)



I have an Elastic IP associated, everything was a fine.




I used this script so to change the instance IP in case o failover:



#!/bin/bash

EIP=52.212.151.17
INSTANCE_ID=i-0bdd8a68eb573fd1a

/usr/bin/aws ec2 disassociate-address --public-ip $EIP
/usr/bin/aws ec2 associate-address --public-ip $EIP --instance-id $INSTANCE_ID



But now my server has an ipv4 and ipv6. And i cannot do the same for ipv6. Only ipv4.



How can i do the same for ipv6? Since there is no Elastic ipv6?


Answer



IPv6 addressing is different than IPv4 is usually managed. IPv6 is managed by subnet, not by individual address as in IPv4 today.



So in Amazon AWS, you need to first assign an IPv6 CIDR block to your VPC. Then you can assign individual IPv6 addresses to your instances. See Amazon's guides for getting started with IPv6 and understanding IP addressing.



By default your instances will obtain IPv6 addresses automatically. If you don't want this, you can assign a specific IPv6 address to it. But unlike IPv4, with IPv6 you assign addresses to the network interface of the instance, not to the instance.




Use aws ec2 assign-ipv6-addresses to assign IPv6 addresses to your instances' network interfaces.


networking - Linux:/sbin/dhclient to bind to a specific interface?

I'm running RHEL5.5 and have several network interfaces (eth0, eth1, eth2) on the machine. I wish to make /sbin/dhclient bind its UDP port 68 on specific interfaces (eth0 and eth2) but when I do 'netstat -anp | grep 68' I see:




udp        0      0 0.0.0.0:68       0.0.0.0:*                 6109/dhclient


which interferes with another software daemon running its own DHCP client that wishes to run on a specific interface (eth1) not serviced by /sbin/dhclient



Can I get /sbin/dhclient to not bind to UDP port 0.0.0.0:68?



I have configured /etc/dhclient.conf to only service the interfaces I want (eth0 and eth2 in this case) but it still binds to 0.0.0.0:68 which prevents the custom DHCP client from running on eth1.




Any help appreciated, thanks.

Simulate a hard disk fail on Virtual Box




I'm testing some NAS setups using virtualbox, with several virtual hard drives, and software raid.



I would like to test the behavior under certain fails, and I would like to simulate that one of the hard disks broke and there's need to rebuild the RAID...



Would it be enough to do a




cat /proc/urandom > /virtualdisk





Or as the virtual disks are containers, VBox couldn't use it and would break the VirtualBox machine?


Answer



I don't know that you can fail a hard drive this way in VBox (or any VM -- They're typically designed to pretend hardware is perfect). You can try it and see but the results could be pretty awful...



A better strategy might be to shut down the VM & remove the disk, power on & do stuff, then shutdown & re-add the disk. Another option is to use the software RAID administration tools to mark a drive as failed (almost all of them support this AFAIK), scribble on it from within the VM, then re-add it & watch the rebuild.



All told however the only real test of a drive failure is to put the OS on real hardware and yank out one of the disks -- This is the only way to know for sure how your OS will react on a given piece of hardware with associated controller quirks.


Ping IPv4 addresses from an IPv6 host



So, I've been given to believe that IPv6-only clients can access IPv4 servers by using addresses like: ::ffff:0:74.125.226.80 (that would be an address for google.com). I'm not IPv6 yet, but I may be soon. I have a miredo/teredo tunnel set up and I can ping normal IPv6 addresses just fine, but when I run:



ping6 ::ffff:0:74.125.226.80



it fails (Destination unreachable: Address unreachable)



Am I misunderstanding something? Can I actually get to IPv4 hosts over my IPv6 connection?


Answer



::ffff:0:74.125.226.80 is a dotted-decimal address, and not a real IPv6 address.



If you only have full, world-routable IPv6 address (with prefix and a /48) then you cannot communicate with the IPv4 world without a special tunnel.



They are for all intents and purposes two different protocols. You have two choices for communication between the two:





  • Dual-stack. Have both IPv4 (behind a NAT if need be) AND IPv6 (with a world-routable IP, and a link-local address) installed and configured. OS's will try to use IPv6 first, and fall back to IPv4 if that fails. Just make sure you're configured correctly (not a dotted-decimal address like above) and it works pretty seamlessly.


  • Tunnelling. If you have an IPv6 device, it needs to be aware of an IPv4 tunnel that it can use to broker your connections to the IPv4 world.



Tuesday, March 22, 2016

CISCO: conditional port-based NAT (policy-based PAT)?

I have a problem to nat-translate inbound http packets to different local_ips based on some condition (like dscp bits set). Actually the dscp bits get set on the incoming trafic using NBAR. I hope NBAR happens BEFORE nat translation on the nat outside interface. If not maybe more sofisticated solution needed.




Here is what I need approximately:



nat inside source static tcp  80 interface  80  
nat inside source static tcp 80 interface 80

Get PHP extension APC not running on Ubuntu 14.04.3 - undefined symbol: php_pcre_exec

I installed the PHP extension APC on the server but I get it not running. It gets not listed in phpinfo(). After activating the php error log I get the following error:



PHP Startup: Unable to load dynamic library '/var/www/vhosts/chroot/usr/lib64/php/modules/apcu.so: undefined symbol: php_pcre_exec in Unknown on line 0


I am using the following versions
Ubuntu 14.04.3 / PHP 5.6.13 / Apache 2.4.7




The installation has been made with:



 sudo apt-get install php-apc


In phpinfo I don't see any extension for apc. I jsut see that the following files has been added "/etc/php5/apache2/conf.d/20-apcu.ini" which seems to get parsed in addition to php.ini and in the ini file is



 extension=apcu.so



The apcu.so file has been installed through apt-get and is in



/usr/lib/php5/20121212/apcu.so



/var/www/vhosts/chroot/usr/lib64/php/modules/apcu.so



any hints how this error "undefined symbol: php_pcre_exec" can be solved. Which additional libraries are missing?

domain name system - DNS Server replication created duplicate A-Records

I have a several Windows Server 2008 R2 DC/DNS servers locally, RODC's at the remote office, and a Windows Server 2012 DC/DNS server on Azure with a VPN tunnel established.



Earlier today I moved a webserver, changed the DNS records on one of the local DNS servers, and updated at the registrar. Everything worked as expected.



Then weird issues started popping up, some people being directed to the wrong server, others to the correct one.



After troubleshooting, I checked the local DNS server again and the records were still correct, until I hit refresh, and the old A-Records popped up in conjunction with the new ones.



The way these records are set up, is a forward lookup zone with one static A record using the parent domain for each zone.




So there ended up being two A-Records with different IP's for the URL's that I had changed earlier in the day, and the old records showed back up in DNS manager when I refreshed the screen. (I had checked several times previously without refreshing)



Fortunately this only affected internal users and not all of them at that, all external users were unaffected because the public DNS records are published through a registrar (GoDaddy, independent so unrelated).



What happened? And how can I prevent this from happening again?

Monday, March 21, 2016

MikroTik IPsec client Fortigate 'Received ESP packet with unknown SPI.'

We have a client with 6 sites using IPsec. Every now and again, possibly once a week, sometimes once a month, data just stops flowing from the remote Fortigate VPN server to the local MikroTik IPsec VPN client.



In order to demonstrate the symptoms of the problem I have attached a diagram. On the diagram Installed SAs tab you will notice a source IP address x.x.186.50 trying to communicate with x.x.7.3 but 0 current bytes. x.x.186.50 is the client's remote Fortigate IPsec server, and x.x.7.73 is a MikroTik based IPsec endpoint. It appears data from the remote side to us is not always flowing.



Phase 1 and 2 are always established but traffic always refuses to flow from the remote side to us.



We tried various things over time, such as rebooting, setting clocks, dabbling with configuration, rechecking and rechecking configuration but it appears the problem is entirely random. And sometimes random things fixes it. At one stage I had a theory that if the tunnel is initiated from their side it works, but fiddling with "Send Initial Contact" has not made any difference.



We've had many chats to the client about this but they have many more international IPsec VPNs and only our MikroTik configuration is failing.




Fortigate log:



enter image description here
http://kb.fortinet.com/kb/microsites/microsite.do?cmd=displayKC&externalId=11654



Looking at Fortigate's knowledgebase it appears SPIs don't agree and DPD would make a difference. But I have tried every single combination of DPD on this side without avail. I would like to enable DPD on the other side but I cannot due to change control and also because the client is saying it's working on all the other sites exactly configuration the same. EDIT DPD was enabled



Local VPN client diagram showing no traffic flow:




enter image description here



I have included a log file showing continuous loops of "received a valid R-U-THERE, ACK sent" MikroTik log file:




echo: ipsec,debug,packet 84 bytes from x.x.7.183[500] to x.x.186.50[500]



echo: ipsec,debug,packet sockname x.x.7.183[500]



echo: ipsec,debug,packet send packet from x.x.7.183[500]




echo: ipsec,debug,packet send packet to x.x.186.50[500]



echo: ipsec,debug,packet src4 x.x.7.183[500]



echo: ipsec,debug,packet dst4 x.x.186.50[500]



echo: ipsec,debug,packet 1 times of 84 bytes message will be sent to x.x.186.50[500]



echo: ipsec,debug,packet 62dcfc38 78ca950b 119e7a34 83711b25 08100501 bc29fe11 00000054 fa115faf




echo: ipsec,debug,packet cd5023fe f8e261f5 ef8c0231 038144a1 b859c80b 456c8e1a 075f6be3 53ec3979



echo: ipsec,debug,packet 6526e5a0 7bdb1c58 e5714988 471da760 2e644cf8



echo: ipsec,debug,packet sendto Information notify.



echo: ipsec,debug,packet received a valid R-U-THERE, ACK sent





I've received various suggesions from IPsec experts and MikroTik themselves implying that the problem is at the remote side. However the situation is greatly compounded that 5 other sites are working and that the client's firewall is under change control. The setup also always worked for many years, so they claim it cannot be a configuration error on their side. This suggestion seems plausible but I cannot implement due to change control. I may only change the client side:




Make sure the IPSec responder has both passive=yes and
send-initial-contact=no set.




This did not work.



EDIT 9 Dec 2013




I am pasting additional screenshots with the Fortigate configuration and what we believe are the Quick Mode selectors on the Mikrotik side.



Phase 1 Fortigate screenshot



Phase 2 Fortigate screenshot



Quick Mode Selectors?



Let me re-iterate that I don't think it's a configuration problem. I speculate it's a timing problem whereby side A or side B tries to send information too aggressively making the negotiation of the information (e.g. the SPI) out of sync.




EDIT 11 Dec 2013



Sadly I have to give up on this issue. Happily everything is working. Why it's working is still a mystery, but to further illustrate what we did I post another image inline.



We fixed it by:




  1. Turning off PPPoE at client.

  2. Installing completely new router (Router B) and tested at Border. It worked at Border.


  3. Switching off new router B at border.
    AND THEN, WITHOUT MAKING A SINGLE CHANGE, the client's end-point Router A started working. So just adding a duplicate router at the border and taking this router offline again made the original router work.



So add this fix to the list of things we've done:




  1. Reboot. That worked once.

  2. Create new tunnel with new IP. That worked once but only once. After changing IP back client endpoint came live again.

  3. Change time servers.


  4. Fiddle with every possible setting.

  5. Wait. Once, after a day, it just came right. This time, even after days, nothing came right.



So I postulate that there is an incompatibility on either Fortigate or MikroTik side which only happens at very random situations. The only things we haven't been able to try is upgrade firmware on Fortigate. Maybe there is hidden corrupt configuration value or timing issue invisible to configurer.



I further speculate that the issue is caused by timing issues causing SPI mismatch. And my guess is the Fortigate doesn't want to "forget" about the old SPI, as if DPD is not working. It just happens randomly and from what I can tell only when endpoint A is Fortigate and endpoint B is MikroTik. The constant aggressive attempts at trying to re-establish the connection "holds" on to old SPI values.



I'll add to this post when it happens again.




enter image description here



EDIT 12 Dec 2013



As expected it happened again. As you may recall we have 6 MikroTik client IPsec end-point routers configured exactly the same connecting to one Fortigate server. The latest incident was again to a random router, not the one I posted here about originally. Considering the last fix where we installed this duplicate router, I took this shortcut:




  1. Disable Router A, the router that does not want to receive packets from Fortigate any more.

  2. Copy Router A's IPsec configuration to a temporary router closer to the border of our network.

  3. Immediately disable the newly created configuration.


  4. Re-enable Router A.

  5. Automagically it just starts working.



Looking at @mbrownnyc comment I believe that we are having an issue with Fortigate not forgetting stale SPIs even though DPD is on. I will investigate our client's firmware and post it.



Here is a new diagram, much like the last, but just showing my "fix":



enter image description here

virtualhost - Apache redirect seems to ignore ServerName

I have a couple virtual host files. The first one redirects all traffic for http://www.mysite.com to the https version:




ServerName www.mysite.com
Redirect permanent / https://www.mysite.com/




ServerName www.mysite.com
# additional configuration



The second defines a VirtualDocumentRoot so that sites in directories such as /var/www/www.othersite.com will be served.




ServerAdmin webmaster@localhost
VirtualDocumentRoot /var/www/%0

VirtualScriptAlias /var/www/%0/cgi-bin/



I also have NameVirtualHost definitions in the ports.conf file (in case that is helpful):



NameVirtualHost *:443
NameVirtualHost *:80



The problem is that all traffic seems to be redirected to https://www.mysite.com. I would expect that the ServerName www.mysite.com line in the first virtual host file would only match www.mysite.com so that visiting http://www.othersite.com would serve content from /var/www/www.othersite.com. When I comment out the entire port 80 VirtualHost definition from the first file, http://www.othersite.com loads as expected. Am I missing something here? How do I only redirect http to https for www.mysite.com?



Edit:



The reason I can't put the ServerName as http://www.othersite.com in the second VirtualHost definition is that there are multiple sites in /var/www that also need to be served.

Project Server 2013 Active Directory Enterprise Resource Pool Synchronization failing

We're trying to configure Micorosft Project Server 2013 to synchronize our Active Directory users through the Active Directory Enterprise Resource Pool Synchronization tool.




When we try to add a distribution or security group to the "Active directory group" field the group can be selected in the autocompletion without any trouble, but when we then try to use "Save and Synchronize now" button, nothing happens.



There is also a message saying "The synchronization failed because the Active Directory group was empty or not found." in the "Synchronization status".



We've verified that the services that are used by the project server have read access to the Active Directory.



Is there something else we should check or any other clue to this error?

spf - Sending email from a Google Apps address



I have a website in a Linode VPS in Linode.



The email address to receive emails is hosted in Google Apps, but I send emails from postfix.




For that to work and not cause problems with spam folders, I have created this SPF record:



v=spf1 include:_spf.google.com ~all


Unfortunately all email gets into the spam folder, especially on Hotmail.



Reading guides and answers here in serverfault, I concluded that I need to add the ip4 info in my SPF record, like this:



v=spf1 ip4:xxx.xxx.xxx.xxx include:_spf.google.com ~all



My emails contain a header image, that is loaded remotely from the same domain. Obviously ISPS hide automatically this image, but is it a factor to classify the email as spam? Do I need anything else taking into account my setup? Is DKIM absolutely necessary?


Answer



You definitively need an ip4 mechanism on your record including the IP address of your server, since it is from there that you send email.



A SPF record basically says: 'The servers whose IP are listed here can send email for this domain'



Since you listed only gmail servers but are not using them to send your email but instead sending using postfix, everyone is understandably judging your email as spam. (you said your email would come from google but sent it from elsewhere).




To clarify, those are the servers your record says can send email for your domain:



sh-3.2$ dig _spf.google.com TXT +short
"v=spf1 include:_netblocks.google.com ?all"
sh-3.2$ dig _netblocks.google.com TXT +short
"v=spf1 ip4:216.239.32.0/19 ip4:64.233.160.0/19 ip4:66.249.80.0/20 ip4:72.14.192.0/18 ip4:209.85.128.0/17 ip4:66.102.0.0/20 ip4:74.125.0.0/16 ip4:64.18.0.0/20 ip4:207.126.144.0/20 ip4:173.194.0.0/16 ?all"


It would in fact be better if you didn't have an SPF record at all instead of having the record above.




I suggest you change your record to something like:



v=spf1 ip4:xxx.xxx.xxx.xxx ~all


DKIM is helpful, specially with Hotmail but you will get an improvement already if you fix your SPF (and if your email volumes are not too high and your email content is not spammy).



Wikipedia article on SPF may be helpful


Sunday, March 20, 2016

linux - Red Hat Server Becomes Unresponsive After Primary Domain Controller is Shut Down

We have a number of servers (application servers, web servers & ftp servers) running red hat 5 that are all virtual. We also have a similar setup that is Windows-based. Yesterday, our infrastructure team needed to shutdown the primary domain controller so they could move the physical server to a new rack. Their assumption was that as soon as the primary domain controller went down, the secondary domain controller would pick up. As soon as the primary domain controller powered down, the Linux-based app servers all slowed to a crawl, to the point that simply trying to log in via ssh took approximately 3 minutes.



Before we could finish troubleshooting the issue, the infrastructure team was able to bring the primary domain controller back on-line.



During the down-time of the primary domain controller, all windows-based servers appeared to be functioning normally.




Our first thought was that the Linux servers didn't have the secondary domain controller listed as a DNS server, but this is not the case. The red hat servers don't tie into any AD functionality, other than using it as a DNS server.



Any thoughts on what else we could check? We're not really Linux sys admins, so I'm not sure if we're missing something pretty basic.

linux - How to SSH to ec2 instance in VPC private subnet via NAT server

I have created a VPC in aws with a public subnet and a private subnet. The private subnet does not have direct access to external network. S...