Saturday, September 28, 2019

scdpm - How to repeat failed tape backup attempt in DPM 2007?

How do you repeat a scheduled backup of a protection group if the first backup attempt failed (e.g. if there were no tapes in the library, or if there was an intermittent failure in the library or drive).



I know that you can force a backup of single protected element in the console by selecting "Create Recovery Point - Tape" in the context menu, but how do you force a tape backup of a whole protection group?

Friday, September 27, 2019

Connection refused in ssh tunnel to apache forward proxy setup



I am trying to setup a private forward proxy in a small server. I mean to use it during a conference to tunnel my internet access through an ssh tunnel to the proxy server.



So I created a virtual host inside apache-2.2 running the proxy, the proxy_http and the proxy_connect module.
I use this configuration:





ServerAdmin xxxxxxxxxxxxxxxxxxxx
ServerName yyyyyyyyyyyyyyyyyyyy

ErrorLog /var/log/apache2/proxy-error_log
CustomLog /var/log/apache2/proxy-access_log combined


ProxyRequests On


# deny access to all IP addresses except localhost
Order deny,allow
Deny from all
Allow from 127.0.0.1

# The following is my preference. Your mileage may vary.
ProxyVia Block
## allow SSL proxy
AllowCONNECT 443





After restarting apache I create a tunnel from client to server:



#> ssh -L8080:localhost:8080 


and try to access the internet through that tunnel:




#> links -http-proxy localhost:8080 http://www.linux.org


I would expect to see the requested page. Instead a get a "connection refused" error. In the shell holding open the ssh tunnel I get this:




channel 3: open failed: connect failed: Connection refused




Anyone got an idea why this connection is refused ?



Answer



I agree with CanOfSpam3 that using -D8080 is a better option then setting up a proxy with Apache. However, to answer your question, I would guess you have missed the Listen line in Apache to listen to port 8080 in addition to the usual ones. alone does not make Apache listen to the IP:Port mentioned, you also need to ask Apache to listen on that with Listen. Here's the reference from Apache


permissions - Windows server share read-only from network and write locally



Question:



How do I Deny Change without Denying Read on Windows File Share Permissions?



Details:



I have a Windows 2012 R2 server. I would like to create a file share that is read-only to a group when accessed from the network (via the file share) but writable when the members of the group are connected via Remote Desktop.




I thought I would be able to accomplish this by making the folder writable from file permissions but deny write using the share permissions.



I am unable to set 'Deny Change' without also being forced to 'Deny Read' in the share permissions dialog. (I don't want to deny read...)


Answer



Local perms consist of only NTFS perms. Effective share perms are the union of the NTFS perms + the share perms. Just make the share perms read only.



ntfs perms
share perms


Thursday, September 26, 2019

bind - Caching DNS server returns invalid ip address for external lookups

I'm trying to resolve a DNS issue and am running short of ideas. Google doesn't seem to be helping, either.



When I use my local caching name server to resolve external host names, it always returns 192.168.1.251. There are some examples below.



Where is this invalid address coming from, and more importantly, how can I correct the issue?



My setup:
Local Domain name Solwiz.net 192.168.0.*




Broadband router - internal address is 192.168.0.1
- DHCP: Disabled



Caching Nameserver: Bind 9
192.168.0.32
Debian Squeeze



Digging:



dig - host on local network works




$ dig @ns2 mail2.solwiz.net

; <<>> DiG 9.7.3 <<>> @ns2 mail2.solwiz.net
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 17568
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 1, ADDITIONAL: 1


;; QUESTION SECTION:
;mail2.solwiz.net. IN A

;; ANSWER SECTION:
mail2.solwiz.net. 259200 IN A 192.168.0.34

;; AUTHORITY SECTION:
solwiz.net. 259200 IN NS ns2.solwiz.net.

;; ADDITIONAL SECTION:

ns2.solwiz.net. 259200 IN A 192.168.0.32

;; Query time: 0 msec
;; SERVER: 192.168.0.32#53(192.168.0.32)
;; WHEN: Fri Aug 1 21:09:36 2014
;; MSG SIZE rcvd: 84


dig - host on external network returns incorrect IP




$ dig @ns2 www.google.ch

; <<>> DiG 9.7.3 <<>> @ns2 www.google.ch
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 16611
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 4, ADDITIONAL: 0

;; QUESTION SECTION:

;www.google.ch. IN A

;; ANSWER SECTION:
www.google.ch. 0 IN A 192.168.1.251

;; AUTHORITY SECTION:
google.ch. 333349 IN NS ns4.google.com.
google.ch. 333349 IN NS ns3.google.com.
google.ch. 333349 IN NS ns2.google.com.
google.ch. 333349 IN NS ns1.google.com.


;; Query time: 1 msec
;; SERVER: 192.168.0.32#53(192.168.0.32)
;; WHEN: Fri Aug 1 21:11:44 2014
;; MSG SIZE rcvd: 129


dig - host on external network returns incorrect IP



$ dig @ns2 www.microsoft.com


; <<>> DiG 9.7.3 <<>> @ns2 www.microsoft.com
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 5476
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 13, ADDITIONAL: 0

;; QUESTION SECTION:
;www.microsoft.com. IN A


;; ANSWER SECTION:
www.microsoft.com. 0 IN A 192.168.1.251

;; AUTHORITY SECTION:
com. 160501 IN NS j.gtld-servers.net.
com. 160501 IN NS k.gtld-servers.net.
com. 160501 IN NS h.gtld-servers.net.
com. 160501 IN NS e.gtld-servers.net.
com. 160501 IN NS f.gtld-servers.net.

com. 160501 IN NS d.gtld-servers.net.
com. 160501 IN NS m.gtld-servers.net.
com. 160501 IN NS l.gtld-servers.net.
com. 160501 IN NS a.gtld-servers.net.
com. 160501 IN NS i.gtld-servers.net.
com. 160501 IN NS c.gtld-servers.net.
com. 160501 IN NS b.gtld-servers.net.
com. 160501 IN NS g.gtld-servers.net.

;; Query time: 2 msec

;; SERVER: 192.168.0.32#53(192.168.0.32)
;; WHEN: Fri Aug 1 21:12:20 2014
;; MSG SIZE rcvd: 275


I dumped the cache with rdnc dumpdb --all, there is a 192.168.1.* addresses mentioned.



Cleared the _default view, the entry for 192.168.1.* was gone.



Repeated the Dig for Microsoft.com, the entry is back:




;
; Start view _default
;
;
; Cache dump of view '_default' (cache _default)
;
$DATE 20140801194948

(several lines removed)


; Unassociated entries

(several lines removed)

; 192.168.1.251 [srtt 722240] [flags 00002000] [ttl 1780]

(lines to end of file removed)



From /etc/bind/named.conf.options



    forwarders {
8.8.8.8;
62.2.24.162; 62.2.17.60;
};


8.8.8.8 is, of course, Google's DNS,
The two 62.2.* are my provider's DNS servers.




Querying the forwarders directly (from my Nameserver)



dig @8.8.8.8 www.google.ch



; <<>> DiG 9.7.3 <<>> @8.8.8.8 www.google.ch
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 17711

;; flags: qr aa rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0


;; QUESTION SECTION:
;www.google.ch. IN A

;; ANSWER SECTION:
www.google.ch. 0 IN A 192.168.1.251

;; Query time: 0 msec

;; SERVER: 8.8.8.8#53(8.8.8.8)
;; WHEN: Sat Aug 2 15:36:51 2014
;; MSG SIZE rcvd: 47


The other forwarders give the same response.



A little background:
I have my main network wlan-bridged to the wlan router, some clients access the wlan router directly.
I've been experiencing connectivity and performance problems for some time. The internal network died completely yesterday, not even the direct wlan clients were getting service, although there was excellent service on the broadband router's ethernet ports.

My network technician disabled the WLAN functionality of the broadband adapter, and connected an access point to one of the ethernet ports of the adapter. The access port's default ip is 192.168.1.2, but he changed that to 192.168.0.2. As far as I know, he disabled any DHCP functionality in the access point. Since the change I've been having the ip resolution issue.

networking - netperf + iptables masquerade -> network unreachable



Why iptables rules pass netperf TCP_STREAM test through, but break UDP_STREAM?



I have a network:



    +---------------+
| |
[client]--[NAT]--[server]



On the NAT, I have added the following iptables rules:



$ iptables -t nat -L:



Chain PREROUTING (policy ACCEPT)
target prot opt source destination

Chain INPUT (policy ACCEPT)
target prot opt source destination


Chain OUTPUT (policy ACCEPT)
target prot opt source destination

Chain POSTROUTING (policy ACCEPT)
target prot opt source destination
MASQUERADE all -- anywhere anywhere


$ iptables -L:

Chain INPUT (policy ACCEPT)
target prot opt source destination



Chain FORWARD (policy ACCEPT)
target prot opt source destination
ACCEPT all -- anywhere anywhere state NEW,RELATED,ESTABLISHED
ACCEPT all -- anywhere anywhere

Chain OUTPUT (policy ACCEPT)
target prot opt source destination



It works well, when I run ping server on the client, when I run netperf through TCP, and when I run netperf through UDP via the direct connection. But it does not work if I run:



$ netperf -H 192.168.2.10 -t UDP_STREAM -l 1
MIGRATED UDP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.2.10 () port 0 AF_INET : demo
send_data: data send error: errno 101
netperf: send_omni: send_data failed: Network is unreachable



It seems, the iptables firewall blocks the UDP packets.



Is it so, and how do I configure it not to?


Answer



The answer is here: https://stackoverflow.com/a/24211455/1234026



In short: netperf disables IP routing by default for UDP_STREAM test, and if the target address is in a different subnet, it fails to find the route. To make it act normally, I need to supply -R 1 key as a test-specific option.


Wednesday, September 25, 2019

smtp - How to Stop Exchange Server 2003 sending out Mass Spam / Virus Email from Unkown Account?



I'm young and new to the scene, I have been working on a family member's business server. It is an old server and has a few problems but has
a big main one I'm trying to fix now.
I'll try be as clean and concise as I can be in explaining what I need.



The Server Details:



The server is an old box with limited hardware: 50Gb C: Drive With only 1.3Gb free
Total Combined Hard Drive Space : 270Gb with 31Gb free

Only Has 4Gb RAM memory
Has Intel Xeon Processor @ 2.00Ghz



The server also has old software / operating sys: Running Small Business Server 2003 Service Pack 2
Running Exchange Server 2003



The Problem:



The Server was recently black listed by our ISP. Upon Calling them they provided no support and said that it should be running fine. All configuration pointed to their connection being the problem so we told the SMTP Virtual Server to create its own connectors using DNS. This fixed our outgoing email problems. However now I see that we have a infected computer that may be sending out mass email.




This is where I start losing track of things about the situation. What happens is in the Exchange server queue we have many connectors to domains that don't exist and the message tracking shows mass emails being sent from our exchange out. Presumably a virus or computer is trying to send spam or send out virus' to other people through our exchange server.



At first it was using a gmail account, I blocked the email account's privilege to send out through the server but then it came back using a different email. Now our queue is clogged more. not just with failed email attempts but all the NDR can't be sent through to a real domain so it sits in the queue too. This has / will lead in blacklisting on our IP as well as our server being clogged up.



I did a test on Mxtoolbox for our mail server and it said that there wasn't an open relay as I thought initially.



Older posts haven't provided instruction that i can't follow, understand or adapt to this server / my problem.



Solution?:





  • A solution to stop our exchange sending out these mass emails / stop the virus without manually blocking all emails from sending out email other than ones I specify otherwise there will be problems opening new accounts later.

  • A small explanation As to whats happened so I understand it and can stop it / have knowledge for the future.

  • And a quick way to remove all the queue when I know no legitimate mail is being sent as there is too many to delete when using the find messages and delete with no NDR inside queues.

  • Hopefully these things will help me and others in the future to understand and fix what has happened.



Final Notes:



Any help at all will be greatly appreciated and any suggestions on where to look / things to try or do.
I understand I'm not the best admin for mail servers, however i'm doing it for free as we don't have money to spend on IT support but really need the services for the business and I'm doing the best I can.

If Any more detail or information is needed please don't hesitate to ask.



Thanks,
Jesse Hayward.


Answer



After further research and some common sense i managed to find a solution.



If this is happening to you, Go to the system's event viewer. Then filter your events with MSExchangeTransport. Then with authentication. (This needs to have been previously turned on through MSExchange settings). Once done Look to see what user account is being used to authenticate the emails being sent when the spam is going through.



Chances are that this account has an easily guessed password or has let through a virus that has stolen its password. Therefore Disabling the account if necessary or changing the password should hopefully stop the problem from continuing.



Tuesday, September 24, 2019

apache 2.2 - Host Ruby on Rails Tracks on Apache2



I'm having problems getting Tracks to be hosted via Apache2 on Ubuntu 10.04. I've followed several tutorials, but none work. I've got the Tracks git repo in /var/lib/tracks and a symbolic link to /var/lib/tracks/public in /var/www.I've installed passenger and enabled the libapache2-mod-passenger. I've configured a VirtualHost but I get a broken link page (not sure if it is 404) when I go to http: //localhost/tracks. I've temporarily disabled the default VirtualHost for troubleshooting, but when I go to http: //localhost in Firefox I get the list of files in /var/lib/tracks/public, but in Chromium I get the default "It works!" index.html page.



/etc/apache/mods-enabled/passenger.load




LoadModule passenger_module /usr/lib/apache2/modules/mod_passenger.so


/etc/apache/mods-enabled/passenger.conf




PassengerRoot /home/erich/.gem/ruby/1.8/gems/passenger-3.0.2
PassengerRuby /usr/bin/ruby1.8




/etc/apache/sites-enabled/tracks




ServerName tracks.localhost
ServerAdmin webmaster@localhost

DocumentRoot /var/lib/tracks/public

AllowOverride all

Options -MultiViews


RailsBaseURI /tracks
#
#Options -MultiViews
#






In the error.log the following message:




[Sun Feb 06 08:33:51 2011] [error] [client 192.168.1.100] File does not exist: /var/lib/tracks/public/tracks
[Sun Feb 06 08:48:30 2011] [notice] caught SIGTERM, shutting down
[Sun Feb 06 08:48:31 2011] [error] *** Passenger could not be initialized because of this error: The Passenger spawn server script, '/home/erich/.gem/ruby/1.8/gems/passenger-3.0.2/lib/phusion_passenger/passenger-spawn-server', does not exist. Please check whether the 'PassengerRoot' option is specified correctly.
[Sun Feb 06 08:48:31 2011] [error] *** Passenger could not be initialized because of this error: The Passenger spawn server script, '/home/erich/.gem/ruby/1.8/gems/passenger-3.0.2/lib/phusion_passenger/passenger-spawn-server', does not exist. Please check whether the 'PassengerRoot' option is specified correctly.
[Sun Feb 06 08:48:31 2011] [notice] Apache/2.2.14 (Ubuntu) Phusion_Passenger/2.2.7 configured -- resuming normal operations



Am I missing something? I am entirely new to Ruby on Rails and just trying to get this app to be hosted by Apache so I don't have to be logged in on my server for it to run. I don't understand the dependencies Ruby on Rails has in order to work with Apache. If anyone knows of some beginner references for me to learn more about how Ruby on Rails works with Apache, I'd be grateful.


Answer



Specify correct PassengerRoot in your apache configuration file. It seems that passenger gem installed not into your home directory or it has other version than 3.0.2.


Monday, September 23, 2019

software raid - mdadm on ubuntu 10.04 - raid5 of 4 disks, one disk missing after reboot



I'm having a problem with the raid array in a server (Ubuntu 10.04).



I've got a raid5 array of 4 disks - sd[cdef], created like this:




# partition disks
parted /dev/sdc mklabel gpt
parted /dev/sdc mkpart primary ext2 1 2000GB
parted /dev/sdc set 1 raid on
# create array
mdadm --create -v --level=raid5 /dev/md2 /dev/sdc1 /dev/sdd1 /dev/sde1 /dev/sdf1


This has been running fine for a couple of months.




I just applied system updates and rebooted, and the raid5 - /dev/md2 - didn't come back on boot. When I re-assembled it with mdadm --assemble --scan, it seems to have come up with only 3 of the member drives - sdf1 is missing. Here's what I can find:



(Side-note: md0 & md1 are raid-1 built on a couple of drives, for / and swap respectively.)



root@dwight:~# mdadm --query --detail /dev/md2
/dev/md2:
Version : 00.90
Creation Time : Sun Feb 20 23:52:28 2011
Raid Level : raid5
Array Size : 5860540224 (5589.05 GiB 6001.19 GB)

Used Dev Size : 1953513408 (1863.02 GiB 2000.40 GB)
Raid Devices : 4
Total Devices : 3
Preferred Minor : 2
Persistence : Superblock is persistent

Update Time : Fri Apr 8 22:10:38 2011
State : clean, degraded
Active Devices : 3
Working Devices : 3

Failed Devices : 0
Spare Devices : 0

Layout : left-symmetric
Chunk Size : 64K

UUID : 1bb282b6:fe549071:3bf6c10c:6278edbc (local to host dwight)
Events : 0.140

Number Major Minor RaidDevice State

0 8 33 0 active sync /dev/sdc1
1 8 49 1 active sync /dev/sdd1
2 8 65 2 active sync /dev/sde1
3 0 0 3 removed


(Yes, the server's called Dwight; I'm a The Office fan :) )



So it thinks one drive (partition really) is missing, /dev/sdf1.




root@dwight:~# mdadm --detail --scan
ARRAY /dev/md0 level=raid1 num-devices=2 metadata=00.90 UUID=c7dbadaa:7762dbf7:beb6b904:6d3aed07
ARRAY /dev/md1 level=raid1 num-devices=2 metadata=00.90 UUID=1784e912:d84242db:3bf6c10c:6278edbc
mdadm: md device /dev/md/d2 does not appear to be active.
ARRAY /dev/md2 level=raid5 num-devices=4 metadata=00.90 UUID=1bb282b6:fe549071:3bf6c10c:6278edbc


What, what, /dev/md/d2? What's /dev/md/d2? I didn't create that.



root@dwight:~# cat /proc/mdstat

Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md2 : active raid5 sdc1[0] sde1[2] sdd1[1]
5860540224 blocks level 5, 64k chunk, algorithm 2 [4/3] [UUU_]

md_d2 : inactive sdf1[3](S)
1953513408 blocks

md1 : active raid1 sdb2[1] sda2[0]
18657728 blocks [2/2] [UU]


md0 : active raid1 sdb1[1] sda1[0]
469725120 blocks [2/2] [UU]

unused devices:


Ditto. md_d2? sd[cde]1 Are in md2 properly, but sdf1 is missing (and seems to think it should be an array of its own?)



root@dwight:~# mdadm -v --examine /dev/sdf1
/dev/sdf1:

Magic : a92b4efc
Version : 00.90.00
UUID : 1bb282b6:fe549071:3bf6c10c:6278edbc (local to host dwight)
Creation Time : Sun Feb 20 23:52:28 2011
Raid Level : raid5
Used Dev Size : 1953513408 (1863.02 GiB 2000.40 GB)
Array Size : 5860540224 (5589.05 GiB 6001.19 GB)
Raid Devices : 4
Total Devices : 4
Preferred Minor : 2


Update Time : Fri Apr 8 21:40:42 2011
State : clean
Active Devices : 4
Working Devices : 4
Failed Devices : 0
Spare Devices : 0
Checksum : 71136469 - correct
Events : 114


Layout : left-symmetric
Chunk Size : 64K

Number Major Minor RaidDevice State
this 3 8 81 3 active sync /dev/sdf1

0 0 8 33 0 active sync /dev/sdc1
1 1 8 49 1 active sync /dev/sdd1
2 2 8 65 2 active sync /dev/sde1
3 3 8 81 3 active sync /dev/sdf1



...so sdf1 thinks it's part of the md2 device, is that right?



When I run that on /dev/sdc1, I get:



root@dwight:~# mdadm -v --examine /dev/sdc1
/dev/sdc1:
Magic : a92b4efc
Version : 00.90.00

UUID : 1bb282b6:fe549071:3bf6c10c:6278edbc (local to host dwight)
Creation Time : Sun Feb 20 23:52:28 2011
Raid Level : raid5
Used Dev Size : 1953513408 (1863.02 GiB 2000.40 GB)
Array Size : 5860540224 (5589.05 GiB 6001.19 GB)
Raid Devices : 4
Total Devices : 3
Preferred Minor : 2

Update Time : Fri Apr 8 22:50:03 2011

State : clean
Active Devices : 3
Working Devices : 3
Failed Devices : 1
Spare Devices : 0
Checksum : 71137458 - correct
Events : 144

Layout : left-symmetric
Chunk Size : 64K


Number Major Minor RaidDevice State
this 0 8 33 0 active sync /dev/sdc1

0 0 8 33 0 active sync /dev/sdc1
1 1 8 49 1 active sync /dev/sdd1
2 2 8 65 2 active sync /dev/sde1
3 3 0 0 3 faulty removed



And when I try to add sdf1 back into the /dev/md2 array, I get a busy error:



root@dwight:~# mdadm --add /dev/md2 /dev/sdf1
mdadm: Cannot open /dev/sdf1: Device or resource busy


Help! How can I add sdf1 back into the md2 array?



Thanks,





Answer



mdadm -S /dev/md_d2, then try adding sdf1.


linux - post_max_size will not change, stuck at 128M, other settings work



I am trying to increase post_max_size to 1024MB, I am using ini_get('post_max_size') and phpinfo() to check the setting and
post_max_size shows as 128M in both.



some phpinfo() on the server:




PHP Version 5.5.9-1ubuntu4.7
System Linux Dalaran 3.13.0-49-generic #83-Ubuntu SMP Fri Apr 10 20:11:33 UTC 2015 x86_64
Build Date Mar 16 2015 20:43:56
Server API Apache 2.0 Handler
Configuration File (php.ini) Path /etc/php5/apache2
Loaded Configuration File /etc/php5/apache2/php.ini
Scan this dir for additional .ini files /etc/php5/apache2/conf.d
Apache Version Apache/2.4.7 (Ubuntu)
user_dir no value



/etc/php5/apache2/php.ini



669 ; Maximum size of POST data that PHP will accept.
670 ; Its value may be 0 to disable the limit. It is ignored if POST data reading
671 ; is disabled through enable_post_data_reading.
672 ; http://php.net/post-max-size
673 post_max_size = 1024M



I have restarted apache after each change. I also ran a grep on all directories in /etc/php5 and found no other reference to Post_max_size



Changing upload_max_filesize and memory_limit works fine.



No user_dir is set, so it should not be loading any .user.ini files.



Nothing useful in apache error log.



If you have any ideas for what I can check next please tell me.


Answer




YES!
Base on what Zoredache said I did a



:/etc/apache2$ rgrep post *


which found



sites-available/000-default.conf:       php_value post_max_size 128M



so a little



/etc/apache2/sites-available$ sudo vi 000-default.conf


change php_value post_max_size 128M to 1024M and I am done!


Sunday, September 22, 2019

ubuntu - New RAID hdparm slow



I just got a HP DL180 G6 that has 25X 146GB 15K SAS drives, 36GB RAM, 2X 2.0GHz Xeon 1333Mhz FSB. For fun I configured them all in a single RAID 0 and installed Ubuntu on it to see how fast it could get with 25 drives on a HP Smart Array P410 Controller w/ 512MB RAM.



When I ran hdparm -tT /dev/mapper/concorde--vg-root I get



Timing cached reads:   5658MB in  1.99 seconds = 2834.13 MB/sec
Timing buffered disk reads: 1192 MB in 3.00 seconds = 397.13 MB/sec



When I run the same command on my other server (HP DL360 G5 - 32GB RAM - 2X 2.66GHz 667Mhz FSB) that only has 4X 15K drives I get:



Timing cached reads:   13268 MB in  1.99 seconds = 6665.18 MB/sec
Timing buffered disk reads: 712 MB in 3.00 seconds = 237.17 MB/sec


I would have expected this to run 5 times faster than the old one, not slower. The server is intended to deliver streaming media and so I need super fast access and transfer to keep up with 2 1Gb network ports I hope to max out at times along with performing it's other tasks.



I just put together a bunch of copies of a 400MB MP4 file to get 45GB to copy from one directory to another and it took 96 seconds, which just seems wrong for everything I have ever heard of the performance boost of RAID 0.




It is setup as a hardware raid, is there anything I need to do in Linux to take advantage of the extra speed that should be there? Does it matter which flavor of Linux I use? I am comfortable with CentOS and Ubuntu but could do others if needed.



Is there a different command I should use to measure performance? I tried using iotop and iostat yesterday to monitor the RAID usage and couldn't get it to report any usage while copying 2GB files over FTP, so kind of stuck trying to set a benchmark, comparing it's performance across servers, and monitoring it so I know when the hard drives are maxing out and need to be replaced with SSD.


Answer



Wow... there's a lot to address here.




  • Disk performance isn't just about throughput. There's the notion of IOPS and latency and service times to contend with. Most workloads are a bit random in nature, so 25 enterprise disks in an array will always trump 4 disks from an IOPS perspective.


  • hdparm is not the right tool to benchmark enterprise storage. Look into purpose-built programs like iozone and fio.





An example iozone command that could be helpful for you is (run from a large directory on the disk array you wish to test): iozone -t1 -i0 -i1 -i2 -r1m -s72g




  • The design of this server means that your disk backplane is oversubscribed. There's an expander chip on the server and those 25 disks are sharing a 4-lane 6Gbps connection to the RAID controller. That means that you have a theoretical maximum throughput of 24Gbps (or 3,000 Megabyte/second) to the array. That's a ceiling, and you won't see performance beyond that point.


  • Ubuntu is almost never the best choice when hardware drivers and support are concerned.It's not officially supported by the server. CentOS or RHEL would be a better fit for this hardware.


  • HP Smart Array controllers have the ability to carve a group of disks (an array) into multiple logical drives of varying capacity and RAID levels. The example below shows a 4-disk array carved into three logical drives. One of the logical drives is configured with a different RAID level than the others.






Smart Array P420i in Slot 0 (Embedded) (sn: 0014380296200A0)



  logicaldrive 1 (72.0 GB, RAID 1+0, OK)
logicaldrive 2 (1024.0 GB, RAID 1+0, OK)
logicaldrive 3 (869.1 GB, RAID 5, OK)

physicaldrive 1I:2:1 (port 1I:box 2:bay 1, SAS, 900.1 GB, OK)
physicaldrive 1I:2:2 (port 1I:box 2:bay 2, SAS, 900.1 GB, OK)
physicaldrive 1I:2:3 (port 1I:box 2:bay 3, SAS, 900.1 GB, OK)
physicaldrive 1I:2:4 (port 1I:box 2:bay 4, SAS, 900.1 GB, OK)





  • At no point should you use RAID 0 for a logical drive here. If you can spare the space, RAID 1+0 will perform very well with this hardware combination.


  • You have LVM in place. That's not the best approach when working with these HP Smart Array RAID controllers. It's an additional layer of abstraction, and you won't see the best performance (although it can be tuned to work well).


  • Firmware. You'll want to update the firmware of the server and related components. Functionality improves with each HP RAID controller firmware revision.


  • RAID cache configuration. Make sure the RAID battery is healthy and that the cache read/write balance is right for your workload.


  • Filesystem choice. XFS is a good option for streaming media. But the rates you're asking for are relatively low. Any modern Linux filesystem should be fine.




domain name system - SMTP / HELO and RBL black listing



We've been having many issues with our external IP address being put on RBL's and blacklisted.



We are using a 3rd party, hosted SMTP e-mail and we have two locations using it. Our TLD is ABC.com. Our internal domain name is XYZ.com (before my time) and is an actual registered TLD that resolved to a legit company.



Our sister location seems to not have any issues and when using Watchguard's Reputation service (http://www.reputationauthority.org) and put ABC.com their are 2 IP address that shows up on the listing and belongs to our sister company and it also shows the external TLD .coms reputation.




When we put our external IP address in it shows as being a "bad" IP and are on a few RBL's. When we put XYZ.com in they have a good reputation and no IP addresses are listed.



I have asked to have an SPF record added to the hosting company's DNS and have been shot down by the admin that can manage it and he will not add the SPF record.



Here is what our header information looks like. Does our internal domain, being a TLD, have any influence on coming up as a spammer in the HELO or the fact that our external IP address does not resolve?



I am afraid if I continue to ask to be unblacklisted that we will be put on a permanent RBL.




Return-path:

Received: from [10.5.2.31] (helo=xmail09.myhosting.com)
by xsmtp02.mail2web.com with esmtps (TLS-1.0:DHE_RSA_AES_256_CBC_SHA1:32)
(Exim 4.63)
(envelope-from )
id 1UMgdB-0005B4-Nv
for XXXXX@xxxx.com; Mon, 01 Apr 2013 11:26:46 -0400
Received: (qmail 12365 invoked from network); 1 Apr 2013 15:26:45 -0000
Received: from unknown (HELO LOCALCOMPUTER.XYZ.com) (Authenticated-user:_someuser@ABC.com@[66.xxx.xxx.xxx])
(envelope-sender )
by xmail09.myhosting.com (qmail-ldap-1.03) with ESMTPA

for ; 1 Apr 2013 15:26:44 -0000
Date: Mon, 1 Apr 2013 11:26:43 -0400
From: Our User
To: Their User
Message-ID:
Mime-Version: 1.0
Content-Type: text/html
Content-Transfer-Encoding: 8bit
X-SA-RemoteMail: Yes
X-SA-Exim-Connect-IP: 10.5.2.31

X-SA-Exim-Mail-From: SOMEUSER@ABC.COM
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on xsa10.softcom.biz
X-Spam-Level:
X-Spam-Status: No, score=0.1 required=5.0 tests=AWL,BAYES_00,
HTML_IMAGE_ONLY_12,HTML_MESSAGE,MIME_HTML_ONLY,T_REMOTE_IMAGE,URIBL_BLOCKED
autolearn=no version=3.3.1
X-Spam-DCC: : xsa10 1324; Body=1 Fuz1=1 Fuz2=1
X-Spam-Pyzor:
Subject: lead
X-SA-Exim-Version: 4.2.1 (built Mon, 13 Oct 2008 12:27:24 -0400)

X-SA-Exim-Scanned: Yes (on xsmtp02.mail2web.com)


Solution: We changed our outgoing SMTP from the 3rd party to our ISP's server. Have not had any more black listing issues since. I'm assuming going through the 3rd party as well as a host of other people probably got the 3rd party's IP address blacklisted, and had nothing to do with us particularly.


Answer



Solution: We changed our outgoing SMTP from the 3rd party to our ISP's server. Have not had any more black listing issues since. I'm assuming going through the 3rd party as well as a host of other people probably got the 3rd party's IP address blacklisted, and had nothing to do with us particularly.



Since the 3rd party hosting has multiple accounts and multiple e-mail addresses, including spammy ones, this was creating the issue of being placed on RBL's. True span was originating from the 3rd party's servers and causing the entire server to be black listed. By using our local SMTP provider we were able to avoid this.


Saturday, September 21, 2019

domain name system - Check public DNS health and RFC compliance




Every now and again I like to run checks on my DNS servers to make sure they are running right and to RFC spec. I used to use the DNSTools website to do this as it gave me a pretty good picture of what was going on - are all my servers responding to the outside world, and the important (NS, MX especially ) records still up and replicated right. Also to see if my MX records have managed to make it onto any blacklists.



Blacklists have always been kind of a pain as I haven't been able to find a reliable "one stop shop" that lets you check against most of the major blacklists out there.



I haven't used DNS tools in a while and now they are requiring you to pay (which I have nothing against, just hard to justify to the superiors when you have invested in a large internal monitoring solution and I'm just doing a "feel good check")



What do my fellow sysadmins uses to check on their DNS records?


Answer



It's not like the old DNSStuff but http://www.iptools.com/ and http://www.mxtoolbox.com are good replacements.


Friday, September 20, 2019

hp proliant - HP DL360 G6 SAS-4xSATA splitter




I have HP Proliant DL360 G6 server with 1 CPU, it runs Windows Server 2012.


I wanted to connect extra SATA disks, so I've bought SAS-4xSATA splitter cable like this:
SAS - 4xSATA cable. Problem is that after connecting the cable with SATA disks (with external power), Windows Server /machine does not see any of the disks attached to splitter cable.
(I have already connected other SATA disks with PCIe cards)



I don't know if I'm trying to do it right...

Do I need to configure something elsewhere (to see the extra disks on splitter cable) or do something else?


Answer



HP RAID 410i controller dosn't support JBOD mode of disks. Create new logical drives from real HDD's in BIOS RAID setup mode when server will boot. Windows will see only logical drives on this controller. You can see same answer at https://serverfault.com/questions/267751/hadoop-jbod-disk-configuration-on-hp-smart-array-410-i-disk-controller


Thursday, September 19, 2019

How to Mount remote disc as local in Windows 7 and Windows 2003?

Is it possible to mount remote file system (on Windows 2003 server) as local disk in Windows 7?
I have Windows 2003 server with software RAID and some shares. On another computer with Windows 7 I have software which can access only data form local or USB disks.



This solution doesn't work because program doesn't see files in folder:
Mount Remote CIFS/SMB Share as a Folder not a Drive Letter
Everything is the same LAN.

Wednesday, September 18, 2019

ssms - Database not visible in SQL Server Management Studio

A colleague was overwriting a test database with a production database by doing a restore in SSMS (SQL Server 2005). He realized he had set the restore path incorrectly and canceled the operation. At this point the database disappeared from SSMS. The test database .mdf and .ldf files are still in their expected locations. I thought the database had become detached and tried to reattach it. I received the error "cannot attach a database with the same name as an existing database". I tried connecting to the database using various clients using both sa and Window logins without success. We tried restarting SQL Server and rebooting the server but the test database did not reappear.



The only interesting things in the log were three entries saying SQL Server had encountered an occurrence of cachestore flush.




My SQL Server knowledge is limited. Any ideas on how to either get our test database working again or remove all signs of it so I can recreate it with the same name?

Tuesday, September 17, 2019

networking - squid specify outgoing network interface

I have a Linux Debian machine with many network interfaces (venet0:1 to venet0:5) running Squid. If I connect to interface venet0:2 squid uses venet0:0 for outgoing traffic but I want Squid to use the same network interface for connections. So if I connect to the ip address for venet0:1 the proxy should also use the same interface for outgoing traffic.



Currently I use the following configuration:




http_port 200
forwarded_for off
uri_whitespace encode
visible_hostname localhost
via off
collapsed_forwarding on
auth_param basic program /usr/lib/squid/ncsa_auth /etc/squid/users
auth_param basic children 5
auth_param basic realm Proxy

auth_param basic credentialsttl 2 hours
auth_param basic casesensitive off
acl ncsa_users proxy_auth REQUIRED
access_log none
cache_store_log none
cache_log /dev/null
acl all src all
http_access allow ncsa_users
header_access From deny all
header_access Referer deny all

header_access Server deny all
header_access User-Agent allow all
header_access WWW-Authenticate deny all
header_access Link deny all
header_access Accept-Charset deny all
header_access Accept-Encoding deny all
header_access Accept-Language deny all
header_access Content-Language deny all
header_access Mime-Version deny all



I've tried out the tutorial from http://www.tastyplacement.com/squid-proxy-multiple-outgoing-ip-addresses but I don't think I can use it because I authenticate users with ncsa and not with source ip addresses.



Is there any possibility so squid use the correct network interface? It would be nice if I can avoid acl rules because that would require config changes with every change of one ip address.

reverse proxy - Transparently forward SSH connections to NATed Servers



I've been trying this for a long time and I have not yet found a good solution. I have several servers behind a NAT that all run an SSH daemon. One of the machines is my main server which gets the SSH port forwarded to it. What I want is basically open a connection to other NATed servers by going through the main server similar to what I can achieve by opening a connection to the main server and then SSHing in to the destination. Since there are some applications that run on top of SSH I'd like to make automate this in order to run rsync or git on top of the connection itself.



Is there a reverse proxy for SSH?


Answer



You can do this using ProxyCommand and netcat in .ssh/config:




# Your 'gateway' server.
Host gateway

# Any other server.
Host server1
ProxyCommand ssh gateway /bin/netcat %h %p


If you do ssh server1, you will open an SSH connection from your current location to your 'gateway' server, which will open a TCP connection to server1. This TCP connection will serve as the connection for SSH between your current location and server1.




Edit: This technique is commonly called 'ssh jumphost'.


Monday, September 16, 2019

amazon ec2 - How do you create a zone apex alias that points to a Elastic Load Balancer in the Route 53 GUI?



I created aliases for my domain name's zone apex using the ELB CLI as described in Elastic Load Balancing Developer Guide. I also added a AAAA record using the --rr-type AAAA flag, which is not described in the guide.



The Route 53 GUI is populated after I execute the elb-associate-route53-hosted-zone commands for A and AAAA records. I recorded how the records look in the GUI, deleted the records, and tried to re-create using the GUI only. I receive the following error.




RRSet with DNS name example.com., type A contains an alias target that contains a hosted zone that is an invalid alias target.



I would like to use the Route 53 GUI to perform this operation. Does the Route 53 GUI support the creation of a zone apex alias that points to a Elastic Load Balancer?


Answer



A engineer on the Route 53 team informed me that creating the proprietary alias can be created in the Route 53 Console (the GUI).




Here are the steps.




  1. click create record set

  2. for zone apex record just leave the name field blank

  3. select the type of alias you want to make A or AAAA (all steps after this are the same for both types)

  4. Select the yes radio button.

  5. Open the EC2 console in another tab and navigate to the list of your load balancers.

  6. Click on the load balancer and look at the description tab in the pane below the list. Sample output below




DNS Name:
new-balancer-751654286.us-east-1.elb.amazonaws.com (A Record)
ipv6.new-balancer-751654286.us-east-1.elb.amazonaws.com (AAAA Record)
dualstack.new-balancer-751654286.us-east-1.elb.amazonaws.com (A or AAAA Record)



Note: Because the set of IP addresses associated with a LoadBalancer can change over time,
you should never create an “A” record with any specific IP address. If you want to use a friendly
DNS name for your LoadBalancer instead of the name generated by the Elastic Load Balancing
service, you should create a CNAME record for the LoadBalancer DNS name, or use Amazon Route 53
to create a hosted zone. For more information, see the Using Domain Names With Elastic Load Balancing at http://docs.amazonwebservices.com/ElasticLoadBalancing/latest/DeveloperGuide/using-domain-names-with-elb.html.




Status: 0 of 0 instances in service



Port Configuration: 80 (HTTP) forwarding to 80 (HTTP)



Stickiness: Disabled(edit)



Availability Zones:
us-east-1b



Source Security Group:

amazon-elb-sg



Owner Alias: amazon-elb



Hosted Zone ID:
Z3DZXD0Q79N41H




  1. Now copy the Hosted zone ID in the above case ‘ Z3DZXD0Q79N41H’ and paste it into the field labeled ‘Alias Hosted Zone ID:’

  2. Now copy the DNS Name in the above case ‘ new-balancer-751654286.us-east-1.elb.amazonaws.com‘ and paste into the field ‘ Alias DNS Name:’

    -Just an FYI this DNS name is the same for both A and AAAA alias records. (do not use ‘ ipv6.new-balancer-751654286.us-east-1.elb.amazonaws.com‘)

  3. Click create record set or at this time you can select yes to weight the record and provide a weight between 0-255 and a setID such as ‘my load balancer’


deployment - SCCM recurring OSD task sequence



Update:




The question below was solved with the help of the accepted answer below. However, the actual cause of the problem was due to a bug. I have added another answer to this question below that contains the details of this bug as well as details on a hotfix solution that has been released.



Question:



At my organization we have a lab of computers that must be reimaged every week. We are currently doing this via SCCM 2007. At the moment this is done by creating a new mandatory advertisement each week for a working OSD task sequence (TS). However, I would like to do this by setting one advertisement on a recurring schedule.



In order for a TS to repeatedly run on a machine you must enable the advertisement option "Always rerun program" or the TS will only run the one time.



The problem I am running into is that when performing a reimage of the machine a new client gets installed and thus a new GUID is created. This means I must provide some automatic way to readd that new client GUID to the collection where the recurring TS is advertised. Of course since the client has a new GUID this means SCCM thinks the TS has yet to run on this machine and begins the reimage as soon as it is readded to the collection thus effectively putting the machine into an infinite rebuild loop.




I have considered simply building the client into the image so that it maintains the same GUID through the reimage but there are other issues with that approach.



Any suggestions on how to setup a recurring TS that will reimage a machine once a week?



Edit:



To clarify a few things I will explain the situation a little better:




  • The OSD Task Sequence I am trying to run will take about an hour and

    a half to complete and this will occur around 3am. After the OS
    deployment is done another TS will need to run in order to install
    one last program that must be done through a separate TS due to certain
    program constraints.


  • Secondly, when I refer to the GUID above I am in fact referring to
    the SMS GUID that gets assigned to newly installed ConfigMgr clients.
    Of course there are other reasons a new SMS GUID would be created but
    those aren't of any concern in this situation.








Solution Details:



With the suggestion from newmanth below I did the following to resolve this issue:




  1. For the OSD Task Sequence and associated advertisement I set the
    following settings:





    • Maximum allowed run time (minutes) : 90 (TS Properties -> Advanced)

    • Program rerun behavior : Always rerun program (Advertisement Properties -> Schedule)

    • Advertisement Schedule : 3am, recurs once per week
       


  2. For the collection containing the computers in question I used the
    following settings:




    • Maintenance Window Duration : 3am - 4:35am, recurs once per week.

      I also check the option, "This schedule applies only

      to operating system deployment task sequences". This allows me to
      run my second TS mentioned above outside the maintenance window but
      prevents the rebuild recurring immediately after re-adding the
      client to the collection.

      A maintenance window must be greater than or equal to the max run time of the TS or program plus the Advertised Programs Client Agent countdown duration (mine was set to 5 minutes).
      Since my TS will have a max run time of 90 mins, I will have to set
      my window to 95 mins.


    • Collection Membership Update Schedule : 4:45am, recurs daily.

      Rebuild is complete, maintenance window closed at 4:35am. I
      now wait 10 mins for good measure and schedule a collection
      membership update in order to re-add the newly installed client. I
      could do this weekly on the same day as the rebuild but I do it

      daily for other reasons.

      Depending on how your collection
      adds new client members, you may also need to schedule your
      discovery methods to run before this update happens. For instance if
      your collection adds new client members based on an Active Directory
      group then you will need to run the respective Active Directory
      discovery methods first so that the newly created client record has
      its corresponding Active Directory information populated. Otherwise
      the new client record will not have any AD group info and it will not
      get added to the collection.






With the settings above the rebuild process should go something like this:




  1. Maintenance Window opens at 3am.

  2. OSD Task Sequence starts at 3am.

  3. OSD Task Sequence ends roughly 1 hour and a half later (4:30am).

  4. Maintenance Window closes at 4:35am preventing an immediate repeat of the TS.

  5. Collection Membership updates at 4:45am re-adding the newly installed client.


  6. After the client policy retrieval the second TS mentioned above runs.

  7. Steps 1-6 should automatically repeat themselves the following week.


Answer



I think you might be able to get this to work by setting a once-per-week maintenance window on the collection in question, in conjunction with always re-running the advertisement. Make sure the window is just long enough to allow the advertisement to run once. This will prevent a subsequent run until the maintenance window hits again. Technet: http://technet.microsoft.com/en-us/library/bb632801.aspx


Sunday, September 15, 2019

vpn - OpenVPN routing problem

I'm running an OpenVPN server on a VPS running Debian 5, and a OpenVPS client on Ubuntu 11.04 Desktop. I want all outgoing traffic from the client to be tunnelled though the VPN server.



I'm able to initiate a connection from the client to the server, and ping successfully between them, but when I try to access outside IP addresses from the client, I am not successful.



For instance, when pinging google by IP:



ping -n 74.125.91.106

PING 74.125.91.106 (74.125.91.106) 56(84) bytes of data.
^C
--- 74.125.91.106 ping statistics ---
7 packets transmitted, 0 received, 100% packet loss, time 6048ms


On the server side, I can see the ping requests coming thought the tunnel, but no replies going back down:



tcpdump -i tun0
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode

listening on tun0, link-type RAW (Raw IP), capture size 65535 bytes
21:24:59.384736 IP 10.8.0.6 > qy-in-f106.1e100.net: ICMP echo request, id 7537, seq 1, length 64
21:25:00.391970 IP 10.8.0.6 > qy-in-f106.1e100.net: ICMP echo request, id 7537, seq 2, length 64
21:25:01.400394 IP 10.8.0.6 > qy-in-f106.1e100.net: ICMP echo request, id 7537, seq 3, length 64
21:25:02.408914 IP 10.8.0.6 > qy-in-f106.1e100.net: ICMP echo request, id 7537, seq 4, length 64
21:25:03.416378 IP 10.8.0.6 > qy-in-f106.1e100.net: ICMP echo request, id 7537, seq 5, length 64
21:25:04.424289 IP 10.8.0.6 > qy-in-f106.1e100.net: ICMP echo request, id 7537, seq 6, length 64
21:25:05.431804 IP 10.8.0.6 > qy-in-f106.1e100.net: ICMP echo request, id 7537, seq 7, length 64



I can also see these on the venet0 interface on the server:



tcpdump -i venet0:0 'icmp[icmptype] == icmp-echo or icmp[icmptype] == icmp-echoreply' 
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on venet0:0, link-type LINUX_SLL (Linux cooked), capture size 65535 bytes
21:39:11.397967 IP 10.8.0.6 > qy-in-f106.1e100.net: ICMP echo request, id 7588, seq 1, length 64
21:39:12.407609 IP 10.8.0.6 > qy-in-f106.1e100.net: ICMP echo request, id 7588, seq 2, length 64
21:39:13.415194 IP 10.8.0.6 > qy-in-f106.1e100.net: ICMP echo request, id 7588, seq 3, length 64
21:39:14.423050 IP 10.8.0.6 > qy-in-f106.1e100.net: ICMP echo request, id 7588, seq 4, length 64
21:39:15.431005 IP 10.8.0.6 > qy-in-f106.1e100.net: ICMP echo request, id 7588, seq 5, length 64

21:39:16.439687 IP 10.8.0.6 > qy-in-f106.1e100.net: ICMP echo request, id 7588, seq 6, length 64
21:39:17.446949 IP 10.8.0.6 > qy-in-f106.1e100.net: ICMP echo request, id 7588, seq 7, length 64


I can also ping google successfully from the server.



Any idea what might be causing this?



Server config file:





dev tun
server 10.8.0.0 255.255.255.0
ifconfig-pool-persist ipp.txt
ca ca.crt
cert server.crt
key server.key
dh dh1024.pem
push "route 10.8.0.0 255.255.255.0"
push "redirect-gateway"

comp-lzo
keepalive 10 60
ping-timer-rem
persist-tun
persist-key
group daemon
daemon


Client config file:





remote <<server IP>>
dev tun
comp-lzo
ca ca.crt
cert client1.crt
key client1.key
route-delay 2
route-method exe

redirect-gateway def1
dhcp-option DNS 10.8.0.1
verb 3

Friday, September 13, 2019

virtualhost - Web server showing IP address instead of domain name

I'm running a web server (Apache2) on CentOS 7. I have the following virtual host file. The site is being served as it should. However, the URL is changed in the browser (any browser) from the domain name verizondecom.com to the server's IP address.



What needs to be changed so that the URL is the domain name and not the server's IP?



NameVirtualHost *:80



ServerName www.verizondecom.com
ServerAlias verizondecom.com
ErrorLog /var/log/httpd/verizondecom.err
CustomLog /var/log/httpd/verizondecom.log combined
DocumentRoot /var/www/www.verizondecom.com/public
SetEnv ENVIRONMENT "production"

AllowOverride ALL

Order allow,deny
Allow from all
Require all granted


linux - How can I prevent a DDOS attack on Amazon EC2?



One of the servers I use is hosted on the Amazon EC2 cloud. Every few months we appear to have a DDOS attack on this sever. This slows the server down incredibly. After around 30 minutes, and sometimes a reboot later, everything is back to normal.



Amazon has security groups and firewall, but what else should I have in place on an EC2 server to mitigate or prevent an attack?




From similar questions I've learned:




  • Limit the rate of requests/minute (or seconds) from a particular IP address via something like IP tables (or maybe UFW?)

  • Have enough resources to survive such an attack - or -

  • Possibly build the web application so it is elastic / has an elastic load balancer and can quickly scale up to meet such a high demand)

  • If using mySql, set up mySql connections so that they run sequentially so that slow queries won't bog down the system




What else am I missing? I would love information about specific tools and configuration options (again, using Linux here), and/or anything that is specific to Amazon EC2.



ps: Notes about monitoring for DDOS would also be welcomed - perhaps with nagios? ;)


Answer



A DDOS (or even a DOS), in its essence, is a resource exhaustion. You will never be able to eliminate bottlenecks, as you can only push them farther away.



On AWS, you are lucky because the network component is very strong - it would be very surprising to learn that the upstream link was saturated. However, the CPU, as well as disks I/O, are way easier to flood.



The best course of action would be by starting some monitoring (local such as SAR, remote with Nagios and/or ScoutApp) and some remote logging facilities (Syslog-ng). With such setup, you will be able to identify which resources get saturated (network socket due to Syn flood ; CPU due to bad SQL queries or crawlers ; ram due to …). Don’t forget to have your log partition (if you don’t have remote logging enable) on an EBS volumes (to later study the logs).




If the attack come through the web pages, the access log (or the equivalent) can be very useful.


Thursday, September 12, 2019

ubuntu - Virtual server freeze (apache ?)



I rent a small virtual server at 1und1.de. (Link, but German only) (2GB RAM dynamic, 512MB at least, 20 GB HDD) I choose to run a Ubuntu 8.04 LTS as operating system (64 Bit). I installed apache2 + php5 + mysql via the Ubuntu repositories and later eAccelerator.



I running some development stuff and a production site. This site is a kind of directory and has a few visitors (250 each day) and a lot of pages (about 7,5k). Every few days the servers freezes. This means, it's up, could be pinged, but any other action will result in "server refused connection". Looking into the admin panel it says that my kmemsize is to high and also there a lot of apache processes that were spawned.




For me it seems that my apache consumes all my ressources (and it also seems, that these freezes seems to start if Google or another crawler start to crawl the page).



Then I tried to avoid this freezes:




  • I lowered MaxKeepAliveRequests and KeepAliveTimeout in the Apache config

  • I set MaxRequestsPerChild in the setting of the prefork settings, to keep Apache workers cleaned more often.



This seems to improve the situation, but still freezes are happening.




Anyone any idea what could / should be changed?



Solution.


Answer



Figure out what the problem Actually Is:
Don't start tuning until you have used something like top or the ps command to see what is actually using the memory. It could be anywhere in the stack ( Mysql, PHP, Apache ). If it is Apache...



Switch to lighter HTTP Daemon:
Have you considered lighter HTTP daemon such as:






Consider a different MPM with Apache:
I would test this a lot before pushing this into production, but you might consider switching the the worker (instead of perfork) Multi-Processing Module (MPM). This article says this was used at dealnews.com and it helped with memory usage. I haven't done this with Ubuntu, but I think it is just:



sudo apt-get remove apache2-mpm-prefork
sudo apt-get install apache2-mpm-worker


But you might want to consider building Apache from source so you only have the modules you need, that can be kind of a big project though. Also, from the article keep in mind:





This is an important part. You can't
use radical extensions in PHP when you
are using worker.




Tune PHP as Well:
This IBM Developerworks article has some php tuning options that might help reduce memory as well.



Tune MySQL as Well:
The third article in the above IBM Lamp tuning article talks about MySQL tuning. This can end up using quite a bit of memory.


Wednesday, September 11, 2019

proxy - NGINX for TCP DDOS Protection

I require a TCP reverse proxy to protect my server's IP. I need something like this which works fine https://xhosts.uk/ddosprotection or https://www.hostsavor.com/proxies
I was wondering if I could use NGINX to achieve this, as NGINX is what I currently use as my ddos protection.

Saturday, September 7, 2019

windows server 2008 - Mapping Drive Error - System Error 1808

A vendor is attempting to map and preserve a network drive using nt authority/system; so it stays persistent when the interactive session of the server is lost. They were able to do this on one server (Windows 2008 R2) but not a second computer (also Windows 2008 R2).



D:\PsExec.exe -s cmd.exe


PsExec v1.98 - Execute processes remotely
Copyright (C) 2001-2010 Mark Russinovich
Sysinternals - www.sysinternals.com

Microsoft Windows [Version 6.1.7600]
Copyright (c) 2009 Microsoft Corporation. all rights reserved.

C:\Windows\system32>whoami
nt authority\system


C:\Windows\system32>net use
New connections will be remembered.

Status Local Remote Network
--------------------------------------------------------------------
OK X: \\netapp1\share1 Microsoft Windows Network
The command completed successfully.

C:\Windows\system32>net use q: \\netapp1\share1

System error 1808 has occurred.

The account used is a computer account. Use your global user account
or local user account to access this server.

C:\Windows\system32>


I am unsure on how to set up a "machine account mapping" which will preserve the drive letter of the Netapp path being mapped, so that the service account running a Windows service can continue to access the share after interactive logon has expired on the server. Since they were able to do this on one server but not another, I'm not sure how to troubleshoot the problem? Any suggestions?

Friday, September 6, 2019

domain name system - Regarding gmail SPF record and A record



I have a domain with the following SPF record,



"v=spf1 +a +mx +ip4:123.45.67.89 ~all"



Two questions,




  1. Is the IP necessary there? The A record on the domain resolves to the same IP i.e. 123.45.67.89.

  2. I've created an email on the domain and added it to gmail to send and receive emails. The emails are working fine, I am able to send emails and they don't have the warning "Google cannot verify if the domain actually sent the email or no". Do I need to add any gmail SPF record to it? I'm asking about this v=spf1 include:_spf.google.com record.


Answer





  1. If you have exactly the same IP (or a:/) in your a mechanism (or mx mechanism), the ip4 mechanism is unnecessary and CAN (rather than must) be removed.



    As domain is not specified in your +a & +mx, the current domain is used, while ip4 & ip6 must always have an

    or / specified.


  2. With the current SPF record, Google falls within ~all, causing SoftFail, i.e. "The SPF record has designated the host as NOT being allowed to send but is in transition". Therefore receiving MTA shouldn't REJECT the mail, but it can mark it as spam. With -all it'd have been rejected.



    Therefore, include:_spf.google.com is necessary, if you need to send email from Gmail. However, you should not add another TXT record, but combine these two into one, e.g.



    @ IN TXT "v=spf1 +a +mx include:_spf.google.com ~all"



    It's also possible (and even suggested on the documentation of include: mechanism) to make included domain Neutral rather than Pass. If Gmail is only used occasionally and you want to prevent other Gmail users to get Pass results on the SPF tests, it can be achieved with:



    @ IN TXT "v=spf1 +a +mx ?include:_spf.google.com ~all"


Thursday, September 5, 2019

nginx - Recurrent 502, 504 errors despite low cpu and memory usage

I'm struggling with this problem since quite a while. I have t2.site instance on AWS hosting a very low traffic personal website built with wordpress, nothing extraordinary. I first used Apache2, and after a few months running perfectly I suddenly started to got some 502, 504 errors. At that time I decided to move to Nginx/php5-fpm, things went well no more 502, 504 errors for a while. And since yesterday it just all happened again, I noticed that if I restart php5-fpm or mysql the site is accessible again, but only for 5-10 minutes before giving a 502 or 504 errors again.



I have followed multiple threads with similar problems nothing really worked, so I surrender and ask for help here. I changed multiple time the nginx and php5-fpm configuration without any durable change. I can see in the htop that there is a lot of mysql process and start suspecting that something (wordpress or ?) is creating a bunch of unnecessary connections to mysql but I don't know how to investigate further.



Here first my htop when things go wrong:



enter image description here




My nginx config :



server {
listen 80;
root /var/www/html/mysite;
index index.php index.html index.htm;
server_name my-site.com;

location / {

try_files $uri $uri/ /index.php?q=$uri&$args;
}

location ~ \.php$ {
try_files $uri =404;
#fastcgi_pass unix:/var/run/php5-fpm.sock;
fastcgi_read_timeout 150;
fastcgi_pass 127.0.0.1:9000;
fastcgi_buffer_size 16k;
fastcgi_buffers 4 16k;

fastcgi_index index.php;
include fastcgi_params;
}

location = /favicon.ico {
log_not_found off;
access_log off;
}

location = /robots.txt {

allow all;
log_not_found off;
access_log off;
}

location ~ /\. {
deny all;
}

location ~* /(?:uploads|files)/.*\.php$ {

deny all;
}
}


My /etc/php5/fpm/pool.d/www.conf file for php5-fpm configuration



; Start a new pool named 'www'.
; the variable $pool can we used in any directive and will be replaced by the
; pool name ('www' here)

[www]

; Per pool prefix
; It only applies on the following directives:
; - 'slowlog'
; - 'listen' (unixsocket)
; - 'chroot'
; - 'chdir'
; - 'php_values'
; - 'php_admin_values'

; When not set, the global prefix (or /usr) applies instead.
; Note: This directive can also be relative to the global prefix.
; Default Value: none
;prefix = /path/to/pools/$pool

; Unix user/group of processes
; Note: The user is mandatory. If the group is not set, the default user's group
; will be used.
user = www-data
group = www-data


; The address on which to accept FastCGI requests.
; Valid syntaxes are:
; 'ip.add.re.ss:port' - to listen on a TCP socket to a specific address on
; a specific port;
; 'port' - to listen on a TCP socket to all addresses on a
; specific port;
; '/path/to/unix/socket' - to listen on a unix socket.
; Note: This value is mandatory.
listen = 127.0.0.1:9000


; Set listen(2) backlog.
; Default Value: 65535 (-1 on FreeBSD and OpenBSD)
;listen.backlog = 65535

; Set permissions for unix socket, if one is used. In Linux, read/write
; permissions must be set in order to allow connections from a web server. Many
; BSD-derived systems allow connections regardless of permissions.
; Default Values: user and group are set as the running user
; mode is set to 0660

listen.owner = www-data
listen.group = www-data
;listen.mode = 0660

; List of ipv4 addresses of FastCGI clients which are allowed to connect.
; Equivalent to the FCGI_WEB_SERVER_ADDRS environment variable in the original
; PHP FCGI (5.2.2+). Makes sense only with a tcp listening socket. Each address
; must be separated by a comma. If this value is left blank, connections will be
; accepted from any ip address.
; Default Value: any

;listen.allowed_clients = 127.0.0.1

; Specify the nice(2) priority to apply to the pool processes (only if set)
; The value can vary from -19 (highest priority) to 20 (lower priority)
; Note: - It will only work if the FPM master process is launched as root
; - The pool processes will inherit the master process priority
; unless it specified otherwise
; Default Value: no set
; priority = -19


; Choose how the process manager will control the number of child processes.
; Possible Values:
; static - a fixed number (pm.max_children) of child processes;
; dynamic - the number of child processes are set dynamically based on the
; following directives. With this process management, there will be
; always at least 1 children.
; pm.max_children - the maximum number of children that can
; be alive at the same time.
; pm.start_servers - the number of children created on startup.
; pm.min_spare_servers - the minimum number of children in 'idle'

; state (waiting to process). If the number
; of 'idle' processes is less than this
; number then some children will be created.
; pm.max_spare_servers - the maximum number of children in 'idle'
; state (waiting to process). If the number
; of 'idle' processes is greater than this
; number then some children will be killed.
; ondemand - no children are created at startup. Children will be forked when
; new requests will connect. The following parameter are used:
; pm.max_children - the maximum number of children that

; can be alive at the same time.
; pm.process_idle_timeout - The number of seconds after which
; an idle process will be killed.
; Note: This value is mandatory.
pm = dynamic

; The number of child processes to be created when pm is set to 'static' and the
; maximum number of child processes when pm is set to 'dynamic' or 'ondemand'.
; This value sets the limit on the number of simultaneous requests that will be
; served. Equivalent to the ApacheMaxClients directive with mpm_prefork.

; Equivalent to the PHP_FCGI_CHILDREN environment variable in the original PHP
; CGI. The below defaults are based on a server without much resources. Don't
; forget to tweak pm.* to fit your needs.
; Note: Used when pm is set to 'static', 'dynamic' or 'ondemand'
; Note: This value is mandatory.
pm.max_children = 5

; The number of child processes created on startup.
; Note: Used only when pm is set to 'dynamic'
; Default Value: min_spare_servers + (max_spare_servers - min_spare_servers) / 2

pm.start_servers = 2

; The desired minimum number of idle server processes.
; Note: Used only when pm is set to 'dynamic'
; Note: Mandatory when pm is set to 'dynamic'
pm.min_spare_servers = 1

; The desired maximum number of idle server processes.
; Note: Used only when pm is set to 'dynamic'
; Note: Mandatory when pm is set to 'dynamic'

pm.max_spare_servers = 3

; The number of seconds after which an idle process will be killed.
; Note: Used only when pm is set to 'ondemand'
; Default Value: 10s
;pm.process_idle_timeout = 10s;

; The number of requests each child process should execute before respawning.
; This can be useful to work around memory leaks in 3rd party libraries. For
; endless request processing specify '0'. Equivalent to PHP_FCGI_MAX_REQUESTS.

; Default Value: 0
pm.max_requests = 500

; The URI to view the FPM status page. If this value is not set, no URI will be
; recognized as a status page. It shows the following informations:
; pool - the name of the pool;
; process manager - static, dynamic or ondemand;
; start time - the date and time FPM has started;
; start since - number of seconds since FPM has started;
; accepted conn - the number of request accepted by the pool;

; listen queue - the number of request in the queue of pending
; connections (see backlog in listen(2));
; max listen queue - the maximum number of requests in the queue
; of pending connections since FPM has started;
; listen queue len - the size of the socket queue of pending connections;
; idle processes - the number of idle processes;
; active processes - the number of active processes;
; total processes - the number of idle + active processes;
; max active processes - the maximum number of active processes since FPM
; has started;

; max children reached - number of times, the process limit has been reached,
; when pm tries to start more children (works only for
; pm 'dynamic' and 'ondemand');
; Value are updated in real time.
; Example output:
; pool: www
; process manager: static
; start time: 01/Jul/2011:17:53:49 +0200
; start since: 62636
; accepted conn: 190460

; listen queue: 0
; max listen queue: 1
; listen queue len: 42
; idle processes: 4
; active processes: 11
; total processes: 15
; max active processes: 12
; max children reached: 0
;
; By default the status page output is formatted as text/plain. Passing either

; 'html', 'xml' or 'json' in the query string will return the corresponding
; output syntax. Example:
; http://www.foo.bar/status
; http://www.foo.bar/status?json
; http://www.foo.bar/status?html
; http://www.foo.bar/status?xml
;
; By default the status page only outputs short status. Passing 'full' in the
; query string will also return status for each pool process.
; Example:

; http://www.foo.bar/status?full
; http://www.foo.bar/status?json&full
; http://www.foo.bar/status?html&full
; http://www.foo.bar/status?xml&full
; The Full status returns for each process:
; pid - the PID of the process;
; state - the state of the process (Idle, Running, ...);
; start time - the date and time the process has started;
; start since - the number of seconds since the process has started;
; requests - the number of requests the process has served;

; request duration - the duration in µs of the requests;
; request method - the request method (GET, POST, ...);
; request URI - the request URI with the query string;
; content length - the content length of the request (only with POST);
; user - the user (PHP_AUTH_USER) (or '-' if not set);
; script - the main script called (or '-' if not set);
; last request cpu - the %cpu the last request consumed
; it's always 0 if the process is not in Idle state
; because CPU calculation is done when the request
; processing has terminated;

; last request memory - the max amount of memory the last request consumed
; it's always 0 if the process is not in Idle state
; because memory calculation is done when the request
; processing has terminated;
; If the process is in Idle state, then informations are related to the
; last request the process has served. Otherwise informations are related to
; the current request being served.
; Example output:
; ************************
; pid: 31330

; state: Running
; start time: 01/Jul/2011:17:53:49 +0200
; start since: 63087
; requests: 12808
; request duration: 1250261
; request method: GET
; request URI: /test_mem.php?N=10000
; content length: 0
; user: -
; script: /home/fat/web/docs/php/test_mem.php

; last request cpu: 0.00
; last request memory: 0
;
; Note: There is a real-time FPM status monitoring sample web page available
; It's available in: ${prefix}/share/fpm/status.html
;
; Note: The value must start with a leading slash (/). The value can be
; anything, but it may not be a good idea to use the .php extension or it
; may conflict with a real PHP file.
; Default Value: not set

;pm.status_path = /status

; The ping URI to call the monitoring page of FPM. If this value is not set, no
; URI will be recognized as a ping page. This could be used to test from outside
; that FPM is alive and responding, or to
; - create a graph of FPM availability (rrd or such);
; - remove a server from a group if it is not responding (load balancing);
; - trigger alerts for the operating team (24/7).
; Note: The value must start with a leading slash (/). The value can be
; anything, but it may not be a good idea to use the .php extension or it

; may conflict with a real PHP file.
; Default Value: not set
;ping.path = /ping

; This directive may be used to customize the response of a ping request. The
; response is formatted as text/plain with a 200 response code.
; Default Value: pong
;ping.response = pong

; The access log file

; Default: not set
;access.log = log/$pool.access.log

; The access log format.
; The following syntax is allowed
; %%: the '%' character
; %C: %CPU used by the request
; it can accept the following format:
; - %{user}C for user CPU only
; - %{system}C for system CPU only

; - %{total}C for user + system CPU (default)
; %d: time taken to serve the request
; it can accept the following format:
; - %{seconds}d (default)
; - %{miliseconds}d
; - %{mili}d
; - %{microseconds}d
; - %{micro}d
; %e: an environment variable (same as $_ENV or $_SERVER)
; it must be associated with embraces to specify the name of the env

; variable. Some exemples:
; - server specifics like: %{REQUEST_METHOD}e or %{SERVER_PROTOCOL}e
; - HTTP headers like: %{HTTP_HOST}e or %{HTTP_USER_AGENT}e
; %f: script filename
; %l: content-length of the request (for POST request only)
; %m: request method
; %M: peak of memory allocated by PHP
; it can accept the following format:
; - %{bytes}M (default)
; - %{kilobytes}M

; - %{kilo}M
; - %{megabytes}M
; - %{mega}M
; %n: pool name
; %o: output header
; it must be associated with embraces to specify the name of the header:
; - %{Content-Type}o
; - %{X-Powered-By}o
; - %{Transfert-Encoding}o
; - ....

; %p: PID of the child that serviced the request
; %P: PID of the parent of the child that serviced the request
; %q: the query string
; %Q: the '?' character if query string exists
; %r: the request URI (without the query string, see %q and %Q)
; %R: remote IP address
; %s: status (response code)
; %t: server time the request was received
; it can accept a strftime(3) format:
; %d/%b/%Y:%H:%M:%S %z (default)

; %T: time the log has been written (the request has finished)
; it can accept a strftime(3) format:
; %d/%b/%Y:%H:%M:%S %z (default)
; %u: remote user
;
; Default: "%R - %u %t \"%m %r\" %s"
;access.format = "%R - %u %t \"%m %r%Q%q\" %s %f %{mili}d %{kilo}M %C%%"

; The log file for slow requests
; Default Value: not set

; Note: slowlog is mandatory if request_slowlog_timeout is set
;slowlog = /var/log/php5-fpm.log

; The timeout for serving a single request after which a PHP backtrace will be
; dumped to the 'slowlog' file. A value of '0s' means 'off'.
; Available units: s(econds)(default), m(inutes), h(ours), or d(ays)
; Default Value: 0
;request_slowlog_timeout = 10s

; The timeout for serving a single request after which the worker process will

; be killed. This option should be used when the 'max_execution_time' ini option
; does not stop script execution for some reason. A value of '0' means 'off'.
; Available units: s(econds)(default), m(inutes), h(ours), or d(ays)
; Default Value: 0
request_terminate_timeout = 30s

; Set open file descriptor rlimit.
; Default Value: system defined value
;rlimit_files = 1024
; Set max core size rlimit.

; Possible Values: 'unlimited' or an integer greater or equal to 0
; Default Value: system defined value
;rlimit_core = 0

; Chroot to this directory at the start. This value must be defined as an
; absolute path. When this value is not set, chroot is not used.
; Note: you can prefix with '$prefix' to chroot to the pool prefix or one
; of its subdirectories. If the pool prefix is not set, the global prefix
; will be used instead.
; Note: chrooting is a great security feature and should be used whenever

; possible. However, all PHP paths will be relative to the chroot
; (error_log, sessions.save_path, ...).
; Default Value: not set
;chroot =

; Chdir to this directory at the start.
; Note: relative path can be used.
; Default Value: current directory or / when chroot
chdir = /


; Redirect worker stdout and stderr into main error log. If not set, stdout and
; stderr will be redirected to /dev/null according to FastCGI specs.
; Note: on highloaded environement, this can cause some delay in the page
; process time (several ms).
; Default Value: no
;catch_workers_output = yes

; Limits the extensions of the main script FPM will allow to parse. This can
; prevent configuration mistakes on the web server side. You should only limit
; FPM to .php extensions to prevent malicious users to use other extensions to

; exectute php code.
; Note: set an empty value to allow all extensions.
; Default Value: .php
;security.limit_extensions = .php .php3 .php4 .php5

; Pass environment variables like LD_LIBRARY_PATH. All $VARIABLEs are taken from
; the current environment.
; Default Value: clean env
;env[HOSTNAME] = $HOSTNAME
;env[PATH] = /usr/local/bin:/usr/bin:/bin

;env[TMP] = /tmp
;env[TMPDIR] = /tmp
;env[TEMP] = /tmp

; Additional php.ini defines, specific to this pool of workers. These settings
; overwrite the values previously defined in the php.ini. The directives are the
; same as the PHP SAPI:
; php_value/php_flag - you can set classic ini defines which can
; be overwritten from PHP call 'ini_set'.
; php_admin_value/php_admin_flag - these directives won't be overwritten by

; PHP call 'ini_set'
; For php_*flag, valid values are on, off, 1, 0, true, false, yes or no.
; Pass environment variables like LD_LIBRARY_PATH. All $VARIABLEs are taken from
; the current environment.
; Default Value: clean env
;env[HOSTNAME] = $HOSTNAME
;env[PATH] = /usr/local/bin:/usr/bin:/bin
;env[TMP] = /tmp
;env[TMPDIR] = /tmp
;env[TEMP] = /tmp


; Additional php.ini defines, specific to this pool of workers. These settings
; overwrite the values previously defined in the php.ini. The directives are the
; same as the PHP SAPI:
; php_value/php_flag - you can set classic ini defines which can
; be overwritten from PHP call 'ini_set'.
; php_admin_value/php_admin_flag - these directives won't be overwritten by
; PHP call 'ini_set'
; For php_*flag, valid values are on, off, 1, 0, true, false, yes or no.


; Defining 'extension' will load the corresponding shared extension from
; extension_dir. Defining 'disable_functions' or 'disable_classes' will not
; overwrite previously defined php.ini values, but will append the new value
; instead.

; Note: path INI options can be relative and will be expanded with the prefix
; (pool, global or /usr)

; Default Value: nothing is defined by default except the values in php.ini and
; specified at startup with the -d argument

;php_admin_value[sendmail_path] = /usr/sbin/sendmail -t -i -f www@my.domain.com
;php_flag[display_errors] = off
;php_admin_value[error_log] = /var/log/fpm-php.www.log
;php_admin_flag[log_errors] = on
;php_admin_value[memory_limit] = 32M


My /var/log/nginx/error.log



2016/09/15 16:48:11 [error] 29608#0: *62681 upstream timed out (110: Connection timed out) while reading response header from upstream, client: 180.76.15.19, server: my-site.com, request: "GET / HTTP/1.1", upstream: "fastcgi://127.0.0.1:9000", host: "www.my-site.com"

2016/09/15 16:49:29 [error] 29608#0: *62801 upstream timed out (110: Connection timed out) while reading response header from upstream, client: 180.76.15.141, server: my-site.com, request: "GET / HTTP/1.1", upstream: "fastcgi://127.0.0.1:9000", host: "www.my-site.com"
2016/09/15 22:46:55 [error] 29607#0: *84028 upstream timed out (110: Connection timed out) while reading response header from upstream, client: 180.76.15.139, server: my-site.com, request: "GET / HTTP/1.1", upstream: "fastcgi://127.0.0.1:9000", host: "www.my-site.com"
2016/09/15 22:47:23 [error] 29607#0: *84244 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 180.76.15.145, server: my-site.com, request: "GET / HTTP/1.1", upstream: "fastcgi://127.0.0.1:9000", host: "www.my-site.com"
2016/09/15 23:26:12 [error] 29607#0: *90756 upstream timed out (110: Connection timed out) while reading response header from upstream, client: 66.249.71.17, server: my-site.com, request: "GET /category/distribution/ HTTP/1.1", upstream: "fastcgi://127.0.0.1:9000", host: "my-site.com"
2016/09/15 23:49:28 [error] 29608#0: *94579 upstream timed out (110: Connection timed out) while reading response header from upstream, client: 180.76.15.33, server: my-site.com, request: "GET / HTTP/1.1", upstream: "fastcgi://127.0.0.1:9000", host: "www.my-site.com"
2016/09/15 23:50:50 [error] 29608#0: *94786 upstream timed out (110: Connection timed out) while reading response header from upstream, client: 180.76.15.163, server: my-site.com, request: "GET / HTTP/1.1", upstream: "fastcgi://127.0.0.1:9000", host: "www.my-site.com"


EDIT :
After checking php5-fpm log I found this :




My /var/log/upstart/php5-fpm.log



[16-Sep-2016 00:05:02] NOTICE: fpm is running, pid 32006
[16-Sep-2016 00:05:02] NOTICE: ready to handle connections
[16-Sep-2016 00:05:02] NOTICE: systemd monitor interval set to 10000ms
[16-Sep-2016 00:05:07] WARNING: [pool www] server reached pm.max_children setting (5), consider raising it
[16-Sep-2016 01:03:39] NOTICE: Terminating ...
[16-Sep-2016 01:03:39] NOTICE: exiting, bye-bye!
[16-Sep-2016 01:03:39] NOTICE: fpm is running, pid 32699

[16-Sep-2016 01:03:39] NOTICE: ready to handle connections
[16-Sep-2016 01:03:39] NOTICE: systemd monitor interval set to 10000ms
[16-Sep-2016 01:03:43] WARNING: [pool www] server reached pm.max_children setting (5), consider raising it


So I updated the php5-fpm config for the following parameters :
Currently I change the php5-fpm configuration, especially increasing the the pm.max_children value :



pm = dynamic
pm.max_children = 50

pm.start_servers = 5
pm.min_spare_servers = 5
pm.max_spare_servers = 35


The site is running better since a few moments, but I doubt this is a long term solution to the problem.



EDIT 2 :



After a long investigation of mysql.log (all queries), and of the access I found that I was under a brute force attack.

You can find information about this attack here: https://www.digitalocean.com/community/tutorials/how-to-protect-wordpress-from-xml-rpc-attacks-on-ubuntu-14-04



The access log was showing repeatedly this :



191.96.249.75 - - [16/Sep/2016:05:28:52 +0000] "POST /xmlrpc.php HTTP/1.0" 499 0 "-" "Mozilla/4.0 (compatible: MSIE 7.0; Windows NT 6.0)"
191.96.249.75 - - [16/Sep/2016:05:28:54 +0000] "POST /xmlrpc.php HTTP/1.0" 499 0 "-" "Mozilla/4.0 (compatible: MSIE 7.0; Windows NT 6.0)"
191.96.249.75 - - [16/Sep/2016:05:28:57 +0000] "POST /xmlrpc.php HTTP/1.0" 499 0 "-" "Mozilla/4.0 (compatible: MSIE 7.0; Windows NT 6.0)"
191.96.249.75 - - [16/Sep/2016:05:28:58 +0000] "POST /xmlrpc.php HTTP/1.0" 499 0 "-" "Mozilla/4.0 (compatible: MSIE 7.0; Windows NT 6.0)"
191.96.249.75 - - [16/Sep/2016:05:29:01 +0000] "POST /xmlrpc.php HTTP/1.0" 499 0 "-" "Mozilla/4.0 (compatible: MSIE 7.0; Windows NT 6.0)"

linux - How to SSH to ec2 instance in VPC private subnet via NAT server

I have created a VPC in aws with a public subnet and a private subnet. The private subnet does not have direct access to external network. S...