Wednesday, July 30, 2014

networking - Network routing issues on Linux



I was hoping someone out there would be able to look at this and let me know what I have missed. I have 4 machines and for some reason, only 1 of them can talk to the other 3 via their private IP address (on eth1).



The 4 machines are:




mach01 10.176.193.17
mach02 10.176.193.92

mach03 10.176.193.27
mach04 10.176.195.9


All of the machines are Debian lenny. From mach02, I can ping the other 3 machines no problem, and from the other machines, I can ping mach02. However, from mach01, mach03 and mach04 I can only ping mach02.



The results from "iptables --list" on all machines is:




Chain INPUT (policy ACCEPT)

target prot opt source destination

Chain FORWARD (policy ACCEPT)
target prot opt source destination

Chain OUTPUT (policy ACCEPT)
target prot opt source destination


So I do not believe there is a firewall issue. The routing tables for eth1 on all machines is:





10.176.192.0 * 255.255.224.0 U 0 0 0 eth1
10.191.192.0 10.176.192.1 255.255.192.0 UG 0 0 0 eth1
10.176.0.0 10.176.192.1 255.248.0.0 UG 0 0 0 eth1


So that looks fine as well. For some reason, ARP requests are failing from mach03 to anywhere other than mach02, and similarly for other machines.





mach03$ arping -c 1 -I eth1 10.176.193.17
ARPING 10.176.193.17

--- 10.176.193.17 statistics ---
1 packets transmitted, 0 packets received, 100% unanswered


I do not see any reason why ARP would fail like this, and have run out of ideas and places to look. Does anyone else with more experience in troubleshooting networking have any ideas?



Thanks




EDIT



After trying to ping mach01 from mach03, the following is in the ARP cache:




$ arp -a
? (10.176.193.17) at on eth1
? (67.23.45.1) at 00:00:0C:07:AC:01 [ether] on eth0



And the other way around (so from mach03 to mach01):




? (10.176.193.92) at 40:40:FA:77:D7:94 [ether] on eth1
? (10.176.193.27) at on eth1
? (67.23.45.1) at 00:00:0C:07:AC:01 [ether] on eth0


And more details on eth1:





$ ip addr show dev eth1
3: eth1: mtu 1500 qdisc pfifo_fast state UNKNOWN qlen 1000
link/ether 40:40:16:e0:f3:dd brd ff:ff:ff:ff:ff:ff
inet 10.176.193.17/19 brd 10.176.223.255 scope global eth1
inet6 fe80::4240:16ff:fee0:f3dd/64 scope link
valid_lft forever preferred_lft forever

Answer




It turns out I discovered an issue with Rackspace Cloud Server's networking. The issue was escalated and has been resolved.



I would like to thank everyone who responded.


No comments:

Post a Comment

linux - How to SSH to ec2 instance in VPC private subnet via NAT server

I have created a VPC in aws with a public subnet and a private subnet. The private subnet does not have direct access to external network. S...