linux - Weird server behavior, fear security breach but nothing on logs

These days my hosting (a2hosting) had an issue with the node hosting my VPS and during this period my users noticed a weird behavior other than the several restarts that happened while there were issues on the node.

It all started when some of my website users contacted me saying they saw different images than usual on the website (totally random images, not related to my website), since i wasn't home i shut down the server from my hosting control panel, fearing an hacker attack.

After a couple of hours I've been told by my users that the server is up again (even though i didn't bring it up) and my users said that after clearing the cache they didn't see anything wrong with their images anymore. I shut it down again as soon as i read about this, even more fearful of an hacker attack (but puzzled since no password has been changed)

Finally, when i came back home I booted the server again and checked the log. No suspicious activity on the logs, nobody logged on the server besides me, no file (of those which i was told were different) had been changed. The only strange things were:

I have 2 apache logs files called 20150518-access.log and 20150517-access.log (and i have some log entries dated May, 18 on other logs, as stated above), which means my system temporarly had May, 18 as date (now it's back to April though and it changed by itself).

Some of my logs have this http://mjzone.net/Files/lognull.png as log entry after one of the restarts

I have done some checks and my system seems fine, no access other than mine, no suspicious file found (not even the images that i was told were different), and i spent all my sunday checking for suspicious activity but i couldn't find any.
So, i have opened a ticket on a2hosting and they said they couldn't give me any information but it was more likely that it was a security breach rather than an issue linked to the node problem (except the restarts, they confirmed it was a problem on their side), but honestly i'm not convinced...and while i can think of a link between the automated restart and the temporarily changed date, i can't find any related to the images (which is the only thing that made me think about a security issue, although i can't think of an hacker that changes a bunch of random images and the server date but then reverts everything back).
My questions are:

is there any way that people seeing random images instead of the actual images might be linked to a server/node issue rather than a security breach?

What else could i check other than system logs and recently changed files to make sure nothing weird happened?

Answer

Seems like a BGP update combined with default host choices¹. I had a trouble gathering data (due to a network I'd pay to upgrade) but here's what I got:

The Route

A2 Hosting's routes are under AS55293. You're covered by the /22 and /23 there. The ASN paths updated on April 17th, same day as your NUL attachment. From LookingGlass (enter your IP, choose Route then Probe):

core1.fmt2.he.net> show ip bgp routes detail 199.195.117.35

Number of BGP Routes matching display condition : 2
S:SUPPRESSED F:FILTERED s:STALE 

1 Prefix: 199.195.116.0/23,  Status: BI,  Age: 19d20h14m55s
NEXT_HOP: 206.223.119.132, Metric: 593, 
Learned from Peer: 216.218.252.168 (6939)
LOCAL_PREF: 100,  MED: 20,  ORIGIN: igp,  Weight: 0
AS_PATH: 12129 55293

2 Prefix: 199.195.116.0/23,  Status: I,  Age: 82d0h54m10s

NEXT_HOP: 206.126.236.70, Metric: 685, 
Learned from Peer: 216.218.252.169 (6939)
LOCAL_PREF: 100,  MED: 20,  ORIGIN: igp,  Weight: 0
AS_PATH: 12129 55293

Last update to IP routing table: 2d19h12m13s, <--------- Right here
1 path(s) installed:  (no data was here, maybe a removal)

The Service

A2 Hosting's service is WordPress. When I visit your server by address it serves another WP site, and though we're accustomed to thinking about names that's why this is important.

By having omitted your host name, the server chose another site (either first found or configured as default). Browsers don't care much about names and approach servers by address, only providing the name so the server can choose a site from configuration. DNS also has PTRs to cross-check addresses in reverse back to names, but yours don't:

$ dig PTR 35.116.195.199.in-addr.arpa.

;; ANSWER SECTION:
... 43200 IN    PTR 199.195.116.35.static.a2webhosting.com.

For shared hosts that just means the webserver has to rely on the client-provided name rather than double-checking with DNS. What it should not do is send the wrong site's content when it isn't sure (though lots of people get away with this because the worst that usually happens is 404's). Unfortunately, your host carries primarily WordPress, yielding a higher chance of success with the wrong requests to the wrong servers. SSL could issue warnings but I wouldn't count on that.

Here's a request for google at your server's address:

$ nc youraddress 80
GET / HTTP/1.1
HOST: www.google.com

.....

HTTP 1.1 200 OK
...headers, html, nothing Google yet...
......
   
  FillGood...

which is the same (wrong) response. The magic happens if I request a WordPress image at your site but it routes wrong, ends up at default, and matches files from another site.

The NULs

The SIGTERM (15) in your NUL image implies graceful shutdown of sshd and not an exploit sled, and the NULs appear to be keepalives. At one/second there are almost exactly 5.5 minutes of them (you mention shutting down for hours), and though iTerm and routers send keepalives (NUL or ^@) it appears the service returned with no intervening logs. I'm tempted to dismiss these as just happening too fast because the default timeout for Cisco routers on route changes is 270 seconds (here and here)...4.5 minutes...or exactly the difference between the times that sshd gracefully shuts down (21:56:59) and returns (22:01:30), within one split second.

Conclusion

These events are aligned in time² and with durations that people and scripts do, not random crashes. This all assumes the host stays up, DNS doesn't support, default configurations, prevalence of architecture and SSH separately controlled, as might be done with host-based jails.

You mentioned Varoufakis and a motorbike, and I saw in the news that he visited Eurogroup Finance recently. EGF's site is in the same subnet and they use /wp, but their server responds properly to unknown hostnames. Since you have two subnets covering you) there are probably around 500 default site's addresses to check, but the primary issue seems to be that not every server sets a default site, or wall.

It's really up to them, but a generic "Wordpress-incompatible" page should probably appear so that when I play their hosted chess I'm not being offered another server's shoe repository. With broken routing the potential to share scripts and cookies exists, and anyone familiar with BGP may see additional problems here. These sites should be made to break in the same way as refusing to cross-share content.

¹ Timely information with assumptions is usually more important, but my original text was too speculative. Sorry about the delay.

² For your timestamp change, try looking for logs referencing ntp or ntpd, to see if your system synchronized with an unexpected time source.

Blog

Tuesday, January 9, 2018