Thursday, November 1, 2018

domain name system - Bouncing email due to bad initial MX TTL?



Last Friday night we changed mail servers. We moved off Office 365 to another hosted Exchange Solution (Intermedia.net).



We use Godaddy for DNS, and after the server migration stuff was ready and clients were good to go, I edited our MX records. Godaddy is fast so within an hour or so I saw on whatsmydns.com that the new, proper MX records had been propagating nicely.



Queue Monday morning. Email is coming through but I'm starting to hear about bouncebacks. More of the same Tuesday. I'd been pulling my hair out when I just eyeball our DNS entries on Tuesday night and see that the TTL on the new MX record is 1 Week. Yikes. I change it to 1/2 an hour. Today, one bottleneck messagelabs/symantec) has updated to point to the right server, but we're still getting some unfortunately large external senders, ie Postini, bouncing their messages off the old server.



Is that initial 1 week TTL to blame? Would Postini resepct that initial TTL despite my haviong shortened it yesterday? I'm going nuts because it seems pretty helpless. I had IT folks at one of our clients contact Postini to open a ticket, since their user emails to us are bouncing, but it could take time to move on that front. My only hope is that it's Thanksgiving so work is mostly over until the weekend. Friday night will be the official 1 week from that initial bad TTL. Should I have hope that things will 'just work' come Monday? I don't know what else to check. The domain is removed from O365, the new MX records seem nicely propagated. I'm about to jump off a pier.


Answer




Never shutdown the old server until you know that the TTL of the MX record pointing to the old server has expired. If the TTL is/was 1 week then leave the old server running for 1 week to catch any emails from clients that may have that MX record cached.



When implementing an email cutover always check the MX record during your planning phase and adjust it accordingly. Personally I don't see any good reason to set the TTL on the MX record to anything more or less than 1 hour.



The new TTL on the MX record doesn't have any bearing on the old TTL of the MX record. So if the old MX record TTL was 1 week (or whatever) then any client that has that cached is going to hold it for that period of time (for whatever remaining TTL is in their cache). The fact that you changed it has no bearing, because those clients aren't going to look it up again until it expires in their cache.


No comments:

Post a Comment

linux - How to SSH to ec2 instance in VPC private subnet via NAT server

I have created a VPC in aws with a public subnet and a private subnet. The private subnet does not have direct access to external network. S...