Email Security.cloud

 View Only
  • 1.  cluster{x}a.eu.messagelabs.com deferred mail from users at our domain

    Posted Jun 06, 2018 11:35 AM

    Hello.

    I am hoping someone from Symantec support could look into this problem.

    When sending mail from our domain (bfpe.com) to customers whose MX records point to both

    cluster3a.eu.messagelabs.com

    and

    cluster8a.eu.messagelabs.com

    mail is being deferred with the following message:

    dsn=4.0.0, stat=Deferred: 421 esmtp: protocol deviation

    The source IP for bfpe.com's mail relay (smtp.bfpe.com) is 208.79.82.226.

    I have checked the IPs reputation on mxtoolbox.com and directly at http://ipremoval.sms.symantec.com/lookup/ and there's nothing to see there.

    Mail to MX records at us.messagelabs.com (cluster{1,4}.us.messagelabs.com) are delivering normally, but I see instances where mail to cluster1a.us.messagelabs.com is also deferred with the same "protocol deviation" error.

    It does seem that the cluster{N}a (where in is an integer like 1,2,3...) records are always the backup MX records for the domain. Is there some reason why when we initially queued the messages, the primary MX was unreachable and forced the switch to the backup? And now because the messages are queued for the "backup" and the "primary" is available we are being rejected (using a lower priority MX when the higher priority MX is available is a SPAMmer trick -- hence the "protocol deviation"?)

    Can anyone shed a little more light on why in particular cluster{N}a.eu.messagelabs.com and cluster{N}a.us.messagelabs.com MX destinations are deferring (for days now) mail from our users?

    Thanks for your insights.



  • 2.  RE: cluster{x}a.eu.messagelabs.com deferred mail from users at our domain

    Broadcom Employee
    Posted Jun 07, 2018 09:35 AM

    Hi

    Anything that connects to out A clusters will see a 421 service temporsily unavailable error as you have seen.

    So if for example a domain had the MX records of

    10 cluster1.us.messagelabs.com

    20 cluster1a.us.messagelabs.com

    All mail should be being routed to the 10 record, cluster.us.messagelabs.com and not the 20 record. As you rightly mention a spammer trick is to target the lower priority MX record as this is likley to have weaker security than the primary MX record.

    I've checked the source IP you mention and see no issues on our side which would cause this with that IP. Occasionally we see an IP throttled to a point where we will reject its connections and force it to try the secondary MX record but in this case that is not happening with the IP you've quoted.

    Have you tried flushing your DNS cache?

    Regards

    Ian Tiller

    Tier 2 Senior Technical Support Engineer



  • 3.  RE: cluster{x}a.eu.messagelabs.com deferred mail from users at our domain

    Posted Jun 07, 2018 12:38 PM

    Dear Ian,

    Thanks for the tip! Unfortunately, there isn't a DNS problem, but there might have been a transient connectivity problem.

    Regardless and with all due respect, deferring mail indefinitely because a sender contacts the published backup/secondary MX records for the domain rather than the primary, is really poor practice and certainly not in the spirit or best practices of the SMTP RFCs.

    A customer with this setup gains little benefit from the published record, except in the rare instance that the primary is taken offline and the secondary is made aware of that change.

    Consider this from off network (well outside of Symantec's cloistered, hallowed halls). I am in distant locale, many hops from either of two routes into Symantec's network intrastructure.

    $ dig +short -t MX schneider-electric.com
    10 cluster3.eu.messagelabs.com.
    20 cluster3a.eu.messagelabs.com.

    $ dig +short cluster3.eu.messagelabs.com
    46.226.52.195

    $ dig +short cluster3a.eu.messagelabs.com
    85.158.139.103

    Sender network/mail relay cannot reach 46.226.52.195 due to provider network not receiving prefix or perhaps blackhole mitigation. Sender mail relay has a route to 85.158.139.103 and makes an attempt to connect but upon deferral, queues that mail for redelivery at the next interval. For whatever reason, even over multiple attempts during the queue TTL, connectivity to the primary remains unavailable. So the queue TTL timer expires, and the mail is bounced to the sender, when by all reasoning, it should have been delivered.

    What is the point of publishing a backup MX record, if not to accept mail for delivery?

    I am well aware of the following document:

    https://support.symantec.com/en_US/article.TECH247121.html

    It does not excuse the bad behavior of backup mail exchangers which presume they have the foreknowledge that from anywhere on the Internet, the primary is reachable. It cannot be determined a priori by Symantec whether every other network attempting to access endpoints via published DNS records can actually access ALL of those endpoints!

    The fact that the primary exchangers at Symantec are clusters running behind load balancers doesn't make the sitation any better. At any given moment, there can exist an outage over a given path to an endpoint. Having a backup enpoint DNS record published is supposed to be another way to mitigate the event! Instead, Symantec completely misses the boat and in effect publishes superfluous MX records which do as much harm as good.

    Something for your engineering team and customers to consider. I would be hard pressed to ever decide on partnering with Symantec for email services based on this learning experience.

    Best regards.

     



  • 4.  RE: cluster{x}a.eu.messagelabs.com deferred mail from users at our domain

    Posted Jun 21, 2018 11:21 AM
      |   view attached

    Just a little over two weeks of this, and still getting the same treatment and still no idea why.

    As hinted in my previous post, schneider-electric.com is one of the customers of Symantec's with which we experience this problem.

    For privacy reasons I am attaching only redacted log and diagnostics outputs from our mail relay.

    On this example message (which is typical of all the deferred messages from all the SMG subscribed Symantec customers), first we see in the queue messages that the sendmail process correctly tries to connect to "cluster3.eu.messagelabs.com" but our relay times out (?) or is disconnected early: "read error". This seems to rule out a DNS lookup error for the correct MX records.

    Because of the "read error" it seems our relay then reverts to attempting to deliver to the backup exchangers listed for schneider-electric.com (see post above.) These exchangers defer mail because "we shouldn't connect to them, since by Symantec's logic: the primary is 'reachable'", hence the "esmtp: protocol deviation"?

    Ultimately, this deferral causes a retry every thirty minutes for 5 days, after which our relay gives up trying and sends a notice to our sender of failed delivery after 5 days of attempting.

    If someone at Symantec could shed a little light, we'd be most appreciative.

     

    Attachment(s)

    txt
    deferred_mail_extract.txt   1 KB 1 version


  • 5.  RE: cluster{x}a.eu.messagelabs.com deferred mail from users at our domain

    Posted Jun 29, 2018 06:18 AM

    We are also having similar problem recently.  Mails to relay=cluster8.eu.messagelabs.com fallback to cluster8a.eu.messagelabs.com and got deferred.  Today I start to got report from users whom send mails to use relay=cluster6.us.messagelabs.com have similar symtom.  Mails to cluster9.us.messagelabs.com usually can success after some retries.  I checked all IP spam score for our mail server IP and it's clean.  Anyone know how to fix this ?



  • 6.  RE: cluster{x}a.eu.messagelabs.com deferred mail from users at our domain

    Posted Jun 29, 2018 09:04 AM

    Lucky you.

    I'll tell you what finally did the trick for us, since I don't think you'll have any luck getting any useful information from Symantec.

    Note, I do truly appreciate Ian Tiller taking the time to post here, confirming that we weren't "blacklisted" (well a I suppose we sort of were -- read on!)

    The short answer for us was to "go deep" (Wireshark) and note that the disconnection from the primary messagelabs MX was during the TLS key exchange.

    There was no status from the SMTP daemon on their end, which of course made it nearly impossible using the logs to troubleshoot. What happed to us was once we sent our certificate, the connection was immediately terminated.

    I of course had already checked the expiration date of our certificate (self signed for the record then -- and now), and was of course well aware that sending mail using TLS to every other destination NOT on messagelabs resulted in a properly negotiated STARTTLS smtp session. So at a glance I was unwilling to believe we actually had a certificate problem. And I will reiterate: for the "Internet at large" we did not have a certificate problem.

    However... with no other leads and basically at wits-end, I checked the properties of our certificate: with an eye toward key length, and hashing algorithms and admittledly, they were weaker than it seemed prudent to continue using. If you want to know how your SSL certificate "looks" this is a great tool:

    https://ssl-tools.net/mailservers

    I regenerated a new "stronger" self signed certificate, and restarted the mail relay daemon, and within the queue processing period, all the stuck messages destined for messagelabs customers were delivered.

    What's really hard to understand is why the backup exchangers at messagelabs don't behave identically to the primary when negotiating the connection. I suppose it's just down to the order of operations, and the fact that messagelabs has their "special" way of handing connections to the backups when they insist that their primary exchangers are reachable. This is a sticky point, and I don't agree with their logic as noted above.

    In short: have a look at your TLS certificate, and if possible, get one with better parameters issued and install it in your mail relay. Apparently, messagelabs has taken a stand on their minimums for an acceptable certificate from a remote sender. Nothing wrong with that really -- TLS is a total mess right now, but trying to figure that out is non-trivial, when the only thing you get back from the remote mail gateway is a TCP-RST to the connection! If there's a public note anywhere on this policy, it would be useful to see that link posted here. I never stumbled across anything mentioning minimum acceptable certificate parameters when exchanging mail with messagelabs...

    Hope this helps and best of luck!

     



  • 7.  RE: cluster{x}a.eu.messagelabs.com deferred mail from users at our domain

    Posted Jun 29, 2018 11:13 AM

    I was woke up around 2am by a client to sort out the problem.  I did get some clue after performing a tcpdump trace and then I verified it with telnet to messagelabs and sent a test mail without the TLS.  Right now, my workaround is to put a Try_TLS: NO in the access file and went straight back to sleep.  I'm sure I cannot handle the new cert thing at this late hour.  I think I gonna need to put a real letsencrypt cert in this system instead of self-signed cert for the long term solution. Thanks you for your detailed and informative reply!  



  • 8.  RE: cluster{x}a.eu.messagelabs.com deferred mail from users at our domain

    Posted Jul 18, 2018 09:27 AM

    Hello,

    Regarding your recent question - "Will Symantec be able to share with us what are the properties and pre-requisites of SSL certificate, in order to negotiate the TLS connection?"

    In order to negotiate a TLS connection to our customer(s), there would need to be an enforcement in place on our customers configuration or we can simply use opportunistic TLS.

    From your stand point all that would be required is that you get a certificate from an authorized certificate provider. Another thing to note is to confirm your mail server will accept and send emails via TLS.

    I have also looked into your sending IP "208.79.82.226" and it is not being throttled or listed on any blacklists as Ian Tiller advised.

    Hope this information helps and answers your concerns.
     

    Kind Regards,
     
    Akash Patel
    Sr. Technical Support Engineer