[App_rpt-users] Reg server root cause

Bryan Fields Bryan at bryanfields.net
Wed Sep 19 16:01:20 UTC 2018


Looks like we had a failure in Tampa this morning, but as expected the
registration server did kick over to Chicago (ORD), but there are still
several hundred nodes not running dnsmgr and thus kept trying to use the IP of
the Tampa server.

Please check your registrations as below:

> > iax2 show registry
> Host                  dnsmgr  Username    Perceived             Refresh  State
> 44.72.21.13:4569      Y       44233       44.98.249.6:4569           60  Registered
> 44.72.21.13:4569      Y       42032       44.98.249.6:4569           60  Registered

If you don't have dnsmgr as 'Y' please add this below as
/etc/asterisk/dnsmgr.conf
--
[general]
enable=yes              ; enable creation of managed DNS lookups
                        ; default is 'no'
refreshinterval=300     ; refresh managed DNS lookups every <n> seconds
                        ; default is 300 (5 minutes)

--

Then restart asterisk and verify you're now seeing dnsmgr as 'Y' in the
registry messages.

Also inexplicably, about 300 nodes are still hitting allstarlink.org directly
for registrations.  This server may go offline at any time.  We've personally
messaged node owners with the address on file before, please ensure you're
using register.allstarlink.org as the registration server and you have dnsmgr
enabled.

The root cause today looks to be IPSEC tunnels went down and didn't
reestablish automatically as they should have.  As all traffic between the
allstarlink.org servers is inside these tunnels this caused the TPA nodes to
drop out of the cluster.  The DNS did fail over to the register-ord server so
for the majority of people DNS refreshed and you moved over.  We're going to
let it run on ORD for a bit.

Further complicating this was as these tunnels were down, our monitoring
system was unable to contact us, so we learned about the outage 15 minutes
later than we should have.   We'll be correcting this.

We're investigating why IPSEC didn't refresh and reestablish these tunnels
automatically, as testing before showed they had.

Please ensure you have dnsmgr running on your node(s).

73's
-- 
Bryan Fields

727-409-1194 - Voice
http://bryanfields.net



More information about the App_rpt-users mailing list