DNS record lookups can fail for a number of reasons, the most common of which is due what’s called a “ServFail” error.
ServFail errors occur when there’s an error communicating with a DNS server. This could have a number of causes, including an error on the DNS server itself, or a temporary networking issue.
Fortunately, most domains use multiple authoritative DNS servers, so if there is a short-lived ServFail issue on one name server which doesn’t impact the others, DNS lookups should still work. That said, if a name server has chronic ServFail issues, we recommend investigating why. ServFail errors happen, but should be rare.
ServFail Errors and DNS Record Monitoring
Many of our customers use DNS Check to notify them via an email, page or chat bot when a monitored DNS record starts failing. Some of these customers want to be notified if there’s any kind of issue, but others would rather not be about ServFail issues, unless they persist.
Last year we introduced a feature for suppressing isolated ServFail notifications. This took the form of an account wide setting which when toggled on, suppressed notifications for ServFail errors unless two or more occurred in a row:
This feature was introduced to cut down on false positives. At the time, 57% of errors being reported were of the “ServFail” variety. The majority of these ServFail errors were resolved 5 minutes later, when the DNS record in question was next checked.
This feature had the desired impact. The number of ServFail related notifications plummeted, and for most users, the issue of ServFail related false positives disappeared.
Unfortunately, this didn’t completely resolve the situation for some customers who have DNS providers with… less than stellar uptime. I won’t call out specific DNS providers in this blog post, but there is a definite pattern in terms of which DNS providers have frequent ServFail errors.
Our Updated ServFail Notification Suppression Feature
To address this issue (on the DNS monitoring side, at least), we’ve replaced our old “Suppress first ServFail notification” setting with a new setting which allows you to suppress notifications for anywhere from 0 to 10 consecutive ServFail errors:
This setting has a default value of “1”, and can be adjusted from your Notification Settings page. I recommend keeping the default value in most situations, and adjusting it upward only as needed.
This setting does not have any impact on notifications for other types of lookup failures, such as the wrong value being returned for a DNS record. As long as you have notifications enabled, you’ll receive a notification the first time a non-ServFail error occurs.