Saturday, January 12, 2013

Troubleshooting 101: Resolution (Part 1 of 5)

Resolution is the first stop when it comes to troubleshooting.  If a user, application, or protocol can't determine the address of it's destination by name, then we'll never take the first step on a journey of hopefully less than 64 hops.

This troubleshooting step typically revolves around DNS, but could also refer to NetBIOS, or even at the application layer.  Since you're already on the Internet, let's discuss DNS issues in particular.

Tools of the Trade

The most common tools used for testing DNS are dig and nslookup.  Hailing from a Microsoft background, I usually find myself using nslookup:
C:\>nslookup
Default Server:  google-public-dns-b.google.com
Address:  8.8.4.4
> www.eveningread.com
Server:  google-public-dns-b.google.com
Address:  8.8.4.4

Non-authoritative answer:
Name:    ghs.l.google.com
Addresses:  2607:f8b0:400e:c01::79
          173.194.79.121
Aliases: 
www.eveningread.com
          ghs.google.com
From this short example, you can see several things.  First, my laptop is configured to query Google's Public DNS servers.  I selected their secondary server as my primary name server based on performance testing with namebench. This tool identified the server as the fasted for my geographic location.

Secondly, you'll see I requested this site's fully qualified domain name for my query which defaults to searching for A records.  This returns ghs.l.google.com, which in a second, automated lookup, returns the actual public IP address for this site.

Over-riding this query is a farily simple task of appending a name server to a single line nslookup query:
C:\> nslookup www.eveningread.com 8.8.8.8
Or interactively setting it in the session:
C:\>nslookup
Default Server:  google-public-dns-b.google.com
Address:  8.8.4.4


> server 8.8.8.8
Default Server:  google-public-dns-a.google.com
Address:  8.8.8.8
 
Searching for different record types is similar:
C:\> nslookup -query-type=mx www.eveningread.com
Or interactively setting it in the session:
C:\>nslookup
Default Server: google-public-dns-b.google.com
Address: 8.8.4.4


> set q=mx
> eveningread.com
Server:  google-public-dns-b.google.com
Address:  8.8.4.4

Non-authoritative answer:
eveningread.com MX preference = 10, mail exchanger = 91b72d3b13b94f9d8f11a10f3e76c6.pamx1.hotmail.com
Here we see the results of an MX record lookup.  (Don't try to email me here, this is configured for demonstration purposes only.  Email is a whole different topic.)

To perform these queries to a name server, DNS clients typically initiate a connection to UDP port 53 of a name server, though some uses of TCP port 53 have been seen. 

It's left as an exercise to the reader to leverage packet capture utilities to watch DNS traffic on their client workstation or DNS server.

Example Use Case: Active Directory Specific DNS Records

One of the most common issues I run into is workstations failing to authenticate with Active Directory (unless they're unplugged from the network, in which case they use cached credentials), or computers that cannot outright join a domain.

Active Directory Domain Services uses some unique records to locate a Domain Controller via the Netlogon process.  Details on these records are outlined here, in this ancient Windows Server 2000 documentation.

When we can't query Active Directory, I typically drop to a command prompt and run the following query (for the corp.contoso.com Active Directory forest, in this example):
C:\>nslookup -query-type=srv _ldap._tcp.dc._msdcs.corp.contoso.com
Default Server: dc1.corp.contoso.com
Address: 10.0.0.14
_ldap._tcp.dc._msdcs.corp.contoso.com SRV service location:
 priority = 0
 weight = 0
 port = 389
 svr hostname = dc1.corp.contoso.com
_ldap._tcp.dc._msdcs.corp.contoso.com SRV service location:
 priority = 0
 weight = 0
 port = 389
 svr hostname = dc2.corp.contoso.com
dc1.corp.contoso.com internet address = 10.0.0.14
dc2.corp.contoso.com internet address = 10.0.0.15
 
 Anything other than the above (provided those are the correct IP addresses), and I've got a resolution problem that's keeping Netlogon from finding a Domain Controller.  Usually updating the DNS settings to query the appropriate name server does the trick at this point.

Caveat Emptor

Another issue I commonly see pertains to client side DNS caching.  Let's say you connect to a highly available database server, which fails over by updating a CNAME for it's endpoint.  One moment your clients are connecting to 10.0.0.20, the next database.corp.contoso.com points to 10.0.0.21.  But if your client application didn't query database.corp.contoso.com a second time, it'll be using the cached 10.0.0.20 IP until TTL expires, or worse never!

Wrap Up

So what's so important about resolution from a troubleshooting perspective?  Well, if it's not done right, you're talking to no one, a server that won't answer for your application, or stuck with all your queries running to a malicious website!  Even if you don't know when or why resolution is a part of your problem, knowing how to test it and remediate it is key to narrowing down your list of issues.