Monthly Archives: April 2017

Why Did Your Availability Group Creation Fail?

Availability Groups are a fantastic way to provide high availability and disaster recovery for your databases, but it isn’t exactly the easiest thing in the world to pull off correctly. To do it right there’s a lot of planning and effort that goes into your Availability Group topology. The funny thing about AGs is as hard as they are to plan…they’re pretty easy to implement…but sometimes things can go wrong. In this post I’m going to show you how to look into things when creating your AGs fails.

When working at a customer site today I encountered and error that I haven’t seen before when creating an Availability Group. So I’m going to walk you through what happened and how I fixed it. So if your AGs fail at creation, you can follow this process to dig into why.

First, let’s try to create our Availability Group

But, that fails and we get this error…it tells me what happened and to go look in the SQL Server error log for more details.

OK, so let’s look in the SQL Server error Log and see what we find.

Clearly something is up, the AG tried to come online but couldn’t.

The error here say check out the Windows Server Failover Clustering log…so let’s go ahead and do that. But that’s not as straightforward as you think. WSFC does write to the event log, but the errors are pretty generic for this issue. Here’s what you’ll see in the System Event Log and the Cluster Events section in the Failover Cluster Manager

Wow, that’s informative, right? Luckily we still have more information to look into.

Let’s dig deeper with using the WSFC cluster logs

The cluster logs need to be queried, they’re not readily available as text for us. We can write them out to file with this PowerShell cmdlet Get-ClusterLogs. Let’s make a directory and dump the logs into there.

Now we have some data to look through!

When we look at the contents of the cluster logs files generates by Get-ClusterLogs, we’re totally on the other side of the spectrum when it comes to information verbosity. The logs so far have been pretty terse and haven’t really told us about what’s causing the failure…well dig through this log and you’ll likely find your reason and a lot more information. Good stuff to look at to get an understanding of the internals of WSFCs. Now for the the reason my Availability Group creation failed was permissions. Check out the log entries.

Well that’s pretty clear about what’s going on…the process creating the AG couldn’t connect to SQL Server to run the very important sp_server_diagnostics stored procedure. A quick internet search to find a fix yielded this article from Mike Fal (b | t) which points to this Microsoft article detailing the issue and fix.

For those that don’t want to click the links here’s the code to adjust the permissions and allow your Availability Group to create.

So to review…here’s how we found our issue.

  1. Read the error the create script gives you
  2. Read the SQL Server error log
  3. Look at your System Event log
  4. Dump your Cluster Logs and review

Use this technique if you find yourself in a situation where your AG won’t come online or worse…fails over unexpectedly or won’t come back online.

New Pluralsight Course – LFCE: Network and Host Security

My new course “LFCE: Network and Host Security” in now available on Pluralsight here! If you want to learn about the course, check out the trailer here or if you want to dive right in check it out here!

This course targets IT professionals that design and maintain RHEL/CentOS based enterprises. It aligns with the Linux Foundation Certified System Administrator (LFCS) and Linux Foundation Certified Engineer (LFCE) and also Redhat’s RHCSA and RHCE certifications. The course can be used by both the IT pro learning new skills and the senior system administrator preparing for the certification exam

Let’s take your LINUX sysadmin skills to the next level and get you started on your LFCS/LFCE learning path.

If you’re in the SQL Server community and want to learn how Linux secure your Linux systems…this course is for you too! You have heard that Microsoft has SQL Server for Linux now, right, if not…read this!

The modules of the course are:

  • Linux Security Concept and Architectures – Introduction you into the fundamental concepts needed for securing your environment
  • Securing Hosts and Services – iptables and TCP Wrappers – Host based firewall concepts and techniques with iptables and TCP Wrappers
  • Securing Hosts and Services – firewalld – Learn leverage firewalld to develop more complex firewalls systems…simply. Including concepts such zones, service, ports and NAT
  • Remote Access – OpenSSH – We’ll look at encryption, authentication and how to configure SSH for public authentication
  • Remote Access – Tools and Techniques  – SSH is more than just remote access, we’ll look at secure copy, tunneling and how to use windowing systems such as X11 and VNC…securely.

Pluralsight Redhat Linux

Check out the course at Pluralsight!