A Novel Idea for High Availability in SQL Server on Linux

Over the past year we’ve learned about how SQL Server on Linux is implemented, leveraging SQLPAL and the team is pretty confident in their architectural decisions as indicated in this post here.

Now that there is this wrapper around SQL Server, this really opens up some interesting opportunities…perhaps we can leverage SQLPAL to facilitate some new high availability techniques.

When I was in graduate school, I worked on a research project, that became my master’s thesis. In this work, I developed a technique that synchronized the process address space of a virtual machine on two separate physical hypervisors.The technique involved an initial copy of all pages between the two systems and then selectively copying the virtual machine’s pages as they became dirty. Using this technique, the process address space of the virtual machine is synchronized between the two hypervisors. This allows for a significant reduction in the amount of information that had to be replicated between the hypervisors but more importantly…the virtual machines memory in sync which meant if hypervisor hosting the virtual machine crashed we could theoretically start the virtual machine on the second hypervisor.

Now, during my PASS Summit talk this year, I presented to the audience my theory that SQLPAL is virtualization. But it’s not machine virtualization, it’s process virtualization. Which means there’s a purpose built environment hosting the SQL Server process. This environment, SQLPAL, is the main allocator of resources from the physical system. It’s the thing that asks for memory, disk, network anything that’s needed from the underlying operating system.

Now, what if we took these two ideas and brought them together? What if SQLPAL was able to synchronize the program state and resources between two separate systems? Could we provide highly available SQL Services with a technique like this? I think we can. Perhaps we don’t even synchronize the pages between the system. Perhaps an even lighter technique could be used, such as duplicating the system calls between the two copies of SQL Server and thus implicitly synchronizing the program state.

Think about the possibilities…we could have a system that fails over with all the context of the currently active system, active connections could stay active, buffer pool populated, plan cache could still exist and not have to be rebuilt. Yes, we’ll likely need some sort of low latency, high bandwidth interconnect..but we have those. And there’s certainly more implementation details that need to be thought through…but I think there’s something here. 

A couple questions I thought of while writing this…

1. Does this provide more value than Availability Groups? I think so…program state remains in sync between the two systems. So things like user connections could be maintained during failover (with the appropriate relocation of the IP of course). I also think the quorum model would be simpler, as there is only one pair in the synchronization.

2. Does this provide more value than virtual machine migration, perhaps. This technique could be hypervisor independent.

I’d love to hear your thoughts on this! Most of all I want you to start thinking about new ways we can leverage SQLPAL and it’s abstraction from hardware.

2 thoughts on “A Novel Idea for High Availability in SQL Server on Linux

  1. Lonny Niederstadt

    Hello!
    I’ll sidestep your questions about comparison to other reliability/HA measures for now :-)

    But I wanted to mention that the main idea described here is the virtual equivalent of what has been available in physical fault tolerant servers from Stratus and NEC for years. These fault tolerant servers ran Linux distros and certain versions of Windows Server. Some special Windows updates packages were even prepared once upon a time to be compatible with splitting the fault tolerant server into two independent images then re-syncing after one of the images was updated (!). That’s really tough – one of the reasons it wasn’t done for many updates and probably also one of the reasons its not widely known.

    Now… Stratus even did some work running virtual servers on top of these fault tolerant servers. Here’s a write-up.
    https://www.stratus.com/assets/PrincipledTechnologies_FullTestReport.pdf

    Going one step forward, Stratus also has everRun, which includes the option of fault tolerant VMs.
    https://www.stratus.com/solutions/platforms/everrun/

    Now step backward a pace. VMware has supported fault tolerant VMs for quite a while.

    But in all of these cases, as you mentioned, the network/interface to maintain lockstep has to be wide enough and fast enough, otherwise primary service from the system *has* to slow down to accommodate lockstep.

    This is a huge reason that even the Stratus & NEC fault tolerant physical servers had limited number of sockets/NUMA nodes (iirc they never went above 2 sockets/NUMA nodes) and also the reason VMware fault tolerant VMs had a pretty restricting limit to number of vcpus in an FT vm (iirc that still hasn’t gotten farther than 4).

    Reply
    1. Anthony Nocentino

      Awesome, thanks for the feedback Lonny! When I was working on that research project in 2008-2009 VMware’s technique didn’t exist yet. I think it came out in 2009-2010. There’s a reason I changed my research direction that year, you might guess what that is :) I’m going to look more closely into the links you shared, thank you!

      Reply

Leave a Reply

Your email address will not be published. Required fields are marked *