Tag Archives: Containers

Persisting SQL Server Data in Docker Containers – Part 1

What’s the number one thing a data professional wants to do with their data…keep it around. Let’s talk about running SQL Server in Containers using Docker Volumes on a Mac

This is the first post in a three part series on Persisting SQL Server Data in Docker Containers. The second post on where Docker actually stores your data is here. And the third post on mapping base OS directories directly into containers is here.

The Need for Data Persistency in Containers

A container image is read-only. When an application changes data inside a running container writes are written to a writable layer. The writable layer plus the read-only container image are brought together by the container runtime and presenting to the processes running inside the container as a single file system. 

The primary issue with this is that the writeable layer has the lifecycle of the container. If you delete the container, you delete the writeable layer and any data that was in there. Luckily Docker containers give us a way to decouple the container and its data.

In Figure 1, you can see a container image and it’s writeable layer. The application inside the container sees this has a single file system. If we delete this container, any data written to the writeable layer will be deleted too. 


Figure 1: A container and it’s writable layer

Docker Volumes

A Docker Volume is a Docker managed resource that is mapped into a defined point in the filesystem inside the container. The primary benefit of using Docker Volumes is that they have a lifecycle that’s independent of a container. This enables you to decouple your application from its state to the point where you can simply throw away the container and replace it with a new container image start up your application and point it to your data.

In Figure 2, we have a container, a writeable layer and a volume. A container will always have a writeable layer even when a Volume is defined. A Volume will be mounted at a specific location in the file system inside the container and writes to that location will be written to the Volume. Writes to other parts of the file system will be written to the writable layer. 


Figure 2: A container, it’s writable layer and a Volume

SQL Server using Docker Volumes

Let’s talk about how we can use Docker Volumes and SQL Server to persist data. If we want to run SQL Server in a container we will want to decouple our data from the container itself. Doing so will enable us to delete the container, replace it and start up a new one pointing at our existing data. When running SQL Server in a container will store data in /var/opt/mssql by default. When the container starts up for the first time it will put the system databases in that location and any user databases created will also be placed at this location by default. 

Now, if we don’t use a Volume that data will be written into the writeable layer of the container and if we delete the container…we delete our data. We don’t want that so let’s start up a container with a Volume. To do so we use the -v option when we use the docker run command. 

docker run \
    --name 'sql17' \
    -p 1433:1433 \
    -v sqldata1:/var/opt/mssql \
    -d mcr.microsoft.com/mssql/server:2017-latest

In the code above you can see –v sqldata1:/var/opt/mssql specified as part of the docker run command. This creates a Docker Volume sqldata1 and maps that inside the container to /var/opt/mssql. Now during this container’s start up when SQL Server will write its data to /var/opt/mssql which is actually going to be written to the Volume. If we delete this container and replace it…when SQL Server starts up it will see the master database and proceed initializing the system as defined in master. If there are any user databases defined in master and they’re accessible they will be brought online too. Let’s try it out…first up let’s create a user database and query the file information about the databases in this container.

sqlcmd -S localhost,1433 -U sa -Q 'CREATE DATABASE TestDB1' -P $PASSWORD

sqlcmd -S localhost,1433 -U sa -Q 'SELECT name, physical_name from sys.master_files' -P $PASSWORD -W

name physical_name
---- -------------
master /var/opt/mssql/data/master.mdf
mastlog /var/opt/mssql/data/mastlog.ldf
tempdev /var/opt/mssql/data/tempdb.mdf
templog /var/opt/mssql/data/templog.ldf
modeldev /var/opt/mssql/data/model.mdf
modellog /var/opt/mssql/data/modellog.ldf
MSDBData /var/opt/mssql/data/MSDBData.mdf
MSDBLog /var/opt/mssql/data/MSDBLog.ldf
TestDB1 /var/opt/mssql/data/TestDB1.mdf
TestDB1_log /var/opt/mssql/data/TestDB1_log.ldf

In the code above we create a database, when query master for the information about the databases running inside this container. You can see all of the paths are /var/opt/mssql which is our volume. 

Container and Data Independence

The Docker Volume created with the -v option created a Docker managed Volume that is independent of the container so our data will live in there and we can service the container independent of the volume. So let’s do that…let’s delete the container and start up a new container and let’s go so far as to use a 2019 container to upgrade SQL Server…that’s cool!

docker stop sql17
docker rm sql17

The code above will stop and then delete the container. When the container is deleted, so is its writeable layer. But we are storing out data in the Volume and that still exists.

docker volume ls
local               sqldata1

Above we can see there is a Volume using the local driver and its name is sqldata1…this still exists and can be mounted by new containers. The local drive is used to map directories from the base OS inside the container. There are other types of drivers that expose other types of storage into the container. More on this later.

docker run \
    --name 'sql19' \
    -p 1433:1433 \
    -v sqldata1:/var/opt/mssql \
    -d mcr.microsoft.com/mssql/server:2019-latest

With this code, we start up a new container and tell it to use the same Volume and mount it into /var/opt/mssql. So when SQL Server starts it finds the master database, master has the metadata about any configuration and user databases and we get back into the state we were previously in. Let’s ask SQL Server for a list of databases.

sqlcmd -S localhost,1433 -U sa -Q 'SELECT name, physical_name from sys.master_files' -P $PASSWORD -W

name physical_name
---- -------------
master /var/opt/mssql/data/master.mdf
mastlog /var/opt/mssql/data/mastlog.ldf
tempdev /var/opt/mssql/data/tempdb.mdf
templog /var/opt/mssql/data/templog.ldf
modeldev /var/opt/mssql/data/model.mdf
modellog /var/opt/mssql/data/modellog.ldf
MSDBData /var/opt/mssql/data/MSDBData.mdf
MSDBLog /var/opt/mssql/data/MSDBLog.ldf
TestDB1 /var/opt/mssql/data/TestDB1.mdf
TestDB1_log /var/opt/mssql/data/TestDB1_log.ldf

…and there you can see in the output above, SQL Server is in the state it was in the initial running of the container on the 2017 image. Now we’re on the 2019 image and have access to all of our persisted data independent of the container image. 

sqlcmd -S localhost,1433 -U sa -Q 'SELECT @@VERSION' -P $PASSWORD
Microsoft SQL Server 2019 (RC1) - 15.0.1900.25 (X64)
        Aug 16 2019 14:20:53
        Copyright (C) 2019 Microsoft Corporation
        Developer Edition (64-bit) on Linux (Ubuntu 16.04.6 LTS)                                                                                                                       

Containers have replaced virtual machines for me and the decoupling of data and computation will have a significant impact on how we design data platforms and systems going forward. In this post, I wanted to highlight how you can use a container with persistent state systems like SQL Server. In the next post, I’m going to show you where that data actually lives on the underlying Operating System.

Speaking at SQL Saturday Dallas

Speaking at SQLSaturday Dallas!

I’m proud to announce that I will be speaking at SQL Saturday Dallas on May 17th 2018! This one won’t let you down! Check out the amazing schedule!

If you don’t know what SQLSaturday is, it’s a whole day of free SQL Server training available to you at no cost!

If you haven’t been to a SQLSaturday, what are you waiting for! Sign up now!

My presentation is Practical Container Scenarios in Azure” 


Here’s the abstract for the talk

You’ve heard the buzz about containers and Kubernetes, now let’s start your journey towards rapidly deploying and scaling your container-based applications in Azure. In this session, we will introduce containers and the container orchestrator Kubernetes. Then we’ll dive into how to build a container image, push it into our Azure Container Registry and deploy it our Azure Kubernetes Services cluster. Once deployed, we’ll learn how to keep our applications available and how to scale them using Kubernetes.

Key topics introduced
Building Container based applications
Publishing containers to Azure Container Registry
Deploying Azure Kubernetes Services Clusters
Scaling our container-based applications in Azure Kubernetes Services

Speaking at SQLSaturday Atlanta – 845

Speaking at SQLSaturday Atlanta!

I’m proud to announce that I will be speaking at SQL Saturday Atlanta on May 17th 2018! This one won’t let you down! Check out the amazing schedule!

If you don’t know what SQLSaturday is, it’s a whole day of free SQL Server training available to you at no cost!

If you haven’t been to a SQLSaturday, what are you waiting for! Sign up now!

My presentation is Containers – You Better Get on Board!” 

SQLSaturday #845 - Atlanta 2019

Here’s the abstract for the talk

Containers are taking over, changing the way systems are developed and deployed…and that’s NOT hyperbole. Just imagine if you could deploy SQL Server or even your whole application stack in just minutes. You can do that, leveraging containers! In this session, we’ll get you started on your container journey learning container fundamentals in Docker, then look at some common container scenarios and introduce deployment automation with Kubernetes.

In this session we’ll look at
Container Fundamentals with Docker
Common Container Scenarios
Automation with Kubernetes

Prerequisites: Operating system concepts such as command line use and basic networking skills.

Data Persistency and Advanced SQL Server Disk Topologies in Kubernetes

When working with SQL Server in containers and Kubernetes storage is a key concept. In this post, we’re going to walk through how to deploy SQL Server in Kubernetes with Persistent Volumes for the system and user databases.

One of the key principals of Kubernetes is the ephemerality of Pods. No Pod is every redeployed, a completely new Pod is created. If a Pod dies, for whatever reason, a new Pod is created in its place there is no continuity in the state of that Pod. The newly created Pod will go back to the initial state of the container image defined in the Pod’s spec. This is very valuable for stateless workloads, not so much for stateful workloads like SQL Server.

This means that for a stateful workload like SQL Server we need to store both configuration and data externally from the Pod to maintain state through the recreation of a Pod. Kubernetes give us constructs two constructs to do that, environment variables and Persistent Volumes. 

Using Environment Variables for Container Configuration

Container-based applications use environment variables for configuration at startup. The SQL Server container has a collection of environment variables that can be used to configure it at container startup. We will leverage two of those in this configuration. MSSQL_DATA_DIR and MSSQL_LOG_DIR these allow us to define a file system locations for user database and log files. When the SQL Server container is started inside the Pod, it reads the environment variables at runtime and sets its configuration based on those values. We define these variables as part of the Pod Spec. We will cover that configuration below.

Using Persistent Volumes to Maintain Database State

To persist the state of our SQL Server container, we will configure SQL Server to store its data and log files for both user and system databases on Persistent Volumes.

First, let’s review how SQL Server in a container starts up. During the initial startup, the SQL Server process checks to see if there are any system databases in the default system file location which is, /var/opt/mssql/data. If there are none the system databases are copied there, if they are there no action is taken. 

To add persistently to the system databases, and really all of the other components of SQL Server such as the Error Log and other system files, we will configure /var/opt/mssql so that it is backed by a Persistent Volume.

By placing the system databases on a Persistent Volume, when a Pod is recreated and the Persistent Volumes are attached and mounted in the same location when the SQL Server process starts up it sees the system databases and has what it needs to maintain state between creation.

If there are records for user databases in the system databases, SQL Server will start the process of bringing those databases online as well. We certainly the default location for user databases is /var/opt/mssql/data but we are going to override that with an environment variable for both the data and log directories, placing each on a dedicated Persistent Volumes.

Let’s walk through that configuration together. 

Persistent Volume Claims

In this configuration, we will use dynamic storage provisioning. In dynamic provisioning, a Persistent Volume Claim (PVC) is used to request a Persistent Volume (PV) from a Storage Class. In this case, we’ll be using AKS’s managed-premium Storage Class. 

Here we define three PVCs, one for each place we want Persistent Volume, for the system files and databases and the user database and log files.

apiVersion: v1
kind: PersistentVolumeClaim
  name: "pvc-sql-data"
  - ReadWriteOnce
  storageClassName: managed-premium
      storage: 10Gi
apiVersion: v1
kind: PersistentVolumeClaim
  name: "pvc-sql-system"
  - ReadWriteOnce
  storageClassName: managed-premium
      storage: 10Gi
apiVersion: v1
kind: PersistentVolumeClaim
  name: "pvc-sql-log"
  - ReadWriteOnce
  storageClassName: managed-premium
      storage: 10Gi


In the Pod spec for our Deployment, we want to define several elements to support this configuration. 

  • Volumes – define volumes that can be mounted by this Pod. In this case, we’re creating and naming three volumes, backed by the PVCs defined above.
  • volumeMounts – volumes mounted into the container and their mountPath, location. This maps the names from the named Volumes to a location in the filesystem in the container.
  • env – due to the ephemerality of the container in the Pod, we need to tell SQL Server at start up that the data and log files will be stored in a specified directory. We are leaving the system databases and files in the default location which is /var/opt/mssql
The net effect of this storage configuration is that we are mapping the Persistent Volumes into a particular location in the filesystem inside the container. 
apiVersion: apps/v1
kind: Deployment
  name: mssql-deployment
  replicas: 1
      app: mssql
    type: Recreate
        app: mssql
      - name: mssql
        image: 'mcr.microsoft.com/mssql/server:2017-latest'
        - containerPort: 1433
        - name: ACCEPT_EULA
          value: 'Y'
        - name: MSSQL_DATA_DIR
          value: '/data'
        - name: MSSQL_LOG_DIR
          value: '/log'
        - name: SA_PASSWORD
          value: 'S0methingS@Str0ng!'
        - name: mssql-system
          mountPath: /var/opt/mssql
        - name: mssql-data
          mountPath: /data
        - name: mssql-log
          mountPath: /log
      - name: mssql-system
          claimName: pvc-sql-system
      - name: mssql-data
          claimName: pvc-sql-data
      - name: mssql-log
          claimName: pvc-sql-log


We’ll front end our SQL Server with a public IP address and a load balancer. 

apiVersion: v1
kind: Service
  name: mssql-deployment
    app: mssql
    - protocol: TCP
      port: 31433
      targetPort: 1433
  type: LoadBalancer

Apply the Configuration

Save the code above into a YAML file and deploy it into SQL Server.

kubectl apply -f deployment-advanced-disk.yaml

You’ll get this output

persistentvolumeclaim/pvc-sql-data created
persistentvolumeclaim/pvc-sql-system created
persistentvolumeclaim/pvc-sql-log created
deployment.apps/mssql-deployment created
service/mssql-deployment created

Confirm the configuration

We can use kubectl get pv to list the Persistent Volumes (PV) dynamically allocated by our cluster. Here there are three Persistent Volumes. The key here is the status is Bound, which means they are bound to a PVC. I also want to point out the Reclaim Policy is Delete. This means if the PVC is deleted, the PV will be deleted at a cleanup interval sometime in the future. 

kubectl get pv
NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM                    STORAGECLASS      REASON   AGE
pvc-e0b418ef-6e69-11e9-a433-f659caf6a6f5   10Gi       RWO            Delete           Bound    default/pvc-sql-data     managed-premium            11m
pvc-e0cf2345-6e69-11e9-a433-f659caf6a6f5   10Gi       RWO            Delete           Bound    default/pvc-sql-system   managed-premium            11m
pvc-e0ea01a8-6e69-11e9-a433-f659caf6a6f5   10Gi       RWO            Delete           Bound    default/pvc-sql-log      managed-premium            11m

With kubectl get pvc we get a list of the PVCs in our configuration, once for each we defined above. The key here is the status is Bound, or that they are bound to a PV.

kubectl get pvc
NAME             STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS      AGE
pvc-sql-data     Bound    pvc-e0b418ef-6e69-11e9-a433-f659caf6a6f5   10Gi       RWO            managed-premium   12m
pvc-sql-log      Bound    pvc-e0ea01a8-6e69-11e9-a433-f659caf6a6f5   10Gi       RWO            managed-premium   12m
pvc-sql-system   Bound    pvc-e0cf2345-6e69-11e9-a433-f659caf6a6f5   10Gi       RWO            managed-premium   12m 

Now let’s use kubectl describe pods to get the deep dive info about our storage configuration and how it’s mapped into the Pod. 

There are three keep places in the output below I want to point you to

  • Containers: mssql: Environment: you’ll find the two environment variables set for the data and log directories. Configured as /data and /log
  • Mounts: we see the file system location inside the container and the name of the Volumes defined in the Pod Spec
  • Volumes: we see the name of the Volumes, their type, claim name and the read/write status.
  • Events: this is a log of the events for the creation of this Pod. Key here is that sometimes the container will come up prior to the storage being available to the Pod. That’s what the error below is, but it clears itself up and the container is able to start.
kubectl describe pods
Name:               mssql-deployment-df4cf5c4c-nf8lf
Namespace:          default
Priority:           0
Node:               aks-nodepool1-89481420-2/
Start Time:         Sat, 04 May 2019 07:41:59 -0500
Labels:             app=mssql
Status:             Running
Controlled By:      ReplicaSet/mssql-deployment-df4cf5c4c
    Container ID:   docker://f2320ae8f94c24fbb04214b903b4a218b82e9548f8d88a95daa7e207eeaa42b4
    Image:          mcr.microsoft.com/mssql/server:2017-latest
    Image ID:       docker-pullable://mcr.microsoft.com/mssql/server@sha256:39554141d307f2d40d2abfc54e3a0eea3aa527e58f616496c6f3ed3245a2e2b1
    Port:           1433/TCP
    Host Port:      0/TCP
    State:          Running
      Started:      Sat, 04 May 2019 07:44:21 -0500
    Ready:          True
    Restart Count:  0
      ACCEPT_EULA:                   Y
      MSSQL_DATA_DIR:                /data
      MSSQL_LOG_DIR:                 /log
      SA_PASSWORD:                   S0methingS@Str0ng!
      KUBERNETES_PORT_443_TCP_ADDR:  cscluster-kubernetes-cloud-fd0c5e-8bca8b54.hcp.centralus.azmk8s.io
      KUBERNETES_PORT:               tcp://cscluster-kubernetes-cloud-fd0c5e-8bca8b54.hcp.centralus.azmk8s.io:443
      KUBERNETES_PORT_443_TCP:       tcp://cscluster-kubernetes-cloud-fd0c5e-8bca8b54.hcp.centralus.azmk8s.io:443
      KUBERNETES_SERVICE_HOST:       cscluster-kubernetes-cloud-fd0c5e-8bca8b54.hcp.centralus.azmk8s.io
      /data from mssql-data (rw)
      /log from mssql-log (rw)
      /var/opt/mssql from mssql-system (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-z9sbf (ro)
  Type              Status
  Initialized       True
  Ready             True
  ContainersReady   True
  PodScheduled      True
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  pvc-sql-system
    ReadOnly:   false
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  pvc-sql-data
    ReadOnly:   false
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  pvc-sql-log
    ReadOnly:   false
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-z9sbf
    Optional:    false
QoS Class:       BestEffort
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
  Type     Reason                  Age   From                               Message
  ----     ------                  ----  ----                               -------
  Normal   Scheduled               13m   default-scheduler                  Successfully assigned default/mssql-deployment-df4cf5c4c-nf8lf to aks-nodepool1-89481420-2
  Normal   SuccessfulAttachVolume  13m   attachdetach-controller            AttachVolume.Attach succeeded for volume "pvc-e0ea01a8-6e69-11e9-a433-f659caf6a6f5"
  Normal   SuccessfulAttachVolume  12m   attachdetach-controller            AttachVolume.Attach succeeded for volume "pvc-e0cf2345-6e69-11e9-a433-f659caf6a6f5"
  Normal   SuccessfulAttachVolume  12m   attachdetach-controller            AttachVolume.Attach succeeded for volume "pvc-e0b418ef-6e69-11e9-a433-f659caf6a6f5"
  Warning  FailedMount             11m   kubelet, aks-nodepool1-89481420-2  Unable to mount volumes for pod "mssql-deployment-df4cf5c4c-nf8lf_default(027c46f7-6e6a-11e9-a433-f659caf6a6f5)": timeout expired waiting for volumes to attach or mount for pod "default"/"mssql-deployment-df4cf5c4c-nf8lf". list of unmounted volumes=[mssql-system mssql-data]. list of unattached volumes=[mssql-system mssql-data mssql-log default-token-z9sbf]
  Normal   Pulled                  11m   kubelet, aks-nodepool1-89481420-2  Container image "mcr.microsoft.com/mssql/server:2017-latest" already present on machine
  Normal   Created                 11m   kubelet, aks-nodepool1-89481420-2  Created container
  Normal   Started                 11m   kubelet, aks-nodepool1-89481420-2  Started container

Creating a Database and Verifying File Location

With this code, we’ll get our IP address for our SQL Server service then we’ll create a database and query master_files for a list of data files. Notice I’m defining my service port as 31443 which is what we defined when creating our service in the earlier step.

SVCIP=$(kubectl get svc mssql-deployment | grep mssql-deployment |  awk '{print $4}')
sqlcmd -S $SVCIP,31433 -U sa -Q 'CREATE DATABASE TestDB1' -P $PASSWORD
sqlcmd -S $SVCIP,31433 -U sa -Q 'SELECT name,physical_name from sys.master_files' -P $PASSWORD

And we’ll get this output, you can see all of the system databases backed by /var/opt/mssql and our user database is on /data and the log is on /log. All backed by Persistent Volumes.

master        /var/opt/mssql/data/master.mdf
mastlog       /var/opt/mssql/data/mastlog.ldf
tempdev       /var/opt/mssql/data/tempdb.mdf
templog       /var/opt/mssql/data/templog.ldf
modeldev      /var/opt/mssql/data/model.mdf
modellog      /var/opt/mssql/data/modellog.ldf
MSDBData      /var/opt/mssql/data/MSDBData.mdf
MSDBLog       /var/opt/mssql/data/MSDBLog.ldf
TestDB1       /data/TestDB1.mdf
TestDB1_log   /log/TestDB1_log.ldf

Confirming Persistency

Let’s go ahead and delete our Pod to confirm that when it’s recreated by our Deployment our data is still there. 

kubectl get pods
NAME                               READY   STATUS    RESTARTS   AGE
mssql-deployment-df4cf5c4c-nf8lf   1/1     Running   0          4d2h

kubectl delete pod mssql-deployment-df4cf5c4c-nf8lf 
pod "mssql-deployment-df4cf5c4c-nf8lf" deleted

Once the Pod is recreated, let’s query master files to see where our databases are located. And you’ll find that your the database created in the previous step persisted between Pod creations.

sqlcmd -S $SVCIP,31433 -U sa -Q 'SELECT name,physical_name from sys.master_files' -P $PASSWORD

master        /var/opt/mssql/data/master.mdf
mastlog       /var/opt/mssql/data/mastlog.ldf
tempdev       /var/opt/mssql/data/tempdb.mdf
templog       /var/opt/mssql/data/templog.ldf
modeldev      /var/opt/mssql/data/model.mdf
modellog      /var/opt/mssql/data/modellog.ldf
MSDBData      /var/opt/mssql/data/MSDBData.mdf
MSDBLog       /var/opt/mssql/data/MSDBLog.ldf
TestDB1       /data/TestDB1.mdf
TestDB1_log   /log/TestDB1_log.ldf

Using PowerShell in Containers

The vision for PowerShell Core is to be able to run PowerShell anywhere. In this article, I’m going to discuss how you can use Docker Containers to enable just that. We’ll look at running PowerShell in a container, running cmdlets, running different versions of PowerShell at the same time, and also how to build our own “serverless” computing platform.

Let’s address a few reasons why you would want to run PowerShell in a container.

  • Speed and agility – this for me is probably the number one reason to run PowerShell in a container.  The PowerShell container images are coming in at around 375MB, this means with a modern Internet connection you’ll be able to pull a PowerShell container image and be up in running in a very small amount of time.
  • Version – there are container images available for every release of PowerShell Core, including preview/release candidate code. With containers, you can run multiple versions of PowerShell Core in a way where they will not conflict with each other.
  • Platform independence – there are container images for Ubuntu, Fedora, Windows Server Core, Nano Server and more. This allows you to be able to consume PowerShell Core regardless of your underlying platform. You can select whichever image you want, pull the container and go. 
  • Testing – if you need to test your scripts across various versions of PowerShell Core you can pull the container, run the script on the exact version you need. You can have multiple containers on your system running multiple versions of PowerShell and be able to run them all at the same time.  
  • Isolation – containers will allow you to have self-contained environments for execution, security, environment, and configuration settings. You can also use this idea to isolate conflicting modules from each other. This is particularly valuable when developing modules and/or cmdlets.

Getting Up and Running

Let’s get started with using PowerShell Core in a container. First up, we will want to pull the Docker Container Image to our local machine. This will pull the image with the latest tag. Which at the time of this post is 6.2.0-ubuntu-18.04.

docker pull mcr.microsoft.com/powershell:latest

With the container image local, let’s go ahead and start up the container. In this first go, I’m going to start up the container with the docker run command and with the –interactive and –tty flags. What these flags do is, when the container starts, attach to the terminal of the container so I can use PowerShell Core interactively at the command line.

docker run                    \
        --name "pwsh-latest"  \
        --interactive --tty   \

This will get you a PowerShell prompt. I told you this was going to be fast.

PowerShell 6.2.0
Copyright (c) Microsoft Corporation. All rights reserved.
Type 'help' to get help.
PS /> 

From that prompt, we can do the normal PowerShell things we need to do. Let’s start our journey like all good PowerShell demos do and run Get-Process. You’ll notice that there is only one process running in the container, and that’s your pwsh session. This is due to the isolation concepts of Containers. With this isolation, problems like conflicting modules and settings go away. The container gives you script an isolated execution environment. If you need to have two conflicting versions of a module, DLL or library to run your workload or script…you can use a container to isolate their execution giving them the ability to co-exist on the same system.

PS /> Get-Process
 NPM(K)    PM(M)      WS(M)     CPU(s)      Id  SI ProcessName
 ------    -----      -----     ------      --  -- -----------
      0     0.00     110.03       2.01       1   1 pwsh

We can use exit to get out of PowerShell. When you exit PowerShell the container will stop. You can see that status of your container with docker ps.

CONTAINER ID        IMAGE                                 COMMAND             CREATED             STATUS                     PORTS               NAMES
8c9160fea43f        mcr.microsoft.com/powershell:latest   "pwsh"              6 minutes ago       Exited (0) 8 seconds ago                       pwsh-latest
If you’d like to get back into your container you can use docker start pwsh-latest -i where pwsh-latest is the container name we just created and -i is for interactive (we used –interactive earlier). Run that and you’ll land right back at a PowerShell prompt again. 

Running a cmdlet When Starting a Container

Now, let’s say we wanted to start our container up and non-interactively run a cmdlet right away, we can do that. With the docker run command, we can tell the container that we want it to start pwsh and pass in a cmdlet as a parameter into pwsh, with the -c parameter and that cmdlet will be executed. Let’s check out how.
docker run mcr.microsoft.com/powershell:latest pwsh -c "&{Get-Process}"
 NPM(K)    PM(M)      WS(M)     CPU(s)      Id  SI ProcessName
 ------    -----      -----     ------      --  -- -----------
      0     0.00      81.35       0.54       1   1 pwsh
From a performance standpoint, I want to point out the time it takes to do this work, we can use the time command to help us with that. Less than two seconds to start the container, start pwsh and execute our cmdlet and shut down the container.
time docker run mcr.microsoft.com/powershell:latest pwsh -c "&{Get-Process}"
 NPM(K)    PM(M)      WS(M)     CPU(s)      Id  SI ProcessName
 ------    -----      -----     ------      --  -- -----------
      0     0.00      81.61       0.54       1   1 pwsh
real 0m1.901s
user 0m0.038s
sys. 0m0.086s
Now let’s say I wanted to test a cmdlet execution against a specific version of PowerShell Core, perhaps even a Release Candidate. Let’s change the tag from latest to preview and docker will pull that container, start it up and we immediately have an environment for testing. This could be leveraged for script testing, cmdlet testing, module testing and so on. In the output below, you can see the preview tag points to the 6.2.0-rc1 version of PowerShell Core.
docker run mcr.microsoft.com/powershell:preview pwsh -c "&{Get-Host}"
Name             : ConsoleHost
Version          : 6.2.0-rc.1
…output omitted...
Now, each time we started a container so far in this post and then exited pwsh, the container shut down and was still on our system. We can see the containers with a docker ps -a. We can restart any of these containers and get them back by using the command mentioned previously.
docker ps -a
CONTAINER ID        IMAGE                                 COMMAND                  CREATED             STATUS                     PORTS               NAMES
d8d8d27ec7be        mcr.microsoft.com/powershell:preview  "pwsh -c &{Get-Host}"    4 seconds ago       Exited (0) 2 seconds ago                       pensive_poincare
5eace290b47c        mcr.microsoft.com/powershell:latest   "pwsh -c &{Get-Proce…"   4 minutes ago       Exited (0) 4 minutes ago                       dreamy_haibt
c8361b9e0a76        mcr.microsoft.com/powershell:latest   "pwsh -c &{Get-Proce…"   6 minutes ago       Exited (0) 6 minutes ago                       boring_shirley
8c9160fea43f        mcr.microsoft.com/powershell:latest   "pwsh"                   15 minutes ago      Exited (0) 8 minutes ago                       pwsh-latest
We can delete each container by name, using docker rm then specifying the name as a parameter. For example, docker rm pwsh-latest would delete that container.

Running a Script When Starting a Container

When a container is deleted, the data “inside” the container is deleted too. So if we created a script inside a container and then delete the container that means the script would go away too. In Docker, we can use a volume to help us with this. A volume allows us to store our data externally to the container, we can mount the volume inside the container and it looks like it’s part of the container’s file system.
With volumes, when we delete the container, the data stays inside the volume. We can then create a new container and attach the volume to that new container and the data will be there for us to work with.
Let’s start a container and attach a volume at the /scripts location of the container’s file system. Let’s also add the –detach parameter. This is going to start the container, start pwsh and then stop the container. Then I’m going to copy a script from my local file system into the container. The container does not need to be running for the copy operation to succeed.
docker run                       \
     --name "pwsh-script"        \
     --interactive --tty         \
     --volume PSScripts:/scripts \
Here’s the code to copy the script from my local file system into the container where pwsh-script is the container name and /scripts is the location we want to copy the script to inside the container. This is the volume we attached to the container. The script is a simple hello-world script.
docker cp Get-Containers.ps1 pwsh-script:/scripts
With that, let’s go ahead and remove the container. We used it just to copy the script into the volume. I kind of feel bad, but we’ll keep moving on.
docker rm pwsh-script
With that, let’s create a new container in interactive mode, with the volume attached. This will put us at a pwsh prompt.
docker run                       \
     --name "pwsh-script"        \
     --interactive --tty         \
     --volume PSScripts:/scripts \
Now, since our script is in the volume and we attached that volume when we created this new container, it’s available for us inside the container. Let’s go ahead and run that script inside the container and then delete the container with docker rm when it’s finished. 
PS /> ls -la /scripts/
total 12
drwxr-xr-x 2 root root    4096 May  2 18:30 .
drwxr-xr-x 1 root root    4096 May  2 18:33 ..
-rw-r--r-- 1  502 dialout   73 Apr 28 21:43 Get-Containers.ps1
PS /> /scripts/Get-Containers.ps1
Hello, world!
PS /> exit
docker rm pwsh-script

Sounds Like…Serverless?

Now let’s take that technique we just stepped through, where we started the container, ran a script and deleted the container and combine all of that into one step. To do so, we’ll use the following command options for docker run. We specify the –rm option which will delete the container when it exits, add the /scripts volume and tell pwsh to run the script that’s in our volume by specifying its location with the parameter -F /scripts/Get-Containers.ps1.
docker run                       \
     --rm                        \
     --volume PSScripts:/scripts \
       mcr.microsoft.com/powershell:latest pwsh -F /scripts/Get-Containers.ps1
Hello, world!
Now, with that last technique, we’ve encapsulated the entire lifecycle of the execution of that script into one line of code. It’s like this script execution never happened…or did it ;) All kidding aside, we effectively have a serverless computing platform now. Using this technique in our data centers, we can spin up a container, on any version of PowerShell on any platform, run some workload/script and when the workload finishes, the container just goes away. For this to work well, we will need something to drive that process. In an upcoming blog post, we’ll talk more about how we can automate the running of PowerShell containers in Kubernetes.
In this post, we covered a lot, we looked at how you can interactively run PowerShell Core in a container, how you can pass cmdlets into a container at runtime, running different versions of PowerShell Core and also how you can persistently store scripts outside of containers in volumes and run those scripts in your containers. We also looked at how you can encapsulate the whole execution of a script and the containers life cycle into one line of code. Really giving you the ability to run PowerShell Core anywhere on any platform.
I hope you enjoyed this and are as excited as I am about how we can leverage this technology to solve new and unique problems in your data center and IT operations.

Speaking at SQLSaturday Nashville – 815!

Speaking at SQLSaturday Nashville – 815!

I’m proud to announce that I will be speaking at SQL Saturday Nashville on January 12th 2019! And wow, 815 SQL Saturdays! This one won’t let you down. Check out the amazing schedule!

If you don’t know what SQLSaturday is, it’s a whole day of free SQL Server training available to you at no cost!

If you haven’t been to a SQLSaturday, what are you waiting for! Sign up now!


This year I have TWO sessions!

1. Inside Kubernetes – An Architectural Deep Dive

In this session we will introduce Kubernetes, we’ll deep dive into each component and its responsibility in a cluster. We will also look at and demonstrate higher-level abstractions such as Services, Controllers, and Deployments and how they can be used to ensure the desired state of an application and data platform deployed in Kubernetes. Next, we’ll look at Kubernetes networking and intercluster communication patterns. With that foundation, we will then introduce various cluster scenarios such as a single node, single head, and high availability designs. By the end of this session, you will understand what’s needed to put your applications and data platform in production in a Kubernetes cluster

Session Objectives:
Understand Kubernetes cluster architecture
Understand Services, Controllers, and Deployments
Designing Production Ready Kubernetes Clusters


2. Containers – You Better Get on Board

Containers are taking over, changing the way systems are developed and deployed…and that’s NOT hyperbole. Just imagine if you could deploy SQL Server or even your whole application stack in just minutes. You can do that, leveraging containers! In this session, we’ll get your started on your container journey learning container fundamentals in Docker, then look at some common container scenarios and introduce deployment automation with Kubernetes. In this session we’ll look at Container Fundamentals with Docker Common, Container Scenarios and Orchestration with Kubernetes

Getting Started with Installing Kubernetes On-Prem

Let’s get you started on your Kubernetes journey with installing Kubernetes on premises in virtual machines. 

Kubernetes is a distributed system, you will be creating a cluster which will have a master node that is in charge of all operations in your cluster. In this walkthrough we’ll create three workers which will run our applications. This cluster topology is, by no means, production ready. If you’re looking for production cluster builds check out Kubernetes documentation. Here and here. The primary components that need high availability in a Kubernetes cluster are the API Server which controls the state of the cluster and the etcd database which stores the persistent state of the cluster. You can learn more about Kubernetes cluster components here.

In our demonstration here, the master is where the API Server, etcd, and the other control plan functions will live. The workers, will be joined to the cluster and run our application workloads. 

Get your infrastructure sorted

I’m using 4 Ubuntu Virtual machines in VMware Fusion on my Mac. Each with 2vCPUs and 2GB of RAM running Ubuntu 16.04.5. Ubuntu 18 requires a slightly different install. Documented here. In there you will add the Docker repository, then install Docker from there. The instructions below get Docker from Ubuntu’s repository 

  • k8s-master –
  • K8s-node1 – DHCP
  • K8s-node2 – DHCP
  • K8s-node3 – DHCP

Ensure that each host has a unique name and that all nodes can have network reachability between each other. Take note of the IPs, because you will need to log into each node with SSH. If you need assistance getting your environment ready, check out my training on Pluralsight to get you started here! I have courses on installation, command line basics all the way up through advanced topics on networking and performance.

Another requirement, which Klaus Aschenbrenner reminded me, is that you need to disable the swap on any system which you will run the kubelet, which in our case is all systems. To do so you need to turn swap off with sudo swapoff -a and edit /etc/fstab removing or commenting out the swap volume entry. 

Overview of the cluster creation process

  • Install Kubernetes packages on all nodes
    • Add Kubernetes’ apt repositories
    • Install the required software for Kubernetes
  • Download deployment files for your pod network
  • Create a Kubernetes cluster on the master
    • We’re going to use a utility called kubeadm to create our cluster with a basic configuration
  • Install a Pod Network
  • Join our three worker nodes to our cluster

Install Kubernetes

Let’s start off with installing Kubernetes on to all of the nodes in our system. This is going to require logging into each server via SSH, adding the Kubernetes apt repositories and installing the correct packages. Perform the following tasks on ALL nodes in your cluster, the master and the three workers. If you add more nodes, you will need to install these packages on those nodes. 

Add the gpg key for the Kubernetes Apt repository to your local system

demo@k8s-master1:~$ curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add –

Add the Kubernetes Apt repository to your local repository locations

demo@k8s-master1:~$ sudo bash -c ‘cat <<EOF >/etc/apt/sources.list.d/kubernetes.list

> deb https://apt.kubernetes.io/ kubernetes-xenial main

> EOF’

Next, we’ll update our apt package lists

demo@k8s-master1:~$ sudo apt-get update

Install the required packages

demo@k8s-master1:~$ sudo apt-get install -y kubelet kubeadm kubectl docker.io

Then we need to tell apt to not update these packages. In Kubernetes, cluster upgrades will be managed by…you guessed it…Kubernetes

demo@k8s-master1:~$ sudo apt-mark hold kubelet kubeadm kubectl docker.io

Here’s what you just installed
  • Kubelet – On each node in the cluster, this is in charge of starting and stopping pods in response to the state defined on the API Server on the master 
  • Kubeadm – Primary command line utility for creating your cluster
  • Kubectl – Primary command line utility for working with your cluster
  • Docker – Remember, that Kubernetes is a container orchestrator so we’ll need a container runtime to run your containers. We’re using Docker. You can use other container runtimes if required

Download the YAML files for your Pod Network

Now, only on the master, let’s download the YAML deployment files for your Pod network and get are cluster created. Networking in Kubernetes is different than what you’d expect. For Pods to be on different nodes to be able to communicate with each other on the same IP network, you’ll want to create a Pod network. Which essentially is an overlay network that gives you a uniform address space for Pods to operate in. The decision of which Pod network to use, or even if you need one is very dependent on your local or cloud infrastructure. For this demo, I’m going to use the Calico Pod network overlay. The code below will download the Pod definition manifests in YAML and we’ll deploy those into our cluster. This start up a container on our system in what’s called a DaemonSet. A DaemonSet is a Kubernetes object that will start the specified container on all or some of the nodes in the cluster. In this case, the calico network Pod will be deployed on all nodes in our cluster. So as we join nodes, you might see some delay in nodes becoming ready…this is because the container is being pulled and started on the node.
Download the YAML for the Pod network

demo@k8s-master1:~$ wget https://docs.projectcalico.org/v3.3/getting-started/kubernetes/installation/hosted/rbac-kdd.yaml

demo@k8s-master1:~$ wget https://docs.projectcalico.org/v3.3/getting-started/kubernetes/installation/hosted/kubernetes-datastore/calico-networking/1.7/calico.yaml

If you need to change the address of your pod network edit calico.yaml, look for the name: CALICO_IPV4POOL_CIDR and set the value: to your specified CIDR range. It’s by default. 

Creating a Kubernetes Cluster

Now we’re ready to create our Kubernetes cluster, we’re going to use kubeadm to help us get this done. It’s a community-based tool that does a lot of the heavy lifting for you.
To create a cluster do this, here we’re specifying a CIDR range to match that in our calico.yaml file.

demo@k8s-master1:~$ sudo kubeadm init –pod-network-cidr=

What’s happening behind the scenes with kubeadm init:
  • Creates a certificate authority – Kubernetes uses certificates to secure communication between components and also to verify the identity of hosts in the cluster
  • Creates configuration files – On the master, this will create configuration files for various Kubernetes cluster components
  • Pulls control plane images – the services implementing the cluster components are deployed into the cluster as containers. Very cool! You can, of course, run these as local system daemons on the hosts, but Kubernetes suggests keeping them inside containers
  • Bootstraps the control plane pods – starts up the pods and creates static manifests on the master start automatically when the master node starts up
  • Taints the master to just system pods – this means the master will run (schedule) only system Pods, not user Pods. This is ideal for production. In testing, you may want to untaint the master, you’ll really want to do this if you’re running a single node cluster. See this link for details on that.
  • Generates a bootstrap token – used to join worker nodes to the cluster
  • Starts any add-ons – the most common add-ons are the DNS pod and the master’s kube-proxy
If you see this, you’re good to go! Keep that join command handy. We’ll need it in a second.

Your Kubernetes master has initialized successfully!

…output omitted

You can now join any number of machines by running the following on each node

as root:

  kubeadm join –token 2a71vm.aat5o5vd0eip9yrx –discovery-token-ca-cert-hash sha256:57b64257181341928e60548314f28aa0d2b15f4d81bf9ae9afdae0cee6baf247

The output from your cluster creation is very important, it’s going to give you the code needed to access your cluster as a non-root user, the code needed to create your Pod network and also the code needed to join worker nodes to your cluster (just go ahead and copy this into a text file right now). Let’s go through each of those together.

Configuring your cluster for access from the master node as a non-privileged user

This will allow you to log into your system with a regular account and administer your cluster.

mkdir -p $HOME/.kube

sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config

sudo chown $(id -u):$(id -g) $HOME/.kube/config

Create your Pod network

Now that your cluster is created, you can deploy the YAML files for your Pod network. You must do this prior to adding more nodes to your cluster and certainly before starting any Pods on those nodes. We are going to use kubectl -f to deploy the Pod network from the YAML file we downloaded earlier. 

demo@k8s-master1:~$ kubectl apply -f rbac-kdd.yaml

clusterrole.rbac.authorization.k8s.io/calico-node created

clusterrolebinding.rbac.authorization.k8s.io/calico-node created

demo@k8s-master1:~$ kubectl apply -f calico.yaml

configmap/calico-config created

service/calico-typha created

deployment.apps/calico-typha created

poddisruptionbudget.policy/calico-typha created

daemonset.extensions/calico-node created

serviceaccount/calico-node created

customresourcedefinition.apiextensions.k8s.io/felixconfigurations.crd.projectcalico.org created

customresourcedefinition.apiextensions.k8s.io/bgppeers.crd.projectcalico.org created

customresourcedefinition.apiextensions.k8s.io/bgpconfigurations.crd.projectcalico.org created

customresourcedefinition.apiextensions.k8s.io/ippools.crd.projectcalico.org created

customresourcedefinition.apiextensions.k8s.io/hostendpoints.crd.projectcalico.org created

customresourcedefinition.apiextensions.k8s.io/clusterinformations.crd.projectcalico.org created

customresourcedefinition.apiextensions.k8s.io/globalnetworkpolicies.crd.projectcalico.org created

customresourcedefinition.apiextensions.k8s.io/globalnetworksets.crd.projectcalico.org created

customresourcedefinition.apiextensions.k8s.io/networkpolicies.crd.projectcalico.org created

Before moving forward, check for the creation of the Calico pods and also the DNS pods, once these are created and the STATUS is Running then you can proceed. In this output here you can also see the other components of your Kubernetes cluster. You see the containers running etcd, API Server, the Controller Manager, kube-proxy and the Scheduler.

demo@k8s-master1:~$ kubectl get pods –all-namespaces

NAMESPACE     NAME                                  READY   STATUS    RESTARTS   AGE

kube-system   calico-node-6ll9j                     2/2     Running   0          2m5s

kube-system   coredns-576cbf47c7-8dgzl              1/1     Running   0          9m59s

kube-system   coredns-576cbf47c7-cc9x2              1/1     Running   0          9m59s

kube-system   etcd-k8s-master1                      1/1     Running   0          8m58s

kube-system   kube-apiserver-k8s-master1            1/1     Running   0          9m16s

kube-system   kube-controller-manager-k8s-master1   1/1     Running   0          9m16s

kube-system   kube-proxy-8z9t7                      1/1     Running   0          9m59s

kube-system   kube-scheduler-k8s-master1            1/1     Running   0          8m55s


Joining worker nodes to your cluster

Now on each of the worker nodes, let’s use kubeadm join to join the worker nodes to the cluster. Go back to the output of kubeadm init and copy the string from that output be sure to put a sudo on the front before you do this on each node. The process below is called a TLS bootstrap. This securely joins the node to the cluster over TLS and authenticates the host with server certificates. 

demo@k8s-node1:~$ sudo kubeadm join –token 2a71vm.aat5o5vd0eip9yrx –discovery-token-ca-cert-hash sha256:57b64257181341928e60548314f28aa0d2b15f4d81bf9ae9afdae0cee6baf247

[preflight] running pre-flight checks

[discovery] Trying to connect to API Server “”

[discovery] Created cluster-info discovery client, requesting info from “”

[discovery] Requesting info from “” again to validate TLS against the pinned public key

[discovery] Cluster info signature and contents are valid and TLS certificate validates against pinned roots, will use API Server “”

[discovery] Successfully established connection with API Server “”

[kubelet] Downloading configuration for the kubelet from the “kubelet-config-1.12” ConfigMap in the kube-system namespace

[kubelet] Writing kubelet configuration to file “/var/lib/kubelet/config.yaml”

[kubelet] Writing kubelet environment file with flags to file “/var/lib/kubelet/kubeadm-flags.env”

[preflight] Activating the kubelet service

[tlsbootstrap] Waiting for the kubelet to perform the TLS Bootstrap…

[patchnode] Uploading the CRI Socket information “/var/run/dockershim.sock” to the Node API object “k8s-node1” as an annotation


This node has joined the cluster:

* Certificate signing request was sent to apiserver and a response was received.

* The Kubelet was informed of the new secure connection details.

Run ‘kubectl get nodes’ on the master to see this node join the cluster.

If you didn’t keep the token or the CA Cert Hash in the earlier steps, go back to the master and run these commands. Also note, that join token is only valid for 24 hours. 
To get the current join token

demo@k8s-master1:~$ kubeadm token list

To get the CA Cert Hash

demo@k8s-master1:~$ openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | openssl dgst -sha256 -hex | sed ‘s/^.* //’

Back on the master, check on the status of your nodes joining the cluster. These nodes are currently NotReady, behind the scenes they’re pulling the Calico container and setting up the Pod network.

demo@k8s-master1:~$ kubectl get nodes


k8s-master1   Ready      master   14m    v1.12.2

k8s-node1     NotReady   <none>   100s   v1.12.2

k8s-node2     NotReady   <none>   96s    v1.12.2

k8s-node3     NotReady   <none>   94s    v1.12.2

And here we are with a fully functional Kubernetes cluster! All nodes joined and Ready.

demo@k8s-master1:~$ kubectl get nodes


k8s-master1   Ready    master   15m     v1.12.2

k8s-node1     Ready    <none>   2m34s   v1.12.2

k8s-node2     Ready    <none>   2m30s   v1.12.2

k8s-node3     Ready    <none>   2m28s   v1.12.2

In our next post, we’ll deploy a SQL Server Pod into our freshly created Kubernetes cluster.
Please feel free to contact me with any questions regarding Linux or other SQL Server related issues at: aen@centinosystems.com

I’m Speaking at SQLSaturday Cambridge!

Speaking at SQLSaturday Cambridge!

I’m proud to announce that I will be speaking at SQL Saturday Cambridge on September 8th 2018! And wow, 748 SQL Saturdays! This one won’t let you down. Check out the amazing schedule of International Experts and Microsoft MVPs!

If you don’t know what SQLSaturday is, it’s a whole day of free SQL Server training available to you at no cost!

If you haven’t been to a SQLSaturday, what are you waiting for! Sign up now!

SQLSaturday #748 - Cambridge 2018

This year I have TWO sessions!

1. Monitoring Linux Performance for the SQL Server Admin

So you’re a SQL Server administrator and you just installed SQL Server on Linux. It’s a whole new world. Don’t fear, it’s just an operating system. It has all the same components Windows has and in this session we’ll show you that. We will look at the Linux operating system architecture and show you where to look for the performance data you’re used to! Further we’ll dive into SQLPAL and how it architecture and internals enables high performance for your SQL Server. By the end of this session you’ll be ready to go back to the office and have a solid understanding of performance monitoring Linux systems and SQL on Linux. We’ll look at the core system components of CPU, Disk, and Memory and monitoring techniques for each.

2. Containers – You Better Get on Board

Containers are taking over, changing the way systems are developed and deployed…and that’s NOT hyperbole. Just imagine if you could deploy SQL Server or even your whole application stack in just minutes. You can do that, leveraging containers! In this session, we’ll get your started on your container journey learning container fundamentals in Docker, then look at some common container scenarios and introduce deployment automation with Kubernetes. In this session we’ll look at Container Fundamentals with Docker Common Container Scenarios Automation with Kubernetes.

Questions from PASS Marathon Containers

Thanks to everyone who attended the PASS Marathon Containers edition and to PASS for the opportunity to present. I received the Questions from the session and wanted to provide answers to the attendees and the community.
If you want to see the session again, check it out on YouTube. The decks are available online at http://www.centinosystems.com/blog/talks/
Here’s the list of questions from the session and my answers.
  • What do you mean it is not for production environment in Windows?
    • It’s my understanding that only Linux based SQL Server containers are supported and that Windows based containers are not. I’m looking to find an official statement, like a web site link) from Microsoft on this but I am having troubles doing so. Here is the official statement on running SQL Server on Linux in a Container – https://bit.ly/2LYPeKh

  • When you say App1 on a container, is it just 1 executable/service or can be multiple of those on the same container?
    • Generally speaking you’ll want only one process in a container. A primary reason for using containers is agility and a core way of achieving that is breaking dependencies by reducing what’s included inside the container.. Technically speaking, you can have more than one process inside a container. If fact SQL Server on Linux does. There’s the Watchdog process, then the actual SQL Server process. The output below is a process listing from inside a running SQL Server on Linux container. You can see PID 1 and 7 are processes inside the container.

      root 1   /opt/mssql/bin/sqlservr

      root 7   /opt/mssql/bin/sqlservr

      For the internals geeks out there, let’s look a a process listing on the host OS that’s running our container. From there we can see that the sqlservr process is a child process of containerd which is managed by dockerd. This is the same SQL Server process inside the container. But in the first example you here can see the impact of namespaces…the process IDs are rebased and start at 1 and the second SQL Server PID is 9. In the output below you can see the PIDs are 2172 and 2213.

      root 1034 /usr/bin/dockerd

      root 1245 \_ docker-containerd 

      root 2154     \_ docker-containerd-shim -namespace moby -workdir 

      root 2172         \_ /opt/mssql/bin/sqlservr

      root 2213             \_ /opt/mssql/bin/sqlservr

  • Maybe I missed this part, how do I know what kind of image I could pull down?
    • In the demos I show how to use docker search to find images that are available from the Docker Hub. If you prefer a web browser experience, check out the Docker Hub to see what containers are available to you. Here’s the code to find the mysql-server images available in Docker Hub.
      • docker search mssql-server | sort
  • Does SQL Container fit into production environment?
    • Here is a link to the official word from Microsoft on running containers in production – https://bit.ly/2LYPeKh
    • What I want you to leave this session with is an introduction to containers, starting your journey on what’s next when using containers. To that end here are some of the things you’ll need to consider before using containers in production
      • Is your organization ready – Do the operational skills and technologies exist to support using containers in production.
      • Backup and recovery – Does the organization have a strong backup and recovery environment. How are you going to protect the data running in a SQL Server container. Luckily, it’s just SQL Server on Linux so you can use the traditional technologies and techniques to backup your data. 
      • Data persistency – Understanding the underlying physical infrastructure and how to persistent data in ways that it’s protected and well performing.
      • Orchestration – Is there technologies in place to manage the state of your containers, things like workload placement, starting, stopping and also data persistency.
  • How do SQL Containers work with High Availability and Disaster Recovery?
    • Backups and data persistency are primary concerns here. You still need to care and feed for your SQL Server databases just as if they were platformed on a full operating system. For HA, Microsoft has some guidance on how to use Kubernetes to provide HA services to your SQL Server containers here. What I want you to think about when using containers for SQL Server is deploying a new container is VERY fast. We want to be able to persist the data and be able to stand up a new container and mount our data inside that container. Using this technique we can restore SQL Services very quickly with low RTO. That itself is an interesting way to provide HA services without any additional technologies.
  • Is there a way to have persistent storage for the system databases (e.g. master database for logins and what not)?
    • In the demos during the session I defined a Docker Data Volume when we started the container where we mounted that as /var/opt/mssql/ inside the container.  When SQL Server on Linux starts for the first time it will copy the system databases from its package directories into /var/opt/mssql/data. Since this data is stored in the persistent data volume if we stop and delete this container and start a new container pointing at that same docker data volume when SQL Server starts up it will use those system databases.

      Starting a SQL Server Container with a Docker Data Volume. The -v parameter names the volume sqldata1 and /var/opt/mssql is where it will be mounted inside the container.

      docker run \
          --name 'sql1' \
          -p 1433:1433 \
          -v sqldata1:/var/opt/mssql \
          -d microsoft/mssql-server-linux:2017-latest
  • How about the backup of a container? can it be like VM’s snapshot? 
    • You can snapshot the state of a container with docker commit. This will create a new image from the container and that image can be used to create additional containers. But recall, containers are intended to be ephemeral, we really want to define the state of the container OUTSIDE of the container in code. The things inside the container that require data persistency, like databases should be taken care of using  techniques like Docker Data Volume, backups and other high availability scenarios.