Tanzu Kubernetes NotAuthenticated is set on the volume on virtualmachine - sql-server

Thanks for any help on this.
I'm running a Tanzu kubernetes cluster, brand new in a dev environment. I'm trying to install MS SQL Server 2019 and am hitting a wall with this error once I apply the manifest.
The SQLserver pod fails with this:
ltkc-workers-mpqdb-556696d6f6-rhpsw
Warning FailedMount 50s kubelet, sqltkc-workers-mpqdb-556696d6f6-rhpsw Unable to attach or mount volumes: unmounted volumes=[mssql-persistent-storage], unattached volumes=[default-token-qzt5k mssql-persistent-storage]: timed out waiting for the condition
Warning FailedAttachVolume 45s (x9 over 2m53s) attachdetach-controller AttachVolume.Attach failed for volume "pvc-697e8f96-a23b-4255-9b19-fa04aeed98ee" : rpc error: code = Internal desc = observed Error: "ServerFaultCode: NotAuthenticated" is set on the volume "fbc91ad5-b62e-4bec-8132-4f2d1c5160f0-697e8f96-a23b-4255-9b19-fa04aeed98ee" on virtualmachine "sqltkc-workers-mpqdb-556696d6f6-rhpsw"
The pv and pvc all are bound:
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
persistentvolume/pvc-697e8f96-a23b-4255-9b19-fa04aeed98ee 10Gi RWO Delete Bound default/mssql-data-claim pstore-high 67m
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
persistentvolumeclaim/mssql-data-claim Bound pvc-697e8f96-a23b-4255-9b19-fa04aeed98ee 10Gi RWO pstore-high 67m
The deployment manifest is just what I downloaded from the web from various other tutorials:
apiVersion: v1
kind: Service
metadata:
name: mssql-deployment
spec:
selector:
app: mssql
ports:
- protocol: TCP
port: 1433
targetPort: 1433
type: LoadBalancer
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: mssql-deployment
spec:
replicas: 1
strategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 1
maxSurge: 0
selector:
matchLabels:
app: mssql
template:
metadata:
labels:
app: mssql
spec:
terminationGracePeriodSeconds: 10
securityContext:
fsGroup: 1000
restartPolicy: Always
containers:
- name: mssql
resources:
requests:
memory: 8000Mi
image: mcr.microsoft.com/mssql/server:2019-latest
ports:
- containerPort: 1433
env:
- name: MSSQL_PID
value: "Developer"
- name: ACCEPT_EULA
value: "Y"
- name: SA_PASSWORD
value: VMware123!
volumeMounts:
- name: mssql-persistent-storage
mountPath: /var/opt/mssql
volumes:
- name: mssql-persistent-storage
persistentVolumeClaim:
claimName: mssql-data-claim
Here is the pvc yaml:
kind: PersistentVolumeClaim
metadata:
name: mssql-data-claim
spec:
accessModes:
- ReadWriteOnce
# storageClassName: vsan-default-storage-policy
storageClassName: pstore-high
resources:
requests:
storage: 10Gi
The storage class exists. I have tried this with both the default vSAN and other storage classes and always hit the same volume authentication issue.
I've searched high and low, can't find any related docs. Was hoping to see if someone knew more.
Thanks so much!!

Thanks again for the help, our team was able to fix this. We found out that our vCenter root password had expired. Once we reset the password our persistent volumes were able to mount to the containers without any errors. Highly suggest if you are running Tanzu to make sure your vCenter is fully updated.

Related

Deploying SQL Server in Kubernetes: context deadline exceeded while pulling mssql image attempt

When I'm trying to run SQL Server in kubernetes with the mcr.microsoft.com/mssql/server image in minikube cluster in several seconds I'm getting the following in logs:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Pulling 38h (x77 over 47h) kubelet Pulling image "mcr.microsoft.com/mssql/server"
Normal BackOff 38h (x1658 over 47h) kubelet Back-off pulling image "mcr.microsoft.com/mssql/server"
Warning Failed 38h (x79 over 47h) kubelet Failed to pull image "mcr.microsoft.com/mssql/server": rpc error: code = Unknown desc = context deadline exceeded
Pulling and running the image in docker desktop works fine.
What I've already tried:
Specifying a tag like :2019-latest;
Specifying an imagePullPolicy like IfNotPresent or Never. Seems like even after pulling the image via powershell directly kubernetes doesn't see it locally (but docker does).
I suspect the reason is that the image is too large and kubernetes has too short timeout settings by default. But I'm a newbie with kubernetes and haven't checked this yet. At least, I don't see anything about it in SQL Server examples.
Here's the deployment.yaml:
apiVersion: apps/v1
kind: Deployment
metadata:
name: mssql-deployment
namespace: mynamespace
spec:
replicas: 1
selector:
matchLabels:
app: mssql
strategy:
type: Recreate
template:
metadata:
labels:
app: mssql
spec:
terminationGracePeriodSeconds: 10
containers:
- image: mcr.microsoft.com/mssql/server
name: mssql
env:
- name: ACCEPT_EULA
value: "Y"
- name: SA_PASSWORD
valueFrom:
secretKeyRef:
name: mssql
key: SA_PASSWORD
ports:
- containerPort: 1433
name: mssql
securityContext:
privileged: true
volumeMounts:
- name: mssqldb
mountPath: /var/opt/mssql
volumes:
- name: mssqldb
persistentVolumeClaim:
claimName: mysql-pv-claim
service.yaml
apiVersion: v1
kind: Service
metadata:
name: mssql-deployment
namespace: mynamespace
spec:
ports:
- protocol: TCP
port: 1433
targetPort: 1433
selector:
app: mssql
type: LoadBalancer
pvc.yaml:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: mysql-pv-claim
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5Gi
Could you, please, help me figure out what I'm doing wrong? Let me know if you need more details.
Thank you!

ImagePullBack pod status in Kubernetes when pulling public image (MS SQL Server Express)

I'm following Les Jackson's tutorial to microservices and got stuck at 05:30:00 while creating a deployment for a ms sql server. I've written the deployment file just as shown on the yt video:
apiVersion: apps/v1
kind: Deployment
metadata:
name: mssql-depl
spec:
replicas: 1
selector:
matchLabels:
app: mssql
template:
metadata:
labels:
app: mssql
spec:
containers:
- name: mssql
image: mcr.microsoft.com/mssql/server:2017-latest
ports:
- containerPort: 1433
env:
- name: MSSQL_PID
value: "Express"
- name: ACCEPT_EULA
value: "Y"
- name: SA_PASSWORD
valueFrom:
secretKeyRef:
name: mssql
key: SA_PASSWORD
volumeMounts:
- mountPath: /var/opt/mssql/data
name: mssqldb
volumes:
- name: mssqldb
persistentVolumeClaim:
claimName: mssql-claim
---
apiVersion: v1
kind: Service
metadata:
name: mssql-clusterip-srv
spec:
type: ClusterIP
selector:
app: mssql
ports:
- name: mssql
protocol: TCP
port: 1433 # this is default port for mssql
targetPort: 1433
---
apiVersion: v1
kind: Service
metadata:
name: mssql-loadbalancer
spec:
type: LoadBalancer
selector:
app: mssql
ports:
- protocol: TCP
port: 1433 # this is default port for mssql
targetPort: 1433
The persistent volume claim:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: mssql-claim
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 200Mi
But when I apply this deployment, the pod ends up with ImagePullBackOff status:
commands-depl-688f77b9c6-vln5v 1/1 Running 0 2d21h
mssql-depl-5cd6d7d486-m8nw6 0/1 ImagePullBackOff 0 4m54s
platforms-depl-6b6cf9b478-ktlhf 1/1 Running 0 2d21h
kubectl describe pod
Name: mssql-depl-5cd6d7d486-nrrkn
Namespace: default
Priority: 0
Node: docker-desktop/192.168.65.4
Start Time: Thu, 28 Jul 2022 12:09:34 +0200
Labels: app=mssql
pod-template-hash=5cd6d7d486
Annotations: <none>
Status: Pending
IP: 10.1.0.27
IPs:
IP: 10.1.0.27
Controlled By: ReplicaSet/mssql-depl-5cd6d7d486
Containers:
mssql:
Container ID:
Image: mcr.microsoft.com/mssql/server:2017-latest
Image ID:
Port: 1433/TCP
Host Port: 0/TCP
State: Waiting
Reason: ImagePullBackOff
Ready: False
Restart Count: 0
Environment:
MSSQL_PID: Express
ACCEPT_EULA: Y
SA_PASSWORD: <set to the key 'SA_PASSWORD' in secret 'mssql'> Optional: false
Mounts:
/var/opt/mssql/data from mssqldb (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube- api-access-xqzks (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
mssqldb:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: mssql-claim
ReadOnly: false
kube-api-access-xqzks:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not- ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 3m42s default-scheduler Successfully assigned default/mssql-depl-5cd6d7d486-nrrkn to docker-desktop
Warning Failed 102s kubelet Failed to pull image "mcr.microsoft.com/mssql/server:2017-latest": rpc error: code = Unknown desc = context deadline exceeded
Warning Failed 102s kubelet Error: ErrImagePull
Normal BackOff 102s kubelet Back-off pulling image "mcr.microsoft.com/mssql/server:2017-latest"
Warning Failed 102s kubelet Error: ImagePullBackOff
Normal Pulling 87s (x2 over 3m41s) kubelet Pulling image "mcr.microsoft.com/mssql/server:2017-latest"
In the events it shows
"rpc error: code = Unknown desc = context deadline exceeded"
But it doesn't tell me anything and resources on troubleshooting this error don't include such error.
I'm using kubernetes on docker locally.
I've researched that this issue can happen when pulling the image from a private registry, but this is public one, right here. I copy pasted the image path to be sure, I tried with different ms sql version, but to no avail.
Can someone be so kind and show me the right direction I should go / what should I try to get this to work? It worked just fine on the video :(
I fixed it by manually pulling the image via docker pull mcr.microsoft.com/mssql/server:2017-latest and then deleting and re-applying the deployment.
I my case, I needed to pull the image "to minikube" using minikube ssh docker pull <the_image>
Then I can apply my deployment without errors.
Source: https://github.com/kubernetes/minikube/issues/14806

Every other SQL Server deployment fails due to access denied on Kubernetes

I am deploying a SQL Server 2019 to Kubernetes with the following manifest:
apiVersion : apps/v1
kind: Deployment
metadata:
name: sql
spec:
selector:
matchLabels:
app: 'sql'
template:
metadata:
labels:
app: sql
spec:
hostname: sql-dev
securityContext:
fsGroup: 10001
initContainers:
- name: volume-permissions
image: busybox
command: ["sh", "-c", "chown -R 10001:0 /var/opt/mssql"]
volumeMounts:
- mountPath: "/var/opt/mssql"
name: mssqldb
containers:
- name: sql
image: localhost:32000/sql:dev-latest
env:
- name: MSSQL_SA_PASSWORD
valueFrom:
secretKeyRef:
name: mssql
key: SA_PASSWORD
- name: ACCEPT_EULA
value: "Y"
ports:
- containerPort: 1433
resources:
limits:
memory: 2Gi
cpu: 1
volumeMounts:
- name: mssqldb
mountPath: /var/opt/mssql
volumes:
- name: mssqldb
persistentVolumeClaim:
claimName: sqldev-pvc
---
apiVersion: v1
kind: Service
metadata:
name: sql-svc
spec:
type: LoadBalancer
ports:
- protocol: TCP
port: 1433
targetPort: 1433
nodePort: 31113
selector:
app: sql
And this is the pv/pvc manifest:
apiVersion: v1
kind: PersistentVolume
metadata:
name: sqldev-pv
spec:
capacity:
storage: 1Gi
volumeMode: Filesystem
accessModes:
- ReadWriteOnce
storageClassName: sql
hostPath:
path: /usr/sql
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: sqldev-pvc
spec:
accessModes:
- ReadWriteOnce
storageClassName: sql
resources:
requests:
storage: 1Gi
If the deployment is not present on the Cluster yet, the deployment itself works and the server is available.
The next deployment fails with the following message:
2021-01-20 12:02:34.98 Server Error: 17113, Severity: 16, State:
2021-01-20 12:02:34.98 Server Error 5(Access is denied.) occurred while opening file '/var/opt/mssql/data/master.mdf' to obtain
configuration information at startup. An invalid startup option might
have caused the error. Verify your startup options, and correct or
remove them if necessary.
Doing another deployment or simply restarting it with kubectl rollout restart deployment/sql comes up fine, while the next one fails again.
The pattern is a consistent good - bad - good - bad - ...
Plese explain why this is happening and how I can resolve this.
Update: Apparently one instance of mssql exclusively locks the database files - which makes total sense. You don't want 2 instances of brain going haywire on your sole instance of childhood memories.
So what I think is happening is:
Instance A exists and is up and running
Instance B deployment starts and wants to access the same volume as A
Only when B is created, A is being terminated with a grace period of 30 seconds
B is trying to access the mdf, while it is still being excklusively locked by and to A currently being terminated.
I have a crude solution involving a sleep 30 bash script before initializing mssql inside the pod, but right now I want to investigate, if there is a more elegant solution.
My first approach to solve this was to delay the boot up time of mssql until the previous pod was terminated using a bash loop:
echo "Waiting 35 seconds grace period."
for i in {0..35}
do
sleep 1
echo "$i seconds waited"
done
While this technically solved the problem, it isn't very elegant. If the grace period of the pod is changed to someting > 35 seconds, this will need to be changed too.
Changing the deployment strategy from its implicit default RollingUpdate to Recreate, did the trick to. The effect is, that the previous pod is being terminated before the new one is spun up.
apiVersion : apps/v1
kind: Deployment
metadata:
name: sql
spec:
selector:
matchLabels:
app: 'sql'
strategy:
type: Recreate
Documentation: https://kubernetes.io/docs/concepts/workloads/controllers/deployment/#recreate-deployment

Connect to local SQL Server Express database from inside minikube cluster

I'm trying to access my SQL Server Express database hosted on my local machine from inside a minikube pod. I tried to follow the idea describe on kubernetes official doc. While I am inspecting my container I found out that my application got crashed every time I am creating POD because the application is unable to connect to the local database.
This is my config:
apiVersion: v1
kind: Service
metadata:
name: sqlserver-svc
spec:
ports:
- protocol: TCP
port: 1443
targetPort: 1433
===========================
apiVersion: v1
kind: Endpoints
metadata:
name: sqlserver-svc
subsets:
- addresses:
- ip: 192.168.0.101
ports:
- port: 1433
======== Application container ==========
apiVersion: v1
kind: Pod # object type -> that will reside kubernetes cluster
metadata:
name: api-pod
labels:
component: web
spec:
containers:
- name: api
image: nayan2/simptekapi
ports:
- containerPort: 5000
env:
- name: DATABASE_HOST
value: "sqlserver-svc:1443\\SQLEXPRESS"
- name: DATABASE_PORT
value: '1433'
- name: DATABASE_USER
value: sa
- name: DATABASE_PW
value: 1234
- name: JWT_SECRET
value: sec
- name: NODE_ENV
value: production
================================
apiVersion: v1
kind: Service
metadata:
name: api-node-port
spec:
type: NodePort
ports:
- port: 4200
targetPort: 5000
nodePort: 31515
selector:
component: web
It is obvious that I am doing something wrong. I am relatively new with docker/container and Kubernetes technology. Still learning. Can anybody help me with this??

How to provision persistent volume claim for software install in kubernetes

I am trying to provision PVC for Solr deployment in k8s and mount it as /opt/solr, which is default Solr installation directory. This way I plan to target both Solr installation and data under it on PVC. However, while storage gets provisioned just fine and statefulset gets created, my deployment doesn't work because /opt/solr ends up empty. What is a proper way to do it? Here my deployment.yaml:
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
name: solr
labels:
app: solr
spec:
volumeClaimTemplates:
- metadata:
name: datadir
annotations:
volume.alpha.kubernetes.io/storage-class: slow
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 2Gi
serviceName: solr-svc
replicas: 1
template:
metadata:
labels:
app: solr
spec:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: "app"
operator: In
values:
- solr-pool
topologyKey: "kubernetes.io/hostname"
terminationGracePeriodSeconds: 300
containers:
- name: solr
image: solr:6.5.1
imagePullPolicy: IfNotPresent
resources:
requests:
memory: 512M
cpu: 500m
ports:
- containerPort: 8983
name: solr-port
protocol: TCP
env:
- name: VERBOSE
value: "yes"
command:
- bash
- -c
- "exec /opt/solr/bin/solr start"
volumeMounts:
- name: solr-script
mountPath: /docker-entrypoint-initdb.d/
- name: datadir
mountPath: /opt/solr/
volumes:
- name: solr-script
configMap:
name: solr-configs
nodeSelector:
pool: solr-pool
Provisioned storage is empty by default and there might be a Deleting retain policy on provisioned storage be sure to check those configurations. You can also exec to your pod and examine mounted volume and see if it's working properly or not (permission issues, read only file system)
In may case there was a conflict with docker container configuration which used /opt/solr as a location for solr install and my attempt to mount separate PV under same location. Once this PV is mounted obviously I loose solr install. The fixes for this are:
create another docker image which uses separate location
change solr config to use different location for data
change PV volume location

Resources