How to provision persistent volume claim for software install in kubernetes - solr

I am trying to provision PVC for Solr deployment in k8s and mount it as /opt/solr, which is default Solr installation directory. This way I plan to target both Solr installation and data under it on PVC. However, while storage gets provisioned just fine and statefulset gets created, my deployment doesn't work because /opt/solr ends up empty. What is a proper way to do it? Here my deployment.yaml:
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
name: solr
labels:
app: solr
spec:
volumeClaimTemplates:
- metadata:
name: datadir
annotations:
volume.alpha.kubernetes.io/storage-class: slow
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 2Gi
serviceName: solr-svc
replicas: 1
template:
metadata:
labels:
app: solr
spec:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: "app"
operator: In
values:
- solr-pool
topologyKey: "kubernetes.io/hostname"
terminationGracePeriodSeconds: 300
containers:
- name: solr
image: solr:6.5.1
imagePullPolicy: IfNotPresent
resources:
requests:
memory: 512M
cpu: 500m
ports:
- containerPort: 8983
name: solr-port
protocol: TCP
env:
- name: VERBOSE
value: "yes"
command:
- bash
- -c
- "exec /opt/solr/bin/solr start"
volumeMounts:
- name: solr-script
mountPath: /docker-entrypoint-initdb.d/
- name: datadir
mountPath: /opt/solr/
volumes:
- name: solr-script
configMap:
name: solr-configs
nodeSelector:
pool: solr-pool

Provisioned storage is empty by default and there might be a Deleting retain policy on provisioned storage be sure to check those configurations. You can also exec to your pod and examine mounted volume and see if it's working properly or not (permission issues, read only file system)

In may case there was a conflict with docker container configuration which used /opt/solr as a location for solr install and my attempt to mount separate PV under same location. Once this PV is mounted obviously I loose solr install. The fixes for this are:
create another docker image which uses separate location
change solr config to use different location for data
change PV volume location

Related

Deploying SQL Server in Kubernetes: context deadline exceeded while pulling mssql image attempt

When I'm trying to run SQL Server in kubernetes with the mcr.microsoft.com/mssql/server image in minikube cluster in several seconds I'm getting the following in logs:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Pulling 38h (x77 over 47h) kubelet Pulling image "mcr.microsoft.com/mssql/server"
Normal BackOff 38h (x1658 over 47h) kubelet Back-off pulling image "mcr.microsoft.com/mssql/server"
Warning Failed 38h (x79 over 47h) kubelet Failed to pull image "mcr.microsoft.com/mssql/server": rpc error: code = Unknown desc = context deadline exceeded
Pulling and running the image in docker desktop works fine.
What I've already tried:
Specifying a tag like :2019-latest;
Specifying an imagePullPolicy like IfNotPresent or Never. Seems like even after pulling the image via powershell directly kubernetes doesn't see it locally (but docker does).
I suspect the reason is that the image is too large and kubernetes has too short timeout settings by default. But I'm a newbie with kubernetes and haven't checked this yet. At least, I don't see anything about it in SQL Server examples.
Here's the deployment.yaml:
apiVersion: apps/v1
kind: Deployment
metadata:
name: mssql-deployment
namespace: mynamespace
spec:
replicas: 1
selector:
matchLabels:
app: mssql
strategy:
type: Recreate
template:
metadata:
labels:
app: mssql
spec:
terminationGracePeriodSeconds: 10
containers:
- image: mcr.microsoft.com/mssql/server
name: mssql
env:
- name: ACCEPT_EULA
value: "Y"
- name: SA_PASSWORD
valueFrom:
secretKeyRef:
name: mssql
key: SA_PASSWORD
ports:
- containerPort: 1433
name: mssql
securityContext:
privileged: true
volumeMounts:
- name: mssqldb
mountPath: /var/opt/mssql
volumes:
- name: mssqldb
persistentVolumeClaim:
claimName: mysql-pv-claim
service.yaml
apiVersion: v1
kind: Service
metadata:
name: mssql-deployment
namespace: mynamespace
spec:
ports:
- protocol: TCP
port: 1433
targetPort: 1433
selector:
app: mssql
type: LoadBalancer
pvc.yaml:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: mysql-pv-claim
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5Gi
Could you, please, help me figure out what I'm doing wrong? Let me know if you need more details.
Thank you!

What are the correct /etc/exports settings for Kubernetes NFS Storage?

I have a simple NFS server (followed instructions here) connected to a Kubernetes (v1.24.2) cluster as a storage class. When a new PVC is created, it creates a PV as expected with a new directory on the NFS server.
The NFS provider was deployed as instructed here.
My issue is that containers don't seem to be able to perform all the functions they expect to when interacting with the NFS server. For example:
A PVC and PV are created with the following yml:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: mssql-data
spec:
storageClassName: nfs-client
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 50Gi
This creates a directory on the NFS server as expected.
Then this deployment is created to use the PVC:
apiVersion: apps/v1
kind: Deployment
metadata:
name: mssql-deployment
spec:
replicas: 1
selector:
matchLabels:
app: mssql
template:
metadata:
labels:
app: mssql
spec:
terminationGracePeriodSeconds: 30
hostname: mssqlinst
securityContext:
fsGroup: 10001
containers:
- name: mssql
image: mcr.microsoft.com/mssql/server:2019-latest
ports:
- containerPort: 1433
env:
- name: MSSQL_PID
value: "Developer"
- name: ACCEPT_EULA
value: "Y"
- name: SA_PASSWORD
value: "Password123"
volumeMounts:
- name: mssqldb
mountPath: /var/opt/mssql
volumes:
- name: mssqldb
persistentVolumeClaim:
claimName: mssql-data
The server comes up and responds to requests but does so with the error:
[S0002][823] com.microsoft.sqlserver.jdbc.SQLServerException: The operating system returned error 1117(The request could not be performed because of an I/O device error.) to SQL Server during a read at offset 0x0000000009a000 in file '/var/opt/mssql/data/master.mdf'. Additional messages in the SQL Server error log and operating system error log may provide more detail. This is a severe system-level error condition that threatens database integrity and must be corrected immediately. Complete a full database consistency check (DBCC CHECKDB). This error can be caused by many factors; for more information, see SQL Server Books Online.
My /etc/exports file has the following contents:
/srv *(rw,no_subtree_check,no_root_squash)
When the SQL container starts, it doesn't undergo any container restarts but the SQL service within the container appears to get into some sort of restart loop until a connection is attempted and then it throws the error and appears to stop.
Is there something I'm missing in the /etc/exports file? I tried variations with sync, async, and insecure but can't seem to get past the SQL error.
I gather from the error that this has something to do with the container's ability to read/write from/to the disk. Am I in the right ballpark?
The config that ended up working was:
/srv *(rw,no_root_squash,insecure,sync,no_subtree_check)
This was after a reinstall of the cluster. No significant changes elsewhere but still seems like there may have been more to the issue than this one config.

Tanzu Kubernetes NotAuthenticated is set on the volume on virtualmachine

Thanks for any help on this.
I'm running a Tanzu kubernetes cluster, brand new in a dev environment. I'm trying to install MS SQL Server 2019 and am hitting a wall with this error once I apply the manifest.
The SQLserver pod fails with this:
ltkc-workers-mpqdb-556696d6f6-rhpsw
Warning FailedMount 50s kubelet, sqltkc-workers-mpqdb-556696d6f6-rhpsw Unable to attach or mount volumes: unmounted volumes=[mssql-persistent-storage], unattached volumes=[default-token-qzt5k mssql-persistent-storage]: timed out waiting for the condition
Warning FailedAttachVolume 45s (x9 over 2m53s) attachdetach-controller AttachVolume.Attach failed for volume "pvc-697e8f96-a23b-4255-9b19-fa04aeed98ee" : rpc error: code = Internal desc = observed Error: "ServerFaultCode: NotAuthenticated" is set on the volume "fbc91ad5-b62e-4bec-8132-4f2d1c5160f0-697e8f96-a23b-4255-9b19-fa04aeed98ee" on virtualmachine "sqltkc-workers-mpqdb-556696d6f6-rhpsw"
The pv and pvc all are bound:
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
persistentvolume/pvc-697e8f96-a23b-4255-9b19-fa04aeed98ee 10Gi RWO Delete Bound default/mssql-data-claim pstore-high 67m
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
persistentvolumeclaim/mssql-data-claim Bound pvc-697e8f96-a23b-4255-9b19-fa04aeed98ee 10Gi RWO pstore-high 67m
The deployment manifest is just what I downloaded from the web from various other tutorials:
apiVersion: v1
kind: Service
metadata:
name: mssql-deployment
spec:
selector:
app: mssql
ports:
- protocol: TCP
port: 1433
targetPort: 1433
type: LoadBalancer
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: mssql-deployment
spec:
replicas: 1
strategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 1
maxSurge: 0
selector:
matchLabels:
app: mssql
template:
metadata:
labels:
app: mssql
spec:
terminationGracePeriodSeconds: 10
securityContext:
fsGroup: 1000
restartPolicy: Always
containers:
- name: mssql
resources:
requests:
memory: 8000Mi
image: mcr.microsoft.com/mssql/server:2019-latest
ports:
- containerPort: 1433
env:
- name: MSSQL_PID
value: "Developer"
- name: ACCEPT_EULA
value: "Y"
- name: SA_PASSWORD
value: VMware123!
volumeMounts:
- name: mssql-persistent-storage
mountPath: /var/opt/mssql
volumes:
- name: mssql-persistent-storage
persistentVolumeClaim:
claimName: mssql-data-claim
Here is the pvc yaml:
kind: PersistentVolumeClaim
metadata:
name: mssql-data-claim
spec:
accessModes:
- ReadWriteOnce
# storageClassName: vsan-default-storage-policy
storageClassName: pstore-high
resources:
requests:
storage: 10Gi
The storage class exists. I have tried this with both the default vSAN and other storage classes and always hit the same volume authentication issue.
I've searched high and low, can't find any related docs. Was hoping to see if someone knew more.
Thanks so much!!
Thanks again for the help, our team was able to fix this. We found out that our vCenter root password had expired. Once we reset the password our persistent volumes were able to mount to the containers without any errors. Highly suggest if you are running Tanzu to make sure your vCenter is fully updated.

Every other SQL Server deployment fails due to access denied on Kubernetes

I am deploying a SQL Server 2019 to Kubernetes with the following manifest:
apiVersion : apps/v1
kind: Deployment
metadata:
name: sql
spec:
selector:
matchLabels:
app: 'sql'
template:
metadata:
labels:
app: sql
spec:
hostname: sql-dev
securityContext:
fsGroup: 10001
initContainers:
- name: volume-permissions
image: busybox
command: ["sh", "-c", "chown -R 10001:0 /var/opt/mssql"]
volumeMounts:
- mountPath: "/var/opt/mssql"
name: mssqldb
containers:
- name: sql
image: localhost:32000/sql:dev-latest
env:
- name: MSSQL_SA_PASSWORD
valueFrom:
secretKeyRef:
name: mssql
key: SA_PASSWORD
- name: ACCEPT_EULA
value: "Y"
ports:
- containerPort: 1433
resources:
limits:
memory: 2Gi
cpu: 1
volumeMounts:
- name: mssqldb
mountPath: /var/opt/mssql
volumes:
- name: mssqldb
persistentVolumeClaim:
claimName: sqldev-pvc
---
apiVersion: v1
kind: Service
metadata:
name: sql-svc
spec:
type: LoadBalancer
ports:
- protocol: TCP
port: 1433
targetPort: 1433
nodePort: 31113
selector:
app: sql
And this is the pv/pvc manifest:
apiVersion: v1
kind: PersistentVolume
metadata:
name: sqldev-pv
spec:
capacity:
storage: 1Gi
volumeMode: Filesystem
accessModes:
- ReadWriteOnce
storageClassName: sql
hostPath:
path: /usr/sql
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: sqldev-pvc
spec:
accessModes:
- ReadWriteOnce
storageClassName: sql
resources:
requests:
storage: 1Gi
If the deployment is not present on the Cluster yet, the deployment itself works and the server is available.
The next deployment fails with the following message:
2021-01-20 12:02:34.98 Server Error: 17113, Severity: 16, State:
2021-01-20 12:02:34.98 Server Error 5(Access is denied.) occurred while opening file '/var/opt/mssql/data/master.mdf' to obtain
configuration information at startup. An invalid startup option might
have caused the error. Verify your startup options, and correct or
remove them if necessary.
Doing another deployment or simply restarting it with kubectl rollout restart deployment/sql comes up fine, while the next one fails again.
The pattern is a consistent good - bad - good - bad - ...
Plese explain why this is happening and how I can resolve this.
Update: Apparently one instance of mssql exclusively locks the database files - which makes total sense. You don't want 2 instances of brain going haywire on your sole instance of childhood memories.
So what I think is happening is:
Instance A exists and is up and running
Instance B deployment starts and wants to access the same volume as A
Only when B is created, A is being terminated with a grace period of 30 seconds
B is trying to access the mdf, while it is still being excklusively locked by and to A currently being terminated.
I have a crude solution involving a sleep 30 bash script before initializing mssql inside the pod, but right now I want to investigate, if there is a more elegant solution.
My first approach to solve this was to delay the boot up time of mssql until the previous pod was terminated using a bash loop:
echo "Waiting 35 seconds grace period."
for i in {0..35}
do
sleep 1
echo "$i seconds waited"
done
While this technically solved the problem, it isn't very elegant. If the grace period of the pod is changed to someting > 35 seconds, this will need to be changed too.
Changing the deployment strategy from its implicit default RollingUpdate to Recreate, did the trick to. The effect is, that the previous pod is being terminated before the new one is spun up.
apiVersion : apps/v1
kind: Deployment
metadata:
name: sql
spec:
selector:
matchLabels:
app: 'sql'
strategy:
type: Recreate
Documentation: https://kubernetes.io/docs/concepts/workloads/controllers/deployment/#recreate-deployment

Accessing container mounted volumes in Kubernetes from docker container

I am currently trying to access a file mounted in my Kubernetes container from a docker image. I need to pass the file in with a flag when my docker image is run.
The docker image is usually run (outside a container) using the command:
docker run -p 6688:6688 -v ~/.chainlink-ropsten:/chainlink -it --env-file=.env smartcontract/chainlink local n -p /chainlink/.password -a /chainlink/.api
Now I have sucessfully used the following config to mount my env, password and api files at /chainlink, but when attempting to access the files during the docker run I get the error:
flag provided but not defined: -password /chainlink/.password
The following is my current Kubernetes Deployment file
kind: Deployment
metadata:
name: chainlink-deployment
labels:
app: chainlink-node
spec:
replicas: 1
selector:
matchLabels:
app: chainlink-node
template:
metadata:
labels:
app: chainlink-node
spec:
containers:
- name: chainlink
image: smartcontract/chainlink:latest
args: [ "local", "n", "--password /chainlink/.password", "--api /chainlink/.api"]
ports:
- containerPort: 6689
volumeMounts:
- name: config-volume
mountPath: /chainlink/.env
subPath: .env
- name: api-volume
mountPath: /chainlink/.api
subPath: .api
- name: password-volume
mountPath: /chainlink/.password
subPath: .password
volumes:
- name: config-volume
configMap:
name: node-env
- name: api-volume
configMap:
name: api-env
- name: password-volume
configMap:
name: password-env
Is there some definition I am missing in my file that allows me to access the mounted volumes when running my docker image?
Change your args to:
args: [ "local", "n", "--password", "/chainlink/.password", "--api", "/chainlink/.api"]
The way you currently have it, it's thinking the whole string --password /chainlink/.password, include the space, is a single flag. That's what the error:
flag provided but not defined: -password /chainlink/.password
is telling you.

Resources