I'm having an issue with some of my elastic search indices in the cluster:
I have 5 regular shards for an example index logs-2021.08, so when I'm running _cat/shards elastic API I'm getting good results (example):
logs-2021.08 2 r STARTED 25008173 11.9gb 0.0.0.0 instance-0000000128
logs-2021.08 2 p STARTED 25008173 11.8gb 0.0.0.0 instance-0000000119
logs-2021.08 4 p STARTED 25012332 11.8gb 0.0.0.0 instance-0000000129
logs-2021.08 4 r STARTED 25012332 11.9gb 0.0.0.0 instance-0000000119
logs-2021.08 1 p STARTED 25003649 11.8gb 0.0.0.0 instance-0000000121
logs-2021.08 1 r STARTED 25003649 11.8gb 0.0.0.0 instance-0000000115
logs-2021.08 3 p STARTED 25006085 11.8gb 0.0.0.0 instance-0000000121
logs-2021.08 3 r STARTED 25006085 11.8gb 0.0.0.0 instance-0000000135
logs-2021.08 0 p STARTED 25007160 11.9gb 0.0.0.0 instance-0000000128
logs-2021.08 0 r STARTED 25007160 11.9gb 0.0.0.0 instance-0000000118
The issue is that I'm also getting these in the results of the cat API:
partial-logs-2021.08 2 p UNASSIGNED
partial-logs-2021.08 4 p UNASSIGNED
partial-logs-2021.08 1 p UNASSIGNED
partial-logs-2021.08 3 p UNASSIGNED
partial-logs-2021.08 0 p UNASSIGNED
I could not find what the problem is or why I'm having these partial indices, but the cluster seems to be unhealthy with these unassigned shards.
Is there any way to solve these from the root (and not the obvious deleting them)?
Easy
Retry Elasticsearch shard allocation was blocked due to too many subsequent allocation failures.
curl -X POST http://127.0.0.1:9200/_cluster/reroute?retry_failed=true
But Understand the reason behind and Allocation API
Elasticsearch allocation API, cluster will attempt to allocate a shard a maximum of index.allocation.max_retries times in a row (defaults to 5), before giving up and leaving the shard unallocated. This scenario can be caused by trying max 5 times, we can increase this to try again for assignment initialization, but issue may repeat.
curl --silent --request PUT --header 'Content-Type: application/json' 127.0.0.1:9200/my_index_name/_settings?pretty=true --data-ascii '{
"index": {
"allocation": {
"max_retries": 15
}
}
}'
But this may fail again because of the different reasons, so Identify the cause, with the cluster allocation. Possible issues could be
Watermark issue because of the hard disk space
Indexing errors. This will occur when you have moved your index from one folder to another folder or from one server to another server.
Structural problems such as having an analyzer which refers to a stopwords file that doesn’t exist on all nodes.
Get Unassigned Shards
curl -s "http://127.0.0.1:9200/_cat/shards?v" | awk 'NR==1 {print}; $4 == "UNASSIGNED" {print}'
To understand the reason run the following command
GET /_cluster/allocation/explain
# OR
curl -XGET "location:9200/_cluster/allocation/explain"
# OR
curl http://127.0.0.1:9200/_cluster/state | jq '.routing_table.indices | .[].shards[][] | select(.state=="UNASSIGNED") | {index: .index, shard: .shard, primary: .primary, unassigned_info: .unassigned_info}'
Once the problem has been corrected, allocation can be manually retried by calling the reroute API with the ?retry_failed URI query parameter, which will attempt a single retry round for these shards. Command to reinitiate the allocation API with the following API.
curl -X POST http://127.0.0.1:9200/_cluster/reroute?retry_failed=true
Related
I tried to run my first cluster as I'm currently trying to learn so I can work in Cloud Engineering hopefully.
What I did :
I have 3 Cloud Servers ( Ubuntu 20.04), all in one Network,
I've successfully set up my ETCD Cluster ( cluster-health shows me all 3 Network IPs of the Servers, 1 leader 2 not leader)
now I've installed k3s on my first Server
curl -sfL https://get.k3s.io | sh -s - server \ --datastore-endpoint="https://10.0.0.2:2380,https://10.0.0.4:2380,https://10.0.0.3:2380"
I've done the same on the 2 other Servers the only difference is I added the token value to it and checked it beforehand in:
cat /var/lib/rancher/k3s/server/token
now everything seems to have worked but when I tried to kubectl get nodes , it just shows me one node...
does anyone have any tips or answers for me?
k3s Service FIle :
[Unit]
Description=Lightweight Kubernetes
Documentation=https://k3s.io
Wants=network-online.target
After=network-online.target
[Install]
WantedBy=multi-user.target
[Service]
Type=notify
EnvironmentFile=-/etc/default/%N
EnvironmentFile=-/etc/sysconfig/%N
EnvironmentFile=-/etc/systemd/system/k3s.service.env
KillMode=process
Delegate=yes
# Having non-zero Limit*s causes performance problems due to accounting overhead
# in the kernel. We recommend using cgroups to do container-local accounting.
LimitNOFILE=1048576
LimitNPROC=infinity
LimitCORE=infinity
TasksMax=infinity
TimeoutStartSec=0
Restart=always
RestartSec=5s
ExecStartPre=/bin/sh -xc '! /usr/bin/systemctl is-enabled --quiet nm-cloud-setup.service'
ExecStartPre=-/sbin/modprobe br_netfilter
ExecStartPre=-/sbin/modprobe overlay
ExecStart=/usr/local/bin/k3s \
server \
'--node-external-ip=78.46.241.153'
'--node-name=node-1'
' --flannel-iface=ens10'
' --advertise-address=10.0.0.2'
' --node-ip=10.0.0.2'
' --datastore-endpoint=https://10.0.0.2:2380,https://10.0.0.4:2380,https://10.0.0.3:2380' \
I want to submit my flink job on YARN with this command:
./bin/flink run -m yarn-cluster -p 4 -yjm 1024m -ytm 4096m ./task.jar
but I faced this error:
is running beyond virtual memory limits. Current usage: 390.3 MB of 1 GB physical memory used; 2.3 GB of 2.1 GB virtual memory used. Killing container.
This is caused because of the a variable named yarn.nodemanager.vmem-pmem-ratio which is default set to 2.1, in this command this ratio is 4096/1024 = 4
You have 3 ways:
1 - If you have access to YARN configuration you can set yarn.nodemanager.vmem-check-enabled is yarn-site.xml to false.
2 - Another if you have access to configuration is to change the ratio value from 2.1 to 5 e.g.
3 - If you don't have access you can change the YARN configurations, you can change the ytm and yjm values in order to satisfy the ratio condition, for example: -yjm 4096 -ytm 4096.
I am going to keep this simple and ask, is there a way to see which pod have an active connection with an endpoint like a database endpoint?
My cluster contains a few hundred of namespace and my database provider just told me that the maximum amount of connections is almost reached and I want to pinpoint the pod(s) that uses multiple connections to our database endpoint at the same time.
I can see from my database cluster that the connections come from my cluster node's IP... but it won't say which pods... and I have quite lot of pods...
Thanks for the help
Each container uses its own network namespace, so to check the network connection inside the container namespace you need to run command inside that namespace.
Luckily all containers in a Pod share the same network namespace, so you can add small sidecar container to the pod that print to the log open connections.
Alternatively, you can run netstat command inside the pod (if the pod has it on its filesystem):
kubectl get pods | grep Running | awk '{ print $1 }' | xargs -I % sh -c 'echo == Pod %; kubectl exec -ti % -- netstat -tunaple' >netstat.txt
# or
kubectl get pods | grep Running | awk '{ print $1 }' | xargs -I % sh -c 'echo == Pod %; kubectl exec -ti % -- netstat -tunaple | grep ESTABLISHED' >netstat.txt
After that you'll have a file on your disk (netstat.txt) with all information about connections in the pods.
The third way is most complex. You need to find the container ID using docker ps and run the following command to get PID
$ pid = "$(docker inspect -f '{{.State.Pid}}' "container_name | Uuid")"
Then, you need to create named namespace:
(you can use any name you want, or container_name/Uuid/Pod_Name as a replacement to namespace_name)
sudo mkdir -p /var/run/netns
sudo ln -sf /proc/$pid/ns/net "/var/run/netns/namespace_name"
Now you can run commands in that namespace:
sudo ip netns exec "namespace_name" netstat -tunaple | grep ESTABLISHED
You need to do that for each pod on each node. So, it might be useful to troubleshoot particular containers, but it needs some more automation for your task.
It might be helpful for you to install Istio to your cluster. It has several interesting features mentioned in this answer
The easiest way is to run netstat on all your Kubernetes nodes:
$ netstat -tunaple | grep ESTABLISHED | grep <ip address of db provider>
The last column is the PID/Program name column, and that's a program that is running in a container (with a different internal container PID) in your pod on that specific node. There are all kinds of different ways to find out which container/pod it is. For example,
# Loop through all containers on the node with
$ docker top <container-id>
Then after you find the container id, if you look through all your pods:
$ kubectl get pod <pod-id> -o=yaml
And you can find the status, for example:
status:
conditions:
- lastProbeTime: null
lastTransitionTime: 2018-11-09T23:01:36Z
status: "True"
type: Initialized
- lastProbeTime: null
lastTransitionTime: 2018-11-09T23:01:38Z
status: "True"
type: Ready
- lastProbeTime: null
lastTransitionTime: 2018-11-09T23:01:38Z
status: "True"
type: ContainersReady
- lastProbeTime: null
lastTransitionTime: 2018-11-09T23:01:36Z
status: "True"
type: PodScheduled
containerStatuses:
- containerID: docker://f64425b3cd0da74a323440bcb03d8f2cd95d3d9b834f8ca5c43220eb5306005d
I have a router with nat port forwarding configured. I launched a http copy of big file via the nat. The http server is hosted on the LAN PC which contains the big file to download. I launched the file download from WAN PC.
I disabled the nat rule when file copy is running. the copy of file keep remaining. I want to stop the copy of file when I disable the nat forward rule with conntrack-tool.
my conntrack list contains the following conntrack session
# conntrack -L | grep "33.13"
tcp 6 431988 ESTABLISHED src=192.168.33.13 dst=192.168.33.215 sport=52722 dport=80 src=192.168.3.17 dst=192.168.33.13 sport=80 dport=52722 [ASSURED] use=1
I tried to remove it with the following command:
# conntrack -D --orig-src 192.168.33.13
tcp 6 431982 ESTABLISHED src=192.168.33.13 dst=192.168.33.215 sport=52722 dport=80 src=192.168.3.17 dst=192.168.33.13 sport=80 dport=52722 [ASSURED] use=1
conntrack v1.4.3 (conntrack-tools): 1 flow entries have been deleted.
the conntrack session is removed I can see in the following command. But another conntrack session was created with src ip address is the lan address of the removed conntrack
# conntrack -L | grep "33.13"
tcp 6 431993 ESTABLISHED src=192.168.3.17 dst=192.168.33.13 sport=80 dport=52722 src=192.168.33.13 dst=192.168.33.215 sport=52722 dport=80 [ASSURED] use=1
conntrack v1.4.3 (conntrack-tools): 57 flow entries have been shown.
I tried to remove the new conntrack but it keep remaining
# conntrack -D --orig-src 192.168.3.17
# conntrack -L | grep "33.13"
conntrack v1.4.3 (conntrack-tools): 11 flow entries have been shown.
tcp 6 431981 ESTABLISHED src=192.168.3.17 dst=192.168.33.13 sport=80 dport=52722 src=192.168.33.13 dst=192.168.33.215 sport=52722 dport=80 [ASSURED] use=1
What I m missing?
first, if "conntrack -D" command succeed, you can see below Messsage.
conntrack v1.4.4 (conntrack-tools): 1 flow entries have been deleted.
So we guess that track deleltion working was failed.
Why do not conntrack delete track?
Perhaps you are referencing a session you want to delete from a specific skb or track.
if you want to get detail infomation, you try to follow "ctnetlink_del_conntrack " call stack funcion in linux kernel.
I have the next 'zfs pool' in the machine A:
root#machineA:/ # zfs list -t all
NAME USED AVAIL REFER MOUNTPOINT
tank 7.44M 28.8G 20K /tank
tank/test 92K 28.8G 19K /tank/test
tank/test#SNAP_2017-June-30_10:00:00 9K - 19K -
tank/test#SNAP_2017-July-01_10:00:00 9K - 19K -
tank/test#SNAP_2017-July-02_10:00:00 9K - 19K -
tank/test#SNAP_2017-July-03_10:00:00 9K - 19K -
tank/test#SNAP_2017-July-04_10:00:00 0 - 19K -
tank/test#BACKUP_from_2017-June-30 0 - 19K -
tank/test/exe 37K 28.8G 19K /tank/test/exe
tank/test/exe#EXE_2017-June-29_13:58:49 9K - 19K -
tank/test/exe#EXE_2017-July-03_10:00:00 9K - 19K -
tank/test/exe#EXE_2017-July-04_10:00:00 0 - 19K -
tank/test/exe#BACKUP_from_2017-June-29
And I want to send a snapshot to the machine B:
root#machineB:/ # zfs list -t all
NAME USED AVAIL REFER MOUNTPOINT
tank 6.04M 28.8G 23K /tank
With the netcat I can send the snapshots but the system returns me an error very unusual...
If I do:
B: nc -w 5 -l 7766 | zfs recv tank/test/exe
A: zfs send -R tank/test/exe#EXE_2017-July-04_10:00:00 | nc -w 5 192.168.99.2 7766
All it's ok, but if I do:
B: nc -w 5 -l 7766 | zfs recv tank/test
A: zfs send -R tank/test#SNAP_2017-July-04_10:00:00 | nc -w 5 192.168.99.2 7766
The stream of snapshots is sent but in the source side I can show:
root#machineA:/ # zfs send -R tank/test#SNAP_2017-July-04_10:00:00 | nc -w 5 192.168.99.2 7766
WARNING: could not send tank/test/exe#SNAP_2017-July-04_10:00:00: does not exist
WARNING: could not send tank/test/exe#SNAP_2017-July-04_10:00:00: does not exist
Why ZFS takes the dataset tank/test/exe? Any suggestions?
Actually, the snapshots it's complaining about don't exist on the source system -- tank/test/exe only has #EXE_<date> snapshots, while you're trying to send tank/test/exe#SNAP_<date>. This warning is appearing because you're sending with -R (recursive) from the top-level tank/test filesystem, which sends the specified snapshot on the parent filesystem first, and then searches the children for the same snapshot name to try to send those as well. Usually, this only does what you are expecting when you've taken a snapshot on the parent filesystem with -R -- on your system, you only snapshotted the parent without snapshotting the child at the same time.
On the sending system, you probably want to change your command to be:
zfs send -R tank/test/snap#SNAP_2017-July-04_10:00:00 | nc -w 5 192.168.99.2 7766