Post Keepalived configuration, Server with VIP unable to communicate to other servers - keepalived

I have configured keepalived on two servers, 10.90.11.194 (Server1) and 10.90.11.196(Server2). Server1 is configured as the MASTER while Server2 is the BACKUP. The VIP 10.90.11.219 successfully switches from Server1 to Server2 when keepalived is stopped on Server1.
Both the servers have syslog-ng configured in them to receive syslogs from firewalls, proxy etc. These servers also have Splunk Heavy Forwarder application installed on them to forward these incoming syslogs to Splunk indexers 10.90.11.226 (IDX1), 10.90.11.227(IDX2) and 10.90.11.228(IDX3).
Server1, Server2, IDX1, IDX2 and IDX 3 are all in the same security group and any-any connection is allowed between them. VIP is also allowed inbound and outbound for this security group.
Problem: No matter which device has VIP assigned, it will not be able to connect to the indexers (IDX1, IDX2 and IDX3) at port 9997. However, this connectivity works absolutely fine from the other device without VIP.
Keepalived Config on Master.
vrrp_instance VI_1 {
state MASTER
interface ens3
virtual_router_id 51
priority 100
advert_int 1
unicast_src_ip 10.90.11.194
unicast_peer {
10.90.11.196
}
authentication {
auth_type PASS
auth_pass 12345
}
virtual_ipaddress {
10.90.11.219/24
}
}
On BACKUP unicast IPs are reversed, priority is 50 and state is BACKUP.
Pinging IDX devices from the server without VIP works fine. Problem is only with the device with VIP where I get "Destination Host Unreachable" response from VIP
[root#SERVER1 ~]# ping 10.90.11.226
PING 10.90.11.226 (10.90.11.226) 56(84) bytes of data.
From 10.90.11.219 icmp_seq=1 Destination Host Unreachable
From 10.90.11.219 icmp_seq=2 Destination Host Unreachable
From 10.90.11.219 icmp_seq=3 Destination Host Unreachable
[root#SERVER1 ~]# route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 10.90.11.193 0.0.0.0 UG 100 0 0 ens3
10.90.11.0 0.0.0.0 255.255.255.0 U 0 0 0 ens3
10.90.11.192 0.0.0.0 255.255.255.224 U 100 0 0 ens3
169.254.169.254 10.90.11.222 255.255.255.255 UGH 100 0 0 ens3
Could any of you please help fix the issue here?

Related

Wireguard can not connect to Qnap NAS

Port 51820 already mapped in my router. Points to my Qnap NAS ip.
My Linux client configuration set in /etc/wireguard/wg0.conf
[Interface]
Address = 198.18.7.2/32
SaveConfig = true
ListenPort = 37636
FwMark = 0xca6c
PrivateKey = <client key>
[Peer]
PublicKey = <qnap key>
AllowedIPs = 0.0.0.0/0
Endpoint = <mydyndns>:51820
PersistentKeepalive = 10
When I try to connect
╭─ender#ender-PC ~
╰─$ sudo wg-quick up wg0
[#] ip link add wg0 type wireguard
[#] wg setconf wg0 /dev/fd/63
[#] ip -4 address add 198.18.7.2/32 dev wg0
[#] ip link set mtu 1420 up dev wg0
[#] ip -4 route add 0.0.0.0/0 dev wg0 table 51820
[#] ip -4 rule add not fwmark 51820 table 51820
[#] ip -4 rule add table main suppress_prefixlength 0
[#] sysctl -q net.ipv4.conf.all.src_valid_mark=1
[#] iptables-restore -n
╭─ender#ender-PC ~
╰─$ ping 1.1.1.1
PING 1.1.1.1 (1.1.1.1) 56(84) bytes of data.
^C
--- 1.1.1.1 ping statistics ---
8 packets transmitted, 0 received, 100% packet loss, time 7168ms
Qnap "server" configuration
Publick key of the client has been added.
I've also tried to connect from the Android app and does not work.
I've been able to check the logs in the Linux client wg0: Handshake for peer 3 (<nasIP>:51820) did not complete after 5 seconds, retrying (try 2) which are the same logs as in the Android app. The issue seems to be pointing in the NAS side.
PS: I have already another VPN working (QBelt, which is a proprietary of Qnap) and is reachable from outside.
QTS version 5.0.0.1837
Hope I come with good, if partial news: I managed to fix OpenVPN through a TeamViewer on a Pi4 linked to the NAS's LAN. Also set up L2TP/IPSec as a second connection solution.
Diagnostic on what went wrong during QTS4->5 update (1828 20211020): NAT still properly set on the routeur, but somehow the OpenVPN server was affected to the secondary LAN. So yes, NAT was pointing to the wrong IP address… There is no telling if the glitch came from QVPN or the network & virtual switch.
Please note the list of issues and changes is impressive. Not surprising for a major release, but still… Next step: I'll set up a WireGuard connection. More later…
I may have missed QTS version on your configuration. Do you mind giving us a few details? Wireguard means a fairly recent version, but which one is important.
I recently updated a new NAS to QTS 5, and a perfectly working OpenvPN server stopped working altogether. Worked like a charm in version 4, being overzealous in updating the server was (sadly predictably) a mistake. Concrete result is, I now get an infinite timeout on port 1194, which ends up with "TLS handshake failed".
Same situation after dropping the OpenVPN configuration on the NAS and recreating it from scratch. So the answer to your problem may be as simple as having to wait for QNAP to fix either QVPN or LAN management on QTS 5.
Unfortunately I cannot access the NAS remotely anymore, so I can't corroborate your feedback that Qbelt is not affected. I will have a TeamViewer set up on the LAN tomorrow ta get access to the NAS again, and I'll give it a try.
This looks like an error on the keys.
Can you please try to recreate public and private server and client keys, ensure the public server key and private client key are in the client configuration files, and public client keys is in the server configuration table?
On a side note, you shouldn't use RFC 2544 IP, even if QNAP tutorial is using them.
the ip of the nas was not correctly set in the port forwarding mapping...

Multi-Site Active Directory Sync

I have created 4 Active Directory Domain Controllers both in different locations. One is in Delhi and Another one in Mumbai.
Delhi has 2 domain controllers Primary(DDC01) and Secondary(DDC02).
Mumbai has 2 domain controllers Primary(MDC01) and Secondary(MDC02).
Both have different networks and I can take the RDP of both Domain controllers from different locations.
Now I want to connect all 4 Domain Controllers so they can replicate the data and policies.
I saw this can be done through Active Directory Site and Services.
I Added Subnet's of Both Sites in Mumbai DC i.e. MDC01
I created Sites such as Mumbai-HO and Delhi-BO in MDC01 it got replicated to MDC02.
I could see MDC01 and MDC02 but I cannot see any of the DDC01 or DDC02 showing there.
Am I missing something?
Just FYI... DDC01 and DDC02 are having different gateways due to some reason.
• Please check the active directory site replication ports are open between for communication between the Mumbai and Delhi sites by doing telnet from command prompt on each of the ports. The inbound as well as outbound communication from these to ports to each other sites should be successful. Please find the list of ports as below: -
UDP Port 88 for Kerberos authentication
UDP and TCP Port 135 for domain controllers-to-domain controller and client to domain controller operations.
TCP Port 139 and UDP 138 for File Replication Service between domain controllers.
UDP Port 389 for LDAP to handle normal queries from client computers to the domain controllers.
TCP and UDP Port 445 for File Replication Service
TCP and UDP Port 464 for Kerberos Password Change
TCP Port 3268 and 3269 for Global Catalog from client to domain controller.
TCP and UDP Port 53 for DNS from client to domain controller and domain controller to domain controller.
• Check the replication status of the AD sites through the repadmin utility by running the below command on the replicating DCs in powershell: -
‘ repadmin /syncall /force ’ or ‘ repadmin /syncall /APeD ’ or ‘ repadmin /replsum ’
If the message replied in the powershell states that ‘Syncall terminated with no errors’, then everything is fine and you need not worry about the replication status between sites. Also, you can check the replication topology status in AD Sites and subnets where all the sites are listed whether created automatically or manually as below: -
This will give out the replication status and issues relating to AD site replication. For more detailed information on the replication issues, execute the below command and check for replication issues on site level. This will give out the site wise information in csv format: -
‘ repadmin /showrepl * /csv > showrepl.csv ’
• Also, please check whether Delhi site is automatically created by KCC or not, if not, then wait for at least 24 hours after the above steps revert successful status of replication. The check the ‘Cost’ parameter of replication link in the site details workspace by clicking on it. It defines the priority level of network connection sync level between the two sites. Please find the snapshot below to know the actual cost of your network connection and set it accordingly: -
For more information on AD site replication issues, please refer the link below: -
https://learn.microsoft.com/en-GB/troubleshoot/windows-server/identity/common-active-directory-replication-errors
https://learn.microsoft.com/en-us/troubleshoot/windows-server/identity/diagnose-replication-failures

ESXi 6.5 Refusing Connections on 2nd Management Interface

I need to remotely migrate the management interface on an ESXi 6.5 host. Ideally, I would create the new interface, confirm it works and then delete the old.
I have successfully created the new interface using these commands:
esxcli network ip netstack add -N VMManagement
esxcli network ip interface add -i vmk0 -M 00:50:56:67:89:10 -N VMManagement -p mgmt-vm
esxcli network ip interface ipv4 set -i vmk0 -P 1 -t dhcp
esxcli network ip interface tag add -i vmk0 -t Management
Here is the outputs of esxcli network ip interface ipv4 address list -i vmk0 -N VMManagement
Name IPv4 Address IPv4 Netmask IPv4 Broadcast Address Type
Gateway DHCP DNS
---- ------------ ------------- -------------- ------------ ----------- --------
vmk0 10.0.4.27 255.255.255.0 10.0.4.255 DHCP 10.0.4.1 true
I can ping vmk0 but it refuses ports 22 and 443. I am able to access ssh/https on the default management interface. I am testing from a host in 10.0.4.0/24 to eliminate routing/firewall variables.
I have tried completely disabling the ESXi firewall as well as running services.sh restart.
Any ideas?
I was able to make this work by putting both management interfaces in the default netstack. Initially, I didn't think a single netstack could handle multiple gateways, but this turned out not to be the case.
That said, if anyone does know how to get multiple management interfaces running on separate netstacks, please chime in, as that will be a better answer to this question.

Close database connections after inactivity

I have a Mule application that connects to an Oracle database. The application is a SOAP api which allows executing SQL Stored Procedures. My connector is set up to use connection pooling and I've been monitoring the connections themselves. I have a maximum pool size of 20 and when doing calls to the database, I can see them opening (netstat -ntl | grep PORTNUMBER).
tcp4 0 0 IP HERE OTHER IP HERE SYN_SENT
tcp4 0 0 IP HERE OTHER IP HERE ESTABLISHED
tcp4 0 0 IP HERE OTHER IP HERE ESTABLISHED
tcp4 0 0 IP HERE OTHER IP HERE ESTABLISHED
tcp4 0 0 IP HERE OTHER IP HERE ESTABLISHED
tcp4 0 0 IP HERE OTHER IP HERE ESTABLISHED
tcp4 0 0 IP HERE OTHER IP HERE ESTABLISHED
tcp4 0 0 IP HERE OTHER IP HERE ESTABLISHED
tcp4 0 0 IP HERE OTHER IP HERE ESTABLISHED
tcp4 0 0 IP HERE OTHER IP HERE ESTABLISHED
tcp4 0 0 IP HERE OTHER IP HERE ESTABLISHED
tcp4 0 0 IP HERE OTHER IP HERE ESTABLISHED
tcp4 0 0 IP HERE OTHER IP HERE ESTABLISHED
tcp4 0 0 IP HERE OTHER IP HERE ESTABLISHED
tcp4 0 0 IP HERE OTHER IP HERE ESTABLISHED
tcp4 0 0 IP HERE OTHER IP HERE ESTABLISHED
tcp4 0 0 IP HERE OTHER IP HERE ESTABLISHED
tcp4 0 0 IP HERE OTHER IP HERE ESTABLISHED
tcp4 0 0 IP HERE OTHER IP HERE ESTABLISHED
tcp4 0 0 IP HERE OTHER IP HERE ESTABLISHED
tcp4 0 0 IP HERE OTHER IP HERE ESTABLISHED
tcp4 10 0 IP HERE OTHER IP HERE ESTABLISHED
When the calls are done, I expect the connections to be closed after a certain period of time. This does not happen. I've noticed that when the application was running on a server, connections were still open from july (that's a couple of months back).
The only way I found so far that actually closes the connections after a couple of seconds is by enabling XA transactions and setting the Connection Timeout. However, this completely messes up the performance of the application and it's unnecessary overhead.
How would I go about adding such a timeout without using XA connections? I'd like for my database connections to be closed after 20 seconds of inactivity.
Thank you
Edit:
Generic database connector is used - Mule version 3.8.0
We have a maximum number of connections that are allowed to the database, we have multiple instances of this flow running. This means connections are reserved by one of the instances which causes the other instances unable to get new connections.
The specific issue we've had was that one instance still had 120 connections reserved, even though the last time it ran was weeks before. When the second instance requested more connections, it could only get 30 since the maximum on the database side is 150.
You should use a connection pool implementation that provides you control of the time to live of a connection. Ideally the pool should also provide validation queries to detect stale queries.
For example the c3p0 pool has a configuration called maxConnectionAge that seems to match your needs. maxIdleTime also could be of interest.
You can try using Oracle Transparent Connection Caching if will have no luck with Mule.
A few questions to understand the case better:
which type of connector you're using (jdbc/database) and which version of Mule is that?
why do you care about connection being open afterwards? are you observing some other symptoms you're not happy with?
Database connections via JDBC are designed to stay open with the intention of being reused. In general, most database technologies including the next generation NoSQL databases, have expensive startup and shutdown costs. Database connections should be established at application startup and closed gracefully at application shutdown. You should not be closing connections after each usage.
Oracle offers a connection pool called UCP. UCP offers options to control stale connections which includes setting a max reuse time and inactive connection timeout among other options.
This can be useful for returning resources to the application as well as checking for broken connections. Regardless, connections should be reused multiple times before closing.

dbus-daemon listening on server port

I have written a simple server program which listens on port 4849. When I start it the first time everything works fine. If I stop and restart it, it fails:
Cannot bind!! ([Errno 98] Address already in use)
Netstat tells me this...
root#node2:/home/pi/woof# netstat -pl
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 *:4849 *:* LISTEN 2426/dbus-daemon
tcp 0 *:* LISTEN 2195/sshd
....
What is dbus-daemon? Do I need it? Why is it listening on the port my server was listening on?
Thanks for any help.

Resources