How to avoid Undertow Connection RESET in apache benchmark test? - nio

Using apache benchmarking 100K request 20K concurrent users:
$ ab -n 100000 -c 20000 http://localhost:8080/mrs/ping
Completed 10000 requests
Completed 20000 requests
Completed 30000 requests
Completed 40000 requests
Completed 50000 requests
Completed 60000 requests
Completed 70000 requests
Completed 80000 requests
Completed 90000 requests
apr_socket_recv: Connection reset by peer (104) <<< HOW to overcome??
Below is the Undertow (version 1.2.6 + xnio-api 3.3.1) PingServer:
public class UndertowPingServer {
private static Logger log = Logger.getLogger(UndertowPingServer.class);
public static void main(String[] args) throws ServletException {
PathHandler path = Handlers.path()
.addPrefixPath("/mrs/ping", new HttpHandler() {
#Override
public void handleRequest(HttpServerExchange exchange) throws Exception {
exchange.getResponseHeaders().put(
Headers.CONTENT_TYPE, "text/plain");
exchange.getResponseSender().send("Server Time:" + new Date().toString() + "\n\n");
}
});
Undertow.Builder builder = Undertow.builder()
.setHandler(path)
.addHttpListener(8080, "0.0.0.0")
.setBufferSize(1024 * 16)
//this seems slightly faster in some configurations
.setIoThreads(Runtime.getRuntime().availableProcessors() * 2)
.setSocketOption(Options.BACKLOG, 500000)
.setWorkerThreads(2000)
//don't send a keep-alive header for HTTP/1.1 requests, as it is not required
.setServerOption(UndertowOptions.ALWAYS_SET_KEEP_ALIVE, false);
Undertow server = builder.build();
server.start();
log.info("micro-service running!");
}
}
All the needed linux kernel sockets and thread settings via sysctl are already done. That is why it can do the first 90K request with 20k users without issue.

You can add -r parameter to ab to prevent it exitting.
from https://httpd.apache.org/docs/2.4/programs/ab.html
-r
Don't exit on socket receive errors.
Connections will start to timeout and can prolong testing time by a timeout period.

I just FOUND OUT similar issue from:
blog.scene.ro/posts/apache-benchmark-apr_socket_recv
sudo sysctl -w net.ipv4.tcp_syncookies=0
This did the job. NO MORE Connection reset by peer (104).
Guess this might not be a undertow or xnio-api issue.

Another observation:
Undertow.Builder builder = Undertow.builder()
.setSocketOption(Options.BACKLOG, 100000) // << Does impact the REST
and also MAY related to:
$ sysctl -a | grep -i sync
...
net.ipv4.tcp_max_syn_backlog = 100000

Related

Embedded ActiveMQ connection taking a long time to shutdown

I'm using Camel with embedded ActiveMQ for some functional tests. But after the tests have run, ActiveMQ is having trouble shutting down, and I see logs like this:
2022-01-31 19:12:34,873 [MQ ShutdownHook] INFO BrokerService - Apache ActiveMQ 5.15.9 (localhost, ID:8dc9dbba2775-37769-1643655318556-0:2) is shutting down
...
2022-01-31 19:12:34,875 [MQ ShutdownHook] INFO TransportConnector - Connector tcp://localhost:0 stopped
2022-01-31 19:12:35,052 [m://localhost#4] INFO PooledConnectionFactory - Expiring connection ActiveMQConnection {id=ID:8dc9dbba2775-37769-1643655318556-4:3,clientId=ID:8dc9dbba2775-37769-1643655318556-3:2,started=false} on IOException: peer (vm://localhost#5) stopped.
...
2022-01-31 19:12:35,053 [MQ ShutdownHook] INFO BrokerService - Apache ActiveMQ 5.15.9 (localhost, ID:8dc9dbba2775-37769-1643655318556-0:19) is shutdown
2022-01-31 19:12:40,052 [MQ ShutdownHook] INFO TransportConnection - The connection to 'vm://localhost#12' is taking a long time to shutdown.
2022-01-31 19:12:45,052 [MQ ShutdownHook] INFO TransportConnection - The connection to 'vm://localhost#12' is taking a long time to shutdown.
2022-01-31 19:12:50,053 [MQ ShutdownHook] INFO TransportConnection - The connection to 'vm://localhost#12' is taking a long time to shutdown.
A connection shutdown is failing and keeps repeating that last message.
The config is done programmatically for the tests :
private static void addActiveMqComponent()
{
if ( FunctionalTestFramework.framework().context().getComponent("activemq2") == null)
{
JmsConfiguration jmsConfig = new JmsConfiguration();
ActiveMQConnectionFactory connectionFactory = new ActiveMQConnectionFactory("vm://localhost:61616?broker.persistent=false");
connectionFactory.setTrustAllPackages(true);
RedeliveryPolicy redeliveryPolicy = new RedeliveryPolicy();
redeliveryPolicy.setInitialRedeliveryDelay(1000);
redeliveryPolicy.setRedeliveryDelay(1000);
redeliveryPolicy.setBackOffMultiplier(3.5);
redeliveryPolicy.setUseExponentialBackOff(true);
redeliveryPolicy.setMaximumRedeliveries(-1);
connectionFactory.setRedeliveryPolicy(redeliveryPolicy);
jmsConfig.setConnectionFactory(connectionFactory );
transactionManager = new JmsTransactionManager();
transactionManager.setConnectionFactory(connectionFactory);
jmsConfig.setTransactionManager(transactionManager);
ActiveMQComponent activeMqComponent = new ActiveMQComponent();
activeMqComponent.setConfiguration(jmsConfig);
activeMqComponent.setTransacted(true);
activeMqComponent.setCacheLevelName("CACHE_CONSUMER");
FunctionalTestFramework.framework().context().addComponent("activemq2", activeMqComponent);
}
}
I've seen some other threads suggesting to set useShutdownHook="false" but it only applies when they're explicitly triggering shutdown (broker.stop()) which I'm not doing. And I'm not finding much else on this specific problem.
Any ideas on what the issue could be?

PAHO MQTT 5 throwing exception when using same clientId in routes

When using paho-mqtt5:test more than once with same clientId then it throw exception Client not connected but if i will use different clientId for each to and from then it will work fine
2021-10-05 19:25:28,650 ERROR [org.apa.cam.pro.err.DefaultErrorHandler] (Camel (camel-1) thread #0 - timer://test) Failed delivery for (MessageId: 871E4623819E4FB-000000000000001B on ExchangeId: 871E4623819E4FB-000000000000001B). Exhausted after delivery attempt: 1 caught: Client is not connected (32104)
Message History (complete message history is disabled)
---------------------------------------------------------------------------------------------------------------------------------------
RouteId ProcessorId Processor Elapsed (ms)
[route1 ] [route1 ] [from[timer://test?period=1000] ] [ 0]
...
[route1 ] [to1 ] [paho:test ] [ 0]
Stacktrace
---------------------------------------------------------------------------------------------------------------------------------------
: Client is not connected (32104)
at org.eclipse.paho.mqttv5.client.internal.ExceptionHelper.createMqttException(ExceptionHelper.java:32)
at org.eclipse.paho.mqttv5.client.internal.ClientComms.sendNoWait(ClientComms.java:231)
at org.eclipse.paho.mqttv5.client.MqttAsyncClient.publish(MqttAsyncClient.java:1530)
at org.eclipse.paho.mqttv5.client.MqttClient.publish(MqttClient.java:564)
at org.apache.camel.component.paho.mqtt5.PahoMqtt5Producer.process(PahoMqtt5Producer.java:55)
at org.apache.camel.support.AsyncProcessorConverterHelper$ProcessorToAsyncProcessorBridge.process(AsyncProcessorConverterHelper.java:66)
at org.apache.camel.processor.SendProcessor.process(SendProcessor.java:172)
at org.apache.camel.processor.errorhandler.RedeliveryErrorHandler$SimpleTask.run(RedeliveryErrorHandler.java:463)
at org.apache.camel.impl.engine.DefaultReactiveExecutor$Worker.schedule(DefaultReactiveExecutor.java:179)
at org.apache.camel.impl.engine.DefaultReactiveExecutor.scheduleMain(DefaultReactiveExecutor.java:64)
at org.apache.camel.processor.Pipeline.process(Pipeline.java:184)
at org.apache.camel.impl.engine.CamelInternalProcessor.process(CamelInternalProcessor.java:398)
at org.apache.camel.component.timer.TimerConsumer.sendTimerExchange(TimerConsumer.java:210)
at org.apache.camel.component.timer.TimerConsumer$1.run(TimerConsumer.java:76)
at java.base/java.util.TimerThread.mainLoop(Timer.java:556)
at java.base/java.util.TimerThread.run(Timer.java:506)
Here is my code which is throwing exception
#ApplicationScoped
class TestRouter : RouteBuilder() {
override fun configure() {
val mqtt5Component = PahoMqtt5Component()
mqtt5Component.configuration = PahoMqtt5Configuration().apply {
brokerUrl = "tcp://192.168.99.101:1883"
clientId = "paho123"
isCleanStart = true
}
context.addComponent("paho-mqtt5", mqtt5Component)
from("timer:test?period=1000").setBody(constant("Testing timer2")).to("paho-mqtt5:test")
from("paho-mqtt5:test").process { e ->
val body = (e.`in`?.body as? ByteArray)?.let { String(it) }
println("test body 1 => $body")
}
}
}
#William, this is expected behavior
The message broker uses the client id to differentiate between clients so it can perform housekeeping for a client connection that is no longer used
In addition, a client may have a "Last Will and Testament" that the broker keeps track of
It is acceptable to append a random number to the end of your current 'clientId' since it is likely no one but you will care about this
If you have access to the individuals login, you could use that as well but you would still want to make each session unique in case they run multiple sessions
Maybe I don't understand what your problem is
Each client must have a unique Id
What are you observing that makes you think that it is creating multiple connections for a single client?
Is there a chance you are opening multiple windows and each is generating a different clientId?
This is a good way to diagnose issues by monitoring what the server is seeing
My paho-mqtt client (Javascript) is connecting as "webclient" and I append a randome number (webclient173) to identify this client
To troubleshoot, I would suggest you close all connections on the client and monitor the log of the MQTT process
When the monitor is in place, open a connection from a client that currently has no connections
This is an example connection to my Mosquitto log file
$ tail -f /var/log/mosquitto/mosquitto.log
1635169943: No will message specified.
1635169943: Sending CONNACK to webclient173 (0, 0)
1635169943: Received SUBSCRIBE from webclient173
1635169943: testtopic (QoS 0)
1635169943: Sending SUBACK to webclient173
1635170003: Received PINGREQ from webclient173
1635170003: Sending PINGRESP to webclient173
1635170003: Received PINGREQ from webclient173
1635170003: Sending PINGRESP to webclient173
What does your log show?

Google Cloud Run pubsub pull listener app fails to start

I'm testing pubsub "pull" subscriber on Cloud Run using just listener part of this sample java code (SubscribeAsyncExample...reworked slightly to fit in my SpringBoot app):
https://cloud.google.com/pubsub/docs/quickstart-client-libraries#java_1
It fails to startup during deploy...but while it's trying to start, it does pull items from the pubsub queue. Originally, I had an HTTP "push" receiver (a #RestController) on a different pubsub topic and that worked fine. Any suggestions? I'm new to Cloud Run. Thanks.
Deploying...
Creating Revision... Cloud Run error: Container failed to start. Failed to start and then listen on the port defined
by the PORT environment variable. Logs for this revision might contain more information....failed
Deployment failed
In logs:
2020-08-11 18:43:22.688 INFO 1 --- [ main] o.s.web.context.ContextLoader : Root WebApplicationContext: initialization completed in 4606 ms
2020-08-11T18:43:25.287759Z Listening for messages on projects/ce-cxmo-dev/subscriptions/AndySubscriptionPull:
2020-08-11T18:43:25.351650801Z Container Sandbox: Unsupported syscall setsockopt(0x18,0x29,0x31,0x3eca02dfd974,0x4,0x28). It is very likely that you can safely ignore this message and that this is not the cause of any error you might be troubleshooting. Please, refer to https://gvisor.dev/c/linux/amd64/setsockopt for more information.
2020-08-11T18:43:25.351770555Z Container Sandbox: Unsupported syscall setsockopt(0x18,0x29,0x12,0x3eca02dfd97c,0x4,0x28). It is very likely that you can safely ignore this message and that this is not the cause of any error you might be troubleshooting. Please, refer to https://gvisor.dev/c/linux/amd64/setsockopt for more information.
2020-08-11 18:43:25.680 WARN 1 --- [ault-executor-0] i.g.n.s.i.n.u.internal.MacAddressUtil : Failed to find a usable hardware address from the network interfaces; using random bytes: ae:2c:fb:e7:92:9c:2b:24
2020-08-11T18:45:36.282714Z Id: 1421389098497572
2020-08-11T18:45:36.282763Z Data: We be pub-sub'n in pull mode2!!
Nothing else after this and the app stops running.
#Component
public class AndyTopicPullRecv {
public AndyTopicPullRecv()
{
subscribeAsyncExample("ce-cxmo-dev", "AndySubscriptionPull");
}
public static void subscribeAsyncExample(String projectId, String subscriptionId) {
ProjectSubscriptionName subscriptionName =
ProjectSubscriptionName.of(projectId, subscriptionId);
// Instantiate an asynchronous message receiver.
MessageReceiver receiver =
(PubsubMessage message, AckReplyConsumer consumer) -> {
// Handle incoming message, then ack the received message.
System.out.println("Id: " + message.getMessageId());
System.out.println("Data: " + message.getData().toStringUtf8());
consumer.ack();
};
Subscriber subscriber = null;
try {
subscriber = Subscriber.newBuilder(subscriptionName, receiver).build();
// Start the subscriber.
subscriber.startAsync().awaitRunning();
System.out.printf("Listening for messages on %s:\n", subscriptionName.toString());
// Allow the subscriber to run for 30s unless an unrecoverable error occurs.
// subscriber.awaitTerminated(30, TimeUnit.SECONDS);
subscriber.awaitTerminated();
System.out.printf("Async subscribe terminated on %s:\n", subscriptionName.toString());
// } catch (TimeoutException timeoutException) {
} catch (Exception e) {
// Shut down the subscriber after 30s. Stop receiving messages.
subscriber.stopAsync();
System.out.printf("Async subscriber exception: " + e);
}
}
}
Kolban question is very important!! With the shared code, I would like to say "No". The Cloud Run contract is clear:
Your service must answer to HTTP request. Out of request, you pay nothing and no CPU is dedicated to your instance (the instance is like a daemon when no request is processing)
Your service must be stateless (not your case here, I won't take time on this)
If you want to pull your PubSub subscription, create an endpoint in your code with a Rest controller. While you are processing this request, run your pull mechanism and process messages.
This endpoint can be called by Cloud Scheduler regularly to keep the process up.
Be careful, you have a max request processing timeout at 15 minutes (today, subject to change in a near future). So, you can't run your process more than 15 minutes. Make it resilient to fail and set your scheduler to call your service every 15 minutes

read string datastream in Flink from socket without using netcat server

I have a case scenario in which I have a stream generator client which is generating multiple streams, merging them and sending it to socket and I want Flink program to listen to it as the server. As we know that server has to be turned up first, so that it can listen to client requests. I tried to do the same by using code given below
public static void main(String[] args) throws Exception {
//setting the envrionment variable as StreamExecutionEnvironment
StreamExecutionEnvironment environment = StreamExecutionEnvironment.getExecutionEnvironment();
environment.setParallelism(1);
DataStream<String> stream1 = environment.socketTextStream("localhost", 9000);
stream1.print();
//start the execution
environment.execute(" Started the execution ");
}// main
The code for stream generator acting as client is given below
DataStream<Event> stream1 = envrionment
.addSource(new EventGenerator(2,60,1,1,100, 200 ))
.name("stream 1")
.setParallelism(parallelism_for_stream_rr);
DataStream<Event> stream2 = envrionment
.addSource(new EventGenerator(3,60,1,2,10, 20 ))
.name("stream 2")
.setParallelism(parallelism_for_stream_rr);
DataStream<Event> stream3 = envrionment
.addSource(new EventGenerator(5,60,1,3,30, 40 ))
.name("stream 3")
.setParallelism(parallelism_for_stream_rr);
DataStream<Event> merged = stream1.union(stream2,stream3);
merged.print();
// sending data to Mobile Cep via socket
merged.map(new MapFunction<Event, String>() {
#Override
public String map(Event event) throws Exception {
String tuple = event.toString();
return tuple + "\n";
}
}).writeToSocket("localhost", 9000, new SimpleStringSchema() );
Issue # 1: The issue is that client code works only when I start a Netcat server, but then Netcat server doesn't forwards the data streams.If Netcat server is not up, client code says it cant make a connection
Issue # 2: Flink program doesn't execute if Netcat server is not up
Caused by: java.net.ConnectException: Connection refused (Connection refused)
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
I know that one possible solution for this is to generate the streams within the Flink program, but I want to receive the streams via socket.
Thanks in Advance ~
Neither Flink's socket source nor its sink starts a TCP server and waits for incoming connections. They are both clients which connect against an already started TCP server. That's also why you have to start netcat before launching the jobs. If you want to write to and read from a socket, then you have to write a TCP server which can buffer the incoming data and forwards them once a client connects to it.

Mongoose opening multiple unwanted TCP sockets on reconnect

Wanting to test a mongoDB server up/down procedure connected to Node/Mongoose, we found out that Mongoose can sometimes open hundreds of TCP sockets (which is not necessary and potentially blocking for the user who is limited to a certain amount of sockets). This occurs in the following case and environment :
Node supervised with PM2 and MongoDB surevised with daemontools
At normal and clean startup :
$ netstat -alpet | grep mongo
tcp 0 0 *:27017 *:* LISTEN mongo 65910844 22930/mongod
tcp 0 0 localhost.localdomain:27017 localhost.localdomain:54595 ESTABLISHED mongo 6591110422930/mongod
The last "ESTABLISHED" line repeated 5 times since the option (poolSize: 5) is specified in Mongoose ("mongo" is the user running mongod under daemontools)
When we have the Node procedure :
mongoose.connection.on('disconnected', function () {
var options = {server: { auto_reconnect:true, poolSize: 5 ,socketOptions: { connectTimeoutMS: 5000 } }
}
console.log('Mongoose default connection disconnected ' + mongoose.connection.readyState);
mongoose.connect( dbURI, options );
});
and we bring down the MongoDB by daemontools (mongodbdaemon is a simple $mongod command) :
svc -d /service/mongodbdaemon
there is of course no mongod running in the system (tested by the netstat command ) and the web server pages called which are using mongoose announce what is normal :
{"name":"MongoError","message":"topology was destroyed"}
The problem occurrs at this stage. Since the time we bring down MongoDB, Mongoose accumulates all the connect() calls in the 'disconnected' event handler. This means that the longer we wait before bringing up MongoDB, the more TCP connections will be opened.
So bringing up MongoDB by
svc -u /service/mongodbdaemon
gives the following :
$ netstat -alpet | grep mongo | wc -l
850 'ESTABLISHED' TCP connections to mongod !
If we bring down again mongod, the hundreds of connections remain in the TIME_WAIT state until Linux cleans the socket pool.
Questions
Can we check if a MongoDB instance is available before connecting to it ?
Can we configure Mongoose not to accumulate reconnecting() tries every millisecond or so ?
Is there a buffer for pending connection operations (as there is for mongoose.insert[...]) that we can access or clean manually ?
Problem reproductible on a CentOS 6.7 / mongoDB 3.0.6 / mongoose 4.1.8 / node 4.0.0
Edit :
From the official mongoose site where I posted this question after posting it here, I received an answer : "using auto_reconnect : true, on the initial connect() operation (which is set by default) there is no reason to reconnect() in a disconnect event callback".
This is true and it works jute fine, but the question is now why does this happen and how to avoid it (it is serious enough on the Linux system level to be an issue in mongoose).
Thanks !

Resources