pyflink.fn_execution.beam.beam_boot doesnt close after its job cancelled

pyflink.fn_execution.beam.beam_boot doesnt close after its job cancelled - apache-flink

root 64994 14.0 0.0 8099944 85472 ? Sl 16:30 0:03 /bin/python3 -m pyflink.fn_execution.beam.beam_boot --id=5-1 --provision_endpoint=localhost:10514
root 64998 0.0 0.0 108060 684 ? S 16:30 0:00 tee /tmp/python-dist-6b89369d-ba23-4e5d-83d8-7a54dd9a3497/flink-python-udf-boot.log
This python process keeps running after cancelling its flink job. What can I do instead of killing it manually?

Related

Do we have any ways to retrieve the list of process and threads which are in runnable state(not running state) in ubuntu?

My requirement is to do dynamic cpu shielding in C program based on the queue length of runnable threads (but not running threads which are waiting for CPU availability) in Realtime operating systems-(say ubuntu with RT linux patch) scenarios. For example, we can consider the system is configured for SCHED_FIFO policy.
I am not able to find any commands to retrieve the number of process which are in wait state, running state, runnable state etc.
Any help is much appreciated.
The command 'PS -T au' shows the state of all 'runnable' as well as 'running' threads as 'R'.
PS -T au
Below is the result I am getting from above command. In this ThreadID-16841, 16842 and 16843 are threads which were created by main process 16840. All the above created threads were showing in R state which denotes Runnable or running.
Instead I would like a linux command or C API to retrieve the number of processes in a runnable state but not running.
USER PID SPID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 914 914 0.1 1.3 428324 105804 tty7 Rsl+ Oct23 1:27 /usr/lib/xorg/Xorg -core :0 -seat seat0 -auth /var/run/lightdm/root/:0 -nolisten
root 914 925 0.0 1.3 428324 105804 tty7 Ssl+ Oct23 0:04 /usr/lib/xorg/Xorg -core :0 -seat seat0 -auth /var/run/lightdm/root/:0 -nolisten
root 1170 1170 0.0 0.0 23004 1772 tty1 Ss+ Oct23 0:00 /sbin/agetty --noclear tty1 linux
senthil 1979 1979 0.0 0.0 29532 5056 pts/11 Ss Oct23 0:00 bash
senthil 2032 2032 0.0 0.0 29552 5212 pts/2 Ss Oct23 0:00 bash
root 16837 16837 0.0 0.0 62092 4132 pts/2 S+ 09:37 0:00 sudo ./sigmain
root 16840 16840 0.0 0.0 31108 796 pts/2 Sl+ 09:37 0:00 ./sigmain
root 16840 16841 95.9 0.0 31108 796 pts/2 Rl+ 09:37 9:01 ./sigmain
root 16840 16842 95.9 0.0 31108 796 pts/2 Rl+ 09:37 9:01 ./sigmain
root 16840 16843 95.9 0.0 31108 796 pts/2 Rl+ 09:37 9:01 ./sigmain
senthil 17225 17225 0.0 0.0 44432 3364 pts/11 R+ 09:46 0:00 ps -T au

LD_PRELOAD in every log in to server

I need to logging all terminal commands in Linux.
I have found correctly working library in C, but it works only when I run LD_PRELOAD=/usr/local/bin/bashpreload.so /bin/bash:
# ldd /bin/bash
linux-vdso.so.1 => (0x00007ffef59f8000)
/usr/local/bin/bashpreload.so (0x00007fe691323000)
libtinfo.so.5 => /lib64/libtinfo.so.5 (0x00007fe691102000)
libdl.so.2 => /lib64/libdl.so.2 (0x00007fe690efe000)
libc.so.6 => /lib64/libc.so.6 (0x00007fe690b6a000)
/lib64/ld-linux-x86-64.so.2 (0x00007fe691524000)
If I log in again in the system after this, I will not see the lib with ldd:
[root#XXX ~]# LD_PRELOAD=/usr/local/bin/bashpreload.so /bin/bash
[root#XXX ~]# ldd /bin/bash
linux-vdso.so.1 => (0x00007ffe481f6000)
/usr/local/bin/bashpreload.so (0x00007f3f1b808000)
libtinfo.so.5 => /lib64/libtinfo.so.5 (0x00007f3f1b5e7000)
libdl.so.2 => /lib64/libdl.so.2 (0x00007f3f1b3e3000)
libc.so.6 => /lib64/libc.so.6 (0x00007f3f1b04f000)
/lib64/ld-linux-x86-64.so.2 (0x00007f3f1ba09000)
[root#XXX ~]# exit
[root#XXX ~]# logout
Connection to XXX closed.
[sahaquiel#sahaquiel-PC ~]$ ssh root#XXX
root#XXX's password:
Last login: Tue Dec 19 11:28:22 2017 from YYY
[root#XXX ~]# ldd /bin/bash
linux-vdso.so.1 => (0x00007ffca2f98000)
libtinfo.so.5 => /lib64/libtinfo.so.5 (0x00007f19a13ff000)
libdl.so.2 => /lib64/libdl.so.2 (0x00007f19a11fb000)
libc.so.6 => /lib64/libc.so.6 (0x00007f19a0e67000)
/lib64/ld-linux-x86-64.so.2 (0x00007f19a1620000)
And one more trouble: if I use this library, my current PID is changing:
Last login: Tue Dec 19 11:28:54 2017 from YYY
[root#XXX ~]# echo "Library is not uploaded"
Library is not uploaded
[root#XXX ~]# echo $$
4639
[root#XXX ~]# LD_PRELOAD=/usr/local/bin/bashpreload.so /bin/bash
[root#XXX ~]# echo $$
4654
[root#212-24-57-104 ~]# ps awwufx | grep -B5 [4]654
root 1706 0.0 0.0 66256 1192 ? Ss 10:54 0:00 /usr/sbin/sshd
root 4517 0.0 0.0 104636 4644 ? Ss 11:27 0:00 \_ sshd: root#pts/1
root 4519 0.0 0.0 108320 1872 pts/1 Ss+ 11:27 0:00 | \_ -bash
root 4637 0.0 0.0 104636 4624 ? Ss 11:30 0:00 \_ sshd: root#pts/0
root 4639 0.0 0.0 108320 1872 pts/0 Ss 11:30 0:00 | \_ -bash
root 4654 0.0 0.0 110376 1956 pts/0 S 11:31 0:00 | \_ /bin/bash
So, I need two things:
Find the way to do LD_PRELOAD quietly for each logging in user;
Know why after this I'm working in the child /bin/bash process.
Thanks!

This is a classic XY problem. You need to log user actions, have decided on a solution, and are asking questions about that solution.
Even though the solution won't work.
Because using an LD_PRELOAD library is not a reliable way to log user commands.
The user can just unset the LD_PRELOAD environment variable. And no, marking it readonly doesn't work. Because it's just a variable in the memory of a process the user controls.
You're setting LD_PRELOAD to a 64-bit shared object. Every 32-bit program will now fail to run.
However your preloaded library logs data, it does so with the user's permissions/access rights. Thus the user can spoof the data recorded.
If you need to log user's actions, use a system designed to do that securely: auditing.

Find the way to do LD_PRELOAD quietly for each logging in user
You need to set somewhere common for all users such as /etc/profile or /etc/environment.
See How to set environment variable for everyone under my linux system? for more options/details.
Know why after this I'm working in the child /bin/bash process.
That's straight-forward - whenever you create a process, its PID is different from its parent :) When you run /bin/bash, you obviously creates another shell and that's why $$ is different. This has nothing to do with LD_PRELOAD. If you run /bin/bash without LD_PRELOAD, you'll observe exactly the same behaviour.

If someone will needs the same as me:
You can add environment variable before user log in via SSH in /etc/security/pam_env.conf, syntax like:
LD_PRELOAD DEFAULT= OVERRIDE="/usr/local/bin/bashpreload.so"

Cannot benchmark DynamoDb using YCSB

I need to use YCSB to benchmark DynamoDB and trying to use YCSB for the first time.
The dynamo provisioned throughput is 100 RCU and 50 WCUs. Following is the command I am executing:
./bin/ycsb load dynamodb -P dynamodb-binding/conf/dynamodb.properties -P workloads/workloada -threads 1 -target 40
The properties file has the endpoint (us-east-1), aws credentials etc. defined. I can run the ycsb shell with inserts:
./bin/ycsb shell dynamo
The table schema has only 1 field which is named: partition_key. Since dynamo is schemaless any column could be added by YCSB and should not be a problem.
But when I try to perform a load I get the following errors:
./bin/ycsb load dynamodb -P dynamodb-binding/conf/dynamodb.properties -P workloads/workloada -threads 1 -target 40
java -cp /opt/ycsb-0.12.0/dynamodb-binding/conf:/opt/ycsb-0.12.0/conf:/opt/ycsb-0.12.0/lib/core-0.12.0.jar:/opt/ycsb-0.12.0/lib/HdrHistogram-2.1.4.jar:/opt/ycsb-0.12.0/lib/htrace-core4-4.1.0-incubating.jar:/opt/ycsb-0.12.0/lib/jackson-core-asl-1.9.4.jar:/opt/ycsb-0.12.0/lib/jackson-mapper-asl-1.9.4.jar:/opt/ycsb-0.12.0/dynamodb-binding/lib/aws-java-sdk-1.10.48.jar:/opt/ycsb-0.12.0/dynamodb-binding/lib/aws-java-sdk-api-gateway-1.10.48.jar:/opt/ycsb-0.12.0/dynamodb-binding/lib/aws-java-sdk-autoscaling-1.10.48.jar:/opt/ycsb-0.12.0/dynamodb-binding/lib/aws-java-sdk-cloudformation-1.10.48.jar:/opt/ycsb-0.12.0/dynamodb-binding/lib/aws-java-sdk-cloudfront-1.10.48.jar:/opt/ycsb-0.12.0/dynamodb-binding/lib/aws-java-sdk-cloudhsm-1.10.48.jar:/opt/ycsb-0.12.0/dynamodb-binding/lib/aws-java-sdk-cloudsearch-1.10.48.jar:/opt/ycsb-0.12.0/dynamodb-binding/lib/aws-java-sdk-cloudtrail-1.10.48.jar:/opt/ycsb-0.12.0/dynamodb-binding/lib/aws-java-sdk-cloudwatch-1.10.48.jar:/opt/ycsb-0.12.0/dynamodb-binding/lib/aws-java-sdk-cloudwatchmetrics-1.10.48.jar:/opt/ycsb-0.12.0/dynamodb-binding/lib/aws-java-sdk-codecommit-1.10.48.jar:/opt/ycsb-0.12.0/dynamodb-binding/lib/aws-java-sdk-codedeploy-1.10.48.jar:/opt/ycsb-0.12.0/dynamodb-binding/lib/aws-java-sdk-codepipeline-1.10.48.jar:/opt/ycsb-0.12.0/dynamodb-binding/lib/aws-java-sdk-cognitoidentity-1.10.48.jar:/opt/ycsb-0.12.0/dynamodb-binding/lib/aws-java-sdk-cognitosync-1.10.48.jar:/opt/ycsb-0.12.0/dynamodb-binding/lib/aws-java-sdk-config-1.10.48.jar:/opt/ycsb-0.12.0/dynamodb-binding/lib/aws-java-sdk-core-1.10.48.jar:/opt/ycsb-0.12.0/dynamodb-binding/lib/aws-java-sdk-datapipeline-1.10.48.jar:/opt/ycsb-0.12.0/dynamodb-binding/lib/aws-java-sdk-devicefarm-1.10.48.jar:/opt/ycsb-0.12.0/dynamodb-binding/lib/aws-java-sdk-directconnect-1.10.48.jar:/opt/ycsb-0.12.0/dynamodb-binding/lib/aws-java-sdk-directory-1.10.48.jar:/opt/ycsb-0.12.0/dynamodb-binding/lib/aws-java-sdk-dynamodb-1.10.48.jar:/opt/ycsb-0.12.0/dynamodb-binding/lib/aws-java-sdk-ec2-1.10.48.jar:/opt/ycsb-0.12.0/dynamodb-binding/lib/aws-java-sdk-ecr-1.10.48.jar:/opt/ycsb-0.12.0/dynamodb-binding/lib/aws-java-sdk-ecs-1.10.48.jar:/opt/ycsb-0.12.0/dynamodb-binding/lib/aws-java-sdk-efs-1.10.48.jar:/opt/ycsb-0.12.0/dynamodb-binding/lib/aws-java-sdk-elasticache-1.10.48.jar:/opt/ycsb-0.12.0/dynamodb-binding/lib/aws-java-sdk-elasticbeanstalk-1.10.48.jar:/opt/ycsb-0.12.0/dynamodb-binding/lib/aws-java-sdk-elasticloadbalancing-1.10.48.jar:/opt/ycsb-0.12.0/dynamodb-binding/lib/aws-java-sdk-elasticsearch-1.10.48.jar:/opt/ycsb-0.12.0/dynamodb-binding/lib/aws-java-sdk-elastictranscoder-1.10.48.jar:/opt/ycsb-0.12.0/dynamodb-binding/lib/aws-java-sdk-emr-1.10.48.jar:/opt/ycsb-0.12.0/dynamodb-binding/lib/aws-java-sdk-events-1.10.48.jar:/opt/ycsb-0.12.0/dynamodb-binding/lib/aws-java-sdk-glacier-1.10.48.jar:/opt/ycsb-0.12.0/dynamodb-binding/lib/aws-java-sdk-iam-1.10.48.jar:/opt/ycsb-0.12.0/dynamodb-binding/lib/aws-java-sdk-importexport-1.10.48.jar:/opt/ycsb-0.12.0/dynamodb-binding/lib/aws-java-sdk-inspector-1.10.48.jar:/opt/ycsb-0.12.0/dynamodb-binding/lib/aws-java-sdk-iot-1.10.48.jar:/opt/ycsb-0.12.0/dynamodb-binding/lib/aws-java-sdk-kinesis-1.10.48.jar:/opt/ycsb-0.12.0/dynamodb-binding/lib/aws-java-sdk-kms-1.10.48.jar:/opt/ycsb-0.12.0/dynamodb-binding/lib/aws-java-sdk-lambda-1.10.48.jar:/opt/ycsb-0.12.0/dynamodb-binding/lib/aws-java-sdk-logs-1.10.48.jar:/opt/ycsb-0.12.0/dynamodb-binding/lib/aws-java-sdk-machinelearning-1.10.48.jar:/opt/ycsb-0.12.0/dynamodb-binding/lib/aws-java-sdk-marketplacecommerceanalytics-1.10.48.jar:/opt/ycsb-0.12.0/dynamodb-binding/lib/aws-java-sdk-opsworks-1.10.48.jar:/opt/ycsb-0.12.0/dynamodb-binding/lib/aws-java-sdk-rds-1.10.48.jar:/opt/ycsb-0.12.0/dynamodb-binding/lib/aws-java-sdk-redshift-1.10.48.jar:/opt/ycsb-0.12.0/dynamodb-binding/lib/aws-java-sdk-route53-1.10.48.jar:/opt/ycsb-0.12.0/dynamodb-binding/lib/aws-java-sdk-s3-1.10.48.jar:/opt/ycsb-0.12.0/dynamodb-binding/lib/aws-java-sdk-ses-1.10.48.jar:/opt/ycsb-0.12.0/dynamodb-binding/lib/aws-java-sdk-simpledb-1.10.48.jar:/opt/ycsb-0.12.0/dynamodb-binding/lib/aws-java-sdk-simpleworkflow-1.10.48.jar:/opt/ycsb-0.12.0/dynamodb-binding/lib/aws-java-sdk-sns-1.10.48.jar:/opt/ycsb-0.12.0/dynamodb-binding/lib/aws-java-sdk-sqs-1.10.48.jar:/opt/ycsb-0.12.0/dynamodb-binding/lib/aws-java-sdk-ssm-1.10.48.jar:/opt/ycsb-0.12.0/dynamodb-binding/lib/aws-java-sdk-storagegateway-1.10.48.jar:/opt/ycsb-0.12.0/dynamodb-binding/lib/aws-java-sdk-sts-1.10.48.jar:/opt/ycsb-0.12.0/dynamodb-binding/lib/aws-java-sdk-support-1.10.48.jar:/opt/ycsb-0.12.0/dynamodb-binding/lib/aws-java-sdk-swf-libraries-1.10.48.jar:/opt/ycsb-0.12.0/dynamodb-binding/lib/aws-java-sdk-waf-1.10.48.jar:/opt/ycsb-0.12.0/dynamodb-binding/lib/aws-java-sdk-workspaces-1.10.48.jar:/opt/ycsb-0.12.0/dynamodb-binding/lib/commons-codec-1.6.jar:/opt/ycsb-0.12.0/dynamodb-binding/lib/commons-logging-1.1.3.jar:/opt/ycsb-0.12.0/dynamodb-binding/lib/dynamodb-binding-0.12.0.jar:/opt/ycsb-0.12.0/dynamodb-binding/lib/httpclient-4.3.6.jar:/opt/ycsb-0.12.0/dynamodb-binding/lib/httpcore-4.3.3.jar:/opt/ycsb-0.12.0/dynamodb-binding/lib/jackson-annotations-2.5.0.jar:/opt/ycsb-0.12.0/dynamodb-binding/lib/jackson-core-2.5.3.jar:/opt/ycsb-0.12.0/dynamodb-binding/lib/jackson-databind-2.5.3.jar:/opt/ycsb-0.12.0/dynamodb-binding/lib/joda-time-2.8.1.jar:/opt/ycsb-0.12.0/dynamodb-binding/lib/log4j-1.2.17.jar com.yahoo.ycsb.Client -db com.yahoo.ycsb.db.DynamoDBClient -P dynamodb-binding/conf/dynamodb.properties -P workloads/workloada -threads 1 -target 40 -load
YCSB Client 0.12.0
Command line: -db com.yahoo.ycsb.db.DynamoDBClient -P dynamodb-binding/conf/dynamodb.properties -P workloads/workloada -threads 1 -target 40 -load
Loading workload...
Starting test.
0 [Thread-1] INFO com.yahoo.ycsb.db.DynamoDBClient -dynamodb connection created with http://dynamodb.us-east-1.amazonaws.com
DBWrapper: report latency for each error is false and specific error codes to track for latency are: []
435 [Thread-1] ERROR com.yahoo.ycsb.db.DynamoDBClient -com.amazonaws.AmazonServiceException: One or more parameter values were invalid: Missing the key partition_key in the item (Service: AmazonDynamoDBv2; Status Code: 400; Error Code: ValidationException; Request ID: BOJ5PRDH5N2H40TDH04ERN47BBVV4KQNSO5AEMVJF66Q9ASUAAJG)
Error inserting, not retrying any more. number of attempts: 1Insertion Retry Limit: 0
[OVERALL], RunTime(ms), 934.0
[OVERALL], Throughput(ops/sec), 0.0
[TOTAL_GCS_PS_Scavenge], Count, 1.0
[TOTAL_GC_TIME_PS_Scavenge], Time(ms), 8.0
[TOTAL_GC_TIME_%PS_Scavenge], Time(%), 0.8565310492505354
[TOTAL_GCS_PS_MarkSweep], Count, 0.0
[TOTAL_GC_TIME_PS_MarkSweep], Time(ms), 0.0
[TOTAL_GC_TIME%PS_MarkSweep], Time(%), 0.0
[TOTAL_GCs], Count, 1.0
[TOTAL_GC_TIME], Time(ms), 8.0
[TOTAL_GC_TIME%], Time(%), 0.8565310492505354
[CLEANUP], Operations, 1.0
[CLEANUP], AverageLatency(us), 1.0
[CLEANUP], MinLatency(us), 1.0
[CLEANUP], MaxLatency(us), 1.0
[CLEANUP], 95thPercentileLatency(us), 1.0
[CLEANUP], 99thPercentileLatency(us), 1.0
[INSERT], Operations, 0.0
[INSERT], AverageLatency(us), NaN
[INSERT], MinLatency(us), 9.223372036854776E18
[INSERT], MaxLatency(us), 0.0
[INSERT], 95thPercentileLatency(us), 0.0
[INSERT], 99thPercentileLatency(us), 0.0
[INSERT], Return=ERROR, 1
[INSERT-FAILED], Operations, 1.0
[INSERT-FAILED], AverageLatency(us), 428928.0
[INSERT-FAILED], MinLatency(us), 428800.0
[INSERT-FAILED], MaxLatency(us), 429055.0
[INSERT-FAILED], 95thPercentileLatency(us), 429055.0
[INSERT-FAILED], 99thPercentileLatency(us), 429055.0
When we load data by YCSB workloada, what kind of data is loaded in the DB (basically what is the source of that data). Could anyone please guide me to understand as to what I am missing?
~Thanks

I solved this. It was due to the fact that I chose the name of the partition key of dynamo as "partition_key". I changed it to something else and it worked fine.
Thanks.

WAL Archive hangs in postgres when gzip is used

I have enabled the WAL Archiving with following archive command:
wal_keep_segments = 32
archive_mode = on
archive_command = 'gzip < %p > /mnt/nfs/archive/%f'
and on Slave I have restore command as:
restore_command = 'gunzip < /mnt/nfs/archive/%f > %p'
archive_cleanup_command = '/opt/PostgreSQL/9.4/bin/pg_archivecleanup -d /mnt/nfs/archive %r'
on Master I could see that many files are stuck. around 327 files are yet to be archived. Ideally it should be only 32 only.
the px command shows:
-bash-4.1$ ps x
PID TTY STAT TIME COMMAND
3302 ? S 0:00 /opt/PostgreSQL/9.4/bin/postgres -D /opt/PostgreSQL/9.4/data
3304 ? Ss 0:00 postgres: logger process
3306 ? Ss 0:09 postgres: checkpointer process
3307 ? Ss 0:00 postgres: writer process
3308 ? Ss 0:06 postgres: wal writer process
3309 ? Ss 0:00 postgres: autovacuum launcher process
3311 ? Ss 0:00 postgres: stats collector process
3582 ? S 0:00 sshd: postgres#pts/1
3583 pts/1 Ss 0:00 -bash
3628 ? Ss 0:00 postgres: archiver process archiving 000000010000002D000000CB
3673 ? S 0:00 sh -c gzip < pg_xlog/000000010000002D000000CB > /mnt/nfs/archive/000000010000002D000000CB
3674 ? D 0:00 gzip
3682 ? S 0:00 sshd: postgres#pts/0
3683 pts/0 Ss 0:00 -bash
4070 ? Ss 0:00 postgres: postgres postgres ::1(34561) idle
4074 ? Ss 0:00 postgres: postgres sorriso ::1(34562) idle
4172 pts/0 S+ 0:00 vi postgresql.conf
4192 pts/1 R+ 0:00 ps x
-bash-4.1$ ls | wc -l
327
-bash-4.1$

gzip and gunzip without flags expect to work with files, compressing or uncompressing them in-place. You're trying to use them as stream processors. That's not going to work.
You want to use gzip -c and zcat (or gunzip -c) to tell them to use stdio.
Additionally, though, you should probably use a simple script as the archive_command that:
Writes with gzip -c to a temp file
Moves the temp file to the final location with mv
This ensures that the file is not read by the replica until it's fully written by the master.
Also, unless the master and replica are sharing the same network file system (or are both on the same host), you might actually need to use scp or similar to transfer the archive files. The restore_command uses paths on the replica, not on the master, so unless the replica server can access the WAL archive via NFS/CIFS/etc, you're going to need to copy the files.

Multiple apache root processes

I noticed today that when making requests from our web server, things were rather slow.
I started looking into it and I've found a load of root owned apache processes.
I don't know for sure that this is actually what's causing things to be slow, but none the less, it doesn't look good.
problem is, I don't know what to do from here?
How do I find out why there are so many root processes?
Could some recommend a set of tests? I've tried stracing a few of them, and they appear to be doing something, but the output of strace is beyond me.
root 30918 1.8 1.3 84284 52296 ? Ss 14:11 0:01 /usr/sbin/apache2 -k restart
root 30919 0.0 1.1 84420 45612 ? S 14:11 0:00 /usr/sbin/apache2 -k restart
root 30920 0.0 1.1 84420 45604 ? S 14:11 0:00 /usr/sbin/apache2 -k restart
root 30921 0.0 1.1 84420 45612 ? S 14:11 0:00 /usr/sbin/apache2 -k restart
root 30922 0.1 1.1 84420 45612 ? S 14:11 0:00 /usr/sbin/apache2 -k restart
root 30923 0.0 1.1 84420 45612 ? S 14:11 0:00 /usr/sbin/apache2 -k restart
www-data 30926 6.6 1.5 104964 61336 ? S 14:12 0:03 /usr/sbin/apache2 -k restart
root 30930 0.1 1.1 84420 45616 ? S 14:12 0:00 /usr/sbin/apache2 -k restart
root 30933 0.0 1.1 84420 45616 ? S 14:12 0:00 /usr/sbin/apache2 -k restart
root 30935 0.0 1.1 84420 45616 ? S 14:12 0:00 /usr/sbin/apache2 -k restart
root 30936 0.0 1.1 84420 45616 ? S 14:12 0:00 /usr/sbin/apache2 -k restart
root 30937 0.0 1.1 84420 45616 ? S 14:12 0:00 /usr/sbin/apache2 -k restart
root 30938 0.0 1.1 84420 45616 ? S 14:12 0:00 /usr/sbin/apache2 -k restart
root 30961 0.0 1.1 84420 45612 ? S 14:12 0:00 /usr/sbin/apache2 -k restart
root 30989 0.0 1.1 84420 45612 ? S 14:12 0:00 /usr/sbin/apache2 -k restart
root 30990 0.0 1.1 84420 45612 ? S 14:12 0:00 /usr/sbin/apache2 -k restart
root 31011 0.1 1.1 84420 45612 ? S 14:12 0:00 /usr/sbin/apache2 -k restart
root 31013 0.1 1.1 84420 45612 ? S 14:12 0:00 /usr/sbin/apache2 -k restart
root 31014 0.0 1.1 84420 45612 ? S 14:12 0:00 /usr/sbin/apache2 -k restart
www-data 31175 2.5 1.5 104168 60524 ? S 14:12 0:00 /usr/sbin/apache2 -k restart
www-data 31189 2.3 1.4 102360 58920 ? S 14:12 0:00 /usr/sbin/apache2 -k restart
www-data 31190 1.5 1.4 101904 58356 ? S 14:12 0:00 /usr/sbin/apache2 -k restart
www-data 31191 0.3 1.1 84556 46760 ? S 14:12 0:00 /usr/sbin/apache2 -k restart
www-data 31192 1.4 1.4 101916 58384 ? S 14:12 0:00 /usr/sbin/apache2 -k restart
www-data 31193 1.5 1.4 101916 58376 ? S 14:12 0:00 /usr/sbin/apache2 -k restart
root 31240 0.1 1.1 84420 45612 ? S 14:12 0:00 /usr/sbin/apache2 -k restart
This is an example of the output from strace from one of the processes.
--- SIGCHLD (Child exited) # 0 (0) ---
read(6, 0xff87f6ef, 1) = -1 EAGAIN (Resource temporarily unavailable)
getuid32() = 0
close(17) = 0
gettimeofday({1354109303, 670988}, NULL) = 0
semop(5668864, {{0, -1, SEM_UNDO}}, 1) = 0
accept(4, {sa_family=AF_INET, sin_port=htons(48107), sin_addr=inet_addr("192.168.16.12")}, [16]) = 17
fcntl64(17, F_GETFD) = 0
fcntl64(17, F_SETFD, FD_CLOEXEC) = 0
semop(5668864, {{0, 1, SEM_UNDO}}, 1) = 0
clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0xf74a2768) = 1949
waitpid(1949, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0) = 1949
--- SIGCHLD (Child exited) # 0 (0) ---
read(6, 0xff87f6ef, 1) = -1 EAGAIN (Resource temporarily unavailable)
getuid32() = 0
close(17) = 0
gettimeofday({1354109305, 724358}, NULL) = 0
semop(5668864, {{0, -1, SEM_UNDO}}, 1) = 0
accept(4, {sa_family=AF_INET, sin_port=htons(48132), sin_addr=inet_addr("192.168.16.12")}, [16]) = 17
fcntl64(17, F_GETFD) = 0
fcntl64(17, F_SETFD, FD_CLOEXEC) = 0
semop(5668864, {{0, 1, SEM_UNDO}}, 1) = 0
clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0xf74a2768) = 1974
waitpid(1974, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0) = 1974
--- SIGCHLD (Child exited) # 0 (0) ---
I've disabled all of the modules in mods-enabled except for essential ones like auth, env, siteenv and alias and started the server. In this case I still get 6 root apache processes and 1 www-data owned apache process.
I've made sure all the modules are up2date.
There are no obvious errors in the logs.
config follow;
ServerRoot "/etc/apache2"
LockFile /var/lock/apache2/accept.lock
PidFile ${APACHE_PID_FILE}
Timeout 300
KeepAlive On
MaxKeepAliveRequests 100
KeepAliveTimeout 15
<IfModule mpm_worker_module>
StartServers 2
MaxClients 150
MinSpareThreads 25
MaxSpareThreads 75
ThreadsPerChild 25
MaxRequestsPerChild 0
</IfModule>
User ${APACHE_RUN_USER}
Group ${APACHE_RUN_GROUP}
AccessFileName .htaccess
<Files ~ "^\.ht">
Order allow,deny
Deny from all
</Files>
DefaultType text/plain
HostnameLookups Off
ErrorLog /var/log/apache2/error.log
LogLevel warn
Include /etc/apache2/mods-enabled/*.load
Include /etc/apache2/mods-enabled/*.conf
Include /etc/apache2/httpd.conf
Include /etc/apache2/ports.conf
LogFormat "%v:%p %h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" vhost_combined
LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" combined
LogFormat "%h %l %u %t \"%r\" %>s %b" common
LogFormat "%{Referer}i -> %U" referer
LogFormat "%{User-agent}i" agent
CustomLog /var/log/apache2/other_vhosts_access.log vhost_combined
Include /etc/apache2/conf.d/
Include /etc/apache2/sites-enabled/
The compiled in modules are:
Compiled in modules:
core.c
mod_log_config.c
mod_logio.c
itk.c
http_core.c
mod_so.c
So I'm only running the mpm_worker config now.
DEBUG UPDATER
When I restart apache, and ps, I get something like this;
root 26921 0.5 1.3 80008 52452 ? Ss 21:27 0:02 /usr/sbin/apache2 -k start
root 27114 0.0 1.1 80144 44804 ? S 21:34 0:00 /usr/sbin/apache2 -k start
root 27115 0.0 1.1 80144 44820 ? S 21:34 0:00 /usr/sbin/apache2 -k start
root 27116 0.0 1.1 80144 44804 ? S 21:34 0:00 /usr/sbin/apache2 -k start
root 27117 0.0 1.1 80144 44804 ? S 21:34 0:00 /usr/sbin/apache2 -k start
root 27119 0.0 1.1 80144 44804 ? S 21:34 0:00 /usr/sbin/apache2 -k start
If I put LogLevel to debug and restart, then I see these messages from mod_proxy
[Thu Nov 29 21:34:01 2012] [info] Server built: Sep 9 2012 21:17:36
[Thu Nov 29 21:34:01 2012] [debug] itk.c(1100): AcceptMutex: sysvsem (default: sysvsem)
[Thu Nov 29 21:34:01 2012] [debug] proxy_util.c(1818): proxy: grabbed scoreboard slot 0 in child 27115 for worker proxy:reverse
[Thu Nov 29 21:34:01 2012] [debug] proxy_util.c(1818): proxy: grabbed scoreboard slot 0 in child 27114 for worker proxy:reverse
[Thu Nov 29 21:34:01 2012] [debug] proxy_util.c(1934): proxy: initialized single connection worker 0 in child 27115 for (*)
[Thu Nov 29 21:34:01 2012] [debug] proxy_util.c(1837): proxy: worker proxy:reverse already initialized
[Thu Nov 29 21:34:01 2012] [debug] proxy_util.c(1934): proxy: initialized single connection worker 0 in child 27114 for (*)
[Thu Nov 29 21:34:01 2012] [debug] proxy_util.c(1818): proxy: grabbed scoreboard slot 0 in child 27117 for worker proxy:reverse
[Thu Nov 29 21:34:01 2012] [debug] proxy_util.c(1837): proxy: worker proxy:reverse already initialized
[Thu Nov 29 21:34:01 2012] [debug] proxy_util.c(1934): proxy: initialized single connection worker 0 in child 27117 for (*)
[Thu Nov 29 21:34:01 2012] [debug] proxy_util.c(1818): proxy: grabbed scoreboard slot 0 in child 27119 for worker proxy:reverse
[Thu Nov 29 21:34:01 2012] [debug] proxy_util.c(1837): proxy: worker proxy:reverse already initialized
[Thu Nov 29 21:34:01 2012] [debug] proxy_util.c(1934): proxy: initialized single connection worker 0 in child 27119 for (*)
[Thu Nov 29 21:34:01 2012] [debug] proxy_util.c(1818): proxy: grabbed scoreboard slot 0 in child 27116 for worker proxy:reverse
[Thu Nov 29 21:34:01 2012] [debug] proxy_util.c(1837): proxy: worker proxy:reverse already initialized
[Thu Nov 29 21:34:01 2012] [debug] proxy_util.c(1934): proxy: initialized single connection worker 0 in child 27116 for (*)
[Thu Nov 29 21:36:20 2012] [notice] SIGHUP received. Attempting to restart
Notice the pids match. However, if I disable mod_proxy, then these message disappear, but I still get the same number of root processes starting, so I believe this is a symptom not a cause.

This is absolutely normal for Apache. Each process processes one request at a time. So if there was only one process (it is called worker) then it would be really slow if there are lots of users.
The issue I see is that these should not be root owned processes. Depending on your platform it should have it's own user. Like in Debian user would be www-data. Then only one process would be owned by root and rest would be owned by that user.
However speed is defined by several factors - hardware, web server, and web application.
Make sure that hardware you are running on fits requirements (enough ram and CPU)
Lower number of workers in case of poor hardware capabilities or increase if it is super good.
Make sure that web application (if there's is one, and often it is php app) is not a bottleneck for performance.
PS: sorry for poor formatting, typed clamsily from phone.

Know I'm a bit late to the game but I ran into the same issue and was going nuts trying to figure out what's going on. I'm on apache 2.4.7 so a bit newer than you but the general premise is the same.
I had to look in /etc/apache2/mods-enabled/mpm_prefork.conf to find my mpm configuration but you have it right here:
<IfModule mpm_worker_module>
StartServers 2
MaxClients 150
MinSpareThreads 25
MaxSpareThreads 75
ThreadsPerChild 25
MaxRequestsPerChild 0
</IfModule>
Looks like a valid config, which it is. However, your MaxRequestsPerChild, like mine was, is set to 0. I've adjusted it to approximately 10 (can probably go higher but am just testing now) and I think that's solved my problem. Hope this helps!

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

pyflink.fn_execution.beam.beam_boot doesnt close after its job cancelled - apache-flink

Related

Do we have any ways to retrieve the list of process and threads which are in runnable state(not running state) in ubuntu?

LD_PRELOAD in every log in to server

Cannot benchmark DynamoDb using YCSB

WAL Archive hangs in postgres when gzip is used

Multiple apache root processes

Categories

Resources