OpenGauss Memory error reporting problem sysbench mass data writing - database

Operation steps & amp; problem phenomenon
1, sysbench prepare, 100 tables, each table 100 million, 50 concurrency.
Try attempts to lower the concurrent number to 25 or the data volume to 5kw. Eventually, there are various memory reporting errors in the create secondary index link.
parameter:
Physical Server memory: 128GB
gs_guc reload -N all -I all -c "shared_buffers='30GB'"
gs_guc reload -N all -I all -c "max_process_memory='90GB'"
gs_guc reload -N all -I all -c "maintenance_work_mem='10GB'"
Report wrong phenomenon 1:
249FATAL: `sysbench.cmdline.call_command' function failed: ./oltp_common.lua:245: db_bulk_insert_next() failed
FATAL: PQexec() failed: 7 memory is temporarily unavailable
FATAL: failed query was: CREATE INDEX k_56 ON sbtest56(k)
FATAL: `sysbench.cmdline.call_command' function failed: ./oltp_common.lua:253: SQL error, errno = 0, state = 'YY006': memory is temporarily unavailable
Creating table 'sbtest76'...
Inserting 100000000 records into
Report wrong phenomenon 2:
Message from syslogd#testserver at Feb 23 10:19:45 ...
systemd:Caught , cannot fork for core dump: Cannot allocate memory
Report wrong phenomenon 3:
opengauss hitch。
Creating a secondary index on 'sbtest9'...
Segmentation fault (core dumped)
log 3:
could not fork new process for connection: Cannot allocate memory
could not fork new process for connection: Cannot allocate memory

Related

Modify kernel parameters (mmap operation not permitted, EPERM)

I am trying to run this code on a server (RedPitaya) as well as a client (ubuntu virtual machine).
The program returns the following error messages when it is run on the client with root privileges:
root#VirtualBox:/.../rp_remote_acquire# ./rp_remote_acquire -m 1 -a 192.169.1.100 -p 5000 -k 0 -c 0 -d 64
mmap scope io failed (non-fatal), 1
mmap scope ddr a failed (non-fatal), 1
Segmentation fault (core dumped)
I am not sure if the segmentation fault is related to the first two errors because I only get a segmentation fault when the server is running...
The error seems to be coming from here:
if (param->mapped_io == MAP_FAILED) {
fprintf(stderr, "mmap scope io failed (non-fatal), %d\n", errno);
param->mapped_io = NULL;
}
I am aware that a similar problem has already been resolved on stackoverflow.
I tried
sysctl dev.mem.restricted
and I tried adding
linux /boot/vmlinuz-linux iomem=relaxed
to the end of
/boot/grub/grub.cfg
and rebooting, but the problem still persists...
I would like to allow this program to access the computers virtual memory and thereby hopefully resolve all the errors. It could well be that I didn't manage to set the kernel parameters correctly.
Could someone please point me in the right direction?

How to limit the maximum memory a process can use in Centos?

I want to limit the maximum memory a process can you in Centos. There can be scenarios where a process ends up using all of the available memory or most of the memory affecting other processes in the system. Therefore, I want to know how this can be limited.
Also, if you can give a sample program where you are limiting the memory usage of a process and show the following scenarios that would be helpful.
Memory allocation successful when requested memory within the set limits.
Memory allocation failed when requested memory above the set limits.
-Thanks
ulimit can be used to limit memory utilization (among other things)
Here is an example of setting memory usage so low that /bin/ls (which is larger than /bin/cat) no longer works, but /bin/cat still works.
$ ls -lh /bin/ls /bin/cat
-rwxr-xr-x 1 root root 25K May 24 2008 /bin/cat
-rwxr-xr-x 1 root root 88K May 24 2008 /bin/ls
$ date > test.txt
$ ulimit -d 10000 -m 10000 -v 10000
$ /bin/ls date.txt
/bin/ls: error while loading shared libraries: libc.so.6: failed to map segment from shared object: Cannot allocate memory
$ /bin/cat date.txt
Thu Mar 26 11:51:16 PDT 2009
$
Note: If I set the limits to 1000 kilobytes, neither program works, because they load libraries, which increase their size. above 1000 KB.
-d data segment size
-m max memory size
-v virtual memory size
Run ulimit -a to see all the resource caps ulimits can set.

OpenMPI bind() failed on error Address already in use (48) Mac OS X

I have installed OpenMPI and tried to compile/execute one of the examples delivered with the newest version.
As I try to run with mpiexec it says that the address is already in use.
Someone got a hint why this is always happening?
Kristians-MacBook-Pro:examples kristian$ mpicc -o hello hello_c.c
Kristians-MacBook-Pro:examples kristian$ mpiexec -n 4 ./hello
[Kristians-MacBook-Pro.local:02747] [[56076,0],0] bind() failed on error Address already in use (48)
[Kristians-MacBook-Pro.local:02747] [[56076,0],0] ORTE_ERROR_LOG: Error in file oob_usock_component.c at line 228
[Kristians-MacBook-Pro.local:02748] [[56076,1],0] usock_peer_send_blocking: send() to socket 19 failed: Socket is not connected (57)
[Kristians-MacBook-Pro.local:02748] [[56076,1],0] ORTE_ERROR_LOG: Unreachable in file oob_usock_connection.c at line 315
[Kristians-MacBook-Pro.local:02748] [[56076,1],0] orte_usock_peer_try_connect: usock_peer_send_connect_ack to proc [[56076,0],0] failed: Unreachable (-12)
[Kristians-MacBook-Pro.local:02749] [[56076,1],1] usock_peer_send_blocking: send() to socket 20 failed: Socket is not connected (57)
[Kristians-MacBook-Pro.local:02749] [[56076,1],1] ORTE_ERROR_LOG: Unreachable in file oob_usock_connection.c at line 315
[Kristians-MacBook-Pro.local:02749] [[56076,1],1] orte_usock_peer_try_connect: usock_peer_send_connect_ack to proc [[56076,0],0] failed: Unreachable (-12)
-------------------------------------------------------
Primary job terminated normally, but 1 process returned
a non-zero exit code.. Per user-direction, the job has been aborted.
-------------------------------------------------------
--------------------------------------------------------------------------
mpiexec detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:
Process name: [[56076,1],0]
Exit code: 1
--------------------------------------------------------------------------
Thanks in advance.
Okay.
I have now changed the $TMPDIR environment variable with export TMPDIR=/tmp and it works.
Now it seems to me that the OpenMPI Session folder was blocking my communication. But why did it?
Am I missing something here?

PostgreSql crashed with error: 'server process (PID XXXX) was terminated by exception 0xC0000142'

I have Postgresql 9.2 running on a 4G memory, Atom N2800 CPU Windows POS READY embedded system(like the XP) machine, basically it running fine for years in production environment, but crashed(service stopped) frequently in recent performance(not stress) testing.
I don't think the testing put too much stress, by enabled the log_min_duration_statement = 0, the simplified overall statistics for what the testing have done listed below:
say 20 minutes is a measure unit, so during one unit:
5000 times of UPDATE with each query contains 20KB size of data(contains a Text field).
35000 times of SELECT with each query returned 20KB size of data(to get that Text field).
the logs didn't see any abnormal until the crash and leave this:
2015-07-29 16:41:53.500 SGT,,,5512,,55b87f74.1588,2,,2015-07-29 15:23:32 SGT,,0,LOG,00000,"server process (PID 4416) was terminated by exception 0xC0000142",,"See C include file ""ntstatus.h"" for a description of the hexadecimal value.",,,,,,,""
2015-07-29 16:41:53.500 SGT,,,5512,,55b87f74.1588,3,,2015-07-29 15:23:32 SGT,,0,LOG,00000,"terminating any other active server processes",,,,,,,,,""
2015-07-29 16:41:53.500 SGT,"eps","transactiondatabase",6960,"127.0.0.1:9162",55b891cf.1b30,9,"idle",2015-07-29 16:41:51 SGT,146/0,0,WARNING,57P00,"terminating connection because of crash of another server process","The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.","In a moment you should be able to reconnect to the database and repeat your command.",,,,,,,""
2015-07-29 16:41:53.515 SGT,"eps","transactiondatabase",5828,"127.0.0.1:9150",55b891c2.16c4,155,"idle",2015-07-29 16:41:38 SGT,145/0,0,WARNING,57P00,"terminating connection because of crash of another server process","The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.","In a moment you should be able to reconnect to the database and repeat your command.",,,,,,,""
2015-07-29 16:41:53.515 SGT,"eps","transactiondatabase",6448,"127.0.0.1:9148",55b891c2.1930,5,"idle",2015-07-29 16:41:38 SGT,93/0,0,WARNING,57P00,"terminating connection because of crash of another server process","The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.","In a moment you should be able to reconnect to the database and repeat your command.",,,,,,,""
....
....
2015-07-29 16:41:54.500 SGT,,,8004,,55b87f76.1f44,2,,2015-07-29 15:23:34 SGT,1/0,0,WARNING,57P00,"terminating connection because of crash of another server process","The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.","In a moment you should be able to reconnect to the database and repeat your command.",,,,,,,""
2015-07-29 16:41:54.515 SGT,,,5512,,55b87f74.1588,4,,2015-07-29 15:23:32 SGT,,0,LOG,00000,"all server processes terminated; reinitializing",,,,,,,,,""
2015-07-29 16:42:04.515 SGT,,,5512,,55b87f74.1588,5,,2015-07-29 15:23:32 SGT,,0,FATAL,XX000,"pre-existing shared memory block is still in use",,"Check if there are any old server processes still running, and terminate them.",,,,,,,""
2015-07-29 16:51:02.078 SGT,,,5828,,55b893f6.16c4,1,,2015-07-29 16:51:02 SGT,,0,LOG,00000,"database system was interrupted; last known up at 2015-07-29 16:40:36 SGT",,,,,,,,,""
2015-07-29 16:51:02.093 SGT,,,5828,,55b893f6.16c4,2,,2015-07-29 16:51:02 SGT,,0,LOG,00000,"database system was not properly shut down; automatic recovery in progress",,,,,,,,,""
2015-07-29 16:51:02.109 SGT,,,5828,,55b893f6.16c4,3,,2015-07-29 16:51:02 SGT,,0,LOG,00000,"redo starts at 0/12C79578",,,,,,,,,""
2015-07-29 16:51:02.421 SGT,,,5828,,55b893f6.16c4,4,,2015-07-29 16:51:02 SGT,,0,LOG,00000,"unexpected pageaddr 0/1046A000 in log file 0, segment 19, offset 4628480",,,,,,,,,""
2015-07-29 16:51:02.421 SGT,,,5828,,55b893f6.16c4,5,,2015-07-29 16:51:02 SGT,,0,LOG,00000,"redo done at 0/13469FC8",,,,,,,,,""
one thing I could point is the database configuration of shared_buffers, now the settings is 256MB, it just there for no reason, does it help to increase this value?
Other major setting: max_connections=200, temp_buffers = 16MB,work_mem = 8MB
Anyone could help to check how the crash happened, or how to minimize the scope?
MSDN says:
0xC0000142
STATUS_DLL_INIT_FAILED
{DLL Initialization Failed} Initialization of the dynamic link library %hs failed. The process is terminating abnormally.
so it was a DLL loading issue and/or issue starting a new process. If I had to guess I'd say you might have hit limits on the number of open files, number of running processes, etc on your XP Embedded system. You might want to lower max_connections.

Sybase initializes but does not run

I am using Red Hat 5.5 and I am trying to run Sybase ASE 12.5.4.
Yesterday I was trying to use the command "service sybase start" and the console showed sybase repeatedly trying to initialize, but failing, the main database server.
UPDATE:
I initialized a database at /ims_systemdb/master using the following commands:
dataserver -d /ims_systemdb/master -z 2k -b 51204 -c $SYBASE/ims.cfg -e db_error.log
chmod a=rwx /ims_systemdb/master
ls -al /ims_systemdb/master
And it gives me a nice database at /ims_systemdb/master with a size of 104865792 bytes (2048x51240).
But when I run
service sybase start
The error log at /logs/sybase_error.log goes like this:
00:00000:00000:2013/04/26 16:11:45.18 kernel Using config area from primary master device.
00:00000:00000:2013/04/26 16:11:45.19 kernel Detected 1 physical CPU
00:00000:00000:2013/04/26 16:11:45.19 kernel os_create_region: can't allocate 11534336000 bytes
00:00000:00000:2013/04/26 16:11:45.19 kernel kbcreate: couldn't create kernel region.
00:00000:00000:2013/04/26 16:11:45.19 kernel kistartup: could not create shared memory
I read "os_create_region" is normal if you don't set shmmax in sysctl high enough, so I set it to 16000000000000, but I still get this error. And sometimes, when I'm playing around with the .cfg file, I get this error message instead:
00:00000:00000:2013/04/25 14:04:08.28 kernel Using config area from primary master device.
00:00000:00000:2013/04/25 14:04:08.29 kernel Detected 1 physical CPU
00:00000:00000:2013/04/25 14:04:08.85 server The size of each partitioned pool must have atleast 512K. With the '16' partitions we cannot configure this value f
Why do these two errors appear and what can I do about them?
UPDATE:
Currently, I'm seeing the 1st error message (os cannot allocate bytes). The contents of /etc/sysctl.conf are as follows:
kernel.shmmax = 4294967295
kernel.shmall = 1048576
kernel.shmmni = 4096
But the log statements earlier state that
os_create_region: can't allocate 11534336000 bytes
So why is the region it is trying to allocate so big, and where did that get set?
The Solution:
When you get a message like "os_create_region: can't allocate 11534336000 bytes", what it means is that Sybase's configuration file is asking the kernel to create a region that exceeds the shmmax variable in /etc/sysctl.conf
The main thing to do is to change ims.conf (or whatever configuration file you are using). Then, you change the max memory variable in the physical memory section.
[Physical Memory]
max memory = 64000
additional network memory = 10485760
shared memory starting address = DEFAULT
allocate max shared memory = 1
For your information, my /etc/sysctl.conf file ended with these three lines:
kernel.shmmax = 16000000000
kernel.shmall = 16000000000
kernel.shmmni = 8192
And once this is done, type "showserver" to reveal what processes are running.
For more information, consult the Sybase System Administrator's Guide, volume 2 as well as Michael Gardner's link to Red Hat memory management in the comments earlier.

Resources