I am having a problem where the OOM killer is being triggered some times. I have researched in the internet and have found many related threads. But a few things still puzzle me. I hope some one could help me.
Environment: iMX6 (32bit).
User/Kernelspace split: 2G-2G
TotalRAm - 4GB
Some important logs:
top invoked oom-killer: gfp_mask=0x201da, order=0, oom_score_adj=0
I see that it is trying to allocate 1 page of (contagious) memory (order=0) in the HIGHMEM zone (from gfp_mask). Please correct me if i am wrong.
DMA free:1322780kB min:4492kB low:5612kB high:6736kB active_anon:0kB inactive_anon:0kB active_file:84kB
DMA: 941*4kB (UEMC) 1211*8kB (UEMC) 1185*16kB (UEMC) 836*32kB (UEMC) 554*64kB
(UEMC) 295*128kB (UEMC) 106*256kB
HighMem free:480kB min:512kB low:2384kB high:4256kB active_anon:2021148kB inactive_anon:70364kB active_file:0kB
HighMem: 0*4kB 1*8kB (R) 0*16kB 7*32kB (R) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB 0*8192kB
I believe the OOM-killer is triggered as the free Highmem (480KB) is below the min (512KB). Again please correct me if i am wrong.
My questions:
1. I thought the DMA_ZONE is only about 16MB, NORMAL_ZONE is upwards from 16MB
to about 896MB and the rest is HIGHMEM_ZONE. But the log shows more than 1GB
free pages (1322780kB) exist in the DMA_ZONE.
2. Why does not the kernel utilize this Zone for further allocation?
Morelogs: (taken out from the complete log):
DMA per-cpu:
CPU 0: hi: 186, btch: 31 usd: 0
CPU 1: hi: 186, btch: 31 usd: 0
CPU 2: hi: 186, btch: 31 usd: 0
CPU 3: hi: 186, btch: 31 usd: 0
HighMem per-cpu:
CPU 0: hi: 186, btch: 31 usd: 51
CPU 1: hi: 186, btch: 31 usd: 20
CPU 2: hi: 186, btch: 31 usd: 4
CPU 3: hi: 186, btch: 31 usd: 14
active_anon:505287 inactive_anon:17591 isolated_anon:0
active_file:21 inactive_file:0 isolated_file:0
unevictable:0 dirty:0 writeback:0 unstable:0
free:330815 slab_reclaimable:1134 slab_unreclaimable:3487
mapped:15956 shmem:25014 pagetables:1982 bounce:0
25046 total pagecache pages
983039 pages of RAM
331349 free pages
9947 reserved pages
2772 slab pages
543663 pages shared
0 pages swap cached
cat /proc/pagetypeinfo
Page block order: 13
Pages per block: 8192
Free pages count per migrate type at order 0 1 2 3 4 5 6 7 8 9 10 11 12 13
Node 0, zone DMA, type Unmovable 1 0 9 8 3 1 0 1 1 1 1 0 1 0
Node 0, zone DMA, type Reclaimable 4 5 5 1 2 0 1 1 1 0 1 0 1 0
Node 0, zone DMA, type Movable 1 6 4 0 0 0 1 1 2 4 3 3 4 28
Node 0, zone DMA, type Reserve 0 0 0 0 0 0 0 0 0 0 0 0 0 1
Node 0, zone DMA, type CMA 1 1 2 0 0 0 0 0 1 1 0 0 1 3
Node 0, zone DMA, type Isolate 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Node 0, zone HighMem, type Unmovable 11 7 2 2 9 6 5 3 3 1 0 1 1 0
Node 0, zone HighMem, type Reclaimable 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Node 0, zone HighMem, type Movable 23 201 4771 4084 1803 403 105 69 57 38 23 21 8 23
Node 0, zone HighMem, type Reserve 0 0 0 0 0 0 0 0 0 0 0 0 0 1
Node 0, zone HighMem, type CMA 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Node 0, zone HighMem, type Isolate 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Number of blocks type Unmovable Reclaimable Movable Reserve CMA Isolate
Node 0, zone DMA 5 1 33 1 16 0
Node 0, zone HighMem 2 0 62 1 0 0
I would be glad to post further logs if necessary.
Thankyou,
Srik
probably long shot but have you tried adding
vm.overcommit_memory = 2
vm.overcommit_ratio = 80
to
/etc/sysctl.conf
The changes are high that you did run out of virtual memory because 32 bit kernel can only directly access 4 GB of virtual memory and there're heavy limitations on the usable address space for hardware access. For example, network adapter hardware acceleration could require memory in some specific address range and if you run out of RAM in that specific range, the system either has to run OOM Killer or kill your network adapter. And that's true even if your system has free available in some unrelated zone.
For details, try reviewing these links:
https://serverfault.com/questions/564068/linux-oom-situation-32-bit-kernel
https://serverfault.com/questions/548736/how-to-read-oom-killer-syslog-messages
and maybe this, too:
https://unix.stackexchange.com/questions/373312/oom-killer-doesnt-work-properly-leads-to-a-frozen-os
TL;DR: if you need more than 2 GB of RAM, install a 64 bit OS.
Related
I tested running bare bones code using ESP IDF on an ESP32 chip using duinotech XC-3800, and obtained the following results in terms of image size.
Analysis Binary Size for ESP32
Folder Structure
temp/
main/
CMakeLists.txt
main.c
CMakeLists.txt
File contents
CMakeLists.txt
# The following lines of boilerplate have to be in your project's
# CMakeLists in this exact order for cmake to work correctly
cmake_minimum_required(VERSION 3.5)
include($ENV{IDF_PATH}/tools/cmake/project.cmake)
project(temp)
main>CMakeLists.txt
idf_component_register(SRCS "main.c"
INCLUDE_DIRS "")
Test 1 main>main.c
#include <stdio.h>
void app_main(void) {
printf("Hello world!\n");
for (int i = 10; i >= 0; i--) {
printf("Restarting in %d seconds...\n", i);
}
printf("Restarting now.\n");
fflush(stdout);
}
Test 2 main>main.c
#include <stdio.h>
void app_main(void) { printf("Hello world!\n"); }
Test 3 main>main.c
void app_main(void) {}
Size comparison
Obtained by running idf_size.py build/temp.map
Test 1
Total sizes:
DRAM .data size: 8320 bytes
DRAM .bss size: 4072 bytes
Used static DRAM: 12392 bytes ( 168344 available, 6.9% used)
Used static IRAM: 38804 bytes ( 92268 available, 29.6% used)
Flash code: 75408 bytes
Flash rodata: 23844 bytes
Total image size:~ 146376 bytes (.bin may be padded larger)
Test 2
Total sizes:
DRAM .data size: 8320 bytes
DRAM .bss size: 4072 bytes
Used static DRAM: 12392 bytes ( 168344 available, 6.9% used)
Used static IRAM: 38804 bytes ( 92268 available, 29.6% used)
Flash code: 75240 bytes
Flash rodata: 23796 bytes
Total image size:~ 146160 bytes (.bin may be padded larger)
Test 3
Total sizes:
DRAM .data size: 8320 bytes
DRAM .bss size: 4072 bytes
Used static DRAM: 12392 bytes ( 168344 available, 6.9% used)
Used static IRAM: 38804 bytes ( 92268 available, 29.6% used)
Flash code: 75004 bytes
Flash rodata: 23780 bytes
Total image size:~ 145908 bytes (.bin may be padded larger)
Analysis
Size for code obtained by running stat --format="%s" main/main.c
All Sizes are in Bytes
Test No. | Code | Image | Flash Code | Flash rodata
-------- | -----| ------ | ---------- | ------------
1 | 207 | 146376 | 75408 | 23844
2 | 70 | 146160 | 75240 | 23796
3 | 43 | 145908 | 75004 | 23780
At least 145KB of boiler plate code just to get an empty main run.
Speculation
I suspect that the 145KB is made up of a number of libraries that are always loaded onto the chip whether you use them or not. Some of them must be the FreeRTOS, WiFi, HTTP etc.
Can we bring down this size somehow and load only the bare minimum required for operation?
You can get more detailed size information by running idf.py size-components and idf.py size-files. Here are the results for your "Test 3" (i.e. empty function) on my dev environment (note that my Total Image Size is already slightly larger than the one you posted.
Total sizes:
DRAM .data size: 7860 bytes
DRAM .bss size: 4128 bytes
Used static DRAM: 11988 bytes ( 168748 available, 6.6% used)
Used static IRAM: 32706 bytes ( 98366 available, 25.0% used)
Flash code: 76002 bytes
Flash rodata: 32164 bytes
Total image size:~ 148732 bytes (.bin may be padded larger)
Per-archive contributions to ELF file:
Archive File DRAM .data & .bss IRAM Flash code & rodata Total
libc.a 0 0 0 55348 3829 59177
libesp32.a 2132 2351 6767 5976 8076 25302
libfreertos.a 4140 776 12432 0 1653 19001
libdriver.a 76 20 0 4003 7854 11953
libsoc.a 161 4 4953 612 3852 9582
libvfs.a 240 103 0 5464 950 6757
libheap.a 877 4 3095 830 986 5792
libspi_flash.a 24 290 1855 779 890 3838
libefuse.a 16 4 0 1142 2213 3375
libnewlib.a 152 272 766 830 96 2116
libapp_update.a 0 4 109 159 1221 1493
libesp_ringbuf.a 0 0 848 0 209 1057
liblog.a 8 268 429 82 0 787
libhal.a 0 0 515 0 32 547
libpthread.a 8 12 0 256 0 276
libgcc.a 0 0 0 0 160 160
libbootloader_support.a 0 0 0 126 0 126
libm.a 0 0 0 88 0 88
libcxx.a 0 0 0 11 0 11
libxtensa-debug-module.a 0 0 8 0 0 8
libmain.a 0 0 0 5 0 5
(exe) 0 0 0 0 0 0
libmbedcrypto.a 0 0 0 0 0 0
libmbedtls.a 0 0 0 0 0 0
libwpa_supplicant.a 0 0 0 0 0 0
Per-file contributions to ELF file:
Object File DRAM .data & .bss IRAM Flash code & rodata Total
lib_a-vfprintf.o 0 0 0 14193 756 14949
lib_a-svfprintf.o 0 0 0 13838 756 14594
lib_a-svfiprintf.o 0 0 0 9642 1210 10852
lib_a-vfiprintf.o 0 0 0 9945 738 10683
uart.c.obj 44 12 0 2985 7293 10334
tasks.c.obj 12 700 5546 0 531 6789
vfs_uart.c.obj 48 63 0 3680 808 4599
panic.c.obj 2023 5 1989 0 0 4017
esp_err_to_name.c.obj 0 0 0 50 3947 3997
portasm.S.obj 3084 0 484 0 0 3568
lib_a-dtoa.o 0 0 0 3522 13 3535
multi_heap.c.obj 873 0 2277 0 0 3150
intr_alloc.c.obj 8 22 618 1703 722 3073
esp_efuse_utility.c.obj 0 0 0 867 2205 3072
cpu_start.c.obj 0 1 1068 331 1352 2752
queue.c.obj 0 0 2310 0 325 2635
lib_a-mprec.o 0 0 0 2134 296 2430
rtc_clk.c.obj 161 4 2098 0 0 2263
dbg_stubs.c.obj 0 2072 32 108 0 2212
vfs.c.obj 192 40 0 1784 142 2158
rtc_periph.c.obj 0 0 0 0 2080 2080
esp_timer_esp32.c.obj 8 26 1068 262 538 1902
xtensa_vectors.S.obj 8 0 1776 0 36 1820
flash_mmap.c.obj 0 264 1092 114 339 1809
task_wdt.c.obj 53 4 0 1178 548 1783
heap_caps.c.obj 4 0 818 52 617 1491
timers.c.obj 8 56 1006 0 233 1303
cache_utils.c.obj 4 14 749 81 402 1250
soc_memory_layout.c.obj 0 0 0 0 1181 1181
heap_caps_init.c.obj 0 4 0 778 369 1151
port.c.obj 0 16 625 0 493 1134
esp_ota_ops.c.obj 0 4 0 147 965 1116
xtensa_intr_asm.S.obj 1024 0 51 0 0 1075
ringbuf.c.obj 0 0 848 0 209 1057
rtc_time.c.obj 0 0 815 0 198 1013
memory_layout_utils.c.ob 0 0 0 612 393 1005
rtc_init.c.obj 0 0 992 0 0 992
periph_ctrl.c.obj 8 0 0 615 280 903
lib_a-fseeko.o 0 0 0 866 0 866
time.c.obj 0 32 122 703 0 857
clk.c.obj 0 0 64 559 221 844
esp_timer.c.obj 8 16 261 406 134 825
dport_access.c.obj 8 40 444 189 137 818
log.c.obj 8 268 429 82 0 787
rtc_wdt.c.obj 0 0 743 0 0 743
partition.c.obj 0 8 0 522 149 679
ipc.c.obj 0 28 159 283 120 590
locks.c.obj 8 0 485 0 94 587
crosscore_int.c.obj 8 8 193 130 156 495
syscall_table.c.obj 144 240 0 82 0 466
system_api.c.obj 0 0 439 0 0 439
timer.c.obj 16 0 0 112 281 409
int_wdt.c.obj 0 1 91 306 0 398
freertos_hooks.c.obj 8 128 43 216 0 395
esp_app_desc.c.obj 0 0 109 12 256 377
brownout.c.obj 0 0 0 149 201 350
windowspill_asm.o 0 0 311 0 0 311
rtc_module.c.obj 8 8 0 291 0 307
xtensa_context.S.obj 0 0 306 0 0 306
cpu_util.c.obj 0 0 305 0 0 305
dport_panic_highint_hdl. 8 0 242 0 0 250
lib_a-reent.o 0 0 0 232 0 232
lib_a-fopen.o 0 0 0 224 0 224
lib_a-snprintf.o 0 0 0 214 0 214
esp_efuse_api.c.obj 0 4 0 193 0 197
pthread_local_storage.c. 8 4 0 180 0 192
cache_err_int.c.obj 0 0 56 98 0 154
xtensa_intr.c.obj 0 0 108 0 35 143
list.c.obj 0 0 142 0 0 142
syscalls.c.obj 0 0 91 45 0 136
lib_a-assert.o 0 0 0 68 60 128
lib_a-flags.o 0 0 0 127 0 127
bootloader_common.c.obj 0 0 0 126 0 126
lib_a-s_frexp.o 0 0 0 110 0 110
flash_ops.c.obj 20 4 14 62 0 100
lib_a-vprintf.o 0 0 0 94 0 94
lib_a-s_fpclassify.o 0 0 0 88 0 88
lib_a-fiprintf.o 0 0 0 84 0 84
pthread.c.obj 0 8 0 76 0 84
esp_efuse_fields.c.obj 0 0 0 82 0 82
clock.o 0 0 72 0 0 72
reent_init.c.obj 0 0 68 0 2 70
state_asm--restore_extra 0 0 62 0 0 62
state_asm--save_extra_nw 0 0 62 0 0 62
xtensa_vector_defaults.S 0 0 46 0 0 46
lib_a-fseek.o 0 0 0 45 0 45
_divdi3.o 0 0 0 0 40 40
_moddi3.o 0 0 0 0 40 40
_udivdi3.o 0 0 0 0 40 40
_umoddi3.o 0 0 0 0 40 40
xtensa_init.c.obj 0 4 32 0 0 36
interrupts--intlevel.o 0 0 0 0 32 32
esp_efuse_table.c.obj 16 0 0 0 8 24
lib_a-errno.o 0 0 0 10 0 10
pm_esp32.c.obj 0 0 0 8 0 8
int_asm--set_intclear.o 0 0 8 0 0 8
eri.c.obj 0 0 8 0 0 8
cxx_exception_stubs.cpp. 0 0 0 6 0 6
cxx_guards.cpp.obj 0 0 0 5 0 5
hello_world_main.c.obj 0 0 0 5 0 5
FreeRTOS-openocd.c.obj 4 0 0 0 0 4
dummy_main_src.c.obj 0 0 0 0 0 0
bootloader_flash.c.obj 0 0 0 0 0 0
bootloader_random.c.obj 0 0 0 0 0 0
bootloader_sha.c.obj 0 0 0 0 0 0
esp_image_format.c.obj 0 0 0 0 0 0
flash_partitions.c.obj 0 0 0 0 0 0
lib_a-fputs.o 0 0 0 0 0 0
lib_a-printf.o 0 0 0 0 0 0
lib_a-putc.o 0 0 0 0 0 0
lib_a-putchar.o 0 0 0 0 0 0
lib_a-puts.o 0 0 0 0 0 0
lib_a-sprintf.o 0 0 0 0 0 0
lib_a-strerror.o 0 0 0 0 0 0
lib_a-u_strerr.o 0 0 0 0 0 0
lib_a-xpg_strerror_r.o 0 0 0 0 0 0
gpio.c.obj 0 0 0 0 0 0
hw_random.c.obj 0 0 0 0 0 0
pm_locks.c.obj 0 0 0 0 0 0
_addsubdf3.o 0 0 0 0 0 0
_cmpdf2.o 0 0 0 0 0 0
_divdf3.o 0 0 0 0 0 0
_fixdfsi.o 0 0 0 0 0 0
_floatdidf.o 0 0 0 0 0 0
_floatsidf.o 0 0 0 0 0 0
_muldf3.o 0 0 0 0 0 0
_popcountsi2.o 0 0 0 0 0 0
platform.c.obj 0 0 0 0 0 0
platform_util.c.obj 0 0 0 0 0 0
sha256.c.obj 0 0 0 0 0 0
esp_mem.c.obj 0 0 0 0 0 0
gpio_periph.c.obj 0 0 0 0 0 0
spi_flash_rom_patch.c.ob 0 0 0 0 0 0
md5-internal.c.obj 0 0 0 0 0 0
From a library point of view, the major contributor is libc which is the C standard library. While you could probably cherry-pick some functions and drop others from there, I don't think anyone would recommend that.
Next up is libesp32, which provides critical functions such as start_cpu0(). Again, you may be able to cherry-pick only the functions you need, if you really want to.
You can figure out what a library provides by looking up the .a file (e.g. find build -name libesp32.a, and then running nm build/esp-idf/esp32/libesp32.a on the found path.
The second table lists the same size-data, but split per source-file instead of per library.
I am running Microsoft SQL server on Ubuntu 16.04.2 LTS in QEMU VM
SQL Agent installed as well.
16 GB RAM assigned, and 6 processors.
SQL Upper memory limit set to 10 GB
I have a single 1.2 GB database. Simple Recovery mode.
Single SQL Agent job, that backs up the DB.
Problem: sqlserv process is killed by OOM shortly after job finished.
What settings should I be looking at to fix this?
I do not see anything in the SQL logs, only the messages in dmesg.
BACKUP JOB:
--Script 1: Backup specific database
-- 1. Variable declaration
DECLARE #path VARCHAR(500)
DECLARE #name VARCHAR(500)
DECLARE #pathwithname VARCHAR(500)
DECLARE #time DATETIME
DECLARE #year VARCHAR(4)
DECLARE #month VARCHAR(2)
DECLARE #day VARCHAR(2)
DECLARE #hour VARCHAR(2)
DECLARE #minute VARCHAR(2)
DECLARE #second VARCHAR(2)
-- 2. Setting the backup path
SET #path = 'C:\sqldata\SQLBACKUPS\'
-- 3. Getting the time values
SELECT #time = GETDATE()
SELECT #year = (SELECT CONVERT(VARCHAR(4), DATEPART(yy, #time)))
SELECT #month = (SELECT CONVERT(VARCHAR(2), FORMAT(DATEPART(mm,#time),'00')))
SELECT #day = (SELECT CONVERT(VARCHAR(2), FORMAT(DATEPART(dd,#time),'00')))
SELECT #hour = (SELECT CONVERT(VARCHAR(2), FORMAT(DATEPART(hh,#time),'00')))
SELECT #minute = (SELECT CONVERT(VARCHAR(2), FORMAT(DATEPART(mi,#time),'00')))
SELECT #second = (SELECT CONVERT(VARCHAR(2), FORMAT(DATEPART(ss,#time),'00')))
-- 4. Defining the filename format
SELECT #name ='DBNAME' + '_' + #year + #month + #day + #hour + #minute + #second
SET #pathwithname = #path + #namE + '.bak'
--5. Executing the backup command
BACKUP DATABASE [DBNAME]
ERROR MESSAGE in dmesg:
[617521.605059] kthreadd invoked oom-killer: gfp_mask=0x27000c0(GFP_KERNEL_ACCOUNT|__GFP_NOTRACK), order=2, oom_score_adj=0
[617521.605060] kthreadd cpuset=/ mems_allowed=0
[617521.605076] CPU: 1 PID: 2 Comm: kthreadd Not tainted 4.8.0-46-generic #49~16.04.1-Ubuntu
[617521.605077] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
[617521.605082] 0000000000000286 00000000ac5a0d51 ffff8806ed5dbb00 ffffffffa0e2e073
[617521.605086] ffff8806ed5dbc90 ffff8806ea450ec0 ffff8806ed5dbb68 ffffffffa0c2e97b
[617521.605088] 0000000000000000 ffff8802fb7b8a80 ffff8806ea450ec0 ffff8806ed5dbb58
[617521.605090] Call Trace:
[617521.605117] [<ffffffffa0e2e073>] dump_stack+0x63/0x90
[617521.605130] [<ffffffffa0c2e97b>] dump_header+0x5c/0x1dc
[617521.605143] [<ffffffffa0dbd629>] ? apparmor_capable+0xe9/0x1a0
[617521.605152] [<ffffffffa0ba58d6>] oom_kill_process+0x226/0x3f0
[617521.605154] [<ffffffffa0ba5e4a>] out_of_memory+0x35a/0x3f0
[617521.605156] [<ffffffffa0bab079>] __alloc_pages_slowpath+0x959/0x980
[617521.605157] [<ffffffffa0bab35a>] __alloc_pages_nodemask+0x2ba/0x300
[617521.605166] [<ffffffffa0a80726>] copy_process.part.30+0x146/0x1b50
[617521.605176] [<ffffffffa0a63eee>] ? kvm_sched_clock_read+0x1e/0x30
[617521.605183] [<ffffffffa0aa3ed0>] ? kthread_create_on_node+0x1e0/0x1e0
[617521.605194] [<ffffffffa0a2c78c>] ? __switch_to+0x2dc/0x700
[617521.605196] [<ffffffffa0a82327>] _do_fork+0xe7/0x3f0
[617521.605213] [<ffffffffa1295b17>] ? __schedule+0x307/0x790
[617521.605215] [<ffffffffa0a82659>] kernel_thread+0x29/0x30
[617521.605219] [<ffffffffa0aa48e0>] kthreadd+0x160/0x1b0
[617521.605222] [<ffffffffa129aa1f>] ret_from_fork+0x1f/0x40
[617521.605224] [<ffffffffa0aa4780>] ? kthread_create_on_cpu+0x60/0x60
[617521.605225] Mem-Info:
[617521.605231] active_anon:1075398 inactive_anon:4083 isolated_anon:0
active_file:2616493 inactive_file:328306 isolated_file:160
unevictable:1 dirty:327621 writeback:785 unstable:0
slab_reclaimable:21286 slab_unreclaimable:7420
mapped:10714 shmem:5451 pagetables:6225 bounce:0
free:33879 free_pcp:498 free_cma:0
[617521.605234] Node 0 active_anon:4301592kB inactive_anon:16332kB active_file:10465972kB inactive_file:1313224kB unevictable:4kB isolated(anon):0kB isolated(file):640kB mapped:42856kB dirty:1310484kB writeback:3140kB shmem:0kB shmem_thp: 0kB shmem_pmdmapped: 3321856kB anon_thp: 21804kB writeback_tmp:0kB unstable:0kB pages_scanned:17790528 all_unreclaimable? yes
[617521.605235] Node 0 DMA free:15900kB min:64kB low:80kB high:96kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15992kB managed:15908kB mlocked:0kB slab_reclaimable:0kB slab_unreclaimable:8kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[617521.605238] lowmem_reserve[]: 0 2952 15988 15988 15988
[617521.605240] Node 0 DMA32 free:64576kB min:12464kB low:15580kB high:18696kB active_anon:733012kB inactive_anon:0kB active_file:2107244kB inactive_file:145520kB unevictable:0kB writepending:145520kB present:3129192kB managed:3063624kB mlocked:0kB slab_reclaimable:6992kB slab_unreclaimable:1272kB kernel_stack:1280kB pagetables:2844kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[617521.605243] lowmem_reserve[]: 0 0 13036 13036 13036
[617521.605244] Node 0 Normal free:55040kB min:55048kB low:68808kB high:82568kB active_anon:3568580kB inactive_anon:16332kB active_file:8358728kB inactive_file:1167704kB unevictable:4kB writepending:1168104kB present:13631488kB managed:13352220kB mlocked:4kB slab_reclaimable:78152kB slab_unreclaimable:28400kB kernel_stack:5168kB pagetables:22056kB bounce:0kB free_pcp:1992kB local_pcp:100kB free_cma:0kB
[617521.605264] lowmem_reserve[]: 0 0 0 0 0
[617521.605266] Node 0 DMA: 1*4kB (U) 1*8kB (U) 1*16kB (U) 0*32kB 2*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 15900kB
[617521.605277] Node 0 DMA32: 208*4kB (UE) 148*8kB (UE) 260*16kB (UE) 115*32kB (UME) 121*64kB (UME) 73*128kB (UME) 67*256kB (UME) 22*512kB (UME) 9*1024kB (UME) 0*2048kB 0*4096kB = 64576kB
[617521.605284] Node 0 Normal: 856*4kB (UMEH) 604*8kB (UEH) 278*16kB (UMEH) 373*32kB (UMEH) 185*64kB (UMEH) 53*128kB (UMEH) 14*256kB (UMEH) 6*512kB (UME) 5*1024kB (MH) 0*2048kB 0*4096kB = 55040kB
[617521.605293] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[617521.605294] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[617521.605294] 2950382 total pagecache pages
[617521.605295] 0 pages in swap cache
[617521.605296] Swap cache stats: add 0, delete 0, find 0/0
[617521.605296] Free swap = 0kB
[617521.605297] Total swap = 0kB
[617521.605297] 4194168 pages RAM
[617521.605297] 0 pages HighMem/MovableOnly
[617521.605298] 86230 pages reserved
[617521.605298] 0 pages cma reserved
[617521.605298] 0 pages hwpoisoned
[617521.605299] [ pid ] uid tgid total_vm rss nr_ptes nr_pmds swapents oom_score_adj name
[617521.605304] [ 337] 0 337 10867 3412 25 3 0 0 systemd-journal
[617521.605306] [ 382] 0 382 25742 291 17 3 0 0 lvmetad
[617521.605307] [ 384] 0 384 11276 897 22 3 0 -1000 systemd-udevd
[617521.605308] [ 780] 108 780 90615 2349 78 3 0 0 whoopsie
[617521.605309] [ 789] 106 789 11833 986 27 3 0 -900 dbus-daemon
[617521.605311] [ 803] 0 803 1100 312 7 3 0 0 acpid
[617521.605312] [ 823] 104 823 65138 701 29 3 0 0 rsyslogd
[617521.605313] [ 835] 0 835 129671 2914 40 6 0 0 snapd
[617521.605314] [ 836] 0 836 7137 729 18 3 0 0 systemd-logind
[617521.605315] [ 838] 0 838 7252 644 20 3 0 0 cron
[617521.605316] [ 857] 0 857 84342 1436 65 3 0 0 ModemManager
[617521.605317] [ 965] 0 965 16380 1344 35 3 0 -1000 sshd
[617521.605318] [ 967] 0 967 4884 65 14 3 0 0 irqbalance
[617521.605320] [ 992] 0 992 17496 788 40 3 0 0 login
[617521.605321] [ 1098] 0 1098 74129 1986 47 3 0 0 polkitd
[617521.605322] [ 1116] 120 1116 11105 983 23 3 0 0 ntpd
[617521.605323] [ 1152] 0 1152 71840 2120 136 4 0 0 winbindd
[617521.605324] [ 1153] 0 1153 105122 3484 203 4 0 0 winbindd
[617521.605325] [ 1159] 0 1159 73413 2856 140 4 0 0 winbindd
[617521.605326] [ 1161] 0 1161 71832 1924 135 4 0 0 winbindd
[617521.605327] [ 1163] 0 1163 71832 1295 136 4 0 0 winbindd
[617521.605328] [ 1721] 1000 1721 11312 932 26 3 0 0 systemd
[617521.605329] [ 1722] 1000 1722 16318 466 34 3 0 0 (sd-pam)
[617521.605337] [ 1725] 1000 1725 5613 1066 16 3 0 0 bash
[617521.605338] [ 1789] 0 1789 14274 787 33 3 0 0 sudo
[617521.605339] [ 1790] 0 1790 14109 719 33 3 0 0 su
[617521.605340] [ 1791] 0 1791 5619 1120 17 3 0 0 bash
[617521.605342] [ 1935] 0 1935 60002 1421 114 4 0 0 nmbd
[617521.605343] [ 1948] 0 1948 86040 3924 165 3 0 0 smbd
[617521.605345] [ 1949] 0 1949 82452 1067 155 3 0 0 smbd
[617521.605347] [ 1951] 0 1951 86171 1589 160 3 0 0 smbd
[617521.605349] [19081] 0 19081 87063 4262 167 3 0 0 smbd
[617521.605351] [19253] 0 19253 24889 1458 52 3 0 0 sshd
[617521.605352] [19275] 1000 19275 24889 891 51 3 0 0 sshd
[617521.605354] [19276] 1000 19276 5605 1104 16 3 0 0 bash
[617521.605356] [19307] 0 19307 14274 778 33 3 0 0 sudo
[617521.605357] [19308] 0 19308 14109 737 32 3 0 0 su
[617521.605359] [19309] 0 19309 5618 1184 16 3 0 0 bash
[617521.605360] [16347] 999 16347 18952 4419 40 4 0 0 sqlservr
[617521.605361] [16349] 999 16349 3028846 1043058 2562 26 0 0 sqlservr
[617521.605362] [20193] 0 20193 88057 4618 168 3 0 0 smbd
[617521.605363] [30023] 0 30023 87931 4038 167 3 0 0 smbd
[617521.605364] [ 4801] 0 4801 87627 4088 167 3 0 0 smbd
[617521.605365] [ 5266] 0 5266 68705 2451 66 4 0 0 cups-browsed
[617521.605366] [ 7563] 0 7563 88008 4183 167 3 0 0 smbd
[617521.605368] [10495] 0 10495 88072 4621 168 3 0 0 smbd
[617521.605369] [12342] 0 12342 88008 4292 167 3 0 0 smbd
[617521.605371] [12797] 0 12797 12555 719 30 3 0 0 cron
[617521.605373] [12798] 0 12798 12555 719 30 3 0 0 cron
[617521.605375] [12799] 0 12799 1127 213 8 3 0 0 sh
[617521.605376] [12800] 0 12800 1127 187 7 3 0 0 sh
[617521.605377] [12801] 0 12801 4902 785 15 3 0 0 rsync
[617521.605378] [12802] 0 12802 4732 483 14 3 0 0 rsync
[617521.605379] [12803] 0 12803 3911 690 12 3 0 0 rsync
[617521.605380] [12804] 0 12804 3741 452 11 3 0 0 rsync
[617521.605381] [12805] 0 12805 4878 477 15 3 0 0 rsync
[617521.605382] [12806] 0 12806 3911 515 11 3 0 0 rsync
[617521.605383] Out of memory: Kill process 16349 (sqlservr) score 254 or sacrifice child
[617521.608484] Killed process 16349 (sqlservr) total-vm:12115384kB, anon-rss:4164616kB, file-rss:7616kB, shmem-rss:0kB
[617521.832626] oom_reaper: reaped process 16349 (sqlservr), now anon-rss:0kB, file-rss:236kB, shmem-rss:0kB
You can configure SQL sp_configure setting to limit memory consumption if there are other processes consuming memory on the machine causing it to run out of memory or increase swap ( though you don't want SQL to be swapped out) or increase memory.
We can also tune the way that the OOM killer handles OOM conditions. If we want to make SQL process ( in this case 3452 ) less likely to be killed by the OOM killer
echo -15 > /proc/3452/oom_adj
I suspect this is a straightforward memory leak issue (python27 process has a memory leak with appengine libraries running on the Managed VMs GCE containers), but I'm confused a few things about the data I was collecting during the OOM issues.
After running fine for most of a day, my "vmstat 1" suddenly changed drastically:
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
r b swpd free buff cache si so bi bo in cs us sy id wa
0 0 0 70116 7612 41240 0 0 0 64 231 645 3 2 95 0
0 0 0 70148 7612 41240 0 0 0 0 164 459 2 2 96 0
1 0 0 70200 7612 41240 0 0 0 0 209 712 2 1 97 0
1 0 0 65432 7612 41344 0 0 100 0 602 820 48 5 47 1
1 3 0 69840 5644 29620 0 0 1284 0 812 797 33 6 34 27
0 1 0 69068 5896 30216 0 0 852 68 362 1052 6 1 0 93
0 1 0 68340 6160 30536 0 0 556 0 547 1355 4 2 0 94
0 2 0 67928 6564 30972 0 0 872 0 793 2173 9 5 0 86
0 1 0 63988 6888 34416 0 0 3776 0 716 1940 3 3 0 94
3 0 0 63696 7104 34608 0 0 376 0 353 1006 4 4 34 58
0 0 0 63548 7112 34948 0 0 332 48 379 916 13 1 84 2
0 0 0 63636 7116 34948 0 0 4 0 184 637 0 1 99 0
0 0 0 63660 7116 34948 0 0 0 0 203 556 0 3 97 0
0 1 0 76100 3648 26128 0 0 460 0 409 1142 7 4 85 4
0 3 0 73452 948 15940 0 0 4144 80 1041 1126 53 6 10 31
0 6 0 73828 84 11424 0 0 32924 80 1135 1732 11 4 0 85
0 6 0 72684 64 12324 0 0 52168 4 1519 2397 6 3 0 91
0 11 0 67340 52 12328 0 0 78072 16 1388 2974 2 9 0 89
1 10 0 65992 336 13412 0 0 79796 0 1297 2973 0 9 0 91
0 15 0 69000 48 10396 0 0 78344 0 1203 2739 2 7 0 91
0 15 0 67168 52 11460 0 0 86864 0 1244 3003 0 6 0 94
1 15 0 71268 52 7836 0 0 82552 4 1497 3269 0 7 0 93
In particular, my memory cache and buff dropped, and the io bytes-in surged, and it stayed like this for ~10 minutes afterwards before the machine died and was rebooted by Google Compute Engine. I assume the "bi" represents bytes-in from disk, but I'm curious why swpd showed 0 for this instance, if there was swapping? And why is memory "free" stat still unaffected if things are reaching a swapping point?
Second at the time of the final crash, my top showed:
Top - 15:06:20 up 1 day, 13:23, 2 users, load average: 13.88, 11.22, 9.30
Tasks: 92 total, 3 running, 89 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0.8 us, 8.0 sy, 0.0 ni, 0.0 id, 90.9 wa, 0.0 hi, 0.4 si, 0.0 st
KiB Mem: 1745136 total, 1684032 used, 61104 free, 648 buffers
KiB Swap: 0 total, 0 used, 0 free, 12236 cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
23 root 20 0 0 0 0 R 12.6 0.0 2:11.61 kswapd0
10 root rt 0 0 0 0 S 2.5 0.0 0:52.92 watchdog/0
2315 root 20 0 192m 17m 0 S 2.1 1.0 47:58.74 kubelet
2993 root 20 0 6116m 1.2g 0 S 0.9 70.2 318:41.51 python2.7
6644 root 20 0 55452 12m 0 S 0.9 0.7 0:00.81 python
2011 root 20 0 761m 9924 0 S 0.7 0.6 12:23.44 docker
6624 root 20 0 4176 132 0 D 0.5 0.0 0:00.24 du
140 root 0 -20 0 0 0 S 0.4 0.0 0:08.64 kworker/0:1H
2472 root 20 0 39680 5616 296 D 0.4 0.3 0:27.43 python
1 root 20 0 10656 132 0 S 0.2 0.0 0:02.61 init
3 root 20 0 0 0 0 S 0.2 0.0 2:02.17 ksoftirqd/0
22 root 20 0 0 0 0 R 0.2 0.0 0:24.61 kworker/0:1
1834 root 20 0 53116 756 0 S 0.2 0.0 0:01.79 rsyslogd
1859 root 20 0 52468 9624 0 D 0.2 0.6 0:29.36 supervisord
2559 root 20 0 349m 172m 0 S 0.2 10.1 25:56.31 ruby
Again, I see the python27 process has claimed to 70% (which when combined with the 10% from Ruby, puts me into dangerous territory). Buy why is kswapd going crazy with my 10% CPU when the above vmstat shows 0?
Should I just not trust vmstat's swapd?
I'm learning a spread of programming languages in a class, and we're working on an APLX project at the moment. A restriction we have to work around is we cannot use If, For, While, etc. No loops or conditionals. I have to be able to take a plane of numbers, ranging 0-7, and replace each number 2 or greater into the depth of that number, and, ideally, change the 1's to 0's. For example:
0100230 => 0000560
I have no idea how I'm supposed to do the replacement with depth aspect, though the change from ones to zeros is quite simple. I'm able to produce the set of integers in a table and I understand how to replace specific values, but only with other specific values, not values that would have to be determined during the function. The depth should be the row depth, rather than the multi-dimensional depth.
For the record this is not the whole of the program, the program itself is a poker dealing and scoring program. This is a specific aspect of the scoring methodology that my professor recommended I use.
TOTALS„SCORE PHAND;TYPECOUNT;DEPTH;ISCOUNT;TEMPS;REPLACE
:If (½½PHAND) = 0
PHAND„DEAL PHAND
:EndIf
TYPECOUNT„CHARS°.¹PHAND
DEPTH„2Þ(½TYPECOUNT)
REPLACE „ 2 3 4 5 6 7
ISCOUNT „ +/ TYPECOUNT
ISCOUNT „ ³ISCOUNT
((1=,ISCOUNT)/,ISCOUNT)„0
©((2=,ISCOUNT)/,ISCOUNT)„1
©TEMPS „ ISCOUNT
Œ„ISCOUNT
Œ„PHAND
You may have missed the first lessons of your prof and it might help to look at at again to learn about vectors and how easy you can work with them - once you unlearned the ideas of other programming languages ;-)
Assume you have a vector A with numbers from 1 to 7:
A←⍳7
A
1 2 3 4 5 6 7
Now, if you wanted to search for values > 3, you'd do:
A>3
0 0 0 1 1 1 1
The result is a vector, too, and you can easily combine the two in lots of operations:
multiplication to only keep values > 0 and replace others with 0:
A×A>3
0 0 0 4 5 6 7
or add 500 to values >3
A+500×A>3
1 2 3 504 505 506 507
or, find the indices of values > 3:
(A>3)×⍳⍴A
0 0 0 4 5 6 7
Now, looking at your q again, the word 'depth' has a specific meaning in APL and I guess you meant something different. Do I understand correctly that you want to replace values > 2 with the ' indices' of these values?
Well, with what I've shown before, this is easy:
A←0 1 0 0 2 3 0
(A≥2)×⍳⍴A
0 0 0 0 5 6 0
edit: looking at multi-dimensional arrays:
let's look into this example:
A←(⍳5)∘.×⍳10
A
1 2 3 4 5 6 7 8 9 10
2 4 6 8 10 12 14 16 18 20
3 6 9 12 15 18 21 24 27 30
4 8 12 16 20 24 28 32 36 40
5 10 15 20 25 30 35 40 45 50
Now, which numbers are > 20 and < 30?
z←(A>20)∧A<30
z
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 1 1 1 0
0 0 0 0 0 1 1 0 0 0
0 0 0 0 1 0 0 0 0 0
Then, you can multiply the values with that boolean result to filter out only the ones satisfying the condition:
A×z
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 21 24 27 0
0 0 0 0 0 24 28 0 0 0
0 0 0 0 25 0 0 0 0 0
Or, perhaps you're interested in the column-index of the values?
z×[2]⍳¯1↑⍴z
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 7 8 9 0
0 0 0 0 0 6 7 0 0 0
0 0 0 0 5 0 0 0 0 0
NB: this statement might not work in all APL-dialects. Here's another way to formulate this:
z×((1↑⍴z)⍴0)∘.+⍳¯1↑⍴z
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 7 8 9 0
0 0 0 0 0 6 7 0 0 0
0 0 0 0 5 0 0 0 0 0
I hope this gives you some ideas to play with. In general, using booleans to manipulate arrays in mathematical operations is an extremely powerful idea in APL which will take you loooooong ways ;-)
Also, if you'd like to see more of the same, have a look at the FinnAPL Idioms - some useful shorties grown over the years ;-)
edit re. "maintaining untouched values":
going back to example array A:
A←(⍳5)∘.×⍳10
A
1 2 3 4 5 6 7 8 9 10
2 4 6 8 10 12 14 16 18 20
3 6 9 12 15 18 21 24 27 30
4 8 12 16 20 24 28 32 36 40
5 10 15 20 25 30 35 40 45 50
Replacing values between 20 and 30 with the power 2 of these values, keeping all others unchanged:
touch←(A>20)∧A<30
(touch×A*2)+A×~touch
1 2 3 4 5 6 7 8 9 10
2 4 6 8 10 12 14 16 18 20
3 6 9 12 15 18 441 576 729 30
4 8 12 16 20 576 784 32 36 40
5 10 15 20 625 30 35 40 45 50
I hope you get the idea...
Or better: ask a new q, as otherwise this would truly take epic dimensions, whereas the idea of stackoverflow is more like "one issue - one question"...
I'm trying to transpose a matrix using MPI in C. Each process has a square submatrix, and I want to send that to the right process (the 'opposite' one on the grid), transposing it as part of the communication.
I'm using MPI_Type_create_subarray which has an argument for the order, either MPI_ORDER_C or MPI_ORDER_FORTRAN for row-major and column-major respectively. I thought that if I sent as one of these, and received as the other, then my matrix would be transposed as part of the communication. However, this doesn't seem to happen - it just stays non-transposed.
The important part of the code is below, and the whole code file is available at this gist. Does anyone have any ideas why this isn't working? Should this approach to doing the transpose work? I'd have thought it would, having read the descriptions of MPI_ORDER_C and MPI_ORDER_FORTRAN, but maybe not.
/* ----------- DO TRANSPOSE ----------- */
/* Find the opposite co-ordinates (as we know it's a square) */
coords2[0] = coords[1];
coords2[1] = coords[0];
/* Get the rank for this process */
MPI_Cart_rank(cart_comm, coords2, &rank2);
/* Send to these new coordinates */
tag = (coords[0] + 1) * (coords[1] + 1);
/* Create new derived type to receive as */
/* MPI_Type_vector(rows_in_core, cols_in_core, cols_in_core, MPI_DOUBLE, &vector_type); */
sizes[0] = rows_in_core;
sizes[1] = cols_in_core;
subsizes[0] = rows_in_core;
subsizes[1] = cols_in_core;
starts[0] = 0;
starts[1] = 0;
MPI_Type_create_subarray(2, sizes, subsizes, starts, MPI_ORDER_FORTRAN, MPI_DOUBLE, &send_type);
MPI_Type_commit(&send_type);
MPI_Type_create_subarray(2, sizes, subsizes, starts, MPI_ORDER_C, MPI_DOUBLE, &recv_type);
MPI_Type_commit(&recv_type);
/* We're sending in row-major form, so it's just rows_in_core * cols_in_core lots of MPI_DOUBLE */
MPI_Send(&array[0][0], 1, send_type, rank2, tag ,cart_comm);
/* Receive from these new coordinates */
MPI_Recv(&new_array[0][0], 1, recv_type, rank2, tag, cart_comm, &status);
I would have thought this would work, too, but apparently not.
If you slog through the relevant bit of the MPI standard where it actually defines the resulting typemap, the reason becomes clear -- MPI_Type_create_subarray maps out the region that the subarray takes in the full array, but marches through the memory in linear order, so the data layout doesn't change. In other words, when the sizes equal the subsizes, the subarray is just a contiguous block of memory; and for a subarray strictly smaller than the whole array, you're just changing the subregion that is being sent/received to, not the data ordering. You can see the effect when choosing just a subregion:
int sizes[]={cols,rows};
int subsizes[]={2,4};
int starts[]={1,1};
MPI_Type_create_subarray(2, sizes, subsizes, starts, MPI_ORDER_FORTRAN, MPI_INT, &ftype);
MPI_Type_commit(&ftype);
MPI_Type_create_subarray(2, sizes, subsizes, starts, MPI_ORDER_C, MPI_INT, &ctype);
MPI_Type_commit(&ctype);
MPI_Isend(&(send[0][0]), 1, ctype, 0, 1, MPI_COMM_WORLD,&reqc);
MPI_Recv(&(recvc[0][0]), 1, ctype, 0, 1, MPI_COMM_WORLD, &statusc);
MPI_Isend(&(send[0][0]), 1, ctype, 0, 1, MPI_COMM_WORLD,&reqf);
MPI_Recv(&(recvf[0][0]), 1, ftype, 0, 1, MPI_COMM_WORLD, &statusf);
/*...*/
printf("Original:\n");
printarr(send,rows,cols);
printf("\nReceived -- C order:\n");
printarr(recvc,rows,cols);
printf("\nReceived: -- Fortran order:\n");
printarr(recvf,rows,cols);
gives you this:
0 1 2 3 4 5 6
10 11 12 13 14 15 16
20 21 22 23 24 25 26
30 31 32 33 34 35 36
40 41 42 43 44 45 46
50 51 52 53 54 55 56
60 61 62 63 64 65 66
Received -- C order:
0 0 0 0 0 0 0
0 11 12 13 14 0 0
0 21 22 23 24 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
Received: -- Fortran order:
0 0 0 0 0 0 0
0 11 12 0 0 0 0
0 13 14 0 0 0 0
0 21 22 0 0 0 0
0 23 24 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
So the same data is getting sent and received; all that's really happening is that the arrays sizes, subsizes and starts are being reversed.
You can transpose with MPI datatypes -- the standard even gives a couple of examples, one of which I've transliterated into C here -- but you have to create the types yourself. The good news is that it's really no longer than the subarray stuff:
MPI_Type_vector(rows, 1, cols, MPI_INT, &col);
MPI_Type_hvector(cols, 1, sizeof(int), col, &transpose);
MPI_Type_commit(&transpose);
MPI_Isend(&(send[0][0]), rows*cols, MPI_INT, 0, 1, MPI_COMM_WORLD,&req);
MPI_Recv(&(recv[0][0]), 1, transpose, 0, 1, MPI_COMM_WORLD, &status);
MPI_Type_free(&col);
MPI_Type_free(&transpose);
printf("Original:\n");
printarr(send,rows,cols);
printf("Received\n");
printarr(recv,rows,cols);
$ mpirun -np 1 ./transpose2
Original:
0 1 2 3 4 5 6
10 11 12 13 14 15 16
20 21 22 23 24 25 26
30 31 32 33 34 35 36
40 41 42 43 44 45 46
50 51 52 53 54 55 56
60 61 62 63 64 65 66
Received
0 10 20 30 40 50 60
1 11 21 31 41 51 61
2 12 22 32 42 52 62
3 13 23 33 43 53 63
4 14 24 34 44 54 64
5 15 25 35 45 55 65
6 16 26 36 46 56 66