Intel SIMD instructions causing segmentation faults [duplicate] - c

How to deal with SIGSEGV, Segmentation fault. while using Avx2 (_mm256_load_pd)(_mm256_store_pd)
(solved)
_mm256_load_pd
I've received segmentation fault wile called
_mm256_load_pd
usage are as blew
double * Val = malloc(sizeof(double)*4);
__m256d vecv = _mm256_load_pd(&Val[0]);
gdb shows
Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7fc5017 in _mm256_load_pd (__P=0x555555559370)
at /usr/lib/gcc/x86_64-linux-gnu/9/include/avxintrin.h:862
862 return *(__m256d *)__P;
(gdb) frame 1
#1 gemv_d_lineProduct_4_avx2 (Val=0x555555559370, indx=0x5555555592f0,
Vector_X=0x5555555592c0, Vector_Y=0x555555559340)
at someThing.c:114
114 __m256d vecv = _mm256_load_pd(&Val[0]);
(gdb)
_mm256_store_pd
while I make Val bigger
double * Val = malloc(sizeof(double)*4);
I found _mm256_load_pd works rightly but result in
Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7fc50e3 in _mm256_store_pd (__A=..., __P=0x555555559390)
at /usr/lib/gcc/x86_64-linux-gnu/9/include/avxintrin.h:868
868 *(__m256d *)__P = __A;
(gdb) frame 1
#1 gemv_d_lineProduct_4_avx2 (Val=0x5555555593e0, indx=0x555555559310,
Vector_X=0x5555555592c0, Vector_Y=0x555555559390)
at something.c:122
122 _mm256_store_pd(Vector_Y,vecY);
full project
https://github.com/DevilInChina/gemv
mkdir build;cd build
cmake ..
make
cd ../bin
./line
#then might get some seg fault
Method of solving
change memory allocate function to
void *aligned_alloc (size_t __alignment, size_t __size);
first parameter should be 1024 or something else.
Thanks to igor-r

According to the Intel reference, _mm256_load_pd() requires 32-byte aligned pointer.
Please, use aligned_alloc() to allocate a memory chunk having the proper alignment.

Related

I was expecting segmentation fault or some kind of out of bound exception but did not get it when using command line arguments in a C program

I am learning C programming from "Learn c the hard way by Zed Shaw". He asks the learner to try and break their own code.
So I tried the following C code and thought printing more values that I gave argv will break it but it did not until later.
#include<stdio.h>
int main(int argc, char *argv[])
{
int i = 0;
printf("This is argc: %d\n",argc);
printf("This is argv[argc]: %s\n",argv[argc]);
printf("This is argv[0]: %s\n",argv[0]);
for(i=argc;i<100;i++)
printf("This is argv[%d]: %s\n",i,argv[i]);
for(i=1;i<argc;i++)
{
printf("arg %d: %s\n",i,argv[i]);
}
return 0;
}
When I try to print argv upto 100:
I see the following when I was expecting some kind of out of bound or segmentation fault.
./exp10_so These are cmd args
This is argc: 5
This is argv[argc]: (null)
This is argv[0]: ./exp10_so
This is argv[5]: (null)
This is argv[6]: TERMINATOR_DBUS_NAME=net.tenshu.Terminator21a9d5db22c73a993ff0b42f64b396873
This is argv[7]: GTK_RC_FILES=/etc/gtk/gtkrc:/home/ab/.gtkrc:/home/ab/.config/gtkrc
This is argv[8]: _=/home/ab/Projects/learn_c_the_hard_way/./exp10_so
This is argv[9]: LANG=en_IN
This is argv[10]: GTK3_MODULES=xapp-gtk3-module
This is argv[11]: XDG_CURRENT_DESKTOP=KDE
This is argv[12]: QT_LINUX_ACCESSIBILITY_ALWAYS_ON=1
This is argv[13]: LC_IDENTIFICATION=en_IN
This is argv[14]: XCURSOR_THEME=breeze_cursors
This is argv[15]: XDG_SESSION_CLASS=user
This is argv[16]: XDG_SESSION_TYPE=x11
This is argv[17]: SHLVL=1
This is argv[18]: TERMINATOR_UUID=urn:uuid:4496f24b-8a64-43af-ab5a-03fc7e722242
This is argv[19]: DESKTOP_SESSION=plasma
This is argv[20]: LC_MEASUREMENT=en_IN
This is argv[21]: OLDPWD=/home/ab/Projects
This is argv[22]: HOME=/home/ab
This is argv[23]: KDE_SESSION_VERSION=5
This is argv[24]: USER=ab
This is argv[25]: TERMINATOR_DBUS_PATH=/net/tenshu/Terminator2
This is argv[26]: SESSION_MANAGER=local/tgh:#/tmp/.ICE-unix/2372,unix/tgh:/tmp/.ICE-unix/2372
This is argv[27]: XDG_SESSION_PATH=/org/freedesktop/DisplayManager/Session1
This is argv[28]: DBUS_SESSION_BUS_ADDRESS=unix:path=/run/user/1000/bus
This is argv[29]: XDG_VTNR=1
This is argv[30]: XDG_SEAT=seat0
This is argv[31]: LC_NUMERIC=en_IN
This is argv[32]: BROWSER=/usr/bin/firefox
This is argv[33]: GTK_MODULES=canberra-gtk-module
This is argv[34]: XDG_SEAT_PATH=/org/freedesktop/DisplayManager/Seat0
This is argv[35]: XDG_DATA_DIRS=/home/ab/.local/share/flatpak/exports/share:/var/lib/flatpak/exports/share:/usr/local/share:/usr/share:/var/lib/snapd/desktop
This is argv[36]: XDG_SESSION_DESKTOP=KDE
This is argv[37]: VTE_VERSION=6401
This is argv[38]: KDE_SESSION_UID=1000
This is argv[39]: LC_TIME=en_IN
This is argv[40]: MAIL=/var/spool/mail/ab
This is argv[41]: LOGNAME=ab
This is argv[42]: QT_AUTO_SCREEN_SCALE_FACTOR=0
This is argv[43]: LC_PAPER=en_IN
This is argv[44]: PATH=/usr/local/nginx/sbin:/home/ab/.local/bin:/usr/local/sbin:/usr/local/bin:/usr/bin:/usr/lib/jvm/default/bin:/usr/bin/site_perl:/usr/bin/vendor_perl:/usr/bin/core_perl:/var/lib/snapd/snap/bin
This is argv[45]: QT_SCREEN_SCALE_FACTORS=LVDS1=1;DP1=1;HDMI1=1;VGA1=1;VIRTUAL1=1;
This is argv[46]: XDG_RUNTIME_DIR=/run/user/1000
This is argv[47]: SHELL=/bin/zsh
This is argv[48]: XDG_SESSION_ID=2
This is argv[49]: LC_MONETARY=en_IN
This is argv[50]: GTK2_RC_FILES=/etc/gtk-2.0/gtkrc:/home/ab/.gtkrc-2.0:/home/ab/.config/gtkrc-2.0
This is argv[51]: LC_TELEPHONE=en_IN
This is argv[52]: EDITOR=/usr/bin/nano
This is argv[53]: COLORTERM=truecolor
This is argv[54]: MOTD_SHOWN=pam
This is argv[55]: KDE_APPLICATIONS_AS_SCOPE=1
This is argv[56]: PAM_KWALLET5_LOGIN=/run/user/1000/kwallet5.socket
This is argv[57]: KDE_FULL_SESSION=true
This is argv[58]: XAUTHORITY=/home/ab/.Xauthority
This is argv[59]: LC_NAME=en_IN
This is argv[60]: DISPLAY=:0
This is argv[61]: LC_ADDRESS=en_IN
This is argv[62]: PWD=/home/ab/Projects/learn_c_the_hard_way
This is argv[63]: XCURSOR_SIZE=24
This is argv[64]: TERM=xterm-256color
This is argv[65]: ZSH=/home/ab/.oh-my-zsh
This is argv[66]: PAGER=less
This is argv[67]: LESS=-R
This is argv[68]: LSCOLORS=Gxfxcxdxbxegedabagacad
This is argv[69]: LS_COLORS=rs=0:di=01;34:ln=01;36:mh=00:pi=40;33:so=01;35:do=01;35:bd=40;33;01:cd=40;33;01:or=40;31;01:mi=00:su=37;41:sg=30;43:ca=30;41:tw=30;42:ow=34;42:st=37;44:ex=01;32:*.tar=01;31:*.tgz=01;31:*.arc=01;31:*.arj=01;31:*.taz=01;31:*.lha=01;31:*.lz4=01;31:*.lzh=01;31:*.lzma=01;31:*.tlz=01;31:*.txz=01;31:*.tzo=01;31:*.t7z=01;31:*.zip=01;31:*.z=01;31:*.dz=01;31:*.gz=01;31:*.lrz=01;31:*.lz=01;31:*.lzo=01;31:*.xz=01;31:*.zst=01;31:*.tzst=01;31:*.bz2=01;31:*.bz=01;31:*.tbz=01;31:*.tbz2=01;31:*.tz=01;31:*.deb=01;31:*.rpm=01;31:*.jar=01;31:*.war=01;31:*.ear=01;31:*.sar=01;31:*.rar=01;31:*.alz=01;31:*.ace=01;31:*.zoo=01;31:*.cpio=01;31:*.7z=01;31:*.rz=01;31:*.cab=01;31:*.wim=01;31:*.swm=01;31:*.dwm=01;31:*.esd=01;31:*.jpg=01;35:*.jpeg=01;35:*.mjpg=01;35:*.mjpeg=01;35:*.gif=01;35:*.bmp=01;35:*.pbm=01;35:*.pgm=01;35:*.ppm=01;35:*.tga=01;35:*.xbm=01;35:*.xpm=01;35:*.tif=01;35:*.tiff=01;35:*.png=01;35:*.svg=01;35:*.svgz=01;35:*.mng=01;35:*.pcx=01;35:*.mov=01;35:*.mpg=01;35:*.mpeg=01;35:*.m2v=01;35:*.mkv=01;35:*.webm=01;35:*.webp=01;35:*.ogm=01;35:*.mp4=01;35:*.m4v=01;35:*.mp4v=01;35:*.vob=01;35:*.qt=01;35:*.nuv=01;35:*.wmv=01;35:*.asf=01;35:*.rm=01;35:*.rmvb=01;35:*.flc=01;35:*.avi=01;35:*.fli=01;35:*.flv=01;35:*.gl=01;35:*.dl=01;35:*.xcf=01;35:*.xwd=01;35:*.yuv=01;35:*.cgm=01;35:*.emf=01;35:*.ogv=01;35:*.ogx=01;35:*.aac=00;36:*.au=00;36:*.flac=00;36:*.m4a=00;36:*.mid=00;36:*.midi=00;36:*.mka=00;36:*.mp3=00;36:*.mpc=00;36:*.ogg=00;36:*.ra=00;36:*.wav=00;36:*.oga=00;36:*.opus=00;36:*.spx=00;36:*.xspf=00;36:
This is argv[70]: LD_LIBRARY_PATH=/usr/local/lib
This is argv[71]: (null)
AddressSanitizer:DEADLYSIGNAL
=================================================================
==69851==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000021 (pc 0x7f3c30d7b4c6 bp 0x7ffe273b2ba0 sp 0x7ffe273b22e8 T0)
==69851==The signal is caused by a READ memory access.
==69851==Hint: address points to the zero page.
#0 0x7f3c30d7b4c6 in __sanitizer::internal_strlen(char const*) /build/gcc/src/gcc/libsanitizer/sanitizer_common/sanitizer_libc.cpp:167
#1 0x7f3c30d0d057 in printf_common /build/gcc/src/gcc/libsanitizer/sanitizer_common/sanitizer_common_interceptors_format.inc:545
#2 0x7f3c30d0d41c in __interceptor_vprintf /build/gcc/src/gcc/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:1639
#3 0x7f3c30d0d517 in __interceptor_printf /build/gcc/src/gcc/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:1697
#4 0x562c5e03f290 in main /home/ab/Projects/learn_c_the_hard_way/exp10_so.c:13
#5 0x7f3c30b0ab24 in __libc_start_main (/usr/lib/libc.so.6+0x27b24)
#6 0x562c5e03f0bd in _start (/home/ab/Projects/learn_c_the_hard_way/exp10_so+0x10bd)
AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV /build/gcc/src/gcc/libsanitizer/sanitizer_common/sanitizer_libc.cpp:167 in __sanitizer::internal_strlen(char const*)
==69851==ABORTING
A segmentation fault happens when the code try to access a memory region that is not available.
Accessing an array out of bounds doesn't means that the memory before or after the area occupied by the array is not available: The compiler or the runtime usually put all varibales or data in general in a given block of memory. If your array is the last item of such a memory block, the accessing it with a to big index will produce a Segmentaion Fault but is the array is in the middle of the memory block, you will just access memory used for other data, giving unexpected result and undefined behavior.
If the array (In may example, but valid for anything) is written, accessing available memory will not produce a segmentation fault but will overwrite something else. It may produce unexpected results or crash or segmentation fault later! This kind of bug is frequently very difficult to find because the unexpected result/behavior looks completely independent of the root cause.

Segmentation fault in gst_mini_object_init

I'm trying to use Gstreamer in a C program.
I use udpsrc so I have to put caps :
GstCaps *caps = gst_caps_new_empty_simple("application/x-rtp");
With this, I get an Segmentation fault.
So, I've tried with G_DEBUG="fatal_warnings" gdb --args ./test_gst.
Here's the output :
Program received signal SIGSEGV, Segmentation fault.
0x76f010e4 in gst_mini_object_init (mini_object=0x28600, flags=0, type=0, copy_func=0x76ed6174 <_gst_caps_copy>, dispose_func=0x0, free_func=0x76ed5128 <_gst_caps_free>)
at gstminiobject.c:133
133 gstminiobject.c: No such file or directory.
(gdb) bt
#0 0x76f010e4 in gst_mini_object_init (mini_object=0x28600, flags=0, type=0, copy_func=0x76ed6174 <_gst_caps_copy>, dispose_func=0x0, free_func=0x76ed5128 <_gst_caps_free>)
at gstminiobject.c:133
#1 0x76ed57b4 in gst_caps_init (caps=0x28600) at gstcaps.c:209
#2 gst_caps_new_empty () at gstcaps.c:239
#3 0x76ed58f8 in gst_caps_new_empty_simple (media_type=0x110b4 "application/x-rtp") at gstcaps.c:282
#4 0x00010bbc in main ()
I don't know if this can help, but I'm working on a Raspberry PI 3 (raspbian).
I found a similar bug report with Segmentation fault in gst_mini_object_init(). According to this comment you should call gst_init() before using Gstreamer.
Did you call gst_init() before using Gstreamer API ?

Segmentation Fault: C-Program migrating from HPUX to Linux

I'm trying to migrate a small c program from hpux to linux. The project compiles fine but crashes at runtime showing me a segmentation fault. I've already tried to see behind the mirror using strace and gdb but still don't understand. The relevant (truncated) parts:
tts_send_2.c
Contains a method
int sequenznummernabgleich(int sockfd, char *snd_id, char *rec_id, int timeout_quit) {
TS_TEL_TAB tel_tab_S01;
int n;
# truncated
}
which is called from within that file like this:
. . .
. . .
switch(sequenznummernabgleich(sockfd,c_snd_id,c_rec_id,c_timeout_quit)) {
/* kritischer Fehler */
case -1:
. . .
. . .
when calling that method I'm presented a segmentation fault (gdb output):
Program received signal SIGSEGV, Segmentation fault.
0x0000000000403226 in sequenznummernabgleich (sockfd=<error reading variable: Cannot access memory at address 0x7fffff62f94c>,
snd_id=<error reading variable: Cannot access memory at address 0x7fffff62f940>, rec_id=<error reading variable: Cannot access memory at address 0x7fffff62f938>,
timeout_quit=<error reading variable: Cannot access memory at address 0x7fffff62f934>) at tts_snd_2.c:498
498 int sequenznummernabgleich(int sockfd, char *snd_id, char *rec_id, int timeout_quit) {
which I just don't understand. When I'm stepping to the line where the method is called using gdb, all the variables are looking fine:
1008 switch(sequenznummernabgleich(sockfd,c_snd_id,c_rec_id,c_timeout_quit)) {
(gdb) p sockfd
$9 = 8
(gdb) p &sockfd
$10 = (int *) 0x611024 <sockfd>
(gdb) p c_snd_id
$11 = "KR", '\000' <repeats 253 times>
(gdb) p &c_snd_id
$12 = (char (*)[256]) 0xfde220 <c_snd_id>
(gdb) p c_rec_id
$13 = "CO", '\000' <repeats 253 times>
(gdb) p &c_rec_id
$14 = (char (*)[256]) 0xfde560 <c_rec_id>
(gdb) p c_timeout_quit
$15 = 20
(gdb) p &c_timeout_quit
$16 = (int *) 0xfde660 <c_timeout_quit>
I've also created an strace output. Here's the last part concerning the code shown above:
strace output
Any ideas ? I've searched the web and of course stackoverflow for hours without finding a really similar case.
Thanks
Kriz
I haven't used an HP/UX in eons but do hazily remember enough for the following suggestions:
Make sure you're initializing variables / struts correctly. Use calloc instead of malloc.
Also don't assume a specific bit pattern order: eg low byte then high byte. Ska endian-ness of the machine. There are usually macros in the compiler that will handle the appropriate ordering for you.
Update 15.10.16
After debugging for even more hours I found the real Problem. On the first line of the Method "sequenznummernabgleich" is a declaration of a struct
TS_TEL_TAB tel_tab_S01;
This is defined as following:
typedef struct {
TS_BOF_REC bof;
TS_REM_REC rem;
TS_EOF_REC eof;
int bof_len;
int rem_len;
int eof_len;
int cnt;
char teltyp[LEN_TELTYP+1];
TS_TEL_ENTRY entries[MAX_TEL];
} TS_TEL_TAB;
and it's embedded struct TS_TEL_ENTRY
typedef struct {
int len;
char tel[MAX_TEL_LEN];
} TS_TEL_ENTRY;
The problem is that the value for MAX_TEL_LEN had been changed from 512 to 1024 and thus the struct almost doubled in size what lead to that the STACK SIZE was not big enough anymore.
SOLUTION
Simply set the stack size from 8Mb to 64Mb. This can be achieved using ulimit command (under linux).
List current stack size: ulimit -s
Set stack size to 64Mb: ulimit -s 65535
Note: Values for stack size are in kB.
For a good short ref on ulimit command have a look # ss64

Debugging segfault with no apparent cause in gdb?

gdb was reporting that my C code was crashing somewhere in malloc(), so I linked my code with Electric Fence to pinpoint the actual source of the memory error. Now my code is segfaulting much earlier, but gdb's output is even more confusing:
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x30026b00 (LWP 4003)]
0x10007c30 in simulated_status (axis=1, F=0x300e7fa8, B=0x1003a520, A=0x3013b000, p=0x1003b258, XS=0x3013b000)
at ccp_gch.c:799
EDIT: The full backtrace:
(gdb) bt
#0 0x10007c30 in simulated_status (axis=1, F=0x300e7fa8, B=0x1003a520, A=0x3013b000, p=0x1003b258, XS=0x3013b000)
at ccp_gch.c:799
#1 0x10007df8 in execute_QUERY (F=0x300e7fa8, B=0x1003a520, iData=0x7fb615c0) at ccp_gch.c:836
#2 0x10009680 in execute_DATA_cmd (P=0x300e7fa8, B=0x7fb615cc, R_type=0x7fb615d0, iData=0x7fb615c0)
at ccp_gch.c:1581
#3 0x10015bd8 in do_volley (client=13) at session.c:76
#4 0x10015ef4 in do_dialogue (v=12, port=2007) at session.c:149
#5 0x10016350 in do_session (starting_port=2007, ports=1) at session.c:245
#6 0x100056e4 in main (argc=2, argv=0x7fb618f4) at main.c:271
The relevant code (slightly modified due to reasons):
796 static uint32_t simulated_status(
797 unsigned axis, struct foo *F, struct bar *B, struct Axis *A, BAZ *p, uint64_t *XS)
798 {
799 uint32_t result = A->status;
800 *XS = get_status(axis);
801 if (!some_function(p)) {
802 ...
The obvious thing to check would be whether A->status is valid memory, but it is. Removing the assignment pushes the segfault to line 800, and removing that assignment causes some other assignment in the if-block to segfault. It looks as though either accessing an argument passed to the function or writing to a local variable is what's causing the segfault, but everything points to valid memory according to gdb.
How am I to interpret this? I've never seen anything like this before, so any suggestions / pointers in the right direction would be appreciated. I'm using GNU gdb 6.8-debian, Electric Fence 2.1, and running on a PowerPC 405 (uname reports Linux powerpmac 2.6.30.3 #24 [...] ppc GNU/Linux).
I'm guessing, but your symptoms are similar to what could happen in a stack overflow situation. The -fstack-protector suggestion in the comments is on the right track here. I'd recommend adding the -fstack-check option as well.
If the SEGV is occurring because of writes to the guard page protecting the stack then an info registers and info frame in gdb would help confirm if this is the case.

Strange stack overflow?

I'm running into a strange situation with passing a pointer to a structure with a very large array defined in the struct{} definition, a float array around 34MB in size. In a nutshell, the psuedo-code looks like this:
typedef config_t{
...
float values[64000][64];
} CONFIG;
int32_t Create_Structures(CONFIG **the_config)
{
CONFIG *local_config;
int32_t number_nodes;
number_nodes = Find_Nodes();
local_config = (CONFIG *)calloc(number_nodes,sizeof(CONFIG));
*the_config = local_config;
return(number_nodes);
}
int32_t Read_Config_File(CONFIG *the_config)
{
/* do init work here */
return(SUCCESS);
}
main()
{
CONFIG *the_config;
int32_t number_nodes,rc;
number_nodes = Create_Structures(&the_config);
rc = Read_Config_File(the_config);
...
exit(0);
}
The code compiles fine, but when I try to run it, I'll get a SIGSEGV at the { beneath Read_Config_File().
(gdb) run
...
Program received signal SIGSEGV, Segmentation fault.
0x0000000000407d0a in Read_Config_File (the_config=Cannot access memory at address 0x7ffffdf45428
) at ../src/config_parsing.c:763
763 {
(gdb) bt
#0 0x0000000000407d0a in Read_Config_File (the_config=Cannot access memory at address 0x7ffffdf45428
) at ../src/config_parsing.c:763
#1 0x00000000004068d2 in main (argc=1, argv=0x7fffffffe448) at ../src/main.c:148
I've done this sort of thing all the time, with smaller arrays. And strangely, 0x7fffffffe448 - 0x7ffffdf45428 = 0x20B8EF8, or about the 34MB of my float array.
Valgrind will give me similar output:
==10894== Warning: client switching stacks? SP change: 0x7ff000290 --> 0x7fcf47398
==10894== to suppress, use: --max-stackframe=34311928 or greater
==10894== Invalid write of size 8
==10894== at 0x407D0A: Read_Config_File (config_parsing.c:763)
==10894== by 0x4068D1: main (main.c:148)
==10894== Address 0x7fcf47398 is on thread 1's stack
The error messages all point to me clobbering the stack pointer, but a) I've never run across one that crashes on entry of the function and b) I'm passing pointers around, not the actual array.
Can someone help me out with this? I'm on a 64-bit CentOS box running kernel 2.6.18 and gcc 4.1.2
Thanks!
Matt
You've blown up the stack by allocating one of these huge config_t structs onto it. The two stack pointers on evidence in the gdb output, 0x7fffffffe448 and 0x7ffffdf45428, are very suggestive of this.
$ gdb
GNU gdb 6.3.50-20050815 ...blahblahblah...
(gdb) p 0x7fffffffe448 - 0x7ffffdf45428
$1 = 34312224
There's your ~34MB constant that matches the size of the config_t struct. Systems don't give you that much stack space by default, so either move the object off the stack or increase your stack space.
The short answer is that there must be a config_t declared as a local variable somewhere, which would put it on the stack. Probably a typo: missing * after a CONFIG declaration somewhere.

Resources