This is the code:
void i_log_ (int error, const char * file, int line, const char * fmt, ...)
{
/* Get error description */
char * str_err = get_str_error (errno);
remove_trailing_newl (str_err);
/* Format string and parameters */
char message [1024];
va_list ap;
va_start (ap, fmt);
vsprintf (message, fmt, ap);
va_end (ap);
/* Get time */
time_t t = time (NULL);
char stime [64];
char * temp = ctime (&t);
strncpy (stime, temp, sizeof stime - 1);
remove_trailing_newl (stime);
FILE * log;
#ifdef __WIN32__
#else
# ifdef P_LISTENER
log = fopen (I_LOG_FILE, "a+b");
flock (fileno (log), LOCK_EX);
# else /* shared file descriptor of log, lock before opening */
pthread_mutex_lock (& mutex);
log = fopen (I_LOG_FILE, "a+b");
# endif
#endif
if (log) {
if (error)
fprintf (log, ERR_FORMAT, stime, file, line, str_err, message);
else
fprintf (log, ERR_FORMAT_NO_ERRNO, stime, file, line, message);
}
#ifdef __WIN32__
free (str_err);
#else
# ifdef P_LISTENER
flock (fileno (log), LOCK_UN);
fclose (log);
# else
fclose (log);
pthread_mutex_unlock (& mutex);
# endif
#endif
return;
}
Although there is a lock mechanism, in this case the function is not called concurrently, so I think that's not the problem. However, the program receive a SIGABRT:
[...]
(gdb) c
Continuing.
Program received signal SIGHUP, Hangup. // It's OK, I sent this.
0x00dee416 in __kernel_vsyscall ()
(gdb) c
Continuing.
Program received signal SIGABRT, Aborted.
0x00dee416 in __kernel_vsyscall ()
(gdb) up
#1 0x0013ae71 in raise () from /lib/i386-linux-gnu/libc.so.6
(gdb) up
#2 0x0013e34e in abort () from /lib/i386-linux-gnu/libc.so.6
(gdb) up
#3 0x00171577 in ?? () from /lib/i386-linux-gnu/libc.so.6
(gdb) up
#4 0x0017b961 in ?? () from /lib/i386-linux-gnu/libc.so.6
(gdb) up
#5 0x0017d28b in ?? () from /lib/i386-linux-gnu/libc.so.6
(gdb) up
#6 0x0018041d in free () from /lib/i386-linux-gnu/libc.so.6
(gdb) up
#7 0x0019b0d2 in ?? () from /lib/i386-linux-gnu/libc.so.6
(gdb) up
#8 0x0019b3c5 in ?? () from /lib/i386-linux-gnu/libc.so.6
(gdb) up
#9 0x00199a9f in localtime () from /lib/i386-linux-gnu/libc.so.6
(gdb) up
#10 0x00199951 in ctime () from /lib/i386-linux-gnu/libc.so.6
(gdb) up
#11 0x08049634 in i_log_ (error=0, file=0x804b17d "src/group.c", line=53, fmt=0x804b128 "Setting up new configuration: listener type: %s, number: %d, http-log: %s, port: %d.") at src/error.c:42
42 char * temp = ctime (&t);
(gdb) print temp
$1 = 0x260000 ""
(gdb) print t
$2 = 1329935482
(gdb) print &t
$3 = (time_t *) 0xbff8a5b8
(gdb)
I haven't a clue. ctime is returning an empty string, and the man page don't mention this case. And come to think of it, I don't understand why it would return an empty string, and what's wrong with that code.
Any help is appreciated.
ctime isn't returning an empty string. It hasn't returned at all yet, because it crashed while trying to do its thing.
The crash is inside free(), so you're probably corrupting memory at some point before you called ctime(). If you're running on a supported platform, try using a tool like Valgrind to check your memory accesses.
Since the crash is happening down inside ctime() and the pointer you pass is valid, the problem is likely that you've already trampled out of bounds with memory (there's free() in the stack trace) somewhere else and the problem is only manifesting itself here.
Related
I am trying to debug the segmentation fault of a menu program written in 'C' and the main function is shown in the below screen shot.
int main( int ac, char **av ) {
/* TDT,II - 02 May 2006 - Added this check to see if there is a Debug level passed in */
if ( ac > 0 ) {
iDebug = atoi( av[1] );
sprintf( cLogText, "Setting Debug Level to ~%d~", iDebug );
WriteTrace( cLogText );
};
initscr();
clear();
t1=time(NULL);
local =localtime(&t1);
Svc_Login();
for ( ; ; ) {
cases_on_pc=FALSE;
if ( !Process_security() ) break;
menu1();
};
wrap_up(0);
endwin( );
exit(0);
}
when i try to debug (run using gdb), without any arguments, getting halted at 0x00007ffff34c323a in ____strtoll_l_internal () from /lib64/libc.so.6 as shown below. if(ac>0) becomes true, only when i pass any arguments. but i haven't passed any runtime arguments. still that 'f block is being executed and function atoi(av[1]) is called and resulted in segmentation fault. I am unable to figure it out. how to proceed further to identify and correct the issue so that i could run the menu program successfully. could somebody give any suggestions on this?
-rw-rw-r--. 1 MaheshRedhat MaheshRedhat 2275270 Jan 10 03:09 caomenu.c
-rw-rw-r--. 1 MaheshRedhat MaheshRedhat 0 Jan 10 03:09 caomenu.lis
-rwxr-xr-x. 1 root root 796104 Jan 10 03:10 scrmenu
[MaheshRedhat#azureRHEL MenuPrograms]$
[MaheshRedhat#azureRHEL MenuPrograms]$ gdb ./scrmenu
(gdb) run
Starting program: /home/MaheshRedhat/MenuPrograms/scrmenu
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Program received signal SIGSEGV, Segmentation fault.
0x00007ffff34c323a in ____strtoll_l_internal () from /lib64/libc.so.6
Missing separate debuginfos, use: yum debuginfo-install glibc-2.28-189.5.0.1.el8_6.x86_64 libnsl-2.28-189.5.0.1.el8_6.x86_64 ncurses-libs-6.1-9.20180224.el8.x86_64
(gdb)
(gdb) bt
#0 0x00007ffff34c323a in ____strtoll_l_internal () from /lib64/libc.so.6
#1 0x00007ffff34bfce4 in atoi () from /lib64/libc.so.6
#2 0x0000000000401674 in main (ac=1, av=0x7fffffffe228) at caomenu.pc:541
(gdb)
Update
The above issue has been resolved. Here is another encounter of segmentation fault.
from this backtrace of system calls, in WriteTrace (cEntryText=0x6f4d20 <cLogText> in my main function lead to call fputs() from library file /lib64/libc.so.6
Starting program: /home/MaheshRedhat/MenuPrograms/scrmenu 1
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Program received signal SIGSEGV, Segmentation fault.
0x00007ffff34f7f5c in fputs () from /lib64/libc.so.6
(gdb)
(gdb) bt
#0 0x00007ffff34f7f5c in fputs () from /lib64/libc.so.6
#1 0x0000000000472efc in WriteTrace (cEntryText=0x6f4d20 <cLogText> "Setting Debug Level to ~1~") at caomenu.pc:18394
#2 0x00000000004016a0 in main (ac=2, av=0x7fffffffe208) at caomenu.pc:543
(gdb)
The below is the declaration of cLogText
char cLogText[250];
The below is code for WriteTrace:
/**************************************************************************
routine to write an entry in the trace file
**************************************************************************/
void WriteTrace( char *cEntryText ) {
char cTimeStamp[40]; /* time stamp variable */
char cTimeFormat[]="%H:%M:%S: "; /* time stamp format */
GetTimeStamp( &cTimeStamp[0], sizeof(cTimeStamp), &cTimeFormat[0] );
TrcFile = fopen( cTraceFile, cTrcOpenFlag ); /* open the file */
cTrcOpenFlag[0] = 'a'; /* after first, always app} */
fprintf(TrcFile, "%s", cTimeStamp); /* write the time stamp */
fprintf(TrcFile, "%s\n", cEntryText); /* write the entry */
fclose(TrcFile); /* close the trace file */
return; /* return to caller */
}
From C11:
The value of argc shall be nonnegative. argv[argc] shall be a null
pointer. If the value of argc is greater than zero, the array members
argv[0] through argv[argc-1] inclusive shall contain pointers to
strings, which are given implementation-defined values by the host
environment prior to program startup. The intent is to supply to the
program information determined prior to program startup from elsewhere
in the hosted environment. If the host environment is not capable of
supplying strings with letters in both uppercase and lowercase, the
implementation shall ensure that the strings are received in
lowercase.
If the value of argc is greater than zero, the string
pointed to by argv[0] represents the program name; argv[0][0] shall be
the null character if the program name is not available from the host
environment. If the value of argc is greater than one, the strings
pointed to by argv[1] through argv[argc-1] represent the program
parameters.
The condition
( ac > 0 )
would be true even if you provided 0 program arguments, with argv[0] pointing to the program name (if it was available).
This statement:
atoi( av[1] );
tries to access av[1] which the Standard defines to be NULL when no program arguments were provided. Hence the segmentation violation signal.
fopen returns NULL to indicate failure.
TrcFile = fopen( cTraceFile, cTrcOpenFlag );
You do not check its return value here, before passing it to fprintf.
Perhaps:
TrcFile = fopen( cTraceFile, cTrcOpenFlag );
if (!TrcFile) {
deal with error here..
}
if (ac > 0) {
iDebug = atoi(av[1]); // out of bounds here, when you run `./scrmenu`
}
when you run ./scrmenu, you will have ac=1 and av[0]=./scrmenu
You may misunderstand ac and av mean in the main function.
how to fix:
if (ac == 2) {
iDebug = atoi(av[1]); // parse the second argument
}
then you can run: ./scrmenu or ./scrmenu 1 something like this.
you can check this post about command line argument
I have this simple script written in C:
#include <stdio.h>
void usage(char *program_name) {
printf("Usage: %s <message> <# of times to repeat>\n", program_name);
exit(1);
}
int main(int argc, char *argv[]) {
int i, count;
// if(argc < 3) // If less than 3 arguments are used,
// usage(argv[0]); // display usage message and exit.
count = atoi(argv[2]); // convert the 2nd arg into an integer
printf("Repeating %d times..\n", count);
for(i=0; i < count; i++)
printf("%3d - %s\n", i, argv[1]); // print the 1st arg
}
And I'm making some test with GDB.
I did this:
(gdb) run test
Starting program: /home/user/Desktop/booksrc/convert2 test
Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7a56e56 in ____strtoll_l_internal () from /usr/lib/libc.so.6
Obviusly it goes in segmentation fault because to work the program needs three argv. And I commented the lines that do the control. So it goes in error.
(gdb) where
#0 0x00007ffff7a56e56 in ____strtoll_l_internal () from /usr/lib/libc.so.6
#1 0x00007ffff7a53a80 in atoi () from /usr/lib/libc.so.6
#2 0x00005555555546ea in main (argc=2, argv=0x7fffffffe958) at convert2.c:14
(gdb) break main
Breakpoint 1 at 0x5555555546d2: file convert2.c, line 14.
(gdb) run test
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Starting program: /home/user/Desktop/booksrc/convert2 test
Breakpoint 1, main (argc=2, argv=0x7fffffffe958) at convert2.c:14
14 count = atoi(argv[2]); // convert the 2nd arg into an integer
(gdb) cont
Continuing.
Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7a56e56 in ____strtoll_l_internal () from /usr/lib/libc.so.6
(gdb) x/3xw 0x7fffffffe958 // this is memory of the "argv" some line before
0x7fffffffe958: 0xffffebfe 0x00007fff 0xffffec22
(gdb) x/s 0xffffebfe
0xffffebfe: <error: Cannot access memory at address 0xffffebfe>
(gdb) x/s 0x00007fff
0x7fff: <error: Cannot access memory at address 0x7fff>
(gdb) x/s 0xffffec22
0xffffec22: <error: Cannot access memory at address 0xffffec22>
In theory, with "x/s" I should have seen the commandline in the first address and "test" in the second address and the null in the third. But nothing. If I copy paste that address to a ascii to string converter, it gives me data without any sense. What am I doing wrong?
Your platform uses 64bit pointers, so try :
(gdb) x/3xg 0x7fffffffe958
to display the 64bit pointers in the argv array, and then :
(gdb) x/s 0x00007fffffffebfe
or just :
(gdb) p argv[0]
First of all always check if the command line is correct
Uncomment the check from your code.
Then in the gdb set the arguments (before running it)
(gdb) set args "hello world" 12
I am running a simple C Application which will get PID Of an process continuously.
This is running on an custom ARM Board.
pid_t GetStreamerPID()
{
pid_t pid = 0;
int ret = 0;
char line[100];
char command[50] = "pidof -s gst-launch-0.10";
memset(line, '\0', 100);
FILE *cmd = popen(command, "r");
if ( cmd == NULL )
{
perror("Popen\n");
exit(0);
}
ret = fread(line, sizeof(char), 20, cmd);
pclose(cmd);
pid = atoi(line);
return pid;
}
Randomly, the code is throwing segmentation Fault at pclose.. I am debugging this from past week and I am unable to find out the cause of the issue.
Attaching gdb backtrace:
Program received signal SIGSEGV, Segmentation fault.
0x76e3a588 in free () from /lib/libc.so.6
(gdb) bt full
#0 0x76e3a588 in free () from /lib/libc.so.6
No symbol table info available.
#1 0x76e25c20 in fclose##GLIBC_2.4 () from /lib/libc.so.6
No symbol table info available.
#2 0x000109b4 in GetStreamerPID () at getPid.c:111
pid = 0
ret = 0
line = '\000' <repeats 99 times>
command = "pidof -s gst-launch-0.10", '\000' <repeats 25 times>
cmd = 0x136c008
#3 0x00010a50 in startStreamer () at getPid.c:147
command = '\000' <repeats 255 times>
pid = 0
#4 0x0001087c in CheckVideoState () at getPid.c:81
iVideoOn = 1
#5 0x00010a20 in MainLoop () at getPid.c:137
No locals.
#6 0x00010b74 in main () at getPid.c:183
No locals.
Also, one more wierd observation is , after I close gdb and run "reboot" command it throws segmentation fault. Again this is random, it can be any command..
I can provide you as much as information you want. Please help me in debugging this wierd issue...
Your crash is happening inside free.
Any such crash is a 99.99% sign that you have heap corruption elsewhere, and very likely has absolutely nothing to do with the code you've shown, or with pclose.
Tools such as Valgrind, Address Sanitizer, or GLIBC mcheck are your best bet.
So here's the code containing the printf (with line numbers, this is from think.c):
30: char *think = getRandomMemory();
31: printf("\33[2K\r");
32: if(think == NULL)
33: think = "NULL";
34: printf("I have an idea: %s\n", think);
35: parse(think);
36: freeMemory(think);
37: printf("> ");
And the code from getRandomMemory() which makes sure the returned pointer points to heap allocated space:
char *getRandomMemory()
{
char *ret;
// --SNIP--
size_t l = strlen(ret) + 1;
char *rret = getMemory(sizeof(char) * l);
for(int i = 0; i < l; i++)
rret[i] = ret[i];
printf("--- %s ---\n", rret);
return rret;
}
And finally this is what gdb gives me when running this. Please note that the "--- test ---" comes from " printf("--- %s ---\n", rret)" above:
(gdb) run
Starting program: /home/v10lator/Private/projekte/KI/Lizzy
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Loading Lizzy 0.1... Done!
> --- test ---
[New Thread 0x7ffff781f700 (LWP 32359)]
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffff781f700 (LWP 32359)]
0x00007ffff7869490 in _IO_vfprintf_internal (s=<optimized out>, format=<optimized out>,
ap=ap#entry=0x7ffff781ee68) at vfprintf.c:1642
1642 vfprintf.c: Datei oder Verzeichnis nicht gefunden.
(gdb) bt
#0 0x00007ffff7869490 in _IO_vfprintf_internal (s=<optimized out>, format=<optimized out>,
ap=ap#entry=0x7ffff781ee68) at vfprintf.c:1642
#1 0x00007ffff7919235 in ___printf_chk (flag=1, format=<optimized out>) at printf_chk.c:35
#2 0x00000000004016bc in printf () at /usr/include/bits/stdio2.h:104
#3 run (first=<optimized out>) at think.c:34
#4 0x00007ffff7bc64c6 in start_thread (arg=0x7ffff781f700) at pthread_create.c:333
#5 0x00007ffff790a86d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
I really don't know what's wrong here, so hopefully someone sees the mistake.
//EDIT: Forgot the getMemory/freeMemory functions:
/*
* This allocates memory.
* The difference between usig malloc directly is that this function will
* print an error and exit the program in case something bad happens.
*/
char *getMemory(size_t size)
{
char *mem = malloc(size);
if(mem == NULL)
crashWithMsg("Internal error (malloc failed)!");
return mem;
}
/*
* This deallocates memory.
* The difference between using free directly is that this function will
* set the pointer to NULL afterwards.
*/
void freeMemory(void **ptr)
{
free(*ptr);
*ptr = NULL;
}
The problem is your call to freeMemory:
freeMemory(think);
You've coded that function to take the address of the pointer to the memory to be freed, rather than the pointer itself. So you need to call it with:
freeMemory(&think);
[Deleted misstated assertion]
Why not use strcpy()?
I'm trying to understand what's going wrong with a program run in HP-UX 11.11 that results in a SIGSEGV (11, segmentation fault):
(gdb) bt
#0 0x737390e8 in _sigfillset+0x618 () from /usr/lib/libc.2
#1 0x73736a8c in _sscanf+0x55c () from /usr/lib/libc.2
#2 0x7373c23c in malloc+0x18c () from /usr/lib/libc.2
#3 0x7379e3f8 in _findbuf+0x138 () from /usr/lib/libc.2
#4 0x7379c9f4 in _filbuf+0x34 () from /usr/lib/libc.2
#5 0x7379c604 in __fgets_unlocked+0x84 () from /usr/lib/libc.2
#6 0x7379c7fc in fgets+0xbc () from /usr/lib/libc.2
#7 0x7378ecec in __nsw_getoneconfig+0xf4 () from /usr/lib/libc.2
#8 0x7378f8b8 in __nsw_getconfig+0x150 () from /usr/lib/libc.2
#9 0x737903a8 in __thread_cond_init_default+0x100 () from /usr/lib/libc.2
#10 0x737909a0 in nss_search+0x80 () from /usr/lib/libc.2
#11 0x736e7320 in __gethostbyname_r+0x140 () from /usr/lib/libc.2
#12 0x736e74bc in gethostbyname+0x94 () from /usr/lib/libc.2
#13 0x11780 in dnetResolveName (name=0x400080d8 "smtp.org.com", hent=0x737f3334) at src/dnet.c:64
..
The problem seems to be occurring somewhere inside libc! A system call trace ends with:
Connecting to server smtp.org.com on port 25
write(1, "C o n n e c t i n g t o s e ".., 51) .......................... = 51
open("/etc/nsswitch.conf", O_RDONLY, 0666) ............................... [entry]
open("/etc/nsswitch.conf", O_RDONLY, 0666) ................................... = 5
Received signal 11, SIGSEGV, in user mode, [SIG_DFL], partial siginfo
Siginfo: si_code: I_NONEXIST, faulting address: 0x400118fc, si_errno: 0
PC: 0xc01980eb, instruction: 0x0d3f1280
exit(11) [implicit] ............................ WIFSIGNALED(SIGSEGV)|WCOREDUMP
Last instruction by the program:
struct hostent *him;
him = gethostbyname(name); // name == "smtp.org.com" as shown by gdb
Is this a problem with the system, or am I missing something?
Any guidance for digging deeper would be appreciated.
Thx.
Long story short: vsnprintf corrupted my heap under HP-UX 11.11.
vsnprintf was introduced in C99 (ISO/IEC 9899:1999) and "is equivalent to snprintf, with the variable argument list" (§7.19.6.12.2), snprintf (§7.19.6.5.2): "If n is zero, nothing is written".
Well, HP UX 11.11 doesn't comply with this specification. When 2nd arg == 0, arguments are written at the end of the 1st arg.. which, of course, corrupts the heap (I allocate no space when maxsize==0, given that nothing should be written).
HP manual pages are unclear ("It is the user's responsibility to ensure that enough storage is available."), nothing is said regarding the case of maxsize==0. Nice trap.. at the very least, the WARNINGS section of the man page should warn std-compliant users..
It's an egg/chicken pb: vnsprintf is variadic, so for the "user's responsibility" to ensure that enough storage is available" the "user's responsibility" must first know how much space is needed. And the best way to do that is to call vnsprintf with 2nd arg == 0: it should then return the amount of space required and sprintfs nothing.. well, except HP's !
One solution to use vnsprintf under this std violation to determine needed space: malloc 1 byte more to your buffer (1st arg) and call vnsprintf(buf+buf.length,1,..). This only puts a \0 in the new byte you allocated. Silly, but effective. If you're under wchar conditions, malloc(sizeof..).
Anyway, workaround is trivial : never call v/snprintf under HP-UX with maxsize==0!
I now have a happy stable program!
Thanks to all contributers.
Heap corruption through vsnprintf under HP-UX B11.11
This program prints "##" under Linux/Cygwin/..
It prints "#fooo#" under HP-UX B11.11:
#include <stdarg.h>
#include <stdio.h>
const int S=2;
void f (const char *fmt, ...) {
va_list ap;
int actualLen=0;
char buf[S];
bzero(buf, S);
va_start(ap, fmt);
actualLen = vsnprintf(buf, 0, fmt, ap);
va_end(ap);
printf("#%s#\n", buf);
}
int main () {
f("%s", "fooo");
return 0;
}
Whenever this situation happens to me (unexpected segfault in a system lib), it is usually because I did something foolish somewhere else, i.e. buffer overrun, double delete on a pointer, etc.
In those instances where my mistake is not obvious, I use valgrind. Something like the following is usually sufficient:
valgrind -v --leak-check=yes --show-reachable=yes ./myprog
I assume valgrind may be used in HP-UX...
Your stack trace is in malloc which almost certainly means that somewhere you corrupted one of malloc's data structures. As a previous answer said, you likely have a buffer overrun or underrun and corrupted one of the items allocated off the heap.
Another explanation is that you tried to do a free on something that didn't come from the heap, but that's less likely--that would probably have crashed right in free.
Reading the (OS X) manpage says that gethostbyname() returns a pointer, but as far as I can tell may not be allocating memory for that pointer. Do you need to malloc() first? Try this:
struct hostent *him = malloc(sizeof(struct hostent));
him = gethostbyname(name);
...
free(him);
Does that work any better?
EDIT: I tested this and it's probably wrong. Granted I used the bare string "stmp.org.com" instead of a variable, but both versions (with and without malloc()ing) worked on OS X. Maybe HP-UX is different.