Segmentation fault traversing a text file - c

I have this code, it's supposed to read a text file character by character and then do something with it, but the code keeps on segfaulting at line 6.
#include <stdio.h>
int main(void)
{
printf("a\n");
FILE* fp = fopen("~/pset5/dictionaries/small", "r");
for (int a = fgetc(fp); a != EOF; a = fgetc(fp))
{
printf("b\n");
}
return 0;
}
Something weird is definitely happening, because it doesn't even print "a\n" to the terminal, even tough the call to printf is before the error. I've run the program with gdb, and this is where it fails.
6 for (int a = fgetc(fp); a != EOF; a = fgetc(fp))
(gdb) n
Program received signal SIGSEGV, Segmentation fault.
_IO_getc (fp=0x0) at getc.c:38
38 getc.c: No such file or directory.
I've also ran it with valgrind as in valgrind --leak-check=full ./test, with test being the name of the executable, and this is the relevant error message:
==7568== Invalid read of size 4
==7568== at 0x4EA8A21: getc (getc.c:38)
==7568== by 0x4005ED: main (test.c:6)
==7568== Address 0x0 is not stack'd, malloc'd or (recently) free'd
==7568==
==7568==
==7568== Process terminating with default action of signal 11 (SIGSEGV)
==7568== Access not within mapped region at address 0x0
I'm really at a loss here, can someone explain what's going on with this segmentation fault, and why the hell isn't the first call to printf printing anything?

As the debugger says (fp=0x0), you're calling fgetc with a null pointer. This is what causes the crash.
fp is null because the fopen call failed. You need to check for errors.
Opening the file fails because most likely you do not have a directory called ~. Recall that expanding ~ to your home directory is done by the shell when you type a command. fopen only takes real filenames.

You forgot to check the return value of fopen() against NULL, which indicates an error while attempting to open the file.
Your for loop busily uses a NULL pointer, hence you get segfault.
Check the global variable errno to find out more about what exactly went wrong in your case.

Related

How to solve exit code 139 error when reading from file on unix

So I believe this is just a problem on unix and that it occurs at the first fscanf if the Clion debugger was right, but I don't know why I get the error- Process finished with exit code 139 (interrupted by signal 11: SIGSEGV) - why?
struct loginInformation
{
char username[USERNAME_LENGTH];
char password[PASSWORD_LENGTH];
int type;
}accounts[NUM_OF_ACCOUNTS];
void createAccountsFromFile()
{
FILE *input = fopen("accounts.txt", "r");
int counter;
for(counter = 0; counter < NUM_OF_ACCOUNTS; counter++)
{
fscanf(input, "%s", accounts[counter].username);
fscanf(input, "%s", accounts[counter].password);
fscanf(input, "%d", &accounts[counter].type);
}
}
int main()
{
createAccountsFromFile();
}
accounts.txt
user1
pass1
0
user2
pass2
1
user3
pass3
2
user4
pass4
3
It means the program crashed before it exited. You need to debug the program. For example, you need to check whether the file is successfully opened after fopen.
SIGSEGV are not always thrown due to a root cause of memory access problems...
Perl throws a 139 on Unix usually because of file I/O. You might have accidentally deleted your input files.
TL;DR: Your program tried to access a memory location it had no permissions to access, so the operating system killed it.
First: The code "139" doesn't matter, forget about the number. Your program was terminated after "getting a SIGSEGV", or a signall regarding a segmentation violation. Read about what that means here:
What causes a SIGSEGV
(never mind that question is about C++, same idea.)
Now, why would this happen? You must be making some assumptions you shouldn't be. Looking at your code, it might be:
Reading a very long string from the file which exceeds the bounds of the loginInformation array - and perhaps even the bounds of the memory region allocated to your program overall.
Scanning from an invalid-state/uninitialized/null file descriptor, as in #xuhdev's answer
(Unlikely/impossible) Ignoring some error generated by one of the fscanf() calls (you need to check errno if a scan failed).
I think that covers it although maybe I missed something. Instead of speculating you can actually check what happened using a debugger on the core dump:
How do I analyze a program's core dump file with GDB when it has command-line parameters?
On Perl programmation RC 139 caused by "Out of memory" for me.
Because there have been too much data in a variable (millions).
I have done a segmentation manualy by release (undef) this variable regularly.
This solved this.

fgetc() Creating Segmentation Fault

I made the file "wor.txt" in the same program and i closed its write stream. But when i try to access it in first run(I created the file) it gives segmentation fault but when i re-run this program it runs successfully.
When i delete the automatically generated file and run the program again it gives Segmentation fault and on 2nd run(Without deleting the file) it runs successfully again.
NOTE: There is data in the textfile hence it is not empty(I have seen it after the first run in the file manager)
FILE *fp1= fopen("wor.txt","r");
FILE *f1= fopen("wordsa.txt","ab+");
if((f1==NULL)||(f2==NULL)){
printf("f1 or f2 is null");
}
char c='0';
while((c)!=EOF){
printf("Here is one marker\n");
c=fgetc(fp1); //This Line gives error
printf("Here is another marker\n");
fputc(c,f1);
}
A char is no sufficient for EOF, change the type to int.
Check the man page of fgetc(), it returns an int and you should use the same datatype for storing the return value and further use.
That said, when either of f1 or fp1 is NULL, you are continuing anyways, accessing those file pointers, which may create UB. You should make some sense of that NULL check and either return or exit so that the code accessing tose pointers are not reached.
Incorrect check.
To properly detect opening of the stream, check fp1, not f2. Then code will fail gracefully when the files do not open properly rather than seg fault.
FILE *fp1= fopen("wor.txt","r");
FILE *f1= fopen("wordsa.txt","ab+");
// if((f1==NULL)||(f2==NULL)){
if((f1==NULL) || (fp1==NULL)){
printf("f1 or fp1 is null");
}
Also use int c as fgetc() typically returns 256 + 1 different values (unsigned char values and EOF) and a char is insufficient to uniquely distinguish them.

Compiled with no Segmentation faults

I was working with example from K&R, its a cat utility to view files
#include <stdio.h>
main(int argc,char **argv){
FILE *fp;
void filecopy(FILE *,FILE *);
if(argc==1)
filecopy(stdin,stdout);
else // accidentally mistyped
while(--argv > 0) // should have been --argc > 0
if((fp=fopen(*++argv,"r"))==NULL){
printf("cat: can't open %s\n",*argv);
return 1;
}else{
filecopy(fp,stdout);
fclose(fp);
}
return 0;
}
void filecopy(FILE *ifp,FILE *ofp)
{
int c;
while((c=getc(ifp))!=EOF)
putc(c,ofp);
}
When compiled with gcc cat.c,
and when I ran ./a.out cat.c from the terminal,all I got was some chinnesse symbols and some readable text(names like _fini_array_,_GLOBAL_OFFSET_TABLE_ and etc..) and the garbage just went on until I pressed Ctrl+C, I wanted to ask why I didn't got Segmentation fault, because didn't the program was reading every memory location from argv start address? and I shouldn't have the rights to do so?
Let's look at these two consecutive lines:
while(--argv > 0)
if((fp=fopen(*++argv,"r"))==NULL){
Every time you decrement argv, you end up incrementing it on the next line. So overall, you are just decrementing and incrementing argv a lot but you are never actually reading past the bounds of the argv memory area.
Even if you were reading past the bounds of the argv memory area, that would be undefined behavior and you are not guaranteed to get a segmentation fault. The result you get depends on your compiler your, operating system, and the other things in your program.
I suspect that executing --argv also gives you undefined behavior, because after that line is executed, the pointer would probably point outside of the array allocated for argv data. But, since you didn't dereference argv while it was pointing there, it turned out to be OK.

Valgrind error with atoi

I am working on this project and when I run valgrind on this line of code
int numPointers;
numPointers = atoi(argv[NUM_POINTERS_VALUE]);
I get a valgrind error of
Invalid read of size 1 [PID: 8979]
Address 0x0 is not stack'd, malloc'd or (recently) freed
I was wondering what is going on here and if there is a way to fix it
When you are using command line arguments it is always a good practice to use
int main()
{
if(argc != <required number of argument>)
{
printf("Fewer arguments in the input\n");
return 1;
}
// Do your stuff
}
Later
if(argc[1] != NULL)
numPointers = atoi(argv[1]);
Because atoi(NULL) results in undefined behavior leading to crash.

New segmentation fault with previously working C code

this code is meant to take a list of names in a text file, and convert to email form
so Kate Jones becomes kate.jones#yahoo.com
this code worked fine on linux mint 12, but now the exact same code is giving a segfault on arch linux.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main()
{
FILE *fp;
fp = fopen("original.txt", "r+");
if (fp == NULL )
{
printf("error opening file 1");
return (1);
}
char line[100];
char mod[30] = "#yahoo,com\n";
while (fgets(line, 100, fp) != NULL )
{
int i;
for (i = 0; i < 100; ++i)
{
if (line[i] == ' ')
{
line[i] = '.';
}
if (line[i] == '\n')
{
line[i] = '\0';
}
}
strcat(line, mod);
FILE *fp2;
fp2 = fopen("final.txt", "a");
if (fp == NULL )
{
printf("error opening file 2");
return (1);
}
if (fp2 != NULL )
{
fputs(line, fp2);
fclose(fp2);
}
}
fclose(fp);
return 0;
}
Arch Linux is a fairly fresh install, could it be that there is something else I didn't install that C will need?
I think the problem would be when your original string plus mod exceeds 100 characters.
When you call strcat, it simply copies the string from the second appended to the first, assuming there is enough room in the first string which clearly doesn't seem to be the case here.
Just increase the size of line i.e. it could be
char line[130]; // 130 might be more than what is required since mod is shorter
Also it is much better to use strncat
where you can limit maximum number of elements copied to dst, otherwise, strcat can still go beyond size without complaining if given large enough strings.
Though a word of caution with strncat is that it does not terminate strings with null i.e. \0 on its own, specially when they are shorter than the given n. So its documentation should be thoroughly read before actual use.
Update: Platform specific note
Thought of adding, it is by sheer coincidence that it didn't seg fault on mint and crashed on arch. In practice it is invoking undefined behavior and should crash sooner or latter. There is nothing platform specific here.
Firstly your code isn't producing segmentation fault. Instead it will bring up "Stack Smashing" and throws below libc_message in the output console.
*** stack smashing detected ***: _executable-name-with-path_ terminated.
Stack buffer overflow bugs are caused when a program writes more data to a buffer located on the stack than there was actually allocated for that buffer.
Stack Smashing Protector (SSP) is a GCC extension for protecting applications from such stack-smashing attacks.
And, as said in other answers, your problem gets resolved with incrementing (strcat() function's first argument). from
char line[100]
to
char line[130]; // size of line must be atleast `strlen(line) + strlen(mod) + 1`. Though 130 is not perfect, it is safer
Lets see where the issue exactly hits in your code:
For that I am bringing up disassembly code of your main.
(gdb) disas main
Dump of assembler code for function main:
0x0804857c <+0>: push %ebp
0x0804857d <+1>: mov %esp,%ebp
0x0804857f <+3>: and $0xfffffff0,%esp
0x08048582 <+6>: sub $0xb0,%esp
0x08048588 <+12>: mov %gs:0x14,%eax
0x0804858e <+18>: mov %eax,0xac(%esp)
..... //Leaving out Code after 0x0804858e till 0x08048671
0x08048671 <+245>: call 0x8048430 <strcat#plt>
0x08048676 <+250>: movl $0x80487d5,0x4(%esp)
.... //Leaving out Code after 0x08048676 till 0x08048704
0x08048704 <+392>: mov 0xac(%esp),%edx
0x0804870b <+399>: xor %gs:0x14,%edx
0x08048712 <+406>: je 0x8048719 <main+413>
0x08048714 <+408>: call 0x8048420 <__stack_chk_fail#plt>
0x08048719 <+413>: leave
0x0804871a <+414>: ret
Following the usual assembly language prologue,
Instruction at 0x08048582 : stack grows by b0(176 in decimal) bytes for allowing storage stack contents for the main function.
%gs:0x14 provides the random canary value used for stack protection.
Instruction at 0x08048588 : Stores above mentioned value into the eax register.
Instruction at 0x0804858e : eax content(canary value) is pushed to stack at $esp with offset 172
Keep a breakpoint(1) at 0x0804858e.
(gdb) break *0x0804858e
Breakpoint 1 at 0x804858e: file program_name.c, line 6.
Run the program:
(gdb) run
Starting program: /path-to-executable/executable-name
Breakpoint 1, 0x0804858e in main () at program_name.c:6
6 {
Once program pauses at the breakpoint(1), Retreive the random canary value by printing the contents of register 'eax'
(gdb) i r eax
eax 0xa3d24300 -1546501376
Keep a breakpoint(2) at 0x08048671 : Exactly before call strcat().
(gdb) break *0x08048671
Breakpoint 2 at 0x8048671: file program_name.c, line 33.
Continue the program execution to reach the breakpoint (2)
(gdb) continue
Continuing.
Breakpoint 2, 0x08048671 in main () at program_name.c:33
print out the second top stack content where we stored the random canary value by executing following command in gdb, to ensure it is the same before strcat() is called.
(gdb) p *(int*)($esp + 172)
$1 = -1546501376
Keep a breakpoint (3) at 0x08048676 : Immediately after returning from call strcat()
(gdb) break *0x08048676
Breakpoint 3 at 0x8048676: file program_name.c, line 36.
Continue the program execution to reach the breakpoint (3)
(gdb) continue
Continuing.
Breakpoint 3, main () at program_name.c:36
print out the second top stack content where we stored the random canary value by executing following command in gdb, to ensure it is not corrupted by calling strcat()
(gdb) p *(int*)($esp + 172)
$2 = 1869111673
But it is corrupted by calling strcat(). You can see $1 and $2 are not same.
Lets see what happens because of corrupting the random canary value.
Instruction at 0x08048704 : Pulls the corrupted random canary value and stores in 'edx` register
Instruction at 0x0804870b : xor the actual random canary value and the contents of 'edx' register
Instruction at 0x08048712 : If they are same, jumps directly to end of main and returns safely. In our case random canary value is corrupted and 'edx' register contents is not the same as the actual random canary value. Hence Jump condition fails and __stack_chk_fail is called which throws libc_message mentioned in the top of the answer and aborts the application.
Useful Links:
IBM SSP Page
Interesting Read on SSP - caution pdf.
Since you didn't tell us where it faults I'll just point out some suspect lines:
for(i=0; i<100; ++i)
What if a line is less than 100 chars? This will read uninitialized memory - its not a good idea to do this.
strcat(line, mod);
What if a line is 90 in length and then you're adding 30 more chars? Thats 20 out of bounds..
You need to calculate the length and dynamically allocate your strings with malloc, and ensure you don't read or write out of bounds, and that your strings are always NULL terminated. Or you could use C++/std::string to make things easier if it doesn't have to be C.
Instead of checking for \n only, for the end of line, add the check for \r character also.
if(line[i] == '\n' || line[i] == '\r')
Also, before using strcat ensure that line has has enough room for mod. You can do this by checking if (i < /* Some value far less than 100 */), if i == 100 then that means it never encountered a \n character hence \0 was not added to line, hence Invalid memory Access occurs inside strcat() and therefore Seg Fault.
Fixed it. I simply increased the size of my line string.

Resources