fgets() seems to overflow input to other variables - c

I'm doing a read from a file, but the input seems to "overflow" into other variables.
I have these 2 variables:
char str[250]; //used to store input from stream
char *getmsg; //already points to some other string
The problem is, when I use fgets() to read the input
printf("1TOKEN:%s\n",getmsg);
fp=fopen("m.txt","r");
fp1=fopen("m1.txt","w");
if(fp!=NULL && fp1!=NULL)
printf("2TOKEN:%s\n",getmsg);
while(fgets(str,250,fp)!=NULL){
printf("3TOKEN:%s\n",getmsg);
printf("read:%s",str);
printf("4TOKEN:%s\n",getmsg);
I get something like this:
1TOKEN:c
2TOKEN:c
3TOKEN:b atob atobbody
read:a b atob atobbody
4TOKEN:b atob atobbody
You see how str kind of flows into getmsg. What happened there? How can I avoid this from happening?
Thanks in advance :)
in the code, "getmsg" is called "token", I thought it might have something to do with identical names or something so I changed it to getmsg, same error, so I changed it back...
if(buf[0]=='C'){
int login_error=1;
fp=fopen("r.txt","r");
if(fp!=NULL){
memcpy(&count,&buf[1],2);
pack.boxid=ntohs(count);
memcpy(pack.pword,&buf[3],10);
printf("boxid:%u pword:%s\n",pack.boxid,pack.pword);
while(fgets(str,250,fp)!=NULL){
/*"getmsg"===>*/ token=strtok(str," ");
token=strtok(NULL," ");//receiver uname
token1=strtok(NULL," ");//pword
token2=strtok(NULL," ");//boxid
sscanf(token2,"%hu",&count);//convert char[] to unsigned short
if(pack.boxid==count && strcmp(token1,pack.pword)==0){//uname & pword found
login_error=0;
printf("found:token:%s\n",token);
break;
}
}
if(login_error==1){
count=65535;
pack.boxid=htons(count);
}
if(login_error==0){
count=0;
pack.boxid=htons(count);
}
fclose(fp);
}
printf("1TOKEN:%s\n",token);
if(login_error==0){
int msg_error=1;
fp=fopen("m.txt","r");
fp1=fopen("m1.txt","w");
if(fp!=NULL && fp1!=NULL){
printf("2TOKEN:%s\n",token);
while(fgets(str,250,fp)!=NULL){
printf("3TOKEN:%s\n",token);
printf("read:%s",str);
token1=strtok(str," ");//sender
token2=strtok(NULL," ");//receiver
token3=strtok(NULL," ");//subject
token4=strtok(NULL," ");//body
printf("m.txt:token1:%s token2:%s token3:%s token4:%s\n",token1,token2,token3,token4);
if(msg_error==1 && strcmp(token,token2)==0){//message found
msg_error=0;
count=0;
pack.boxid=htons(count);
strcpy(pack.uname,token1);
strcpy(pack.subject,token3);
strcpy(pack.body,token4);
printf("pack:uname:%s subject:%s body:%s token:%s token2:%s strcmp:%d\n",pack.uname,pack.subject,pack.body,token,token2,strcmp(token,token2));
continue;
}
fprintf(fp1,"%s %s %s %s\n",token1,token2,token3,token4);
}
if(msg_error==1){
count=65534;
pack.boxid=htons(count);
}
printf("count:%u -> boxid:%u\n",count,pack.boxid);
fclose(fp);
fclose(fp1);
}
str[0]='c';
memcpy(&str[1],&pack.boxid,2);
memcpy(&str[3],pack.uname,8);
memcpy(&str[11],pack.subject,20);
memcpy(&str[31],pack.body,200);
str[231]='\0';
bytes=232;
}
}
below is m.txt, it is used to store senders, receivers, subjects and msgbodies:
the naming patter is quite obvious >.^
a b atob atobbody
a c atoc atoccc
b c btoc btoccccc
b a btoa btoaaaaa
So I'm trying to get a msg stored in m.txt for the recipient "c", but it flows over, and by much coincidence, it returns the msg for "b"...

It looks like getmsg is pointing to the third character of your str buffer:
`str` is "a b atob atobbody"
^
|
\__ `getmsg` is pointing there.
Therefore, every time you change str by calling fgets(), the string pointed to by getmsg also changes, since it uses the same memory.

Related

File I/O Extraction with structures in C

The task is to read in a .txt file with a command line argument, within the file there is a list unstructured information listing every airport in the state of Florida note this is only a snippet of the total file. There is some data that must be ignored such as ASO ORL PR A 0 18400 - anything that does not pertain to the structured variables within AirPdata.
The assignment is asking for the site number, locID, fieldname, city, state, latitude, longitude, and if there is a control tower or not.
INPUT
03406.20*H 2FD7 AIR ORLANDO ORLANDO FL ASO ORL PR 28-26-08.0210N 081-28-23.2590W PR NON-NPIAS N A 0 18400
03406.18*H 32FL MEYER- INC ORLANDO FL ASO ORL PR 28-30-05.0120N 081-22-06.2490W PR NON-NPAS N 0 0
OUTPUT
Site# LocID Airport Name City ST Latitude Longitude Control Tower
------------------------------------------------------------------------
03406.20*H 2FD7 AIR ORLANDO ORLANDO FL 28-26-08.0210N 081-28-23.2590W N
03406.18*H 32FL MEYER ORLANDO FL 28-30.05.0120N 081-26-39.2560W N
etc.. etc. etc.. etc.. .. etc.. etc.. ..
etc.. etc. etc.. etc.. .. etc.. etc.. ..
my code so far looks like
#include <stdio.h>
#include <stdlib.h>
#include <strings.h>
typedef struct airPdata{
char *siteNumber;
char *locID;
char *fieldName;
char *city;
char *state;
char *latitude;
char *longitude;
char controlTower;
} airPdata;
int main (int argc, char* argv[])
{
char text[1000];
FILE *fp;
char firstwords[200];
if (strcmp(argv[1], "orlando5.txt") == 0)
{
fp = fopen(argv[1], "r");
if (fp == NULL)
{
perror("Error opening the file");
return(-1);
}
while (fgets(text, sizeof(text), fp) != NULL)
{
printf("%s", text);
}
}
else
printf("File name is incorrect");
fflush(stdout);
fclose(fp);
}
So far i'm able to read the whole file, then output the unstructured input onto the command line.
The next thing I tried to figure out is to extract piece by piece the strings and store them into the variables within the structure. Currently i'm stuck at this phase. I've looked up information on strcpy, and other string library functions, data extraction methods, ETL, I'm just not sure what function to use properly within my code.
I've done something very similar to this in java using substrings, and if there is a way to take a substring of the massive string of text, and set parameters on what substrings are held in what variable, that would potentially work. such as... LocID is never more than 4 characters long, so anything with a numerical/letter combination that is four letters long can be stored into airPdata.LocID for example.
After the variables are stored within the structures, I know I have to use strtok to organize them within the list under site#, locID...etc.. however, that's my best guess to approach this problem, i'm pretty lost.
I don't know what the format is. It can't be space-separated, some of the fields have spaces in them. It doesn't look fixed-width. Because you mentioned strtok I'm going to assume its tab-separated.
You can use strsep use that. strtok has a lot of problems that strsep solves, but strsep isn't standard C. I'm going to assume this is some assignment requiring standard C, so I'll begrudgingly use strtok.
The basic thing to do is to read each line, and then split it into columns with strtok or strsep.
char line[1024];
while (fgets(line, sizeof(line), fp) != NULL) {
char *column;
int col_num = 0;
for( column = strtok(line, "\t");
column;
column = strtok(NULL, "\t") )
{
col_num++;
printf("%d: %s\n", col_num, column);
}
}
fclose(fp);
strtok is funny. It keeps its own internal state of where it is in the string. The first time you call it, you pass it the string you're looking at. To get the rest of the fields, you call it with NULL and it will keep reading through that string. So that's why there's that funny for loop that looks like its repeating itself.
Global state is dangerous and very error prone. strsep and strtok_r fix this. If you're being told to use strtok, find a better resource to learn from.
Now that we have each column and its position, we can do what we like with it. I'm going to use a switch to choose only the columns we want.
for( column = strtok(line, "\t");
column;
column = strtok(NULL, "\t") )
{
col_num++;
switch( col_num ) {
case 1:
case 2:
case 3:
case 4:
case 5:
case 9:
case 10:
case 13:
printf("%s\t", column);
break;
default:
break;
}
}
puts("");
You can do whatever you like with the columns at this point. You can print them immediately, or put them in a list, or a structure.
Just remember that column is pointing to memory in line and line will be overwritten. If you want to store column, you'll have to copy it first. You can do that with strdup but *sigh* that isn't standard C. strcpy is really easy to use wrong. If you're stuck with standard C, write your own strdup.
char *mystrdup( const char *src ) {
char *dst = malloc( (sizeof(src) * sizeof(char)) + 1 );
strcpy( dst, src );
return dst;
}

Inserting word from a text file into a tree in C

I have been encountering a weird problem for the past 2 days and I can't get to solve it yet. I am trying to get words from 2 texts files and add those words to a tree. The methods I choose to get the words are refereed here:
Splitting a text file into words in C.
The function that I use to insert words into a tree is the following:
void InsertWord(typosWords Words, char * w)
{
int error ;
DataType x ;
x.word = w ;
printf(" Trying to insert word : %s \n",x.word );
Tree_Insert(&(Words->WordsRoot),x, &error) ;
if (error)
{
printf("Error Occured \n");
}
}
As mentioned in the link posted , when I am trying to import the words from a text file into the tree , I am getting "Error Occured". For once again the function:
the text file :
a
aaah
aaahh
char this_word[15];
while (fscanf(wordlist, "%14s", this_word) == 1)
{
printf("Latest word that was read: '%s'\n", this_word);
InsertWord(W,this_word);
}
But when I am inserting the exact same words with the following way , it works just fine.
for (i = 0 ; i <=2 ; i++)
{
if (i==0)
InsertWord(W,"a");
if (i==1)
InsertWord(W,"aaah");
if (i==2)
InsertWord(W,"aaahh");
}
That proves the tree's functions works fine , but I can't understand what's happening then.I am debugging for straight 2 days and still can't figure it. Any ideas ?
When you read the words using
char this_word[15];
while (fscanf(wordlist, "%14s", this_word) == 1)
{
printf("Latest word that was read: '%s'\n", this_word);
InsertWord(W,this_word);
}
you are always reusing the same memory buffer for the strings. This means when you do
x.word = w ;
you are ALWAYS storing the SAME address. And every read redefine ALL already stored words, basically corrupting the data structure.
Try changing the char this_word[15]; to char *this_word; and placing a this_word = malloc(15);in the beggining of thewhile` loop instead, making it allocate a new buffer for each iteration. So looking like
char *this_word;
while (fscanf(wordlist, "%14s", this_word) == 1)
{
this_word = malloc(15);
printf("Latest word that was read: '%s'\n", this_word);
InsertWord(W,this_word);
}
As suggested by Michael Walz a strdup(3) also solves the immediate problem.
Of course you will also have do free up the .word elements when finished with the tree.
Seems like the problem was in the assignment of the strings.Strdup seemed to solve the problem !

Bus Error on void function return

I'm learning to use libcurl in C. To start, I'm using a randomized list of accession names to search for protein sequence files that may be found hosted here. These follow a set format where the first line is a variable length (but which contains no information I'm trying to query) then a series of capitalized letters with a new line every sixty (60) characters (what I want to pull down, but reformat to eighty (80) characters per line).
I have the call itself in a single function:
//finds and saves the fastas for each protein (assuming on exists)
void pullFasta (proteinEntry *entry, char matchType, FILE *outFile) {
//Local variables
URL_FILE *handle;
char buffer[2] = "", url[32] = "http://www.uniprot.org/uniprot/", sequence[2] = "";
//Build full URL
/*printf ("u:%s\nt:%s\n", url, entry->title); /*This line was used for debugging.*/
strcat (url, entry->title);
strcat (url, ".fasta");
//Open URL
/*printf ("u:%s\n", url); /*This line was used for debugging.*/
handle = url_fopen (url, "r");
//If there is data there
if (handle != NULL) {
//Skip the first line as it's got useless info
do {
url_fread(buffer, 1, 1, handle);
} while (buffer[0] != '\n');
//Grab the fasta data, skipping newline characters
while (!url_feof (handle)) {
url_fread(buffer, 1, 1, handle);
if (buffer[0] != '\n') {
strcat (sequence, buffer);
}
}
//Print it
printFastaEntry (entry->title, sequence, matchType, outFile);
}
url_fclose (handle);
return;
}
With proteinEntry being defined as:
//Entry for fasta formatable data
typedef struct proteinEntry {
char title[7];
struct proteinEntry *next;
} proteinEntry;
And the url_fopen, url_fclose, url_feof, url_read, and URL_FILE code found here, they mimic the file functions for which they are named.
As you can see I've been doing some debugging with the URL generator (uniprot URLs follow the same format for different proteins), I got it working properly and can pull down the data from the site and save it to file in the proper format that I want. I set the read buffer to 1 because I wanted to get a program that was very simplistic but functional (if inelegant) before I start playing with things, so I would have a base to return to as I learned.
I've tested the url_<function> calls and they are giving no errors. So I added incremental printf calls after each line to identify exactly where the bus error is occurring and it is happening at return;.
My understanding of bus errors is that it's a memory access issue wherein I'm trying to get at memory that my program doesn't have control over. My confusion comes from the fact that this is happening at the return of a void function. There's nothing being read, written, or passed to trigger the memory error (as far as I understand it, at least).
Can anyone point me in the right direction to fix my mistake please?
EDIT: As #BLUEPIXY pointed out I had a potential url_fclose (NULL). As #deltheil pointed out I had sequence as a static array. This also made me notice I'm repeating my bad memory allocation for url, so I updated it and it now works. Thanks for your help!
If we look at e.g http://www.uniprot.org/uniprot/Q6GZX1.fasta and skip the first line (as you do) we have:
MNAKYDTDQGVGRMLFLGTIGLAVVVGGLMAYGYYYDGKTPSSGTSFHTASPSFSSRYRY
Which is a 60 characters string.
When you try to read this sequence with:
//Grab the fasta data, skipping newline characters
while (!url_feof (handle)) {
url_fread(buffer, 1, 1, handle);
if (buffer[0] != '\n') {
strcat (sequence, buffer);
}
}
The problem is sequence is not expandable and not large enough (it is a fixed length array of size 2).
So make sure to choose a large enough size to hold any sequence, or implement the ability to expand it on-the-fly.

Segmentation fault on certain inputs and not others

Heres a function I wrote that has some debugging elements in it already. When i enter either a "y" or a "Y" as the input I get a segmentation fault during runtime. When I enter any other value the code runs. The seg fault kicks out after it scans and gives me the response but before the "scan worked" line is output. DOn't know why it would act like this only on these values. If anyone needs the function call I have that as well.
query_user(char *response [10])
{
printf("response after query call before clear=%s\n",response);
strcpy(response,"");
printf("response after clearing before scan=%s\n",response);
printf("Enter another person into the line? y or n\n");
scanf("%s", response);
printf("response after scan=%s\n",response);
printf("scan worked");
}
main()
{
char response [10];
strcpy(response,"y");
printf("response=%s\n",response);
printf("When finished with program type \"done\" to exit\n");
while (strcmp(response,"done") != 0)
{
printf("response after while loop and before query call=%s\n",response);
query_user(&response);
}
}
output on error:
response after query call before clear=y
response after clearing before scan=
Enter another person into the line? y or n
y
response after scan=y
Segmentation Fault (core dumped)
output on non-error:
response after query call before clear=y
response after clearing before scan=
Enter another person into the line? y or n
n
response after scan=n
scan worked
Cycle number 0
(program continues to run outside this function)
Your declaration of the parameter for query_user is wrong. You've declared an array of pointers to char. You need a simple char buffer. Like this:
query_user(char response[])
or
query_user(char* response)
Use whichever you prefer.
When you call the function, you can do it like this:
query_user(response);
Also I would point out that your declaration for main is incorrect. You should use
int main(void)
What you are doing is not totally secure.
What will happen if the user tape more than size of your tab ? (Segfault probably).
You really should use something like scanf("%9s", response) (You will have to empty the buffer after it !) :
(Example for tab size = 10) :
query_user(char *response)
{
printf("response after query call before clear=%s\n",response);
strcpy(response,"");
printf("response after clearing before scan=%s\n",response);
printf("Enter another person into the line? y or n\n");
scanf("%9s", response);
clean_buffer();
printf("response after scan=%s\n",response);
printf("scan worked");
}
void clean_buffer()
{
int c;
while ((c = getchar ()) != '\n' && c != EOF);
}
Hope it helps !
First, the code you posted doesn't reproduce the problem. I tested on Linux with Gcc 4.6.2 and on Windows with Visual Studio 2010. In both cases with your exact code my output is:
response=y
When finished with program type "done" to exit
response after while loop and before query call=y
response after query call before clear=y
response after clearing before scan=
Enter another person into the line? y or n
y
response after scan=y
scan workedresponse after while loop and before query call=y
response after query call before clear=y
response after clearing before scan=
Enter another person into the line? y or n
So to better diagnose we need a full working set of code which shows this problem. There are a lot of warnings with the posted code (not sure if those are in your base as well) and sometimes ignoring warnings can get you in a bad place:
main should return int: int main()
Put a return in your main at the end return 0;
query_user needs a void return type: void query_user()
strcpy takes a char * you're giving it a char **
For this last point you can fix it by passing the array to your function such as:
query_user(response); // remove the & operator
And your header for query_user() needs to change such as:
query_user(char response [10]) // remove the * operator
Also note that scanf will leave a '\n' character on stdin, to make sure you don't consume that you can put a single space before the %s:
scanf(" %s", response);
I don't think that any of this will make a big difference, but make sure you make the changes. Please post an update to your code that actually shows the error.

Why does my program read an extra structure?

I'm making a small console-based rpg, to brush up on my programming skills.
I am using structures to store character data. Things like their HP, Strength, perhaps Inventory down the road. One of the key things I need to be able to do is load and save characters. Which means reading and saving structures.
Right now I'm just saving and loading a structure with first name and last name, and attempting to read it properly.
Here is my code for creating a character:
void createCharacter()
{
char namebuf[20];
printf("First Name:");
if (NULL != fgets(namebuf, 20, stdin))
{
char *nlptr = strchr(namebuf, '\n');
if (nlptr) *nlptr = '\0';
}
strcpy(party[nMember].fname,namebuf);
printf("Last Name:");
if (NULL != fgets(namebuf, 20, stdin))
{
char *nlptr = strchr(namebuf, '\n');
if (nlptr) *nlptr = '\0';
}
strcpy(party[nMember].lname,namebuf);
/*Character created, now save */
saveCharacter(party[nMember]);
printf("\n\n");
loadCharacter();
}
And here is the saveCharacter function:
void saveCharacter(character party)
{
FILE *fp;
fp = fopen("data","a");
fwrite(&party,sizeof(party),1,fp);
fclose(fp);
}
and the loadCharacter function
void loadCharacter()
{
FILE *fp;
character tempParty[50];
int loop = 0;
int count = 1;
int read = 2;
fp= fopen("data","r");
while(read != 0)
{
read=fread(&tempParty[loop],sizeof(tempParty[loop]),1,fp);
printf("%d. %s %s\n",count,tempParty[loop].fname,tempParty[loop].lname);
loop++;
count++;
}
fclose(fp);
}
So the expected result of the program is that I input a name and last name such as 'John Doe', and it gets appended to the data file. Then it is read in, maybe something like
1. Jane Doe
2. John Doe
and the program ends.
However, my output seems to add one more blank structure to the end.
1. Jane Doe
2. John Doe
3.
I'd like to know why this is. Keep in mind I'm reading the file until fread returns a 0 to signify it's hit the EOF.
Thanks :)
Change your loop:
while( fread(&tempParty[loop],sizeof(tempParty[loop]),1,fp) )
{
// other stuff
}
Whenever you write file reading code ask yourself this question - "what happens if I read an empty file?"
You have an algorithmic problem in your loop, change it to:
read=fread(&tempParty[loop],sizeof(tempParty[loop]),1,fp);
while(read != 0)
{
//read=fread(&tempParty[loop],sizeof(tempParty[loop]),1,fp);
printf("%d. %s %s\n",count,tempParty[loop].fname,tempParty[loop].lname);
loop++;
count++;
read=fread(&tempParty[loop],sizeof(tempParty[loop]),1,fp);
}
There are ways to ged rid of the double fread but first get it working and make sure you understand the flow.
Here:
read=fread(&tempParty[loop],sizeof(tempParty[loop]),1,fp);
printf("%d. %s %s\n",count,tempParty[loop].fname,tempParty[loop].lname);
You are not checking whether the read was successful (the return value of fread()).
while( 1==fread(&tempParty[loop],sizeof*tempParty,1,fp) )
{
/* do anything */
}
is the correct way.
use fopen("data","rb")
instead of fopen("data","r") which is equivalent to fopen("data","rt")
You've got the answer to your immediate question but it's worth pointing out that blindly writing and reading whole structures is not a good plan.
Structure layouts can and do change depending on the compiler you use, the version of that compiler and even with the exact compiler flags used. Any change here will break your ability to read files saved with a different version.
If you have ambitions of supporting multiple platforms issues like endianness also come into play.
And then there's what happens if you add elements to your structure in later versions ...
For robustness you need to think about defining your file format independently of your code and having your save and load functions handle serialising and de-serialising to and from this format.

Resources