The last character is not printed to a file - c

I am trying to figure out why using C function strtok is not working properly for me. Here's the problem:
I have a file which contains two types of information: headers and text descriptions. Each line in the file is either a header or part of a text description. A header starts with '>'. The description text follows the header and can span multiple lines. At the end of the text there is an empty line which separates the description from the next header. My aim is to write two separate files: one contains the headers on each line and the other contains the corresponding description on a line by itself. To implement the codes in C, I used fgets to read the file one line at a time into dynamically allocated memory. In order to write the description text on one single line, I used `strtok to get rid of any new line characters exists in the text.
My code is working properly for the header files. However, for the descriptions file, I noticed that the last character of the text is not printed out to the file even though it is printed to the stdout.
FILE *headerFile = fopen("Headers", "w"); //to write headers
FILE *desFile = fopen("Descriptions", "w"); //to write descriptions
FILE *pfile = fopen("Data","r");
if ( pfile != NULL )
{
int numOfHeaders =0;
char **data1 = NULL; //an array to hold a header line
char **data2 = NULL; //an array to hold a description line
char line[700] ; //maximum size for the line
while (fgets(line, sizeof line, pfile ))
{
if(line[0] =='>') //It is a header
{
data1 = realloc(data1,(numOfHeaders +1)* sizeof(*data1));
data1[numOfHeaders]= malloc(strlen(line)+1);
strcpy(data1[numOfHeaders],line);
fprintf(headerFile, "%s",line);//writes the header
if(numOfHeaders >0)
fprintf(desFile, "\n");//writes a new line in the desc file
numOfHeaders++;
}
//it is not a header and not an empty line
if(line[0] != '>' && strlen(line)>2)
{
data2 = realloc(data2,(numOfHeaders +1)* sizeof(*data2));
data2[numOfHeaders]= malloc(strlen(line)+1);
char *s = strtok(line, "\n ");
strcpy(data2[numOfHeaders],s);
fprintf(desFile, "%s",data2[numOfHeaders]);
printf(desFile, "%s",data2[numOfHeaders]);
}
} //end-while
fclose(desFile);
fclose(headerFile);
fclose(pfile );
printf("There are %d headers in the file.\n",numOfHeaders);
}

As mentioned in the comments:
fprintf(desFile, "%s",data2[numOfHeaders]); //okay
printf(desFile, "%s",data2[numOfHeaders]); //wrong
Second line should be:
printf("%s",data2[numOfHeaders]); //okay
Or, you could do this:
sprintf(buffer, "%s",data2[numOfHeaders]);
fprintf(desFile, buffer);
printf(buffer);
Other possible issues:
Without an input file it is not possible to know for certain what strtok() is doing, but here is a guess based on what you have described:
In these two lines:
data2[numOfHeaders]= malloc(strlen(line)+1);
char *s = strtok(line, "\n ");
if the string contained in data2 has any embedded spaces, s will only contain the segment occurring before that space. And because you are only calling it once before line gets refreshed:
while (fgets(line, sizeof line, pfile ))
only one token (the very first segment) will be read.
Not always, but Normally, strtok() is called in a loop:
char *s = {0};
s= strtok(stringToParse, "\n ");//make initial call before entering loop
while(s)//ALWAYS test to see if s contains new content, else NULL
{
//do something with s
strcpy(data2[numOfHeaders],s);
//get next token from string
s = strtok(NULL, "\n ");//continue to tokenize string until s is null
}
But, as I said above, you are calling it only once on that string before the content of the string is changed. It is possible then, that the segment not printing has simply not yet been tokenized by strtok().

Related

C - Reading CSV file in char array

I have a very little experience in C programming, particularly File Handling. I am developing a project in which I'm supposed to create a Sign Up/Log In system. I have a .csv file in which the data are separated by ,
What I am trying to do is reading the first and second column into two char arrays respectively.
char userLogin[100];
char userPassword[100];
FILE *file3 = fopen("C:\\Users\\Kshitiz\\Desktop\\BAAS\\signup_db.csv","r");
if(file3 != NULL){
while(!feof(file3)){
fscanf(file3,"%[^,],%s",userLogin,userPassword);
puts(userLogin);
puts(userPassword);
}
}
fclose(file3);
Content of signup_db.csv:
Username,Password
SBI063DDN,Qazwsx1234
ICICIDDN456,WSXEDC1234r
Expected Output:
Username
Password
SBI063DDN
Qazwsx1234
ICICIDDN456
WSXEDC1234r
Output which I'm getting:
Username
Password
SBI063DDN
Qazwsx1234
ICICIDDN456
WSXEDC1234r
WSXEDC1234r
Can anyone please help me how can I resolve this issue? Thank you!
The 'fscanf()' function returns the number of items of the argument list successfully filled. So instead try this:
while(fscanf(file3,"%[^,],%s",userLogin,userPassword) == 2)
{
puts(userLogin);
puts(userPassword);
}
The problem you mentioned is probably because of a new line character at the end of your file. When you read the last line, you have not yet reached the end of file. The above code solves this issue.
In my case I have the expected results, but I don't know if there is a difference with the compiler or if my csv file is different (I've tried to recreate it). Here is another way to parse the file, check if you have the expected results:
#include <stdio.h>
#include <string.h>
#define LINE_LENGTH 1000
int main(void) {
char userLogin[100];
char userPassword[100];
char line[LINE_LENGTH];
char *delimiter = ",";
char *token;
FILE *file3 = fopen("signup_db.csv", "r");
while(fgets(line, LINE_LENGTH, file3) != NULL) {
token = strtok(line, delimiter);
printf("%s\n", token);
token = strtok(NULL, delimiter);
printf("%s\n", token);
}
fclose(file3);
}

Regarding FOPEN in C

I am having a problem regarding FOPEN in C.
I have this code which reads a particular file from a directory
FILE *ifp ;
char directoryname[50];
char result[100];
char *rpath = "/home/kamal/samples/pipe26/divpipe0.f00001";
char *mode = "r";
ifp = fopen("director.in",mode); %director file contains path of directory
while (fscanf(ifp, "%s", directoname) != EOF)
{
strcpy(result,directoname); /* Path of diretory /home/kamal/samples/pipe26 */
strcat(result,"/"); /* front slash for path */
strcat(result,name); /* name of the file divpipe0.f00001*/
}
Till this point my code works perfectly creating a string which looks " /home/kamal/samples/pipe26/divpipe0.f00001 ".
The problem arises when I try to use the 'result' to open a file, It gives me error. Instead if I use 'rpath' it works fine even though both strings contain same information.
if (!(fp=fopen(rpath,"rb"))) /* This one works fine */
{
printf(fopen failure2!\n");
return;
}
if (!(fp=fopen(result,"rb"))) /* This does not work */
{
printf(fopen failure2!\n");
return;
}
Could some one please tell why I am getting this error ?
I think you mean char result[100];; i.e. without the asterisk. (Ditto for directoryname.)
You're currently stack-allocating an array of 100 pointers. This will not end well.
Note that rpath and mode point to read-only memory. Really you should use const char* for those two literals.
The error is the array 'char* result[100]', here you are allocating an array of 100 pointers to strings, not 100 bytes / characters, which was your intent.

How can I split an input using multiple delimiters?

I want to split into tokens an lessons.txt file. This file has some people and these people's lessons. How can I do it ?
There is my lessons.txt file :
George Adam :Math,Science,Germany
Elizabeth McCurry :Music,Math,History
Tom Hans :Science,Music
Firstly, I want to split into ":". And I want to store names in an array. Secondly , I want to split into "," and these lessons I want to store an different array. How can I this ?
There is my code below :
char names[100] , *token, *lecture;
file=fopen("C:\\lessons.txt","r");
while(!feof(file))
{
fgets(names,sizeof(names),file);
printf("%s",names);
token=strtok(names,":");
while(token!=NULL)
{
token=strtok(NULL,":");
printf(" \n %s",token);
lecture=strtok(token,",");
while(lecture!=NULL)
{
lecture=strtok(NULL,",");
printf(" \n\n %s",lecture);
}
}
}
fclose(file);
So you want names to be stored in a separate array, and lessons to be stored in another?
You will need two separate tokens, you are using the same token for names and lessons.
Try this :
FILE *file;
file = fopen("C:\\lessons.txt", "r");
char names[100], *token, *difftok;
while (fgets(names, sizeof(names), file) != NULL) {
token = strtok(names, ":")
//puts(token); ---> George Adams
difftok = strtok(NULL, ",");
//puts(difftok); ---> Math
difftok = strtok(NULL, ",");
//puts(difftok); ---> Science
difftok = strtok(NULL, "\n");
//puts(difftok); ---> Germany
}
fclose(fp);
}
In my excerpt, token will always represent names, and difftok will always be lectures, from here I think you can figure out how to store the tokens into an array. Token goes into one, difftok into another.
Also, your EOF condition is wrong, feof returns a non-zero when it reaches end of file :
while(!feof(file))
Should be:
while(feof(file) == 0)
However, in this case I used fgets(...) != NULL because fgets return NULL when it reached end of file. You should probably use my condition as feof(file) == 0 encounters some end of file problems when used with your code and messes up the way the tokens parse the string.

Bus Error on void function return

I'm learning to use libcurl in C. To start, I'm using a randomized list of accession names to search for protein sequence files that may be found hosted here. These follow a set format where the first line is a variable length (but which contains no information I'm trying to query) then a series of capitalized letters with a new line every sixty (60) characters (what I want to pull down, but reformat to eighty (80) characters per line).
I have the call itself in a single function:
//finds and saves the fastas for each protein (assuming on exists)
void pullFasta (proteinEntry *entry, char matchType, FILE *outFile) {
//Local variables
URL_FILE *handle;
char buffer[2] = "", url[32] = "http://www.uniprot.org/uniprot/", sequence[2] = "";
//Build full URL
/*printf ("u:%s\nt:%s\n", url, entry->title); /*This line was used for debugging.*/
strcat (url, entry->title);
strcat (url, ".fasta");
//Open URL
/*printf ("u:%s\n", url); /*This line was used for debugging.*/
handle = url_fopen (url, "r");
//If there is data there
if (handle != NULL) {
//Skip the first line as it's got useless info
do {
url_fread(buffer, 1, 1, handle);
} while (buffer[0] != '\n');
//Grab the fasta data, skipping newline characters
while (!url_feof (handle)) {
url_fread(buffer, 1, 1, handle);
if (buffer[0] != '\n') {
strcat (sequence, buffer);
}
}
//Print it
printFastaEntry (entry->title, sequence, matchType, outFile);
}
url_fclose (handle);
return;
}
With proteinEntry being defined as:
//Entry for fasta formatable data
typedef struct proteinEntry {
char title[7];
struct proteinEntry *next;
} proteinEntry;
And the url_fopen, url_fclose, url_feof, url_read, and URL_FILE code found here, they mimic the file functions for which they are named.
As you can see I've been doing some debugging with the URL generator (uniprot URLs follow the same format for different proteins), I got it working properly and can pull down the data from the site and save it to file in the proper format that I want. I set the read buffer to 1 because I wanted to get a program that was very simplistic but functional (if inelegant) before I start playing with things, so I would have a base to return to as I learned.
I've tested the url_<function> calls and they are giving no errors. So I added incremental printf calls after each line to identify exactly where the bus error is occurring and it is happening at return;.
My understanding of bus errors is that it's a memory access issue wherein I'm trying to get at memory that my program doesn't have control over. My confusion comes from the fact that this is happening at the return of a void function. There's nothing being read, written, or passed to trigger the memory error (as far as I understand it, at least).
Can anyone point me in the right direction to fix my mistake please?
EDIT: As #BLUEPIXY pointed out I had a potential url_fclose (NULL). As #deltheil pointed out I had sequence as a static array. This also made me notice I'm repeating my bad memory allocation for url, so I updated it and it now works. Thanks for your help!
If we look at e.g http://www.uniprot.org/uniprot/Q6GZX1.fasta and skip the first line (as you do) we have:
MNAKYDTDQGVGRMLFLGTIGLAVVVGGLMAYGYYYDGKTPSSGTSFHTASPSFSSRYRY
Which is a 60 characters string.
When you try to read this sequence with:
//Grab the fasta data, skipping newline characters
while (!url_feof (handle)) {
url_fread(buffer, 1, 1, handle);
if (buffer[0] != '\n') {
strcat (sequence, buffer);
}
}
The problem is sequence is not expandable and not large enough (it is a fixed length array of size 2).
So make sure to choose a large enough size to hold any sequence, or implement the ability to expand it on-the-fly.

Why does c script only rename in last execution of loop

I am using C Script within Siemens WinCC 7.0 to read a text file containing source and destination comma separated e.g.
C:\Users\Administrator\Desktop\C File Transfer Test\Source\Cat.txt,P:\Cat.txt
C:\Users\Administrator\Desktop\C File Transfer Test\Source\Cat1.txt,P:\Cat1.txt
C:\Users\Administrator\Desktop\C File Transfer Test\Source\Cat2.txt,P:\Cat2.txt
C:\Users\Administrator\Desktop\C File Transfer Test\Source\Cat3.txt,P:\Cat3.txt
C:\Users\Administrator\Desktop\C File Transfer Test\Source\Cat4.txt,P:\Cat4.txt
I am using the following code to open this file and loop through moving the files from source to destination
#include "apdefap.h"
void File_Transfer()
{
#define MODUL "CopyProjekt "
char pathIn[100];
char pathOut[100];
char szProjektname[255];
FILE * fpInFile ;
FILE * fpOutFile ;
FILE *TempSource;
FILE *TempDestination;
#pragma code ("kernel32.dll")
BOOL CopyFileA(LPCTSTR,LPCTSTR,BOOL);
#pragma code ()
DM_DIRECTORY_INFO dmDirInfo;
DM_PROJECT_INFO dmProjectInfo;
CMN_ERROR dmError;
char *source;
char *destination;
char line[1000];
char * tokens;
char *tempTokens;
int i;
char tempString[1000];
if (DMGetProjectDirectory("PDLRT", szProjektname, &dmDirInfo, &dmError )!= NULL)
{
strcat(pathIn , dmDirInfo.szProjectDir) ;
strcat(pathOut, dmDirInfo.szProjectDir) ;
strcat(pathIn ,"FilesForTransfer\\FileData.txt");
strcat(pathOut ,"FilesForTransfer\\FileDataTemp.txt");
//(NULL,pathIn ,"2", MB_YESNO|MB_ICONQUESTION|MB_SYSTEMMODAL);
}
//Open the file containing the folder names and paths
fpInFile = fopen(pathIn,"r" );
fpOutFile = fopen(pathOut,"w" );
while (fgets(line,sizeof line,fpInFile) != NULL)
{
MessageBox(NULL,line,"Read Line",MB_YESNO|MB_ICONQUESTION|MB_SYSTEMMODAL);
tempTokens = line;
tokens = strtok(tempTokens ,",");
while (tokens != NULL)
{
if (i == 0)
{
source = tokens ;
}
else
{
destination = tokens ;
}
i = i + 1;
//read the tokens again
tokens = strtok(NULL ,",");
}
//MessageBox(NULL,source ,destination, MB_YESNO|MB_ICONQUESTION|MB_SYSTEMMODAL);
//Move the file from source to destination
//if (CopyFileA(source,destination,FALSE) != 0)
if (rename(source ,destination )!=0)
{
MessageBox(NULL,"FAILED" ,"Transfer", MB_YESNO|MB_ICONQUESTION|MB_SYSTEMMODAL);
}
else
{
MessageBox(NULL,"PASSED","Transfer",MB_YESNO|MB_ICONQUESTION|MB_SYSTEMMODAL);
}
}
//rename(source,destination)
//fputs(tempstring,fpOutFile );
fclose(fpInFile );
fclose(fpOutFile );
remove(pathIn);
rename(pathOut ,pathIn );
//MessageBox(NULL,"done" ,"Done" , MB_YESNO|MB_ICONQUESTION|MB_SYSTEMMODAL);
}
The message boxes are executing properly every time and the source and destination will be displayed correctly. However only the last file rename will work. If there is only one line in the reference file of source,destination then it will work fine. If there are more than one it will only work on the last.
From what I can work out the code is running through the loop properly and getting the right data from the lookup file but the rename is just not working properly.
Any ideas would be appreciated.
Thanks
fgets() stores the new-line character in the buffer it is populating if found:
Reads at most count - 1 characters from the given file stream and stores them in str. The produced character string is always NULL-terminated. Parsing stops if end-of-file occurs or a newline character is found, in which case str will contain that newline character.
So the destination file name will contain the new-line character, which is illegal. Remove it before attempting the rename():
char* nl_ptr = strrchr(destination, '\n');
if (nl_ptr) *nl_ptr = 0;
The last line works because there is no new-line character.

Resources