File I/O function for C - c

char *loadTextFile(const char *filename)
{
FILE *fileh;
char *text = 0;
long filelength;
if((fileh=fopen(filename,"rb"))== 0)
printf("loadTextFile() - could not open file");
else
{
fseek(fileh, 0, SEEK_END);
filelength = ftell(fileh);
rewind(fileh);
text=(char *) smartmalloc((int)filelength + 1);
fread(text,(int)filelength, 1, fileh);
fclose(fileh);
text[filelength]=0;
}
printf(text);
return(text);
}
This function only returns partial data of a txt file. It is also inconsistent...soemtimes gives me 100 characters of the file some times 20. I don't see anything wrong with it. Thought I might get another pair of eyes on it. Thanks.

Obvious things to check:
What did ftell(fileh) give you?
Can there be embedded NUL characters in the file? That would cause printf(text) to stop prematurely.

Here is a slightly better version of your code. You need more error checking with the IO function calls. Also, there is the annoying long to size_t implicit conversions which I would recommend dealing with properly in production code.
char* loadTextFile(const char *filename) {
char *text;
long length;
FILE *fileh = fopen(filename, "rb");
if ( !fileh ) {
return NULL;
}
fseek(fileh, 0, SEEK_END);
length = ftell(fileh);
rewind(fileh);
text = malloc(length + 1);
if ( !text ) {
return NULL;
}
fread(text, 1, length, fileh);
text[length] = 0;
fclose(fileh);
return text;
}
Note that, John R. Strohm is right: If your assessment of what has been read is based on what printf prints, then you are likely being misled by embedded nuls.

fread is not guaranteed to return as many characters as you ask for. You need to check its return value and use it in a loop.
Example loop (not tested):
char *p = text;
do {
size_t n = fread(p,1,(size_t)filelength, fileh);
if (n == 0) {
*p = '\0';
break;
}
filelength -= n;
p += n;
} while (filelength > 0);
The test for n==0 catches the case where some other process truncates the file as you are trying to read it.

Related

how do i make operations on a specific line in a text file in c?

void main(void)
{
FILE* textfile;
char line[1000];
textfile = fopen("omar.txt", "r");
if (textfile == NULL)
return 1;
while (fgets(line, 1000, textfile)) {
printf(line);
}
fclose(textfile);
}
so this code prints the whole content of a text file , what should I do to read the third line in the file for example ?
To read the nth line in a file you can do something like this
int i = 0;
while (fgets(line, 1000, textfile)) {
i++;
if (i == n) {
// do stuff with nth line
break;
}
}
This approach uses a counter to count until the nth iteration is reached. Once it is, you can do what you need to do with the nth line.
Also this may be unrelated but you should never use printf without a format specifier as you have in printf(line);. This can be dangerous and could be used by an attacker to exploit the program. I would recommend that in your case puts(line); is a better alternative.
For example:
int readNthLine(FILE *fi, char *buff, size_t buffsize, size_t line)
{
fseek(fi, 0, SEEK_SET);
{
for(size_t cline = 0; cline < line; cline++)
{
if(!fgets(buff, buffsize, fi)) return -1;
}
}
return 0;
}
This very simple function will work only if the size of the buffer is larger than the length of the longest line in the file.
Of course, you should check the result of any I/O operation.

Read bytes (chars) from buffer

I'm working on steganography program in Java. But I got advice that I be able to resolve this task better in C program. I would like to try it, but I'm pretty bad in C programing. For now I would like to read one gif file and find byte which is used as image separator (0x2c from GIF format).
I tried to write this program:
int main(int argc, char *argv[])
{
FILE *fileptr;
char *buffer;
long filelen = 0;
fileptr = fopen("D:/test.gif", "rb"); // Open the file in binary mode
fseek(fileptr, 0, SEEK_END); // Jump to the end of the file
filelen = ftell(fileptr); // Get the current byte offset in the file
rewind(fileptr); // Jump back to the beginning of the file
buffer = (char *)malloc((filelen+1)*sizeof(char)); // Enough memory for file + \0
fread(buffer, filelen, 1, fileptr); // Read in the entire file
fclose(fileptr); // Close the file
int i = 0;
for(i = 0; buffer[ i ]; i++)
{
if(buffer[i] == 0x2c)
{
printf("Next image");
}
}
return 0;
}
Could someone give me advice how to repair my loop?
Could someone give me advice how to repair my loop?
Option 1: Don't depend on the terminating null character.
for(i = 0; i < filelen; i++)
{
if(buffer[i] == 0x2c)
{
printf("Next image");
}
}
Option 2: Add the terminating null character before relying on it. This is potentially unreliable since you are reading a binary file that could have embedded null characters in it.
buffer[filelen] = '\0';
for(i = 0; buffer[ i ]; i++)
{
if(buffer[i] == 0x2c)
{
printf("Next image");
}
}
Similar to the 'for()' based answer, if you only need to check for a specific byte (0x2c), you can simply do something like the following (and not worry about null in the byte stream), using while().
i = 0;
while(i < filelen)
{
if(buffer[i++] == 0x2c)
{
printf("Next image");
}
}

fgets() not reading from a text file?

I have a function loadsets() (short for load settings) which is supposed to load settings from a text file named Progsets.txt. loadsets() returns 0 on success, and -1 when a fatal error is detected. However, the part of the code which actually reads from Progsets.txt, (the three fgets()), seem to all fail and return the null pointer, hence not loading anything at all but a bunch of nulls. Is there something wrong with my code? fp is a valid pointer when I ran the code, and I was able to open it for reading. So what's wrong?
This code is for loading the default text color of my very basic text editor program using cmd.
headers:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <Windows.h>
#define ARR_SIZE 100
struct FINSETS
{
char color[ARR_SIZE + 1];
char title[ARR_SIZE + 1];
char maxchars[ARR_SIZE + 1];
} SETTINGS;
loadsets():
int loadsets(int* pMAXCHARS) // load settings from a text file
{
FILE *fp = fopen("C:\\Typify\\Settings (do not modify)\\Progsets.txt", "r");
char *color = (char*) malloc(sizeof(char*) * ARR_SIZE);
char *title = (char*) malloc(sizeof(char*) * ARR_SIZE);
char *maxchars = (char*) malloc(sizeof(char*) * ARR_SIZE);
char com1[ARR_SIZE + 1] = "color ";
char com2[ARR_SIZE + 1] = "title ";
int i = 0;
int j = 0;
int k = 0;
int found = 0;
while (k < ARR_SIZE + 1) // fill strings with '\0'
{
color[k] = title[k] = maxchars[k] = '\0';
SETTINGS.color[k] = SETTINGS.maxchars[k] = SETTINGS.title[k] = '\0';
k++;
}
if (!fp) // check for reading errors
{
fprintf(stderr, "Error: Unable to load settings. Make sure that Progsets.txt exists and has not been modified.\a\n\n");
return -1; // fatal error
}
if (!size(fp)) // see if Progsets.txt is not a zero-byte file (it shouldn't be)
{
fprintf(stderr, "Error: Progsets.txt has been modified. Please copy the contents of Defsets.txt to Progsets.txt to manually reset to default settings.\a\n\n");
free(color);
free(title);
free(maxchars);
return -1; // fatal error
}
// PROBLEMATIC CODE:
fgets(color, ARR_SIZE, fp); // RETURNS NULL (INSTEAD OF READING FROM THE FILE)
fgets(title, ARR_SIZE, fp); // RETURNS NULL (INSTEAD OF READING FROM THE FILE)
fgets(maxchars, ARR_SIZE, fp); // RETURNS NULL (INSTEAD OF READING FROM THE FILE)
// END OF PROBLEMATIC CODE:
system(strcat(com1, SETTINGS.color)); // set color of cmd
system(strcat(com2, SETTINGS.title)); // set title of cmd
*pMAXCHARS = atoi(SETTINGS.maxchars);
// cleanup
fclose(fp);
free(color);
free(title);
free(maxchars);
return 0; // success
}
Progsets.txt:
COLOR=$0a;
TITLE=$Typify!;
MAXCHARS=$10000;
EDIT: Here is the definition of the size() function. Since I'm just working with ASCII text files, I assume that every character is one byte and the file size in bytes can be worked out by counting the number of characters. Anything suspicious?
size():
int size(FILE* fp)
{
int size = 0;
int c;
while ((c = fgetc(fp)) != EOF)
{
size++;
}
return size;
}
The problem lies in your use of the size() function. It repeatedly calls fgetc() on the file handle until it gets to the end of the file, incrementing a value to track the number of bytes in the file.
That's not a bad approach (though I'm sure there are better ones that don't involve inefficient character-based I/O) but it does have one fatal flaw that you seem to have overlooked.
After you've called it, you've read the file all the way to the end so that any further reads, such as:
fgets(color, ARR_SIZE, fp);
will simply fail since you're already at the end of the file. You may want to consider something like rewind() before returning from size() - that will put the file pointer back to the start of the file so that you can read it again.

C, Segmentation fault parsing large csv file

I wrote a simple program that would open a csv file, read it, make a new csv file, and only write some of the columns (I don't want all of the columns and am hoping removing some will make the file more manageable). The file is 1.15GB, but fopen() doesn't have a problem with it. The segmentation fault happens in my while loop shortly after the first progress printf().
I tested on just the first few lines of the csv and the logic below does what I want. The strange section for when index == 0 is due to the last column being in the form (xxx, yyy)\n (the , in a comma separated value file is just ridiculous).
Here is the code, the while loop is the problem:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(int argc, char** argv) {
long size;
FILE* inF = fopen("allCrimes.csv", "rb");
if (!inF) {
puts("fopen() error");
return 0;
}
fseek(inF, 0, SEEK_END);
size = ftell(inF);
rewind(inF);
printf("In file size = %ld bytes.\n", size);
char* buf = malloc((size+1)*sizeof(char));
if (fread(buf, 1, size, inF) != size) {
puts("fread() error");
return 0;
}
fclose(inF);
buf[size] = '\0';
FILE *outF = fopen("lessColumns.csv", "w");
if (!outF) {
puts("fopen() error");
return 0;
}
int index = 0;
char* currComma = strchr(buf, ',');
fwrite(buf, 1, (int)(currComma-buf), outF);
int progress = 0;
while (currComma != NULL) {
index++;
index = (index%14 == 0) ? 0 : index;
progress++;
if (progress%1000 == 0) printf("%d\n", progress/1000);
int start = (int)(currComma-buf);
currComma = strchr(currComma+1, ',');
if (!currComma) break;
if ((index >= 3 && index <= 10) || index == 13) continue;
int end = (int)(currComma-buf);
int endMinusStart = end-start;
char* newEntry = malloc((endMinusStart+1)*sizeof(char));
strncpy(newEntry, buf+start, endMinusStart);
newEntry[end+1] = '\0';
if (index == 0) {
char* findNewLine = strchr(newEntry, '\n');
int newLinePos = (int)(findNewLine-newEntry);
char* modifiedNewEntry = malloc((strlen(newEntry)-newLinePos+1)*sizeof(char));
strcpy(modifiedNewEntry, newEntry+newLinePos);
fwrite(modifiedNewEntry, 1, strlen(modifiedNewEntry), outF);
}
else fwrite(newEntry, 1, end-start, outF);
}
fclose(outF);
return 0;
}
Edit: It turned out the problem was that the csv file had , in places I was not expecting which caused the logic to fail. I ended up writing a new parser that removes lines with the incorrect number of commas. It removed 243,875 lines (about 4% of the file). I'll post that code instead as it at least reflects some of the comments about free():
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(int argc, char** argv) {
long size;
FILE* inF = fopen("allCrimes.csv", "rb");
if (!inF) {
puts("fopen() error");
return 0;
}
fseek(inF, 0, SEEK_END);
size = ftell(inF);
rewind(inF);
printf("In file size = %ld bytes.\n", size);
char* buf = malloc((size+1)*sizeof(char));
if (fread(buf, 1, size, inF) != size) {
puts("fread() error");
return 0;
}
fclose(inF);
buf[size] = '\0';
FILE *outF = fopen("uniformCommaCount.csv", "w");
if (!outF) {
puts("fopen() error");
return 0;
}
int numOmitted = 0;
int start = 0;
while (1) {
char* currNewLine = strchr(buf+start, '\n');
if (!currNewLine) {
puts("Done");
break;
}
int end = (int)(currNewLine-buf);
char* entry = malloc((end-start+2)*sizeof(char));
strncpy(entry, buf+start, end-start+1);
entry[end-start+1] = '\0';
int commaCount = 0;
char* commaPointer = entry;
for (; *commaPointer; commaPointer++) if (*commaPointer == ',') commaCount++;
if (commaCount == 14) fwrite(entry, 1, end-start+1, outF);
else numOmitted++;
free(entry);
start = end+1;
}
fclose(outF);
printf("Omitted %d lines\n", numOmitted);
return 0;
}
you're malloc'ing but never freeing. possibly you run out of memomry, one of your mallocs returns NULL, and the subsequent call to str(n)cpy segfaults.
adding free(newEntry);, free(modifiedNewEntry); immediately after the respective fwrite calls should solve your memory shortage.
also note that inside your loop you compute offsets into the buffer buf which contains the whole file. these offsets are held in variables of type int whose maximum value on your system may be too small for the numbers you are handling. also note that adding large ints may result in a negative value which is another possible cause of the segfault (negative offsets into buf take you to some address outside the buffer possibly not even readable).
The malloc(3) function can (and sometimes does) fail.
At least code something like
char* buf = malloc(size+1);
if (!buf) {
fprintf(stderr, "failed to malloc %d bytes - %s\n",
size+1, strerror(errno));
exit (EXIT_FAILURE);
}
And I strongly suggest to clear with memset(buf, 0, size+1) the successful result of a malloc (or otherwise use calloc ....), not only because the following fread could fail (which you are testing) but to ease debugging and reproducibility.
and likewise for every other calls to malloc or calloc (you should always test them against failure)....
Notice that by definition sizeof(char) is always 1. Hence I removed it.
As others pointed out, you have a memory leak because you don't call free appropriately. A tool like valgrind could help.
You need to learn how to use the debugger (e.g. gdb). Don't forget to compile with all warnings and debugging information (e.g. gcc -Wall -g). And improve your code till you get no warnings.
Knowing how to use a debugger is an essential required skill when programming (particularly in C or C++). That debugging skill (and ability to use the debugger) will be useful in every C or C++ program you contribute to.
BTW, you could read your file line by line with getline(3) (which can also fail and you should test that).

Read comma-separated "quoted" strings from a file

I am new to C programming but have a bit of Java knowledge, so I want to write a program that reads strings stored in a file, possibly several names separated by comma, such as "boy","girl","car" etc. In Java I would use something like, string str[]=str1.split(" ");.
So I came up with several codes each time but none seems to work, here is my most recent code:
fscanf(fp,"%[^\n]",c);
But this essentially prints the whole line till a new line is found. I have also tried using
fscanf(fp,"%[^,]",c);
And if I use gets() it only gets the first string and ignores all others from the first comma.
This didn't give any reasonable output, it rather gave some minute(encoded) characters.
Please can anyone help me with how to pick out string values separated by comma and in quotes
You can use strtok() function (string.h) to do this task. store the file data in a string of a considerable size. and apply
str = strtok(full_file_string,",");
/* you can save this string in a 2 dimensional array of characters or print it */
while(NULL != str)
{
str=strtok(NULL,",");
/*print or save your next word here as you like */
}
for further reference see manpage of strtok.
Hope this might help you :)
fscanf doesn't work with regular expressions, but rather with placeholders. So you need to specify the placeholder for what you want to read, and then fscanf will get the next element that matches your pattern. To get what you want one would use something like:
char word[enough_space];
.
.
.
while(fscanf(fp, "\"%s\"", word) != EOF)
{
//Do something with yout word.
};
Here you will be trying to get a string between to quotes. Note how the placeholder indicates which part of the match should be saved. on successive calls fscanf will get to the next match and so on. When it consumes the whole file it will return EOF.
Below example will extract the substring. The format of your fille should be something like:
"boy","girl","car",
Notice that file string should end with ','
int read_file_with_string_tokens() {
char * tocken;
char astring[127];
int current = 0;
int limit;
char *filebuffer = NULL;
FILE *file = fopen("your/file/path/and/name", "r");
if (file != NULL) {
fseek(file, 0L, SEEK_END);
int f_size = ftell(file);
fseek(file, 0L, SEEK_SET);
filebuffer = (char*) malloc(f_size + 2);
if (filebuffer == NULL) {
pclose(file);
free(filebuffer);
return -1;
}
memset(filebuffer, 0, f_size + 2);
if (fgets(filebuffer, f_size + 1, file) == NULL) {
fclose(file);
free(filebuffer);
return -1;
}
fclose(file);
memset(astring, 0, 127);
char *result = NULL;
tocken = strchr(filebuffer, ',');
while (tocken != NULL) {
limit = tocken - filebuffer + 1;
strncpy(astring, &filebuffer[current], limit - current - 1);
printf("%s" , astring);
current = limit;
tocken = strchr(&filebuffer[limit], ',');
memset(astring, 0, 127);
}
free(filebuffer);
}
return 0;
}
#include <stdio.h>
int main(void){
char line[128];
char word[32];
FILE *in, *out;
int line_length;
in = fopen("in.txt", "r");
out = fopen("out.txt", "w");
while(1==fscanf(in, "%[^\n]%n\n", line, &line_length)){//read one line
int pos, len;
for(pos=0;pos < line_length-1 && 1==sscanf(line + pos, "%[^,]%*[,]%n", word, &len);pos+=len){
fprintf(out, "%s\n", word);
}
}
fclose(out);
fclose(in);
return 0;
}
/* output result out.txt
"boy"
"girl"
"car"
...
*/

Resources