C - fscanf to read two ints and then strings - c

I need to store in a text file two integers and then the lines of a text. i've successfully done it by writing each int in a line and each line of the text in a new line too. In order to read it, however, I've found some troubles. I'm doing this:
FILE *f = fopen(arquivo, "r");
char *lna = NULL;
fscanf(f, "%d\n%d\n", &maxCol, &maxLin);
//↑This reads the two ints, works fine in step-by-step
for (;;) {
fscanf(f, "%s\n", &lna);
//↑This sets lna to NULL always, even if there are more lines
if (lna != NULL)
lna[strlen(lna) - 1] = '\0';
if (feof(f))
break;
inserirApos(lista, lna, &atual);
}
fclose(f);
I've tryied a few different ways, but they never worked. I understand I can read everthing like strings, with gets or something, but I think that has a problem if the string contains spaces. I wanted to know if the way I'm doing is the best, and what's wrong with it. I've found one of these methods (that didn't work either) that you have to pass the maximum length of each line. I know this information if necessary, it's the maxCol I read before.

fscanf(f, "%s\n", &lna);
Is the wrong argument type. The %s format expects a char* as argument, but you gave it a char**. And you have not allocated memory to that pointer. fscanf expects a char* pointing to a large enough memory area.
char *lna = malloc(whatever_you_need);
...
fscanf("%s ", lna);
(no difference between the '\n' and ' ' in the fscanf format. both consume the entire whitespace following the string of non-whitespace characters scanned int lna.)

You seem to be expecting fscanf() to dynamically allocate strings for you; that's not at all how it works. This is undefined behavior.

You need to allocate the space for lna first.
char *lna = malloc(MAX_SIZE);//MAX_SIZE is the maximum size the string can be + 1
The additional arguments should point to already allocated objects of the type specified by their corresponding format specifier within the format string.

Related

How to read and print hexadecimal numbers from a file in C

I'm trying to read 14 digit long hexadecimal numbers from a file and then print them. My idea is to use a long long int and read the lines from the files with fscanf as if they were strings and then turn the string into a hex number using atoll. The problem is I am getting a seg value on my fscanf line according to valgrind and I have absolutely no idea why. Here is the code:
#include<stdio.h>
int main(int argc, char **argv){
if(argc != 2){
printf("error argc!= 2\n");
return 0;
}
char *fileName = argv[1];
FILE *fp = fopen( fileName, "r");
if(fp == NULL){
return 0;
}
long long int num;
char *line;
while( fscanf(fp, "%s", line) == 1 ){
num = atoll(line);
printf("%x\n", num);
}
return 0;
}
Are you sure you want to read your numbers as character strings? Why not allow the scanf do the work for you?
long long int num;
while( fscanf(fp, "%llx", &num) == 1 ){ // read a long long int in hex
printf("%llx\n", num); // print a long long int in hex
}
BTW, note the ll size specifier to %x conversion in printf - it defines the integer value will be of long long type.
Edit
Here is a simple example of two loops reading a 3-line input (with two, no and three numbers in consecutive lines) with a 'hex int' format and with a 'string' format:
http://ideone.com/ntzKEi
A call to rewind allows the second loop read the same input data.
That line variable is not initialized, so when fscanf() dereferences it you get undefined behavior.
You should use:
char line[1024];
while(fgets(line, sizeof line, fp) != NULL)
To do the loading.
If you're on C99, you might want to use uint64_t to hold the number, since that makes it clear that 14-digit hexadecimal numbers (4 * 14 = 56) will fit.
The other answers are good, but I want to clarify the actual reason for the crash you are seeing. The problem is that:
fscanf(fp, "%s", line)
... essentially means "read a string from a file, and store it in the buffer pointed at by line". In this case, your line variable hasn't been initialised, so it doesn't point anywhere. Technically, this is undefined behavior; in practice, the result will often be that you write over some arbitrary location in your process's address space; furthermore, since it will often point at an illegal address, the operating system can detect and report it as a segment violation or similar, as you are indeed seeing.
Note that fscanf with a %s conversion will not necessarily read a whole line - it reads a string delimited by whitespace. It might skip lines if they are empty and it might read multiple strings from a single line. This might not matter if you know the precise format of the input file (and it always has one value per line, for instance).
Although it appears in that case that you can probably just use an appropriate modifier to read a hexadecimal number (fscanf(fp, "%llx", &num)), rather than read a string and try to do a conversion, there are various situations where you do need to read strings and especially whole lines. There are various solutions to that problem, depending on what platform you are on. If it's a GNU system (generally including Linux) and you don't care about portability, you could use the m modifier, and change line to &line:
fscanf(fp, "%ms", &line);
This passes a pointer to line to fscanf, rather than its value (which is uninitialised), and the m causes fscanf to allocate a buffer and store its address in line. You then should free the buffer when you are done with it. Check the Glibc manual for details. The nice thing about this approach is that you do not need to know the line length beforehand.
If you are not using a GNU system or you do care about portability, use fgets instead of fscanf - this is more direct and allows you to limit the length of the line read, meaning that you won't overflow a fixed buffer - just be aware that it will read a whole line at a time, unlike fscanf, as discussed above. You should declare line as a char-array rather than a char * and choose a suitable size for it. (Note that you can also specify a "maximum field width" for fscanf, eg fscanf(fp, "%1000s", line), but you really might as well use fgets).

Why in C, when reading files, do we have to use a character array?

I've literally just started programming in C. Coming from a little understanding of Python.
Just had a lecture on C, the lecture was about this:
#include <stdio.h>
int main() {
FILE *file;
char name[10], degree[5];
int mark;
file = fopen("file.txt", "r");
while (fscan(file("%s %s %d", name, degree, &mark) != EOF);
printf("%s %s %d", name, degree, mark);
fclose(file);
}
I'm specifically asking why we the lecturer would have used an array rather than just declaring two string variables. Searching for a deeper answer than just, "that's just C for you".
There are multiple typos on this line:
while (fscan(file("%s %s %d", name, degree, &mark) != EOF);
printf("%s %s %d", name, degree, mark);
It should read:
while (fscanf(file, "%s %s %d", name, degree, &mark) == 3)
printf("%s %s %d", name, degree, mark);
Can you spot the 4 mistakes?
The function is called fscanf
file is an argument followed by ,, not a function name followed by (
you should keep looping for as long as fscanf converts 3 values. If conversion fails, for example because the third field is not a number, it will return a short count, not necessarily EOF.
you typed an extra ; after the condition. This is parsed as an empty statement: the loop keeps reading until end of file, doing nothing, and finally executes printf just once, with potentially invalid arguments.
The programmer uses char arrays and passes their address to fscanf. If he had used pointers (char *), he would have needed to allocate memory to make them point to something and pass their values to fscanf, a different approach that is not needed for fixed array sizes.
Note that the code should prevent potential buffer overflows by specifying the maximum number of characters to store into the arrays:
while (fscanf(file, "%9s %4s %d", name, degree, &mark) == 3) {
printf("%s %s %d", name, degree, mark);
}
Note also that these hard-coded numbers must match the array sizes minus 1, an there is no direct way to pass the array sizes to fscanf(), a common source of bugs when the code is modified. This function has many quirks and shortcomings, use with extreme care.
Take name[] for example. It's an array of chars, a collection of chars if you like.
There is no string type in c, so we use an array an array of chars when we want to use a string.
The code is written as such, so that we can read the actual string in a line of the file, in our array.
As a side note, this program will produce syntax errors.
What you have in every non-empty file is a series of bytes. Therefor, what your C program has to do is read bytes. Since the variable type char is used to represent a byte, and since you want to read multiple bytes at once for efficiency purposes, you read an array of chars. That's for the general understanding of what reading from a file means.
Going back to your example, there is no string type in C. A string is an array of bytes (char[]) ended by a nul character.
What the lecturer is doing is:
define an array of chars (which will contain a string) => char name[10] is an array that may contain a string of at most 9 characters (the last byte would be used for '\0').
ask fscanf to read a string, which means it will read multiple bytes (multiple chars) until it finds a nul character ('\0') and put all of that in the given array.
To understand what's happening, forget about string as an opaque data type (which might be true in other programming languages) and see them as what they really are: arrays of chars.

What happens with extra memory using fscanf?

I'm new to C and I have a couple questions about fscanf. I wrote a simple program that reads the contents of a file and spits it back out on the command line:
#include <stdio.h>
#include <stdlib.h>
int main (int argc, char* argv[1])
{
if (argc != 2)
{
printf("Usage: fscanf txt\n");
return 1;
}
char* txt = argv[1];
FILE* fp = fopen(txt, "r");
if (fp == NULL)
{
printf("Could not open %s.\n", txt);
return 2;
}
char s[50];
while (fscanf(fp, "%49s", s) == 1)
printf("%s\n", s);
return 0;
}
Let's say the contents of my text file is just "C is cool.", which will output:
C
is
cool.
So I have two questions here:
1) Does fscanf assume that the placeholder "%s" will be a single word (an array of chars only)? According to this program's output, spaces and line breaks seem to prompt the function to return. But what if I wanted to read a whole paragraph? Would I use fread() instead?
2) More importantly I'm wondering what happens with all of the unused space in the array. On the first iteration, I think s[0] = "C" and s[1] = "\0", so are s[2] - s[49] just wasted?
EDIT: while (fscanf(fp, "%**49**s", s) == 1) - thanks to #M Oehm for pointing this out - enforcing strong limit here to prevent dangerous buffer overflows
1) Does fscanf assume that the placeholder "%s" will be a single word
(an array of chars only)? According to this program's output, spaces
and line breaks seem to prompt the function to return. But what if I
wanted to read a whole paragraph? Would I use fread() instead?
The %s specifier reads single words that are delimited by white space. The scanf family of functions are very cerude; they do not normally distinguish between line breaks and spaces, for example.
A line is anything up to the next newline. There is no concept of paragraph, but you might consider anything between blank lines a paragraph. The function to read lines of text is fgets, so you could read lines until you find an empty one. (fgets retains the newline at the end, mind.)
fread is a function for reading binary data. It is not useful for reading structured texts. (But it can be used to read the contents of a whole text file at once.)
2) More importantly I'm wondering what happens with all of the unused
space in the array. On the first iteration, I think c[0] = 'C' and
c[1] = '\0', so are c[2] - c[49] just wasted?
You are right, the data after the null ternimator isn't used. "Wasted" is too negative – with user input you don't know whether you encounter a longer word eventually. Because dynamic allocation requires some care in C, allocating "enogh for most cases" is a goopd practice in C. You should enforce the hard limit when reading, though, to prevent buffer overruns:
fscanf(fp, "%49s", s)
The issue of "wasted" memory becomes more serious if you have an array of arrays of 50 chars. Most of the words will be much shorter than 50 chars. Here, the extra memory might eventually hurt you. 48 extra characters for reading a line are okay, though.
(A strategy to save "compact" arrays of chars is to have a running array of chars that is a concatenation of all strings, including their terminators. The word array is then an array of piointers into that master string.)
You use specifier %s which will read and store data in array s until it encounters a space or newline . As soon as it encounters space fscanf returns.
I think c[0] = "C" and c[1] = "\0", so are c[2] - c[49] just wasted?
Yes , s[0]='C' and s[1]='\0' and you probably can't do anything about the size of array being much more.
If you want complete string "C is cool" stored in array use fgets.
#define len 1000
char s[len];
while(fgets(s,len,fp)!=NULL) {
//your code
}

fscanf not scanning file

I am trying to scan a file using fscanf and put the string into an char array of size 20 as follows:
char buf[20];
fscanf(fp, "%s", buf);
The file fp currently contains: 1 + 23.
I am setting a pointer to the first element in buf as follows:
char *p;
p = buf;
Printing buf, printf("%s", buf) yields only 1. Trying to increment p and printing prints out rubbish as well (p++; printf("%c", *p)).
What am I doing wrong with fscanf here? Why isn't it reading the whole string from the file?
fscanf (and related functions) with the format-string "%s" will try to read as many characters as it can without including a whitespace, in this case it will find the first character (1) and store it, then it will hit a space () and therefore stop searching.
If you'd like to read the whole line at once consider using fgets, it is also safer to use since you need to specify the size of your destination buffer as one of it's arguments.
fgets will try to read at maximum length-of-buffer minus 1 characters (last byte is saved for the trailing null-byte), it will stop at either reading that many characters, hitting a new-line or the end of the file.
fgets (buf, 20, fp);
Links to documentation
codecogs.com - scanf, fscanf and related functions - <stdio.h>
codecogs.com - fgets - <stdio.h>

Looping until a specific string is found

I have a simple question. I want to write a program in C that scans the lines of a specific file, and if the only phrase on the line is "Atoms", I want it to stop scanning and report which line it was on. This is what I have and is not compiling because apparently I'm comparing an integer to a pointer: (of course "string.h" is included.
char dm;
int test;
test = fscanf(inp,"%s", &dm);
while (test != EOF) {
if (dm=="Amit") {
printf("Found \"Atoms\" on line %d", j);
break;
}
j++;
}
the file was already opened with:
inp = fopen( .. )
And checked to make sure it opens correctly...
I would like to use a different approach though, and was wondering if it could work. Instead of scanning individual strings, could I scan entire lines as such:
// char tt[200];
//
// fgets(tt, 200, inp);
and do something like:
if (tt[] == "Atoms") break;
Thanks!
Amit
Without paying too much attention to your actual code here, the most important mistake your making is that the == operator will NOT compare two strings.
In C, a string is an array of characters, which is simply a pointer. So doing if("abcde" == some_string) will never be true unless they point to the same string!
You want to use a method like "strcmp(char *a, char *b)" which will return 0 if two strings are equal and something else if they're not. "strncmp(char *a, char *b, size_t n)" will compare the first "n" characters in a and b, and return 0 if they're equal, which is good for looking at the beginning of strings (to see if a string starts with a certain set of characters)
You also should NOT be passing a character as the pointer for %s in your fscanf! This will cause it to completely destroy your stack it tries to put many characters into ch, which only has space for a single character! As James says, you want to do something like char ch[BUFSIZE] where BUFSIZE is 1 larger than you ever expect a single line to be, then do "fscanf(inp, "%s", ch);"
Hope that helps!
please be aware that dm is a single char, while you need a char *
more: if (dm=="Amit") is wrong, change it in
if (strcmp(dm, "Amit") == 0)
In the line using fscanf, you are casting a string to the address of a char. Using the %s in fscanf should set the string to a pointer, not an address:
char *dm;
test = fscanf(inp,"%s", dm);
The * symbol declares an indirection, namely, the variable pointed to by dm. The fscanf line will declare dm as a reference to the string captured with the %s delimiter. It will point to the address of the first char in the string.
What kit said is correct too, the strcmp command should be used, not the == compare, as == will just compare the addresses of the strings.
Edit: What kit says below is correct. All pointers should be allocated memory before they are used, or else should be cast to a pre-allocated memory space. You can allocate memory like this:
dm = (char*)malloc(sizeof(char) * STRING_LENGTH);
where STRING_LENGTH is a maximum length of a possible string. This memory allocation only has to be done once.
The problem is you've declared 'dm' as a char, not a malloc'd char* or char[BUFSIZE]
http://www.cplusplus.com/reference/clibrary/cstdio/fscanf/
You'll also probably report incorrect line numbers, you'll need to scan the read-in buffer for '\n' occurences, and handle the case where your desired string lies across buffer boundaries.

Resources