sscanf() skipping delimiter & width - c

I am writing an assembler for 6502 and trying to read the instructions and opcode data from a file I prepared. Using sscanf to store data and it only worked partially...
File:
ADC,Im,69,2
ADC,ZP,65,2
ADC,ZPx,75,2
ADC,Ab,6D,3
ADC,Abx,7D,3
ADC,Aby,79,3
...
Here is only part of the code related to the problem. fgets works fine. Problem line commented below. Will upload more if needed.
Code:
FILE *fp = ...
char bf[15];
char name[3];
char mode[3];
char op[2];
int bytes;
while (fgets(bf,15,fp)) {
//below is the problem line
sscanf(bf, "%3[^,],%3[^,],%2[^,],%d", name, mode, op, &bytes);
}
printf("%s,%s,%s,%d\n", name, mode, op, bytes);
Output:
ADCIm,Im,69,2
ADCZP,ZP,65,2
ADCZPx75,ZPx75,75,2
ADCAb,Ab,6D,3
ADCAbx7D,Abx7D,7D,3
...
Expected to be (just like the file format):
ADC,Im,69,2
ADC,ZP,65,2
ADC,ZPx75,75,2
ADC,Ab,6D,3
ADC,Abx7D,7D,3
...
It seems that op and bytes all work alright, but there are something wrong with the name and mode variables, even though I contained the width and delimiter in the argument.

Actually, it's not sscanf that is at fault. You have undefined behavior due to buffer overruns -- printf knows nothing about the fact that your strings are not null-terminated, and it will continue printing characters until it finds a '\0'.
To stop it from doing that, you can supply a maximum field width in the format specifier:
printf("%.3s,%.3s,%.2s,%d\n", name, mode, op, bytes);

Related

puts and printf do not give out full text (text containing CJK characters), when the text is read from a local file, on Windows, MSVC

The text contains:
..... (some characters can't be posted on SO)
xxxxxxxx=xxx xxxxxxx=xxxxx://xxx..xxx/xxxxx/xx9528994
(for full text & data please see https://github.com/ggaarder/snippets/raw/master/x.txt)
which is ended in xxxxx://xxx..xxx/xxxxx/xx9528994, however, when reading it then puts, it only gives out
..... (some characters can't be posted on SO)
xxxxxxxx=xxx xxxxxxx=xxxxx:/
which only prints to xxxxx:/, and /xxx..xxx/xxxxx/xx9528994 is missed.
Code to test:
#include <stdio.h>
int main(void)
{
char s[30000];
FILE *f = fopen("x.txt", "r");
fread(s, sizeof(s), 1, f);
puts(s);
return 0;
}
The buffer size 30000 is adequate. x.txt is 1049 bytes.
You can download x.txt at https://github.com/ggaarder/snippets/raw/master/x.txt, for convenience I have packed everything to https://github.com/ggaarder/snippets/raw/master/foo.zip.
It will be very kind of you to download and take a look of x.txt, since most part of it can't be posted on SO because of the special characters, including some CJK.
Attempts:
The whole file is read properly. #pmg notices that fread returns zero, while #Someprogrammerdude points out that if fread's size and count arguments are swapped fread returns 1049, and this supports the guess.
If the CJK letters are removed, the output will be totally OK. So I think there is no '\0' in the middle.
By adding
ret = puts(s);
printf("\nret: %d, %s", ret, strerror(errno));
We will get ret: 0, No error. puts return zero and there's nothing in errno.
You may notice that there's a heading \n in 3.. Yes, puts doesn't gives out the newline as usual - does this suggest that puts failed?
But why does it returns zero and there's nothing in errno?
May it be related to Windows NT cmd? Maybe some special terminal control letters are unintentionally out.
Reading by rb is the same. x.txt is an XML text, just for convenience I removed part of it that are the irrelevant, so it looks like spam.
I guess this is just yet another encoding issue, plus some magical secret Windows commandline control sequence .... I'm not taking it. I will just erase all non-ASCII characters.
The order of the "size" and "count" arguments to fread is crucial.
The first argument is the "element" size, and the second argument is the number of elements to attempt to read.
In the case of a text file, the element size is a single character, usually a single byte. The number of elements to attempt to read is the size of the destination array.
So your call should be
fread(s, 1, sizeof s, f);
instead.
What happens now when you have the opposite is that you say that the "element" size is 30000 bytes, and that fread should read one such element. Since the size of the file is less than 30000 bytes, it just can't read even a single element, and returns 0 to indicate it.
open the file in binary mode
switch arguments and check the return value of fread().
#include <stdio.h>
#include <stdlib.h>
int main(void) {
char s[30000];
FILE *f = fopen("x.txt", "rb"); // binary mode
unsigned long len = fread(s, 1, sizeof(s), f); // switch args, check value
if (len < 1) {
perror("bad fread");
exit(EXIT_FAILURE);
}
s[len] = 0; // properly terminate s
puts(s);
return 0;
}
It's just yet another encoding issue happening everyday. Just SetConsoleOutputCP(65001) or /utf-8 or set execution code page in #pragma and everything will be fine.

Changing char* string in a C loop

I am trying to change a string within a loop to be able to save my images with a changing variable. Code snippet is as follows:
for (frames=1; frames<=10; frames++)
{
char* Filename = "NEWIMAGE";
int Save_Img = is_SaveImageMemEx (hCam, Filename, pMem, memID,
IS_IMG_PNG, 100);
printf("Status Save %d\n",Save_Img);
}
What I want to do is put a variable that changes with the loop counter inside Filename so my saved file changes name with every iteration.
Any help would be great.
Create a file name string with sprintf and use the %d format conversion specifier for an int:
char filename[32];
sprintf(filename, "NEWIMAGE-%d", frames);
sprintf works just like printf, but "prints" to a string instead of stdout.
If you declared frames as an unsigned int, use %u. If it is a size_t use %zu. For details see your friendly printf manual page, which will tell you how you can for example zero pad the number.
Be sure that the character array you write to is large enough to hold the longest output plus an extra '\0' character. In your particular case NEWIMAGE-10 + 1 means 11 + 1 = 12 characters is enough, but 32 is future-proof for some time.
If you want to program like a pro, look at the snprintf and asnprintf functions, which can limit or allocate the memory written to, respectively.
You can use sprintf to create a formatting string:
char Filename[50];
sprintf(Filename, "NEWIMAGE%d", frames);

How to read and print hexadecimal numbers from a file in C

I'm trying to read 14 digit long hexadecimal numbers from a file and then print them. My idea is to use a long long int and read the lines from the files with fscanf as if they were strings and then turn the string into a hex number using atoll. The problem is I am getting a seg value on my fscanf line according to valgrind and I have absolutely no idea why. Here is the code:
#include<stdio.h>
int main(int argc, char **argv){
if(argc != 2){
printf("error argc!= 2\n");
return 0;
}
char *fileName = argv[1];
FILE *fp = fopen( fileName, "r");
if(fp == NULL){
return 0;
}
long long int num;
char *line;
while( fscanf(fp, "%s", line) == 1 ){
num = atoll(line);
printf("%x\n", num);
}
return 0;
}
Are you sure you want to read your numbers as character strings? Why not allow the scanf do the work for you?
long long int num;
while( fscanf(fp, "%llx", &num) == 1 ){ // read a long long int in hex
printf("%llx\n", num); // print a long long int in hex
}
BTW, note the ll size specifier to %x conversion in printf - it defines the integer value will be of long long type.
Edit
Here is a simple example of two loops reading a 3-line input (with two, no and three numbers in consecutive lines) with a 'hex int' format and with a 'string' format:
http://ideone.com/ntzKEi
A call to rewind allows the second loop read the same input data.
That line variable is not initialized, so when fscanf() dereferences it you get undefined behavior.
You should use:
char line[1024];
while(fgets(line, sizeof line, fp) != NULL)
To do the loading.
If you're on C99, you might want to use uint64_t to hold the number, since that makes it clear that 14-digit hexadecimal numbers (4 * 14 = 56) will fit.
The other answers are good, but I want to clarify the actual reason for the crash you are seeing. The problem is that:
fscanf(fp, "%s", line)
... essentially means "read a string from a file, and store it in the buffer pointed at by line". In this case, your line variable hasn't been initialised, so it doesn't point anywhere. Technically, this is undefined behavior; in practice, the result will often be that you write over some arbitrary location in your process's address space; furthermore, since it will often point at an illegal address, the operating system can detect and report it as a segment violation or similar, as you are indeed seeing.
Note that fscanf with a %s conversion will not necessarily read a whole line - it reads a string delimited by whitespace. It might skip lines if they are empty and it might read multiple strings from a single line. This might not matter if you know the precise format of the input file (and it always has one value per line, for instance).
Although it appears in that case that you can probably just use an appropriate modifier to read a hexadecimal number (fscanf(fp, "%llx", &num)), rather than read a string and try to do a conversion, there are various situations where you do need to read strings and especially whole lines. There are various solutions to that problem, depending on what platform you are on. If it's a GNU system (generally including Linux) and you don't care about portability, you could use the m modifier, and change line to &line:
fscanf(fp, "%ms", &line);
This passes a pointer to line to fscanf, rather than its value (which is uninitialised), and the m causes fscanf to allocate a buffer and store its address in line. You then should free the buffer when you are done with it. Check the Glibc manual for details. The nice thing about this approach is that you do not need to know the line length beforehand.
If you are not using a GNU system or you do care about portability, use fgets instead of fscanf - this is more direct and allows you to limit the length of the line read, meaning that you won't overflow a fixed buffer - just be aware that it will read a whole line at a time, unlike fscanf, as discussed above. You should declare line as a char-array rather than a char * and choose a suitable size for it. (Note that you can also specify a "maximum field width" for fscanf, eg fscanf(fp, "%1000s", line), but you really might as well use fgets).

Get how many characters sscanf has read? (%n)

I have a string structured such as "[first something]=[second something"]
I think sscanf would be a way to seperate them!
However, scan never reports the offset properly with %n.
The line of code is something very much like:
char data[100];
char source[] = "username=katy"
int offset=-1;
sscanf([source],"%[^=],%s%n",data,&offset)
printf("sscanf is reporting %s with an offset of %i\n"
)
but the output always looks like:
sscanf is reporting username with an offset of -1
Could someone be so kind as to clear this up for me?
(Yes, I know this leaves us prone to a buffer overflow error - that is garuneteed against a little earlier in the code...)
Your comma in the scanf format string makes no sense. Instead of "%[^=],%s%n", try "%[^=]=%s%n". You should also put field width limits on both of the strings or else you could overflow the destination buffers, and you've passed too few arguments to sscanf (only one of the strings, not the other one).
A corrected version of the code might look like:
char key[100], data[100];
char source[] = "username=katy"
int offset=-1;
sscanf(source,"%99[^=]=%99s%n",key,data,&offset)
printf("sscanf is reporting %s with an offset of %i\n", data, offset);

Write and Display a stream with format on C

I'm in my first year of Computer Sciences and I have to design a procedure that writes to a file with format (fprintf) and displays it with format (fscanf). But I can't get it to run properly; it compiles but when it gets to the fscanf part, it crashes. I've been looking around reference sites, YouTube videos and stuff but I can't get it to work without any success.
Except for the last 2 lines of codes, it does everything great. Its capable of writing the records I enter, in the .txt file. The problem is with the use of fscanf itself.
void write_with_format()
{
char name_of_file[100] = "grades.txt";
FILE *arch;
arch = fopen (name_of_file, "a");
char name[50];
char career[50];
char grades[100];
char total;
printf("Give me the name");
gets(name);
printf("Give me the career");
gets(career);
printf("Give me the grade");
gets(grades);
getchar();
fprintf (arch, "%s,%s,%s\n",name,career,grades);
fscanf(arch,"%s %s %f",&name,&career,&grades);
printf("%s %s %f",name,career,grades);
}
I'd appreciate any help regarding my code or the proper use of fscanf, thank you all.
This line is all wrong:
fscanf(arch,"%s %s %f",&name,&career,&grades);
grades is declared as char grades[100];, ie. a string, however you're trying to read a float into it. Same goes for the printf line below it, you're using %f and telling printf that you're passing a float, however you're passing an array. You also don't need to use the address-of operator (&) when passing arrays to functions, as you have with fscanf.
You should use fclose to flush the buffer and close the file stream once you're done with reading/writing a file.
Back to the fscanf line, what exactly do you expect it to do? The file position inidcator is at the end of the file, just after where you've appended the line produced by fprintf. Check the return value of fscanf and you'll see that it's returning EOF to report an error. The specific error value is stored in errno.
You can use rewind or fseek to set the position to the start of the file or back a certain amount, or you could always reopen the file. I know that I at least wouldn't have my read code in a write_with_format function.
gets is unsafe and should not be used as it has the potential to cause buffer overflows, use fgets(stdin, SIZE...) instead.
Turn up your compiler warnings. If by chance you're using gcc, the flag is -Wall. Just because your code compiles, doesn't mean that it's going to work properly (or at all).
You declare grades as an array of char's but are trying to read into it a float.

Categories

Resources