Why sscanf does not work for this situation? - c

Currently I am reading file, and printing (stdout) all words/strings that it contains.
Here is the code:
int scan_strings(FILE *in, FILE *out)
{
char buffer[64];
int i = 0, n = 0;
for(;;)
{
if (fscanf(in, "%*[^" charset "]") != EOF)
{
i = 0;
while (fscanf(in, "%63[" charset "]%n", buffer, &n) == 1)
{
if (n < 4 && i == 0)
{
break;
}
else
{
i = 1;
}
fputs(buffer, out);
}
if (i != 0)
{
putc('\n', out);
}
}
if (feof(in))
{
return 0;
}
if (ferror(in) || ferror(out))
{
return -1;
}
}
}
But what I am trying to do, is to search the strings from a buffer which is already read to memory.
I changed in and out variables to unsigned char* and changed fscanf to sscanf. That however doesn't work. Am I misunderstanding the sscanf function, or is there something else wrong in my code?
How I can print all strings from already-read buffer? The data is binary data.
I am working on Windows and Linux portability isn't needed.

sscanf(data, "%*[^" charset "]") works differently from fscanf(in, "%*[^" charset "]"). when data is binary.
Assume charset is some string like "123".
fscanf(in, "%*[^123]") will scan in as long as the char read is not '1', '2', or '3'.
This includes '\0'.
sscanf(data, "%*[^123]") will scan data as long as the char read is not '1', '2', or '3'.
This does not include '\0' as sscanf quits offering char to scan once '\0' is encountered.
Using sscanf() to scan over '\0' is not possible.
[Edit]
OP: How should I go about doing it - for binary data(from buffer/variable)?
A: Additional code around sscanf() can be used to cope with its stopping a scan when '\0' is encountered. Something like just for the first sscanf():
size_t j=0;
for (;;) {
// if (fscanf(in, "%*[^" charset "]") != EOF)
while (j < datasize) {
int n = 0;
sscanf(&data[j], "%*[^123]%n", &n);
if (n > 0) j += n;
else if (data[j] == '\0') j++;
else break;
}
if (j < datasize) {
i = 0;
...
As you can see things are getting ugly.
Let's try using strchr() with untested code:
size_t j=0;
for (;;) {
while (j < datasize) {
int ch = data[j];
if (ch && strchr(charset, ch) != NULL) break;
j++;
}
if (j < datasize) {
i = 0;
...
Getting better and this is only for the first sscanf().

The problem is that your code never modifies in. When in is a file fscanf will move through it sequentially. But sscanf doesn't work that way.
You need to find out how many characters sscanf read, and then increment in accordingly.
You're already getting the number of bytes read in n, so just add that to in.
in += n;
... after the sscanf.

Related

C Program doesn't end after giving the correct output

So I'm trying to do a program that reads a sequence of numbers separated by spaces and new lines. The output should be the same sequence, but erasing unnecessary zeros(The sequence of charachters 'EOF' ends the program). Per example
01492 102934 should come out as 1492 102934
9312 0 01923 should come out as 9312 0 1923
0001249 0000 should come out as 1249 0
Well I've achieved that purpose but have come across a roadblock. The program doesn't exit unless I type the EOF sequence. Maybe it's because I have a while(1) running that gives an infinite loop. But when I try to delete it the program doesn't even print at all. I'm still learning this is for a school project.
Any help would be apreciated!
Here's the code:
#include <stdio.h>
int main(){
char c;
int i=0;
while(1){
c=getchar();
if (i==0){
if(c=='0'){
while (c=='0'){
c=getchar();
}
}
printf("%c",c);
i=i+1;
}
else if (c==' '){
printf("%c",c);
c=getchar();
if(c=='0'){
while (c=='0'){
c=getchar();
}
}
printf("%c",c);
}
else if (c=='E'){
c=getchar();
if (c=='O'){
c=getchar();
if(c=='F'){
printf("\n");
return 0;
}
}
}
else{
printf("%c",c);
}
}
}
The important stuff:
int c; // IMPORTANT, cannot be char
while (1) {
c = getchar();
if (c == EOF) break; // exit loop
// ...
}
There has to be some way to tell the program to exit.
With this, the program will exit on the letter x or two consecutive newlines or entering END.
getchar will return EOF when there is nothing left to read from a file. That can be simulated from stdin ( the keyboard) with ctrl + z on Windows or ctrl + d on Linux.
#include <stdio.h>
#include <string.h>
int main ( void) {
char done[4] = "";
int c = 0;
int prior = 0;
int reading = 0;
int zero = 1;
while ( EOF != ( c = getchar ( )) && 'x' != c) {
if ( '\n' == c && '\n' == prior) {
break;
}
if ( c >= '0' && c <= '9') {
reading = 1;
if ( '0' != c) {
zero = 0;
}
if ( ! zero) {
putchar ( c);
}
}
else {
if ( reading) {
if ( zero) {
putchar ( '0');
}
if ( ' ' == c || '\n' == c) {
putchar ( c);
}
else {
putchar ( ' ');
}
}
reading = 0;
zero = 1;
}
prior = c;
done[0] = done[1];
done[1] = done[2];
done[2] = c;
done[3] = 0;
if ( 0 == strcmp ( done, "END")) {
break;
}
}
putchar ( '\n');
return 0;
}
getchar() returns an int, not a char. If it only returned a char, there would be no way for it to return a value that indicates end of file, since all char values are valid and can’t be used for another purpose.
A motivating example in decimal system may be: A function checks the temperature returns a two-digit number. Any temperature between 0 and 99 is valid. How do you report errors when the thermometer is disconnected? You have to return a number with more digits, and use a special value like UNPLUGGED = 100.
But int is a wider type: it has many more values than char, and the “extra” values can be used to indicate some special condition that means “hey, this is not a valid character, but something else I had to tell you”.
getchar() returns the EOF constant upon failure (any failure), for example if no more input is available. There’s nothing sensible you can do even if the reason for the failure other than end of input. You should end processing at the first EOF.
Thus, change the type of c to int, and every time you call getchar(), you must check that its value is not EOF, and return when you encounter it.
The nested structure of your loops means that EOF checking has to be repeated all over the place. There are other ways to structure the code to keep this check in one place, but, admittedly, the nested loops have at least the potential to exploit the branch predictor, whereas a single getchar followed by a state-machine style switch statement will make it perform potentially worse. None of this matters in a simple homework problem, but it’s something to keep in mind. In any case, performance has to be benchmarked - no other way around it.
Try this code, I think it does what you requested:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
static int getLine(char *prmpt, char *buff, size_t sz) {
int ch, extra;
// Get line with buffer overrun protection.
if (prmpt != NULL) {
printf("%s", prmpt);
fflush(stdout);
}
if (fgets(buff, sz, stdin) == NULL)
return -2;
// If it was too long, there'll be no newline. In that case, we flush
// to end of line so that excess doesn't affect the next call.
if (buff[strlen(buff) - 1] != '\n') {
extra = 0;
while (((ch = getchar()) != '\n') && (ch != EOF))
extra = 1;
return (extra == 1) ? -1 : 0;
}
// Otherwise remove newline and give string back to caller.
buff[strlen(buff) - 1] = '\0';
return 0;
}
int* convert2numbers(char* arr, int size) {
int i;
int j;
int k;
char token[100];
int* numbers;
int last_space = 0;
int index = 1;
int amount = 1;
// Count the amount of tokens.
for (i = 0; i < size; ++i) {
if (arr[i] == ' ') {
++amount;
}
}
numbers = (int *)malloc(amount * sizeof(int));
numbers[0] = amount;
for (j = 0; j <= size; ++j) {
if (arr[j] == ' ' || arr[j] == '\0') {
// Copy token from input string.
for (k = 0; k < j; ++k) {
token[k] = arr[k + last_space];
}
token[j] = '\0';
numbers[index] = atoi(token);
// Clear the token and continue.
memset(token, '\0', sizeof(token));
last_space = j;
++index;
}
}
return numbers;
}
int main(void) {
int i;
int size;
int* numbers;
int amount;
char input[100];
char help[] = "Numbers> ";
printf("Input numbers below or press enter to exit!\n");
while (1) {
getLine(help, input, sizeof(input));
// If input is empty exit.
if (input[0] == '\0') {
break;
}
size = strlen(input);
numbers = convert2numbers(input, size);
amount = numbers[0];
for (i = 1; i < amount + 1; ++i) {
printf("%d ", numbers[i]);
}
printf("\n");
}
return 0;
}
When run with these inputs this code outputs:
Input numbers below or press enter to exit!
Numbers> 01492 102934
1492 102934
Numbers> 9312 0 01923
9312 0 1923
Numbers> 0001249 0000
1249 0
Also if you press enter in console, it exits, as to escape the while(1) loop, easily.

Input of varying format in C

I am currently trying to figure out how to process an input of such format: [int_1,...,int_N] where N is any number from interval <1, MAX_N> (for example #define MAX_N 1000). What I have right now is fgets to get it as string which I then, using some loops and sscanf, save into an int array.
My solution is, IMO, not the most elegant and functional, but that's because of how I implement it. So what I'm asking I guess is how you guys would solve this problem, because I've ran out of ideas.
Edit: adding the code for string -> int array
int digit_arr[MAX_N];
char input[MAX_N];
//MAX_N is a constant set at 1000
//Brackets and spaces have been removed at this point
for (i = 0; i < strlen(input); i++) {
if(sscanf(&input[i+index_count],"%d,", &digit_arr[i]) == 1){
while (current_char != ',') {
current_char = input[i+index_count+j];
index_count++;
j++;
if ((index_count+j+i) == strlen(input)-1){
break;
}
}
}
My personal variant:
char const* data = input; // if input is NOT a pointer or you yet need it unchanged
for(;;)
{
int offset = 0;
if(sscanf(data, "%d,%n", digit_arr + i, &offset) == 1)
{
++i;
if(offset != 0)
{
data += offset;
continue;
}
}
break;
}
You might finally ckeck if all characters in the text are consumed:
if(*data)
{
// not all characters consumed, input most likely invalid
}
else
{
// we reached terminating null character -> fine
}
Note that my code as is does not cover trailing whitespace, you could do so by changing the format string to "%d, %n (note the added space character).

Why isn't the delimeter \n being detected in C?

The issue is detecting '\n' when I loop through my array. It works once as
shown in the comments, but it does not work after. The goal of this program is to take input from the terminal and put it into an array. The array should not contain any '\n'. Any help is appreciated, Thanks
#include<stdio.h>
#include<stdlib.h>
#include<string.h>
// 1. Function must take input and place in array whilst making sure it does not overflow
// 2. Must return null if end of stdi is reached
// 3. Must ensure that it does not contain delimeter \n
// Tests:
// a) empty string
// b) string longer than buffer
// c) what happens when you press ctrl-d
char read_line(char *buf, size_t sz) {
while(fgets(buf + strlen(buf), sz, stdin)){
if (strlen(buf) < sz) {
if(buf[strlen(buf)-1] == '\n' ){
// IT GET'S DETECTED HERE WHEN THE ENTER
// BUTTON
// IS PRESSED BUT ...
break;
}
}
}
// WHEN I LOOP THROUGH THE ARRAY IT GETS DETECTED AS SINGLE CHARS; '\'
// AND 'n' DISTINCTLY
for(int i = 0; i < strlen(buf)-1; ++i){
if(buf[i] == '\n'){
printf("present");
} else {
printf("x");
}
}
return NULL;
}
int main(int argc, char *argv[]){
char arra[20];
size_t sz = sizeof(arra);
memset(arra, 0, sz);
printf("Enter command: \n");
read_line(arra, sz);
// Print elements in array
printf("Printing out array: \n");
for(int i = 0; i < strlen(arra); ++i){
char c = arra[i];
printf("%c", c);
}
}
You appear to be entering something like the keystrokes hello\nENTER.
The entry of the two distinct characters \ and n are exactly that, two distinct characters. That is vastly different to the single newline character which is represented in the source as \n.
In terms of what the buffer will hold, it'll be the string "hello\\n\n", where \\ is the \ character, n is an n, and \n is the newline.
If your intent is to detect the newline in the string, you'll need to process every character in the string. The loop:
for (int i = 0; i < strlen(buf) - 1; ++i) ...
will basically skip the last character, which is fine for ignoring trailing newline should it exist but, if you want to detect it, you'll need:
for (int i = 0; i < strlen(buf); ++i) ...
suggest replacing:
for(int i = 0; i < strlen(buf)-1; ++i){
if(buf[i] == '\n'){
printf("present");
} else {
printf("x");
}
with:
if( strchr( buf, '\n' ) )
{
puts( "present" );
}
else
{
puts( "x" );
}

how to get an unknown number of integers into array in c

I need to write a program in c that gets integers as input from the user.
input example:
10 20 50 70
The user presses Enter and then the input is over.
I can't think of a condition to make it happen. I tried to write:
int grades[1000];
int i=0;
while(scanf("%d", &grades[i])!=EOF)
{
i++;
}
It is not working.
Reading a line of user input and then parsing is really the best approach as with #The Paramagnetic Croissant
If code can not pre-define an input buffer size or must parse the line while it comes in then using scanf("%d",... is OK. Non-elegant code occurs with finding the '\n'.
#define N 1000
int grades[N];
int i=0;
for (i=0; i<N; i++) {
// Consume leading white-space, but not \n
int ch;
while ((ch == fgetc(stdin)) != '\n' && isspace(ch));
// normal exit
if (ch == '\n' || ch == EOF) break;
ungetc(ch, stdin);
if (1 != scanf("%d", &grades[i])) {
// Non-numeric data
break;
}
i++;
}
If you need read an entire line, then read an entire line, simple as that. If you google "C read line", you will most probably end up reading the documentation of fgets(). Then you google "C convert string to integer", and you perceive that there exists a function called strtol() in the C standard library. Armed with these two weapons, and applying some logic, you can deduce something like this:
const size_t max_numbers = 1000; // however many
int numbers[max_numbers];
size_t index = 0;
char buf[LINE_MAX];
while (index < max_numbers && fgets(buf, sizeof buf, stdin)) {
char *p = buf;
char *end;
while (index < max_numbers && *p && *p != '\n') {
numbers[index++] = strtol(p, &end, 10);
p = end;
}
}

Using fscanf to receive an array

I'm trying to take the input of floating numbers from a file and arrange it into an array.
The only trouble is that I don't know exactly how many floating numbers there will be each time though I do know that the max amount of floating numbers is 1000.
What I need to do is have fscanf take all the floating numbers but then stop at the next line, which is full of integers.
Here is what I have so far for this section:
for (repetition = 0; repetition <= 1000; repetition++)
{
fscanf(userFile, "%f", &itemPrice[itemNumber]);
itemNumber++;
}
But unfortunately, this continues on to assign array values to all of the other values in the next several lines.
I found another user input, auctionItems and used that to control the array length using while(itemNumber < auctionItems)
The return value of fscanf is the number of items successfully read. Use that to decide when to stop.
#include <stdio.h>
int main(int argc,char * argv[])
{
int i;
float ff[1000];
char next_text[17];
for (i=0; i < 1000; i++) {
int n_read;
n_read = fscanf(stdin, " %f", &( ff[i] ));
if (n_read < 1) {
fscanf(stdin, "%16s", next_text);
next_text[16] = (char) 0;
printf("Next text: '%s'\n", next_text);
break;
}
}
printf("Read %d items, %f .. %f\n",i,ff[0],ff[i-1]);
return 0;
}
Maybe you could try this method using fgetc and fputc:
for (repetition = 0; itemPrice <= 1000; repetition++)
{
int c = fgetc(userFile);
if (c == '\n' || c == EOF) break;
fputc(c);
fscanf(userFile, "%f", &itemPrice[itemNumber]);
itemNumber++;
}
char input[MAX];
while(fgets(input, MAX, userFile) != NULL){
sscanf(input, "%lf", &itemPrice[itemNumber++]);
}
OP is fairly close. Just stop when scanning fails.
fscanf() returns the number of fields scanned or EOF. In this case, checking for 1 is sufficient.
// for (repetition = 0; itemPrice <= 1000; repetition++) Use <, not <= #Arpit
for (i = 0; i < 1000; )
{
int retval = fscanf(userFile, "%f", &itemPrice[i]);
if (retval != 1) break;
i++;
}
A robust solution would detect unexpected data. This solution simple goes until the end-of-file or non-float text is encountered.
[Edit]
It appears OP has only 1 line of floats. In that case code could read the whole line and then parse the buffer.
#define CHAR_PER_FLOAT_MAX 20
char buf[1000*(CHAR_PER_FLOAT_MAX + 1)]; // Some large buffer
if (fgets(buf, sizeof buf, input)== NULL) return; // EOF or IO error
char *p = buf;
for (i = 0; i < 1000; ) {
int n = 0;
if (sscanf(p, "%f %n", &itemPrice[i], &n) != 1) break;
p += n;
i++;
}
if (*p) Handle_ExtraTextOnLine();
foo(itemPrice, i); // Use data
Another approach is to read direct from the file 1 float at a time and then look at the following white-space for an end-of-line. Less elegant.
for (i = 0; i < 1000; ) {
if (fscanf(input, "%f", &itemPrice[i]) != 1) {
// need to add code to consume the rest of the line if processing is to continue.
break;
}
i++;
// look for standard white-space
char buf[2];
while (fscanf(input, "%1[ \f\n\r\t\v]", buf) == 1) {
if (buf[0] == '\n') break;
}
}
foo(itemPrice, i); // Use data

Resources