Scanning string from byte array - c

Is there a way to use scanf() to scan a string from an array of bytes?
i.e: scan any number of bytes before a specific value is found, and after that scan the subsequent string?
The main problem I'm having is dealing with the '\0' value. Is there a way to make scanf() bypass the NUL terminator in a controlled way?

Why dont you just iterate over the string?
char str[30];
char str1[30];
char str2[30];
//Initialize str it to some string
int i=0;
while(str[i]!='x') //say scan till you find x
{
i++;
}
i++;
memcpy(str1, str, i); //extract this substring till x
str1[i+1]='\0';
int j=0;
while(str[i]!='\0') //now copy the rest
{
j++; //track the point where x appeared
i++;
}
memcpy(str2, &str[i-j], j); //extract the rest
str2[j+1]='\0';

You can use sscanf() instead:
char string[100] = "720 11 43";
int x, y, z;
sscanf(string, "%d %d %d", &x, &y, &z);

Cannot do this from an array of bytes using sscanf() because sscanf() stops when it reaches a '\0'. #Joseph Quinsey
If code is reading from a file or stdin, there is a solution.
Both fscanf("%[^something]") and fscanf("%c") will scan a '\0'. Even fgets() will scan a '\0'.
OP: "scan any number of bytes before a specific value is found, and after that scan the subsequent string?"
The following will 1) scan over any number of byte until an x is found, 2) scan the x, 3) scan-over white-space and 4) scan and save non-white-space. Unfortunately this last step treats embedded '\0' as non-white-space.
ch ch;
char buf[100];
if (1 == fscanf(inf, "%*[^x]x%99s", buf)) string_after_x_is_found(buf);
To use fscanf("%c") in a general sense
FILE *inf;
inf = fopen("something", "rb");
char ch;
while (fscanf(inf,"%c", &ch) == 1) {
foo(ch);
}
fclose(inf);
Embedded '\0' really messes up the scanf() family. Careful use of format can work, but I recommend simply using fread() or fgetc() instead.

Related

Validating integer of length 11 and starts with 0

I'm trying to make a function to validate mobile entry, the mobile number MUST starts with 0 and is 11 numbers (01281220427 for example.)
I want to make sure that the program gets the right entry.
This is my attempt:
#include <stdio.h>
#include <strings.h>
void integerValidation(char x[15]);
int main(int argc, char **argv)
{
char mobile[15];
integerValidation(mobile);
printf("%s\n\n\n", mobile);
return 0;
}
void integerValidation(char x[15]){
char input[15];
long int num = -1;
char *cp, ch;
int n;
printf("Please enter a valid mobile number:");
while(num<0){
cp = fgets(input, sizeof(input), stdin);
if (cp == input) {
n = sscanf(input, "%ld %c", &num, &ch);
if (n!=1) {printf("ERROR! Please enter a valid mobile number:");
num = -1;
}
else if (num<0)
printf("ERROR! Please enter a valid mobile number:");
else if ((strlen(input)-1)>11 || (strlen(input)-1)<11 || strncmp(&input[0], "0", 1) != 0){
printf("ERROR! Please enter a valid mobile number:");
num = -1;
}
}
}
long int i;
i = strlen(input);
//Because when I try to print it out it prints a line after number.
strcpy(&input[i-1], "");
strcpy(x, input);
}
Now, if I don't use
strcpy(&input[i-1], "");
the array prints a new line after the number, what would be a good fix other than mine? and how can I make this function optimized and shorter?
Thanks in advance!
Edit:
My question is: 1. Why does the input array prints a new line in the end?
2. How can I make this code shorter?
End of edit.
If you insist on using sscanf(), you should change the format this way:
int integerValidation(char x[15]) {
char input[15], c;
printf("Please enter a valid mobile number:");
while (fgets(input, sizeof(input), stdin)) {
if (sscanf(input, "%11[0123456789]%c", x, &c) == 2
&& x[0] == '0' && strlen(x) == 11 && c == '\n') {
// number stored in `x` is correct
return 1;
}
printf("ERROR! Please enter a valid mobile number:");
}
x[0] = '\0'; // no number was input, end of file reached
return 0;
}
%12[0123456789] parses at most 11 characters that must be digits.
%c reads the following character, which should be the trailing '\n'.
I verify that both formats have been matched, and the number starts with 0 (x[0] == '0') and it has exactly 11 digits.
You're seeing the newline, since fgets() reads until an EOF or a newline is received. The newline is stored in the buffer, and after that the string is terminated with '\0'.
An alternative would be to directly overwrite the newline with another null-byte: input[i-1] = '\0' (which basically does the same thing as your solution, but saves a function call).
The same goes for the check with strncmp with length 1, you can directly check input[0] == '0'. Note that you have to compare against '0' (char) here, not "0" (string).
A few other things I'm seeing:
You can also spare the %c in the format string for sscanf (you're never evaluating it anyway, since you're checking for 1 as return value), which also eliminates the need for char ch.
Also, you're passing char x[15] as argument to your function. This is a bit misleading, because what actually gets passed is a pointer to a char array (try using sizeof(x), your compiler will most likely issue a warning about the size of char * being returned by sizeof()).
What you could do is to ditch the char array input, which you're using as temporary buffer, and use the buffer which was handed over as argument. For this to be save, you should use a second funcion parameter to specify the size of the buffer which was handed to the function, which would result in a function header like as follows:
void integerValidation(char *input, size_t len);
With this, you'd have to use len instead of sizeof(input). The following question provides more detail why: C: differences between char pointer and array
Since you're not using a temporary buffer anymore, you can remove the final call to strcpy().
There are also a lot of checks for the number length/format. You can save a few:
If you use %lu instead of %ld no signed numbers are being converted, which saves you the check for num < 0.
You're checking whether the length of the read number is <11 or >11 - why not just check for !=11?
You're calling strlen() three times on the input-buffer (or still twice with the reworked check for lengh 11) - it makes sense to call it once, save the length in a variable and use that variable from then on, since you're not altering the string between the calls.
There is already an accepted answer, but for what it's worth, here is another.
I made several changes to your code, firstly avoiding "magic numbers" by defining the phone number length and an arbitrarily greater string length. Then there is no point passing an array x[15] to a function since it pays no regard to its length, might as well use the simpler *x pointer. Next, I return all reasons for failure back to the caller, that's simpler. And instead of trying to treat the phone number as a numeric entry (note: letters, spaces, hyphens, commas and # can sometimes be a part of phone number too) I stick to a character string. Another reason is that the required leading zero will vanish if you convert the entry to an int of some size. I remove the trailing newline that fgets() reads with the input line, and the result is this.
#include <stdio.h>
#include <string.h>
#include <ctype.h>
#define MAXLEN 11
#define STRLEN (MAXLEN+10)
int integerValidation(char *x);
int main(int argc, char **argv)
{
char mobile[STRLEN];
while (!integerValidation(mobile)) // keep trying
printf("Invalid phone number\n");
printf("%s\n\n\n", mobile); // result
return 0;
}
int integerValidation(char *x)
{
int i, len;
printf("Please enter a valid mobile number:");
if(fgets(x, STRLEN, stdin) == NULL) // check bad entry
return 0;
x [ strcspn(x, "\r\n") ] = 0; // remove trailing newline etc
if((len = strlen(x)) != MAXLEN) // check length
return 0;
if(x[0] != '0') // check leading 0
return 0;
for(i=1; i<len; i++) // check all other chars are numbers
if(!isdigit(x[i]))
return 0;
return 1; // success
}

Scanf skipped in loop (Hangman)

This program essentially asks for a secret string, then asks a user to repeatedly guess single chars of that string until he guesses it all. It works however every second time the while loop is run it skips user input for the guessed char. How do I fix this?
int main(){
char guess;
char test2 [50];
char * s = test2;
char output [50];
char * t = output;
printf("Enter the secret string:\n");
fgets(test2, 50, stdin);
for (int i=0;i<49;i++){ //fills ouput with _ spaces
*(output +i)='_';
while(strcmp(s,t) != 0){
printf("Enter a guess:");
scanf("%c",&guess);
printf("You entered: %c\n", guess);
showGuess(guess,s, t ); // makes a string "output" with guesses in it
printf("%s\n",t);
}
printf("Well Done!");
}
For a quick and dirty solution try
// the space in the format string consumes optional spaces, tabs, enters
if (scanf(" %c", &guess) != 1) /* error */;
For a better solution redo your code to use fgets() and then parse the input.
As pointed out in some other answers and comments, you need to "consume" the "newline character" in the input.
The reason for that is that the input from your keyboard to the program is buffered by your shell, and so, the program won't see anything until you actually tell your shell to "pass the content of its buffer to the program". At this point, the program will be able to read the data contained in the previous buffer, e.g. your input, followed by one the character(s) used to validate your input in the shell: the newline. If you don't "consume" the newline before you do another scanf, that second scanf will read the newline character, resulting in the "skipped scanf" you've witnessed. To consume the extra character(s) from the input, the best way is to read them and discard what you read (what the code below does, notice the
while(getc(stdin) != '\n');
line after your scanf. What this line does is: "while the character read from stdin is not '\n', do nothing and loop.").
As an alternative, you could tell your shell to not buffer the input, via the termios(3) functions, or you could use either of the curses/ncurses libraries for the I/O.
So here is what you want:
int main(){
char guess;
char test2 [50];
char * s = test2; // 3. Useless
char output [50];
char * t = output; // 3. Useless
int i; // 8. i shall be declared here.
printf("Enter the secret string:\n");
fgets(test2, 50, stdin);
for (i=0;i<50;i++) if (test2[i] == '\n') test2[i] = '\0'; // 4. Remove the newline char and terminate the string where the newline char is.
for (int i=0;i<49;i++){ // 5. You should use memset here; 8. You should not declare 'i' here.
*(output +i)='_';
} // 1. Either you close the block here, or you don't open one for just one line.
output[49] = '\0'; // 6. You need to terminate your output string.
while(strcmp(s,t) != 0){ // 7. That will never work in the current state.
printf("Enter a guess:");
scanf("%c",&guess);
while(getc(stdin) != '\n');
printf("You entered: %c\n", guess);
showGuess(guess,s, t );
printf("%s\n",t);
}
printf("Well Done!");
return 0; // 2. int main requires that.
}
Other comments on your code:
You opened a block after your for loop and never closed it. That might be causing problems.
You declared your main as a function returning an integer... So you should at least return 0; at the end.
You seem to have understood that char * t = output; copies output's value and uses t as a name for the new copy. This is wrong. You are indeed copying something, but you only copy the address (a.k.a reference) of output in t. As a result, output and t refer to the same data, and if you modify output, t will get modified; and vice versa. Otherwise said, those t and s variables are useless in the current state.
You also need to remove the newline character from your input in the test2 buffer. I have added a line after the fgets for that.
Instead of setting all the bytes of an array "by hand", please consider using the memset function instead.
You need to actually terminate the output string after you "fill" it, so you should allocate a '\0' in last position.
You will never be able to compare the test2 string with the output one, since the output one is filled with underscores, when your test2 is NULL terminated after its meaningful content.
While variables at the loop scope are valid according to C99 and C11, they are not standard in ANSI C; and it is usually better to not declare any variable in a loop.
Also, "_ spaces" are called "underscores" ;)
Here is a code that does what you want:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define LEN 50
int main()
{
char phrase[LEN];
char guessed[LEN];
char guess;
int i, tries = 0;
puts("Please enter the secret string:");
if(fgets(phrase, LEN, stdin) == NULL)
return 1;
for(i = 0; i < LEN && phrase[i] != '\n'; i++); // Detect the end of input data.
for(; i < LEN; i++) // For the rest of the input data,
phrase[i] = '_'; // fill with underscores (so it can be compared with 'guessed' in the while loop).
phrase[LEN - 1] = '\0'; // NULL terminate 'phrase'
memset(guessed, '_', LEN); // Fill 'guessed' with underscores.
guessed[LEN - 1] = '\0'; // NULL terminate 'guessed'
while(strcmp(phrase, guessed) != 0) // While 'phrase' and 'guessed' differ
{
puts("Enter a guess (one character only):");
if(scanf("%c", &guess) != 1)
{
puts("Error while parsing stdin.");
continue;
}
if(guess == '\n')
{
puts("Invalid input.");
continue;
}
while(getc(stdin) != '\n'); // "Eat" the extra remaining characters in the input.
printf("You entered: %c\n", guess);
for(i = 0; i < LEN; i++) // For the total size,
if(phrase[i] == guess) // if guess is found in 'phrase'
guessed[i] = guess; // set the same letters in 'guessed'
printf("Guessed so far: %s\n", guessed);
tries++;
}
printf("Well played! (%d tries)\n", tries);
return 0;
}
Feel free to ask questions in the comments, if you are not getting something. :)
Newline character entered in the previous iteration is being read by scanf. You can take in the '\n' by using the getc() as follows:
scanf("%c",&guess);
getc(stdin);
..
This changed worked for me. Though the right explanation and c leaner code is the one given by #7heo.tk
Change
scanf("%c",&guess);
with
scanf(" %c",&guess);
It should ignore '\n'.

Comparing String Arrays in C

This is the code:
#include <stdio.h>
int main(void)
{
char words[256];
char filename[64];
int count = 0;
printf("Enter the file name: ");
scanf("%s", filename);
FILE *fileptr;
fileptr = fopen(filename, "r");
if(fileptr == NULL)
printf("File not found!\n");
while ((fscanf(fileptr, " %s ", words))> 0)
{
if (words==' ' || words == '\n')
count++;
}
printf("%s contains %d words.\n", filename, count);
return 0;
}
I keep getting this error:
warning: comparison between pointer and integer [enabled by default]
if (words==' ' || words == '\n')
^
I don't get the error once I change, words to *words but that does not give me the correct results. I am trying count the number of words in a file.
not necessary compare because %s(words) does not contain white spaces(e.g. ' ' or '\n').
try this
while (fscanf(fileptr, "%s", words)> 0) {
count++;
}
words is char pointer while ' ' is char, *words equals to words[0]
usually we would define a new pointer as below
char *p = words;
while(*p != '\0' )
{
// using *p something you need to do
p++;
}
There is no string in C. Every string (/ literal) is an Array of chars. Use strcmp
Take care that using the array name words by itself implies a pointer to the first element in the array. If what you need is to compare 2 strings in C then the strcmp is what you are looking for.
You cannot compare strings in C. You should compare them character by character using the standard library function strcmp. Here's its prototype contained in the string.h header.
int strcmp(const char *s1, const char *s2);
The strcmp function compares the two strings s1 and s2. It returns an integer less than, equal to, or greater than zero if s1 is found, respectively, to be less than, to match, or be greater than s2.
The format string of fscanf " %s " (note the trailing and the leading space) will read and discard any number of whitespaces which it does anyway with the format string "%s". This means no whitespaces will be written into the buffer words by fscanf. fscanf will write only non-whitespace characters in words and returns when it encounters a whitespace. So, to count the number of words, just increase the counter for each successful fscanf call.
Also, your program should check for possible buffer overflow in scanf and fscanf calls. If the input string is too big for the buffer, then this would cause undefined behaviour and even causing crash due to segfault. You can guard against it by changing the format string. scanf("%63s", filename); means scanf will read from stdin until it encounters a whitespace and write at most 63 non-whitespace characters in the buffer filename and then add a terminating null byte at the end.
#include <stdio.h>
#include <string.h>
int main(void) {
// assuming max word length is 256
// +1 for the terminating null byte added by scanf
char words[256 + 1];
// assuming max file name length is 64
// +1 for the terminating null byte
char filename[64 + 1];
int count = 0; // counter for number of words
printf("Enter the file name: ");
scanf("%64s", filename);
FILE *fileptr;
fileptr = fopen(filename, "r");
if(fileptr == NULL)
printf("File not found!\n");
while((fscanf(fileptr, "%256s", words)) == 1)
count++;
printf("%s contains %d words.\n", filename, count);
return 0;
}

Varying number of elements in scanf()

How do I tell my program how to vary how many elements are to be read by scanf? I want it to read each character in a string, and the length of the string may vary from one character to a hundred characters. I know I can do scanf("%c%c%c%c...") a hundred times but is there an easier way to do this?
Sure, use fgets() with an appropriately sized buffer:
char buf[LINE_MAX];
if (fgets(buf, sizeof buf, stdin) != NULL) {
// input is now in `buf'
}
If you really can't use arrays, then call getchar() until it finds a newline:
int sum = 0;
int ch;
while ((ch = getchar()) != EOF && ch != '\n') {
sum += ch;
}
(this already does what you want, i. e. it sums the character codes of the string the user enters.)
You can do this way
char A[105];
scanf("%s", A);
printf("%s\n", A);
By this, you can input string with various length. If you input "abc" then the result is "abc"

How would using scanf like gets work?

What would be the best way to imitate the functionality of gets with scanf?
Here is my current attempt
int main()
{
char cvalue[20]; //char array to store input string
int iloop=0; //integer variable for loop
for(iloop=0;iloop<20;iloop++) // for loop to get the string char by char
{
scanf("%c",&cvalue[iloop]); //getting input
if(cvalue[iloop]=='\n') //if input is newline skip further looping
break;
} // end of loop
cvalue[iloop]='\0'; //set end of the character for given input
printf("%s",cvalue); //printing the given string
return 0;
}
You could use scanf this way to work like gets
scanf("%[^\n]",&a);
You need to observe the usually dangers of gets().
The challenge to using scanf() is
1) Insuring that \n is consumed. scanf("%[^\n]",... does not do this.
2) Insuring the str gets a \0 if only a \n is read.
3) Dealing with EOF and I/O errors and return 0.
4) Insure leading whitespace are read into str as scanf("%s" skips them.
#include <stdio.h>
// On success, the gets() returns str.
// If EOF encountered, the eof indicator is set (feof).
// If this happens before any characters could be read,
// pointer returned is a null pointer.
// If a read error occurs, the error (ferror) is set
// and a null pointer is also returned.
char *gets_via_scanf( char * str ) {
// Reads characters from stdin & saves them into str until \n or the end-of-file.
// \n, if found, is not copied into str.
int retval = scanf("%[^\n]",str); // %[ does not skip leading whitespace
if (retval == EOF) return 0;
if (retval == 0) {
*str = '\0'; // Happens when users only types in \n
}
char ch;
scanf("%c",&ch); // Consume leftover \n, could be done with getc()
return str;
}
Your attempt doesn't really imitate gets(), since gets() just keeps putting bytes into the supplied buffer until the end of line is reached. You should realize then that gets() is dangerous and should be avoided. It does not offer any protection from buffer overflow. So, it is also questionable to imitate it.
Given that, your attempt has a couple flaws that I see. First, it loops to the complete size of the input buffer. This doesn't leave you any room to store the NUL terminator if the input line is 20 bytes or longer. This means that you may attempt to store the \0 at cvalue[20], which outside the array boundary. You can fix this by shortening your for loop by one:
for(iloop=0;iloop<19;iloop++) // for loop to get the string char by char
The second flaw is that you do not check to see if the scanf() call succeeds. If you detect failure, you should also leave the loop:
if (scanf("%c",&cvalue[iloop]) != 1) { //getting input
break;
}
Below was my attempt at creating a safer version of gets() implemented with scanf().
char *getsn (char *s, size_t sz) {
char c;
char fmt[sizeof(sz) * CHAR_BIT + sizeof("[^\n]")];
if (sz == 0) return 0;
if (sz == 1) {
s[0] = '\0';
return s;
}
s[sz-2] = '\0';
snprintf(fmt, sizeof(fmt), "%%%lu%s", (unsigned long)sz-1, "[^\n]");
switch (scanf(fmt, s)) {
case 0: s[0] = '\0';
scanf("%c", &c);
return s;
case 1: scanf("%c", &c);
if (s[sz-2] != '\0' && c != '\n') {
ungetc(c, stdin);
}
return s;
default: break;
}
return 0;
}
The safer version uses snprintf() to create a format string that limits how many characters should be stored by the scanf(). So if the provided sz parameter was 100, the resulting format string would be "%99[^\n]". Then, it makes sure to only strip out the \n from the input stream if it was actually encountered.

Resources