C sscanf to validate input format - c

I want to write a program which reads a line of input from the user, in the following format: <Name>,<Age>,<City>
The Name can contain English letters, spaces and - only.
The Age must be an integer between 18 to 120.
The City must contain English letters and - only.
Each of them can be of size 49.
I want to store the information and print an informative error will be printed for bad input.
My code is the following:
char str[150];
char input[3][50] = { 0 };
int num = 0;
if (fgets(str, 150, stdin) != NULL) {
num = sscanf(str, "%[a-zA-Z -],%[0-9],%[a-zA-Z-]", input[0], input[1], input[2]);
}
if (num < 3) {
printf("ERROR\n");
}
The problem is that an error will not be printed for an input such as Name1$#,20,NY, Best,19,Rome123, or Best,100,Paris1$, where the city is in wrong format (with trailing characters). There is any way to solve it using sscanf?

You can use sscanf() and character classes for your purpose but there are small problems with your format string:
A-Z assumes ASCII encoding or at least an encoding where letters are contiguous.
a trailing - has a special meaning, put the dash in first position to match a dash explicitly.
there is no length prefix, so a name or city longer than 49 characters will overflow the destination array.
Rather than using fgets(), you should read the line mannually to detect overlong lines.
You can add an extra %c to check for extra characters at the end of the line. Storing the converted value is not required if you don't intend to use the field values, but you must convert the number to check if its value is in the requested range:
char str[150];
char name[50];
char city[50];
char agestr[4];
size_t i;
int c, age, pos, n;
for (i = 0; (c = getchar()) != EOF && c != '\n'; i++) {
if (i < sizeof(str) - 1)
str[i] = (char)c;
}
if (c == EOF && i == 0) {
printf("end of file\n");
return -1;
}
if (i >= sizeof(str)) {
printf("line too long\n");
return 0;
}
str[i] = '\0';
pos = 0;
/* validate the name */
if (sscanf(str + pos, "%49[-a-zA-Z ]%n", name, &n) != 1 || str[pos + n] != ',') {
printf("invalid name\n");
return 0;
}
pos += n + 1;
/* validate the age */
if (str[pos] == '0' || sscanf(str + pos, "%3[0-9]%n", agestr, &n) != 1 || str[pos + n] != ',') {
printf("invalid age\n");
return 0;
}
age = atoi(agestr);
if (age < 18 || age > 120) {
printf("age out of range: %d\n", age);
return 0;
}
pos += n + 1;
/* validate the city */
if (sscanf(str + pos, "%49[-a-zA-Z]%n", city, &n) != 1 || str[pos + n] != '\0') {
printf("invalid city\n");
return 0;
}
/* Input was validated... proceed */

Related

Read an input that is separated by spaces, parenthesis, and commas with scanf() and fgets()

I have the following input:
1 (2 ,3 ,4) lantern
The number of int inputs between the parenthesis is unknown, and could extend for a while.
My original thought was to scanf() the first int, then create a while loop to determine when the closed paranethsis is scanned. Then finally use fgets() to get the string at the end, something similar to this.
scanf("%d", &address); //first input
scanf("%c", &paren); //scan the '(' or ',' or ')'
int current_room = 0; //index for array inside parenthsis
while(paren == '(' || paren == ','){
scanf("%d,", adjoined_room[current_room]); //scan am int
scanf("%c", &paren); //scan a ',' or ')'
current_room++; //increase the index
}
This however prints the following output when I print my address, array, and string:
Address: 1
Item: (2 ,3 ,4) lantern
The inputted ints between the parenthesis were never set to the array. Is there a better way to determine when ')' is inputted?
The problem is that scanf("%c", will read the very next character in the input, without skipping any whitespace. If you want to skip whitespace, you need a space in the format, eg scanf(" %c",. You should also check the scanf return value to make sure that you got an integer
Adding that to your code gives you something like:
if (scanf("%d", &address) != 1) { //first input
fprintf(stderr, "syntax error\n");
return; // not an integer -- do something else
}
scanf(" %c", &paren); //scan the '(' or ',' or ')'
int current_room = 0; //index for array inside parenthsis
while(paren == '(' || paren == ','){
if (scanf("%d", adjoined_room[current_room]) == 1) { //scan an int
current_room++; //increase the index
}
scanf(" %c", &paren); //scan a ',' or ')'
if (paren != ',' && paren != ')') {
fprintf(stderr, "syntax error\m");
return;
}
}
If you want to do this with interactive input, you should probably use fgets or getline to read entire lines and sscanf to parse each line independently so you don't confuse your user when there's an error in the middle of a line. The "read line + sscanf" is also very useful if you have a number of different patterns that you want to try (sscanf on the same line with different formats to find the first one that matches).
scanf should never be used. Ever. But....you might try something like:
#include <err.h>
#include <stdio.h>
#include <stdlib.h>
void * xrealloc(void *buf, size_t num, size_t siz);
int
main(void)
{
size_t cap = 4;
char buf[1024];
int *x = xrealloc(NULL, cap, sizeof *x);
if( scanf("%d ( %d", x, x + 1) != 2 ){
errx(EXIT_FAILURE, "Ivalid input");
}
int *y = x + 2;
while( scanf(",%d", y) == 1 ){
if( ++y == x + cap ){
cap += 4;
x = xrealloc(x, cap, sizeof *x);
}
}
if( scanf(")%1023s", buf) != 1 ){
errx(EXIT_FAILURE, "Ivalid input");
}
for( unsigned i = 0; i < y - x; i += 1 ){
printf("x[%d] = %d\n", i, x[i]);
}
printf("%s\n", buf);
return 0;
}
void *
xrealloc(void *buf, size_t num, size_t siz)
{
char *b = buf;
b = realloc(b, num * siz);
if( b == NULL ){
perror("realloc");
exit(EXIT_FAILURE);
}
return b;
}
This does not correctly handle input with a trailing comma like: 1 (2 ,3 ,4, ) lantern, and I'm sure there are many other inputs that it does not like. Exercise left for the reader.
You probably don't want to use an initial capacity as small as 4, but it's convenient for simple testing.
This may not be the most popular answer, and it may or may not help your immediate goals, but I am of the philosophy to read input as a stream of bytes and parse via (crude or sophisticated) state machine:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>
#define process_word(x) (printf("Got string \'%s\'\n", x))
#define process_number(x) (printf("Got number %lu\n", strtoul(x, NULL, 10)))
int main(void) {
int c;
int depth = 0;
size_t i;
char digitbuffer[256];
char alphabuffer[256];
while ((c = fgetc(stdin)) != EOF) {
switch (c) {
case ' ':
case ',':
break;
case '(':
depth++;
break;
case ')':
if (depth == 0) perror("Mismatched parenthesis, skipping");
else depth--;
break;
default:
if (isalpha(c)) {
memset(alphabuffer, 0, 256);
alphabuffer[0] = c;
i = 1;
while ((c = fgetc(stdin)) != EOF &&
isalpha(c) &&
i < 255) {
alphabuffer[i++] = c;
}
if (!isalpha(c) && c != EOF) ungetc(c, stdin);
process_word(alphabuffer);
}
else if (isdigit(c)) {
memset(digitbuffer, 0, 256);
digitbuffer[0] = c;
i = 1;
while ((c = fgetc(stdin)) != EOF &&
isdigit(c) &&
i < 255) {
digitbuffer[i++] = c;
}
if (!isdigit(c) && c != EOF) ungetc(c, stdin);
process_number(digitbuffer);
}
break;
}
}
return 0;
}
This gives you the most control over handling your specific data format, in my opinion.
You can define your own process_word() and process_number() functions, of course. process_number() might assign the number to the address field of a record if depth == 0, for example, or add it to adjacent_room[] if depth == 1. process_word() might add the string to the item field of the same record. Completely up to you. ¯\_(ツ)_/¯

palindrome c program is not working for some reason

this program checks weather the entered string is palindrome or not . it should be in a way like it should even tell the string is palindrome if there is space or any special character
like messi is a palindrome of iss em
and ronald!o is a palindrome of odlanor
this is the program and for some odd reason it is strucking and not working
#include <stdio.h>
#include <string.h>
int main() {
char palstr[100], ans[100];
printf("enter the string for checking weather the string is a palindrome or not");
scanf("%[^/n]", &palstr);
int ispalin = 1, i = 0, n = 0;
int num = strlen(palstr);
printf("the total length of the string is %d", num);
while (i <= num) {
if (palstr[i] == ' ' || palstr[i] == ',' || palstr[i] == '.' ||
palstr[i] == '!' || palstr[i] == '?') {
i++;
}
palstr[n++] == palstr[i++];
}
int j = num;
i = 0;
while (i <= num) {
ans[j--] = palstr[i];
}
printf("the reverse of the string %s is %s", palstr, ans);
if (ans == palstr)
printf("the string is a palindrome");
else
printf("the string is not a palindrome");
return 0;
}
A few points to consider. First, regarding the code:
if (ans == palstr)
This is not how you compare strings in C, it compares the addresses of the strings, which are always different in this case.
The correct way to compare strings is:
if (strcmp(ans, palstr) == 0)
Second, you should work out the length of the string after you have removed all unwanted characters since that's the length you'll be working with. By that I mean something like:
char *src = palstr, dst = palstr;
while (*src != '\0') {
if (*c != ' ' && *src != ',' && *src != '.' && *src != '!' && *src != '?') {
*dst++ = *src;
}
src++;
}
Third, you have a bug in your while loop anyway in that, if you get two consecutive bad characters, you will only remove the first (since your if does that then blindly copies the next character regardless).
Fourth, you may want to consider just stripping out all non-alpha characters rather than that small selection:
#include <ctype.h>
if (! isalpha(*src) {
*dst++ = *src;
}
Fifth and finally, you don't really need to create a new string to check for a palindrome (though you may still need to if you want to print the string in reverse), you can just start at both ends and move inward, something like:
char *left = &palstr, right = palstr + strlen(palstr) - 1, ispalin = 1;
while (left < right) {
if (*left++ != *right--) {
ispalin = 0;
break;
}
}
There may be other things I've missed but that should be enough to start on.
well, the are so many bugs in this code. I will point them out with comments.
#include <stdio.h>
#include <string.h>
int main() {
char palstr[100], ans[100];
printf("enter the string for checking weather the string is a palindrome or not\n");
scanf("%s", palstr); // your former code won't stop input util Ctrl+D
int ispalin = 1, i = 0, n = 0;
int num = strlen(palstr);
printf("the total length of the string is %d\n", num);
while (i < num) { // < insted of <=
if (palstr[i] == ' ' || palstr[i] == ',' || palstr[i] == '.' ||
palstr[i] == '!' || palstr[i] == '?') {
i++;
continue;// without this, marks still in the string
}
palstr[n++] = palstr[i++]; //should be =
}
palstr[n] = '\0'; //
num = n; // the length might be changed
i = 0;
int j = num-1; // reverse
while (i < num) { //
ans[i++] = palstr[j--]; //
}
ans[i] = '\0'; //
printf("the reverse of the string %s is %s\n", palstr, ans);
//if (ans == palstr) they can never be equal
if (strcmp(ans, palstr)==0)
printf("the string is a palindrome\n");
else
printf("the string is not a palindrome\n");
return 0;
}

ensure that scanf only reads dd.mm.yyyy

I am looking for a solution of my problem.
I want to scanf a date (dd.mm.yyyy). I need to make sure, the input is in this format with only 0 < day < 31 ; 0 < month < 13 ; 2018 < year .
For length of the Task, i do it like this:
printf("Please typ in the Task: \t");
scanf("%s", &what);
while (strlen(what) >= MAX) {
clearScanf();
printf("The task must contain a maximum of %d :\t", MAX - 1);
scanf("%s", &what);
}
But i dont know how to ensure, that my
printf("Pls put in the Deadline (dd.mm.yyyy): \t");
scanf("%s", when);
wont take characters, but still use the '.' between.
After the scanf, i want to give everything to my structure back with:
strcpy(temp->name, what);
strcpy(temp->deadline, when);
temp->next = pointer;
But i dont know, how to give month, year and day sepeerat back.
Using scanf + sscanf:
int day, month, year;
for(;;) /* Infinite loop */
{
scanf("%s", when);
char temp;
if(sscanf(when, "%2d.%2d.%4d%c", &day, &month, &year, &temp) != 4 ||
temp != '\n') /* Check if no extra characters were typed after the date */
{
fputs("Invalid format!\n", stderr);
clearScanf(); /* Assuming this function of yours clears the stdin */
}
else if(!(0 < date && date <= 31) || /* Valid range checks */
!(0 < month && month <= 12) ||
!(0 < year && year <= 2018))
{
fputs("Invalid date!\n", stderr);
}
else
{
break;
}
}
What this does is, it tells scanf to first scan a string and then extracts data from it using sscanf.
The sscanf first extracts 2 digits, then a dot, again two digits, a dot and then 4 digits and finally a character and assigns to the corresponding argument. The character is to check if the user typed more characters.
sscanf returns the number of items successfully scanned and assigned. In this case, if it returned 4, it successfully extracted everything.
Write your own format checking function:
bool is_correctly_formatted(const char* s) {
if(!isdigit(s[0])) return false;
if(!isdigit(s[1])) return false;
if('.' != s[2]) return false;
if(!isdigit(s[3])) return false;
if(!isdigit(s[4])) return false;
if('.' != s[5]) return false;
if(!isdigit(s[6])) return false;
if(!isdigit(s[7])) return false;
if(0 != s[8]) return false;
return true;
}
You can then use it like:
#define MAX 8
int main(int argc, char** argv) {
/* Space for MAX chars + line feed + terminal zero */
char what[MAX + 1 + 1] = {0};
/* Ensure that scanf reads in at most as many chars as can be safely written to `what` */
char format[MAX + 2 + 1] = {0};
snprintf(format, sizeof(format), "%%%ds", MAX + 1);
do {
printf("%s Please type in the Task: ", format);
scanf(format, &what[0]);
/* chomp the trailing '\n' */
if(isspace(what[MAX + 1])) what[MAX + 1] = 0;
printf("Entered %s\n", what);
} while (! is_correctly_formatted(what));
}
This enables you to be as flexible in what you expect as possible. You could even resort back to using some kind of regex library.
Beware: The read string will contain the trailing line feed, thus you got to remove it...

C program that stops running due to a function be called?

this might be difficult to explain. I am working on a program that takes in a file with numbers in it. the first two numbers are the dimensions of a matrix rows and then columns. the rest of the numbers are the elements of the matrix. what I am having trouble with is that after I created a function to read in a number in a give c style string, the program stops doing anything. It compiles and runs but nothing is ever done, not even printing the first line after main.
proj2.c
#include <stdio.h>
#include <math.h>
#include <stdlib.h>
float readNum(char* buffer, int *pos);
int main(){
char buffer[512];
printf("Enter File Name: ");
//char* fileName = fgets(buffer, sizeof(buffer), stdin);
FILE* file = fopen("matrix.txt", "r");
if(file == NULL){
printf("ERROR COULD NOT OPEN FILE\n");
exit(1);
}
int row = 0;
int col = 0;
int rowcheck = 0;
int colcheck = 0;
int matrixcheck = 0;
while(!feof(file)){
printf("HELLO");
if(fgets(buffer,sizeof(buffer),file) != NULL){
//position of current character
int pos = 0;
//current character
char current;
//loop to determine the dimensions of the matrix
if(colcheck == 0 && rowcheck == 0){
while(colcheck == 0 || rowcheck == 0){
//set current character
current = buffer[pos];
//determine if current character is a number and that the nex character is a space
//for single digit row dimensions
if(current >= '0' && current <= '9' && buffer[pos+1] == ' ' && rowcheck == 0){
row += current - '0';
rowcheck = 1;
}
//if not single digit row dimension add the current character times 10
//and repeat loop to obtain the second digit
else if (buffer[pos+1] >= '0' && buffer[pos+1] <= '9' && rowcheck == 0){
row += (current - '0') * 10;
}
//for columns check if current character is a number and if the next character is space or newline
//and that row has already been checked
else if(current >= '0' && current <= '9' && (buffer[pos+1] == ' ' || buffer[pos+1] == 10) && rowcheck == 1){
col += current - '0';
colcheck = 1;
}
//final check for if columns is double digit so check if next char is a number and that current char is
//not a space
else if(buffer[pos] != ' ' && buffer[pos+1] >= '0' && buffer[pos+1] <= '9' && rowcheck == 1){
col += (current - '0' ) * 10;
}
pos++;
printf("rows: %d cols: %d", row,col);
}
}
//condition to ensure columns and rows have been determined
else if(colcheck == 1 && rowcheck == 1){
//loop to find the elements of the matrix
while(matrixcheck == 0){
current = buffer[pos];
if(buffer[pos + 1] != 10){
if((current >= '0' && current <= '9') || current == '-' || current == '.'){
float num = readNum(buffer, &pos);
printf("number: %f", num);
}
}
}
}
}
}
fclose(file);
}
and readNum.c
#include <stdio.h>
#include <math.h>
float readNum(char* buffer,int *pos){
int negative = 1;
int y = 0;
float number = 0;
if(buffer[*pos] == '-'){
negative = -1;
(*pos)++;
}
while(buffer[*pos + y] >= '0' && buffer[*pos + y] <= '9'){
y++;
}
for(int z = 0; z < y; z++){
number += (buffer[*pos + z] - 48) * pow(10, y - z - 1);
}
*pos += y;
if(buffer[*pos] == '.'){
(*pos)++;
int d = 0;
while(buffer[*pos + d] >= '0' && buffer[*pos + d] <= '9'){
if(buffer[d + *pos] == '.'){
printf("ERROR: multiple decimals in an element");
}
d++;
}
for(int z = 0; z < d; z++){
number += (buffer[z + *pos] - '0') * pow(10, -z - 1);
}
pos += d;
}
return number * negative;
}
commenting out the lines
float num = readNum(buffer, &pos);
printf("number: %f", num);
allows the program to run normally, but uncommenting them it just stops doing anything, in eclipse the console just stays blank running something or other and I terminate it after a bit because nothing is happening, not even the first line is being printed.
this is a sample file that is being read
3 2
56 12 98 25
34.5
45
Thank you in advance
SOLUTION has been found, i'm not sure if everyone understood what exactly is happening in the program. main would not run at all, the first line would not print anything. the solution to this was using fflush(stdout) after the first print statement.
Parsing the file character by character is way to complicated when you are
trying to read floats. Use the function provided by the standard library.
Your code can yield undefined behaviour, because you don't check the boundaries
of buffer, for example:
if(current >= '0' && current <= '9' && buffer[pos+1] == ' ' && rowcheck == 0){
row += current - '0';
rowcheck = 1;
}
You never check if your read the '\0'-terminating byte and keep incrementing
pos, buffer[pos+1] might access beyond the limit. Also I don't understand
how you are really parsing the dimensions. That's why I tell you, don't reinvent
the wheel, use the tools at your disposal.
You say that the dimensions are in the first line, then you can get the
dimension by doing this:
char buffer[512];
if(fgets(buffer, sizeof buffer, file) == NULL)
{
fprintf(stderr, "File is empty\n");
flcose(file);
return 1;
}
size_t cols,rows;
if(fscanf("%zu %zu", &rows, &cols) != 2)
{
fprintf(stderr, "Invalid file format, cannot get columns and rows\n");
fclose(file);
return 1;
}
if(rows == 0 || cols == 0)
{
fprintf(stderr, "Invalid dimension %zux%zu\n", rows, cols);
fclose(file);
return 1;
}
Now, you can parse the file like this:
float matrix[rows][cols] = { 0 };
for(size_t i = 0; i < rows; ++i)
{
if(fgets(buffer, sizeof buffer, file) == NULL)
{
fprintf(stderr, "End of file reached before filling matrix\n");
fclose(file);
return 1;
}
int pos;
char *scan = buffer;
for(size_t j = 0; j < cols; ++j)
{
if(sscanf(scan, "%f%n", matrix[i] + j, &pos) != 1)
{
fprintf(stderr, "Invalid format at line %zu\n", i+2);
break; // continue parsing with the next line
}
scan += pos;
}
}
fclose(file);
printf("matrix[%zu][%zu] = %f\n", rows/2, cols/2, matrix[rows/2][cols/row]);
This code is more robust, because it checks if the functions are working as
intended. If no more lines can be read before the the matrix is filled, then you
can return an error message and end the program. If the lines don't have the
proper format, I ignore that line and the row is filled with 0 while also
printing an error message. If there are more lines than rows, they are ignored
and you would not overflow the buffers. The intentions are also more clear and
it's easier to understand what I'm doing.
Like I said at the beginning, using the function provided by the standard C
library is better than trying to invent the wheel again. Your code is
complicated and hard to read.
Also see why is while(feof) always wrong. It's easier to manage the end
of file when using fgets, because fgets returns NULL when no more data can
be read, either because of an I/O error or because the file reached EOF. That's
why my example above always checks the return value of fgets. Note how I use
%n in the scanf format: %n returns the number of characters consumed thus
far from the input, which is a great info when using sscanf in a loop. I also
check if scanf doesn't return the number of matched elements (note that %n
does not increase the number of matched elements). For more information about
this see the documentation of scanf.
This loop can run forever:
while(buffer[*pos] >= '0' && buffer[*pos] <= '9'){
y++;
}
How can we get out of this loop?:
while(matrixcheck == 0){
current = buffer[pos];
if(buffer[pos + 1] != 10){
if((current >= '0' && current <= '9') || current == '-' || current == '.'){
float num = readNum(buffer, &pos);
printf("number: %f", num);
}
}
}
SOLUTION has been found, i'm not sure if everyone understood what exactly is happening in the program. main would not run at all, the first line would not print anything. the solution to this was using fflush(stdout) after the first print statement.

Only string between 5 and 10 characters inputted? C Programming

So my program needs to consist of it forcing the user to input a string between 5-10 characters, with validation, so ensuring that the length is correct, i have this so far but completely stuck, any advice on how to validate the data type to only strings allowed to be inputted?
char x[10];
int length, i;
for(i=0;i=10;i=i+1){
printf("Please enter 5-10 Characters\n");
scanf("%s", &x);
length = strlen(x);
if (length < 5){
printf("Not Long Enough!\n");
}
if (length > 10){
printf("Too Long!\n");
}
while('x' == 'char'){
if (scanf("%s", &x) == 1){
return 0;
}else{
printf("Not a string, Try again");
gets(x);
}
}
printf("You inputted: %s\n", x);
}
Various problems in code, so will center on the title topic.
Only string between 5 and 10 characters inputted?
any advice on how to validate the data type to only strings allowed to be inputted?
Use fgets() to read a line of user input. I would use a helper function.
scanf() does not well recover from errant input. Do not use it.
Be sure to use a large enough buffer to hold the 10 characters read and the appended null character. #Weather Vane
// Return 1 on success
// Return -1 on EOF
// Return 0 on failure (too long or too short)
// `x` must be at least size `max + 1`
int read_line_min_max(char *x, size_t min, size_t max) {
if (fgets(x, max + 1, stdin) == NULL) return -1;
size_t len = strlen(x);
if (len > 0 && x[len - 1] == '\n') {
x[--len] = '\0'; // lop off potential \n
} else {
// more data to read, but saving not needed, simply consume it
int ch;
while ((ch = fgetc(stdin)) != '\n' && ch != EOF) {
len++;
}
}
return (len >= min) && (len <= max);
}
Example
#define READ_MIN 5
#define READ_MAX 10
char buffer[READ_MAX + 1];
int result;
do {
printf("Please enter 5-10 Characters\n");
result = read_line_min_max(buffer, READ_MIN, READ_MAX);
} while (result == 0);
while('x' == 'char') is unclear.
Perhaps should be while(strcmp(x, "char") != 0)

Resources