When validating inputs from a user and doing it through functions in C programming, will you have a lot if if statements that checks 1 or 0 that is returned from the validating functions?
If you don't understand what I mean, then below is code I typed strictly as an example. It's definitely not being used anywhere else.
#include <stdio.h>
int checkIfZero(int x){
int result = 1;
if (x ==0){
printf ("You typed in zero for your age. Try again.\n\n");
result = 0;
}
return result;
}
int checkUpper(char x){
int result = 1;
if (x > 96){
printf("Iniitial is not a uppercase. Try again\n\n");
result =0;
}
return result;
}
int main(int argc, const char * argv[])
{
int age;
char initial;
int correct = 0;
do {
int counter; // holds returned result of first function
int counter2; // holds returned result of the second function
printf("Please type your age and the initial of your first name in Uppercase\n");
scanf("%d %c", &age, &initial);
counter = checkIfZero(age);
if(!counter){
continue;
}
counter2 = checkUpper(initial);
if (!counter2){
continue;
}
correct = 1;
printf("Correct\n");
} while (correct==0);
return 0;
}
If you notice, I have 2 functions that validate the inputs. Later, I have to create different variables that either will have a 1 or 0 form what these functions return and check them using if statements.
Now let's say I create more like 10 of validating functions
Does that mean I have to create 10 different variables to catch the returned result of the functions and then type 10 if statements?
I'm okay with that if that's how people usually do it, but is that the case?
You don't have to place the result in a variable:
counter = checkIfZero(numSiblings);
if(!counter){
continue;
Can become:
if (!checkIfZero(numSiblings))
continue;
Define a standard of how it works, and stick to it.
But don't mix up validation ("Is the value OK?") with error message output ("fprintf(...)"). If you sometime later want to, say, check a set of three variables and give a single error message ("Illegal name" after checking given, middle, and last name) this will get in the way. Make sure you can redefine messages (i.e., for your future Greek version of the program, or to reuse the "is this a valid UTF-8 string" validation for different input data; think about combining tests, like UTF-8 and only letters) and that the error output stream can be redirected.
Perhaps you should look around at largeish open source projects, particularly ones that have to validate varied user input. You should be able to find a framework that you can, er, steal.
Related
I wrote a program that counts and prints the number of occurrences of elements in a string but it throws a garbage value when i use fgets() but for gets() it's not so.
Here is my code:
#include<stdio.h>
#include<string.h>
#include<ctype.h>
#include<stdlib.h>
int main() {
char c[1005];
fgets(c, 1005, stdin);
int cnt[26] = {0};
for (int i = 0; i < strlen(c); i++) {
cnt[c[i] - 'a']++;
}
for (int i = 0; i < strlen(c); i++) {
if(cnt[c[i]-'a'] != 0) {
printf("%c %d\n", c[i], cnt[c[i] - 'a']);
cnt[c[i] - 'a'] = 0;
}
}
return 0;
}
This is what I get when I use fgets():
baaaabca
b 2
a 5
c 1
32767
--------------------------------
Process exited after 8.61 seconds with return value 0
Press any key to continue . . . _
I fixed it by using gets and got the correct result but i still don't understand why fgets() gives wrong result
Hurray! So, the most important reason your code is failing is that your code does not observe the following inviolable advice:
Always sanitize your inputs
What this means is that if you let the user input anything then he/she/it can break your code. This is a major, common source of problems in all areas of computer science. It is so well known that a NASA engineer has given us the tale of Little Bobby Tables:
Exploits of a Mom #xkcd.com
It is always worth reading the explanation even if you get it already #explainxkcd.com
medium.com wrote an article about “How Little Bobby Tables Ruined the Internet”
Heck, Bobby’s even got his own website — bobby-tables.com
Okay, so, all that stuff is about SQL injection, but the point is, validate your input before blithely using it. There are many, many examples of C programs that fail because they do not carefully manage input. One of the most recent and widely known is the Heartbleed Bug.
For more fun side reading, here is a superlatively-titled list of “The 10 Worst Programming Mistakes In History” #makeuseof.com — a good number of which were caused by failure to process bad input!
Academia, methinks, often fails students by not having an entire course on just input processing. Instead we tend to pretend that the issue will be later understood and handled — code in academia, science, online competition forums, etc, often assumes valid input!
Where your code went wrong
Using gets() is dangerous because it does not stop reading and storing input as long as the user is supplying it. It has created so many software vulnerabilities that the C Standard has (at long last) officially removed it from C. SO actually has an excellent post on it: Why is the gets function so dangerous that it should not be used?
But it does remove the Enter key from the end of the user’s input!
fgets(), in contrast, stops reading input at some point! However, it also lets you know whether you actually got an entire line of of text by not removing that Enter key.
Hence, assuming the user types: b a n a n a Enter
gets() returns the string "banana"
fgets() returns the string "banana\n"
That newline character '\n' (what you get when the user presses the Enter key) messes up your code because your code only accepts (or works correctly given) minuscule alphabet letters!
The Fix
The fix is to reject anything that your algorithm does not like. The easiest way to recognize “good” input is to have a list of it:
// Here is a complete list of VALID INPUTS that we can histogram
//
const char letters[] = "abcdefghijklmnopqrstuvwxyz";
Now we want to create a mapping from each letter in letters[] to an array of integers (its name doesn’t matter, but we’re calling it count[]). Let’s wrap that up in a little function:
// Here is our mapping of letters[] ←→ integers[]
// • supply a valid input → get an integer unique to that specific input
// • supply an invalid input → get an integer shared with ALL invalid input
//
int * histogram(char c) {
static int fooey; // number of invalid inputs
static int count[sizeof(letters)] = {0}; // numbers of each valid input 'a'..'z'
const char * p = strchr(letters, c); // find the valid input, else NULL
if (p) {
int index = p - letters; // 'a'=0, 'b'=1, ... (same order as in letters[])
return &count[index]; // VALID INPUT → the corresponding integer in count[]
}
else return &fooey; // INVALID INPUT → returns a dummy integer
}
For the more astute among you, this is rather verbose: we can totally get rid of those fooey and index variables.
“Okay, okay, that’s some pretty fancy stuff there, mister. I’m a bloomin’ beginner. What about me, huh?”
Easy. Just check that your character is in range:
int * histogram(char c) {
static int fooey = 0;
static int count[26] = {0};
if (('a' <= c) && (c <= 'z')) return &count[c - 'a'];
return &fooey;
}
“But EBCDIC...!”
Fine. The following will work with both EBCDIC and ASCII:
int * histogram(char c) {
static int fooey = 0;
static int count[26] = {0};
if (('a' <= c) && (c <= 'i')) return &count[ 0 + c - 'a'];
if (('j' <= c) && (c <= 'r')) return &count[ 9 + c - 'j'];
if (('s' <= c) && (c <= 'z')) return &count[18 + c - 's'];
return &fooey;
}
You will honestly never have to worry about any other character encoding for the Latin minuscules 'a'..'z'.Prove me wrong.
Back to main()
Before we forget, stick the required magic at the top of your program:
#include <stdio.h>
#include <string.h>
Now we can put our fancy-pants histogram mapping to use, without the possibility of undefined behavior due to bad input.
int main() {
// Ask for and get user input
char s[1005];
printf("s? ");
fgets(s, 1005, stdin);
// Histogram the input
for (int i = 0; i < strlen(s); i++) {
*histogram(s[i]) += 1;
}
// Print out the histogram, not printing zeros
for (int i = 0; i < strlen(letters); i++) {
if (*histogram(letters[i])) {
printf("%c %d\n", letters[i], *histogram(letters[i]));
}
}
return 0;
}
We make sure to read and store no more than 1004 characters (plus the terminating nul), and we prevent unwanted input from indexing outside of our histogram’s count[] array! Win-win!
s? a - ba na na !
a 4
b 1
n 2
But wait, there’s more!
We can totally reuse our histogram. Check out this little function:
// Reset the histogram to all zeros
//
void clear_histogram(void) {
for (const char * p = letters; *p; p++)
*histogram(*p) = 0;
}
All this stuff is not obvious. User input is hard. But you will find that it doesn’t have to be impossibly difficult genius-level stuff. It should be entertaining!
Other ways you could handle input is to transform things into acceptable values. For example you can use tolower() to convert any majuscule letters to your histogram’s input set.
s? ba na NA!
a 3
b 1
n 2
But I digress again...
Hang in there!
I am looking for a way to take a string and check 3 possibilities.
Digit and thus converts it to a signed int (not a long)
Is a symbolic representation previously defined at runtime, and converts it to a signed int
Neither
The "symbolic representation" will be basically like an associative array that starts at 0 elements and expands as more symbols are added. For example lets say for instance that C had associative arrays (I wish) with this peusdocode:
symbol_array['q'] = 3;
symbol_array['five'] = 5;
symbol_array['negfive'] = -5;
symbol_array['random294'] = 28;
signed int i;
string = get_from_input();
if(!(i = convert_to_int(string))) {
if(!(i = translate_from_symbol(string))) {
printf("Invalid symbol or integer\n");
exit(1);
}
}
printf("Your number: %d\n, i);
The idea being if they entered "5" it would convert it to 5 via convert_to_int, and if they entered "five" it would convert it to 5 via translate_from_symbol. As what I feel may be hardest is if they entered "random294" it wouldn't convert it to 294, but to 28. If they entered "foo" then it would exit(1).
My general questions are these: (Instead of making multiple posts)
When making convert_to_int I know I shouldn't use atoi because it doesn't fail right. Some people say to use strtol but it seems tedious to convert it back to a non-long int. The simplistic (read: shortest) way I've found is using sscanf:
int i;
if ((sscanf(string, "%d", &i)) == 1){
return i;
}
However, some people look down on that even. What is a better method if not sscanf or converting strtol?
Secondly, how can I not only return an integer but also know if it found one. For example if the user entered "0" then it would return 0, thus setting off my FALSE in my if statement. I had considered using -1 if not found but since I am returning signed int's then this also suffers from the same problem. In PHP I know for example with strpos they use === FALSE
Finally, is there any short code that emulates associate arrays and/or lets me push elements on to the array in runtime?
First, you might want to revise your syntax and set the keyword apart from the operand, i.e. "neg five" instead of "negfive". Otherwise your symbol lookup for the keywords has to consider every prefix. ("random294" might be okay if your keywords aren't allowed to have digits in them.)
Sure, sscanf tells you whether you found a decimal in the return value and writes that decimal to a separate int, which is nice, but you'll have to watch out for trailing characters by checking that the number of characters read equals the length of your string with the %n format. Otherwise, sscanf will consider 5x as legal decimal number. strtol also returns a pointer to the location after the parsed decimal number, but it relies too much on checking err for my taste.
The fact that strtol uses long integers shouldn't be an issue. If the input doesn't fit into an int, return INT_MAX or INT_MIN or issue an error.
You can also easily write a wrapper function around sscanf or strtol that suits your needs better. (I know I'd like a function that returns true on success and stores the integer via a pointer argument, sscanf style, where success means: no trailing non-digit characters.)
Finally, about the associative arrays: There is no short code, at least not in C. You'll have to implement your own hash map or use a library. As a first draft, I'd use a linear list of strings and check them one by one. This is a very naive approach, but easy to implement. I assume that you don't start out with a lot of symbols, and you're not doing a lot of checks, so speed shouldn't be an issue. (You can sort the array and use binary search to speed it up, but you'd have to re-sort after every insertion.) Once you have the logic of your program working, you can start thinking about hash maps.
Something like this should do your job:
#include <stdio.h>
#include <string.h>
struct StringToLongLookUp {
char *str;
char *num;
};
struct StringToLongLookUp table[] =
{
{ "q" , "3" },
{ "five" , "5" },
{ "negfive" , "-5" },
{ "random294", "28" }
};
int translate_from_symbol(char **str)
{
int i;
for(i = 0; i < (sizeof(table) / sizeof(struct StringToLongLookUp)); i++)
{
if(strcmp(*str, table[i].str) == 0)
{
*str = table[i].num;
return 1; // TRUE
}
}
return 0; // FALSE
}
int main()
{
char buf[100];
char *in = buf;
char *out;
int val;
scanf("%s", in);
translate_from_symbol(&in);
val = strtol(in, &out, 10);
if (in != out)
{
printf("\nValue = %d\n", val);
}
else
{
printf("\nValue Invalid\n");
}
}
Of course, you get a long, but converting that to int shouldn't be an issue as mentioned above.
I'm writing a function for my homework which is supposed to tell if a given string is a palindrome or not.
Although I even tried it on paper with the word "otto", my program always returns 1.
Although this is a quite common question, I'd really like to know what I'm doing wrong instead of just copying a solution from here.
int is_palindrom(const char* palin)
{
int size = strlen(palin), i=0;
for (i=0;i<=(size/2); ++i)
{
if(palin[i] != palin[(size - i -1)])
{
return 1;
}
}
return 0;
}
Your code is correct, however please note that you may have an inverted logical expression. You are returning 1 in case of not equal, and 0 when it is. This means your function is working the opposite of "standard" C functions, where 1 evaluates to true.
Obviously, you are free to use whichever value you like to represent whatever you want. However, this can easily lead to confusion if someone else is reading your code. If bool is available, you should be using that; otherwise, you should always assume 1 is true and 0 is false.
Also, make sure to note is_palindrome takes a string and not an integer.
i.e. you must call it as is_palindrome("767") and not is_palindrome(767)
Your code does return 0 when it should. I am guessing when you read the string you pass as argument to your function, there are extra characters appended to the string, most probably a new line character. Try debugging the application or adding debug output in the function. For instance print the length of the string and the ascii codes of the characters in it.
Here is the code I used to verify it:
#include <stdio.h>
#include <string.h>
int is_palindrom(const char* palin)
{
int size = strlen(palin), i=0;
for (i=0;i<=(size/2); ++i)
{
if(palin[i] != palin[(size - i -1)])
{
return 1;
}
}
return 0;
}
int main(void) {
printf("%d", is_palindrom("otto"));
return 0;
}
Make sure your (const char *) has a "\0" at the end when you call this function.
#include<stdio.h>
#include<conio.h>
int is_palindrom(const char* jj);
int main(char *args){
int rr = is_palindrom("otto");
printf("rsult is %d", rr);
getch();
}
int is_palindrom(const char* palin)
{
int size = strlen(palin), i=0;
for (i=0;i<=(size/2); ++i)
{
if(palin[i] != palin[(size - i -1)])
{
return 1;
}
}
return 0;
}
I ran you code using above code snippet and it work fine for me.it returns 0 if palindrome is entered and 1 if entered value is not palindrome. the main part of the function is the loop
for (i=0;i<=(size/2); ++i) and the comparison if(palin[i] != palin[(size - i -1)]) the loop starts from 0 and then in condition palin[0] element and palin[4-0-1] i.e palin[3] element first o and last o in this case are mapped then the increement ++i takes place and then nest mapping of palin[second] and palin[second-last] elements happen so you can you either `++i' or 'i++'
I purchased "A Book on C" for my procedural programming class and I was going through some of the exercises. Chapter 2 Exercise 9 is about designing a unit converter that can work with ounces, pounds, grams, and kilograms.
The code I've written works, but I really think it could be done much cleaner. Using nested if statements seems like a messy way to go about this.
Also, one problem I noticed was that if a char or string is given to scanf() on line 27, it will persist and then be passed to the scanf() on line 95. For example, if you enter "y" as the value to convert, the program will goto beginning without allowing the user to answer "Would you like to perform additional conversions?" How can I go about fixing this so that if a NaN is input it is discarded?
My code can be located at:
http://pastebin.com/4tST0i7T
One way to clean up the if structure would be to convert the value from the "fromUnit" to a common value and then convert it to the "toUnit". It simplifies the structure by leaving only two if structures around. (It also scales better.) So, it would be something more like:
if (!strcmp(fromUnit, "pound")) {
tempval = input / 16;
} else if (!strcmp(fromUnit, "gram") == 0) {
tempval = input * OUNCESTOGRAMS;
}
if (!strcmp(toUnit, "pound")) {
output = tempval * 16;
} else if (!strcmp(toUnit, "gram")) {
output = tempval / OUNCESTOGRAMS;
}
Granted, that math isn't correct, it's just there for the example. You would just have to (1) pick the temporary unit that you wanted to use (2) convert from the input unit to that unit and (3) convert from the temporary unit to the output unit.
And as someone else mentioned, gets() is definitely the way to go.
I would do it something like this:
#include <stdio.h>
typedef struct _unit {
char * name;
float grams;
} unit;
unit units[] = {
{"gram", 1.0},
{"kilogram", 1000.0},
{"pound", 500.0},
{"ounce", 28.3495231}
};
unit * search_unit(char * name)
{
int i;
for (i = 0; i < (sizeof(units) / sizeof(unit)); i++)
{
printf("%d %s\n", i, units[i].name);
if (0 == strcmp(units[i].name, name))
{
return & units[i];
}
}
return NULL;
}
int main() {
char line[10];
char unitname[10];
int number;
unit * found_unit;
while (1)
{
fgets(line, sizeof(line), stdin);
if (1 == sscanf(line, "%d", &number))
{
break;
}
printf("not a number\n");
}
while (1)
{
fgets(line, sizeof(line), stdin);
sscanf(line, "%s\n", unitname);
found_unit = search_unit(unitname);
if (found_unit)
{
printf("%d %s is %f grams\n", number, unitname, found_unit->grams * number);
break;
}
printf("unknown unit\n");
}
}
Store your data in some data structure, instead of in the code.
First read a line of text, then check whether it is a number.
When reading from stdin, take the size of the buffer into account.
Use loops instead of goto's.
Use some common unit, grams for example, to calculate anything to anything.
The most reliable way is to read input string using fgets() function, check if it contains digit using isdigit() (all characters in string) and then convert it to numeric value using atoi().
BTW, the last two operations can be replaced by strtol().
I wrote a function in C that converts a string to an integer and returns the integer. When I call the function I also want it to let me know if the string is not a valid number. In the past I returned -1 when this error occurred, because I didn't need to convert strings to negative numbers. But now I want it to convert strings to negative numbers, so what is the best way to report the error?
In case I wasn't clear about this: I don't want this function to report the error to the user, I want it to report the error to the code that called the function. ("Report" might be the wrong word to use...)
Here's the code:
s32 intval(const char *string) {
bool negative = false;
u32 current_char = 0;
if (string[0] == '-') {
negative = true;
current_char = 1;
}
s32 num = 0;
while (string[current_char]) {
if (string[current_char] < '0' || string[current_char] > '9') {
// Return an error here.. but how?
}
num *= 10;
num += string[current_char] - '0';
current_char++;
}
if (negative) {
num = -num;
}
return num;
}
There are several ways. All have their pluses and minuses.
Have the function return an error code and pass in a pointer to a location to return the result. The nice thing about this there's no overloading of the result. The bad thing is that you can't use the real result of the function directly in an expression.
Evan Teran suggested a variation of this that has the caller pass a pointer to a success variable (which can be optionally NULL if the caller doesn't care) and returns the actual value from the function. This has the advantage of allowing the function to be used directly in expressions when the caller is OK with a default value in an error result or knows that the function cannot fail.
Use a special 'sentinel' return value to indicate an error, such as a negative number (if normal return values cannot be negative) or INT_MAX or INT_MIN if good values cannot be that extreme. Sometimes to get more detailed error information a call to another function (such as GetLastError()) or a global variable needs to be consulted (such as errno). This doesn't work well when your return value has no invalid values, and is considered bad form in general by many people.
An example function that uses this technique is getc(), which returns EOF if end of file is reached or an error is encountered.
Have the function never return an error indication directly, but require the caller to query another function or global. This is similar to how VB's "On Error Goto Next" mode works - and it's pretty much universally considered a bad way to go.
Yet another way to go is to have a 'default' value. For example, the atoi() function, which has pretty much the same functionality that your intval() function, will return 0 when it is unable to convert any characters (it's different from your function in that it consumes characters to convert until it reaches the end of string or a character that is not a digit).
The obvious drawback here is that it can be tricky to tell if an actual value has been converted or if junk has been passed to atoi().
I'm not a huge fan of this way to handle errors.
I'll update as other options cross my mind...
Well, the way that .NET handles this in Int32.TryParse is to return the success/failure, and pass the parsed value back with a pass-by-reference parameter. The same could be applied in C:
int intval(const char *string, s32 *parsed)
{
*parsed = 0; // So that if we return an error, the value is well-defined
// Normal code, returning error codes if necessary
// ...
*parsed = num;
return SUCCESS; // Or whatever
}
a common way is to pass a pointer to a success flag like this:
int my_function(int *ok) {
/* whatever */
if(ok) {
*ok = success;
}
return ret_val;
}
call it like this:
int ok;
int ret = my_function(&ok);
if(ok) {
/* use ret safely here */
}
EDIT: example implementation here:
s32 intval(const char *string, int *ok) {
bool negative = false;
u32 current_char = 0;
if (string[0] == '-') {
negative = true;
current_char = 1;
}
s32 num = 0;
while (string[current_char]) {
if (string[current_char] < '0' || string[current_char] > '9') {
// Return an error here.. but how?
if(ok) { *ok = 0; }
}
num *= 10;
num += string[current_char] - '0';
current_char++;
}
if (negative) {
num = -num;
}
if(ok) { *ok = 1; }
return num;
}
int ok;
s32 val = intval("123a", &ok);
if(ok) {
printf("conversion successful\n");
}
The os-style global errno variable is also popular. Use errno.h.
If errno is non-zero, something went wrong.
Here's a man page reference for errno.
Take a look at how the standard library deals with this problem:
long strtol(const char * restrict str, char **restrict endptr, int base);
Here, after the call the endptr points at the first character that could not be parsed. If endptr == str, then no characters were converted, and this is a problem.
In general I prefer the way Jon Skeet proposed, ie. returning a bool (int or uint) about success and storing the result in a passed address. But your function is very similar to strtol, so I think it is a good idea to use the same (or similar) API for your function. If you give it a similar name like my_strtos32, this makes it easy to understand what the function does without any reading of the documentation.
EDIT: Since your function is explicitly 10-based, my_strtos32_base10 is a better name. As long as your function is not a bottle-neck you can then, skip your implementation. And simply wrap around strtol:
s32
my_strtos32_base10(const char *nptr, char **endptr)
{
long ret;
ret = strtol(nptr, endptr, 10);
return ret;
}
If you later realize it as an bottleneck you can still optimize it for your needs.
You can either return an instance of a class where a property would be the value interested in, another property would be a status flag of some sort. Or, pass in an instance of the result class..
Pseudo code
MyErrStatEnum = (myUndefined, myOK, myNegativeVal, myWhatever)
ResultClass
Value:Integer;
ErrorStatus:MyErrStatEnum
Example 1:
result := yourMethod(inputString)
if Result.ErrorStatus = myOK then
use Result.Value
else
do something with Result.ErrorStatus
free result
Example 2
create result
yourMethod(inputString, result)
if Result.ErrorStatus = myOK then
use Result.Value
else
do something with Result.ErrorStatus
free result
The benefit of this approach is you can expand the info coming back at any time by adding additional properties to the Result class.
To expand this concept further, it also applies to method calls with multiple input parameters. For example, instead of CallYourMethod(val1, val2, val3, bool1, bool2, string1) instead, have a class with properties matching val1,val2,val3,bool1,bool2,string1 and use that as a single input parameter. It cleans up the method calls and makes the code more easily modified in the future. I'm sure you've seen that method calls with more than a few parameters is much more difficult to use/debug. (7 is the absolute most I would say.)
What is the best way to return an error from a function when I'm already returning a value?
Some additional thoughts to the various answers.
Return a structure
Code can return a value and an error code. A concern is the proliferation of types.
typedef struct {
int value;
int error;
} int_error;
int_error intval(const char *string);
...
int_error = intval(some_string);
if (int_error.error) {
Process_Error();
}
int only_care_about_value = intval(some_string).value;
int only_care_about_error = intval(some_string).error;
Not-a-number and NULL
Use a special value when the function return type provides it.
Not-a-number's are not required by C, but ubiquitous.
#include <math.h>
#include <stddef.h>
double y = foo(x);
if (isnan(y)) {
Process_Error();
}
void *ptr = bar(x);
if (ptr == NULL) {
Process_Error();
}
_Generic/Function Overloading
Considering the pros & cons of error_t foo(&dest, x) vs. dest_t foo(x, &error),
With a cascaded use of _Generic or function overloading as a compiler extension, selecting on 2 or more types, it makes sense to differentiate the underlying function called to be based on the parameters of the call, not the return value. Return the common type, the error status.
Example: a function error_t narrow(destination_t *, source_t) that converted the value of one type to a narrower type, like long long to short and tested if the source value was in range of the target type.
long long ll = ...;
int i;
char ch;
error = narrow(&i, ll);
...
error = narrow(&ch, i);