I have a function that should be passed a string with two numbers (as a regex: /-?[0-9]+ -?[0-9]+/) and return the second.
I've decided that the program should do error checking. First, it should test if the string is actually of the desired form; second, it should ensure that the first numbers (the ones that are not returned) are sequential.
Now I've been programming for a long time and this is not a difficult task. (It's made slightly more difficult by the fact that the numbers need not fit into a machine word.) But my question is about how I should do this rather than how I can. All of the solutions I've come up with are somewhat ugly.
I could use a global variable to keep the values in and compare them (or just leave the value there if it's NULL); this seems like The Wrong Thing.
I could pass one or both of the return value and the last/current line's first number by reference and modify them
I could use the return value to give a bool for there was/was not an error
etc.
So any thoughts relating to the proper way to deal with error-checking of this sort in C would be welcome.
This is related to a much more theoretical question I asked on cstheory. For reference, here is the function:
char*
scanInput(char* line)
{
int start = 0;
while (line[start] == ' ' || line[start] == '\t')
start++;
if (line[start] == '#')
return NULL; // Comment
if (line[start] == '-')
start++;
while (line[start] >= '0' && line[start] <= '9')
start++;
while (line[start] == ' ' || line[start] == '\t')
start++;
int end = start;
if (line[end] == '-')
end++;
while (line[end] >= '0' && line[end] <= '9')
end++;
if (start == end)
return NULL; // Blank line, or no numbers found
line[end] = '\0';
return line + start;
}
and it is called like so:
while(fgets(line, MAX_LINELEN, f) != NULL) {
if (strlen(line) > MAX_LINELEN - 5)
throw_error(talker, "Maximum line length exceeded; file probably not valid");
char* kept = scanInput(line);
if (kept == NULL)
continue;
BIGNUM value = strtobignum(kept);
if (++i > MAX_VECLEN) {
warning("only %d terms used; file has unread terms", MAX_VECLEN);
break;
}
// values are used here
}
The traditional solution in C is to use pass by reference (pointers) to return the values your function computes and use the return value for error handling, just like how scanf does this.
int scanInput(char **line_p int *number){
char * line = *line_p;
...
if(something bad happens){
return 1;
}
...
*linep = line + start;
*number = ...;
return 0; //success
}
int main(){
char word[100]; strcpy(word, "10 17");
char *line = word;
int number;
switch(scanInput(&line, &number)){
case 1:
default:
}
}
Extra points:
It might be a good idea to use some enum to give a meaning to the error codes.
If you can use C++ (or similar) exceptions are often the best solution for error handling, since you don't have to fill your code with ifs anymore
Global variables are generaly evil. If you are tempted to use them, consider instead encapsulating the state you need in a struct and passing a pointer to it around. Treat it as a "this" pointer, in the OO sense.
Ultimately, you are going to need to isolate and convert both big numbers in each line. To check that the first number on the line is the one that follows the previous, you will have to keep a record of the last such number found. So, you will probably need a structure such as:
BIGNUM old_value = 0; // See notes below
while (fgets(line, sizeof(line), f) != 0)
{
BIGNUM value1;
BIGNUM value2;
if (ScanDoubleBigNum(line, &value1, &value2) != 0)
...handle line format error...
if (old_value == 0 || are_consecutive(old_value, value1))
{
// OK - valid information found
// Release old_value
old_value = value1;
process(value2);
// Release value2
}
else
...handle non-consecutive error...
}
The are_consecutive() function determines whether its second argument is one greater than its first. The process() function does whatever you need to do with the second value. The ScanDoubleBigNum() function is related to your ScanInput() but it reads two values. The actual code will call another function (call it ScanBigNum()) containing about half of ScanInput() (since that contains essentially the same code twice), plus the conversion that currently occurs in your loop. The code in ScanDoubleBigNum() will call ScanBigNum() twice. Note that ScanBigNum() will need to identify where the scan finishes so that the second call can continue where the first stopped.
I'm taking the liberty of assuming that a BIGNUM is an allocated structure identified by a pointer, so the initialization BIGNUM old_value = 0; is a way of indicating there is no value yet. There is presumably a function to release a BIGNUM. If this is incorrect, then you need to adapt the proposed code to accommodate the actual behaviour of the BIGNUM type. (Is this based on OpenSSL or SSLeay code?)
Related
I have a quiz game in C where I am using a struct to save when a user enters a wrong answer and the corresponding correct answer to that question. First I used malloc to allocated memory for a single struct.
Struct:
typedef struct
{
char* wrongAnswers;
char* corrections;
} Corrections;
Malloc:
Corrections* corrections = (Corrections*)malloc(sizeof(Corrections));
Later on in my program, I have functionality where a wrong answer increments an 'incorrectAnswers' variable, which is used to reallocate the memory to allow for the new wrong answer to be stored, along with its corresponding correct answer.
Code:
// Extract characters from file and store in character c
for (c = getc(fPointerOpen); c != EOF; c = getc(fPointerOpen)) {
if (c == '\n') // Increment count if this character is newline
numberOfLines++;
}
for (int i = 0; i < numberOfLines; i++) {
int lengthOfQuestion = 150;
if (v == 0) {
printf("Correct\n");
score++;
}
else {
printf("Incorrect\n");
incorrectAnswers++;
corrections = (Corrections*)realloc(corrections, incorrectAnswers * sizeof(Corrections));
corrections[i].wrongAnswers = malloc(sizeof(char) * lengthofanswer);
corrections[i].wrongAnswers = lines[i].userAnswers;
corrections[i].corrections = malloc(sizeof(char) * lengthofanswer);
corrections[i].corrections = lines[i].answers;
}
printf("Your score is %d/%d\n", score, (i + 1));
}
I am receiving a bug depending on the order in which right and wrong answers are input. I have tried using free() in different parts of the program and I have noticed that the bug will always appear when I enter a wrong answer as the last entry in my program/if I enter a right then wrong answer. Why is this the case? My understanding is I am implementing realloc incorrectly.
you should use strcpy to copy string, do not use = to assign string in c.
strcpy(corrections[i].wrongAnswers,lines[i].userAnswers);
strcpy(corrections[i].corrections,lines[i].answers);
Each time you use malloc or realloc, you should check the return value of these functions.
I'm looking for a C function like the following that parses a length-terminated char array that expresses a floating point value and returns that value as a float.
float convert_carray_to_float( char const * inchars, int incharslen ) {
...
}
Constraints:
The character at inchars[incharslen] might be a digit or other character that might confuse the commonly used standard conversion routines.
The routine is not allowed to invoke inchars[incharslen] = 0 to create a z terminated string in place and then use the typical library routines. Even patching up the z-overwritten character before returning is not allowed.
Obviously one could copy the char array in to a new writable char array and append a null at the end, but I am hoping to avoid copying. My concern here is performance.
This will be called often so I'd like this to be as efficient as possible. I'd be happy to write my own routine that parses and builds up the float, but if that's the best solution, I'd be interested in the most efficient way to do this in C.
If you think removing constraint 3 really is the way to go to achieve high performance, please explain why and provide a sample that you think will perform better than solutions that maintain constraint 3.
David Gay's implementation, used in the *BSD libcs, can be found here: https://svnweb.freebsd.org/base/head/contrib/gdtoa/ The most important file is strtod.c, but it requires some of the headers and utilities. Modifying that to check the termination every time the string pointer is updated would be a bit of work but not awful.
However, you might afterwards think that the cost of the extra checks is comparable to the cost of copying the string to a temporary buffer of known length, particularly if the strings are short and of a known length, as in your example of a buffer packed with 3-byte undelimited numbers. On most architectures, if the numbers are no more than 8 bytes long and you were careful to ensure that the buffer had a bit of tail room, you could do the copy with a single 8-byte unaligned memory access at very little cost.
Here's a pretty good outline.
Not sure it covers all cases, but it shows most of the flow:
float convert_carray_to_float(char const * inchars, int incharslen)
{
int Sign = +1;
int IntegerPart = 0;
int DecimalPart = 0;
int Denominator = 1;
bool beforeDecimal = true;
if (incharslen == 0)
{
return 0.0f;
}
int i=0;
if (inchars[0] == '-')
{
Sign = -1;
i++;
}
if (inchars[0] == '+')
{
Sign = +1;
i++;
}
for( ; i<incharslen; ++i)
{
if (inchars[i] == '.')
{
beforeDecimal = false;
continue;
}
if (!isdigit(inchars[i]))
{
return 0.0f;
}
if (beforeDecimal)
{
IntegerPart = 10 * IntegerPart + (inchars[i] - '0');
}
else
{
DecimalPart = 10 * DecimalPart + (inchars[i] - '0');
Denominator *= 10;
}
}
return Sign * (IntegerPart + ((float)DecimalPart / Denominator));
}
I have two bits of code, which I think work exactly the same:
if (ntohs(tcp_hdr->tcp_dport)==80) {
char * parser = strtok(string,";;");
while (parser != NULL){
char parvar[100];
strcpy(parvar, parser);
if(parvar[0] == 'H' && parvar[1] == 'o' && parvar[2] == 's' && parvar[3] == 't') {
char * substr = extract(parvar, 6, strlen(parvar));
visited_hosts[hosts_counter] = substr;
hosts_counter++;
}
parser = strtok(NULL, ";;");
}
bytes_sent += ((ip_hdr->ip_ttl)-40);
}
and
if (ntohs(tcp_hdr->tcp_sport)==80) {
char * parser = strtok(string,";;");
while (parser != NULL){
char parvar[100];
strcpy(parvar, parser);
if(parvar[0] == 'L' && parvar[1] == 'o' && parvar[2] == 'c' && parvar[3] == 'a') {
char * substr = extract(parvar, 10, strlen(parvar));
visited_pages[pages_counter] = substr;
pages_counter++;
}
parser = strtok(NULL, ";;");
}
bytes_received += ((ip_hdr->ip_ttl)-40);
}
I have while(1) listener and first piece of code works fine, but second one exits loop after completionof its task with a segmentation fault. I cannot use gdb as I am using QEMU to test my solutions. Do you guys know what might be a problem, or what else can I use to debug c code in QEMU?
Congratulations, you just fell for the fixed limits trap:
You allocate an array of 100 bytes on the stack, then you copy a string of unknown size into it (using strcpy()). Now, when parser is a string that is longer than 100 bytes, strcpy() continues writing past the end of the array, overwriting vital data on your stack, including your functions return address. This is why your program crashes when your function tries to return - it tries to jump to an address that does not exist.
My advise is: Avoid fixed sized buffers at all costs. Avoid any fixed limits at all costs. The only exception is when you can prove that no future use will ever be able to exceed the limit. Because, whenever you use a fixed limit, I can guarantee you, that someday it will be exceeded and bite you. And finding such a bug to fix it is much much more expensive than doing it right the first time.
I'm actually writing about the same program as before, but I feel like I've made significant progress since the last time. I have a new question however; I have a function designed to store the frequencies of letters contained within the message inside an array so I can do some comparison checks later. When I ran a test segment through the function by outputting all of my array entries to see what their values are, it seems to be storing some absurd numbers. Here's the function of issue:
void calcFreq ( float found[] )
{
char infname[15], alpha[27];
char ch;
float count = 0;
FILE *fin;
int i = 0;
while (i < 26) {
alpha[i] = 'A' + i++;
}
printf("Please input the name of the file you wish to scan:\n");
scanf("%s", infname);
fin = fopen ( infname, "r");
while ( !feof(fin) ) {
fscanf(fin, "%c", &ch);
if ( isalpha(ch) ) {
count += 1;
i = 0;
if ( islower(ch) ) { ch = toupper(ch); }
while ( i < 26 ) {
if ( ch == alpha[i] ) {
found[i]++;
i = 30;
}
i++;
}
}
}
fclose(fin);
i = 0;
while ( i < 26 ) {
found[i] = found[i] / count;
printf("%f\n", found[i]);
i++;
}
}
At like... found[5], I get this hugely absurd number stored in there. Is there anything you can see that I'm just overlooking? Also, some array values are 0 and I'm pretty certain that every character of the alphabet is being used at least once in the text files I'm using.
I feel like a moron - this program should be easy, but I keep overlooking simple mistakes that cost me a lot of time >.> Thank you so much for your help.
EDIT So... I set the entries to 0 of the frequency array and it seems to turn out okay - in a Linux environment. When I try to use an IDE from a Windows environment, the program does nothing and Windows crashes. What the heck?
Here are a few pointers besides the most important one of initializing found[], which was mentioned in other comments.
the alpha[] array complicates things, and you don't need it. See below for a modified file-read-loop that doesn't need the alpha[] array to count the letters in the file.
And strictly speaking, the expression you're using to initialize the alpha[] array:
alpha[i] = 'A' + i++;
has undefined behavior because you modify i as well as use it as an index in two different parts of the expression. The good news is that since you don't need alpha[] you can get rid of its initialization entirely.
The way you're checking for EOF is incorrect - it'll result in you acting on the last character in the file twice (since the fscanf() call that results in an EOF will not change the value of ch). feof() won't return true until after the read that occurs at the end of the file. Change your ch variable to an int type, and modify the loop that reads the file to something like:
// assumes that `ch` is declared as `int`
while ( (ch = fgetc(fin)) != EOF ) {
if ( isalpha(ch) ) {
count += 1;
ch = toupper(ch);
// the following line is technically non-portable,
// but works for ASCII targets.
// I assume this will work for you because the way you
// initialized the `alpha[]` array assumed that `A`..`Z`
// were consecutive.
int index = ch - 'A';
found[index] += 1;
}
}
alpha[i] = 'A' + i++;
This is undefined behavior in C. Anything can happen when you do this, including crashes. Read this link.
Generally I would advise you to replace your while loops with for loops, when the maximum number of iterations is already known. This makes the code easier to read and possibly faster as well.
Is there a reason you are using float for counter variables? That doesn't make sense.
'i = 30;' What is this supposed to mean? If your intention was to end the loop, use a break statement instead of some mysterious magic number. If your intention was something else, then your code isn't doing what you think it does.
You should include some error handling if the file was not found. fin = fopen(..) and then if(fin == NULL) handle errors. I would say this is the most likely cause of the crash.
Check the definition of found[] in the caller function. You're probably running out of bounds.
I wrote a function in C that converts a string to an integer and returns the integer. When I call the function I also want it to let me know if the string is not a valid number. In the past I returned -1 when this error occurred, because I didn't need to convert strings to negative numbers. But now I want it to convert strings to negative numbers, so what is the best way to report the error?
In case I wasn't clear about this: I don't want this function to report the error to the user, I want it to report the error to the code that called the function. ("Report" might be the wrong word to use...)
Here's the code:
s32 intval(const char *string) {
bool negative = false;
u32 current_char = 0;
if (string[0] == '-') {
negative = true;
current_char = 1;
}
s32 num = 0;
while (string[current_char]) {
if (string[current_char] < '0' || string[current_char] > '9') {
// Return an error here.. but how?
}
num *= 10;
num += string[current_char] - '0';
current_char++;
}
if (negative) {
num = -num;
}
return num;
}
There are several ways. All have their pluses and minuses.
Have the function return an error code and pass in a pointer to a location to return the result. The nice thing about this there's no overloading of the result. The bad thing is that you can't use the real result of the function directly in an expression.
Evan Teran suggested a variation of this that has the caller pass a pointer to a success variable (which can be optionally NULL if the caller doesn't care) and returns the actual value from the function. This has the advantage of allowing the function to be used directly in expressions when the caller is OK with a default value in an error result or knows that the function cannot fail.
Use a special 'sentinel' return value to indicate an error, such as a negative number (if normal return values cannot be negative) or INT_MAX or INT_MIN if good values cannot be that extreme. Sometimes to get more detailed error information a call to another function (such as GetLastError()) or a global variable needs to be consulted (such as errno). This doesn't work well when your return value has no invalid values, and is considered bad form in general by many people.
An example function that uses this technique is getc(), which returns EOF if end of file is reached or an error is encountered.
Have the function never return an error indication directly, but require the caller to query another function or global. This is similar to how VB's "On Error Goto Next" mode works - and it's pretty much universally considered a bad way to go.
Yet another way to go is to have a 'default' value. For example, the atoi() function, which has pretty much the same functionality that your intval() function, will return 0 when it is unable to convert any characters (it's different from your function in that it consumes characters to convert until it reaches the end of string or a character that is not a digit).
The obvious drawback here is that it can be tricky to tell if an actual value has been converted or if junk has been passed to atoi().
I'm not a huge fan of this way to handle errors.
I'll update as other options cross my mind...
Well, the way that .NET handles this in Int32.TryParse is to return the success/failure, and pass the parsed value back with a pass-by-reference parameter. The same could be applied in C:
int intval(const char *string, s32 *parsed)
{
*parsed = 0; // So that if we return an error, the value is well-defined
// Normal code, returning error codes if necessary
// ...
*parsed = num;
return SUCCESS; // Or whatever
}
a common way is to pass a pointer to a success flag like this:
int my_function(int *ok) {
/* whatever */
if(ok) {
*ok = success;
}
return ret_val;
}
call it like this:
int ok;
int ret = my_function(&ok);
if(ok) {
/* use ret safely here */
}
EDIT: example implementation here:
s32 intval(const char *string, int *ok) {
bool negative = false;
u32 current_char = 0;
if (string[0] == '-') {
negative = true;
current_char = 1;
}
s32 num = 0;
while (string[current_char]) {
if (string[current_char] < '0' || string[current_char] > '9') {
// Return an error here.. but how?
if(ok) { *ok = 0; }
}
num *= 10;
num += string[current_char] - '0';
current_char++;
}
if (negative) {
num = -num;
}
if(ok) { *ok = 1; }
return num;
}
int ok;
s32 val = intval("123a", &ok);
if(ok) {
printf("conversion successful\n");
}
The os-style global errno variable is also popular. Use errno.h.
If errno is non-zero, something went wrong.
Here's a man page reference for errno.
Take a look at how the standard library deals with this problem:
long strtol(const char * restrict str, char **restrict endptr, int base);
Here, after the call the endptr points at the first character that could not be parsed. If endptr == str, then no characters were converted, and this is a problem.
In general I prefer the way Jon Skeet proposed, ie. returning a bool (int or uint) about success and storing the result in a passed address. But your function is very similar to strtol, so I think it is a good idea to use the same (or similar) API for your function. If you give it a similar name like my_strtos32, this makes it easy to understand what the function does without any reading of the documentation.
EDIT: Since your function is explicitly 10-based, my_strtos32_base10 is a better name. As long as your function is not a bottle-neck you can then, skip your implementation. And simply wrap around strtol:
s32
my_strtos32_base10(const char *nptr, char **endptr)
{
long ret;
ret = strtol(nptr, endptr, 10);
return ret;
}
If you later realize it as an bottleneck you can still optimize it for your needs.
You can either return an instance of a class where a property would be the value interested in, another property would be a status flag of some sort. Or, pass in an instance of the result class..
Pseudo code
MyErrStatEnum = (myUndefined, myOK, myNegativeVal, myWhatever)
ResultClass
Value:Integer;
ErrorStatus:MyErrStatEnum
Example 1:
result := yourMethod(inputString)
if Result.ErrorStatus = myOK then
use Result.Value
else
do something with Result.ErrorStatus
free result
Example 2
create result
yourMethod(inputString, result)
if Result.ErrorStatus = myOK then
use Result.Value
else
do something with Result.ErrorStatus
free result
The benefit of this approach is you can expand the info coming back at any time by adding additional properties to the Result class.
To expand this concept further, it also applies to method calls with multiple input parameters. For example, instead of CallYourMethod(val1, val2, val3, bool1, bool2, string1) instead, have a class with properties matching val1,val2,val3,bool1,bool2,string1 and use that as a single input parameter. It cleans up the method calls and makes the code more easily modified in the future. I'm sure you've seen that method calls with more than a few parameters is much more difficult to use/debug. (7 is the absolute most I would say.)
What is the best way to return an error from a function when I'm already returning a value?
Some additional thoughts to the various answers.
Return a structure
Code can return a value and an error code. A concern is the proliferation of types.
typedef struct {
int value;
int error;
} int_error;
int_error intval(const char *string);
...
int_error = intval(some_string);
if (int_error.error) {
Process_Error();
}
int only_care_about_value = intval(some_string).value;
int only_care_about_error = intval(some_string).error;
Not-a-number and NULL
Use a special value when the function return type provides it.
Not-a-number's are not required by C, but ubiquitous.
#include <math.h>
#include <stddef.h>
double y = foo(x);
if (isnan(y)) {
Process_Error();
}
void *ptr = bar(x);
if (ptr == NULL) {
Process_Error();
}
_Generic/Function Overloading
Considering the pros & cons of error_t foo(&dest, x) vs. dest_t foo(x, &error),
With a cascaded use of _Generic or function overloading as a compiler extension, selecting on 2 or more types, it makes sense to differentiate the underlying function called to be based on the parameters of the call, not the return value. Return the common type, the error status.
Example: a function error_t narrow(destination_t *, source_t) that converted the value of one type to a narrower type, like long long to short and tested if the source value was in range of the target type.
long long ll = ...;
int i;
char ch;
error = narrow(&i, ll);
...
error = narrow(&ch, i);