atoi() not converting whole string because of special character - c

I have data in my .txt file containing different time ins and time outs of employees. For example, 10:20 but I initially designed the structure to have their data types to be of char arrays or string. Since I'll be using the time values in another function, I have to use the atoi() function to convert them into integer values. Problem is, there is a colon : in each of the time values. Would it be possible to convert the string 10:20 to an integer using atoi() so that I can it in my future functions? Does the use of atoi() allow some splitting or some sort so that I can convert my time value from string to int?
I tried
char time[10] = "10:20";
int val;
printf("string val = %s, int value = %d", time, atoi(time));
But my output is only
string val = 10:20, int value = 10 so only the string before the : is read and converted to string. I would want that after converting, I would stil have 10:20 as the result but in integer because I am going to use relational operators with it.

It's not clear what you actually want, but maybe something like:
#include <stdio.h>
#include <stdlib.h>
int
main(int argc, char **argv)
{
char *time = argc > 1 ? argv[1] : "10:20";
int d;
char *e;
d = strtol(time, &e, 10);
if( *e == ':' ){
d *= 100;
d += strtol(e + 1, &e, 10);
}
if( *e != '\0' ){
fprintf(stderr, "invalid input\n");
return 1;
}
printf("string val = %s, int value = %d\n", time, d);
return 0;
}
This will produce d = 1020 for the string "10:20". It's not at all clear to me what integer you want to produce, but that seems to be what you're looking for.

You can also use sscanf:
#include <stdio.h>
int main() {
char const* time = "10:20";
int h, m;
if (sscanf(time, "%d:%d", &h, &m) != 2)
return 1;
printf("string val = %s, int value = %d\n", time, h * 100 + m);
}

Related

sscanf and scanset stops reading of hex numbers

I try to verify an UUID v4. I try to do this with sscanf, if the UUID can be read completly with sscanf (= total number of characters read - 36), i assume this is a correct UUID. My code up to now:
#include <stdio.h>
int main()
{
char uuid[ 37 ] = "da4dd6a0-5d4c-4dc6-a5e3-559a89aff639";
int a = 0, b = 0, c = 0, d = 0, e = 0, g = 0;
long long int f = 0;
printf( "uuid >%s<, variables read: %d \n", uuid, sscanf( uuid, "%8x-%4x-4%3x-%1x%3x-%12llx%n", &a, &b, &c, &d, &e, &f, &g ) );
printf( " a - %x, b - %x, c - %x, d - %x, e - %x, f - %llx, total number of characters read - %d \n", a, b, c, d, e, f, g );
return 0;
}
which return the following output
uuid >da4dd6a0-5d4c-4dc6-a5e3-559a89aff639<, variables read: 6
a - da4dd6a0, b - 5d4c, c - dc6, d - a, e - 5e3, f - 559a89aff639, total number of characters read - 36
So far, everything okay.
Now I want to include, that the first character after the third hyphen needs to be one of [89ab]. So I changed %1x%3x to %1x[89ab]%3x. But now, the first character is read and the rest not anymore.
The output:
uuid >da4dd6a0-5d4c-4dc6-a5e3-559a89aff639<, variables read: 4
a - da4dd6a0, b - 5d4c, c - dc6, d - a, e - 0, f - 0, total number of characters read - 0
What am I missing? What is wrong with the syntax? Is possible to read it like this? I tried several combinations of the scanset and the specifier, but nothing works.
Instead of using sscanf() for this task, you might just write a simple dedicated function:
#include <ctype.h>
#include <string.h>
int check_UUID(const char *s) {
int i;
for (i = 0; s[i]; i++) {
if (i == 8 || i == 13 || i == 18 || i == 23) {
if (s[i] != '-')
return 0;
} else {
if (!isxdigit((unsigned char)s[i])) {
return 0;
}
}
if (i != 36)
return 0;
// you can add further tests for specific characters:
if (!strchr("89abAB", s[19]))
return 0;
return 1;
}
If you insist on using sscanf(), here is concise implementation:
#include <stdio.h>
int check_UUID(const char *s) {
int n = 0;
sscanf(s, "%*8[0-9a-fA-F]-%*4[0-9a-fA-F]-%*4[0-9a-fA-F]-%*4[0-9a-fA-F]-%*12[0-9a-fA-F]%n", &n);
return n == 36 && s[n] == '\0';
}
If you want to refine the test for the first character after the third hyphen, add another character class:
#include <stdio.h>
int check_UUID(const char *s) {
int n = 0;
sscanf(s, "%*8[0-9a-fA-F]-%*4[0-9a-fA-F]-%*4[0-9a-fA-F]-%*1[89ab]%*3[0-9a-fA-F]-%*12[0-9a-fA-F]%n", &n);
return n == 36 && s[n] == '\0';
}
Notes:
The * after the % means do not store the conversion, just skip the characters and the 1 means consume at most 1 character.
For the number of characters parsed by sscanf to reach 36, all hex digit sequences must have exactly the specified width.
%n causes scanf to store the number of characters read so far into the int pointed to by the next argument.
your conversion specification is useful to get the actual UUID numbers, but the %x format accepts leading white space, an optional sign and an optional 0x or 0X prefix, all of which are invalid inside a UUID. You can first validate the UUID, then convert it to its individual parts if required.
Now I want to include, that the first character after the third hyphen needs to be one of [89ab]. So I changed %1x%3x to %1x[89ab]%3x
Should have been "%1[89ab]%3x" and then saved into a 2 character string. Then convert that small string into a hex value with strtol(..., ..., 16).
Instead, I suggest a 2 step validation for universally unique identifier (UUID)
:
Check for syntax, then read the value.
I'd avoid "%x" as it allows leading spaces, leading '+','-' and optional leading 0x and narrow inputs.
For validation, perhaps a simply test in code:
#include <ctype.h>
#include <stdio.h>
// byte lengths: 4-2-2-2-6
typedef struct {
unsigned long time_low;
unsigned time_mid;
unsigned time_hi_and_version;
unsigned clock_seq_hi_and_res_clock_seq_low;
unsigned long long node;
} uuid_T;
uuid_T* validate_uuid(uuid_T *dest, const char *uuid_source) {
static const char *uuid_pat = "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx";
const char *pat = uuid_pat;
const unsigned char *u = (const unsigned char*) uuid_source;
while (*u) {
if ((*pat++ == 'x' && !isxdigit(*u)) || *u != '-') {
return NULL;
}
u++;
}
if (*pat) { // Too short
return NULL;
}
sscanf(uuid_source, "%lx-%x-%x-%x-%llx", &dest->time_low,
&dest->time_mid, &dest->time_hi_and_version,
&dest->clock_seq_hi_and_res_clock_seq_low, &dest->node);
return dest;
}
u is unsigned char *u so isxdigit(*u) is only called with non-negative values and so avoids UB,

C program calling a string to int function, I am unable to convert the input

I would like to convert a string to an int and calling the function from main. Where the first character is a letter declaring the base of the number and the rest of the characters in the string are the number. I am able to get the function to work separately, but when using the main function to call it will not print out the correct values.
Example of user input using binary:
b1000
b1010
result should be:
b
b
1000
1010
Here is the code:
#include <stdio.h>
#include <string.h>
#include <math.h>
int str_to_int(inputbase) {
char num1[50];
num1[50] = inputbase;
char numcpy1[sizeof(num1) - 1];
int i, len1;
int result1 = 0;
//printf("String: ");
//gets(num1);
//Access first character for base
printf("%c \n", num1[0]);
//Remove first character for number1 and number 2
if (strlen(num1) > 0) {
strcpy(numcpy1, &(num1[1]));
} else {
strcpy(numcpy1, num1);
}
len1 = strlen(numcpy1);
//Turn remaining string characters into an int
for (i = 0; i < len1; i++) {
result1 = result1 * 10 + ( numcpy1[i] - '0' );
}
printf("%d \n", result1);
return result1;
}
int main() {
char *number1[50], *number2[50];
int one, two;
printf("\nAsk numbers: \n");
gets(number1);
gets(number2);
one = str_to_int(number1);
two = str_to_int(number2);
printf("\nVerifying...\n");
printf("%d\n", one);
printf("%d\n", two);
return 0;
}
I suppose your code cannot be compiled because some errors.
The first one is in the line
int str_to_int(inputbase)
where inputbase are defined without type.
If this changed to
int str_to_int(char * inputbase)
the next point for improvement is in line
num1[50] = inputbase;
assignement like that has set of errors:
num1[50] means access to 51th item, but there is only 50 items indexed from 0 to 49
statement num1[0] = inputbase; (as well as with any other correct index) is wrong because of difference in types: num1[0] is char, but inputbase is pointer
num1 = inputbase; will be also wrong (for copying string = cannot be used in C, so consider making loop or using standard library function strncpy)
And since this is only the beginning of problems, I suggest starting from decimal input using some standard function for conversion char* string to int (e.g. atoi, or sscanf), then after you check the program and find it correct if it is required you can avoid using standard conversion and write your own str_to_int
The prototype for your function str_to_int() should specify the type of intputbase. You are passing a string and there is no reason for str_to_int to modify this string, so the type should be const char *inputbase.
Furthermore, you do not need a local copy for the string, just access the first character to determine the base and parse the remaining digits accordingly:
#include <stdlib.h>
int str_to_int(const char *inputbase) {
const char *p = inputbase;
int base = 10; // default to decimal
if (*p == 'b') { // binary
p++;
base = 2;
} else
if (*p == 'o') { // octal
p++;
base = 8;
} else
if (*p == 'h') { // hexadecimal
p++;
base = 16;
}
return strtol(p, NULL, base);
}

Issues with creating a copy of an argument in C

I am converting string arguments into ints I am having an issue where the first argument is not copying and converting to a string while the second argument is working just fine. I am removing the first character of the string array and printing out the rest of the string as an int.
#include <stdio.h>
#include <string.h>
#include <math.h>
int main(int argc, char *argv[])
{
char numcpy1[sizeof(argv[1])-1];
char numcpy2[sizeof(argv[2])-1];
int i, len1, len2;
int result1=0;
int result2=0;
//Access first character for base
printf("%c \n", argv[2][0]);
printf("%c \n", argv[3][0]);
//Remove first character for number1 and number 2
if(strlen(argv[2]) > 0)
{
strcpy(numcpy1, &(argv[2][1]));
}
else
{
strcpy(numcpy1, argv[2]);
}
len1 = strlen(numcpy1);
if(strlen(argv[3]) > 0)
{
strcpy(numcpy2, &(argv[3][1]));
}
else
{
strcpy(numcpy2, argv[3]);
}
len2 = strlen(numcpy2);
//Turn remaining string characters into an int
for(i=0; i<len1; i++)
{
result1 = result1 * 10 + ( numcpy1[i] - '0' );
}
for(i=0; i<len2; i++)
{
result2 = result2 * 10 + ( numcpy2[i] - '0' );
}
printf("%d \n", result1);
printf("%d \n", result2);
return 0;
}
Output:
b
b
-4844
1010
What I want:
b
b
1000
1010
The sizeof operator does not give you the length of a string, as you seem to think it does. It tell you the size of the datatype. Since argv[2] is a char *, this evaluates to the size of this pointer, most likely 4 or 8 depending on the system.
If the string in question is longer than this value, you end up writing past the end of the array. This invokes undefined behavior, which in your case manifests as an unexpected result.
If you want the length of the string, use the strlen function instead. Also, you need to add one more byte. Strings in C are null terminated, so you need space for the null byte that marks the end of the string.
char numcpy1[strlen(argv[2])];
char numcpy2[strlen(argv[3])];
Also note that we don't need to add 1 to each of these since the first character in each string isn't copied.
You need to change
char numcpy1[sizeof(argv[1])-1];
to
char numcpy1[strlen(argv[1]) + 1];
Ditto for numcpy2
I think instead of copying the numbers you could work more easily with pointers. Thereby, you don't have to think about memory allocation and copying the values just for the sake of converting them into an integer value:
const char *numcpy1 = argv[2][0]=='\0' ? argv[2] : argv[2]+1;
...

C convert section of char array to double

I want to convert a section of a char array to a double. For example I have:
char in_string[] = "4014.84954";
Say I want to convert the first 40 to a double with value 40.0. My code so far:
#include <stdio.h>
#include <stdlib.h>
int main(int arg) {
char in_string[] = "4014.84954";
int i = 0;
for(i = 0; i <= sizeof(in_string); i++) {
printf("%c\n", in_string[i]);
printf("%f\n", atof(&in_string[i]));
}
}
In each loop atof it converts the char array from the starting pointer I supply all the way to the end of the array. The output is:
4
4014.849540
0
14.849540
1
14.849540
4
4.849540
.
0.849540
8
84954.000000 etc...
How can I convert just a portion of a char array to a double? This must by modular because my real input_string is much more complicated, but I will ensure that the char is a number 0-9.
The following should work assuming:
I will ensure that the char is a number 0-9.
double toDouble(const char* s, int start, int stop) {
unsigned long long int m = 1;
double ret = 0;
for (int i = stop; i >= start; i--) {
ret += (s[i] - '0') * m;
m *= 10;
}
return ret;
}
For example for the string 23487 the function will do this calculations:
ret = 0
ret += 7 * 1
ret += 8 * 10
ret += 4 * 100
ret += 3 * 1000
ret += 2 * 10000
ret = 23487
You can copy the desired amount of the string you want to another char array, null terminate it, and then convert it to a double. EG, if you want 2 digits, copy the 2 digits you want into a char array of length 3, ensuring the 3rd character is the null terminator.
Or if you don't want to make another char array, you can back up the (n+1)th char of the char array, replace it with a null terminator (ie 0x00), call atof, and then replace the null terminator with the backed up value. This will make atof stop parsing where you placed your null terminator.
Just use sscanf. Use the format "ld" and check for return value is one.
What about that, insert NULL at the right position and then revert it back to the original letter? This means you will manipulate the char array but you will revert it back to the original at the end.
You can create a function that will make the work in a temporary string (on the stack) and return the resulting double:
double atofn (char *src, int n) {
char tmp[50]; // big enough to fit any double
strncpy (tmp, src, n);
tmp[n] = 0;
return atof(tmp);
}
How much simpler could it get than sscanf?
#include <assert.h>
#include <stdio.h>
int main(void) {
double foo;
assert(sscanf("4014.84954", "%02lf", &foo) == 1);
printf("Processed the first two bytes of input and got: %lf\n", foo);
assert(sscanf("4014.84954" + 2, "%05lf", &foo) == 1);
printf("Processed the next five bytes of input and got: %lf\n", foo);
assert(sscanf("4014.84954" + 7, "%lf", &foo) == 1);
printf("Processed the rest of the input and got: %lf\n", foo);
return 0;
}

Grab all integers from irregular strings in C

I am looking for a (relatively) simple way to parse a random string and extract all of the integers from it and put them into an Array - this differs from some of the other questions which are similar because my strings have no standard format.
Example:
pt112parah salin10n m5:isstupid::42$%&%^*%7first3
I would need to eventually get an array with these contents:
112 10 5 42 7 3
And I would like a method more efficient then going character by character through a string.
Thanks for your help
A quick solution. I'm assuming that there are no numbers that exceed the range of long, and that there are no minus signs to worry about. If those are problems, then you need to do a lot more work analyzing the results of strtol() and you need to detect '-' followed by a digit.
The code does loop over all characters; I don't think you can avoid that. But it does use strtol() to process each sequence of digits (once the first digit is found), and resumes where strtol() left off (and strtol() is kind enough to tell us exactly where it stopped its conversion).
#include <stdlib.h>
#include <stdio.h>
#include <ctype.h>
int main(void)
{
const char data[] = "pt112parah salin10n m5:isstupid::42$%&%^*%7first3";
long results[100];
int nresult = 0;
const char *s = data;
char c;
while ((c = *s++) != '\0')
{
if (isdigit(c))
{
char *end;
results[nresult++] = strtol(s-1, &end, 10);
s = end;
}
}
for (int i = 0; i < nresult; i++)
printf("%d: %ld\n", i, results[i]);
return 0;
}
Output:
0: 112
1: 10
2: 5
3: 42
4: 7
5: 3
More efficient than going through character by character?
Not possible, because you must look at every character to know that it is not an integer.
Now, given that you have to go though the string character by character, I would recommend simply casting each character as an int and checking that:
//string tmp = ""; declared outside of loop.
//pseudocode for inner loop:
int intVal = (int)c;
if(intVal >=48 && intVal <= 57){ //0-9 are 48-57 when char casted to int.
tmp += c;
}
else if(tmp.length > 0){
array[?] = (int)tmp; // ? is where to add the int to the array.
tmp = "";
}
array will contain your solution.
Just because I've been writing Python all day and I want a break. Declaring an array will be tricky. Either you have to run it twice to work out how many numbers you have (and then allocate the array) or just use the numbers one by one as in this example.
NB the ASCII characters for '0' to '9' are 48 to 57 (i.e. consecutive).
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <stdbool.h>
int main(int argc, char **argv)
{
char *input = "pt112par0ah salin10n m5:isstupid::42$%&%^*%7first3";
int length = strlen(input);
int value = 0;
int i;
bool gotnumber = false;
for (i = 0; i < length; i++)
{
if (input[i] >= '0' && input[i] <= '9')
{
gotnumber = true;
value = value * 10; // shift up a column
value += input[i] - '0'; // casting the char to an int
}
else if (gotnumber) // we hit this the first time we encounter a non-number after we've had numbers
{
printf("Value: %d \n", value);
value = 0;
gotnumber = false;
}
}
return 0;
}
EDIT: the previous verison didn't deal with 0
Another solution is to use the strtok function
/* strtok example */
#include <stdio.h>
#include <string.h>
int main ()
{
char str[] = "pt112parah salin10n m5:isstupid::42$%&%^*%7first3";
char * pch;
printf ("Splitting string \"%s\" into tokens:\n",str);
pch = strtok (str," abcdefghijklmnopqrstuvwxyz:$%&^*");
while (pch != NULL)
{
printf ("%s\n",pch);
pch = strtok (NULL, " abcdefghijklmnopqrstuvwxyz:$%&^*");
}
return 0;
}
Gives:
112
10
5
42
7
3
Perhaps not the best solution for this task, since you need to specify all characters that will be treated as a token. But it is an alternative to the other solutions.
And if you don't mind using C++ instead of C (usually there isn't a good reason why not), then you can reduce your solution to just two lines of code (using AXE parser generator):
vector<int> numbers;
auto number_rule = *(*(axe::r_any() - axe::r_num())
& *axe::r_num() >> axe::e_push_back(numbers));
now test it:
std::string str = "pt112parah salin10n m5:isstupid::42$%&%^*%7first3";
number_rule(str.begin(), str.end());
std::for_each(numbers.begin(), numbers.end(), [](int i) { std::cout << "\ni=" << i; });
and sure enough, you got your numbers back.
And as a bonus, you don't need to change anything when parsing unicode wide strings:
std::wstring str = L"pt112parah salin10n m5:isstupid::42$%&%^*%7first3";
number_rule(str.begin(), str.end());
std::for_each(numbers.begin(), numbers.end(), [](int i) { std::cout << "\ni=" << i; });
and sure enough, you got the same numbers back.
#include <stdio.h>
#include <string.h>
#include <math.h>
int main(void)
{
char *input = "pt112par0ah salin10n m5:isstupid::42$%&%^*%7first3";
char *pos = input;
int integers[strlen(input) / 2]; // The maximum possible number of integers is half the length of the string, due to the smallest number of digits possible per integer being 1 and the smallest number of characters between two different integers also being 1
unsigned int numInts= 0;
while ((pos = strpbrk(pos, "0123456789")) != NULL) // strpbrk() prototype in string.h
{
sscanf(pos, "%u", &(integers[numInts]));
if (integers[numInts] == 0)
pos++;
else
pos += (int) log10(integers[numInts]) + 1; // requires math.h
numInts++;
}
for (int i = 0; i < numInts; i++)
printf("%d ", integers[i]);
return 0;
}
Finding the integers is accomplished via repeated calls to strpbrk() on the offset pointer, with the pointer being offset again by an amount equaling the number of digits in the integer, calculated by finding the base-10 logarithm of the integer and adding 1 (with a special case for when the integer is 0). No need to use abs() on the integer when calculating the logarithm, as you stated the integers will be non-negative. If you wanted to be more space-efficient, you could use unsigned char integers[] rather than int integers[], as you stated the integers will all be <256, but that isn't a necessity.

Resources