I've been searching the internet for some time, but didn't find a simple solution for a actually simple problem in my eyes. I guess it has been asked already:
I'm reading a value like 20.1 or XYZ via sscanf from a file and saving it in char *width_as_string.
All functions should be valid in -std=c99.
Now I want to check if the value in width_as_string is an integer. If true, it should be saved in int width. If false, width should remain with the value 0.
My approaches:
int width = 0;
if (isdigit(width_as_string)) {
width = atoi(width_as_string);
}
Alternatively, convert width_as_string to int width and convert it back to a string. Then compare if it is the same. But I'm not sure how to achieve that. I already tried itoa.
Functions like isdigit and itoa are not valid in std=c99, therefore I can't use them.
Thanks.
Read carefully some documentation of sscanf. It returns a count, and accepts the %n conversion specifier to give the number of character (bytes) scanned so far. Perhaps you want:
int endpos = 0;
int width = 0;
if (sscanf(width_as_string, "%d %n", &width, &endpos)>=1 && endpos>0) {
behappywith(width);
};
Perhaps you want also to add && width_as_string[endpos]==(char)0 (to check that the number is perhaps space suffixed, then reaching the end of string) after endpos>0
You could also consider the standard strtol which sets an end pointer:
char*endp = NULL;
width = (int) strtol(width_as_string, &endp, 0);
if (endp>width_as_string && *endp==(char)0 && width>=0) {
behappywith(width);
}
The *endp == (char)0 is testing that the end of number pointer -filled by strtol- is the end of string pointer (since a string is terminated with a zero byte). You could make that more fancy if you want to accept trailing spaces.
PS. Actually, you need to specify precisely what is an acceptable input (perhaps by some EBNF syntax). We don't know if "1 " or "2!" or "3+4" are (as C strings) acceptable to you.
How about strtol?
This gives a clear return value if something goes wrong, i think this is what you're looking for
http://www.cplusplus.com/reference/cstdlib/strtol/
Actually, you could use sscanf at the very beginning to check whether the number is integer or not. Something like this
#include <stdio.h>
#include <string.h>
int
main (int argc, char *argv[])
{
int wc; // width to check
int w; // width
char *string = "20.1";
printf("string = %s\n", string);
if (strchr(string, '.') != NULL)
{
wc = 0;
printf("wc = %d\n", wc);
}
else if ((sscanf(string, "%d", &w)) > 0)
{
wc = w;
printf("wc = %d\n", wc);
} else w = 0;
return 0;
}
This is a sample program of course, it first searches the string for a "." to verify if the number could be float and discards it in such a case, then tries to read an integer if no "." are found.
Changed thanks to ameyCU's suggestion
Reference page for sscanf
Related
I'm trying to get an integer number from command line without scanf() but using justfgets(), how can I filter the fgets() contentsreporting an error if I insert a character or a string? The problem is that when I insert something different like a character or a string the atoi()function (essential to do some operations in my algorithm) converts me that string to 0, whilst I'd prefer to exit if the value inserted is different from an integer.
Here's a code part:
.....
char pos[30];
printf("\n Insert a number: ");
fgets (pos, sizeof(pos), stdin);
if (atoi(pos) < 0) //missing check for string character
exit(1);
else{
printf ("%d\n", atoi(pos)); //a string or character converted through atoi() gives 0
}
int number = atoi(pos);
......
As commenters have said, use strtol() not atoi().
The problem with strtol() is that it will only give an ERANGE error (as per the specification) when the converted number will not fit in a long-type. So if you ask it to convert " 1" it gives 1. If you ask it to convert "apple", it returns 0 and sets endptr to indicate an error.
Obviously you need to decide if " 12" is going to be acceptable input or not — strtol() will happily skip the leading white space.
EDIT: Function updated to better handle errors via the endptr.
// Convert the given <text> string to a decimal long, in <value>
// Allow a string of digits, or white space then digits
// returns 1 for OK, or 0 otherwise
int parseLong( const char *text, long *value )
{
int rc = 0; // fail
char *endptr; // used to determine failure
if ( text && value )
{
errno = 0; // Clear any errors
*value = strtol( text, &endptr, 10 ); // Do the conversion
// Check that conversion was performed, and
// that the value fits in a long
if ( endptr != text && errno != ERANGE )
{
rc = 1; // success
}
}
return rc;
}
First, you have to keep in mind that characters are not essentially alpha characters; be precise.
I think what you're looking for is an "is integer" function.
In the standard C library ctype.h there are functions called isalpha and isdigit.
https://www.programiz.com/c-programming/library-function/ctype.h/isalpha
So you could make a function that verifies if a char * contains only numeric characters.
int str_is_only_numeric(const char *str) {
int i = 0;
while (str[i] != '\0') {
if (isdigit(str[i++]) == 0) {
return -1;
}
}
return 0;
}
Here's a working example of the function: https://onlinegdb.com/SJBdLdy78
I solved on my own using strcspn()before checking through isdigit()the integer type, without strcspn() it'd have returned always -1
I'm using atoi to convert a string integer value into integer.
But first I wanted to test different cases of the function so I have used the following code
#include <stdio.h>
int main(void)
{
char *a ="01e";
char *b = "0e1";
char *c= "e01";
int e=0,f=0,g=0;
e=atoi(a);
f=atoi(b);
g=atoi(c);
printf("e= %d f= %d g=%d ",e,f,g);
return 0;
}
this code returns e= 1 f= 0 g=0
I don't get why it returns 1 for "01e"
that's because atoi is an unsafe and obsolete function to parse integers.
It parses & stops when a non-digit is encountered, even if the text is globally not a number.
If the first encountered char is not a space or a digit (or a plus/minus sign), it just returns 0
Good luck figuring out if user input is valid with those (at least scanf-type functions are able to return 0 or 1 whether the string cannot be parsed at all as an integer, even if they have the same behaviour with strings starting with integers) ...
It's safer to use functions such as strtol which checks that the whole string is a number, and are even able to tell you from which character it is invalid when parsing with the proper options set.
Example of usage:
const char *string_as_number = "01e";
char *temp;
long value = strtol(string_as_number,&temp,10); // using base 10
if (temp != string_as_number && *temp == '\0')
{
// okay, string is not empty (or not only spaces) & properly parsed till the end as an integer number: we can trust "value"
}
else
{
printf("Cannot parse string: junk chars found at %s\n",temp);
}
You are missing an opportunity: Write your own atoi. Call it Input2Integer or something other than atoi.
int Input2Integer( Str )
Note, you have a pointer to a string and you will need to establish when to start, how to calculate the result and when to end.
First: Set return value to zero.
Second: Loop over string while it is not null '\0'.
Third: return when the input character is not a valid digit.
Fourth: modify the return value based on the valid input character.
Then come back and explain why atoi works the way it does. You will learn. We will smile.
First of all let me ask for your forgiveness if this is too trivial, I am not a C developer, usually I program in Fortran.
I am in need to read some columnated text files. The problem I have is that some columns can have blank space (non filled value) or not fully filed field.
Let me use a short example of the problem. Lets say I have a generator program like:
#include <stdio.h>
#include <stdlib.h>
int main(){
printf("xxxx%4d%4.2f\n",99,3.14);
}
When I execute this program I get:
$ ./t1
xxxx 993.14
If I get it into a text file and try to read using (e.g.) sscanf with the code:
#include <stdio.h>
#include <stdlib.h>
int main() {
char *fmt = "%*4c%4d%4f";
char *line = "xxxx 993.14";
int ival;
float fval;
sscanf(line,fmt,&ival,&fval);
printf(">>>>%d|%f\n",ival,fval);
}
The result is:
$ ./t2
>>>>993|0.140000
What is the problem here? The sscanf seems to think that all space is meaningless and should be discarded. So the "%4c" does what it is meant to be, it counts 4 characters without discarding any blank space and discards everything due to "". Next the %4d start skipping all blank spaces and start count the 4 characters of the field upon finding the first valid character for the conversion. So the value, meant to be 99 becomes 993, and the 3.14 becomes 0.14.
In Fortran the reading code would be:
program t3
implicit none
integer :: ival
real :: fval
character(len=30) :: fmt="(4x,i4,f4.0)"
character(len=30) :: line="xxxx 993.14"
read(line,fmt) ival, fval
write(*,"('>>>>',i4,'|',f4.2)") ival,fval
end program t3
and the result would be:
$ ./t3
>>>> 99|3.14
That is, the format specification states the field width and nothing is discarding in conversion, except if instructed to by the "nX" specification.
Some final remarks to help the helpers:
The format to be read is an international standard and there is no
way to change it.
The number of existing files is to big to think of intervention or
format change.
It is not a CSV or similar format.
The code has to be in C for integration in a free software package.
Sorry to be too long, trying to state the problem as completely as possible.
The question is: Is there a way to tell sscanf to not skip the blank spaces? If not, is there a simple way to do it in C or it will be necessary write an specialized parser for each record type?
Thank you in advance.
When reading fixed-length fields with sscanf, it is best to parse the values as character strings (which you could do a number of ways), and then perform independent conversion of each of the fields. This allows you to handle conversion/error detection on a per-field basis. For example, you could use a format string of:
char *fmt = "%*4s%2[^0-9]%s";
which would read/discard the 4 leading characters, then read 2-chars as your integer, followed by the remainder of line (or up until the next whitespace) as a string containing your float value.
To handle the storage and parsing of line as fixed length fields, you could use temporary character arrays to hold each of the strings and then use sscanf to fill them much as you have attempted to do with the integer and float directly. e.g.:
char istr[8] = {0};
char fstr[16] = {0};
...
sscanf (line,fmt,istr,fstr);
(note: you could use minimum storage of istr[3] and fstr[7] in this given case, adjust the storage length as required, but providing space for the nul-terminating character)
You can then use strtol and strtof to provide conversion with error checking on each value. For example:
errno = 0;
if ((ival = (int)strtol (istr, NULL, 10)) == 0 && errno)
fprintf (stderr, "error: integer conversion failed.\n");
/* underflow/overflow checks omitted */
and
errno = 0;
if ((fval = strtof (fstr, NULL)) == 0 && errno)
fprintf (stderr, "error: integer conversion failed.\n");
/* nan and inf checks omitted */
Putting all the pieces together in you example, you could use something like:
#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
int main() {
char *fmt = "%*4s%2[^0-9]%s";
char *line = "xxxx 993.14";
char istr[8] = {0};
char fstr[16] = {0};
int ival;
float fval;
sscanf (line,fmt,istr,fstr);
errno = 0;
if ((ival = (int)strtol (istr, NULL, 10)) == 0 && errno)
fprintf (stderr, "error: integer conversion failed.\n");
/* underflow/overflow checks omitted */
errno = 0;
if ((fval = strtof (fstr, NULL)) == 0 && errno)
fprintf (stderr, "error: integer conversion failed.\n");
/* nan and inf checks omitted */
printf(">>>>%d|%6.2f\n",ival,fval);
return 0;
}
Example/Output
$ >>>>0|993.14
*scanf() is not designed to handle fixed column width with non-intervening white-space.
With sscanf(), to not skip spaces, code must use "%c", "%n", "%[]" as all other specifiers skip leading white-space and those skipped characters do not contribute to a width limit.
To scan the printed line, which in now in buffer, take advantage that the only use of '\n' is at the end of the line.
char str_int[5];
char str_float[5];
int n = 0;
sscanf(buffer, "%*4c%4[^\n]%4[^\n]%n", str_int, str_float, &n);
if (n != 12 || buffer[n] != '\n') Fail();
// Now convert str_int, str_float as needed.
Another way to use sscanf() would be to parse buffer as
int ival;
float fval;
if (strlen(buffer) != 13) Fail();
if (sscanf(&buffer[8], "%f", &fval) != 1) Fail();
buffer[8] = '\0';
if (sscanf(&buffer[4], "%d", &ival) != 1) Fail();
Note: The 4s in the below do not specified the output width as 4 characters. 4 is the minimum width to print.
printf("xxxx%4d%4.2f\n",ival, fval);
Code could use the following to detect problems.
if (13 != printf("xxxx%4d%4.2f\n",ival, fval)) Fail();
Watch out for
printf("xxxx%4d%4.2f\n",123, 9.995000001f); // "xxxx 12310.00\n"
First off, I dunno. There might be some way to wrangle sscanf to recognize the whitespace towards your integer count. But I just don't think scanf was made for this sort of format in mind. The tool's trying to be smart of helpful and it's biting you in the ass.
But if it's columnated data and you know the position of the various fields, there's a really easy work around. Just extract the field you want.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(int argc, char** argv)
{
char line[] = "xxxx 893.14";
char tmp[100];
int thatDamnNumber;
float myfloatykins;
//Get that field
memcpy(tmp, line+4, 4);
sscanf(tmp, "%d", &thatDamnNumber);
//Kill that field so it doesn't goober-up the float
memset(line+4, ' ', 4);
sscanf(line, "%*4c%f", &myfloatykins);
printf("%d %f\n", thatDamnNumber, myfloatykins);
return 0;
}
If there is a lot of this, you could make some generalized functions: integerExtract(int positionStart, int sizeInCharacters), floatExtract(), etc.
If each element is of fixed width you don't really need scanf(), try this
char copy[5];
const char *line = "xxxx 993.14";
int ival;
float fval;
copy[0] = line[4];
copy[1] = line[5];
copy[2] = line[6];
copy[3] = line[7];
copy[4] = '\0'; // nul terminate for `atoi' to work
ival = atoi(copy);
fval = atof(&line[8]);
fprintf(stdout, "%d -- %f\n", ival, fval);
If you want (probably should) you can use strtol() instead of atoi() and strtof() instead of atof() to check for malformed data.
Both these functions take a parameter to store the unconverted/invalid characters, you can check the passed pointer in order to verify that there was a problem with conversion.
Or if you really want scanf() do the same, capture the integer + whitespaces to a char array and then convert it to int later, like this
char integer[5];
const char *line = "xxxx 993.14";
int ival;
float fval;
if (sscanf(line, "%*4c%4[0-9 ]%f", integer, &fval) != 2)
return -1;
ival = atoi(integer);
fprintf(stdout, "%d -- %f\n", ival, fval);
The format "%*4c%4[0-9 ]%f" will
Skip the first four characters including white spaces.
Scan the next four characters if they consist only of digits or white spaces.
Scan the rest of the input string searching for a matching float value.
I am posting what I think is a final conclusion from the answers I have got so far and from other sources.
What is a very trivial task in Fortran is not a so trivial task in other languages. I guess — not sure — that the same task could be as easy as in Fortran in other languages. I think that Cobol, Pascal, PL/I and others from the time of punched card probably could be trivial.
I think that most languages nowadays are more comfortable with different data structure and inherited its I/O structure from C. I think that Java, Python, Perl(?) and others could serve as examples.
From what I saw in this thread there are two main problems to read / convert fixed column length text data with C.
The first problem is that, as Philip said in his answer: “The tool’s trying to be smart of helpful and it’s biting you in the ass.” Quite right! The point is that it seems that C text I/O thinks that “white space” is something like a NULL character and should be thrown away, completely disregarding any information of the start of field. The only exception to that seems to be the %nc that get exactly n chars, even blanks.
The second problem is that the conversion “tag” (how is that called?) %nf will keep converting while it finds a valid character, even if you say stop at the 4th character.
If we join those two problems with a field completely filled with white space, depending on the conversion tool used, it throws an error or keeps going madly looking for something meaningful.
At the end of the day, it seems that the only way is to extract the field length to another memory area, dynamically allocated or not (we can have an area for each column length), and try to parse this separate area, taking into account the possibility of a full white space area to cache the error.
I have a string that has ints and I'm trying to get all the ints into another array. When sscanf fails to find an int I want the loop to stop. So, I did the following:
int i;
int getout = 0;
for (i = 0; i < bsize && !getout; i++) {
if (!sscanf(startbuffer, "%d", &startarray[i])) {
getout = 1;
}
}
//startbuffer is a string, startarray is an int array.
This results in having all the elements of startarray to be the first char in startbuffer.
sscanf works fine but it doesn't move onto the next int it just stays at the first position.
Any idea what's wrong? Thanks.
The same string pointer is passed each time you call sscanf. If it were to "move" the input, it would have to move all the bytes of the string each time which would be slow for long strings. Furthermore, it would be moving the bytes that weren't scanned.
Instead, you need to implement this yourself by querying it for the number of bytes consumed and the number of values read. Use that information to adjust the pointers yourself.
int nums_now, bytes_now;
int bytes_consumed = 0, nums_read = 0;
while ( ( nums_now =
sscanf( string + bytes_consumed, "%d%n", arr + nums_read, & bytes_now )
) > 0 ) {
bytes_consumed += bytes_now;
nums_read += nums_now;
}
Convert the string to a stream, then you can use fscanf to get the integers.
Try this.
http://www.gnu.org/software/libc/manual/html_node/String-Streams.html
You are correct: sscanf indeed does not "move", because there is nothing to move. If you need to scan a bunch of ints, you can use strtol - it tells you how much it read, so you can feed the next pointer back to the function on the next iteration.
char str[] = "10 21 32 43 54";
char *p = str;
int i;
for (i = 0 ; i != 5 ; i++) {
int n = strtol(p, &p, 10);
printf("%d\n", n);
}
This is the correct behavior of sscanf. sscanf operates on a const char*, not an input stream from a file, so it will not store any information about what it has consumed.
As for the solution, you can use %n in the format string to obtain the number of characters that it has consumed so far (this is defined in C89 standard).
e.g. sscanf("This is a string", "%10s%10s%n", tok1, tok2, &numChar); numChar will contain the number of characters consumed so far. You can use this as an offset to continue scanning the string.
If the string only contains integers that doesn't exceed the maximum value of long type (or long long type), use strtol or strtoll. Beware that long type can be 32-bit or 64-bit, depending on the system.
Here's my issue:
I have written a function to detect if a string is hex based off of the "0x####" format:
int lc3_hashex(char *str)
{
int val = 0;
char *to;
to = strndup(str+2, 10);
val = sscanf(to, "%x", &val);
if (val)
{
return val;
}
return 0;
}
Assuming the parameter is of the form "0x####", it returns the decimal version of the post "0x" numbers. But is there any built in way (or a way I am just overseeing) to get the integer value of the hexidecimal number "0x4000" as opposed to the integer value of "4000"?
Thanks.
You can reduce that function to:
int cnt = sscanf(str, "%x", &val);
if (cnt == 1) {
// read a valid `0xNNNN` string
}
scanf with the %x format already does the hex conversion, and deals with the 0x prefix just fine. Also, it's return value is the number of items matched, so you can use that to determine if it did find a hex value in str or not.
With this you have both pieces of information you need: whether or not the string was formatted as you expected it, and what value it was (properly converted). It also avoids a string allocation (which you're not freeing), and the bug your code has if strlen(str) is less than two.
If you change your function signature to:
int check_and_get_hex(const char *str, int *val);
(or something like that), update the sscanf call accordingly (passing val rather than &val), and return (cnt == 1), you can get both the "it's a valid hex string" and the value to the caller in a single shot.
Use strtol from stdlib.h and specify the base as 16.
The only downside is that this function returns 0 upon failure, so you'll want to check to make sure the input to it is not 0.
I fail to understand why the string is being cut short before doing sscanf. If you want the string in hex value to be converted to the decimal integer, you can give it directly.
#include<stdio.h>
int main()
{
char sHex[7] = "0x2002";
int nVal = 0;
sscanf( sHex, "%x", &nVal );
printf( "%d", nVal );
return 0;
}
This will print 8194, the decimal value for 0x2002. By giving "%x" to sscanf, you are specifying the input string as hexadecimal. so, the preceding "0x" is fine.