This question already has answers here:
scanning a string to hex char array
(3 answers)
Closed 8 years ago.
Here is my Code :
char a[18], b[18];
char oper, clear;
char *test;
init_8051();
test="0x1234567890123456 + 0x1234567890123456\0";
printf("Please enter an equation: %s \n",test );
sscanf(test,"0x%s %c 0x%s",a,&oper,b);
printf(" a= %s \n" ,a);
printf(" oper= %s \n" ,oper);
printf(" b= %s \n" ,b);
I want to accept to hex numbers with an operation as a string and to be able to seperate those 2 numbers into 2 separate char arrays but it doesnt wanna work, here is the output of the following code :
Please enter an equation: 0x1234567890123456 + 0x1234567890123456
a= 1234567890123456
oper= Ò
b= 1234567890123456
As you can see the operation is not recognized and also i have to use spaces which i wish i didnt have to use i wish it to be in the format of 0x1234567890123456+0x1234567890123456
with no spaces between the plus and the number.
Thanks
From the sscanf manual
s Matches a sequence of non-white-space characters; the next pointer must be a pointer to character array that is long enough to hold the input sequence and the
terminating null byte ('\0'), which is added automatically. The input string stops at white space or at the maximum field width, whichever occurs first.
It means that %s consumes the + and the rest of the characters, leaving b and oper uninitalized, abd overflowing a since it only has space for 18 characters.
So when the input string is lacking the space after the first operand, sscanf will continue reading until it finds a whitespace character. Hence when the string does not contain the separating space between the operands and the operator, sscanf consumes all the input.
I'll append here a different approach to your problems solution
We copy the string, this is required by strtok you can't pass an inmutable string, there are plenty of methods to copy this string, you just have to pick the appropriate one for your case
input = strdup("0x1234567890123456 + 0x1234567890123456\0");
Now, we use strpbrk to find the operator
pointer = strpbrk(input, "+-*/" /* here go the operators */);
if (pointer != NULL)
oper = *pointer; /* this will contain the operator ascii value */
Create a string containing the operator as a delimiter
operstr[0] = oper;
operstr[1] = '\0'; /* strings must be null terminated */
Now, we use strtok to tokenize the string, and find the operands
pointer = strtok(input, operstr);
if (pointer != NULL)
fprintf(stderr, "first operand: %s\n", pointer); /* you can copy this string if you need to */
printf("Operator: %s \n", operstr);
Second call to strtok needs NULL first argument
pointer = strtok(NULL, operstr);
if (pointer != NULL)
fprintf(stderr, "second operand: %s\n", pointer); /* you can copy this string if you need to */
And finally free our copy of the input string.
free(input);
It is better to use strtok_r the reentrant version. But for now you could test my suggestions and may be, it is what you need.
Even though this will work for this particular situation it is not the preferred way of doing this kind of thing, you can try writing a parser and use Reverse Polish Notation, or you can try with a lexical analyzer and a parser generator like flex and bison.
My previous answer was downvoted, and didn't address all of OP's requirements, so I have rewritten this answer.
OP wants flexible input, either spaces or no spaces. I suggest not using sscanf() but the methods below. First the program finds a valid operator by using strcspn(), then breaks the string using strtok() on operators and whitespace. But using strtok() on a string literal is UB so I copy the "equation" to another string first.
I also corrected the printf() field spec for the operator, and made a and b different - it's always a bad idea using the same values for different variables in an example.
#include <stdio.h>
#include <string.h>
#define OPERATORS "+-*/"
#define DELIMS " \t\n" OPERATORS
int parse (char *test)
// return 1 if parsed successfully
{
char a[50], b[50];
char oper;
char *ptr;
int opind;
opind = strcspn (test, OPERATORS); // find operator
if (opind < 1) return 0; // fail
oper = test[opind]; // collect operator
ptr = strtok (test, DELIMS); // find a
if (ptr == NULL) return 0; // fail
strcpy (a, ptr); // collect 1st arg
ptr = strtok (NULL, DELIMS); // find b
if (ptr == NULL) return 0; // fail
strcpy (b, ptr); // collect 2nd arg
printf(" a %s \n" ,a);
printf(" oper %c \n" ,oper); // corrected format
printf(" b %s \n" ,b);
return 1;
}
int main (void)
{
char test[100];
strcpy (test, "0x123456789ABCDEF0+0xFEDCBA9876543210");
if (!parse (test))
printf("Failed\n");
printf("\n");
strcpy (test, "0x123456789ABCDEF0 + 0xFEDCBA9876543210");
if (!parse (test))
printf("Failed\n");
return 0;
}
Program output
a 0x123456789ABCDEF0
oper +
b 0xFEDCBA9876543210
a 0x123456789ABCDEF0
oper +
b 0xFEDCBA9876543210
Related
Let's say I have a string "file1.h: file2.c,file3.cpp" and I want to split it into "file1.h" and "file2.c,file3.cpp" - that is using : (: and whitespace) as delimiter. How can I do it?
I tried this code with no help:
int main(int argc, char *argv[]) {
char str[] = "file1.h: file2.c,file3.cpp";
char name[100];
char depends[100];
sscanf(str, "%s: %s", name, depends);
printf("Name: %s\n", name);
printf("Deps: %s\n", depends);
}
And the output I get is:
Name: file1.h:
Deps:
What you seem to need is strtok(). Read about it in the man page. Related quote from C11, chapter §7.24.5.8
A sequence of calls to the strtok function breaks the string pointed to by s1 into a
sequence of tokens, each of which is delimited by a character from the string pointed to
by s2. [...]
In your case, you can use a delimiter like
char * delim = ": "; //combination of : and a space
go get the job done.
Things to mention additionally,
the input needs to be modifiable (which is, in your case) for strtok()
and it actually destroys the input fed to it, keep a copy around if you need the actual later.
This is an alternative way to do it, it uses strchr(), but this assumes that the input string always has the format
name: item1,item2,item3,...,itemN
Here is the program
#include <string.h>
#include <stdio.h>
int
main(void)
{
const char *const string = "file1.h: file2.c,file3.cpp ";
const char *head;
const char *tail;
const char *next;
// This basically makes a pointer to the `:'
head = string;
// If there is no `:' this string does not follow
// the assumption that the format is
//
// name: item1,item2,item3,...,itemN
//
if ((tail = strchr(head, ':')) == NULL)
return -1;
// Save a pointer to the next character after the `:'
next = tail + 1;
// Strip leading spaces
while (isspace((unsigned char) *head) != 0)
++head;
// Strip trailing spaces
while (isspace((unsigned char) *(tail - 1)) != 0)
--tail;
fputc('*', stdout);
// Simply print the characters between `head' and `tail'
// you could as well copy them, or whatever
fwrite(head, 1, tail - head, stdout);
fputc('*', stdout);
fputc('\n', stdout);
head = next;
while (head != NULL) {
tail = strchr(head, ',');
if (tail == NULL) {
// This means there are no more `,'
// so we now try to point to the end
// of the string
tail = strchr(head, '\0');
}
// This is basically the same algorithm
// just with a different delimiter which
// will presumably be the same from
// here
next = tail + 1;
// Strip leading spaces
while (isspace((unsigned char) *head) != 0)
++head;
// Strip trailing spaces
while (isspace((unsigned char) *(tail - 1)) != 0)
--tail;
// Here is where you can extract the string
// I print it surrounded by `*' to show that
// it's stripping white spaces
fputc('*', stdout);
fwrite(head, 1, tail - head, stdout);
fputc('*', stdout);
fputc('\n', stdout);
// Try to point to the next one
// or make head `NULL' if this is
// the end of the string
//
// Note that the original `tail' pointer
// that was pointing to the next `,' or
// the end of the string, has changed but
// we have saved it's original value
// plus one, we now inspect what was
// there
if (*(next - 1) == '\0') {
head = NULL;
} else {
head = next;
}
}
fputc('\n', stderr);
return 0;
}
It's excessively commented to guide the reader.
As Sourav says, you really need to use strtok for tokenizing strings. But this doesn't explain why your existing code is not working.
The answer lies in the specification for sscanf and how it handles a '%s' in the format string.
From the man page:
s Matches a sequence of non-white-space characters;
So, the presence of a colon-space in your format string is largely irrelevant for mathcing the first '%s'. When sscanf sees the first %s it simply consumes the input string until a whitespace character is encountered, giving you your value for name of "file1.h:" (note the inclusion of the colon).
Next it tries to deal with the colon-space sequence in your format string.
Again, from the man page
The format string consists of a sequence of directives which describe how to process the sequence of input characters.
The colon-space sequence does not match any known directive (i.e. "%" followed by something) and thus you get a matching failure.
If, instead, your format string was simply "%s%s", then sscanf will get you almost exactly what you want.
int main(int argc, char *argv[]) {
char str[] = "file1.h: file2.c,file3.cpp";
char name[100];
char depends[100];
sscanf(str, "%s%s", name, depends);
printf("str: '%s'\n", str);
printf("Name: %s\n", name);
printf("Deps: %s\n", depends);
return 0;
}
Which gives this output:
str: 'file1.h: file2.c,file3.cpp'
Name: file1.h:
Deps: file2.c,file3.cpp
At this point, you can simply check that sscanf gave a return value of 2 (i.e. it found two values), and that the last character of name is a colon. Then just truncate name and you have your answer.
Of course, by this logic, you aren't going to be able to use sscanf to parse your depends variable into multiple strings ... which is why others are recommending using strtok, strpbrk etc because you are both parsing and tokenizing your input.
Well, I am pretty late. I do not have much knowledge on inbuilt functions in C. So I started writing a solution for you. I don't think you need this now. But, anyway here it is and modify it as per your need. If you find any bug feel free to tell.
I am trying to save one character and 2 strings into variables.
I use sscanf to read strings with the following form :
N "OldName" "NewName"
What I want : char character = 'N' , char* old_name = "OldName" , char* new_name = "NewName" .
This is how I am trying to do it :
sscanf(mystring,"%c %s %s",&character,old_name,new_name);
printf("%c %s %s",character,old_name,new_name);
The problem is , my problem stops working without any outputs .
(I want to ignore the quotation marks too and save only its content)
When you do
char* new_name = "NewName";
you make the pointer new_name point to the read-only string array containing the constant string literal. The array contains exactly 8 characters (the letters of the string plus the terminator).
First of all, using that pointer as a destination for scanf will cause scanf to write to the read-only array, which leads to undefined behavior. And if you give a string longer than 7 character then scanf will also attempt to write out of bounds, again leading to undefined behavior.
The simple solution is to use actual arrays, and not pointers, and to also tell scanf to not read more than can fit in the arrays. Like this:
char old_name[64]; // Space for 63 characters plus string terminator
char new_name[64];
sscanf(mystring,"%c %63s %63s",&character,old_name,new_name);
To skip the quotation marks you have a couple of choices: Either use pointers and pointer arithmetic to skip the leading quote, and then set the string terminator at the place of the last quote to "remove" it. Another solution is to move the string to overwrite the leading quote, and then do as the previous solution to remove the last quote.
Or you could rely on the limited pattern-matching capabilities of scanf (and family):
sscanf(mystring,"%c \"%63s\" \"%63s\"",&character,old_name,new_name);
Note that the above sscanf call will work iff the string actually includes the quotes.
Second note: As said in the comment by Cool Guy, the above won't actually work since scanf is greedy. It will read until the end of the file/string or a white-space, so it won't actually stop reading at the closing double quote. The only working solution using scanf and family is the one below.
Also note that scanf and family, when reading string using "%s" stops reading on white-space, so if the string is "New Name" then it won't work either. If this is the case, then you either need to manually parse the string, or use the odd "%[" format, something like
sscanf(mystring,"%c \"%63[^\"]\" \"%63[^\"]\"",&character,old_name,new_name);
You must allocate space for your strings, e.g:
char* old_name = malloc(128);
char* new_name = malloc(128);
Or using arrays
char old_name[128] = {0};
char new_name[128] = {0};
In case of malloc you also have to free the space before the end of your program.
free(old_name);
free(new_name);
Updated:...
The other answers provide good methods of creating memory as well as how to read the example input into buffers. There are two additional items that may help:
1) You expressed that you want to ignore the quotation marks too.
2) Reading first & last names when separated with space. (example input is not)
As #Joachim points out, because scanf and family stop scanning on a space with the %s format specifier, a name that includes a space such as "firstname lastname" will not be read in completely. There are several ways to address this. Here are two:
Method 1: tokenizing your input.
Tokenizing a string breaks it into sections separated by delimiters. Your string input examples for instance are separated by at least 3 usable delimiters: space: " ", double quote: ", and newline: \n characters. fgets() and strtok() can be used to read in the desired content while at the same time strip off any undesired characters. If done correctly, this method can preserve the content (even spaces) while removing delimiters such as ". A very simple example of the concept below includes the following steps:
1) reading stdin into a line buffer with fgets(...)
2) parse the input using strtok(...).
Note: This is an illustrative, bare-bones implementation, sequentially coded to match your input examples (with spaces) and includes none of the error checking/handling that would normally be included.
int main(void)
{
char line[128];
char delim[] = {"\n\""};//parse using only newline and double quote
char *tok;
char letter;
char old_name[64]; // Space for 63 characters plus string terminator
char new_name[64];
fgets(line, 128, stdin);
tok = strtok(line, delim); //consume 1st " and get token 1
if(tok) letter = tok[0]; //assign letter
tok = strtok(NULL, delim); //consume 2nd " and get token 2
if(tok) strcpy(old_name, tok); //copy tok to old name
tok = strtok(NULL, delim); //consume 3rd " throw away token 3
tok = strtok(NULL, delim); //consume 4th " and get token 4
if(tok) strcpy(new_name, tok); //copy tok to new name
printf("%c %s %s\n", letter, old_name, new_name);
return 0;
}
Note: as written, this example (as do most strtok(...) implementations) require very narrowly defined input. In this case input must be no longer than 127 characters, comprised of a single character followed by space(s) then a double quoted string followed by more space(s) then another double quoted string, as defined by your example:
N "OldName" "NewName"
The following input will also work in the above example:
N "old name" "new name"
N "old name" "new name"
Note also about this example, some consider strtok() broken, while others suggest avoiding its use. I suggest using it sparingly, and only in single threaded applications.
Method 2: walking the string.
A C string is just an array of char terminated with a NULL character. By selectively copying some characters into another string, while bypassing the one you do not want (such as the "), you can effectively strip unwanted characters from your input. Here is an example function that will do this:
char * strip_ch(char *str, char ch)
{
char *from, *to;
char *dup = strdup(str);//make a copy of input
if(dup)
{
from = to = dup;//set working pointers equal to pointer to input
for (from; *from != '\0'; from++)//walk through input string
{
*to = *from;//set destination pointer to original pointer
if (*to != ch) to++;//test - increment only if not char to strip
//otherwise, leave it so next char will replace
}
*to = '\0';//replace the NULL terminator
strcpy(str, dup);
free(dup);
}
return str;
}
Example use case:
int main(void)
{
char line[128] = {"start"};
while(strstr(line, "quit") == NULL)
{
printf("Enter string (\"quit\" to leave) and hit <ENTER>:");
fgets(line, 128, stdin);
sprintf(line, "%s\n", strip_ch(line, '"'));
printf("%s", line);
}
return 0;
}
I'm trying to separate the following string into three separate variables, i.e., a, b and c.:
" mov/1/1/1,0 STR{7}, r7"
each need to hold a different segment of the string, e.g:
a = "mov/1/1/1,0"
b = "STR{7}"
c = "r7"
There may be a space or also a tab between each command; this what makes this code part trickier.
I tried to use strtok, for the string manipulation, but it didn't work out.
char command[50] = " mov/1/1/1,0 STR{7}, r7";
char a[10], b[10], c[10];
char * ptr = strtok(command, "\t");
strcpy(a, ptr);
ptr = strtok(NULL, "\t");
strcpy(b, ptr);
ptr = strtok(NULL, ", ");
strcpy(c, ptr);
but this gets things really messy as the variables a, b and c get to hold more values than they should, which leads the program to crash.
Input may vary from:
" mov/1/1/1,0 STR{7}, r7"
"jsr /0,0 PRTSTR"
"mov/1/1/0,0 STRADD{5}, LASTCHAR {r3} "
in which the values of a,b and c change to different part of the given string.
I was told it is safer to use sscanf for that kind of manners than strtok, but I'm not sure why and how it could assist me.
I would be more than glad to hear your opinion!
This should do the trick :
sscanf(command, "%s,%s,%s", &a, &b, &c)
From scanf manpage, %s eats whitespaces, be them spaces or tabs :
s : Matches a sequence of non-white-space characters; the next pointer
must be a pointer to character array that is long enough to hold the
input sequence and the terminating null byte ('\0'), which is added
automatically. The input string stops at white space or at the
maximum field width, whichever occurs first.
As you might be knowing that you can use sscanf() the same way as scanf(), the difference is sscanf scans from string, while scanf from standard input.
In this problem you can specify scanf, with a set of characters to "always skip", as done in this link.
Since you have different set of constraints for scanning all the three strings, you can specify, using %*[^...], these constraints, before every %s inside sscanf().
I have reservations about using strtok(), but this code using it seems to do what you need. As I noted in a comment, the sample string "jsr /0,0 PRTSTR" throws a spanner in the works; it has a significant comma in the second field, whereas in the other two example strings, the comma in the second field is not significant. If you need to remove trailing commas, you can do that after the space-based splitting — as shown in this code. The second loop tests the zap_trailing_commas() function to ensure that it behaves under degenerate cases, zapping trailing commas but not underflowing the start of the buffer or anything horrid.
#include <stdio.h>
#include <string.h>
static void zap_trailing_commas(char *str)
{
size_t len = strlen(str);
while (len-- > 0 && str[len] == ',')
str[len] = '\0';
}
static void splitter(char *command)
{
char a[20], b[20], c[20];
char *ptr = strtok(command, " \t");
strcpy(a, ptr);
zap_trailing_commas(a);
ptr = strtok(NULL, " \t");
strcpy(b, ptr);
zap_trailing_commas(b);
ptr = strtok(NULL, " \t");
strcpy(c, ptr);
zap_trailing_commas(c);
printf("<<%s>> <<%s>> <<%s>>\n", a, b, c);
}
int main(void)
{
char data[][50] =
{
" mov/1/1/1,0 STR{7}, r7",
"jsr /0,0 PRTSTR",
"mov/1/1/0,0 STRADD{5}, LASTCHAR {r3} ",
};
for (size_t i = 0; i < sizeof(data)/sizeof(data[0]); i++)
splitter(data[i]);
char commas[][10] = { "X,,,", "X,,", "X,", "X" };
for (size_t i = 0; i < sizeof(commas)/sizeof(commas[0]); i++)
{
printf("<<%s>> ", commas[i]);
zap_trailing_commas(&commas[i][1]);
printf("<<%s>>\n", commas[i]);
}
return 0;
}
Sample output:
<<mov/1/1/1,0>> <<STR{7}>> <<r7>>
<<jsr>> <</0,0>> <<PRTSTR>>
<<mov/1/1/0,0>> <<STRADD{5}>> <<LASTCHAR>>
<<X,,,>> <<X>>
<<X,,>> <<X>>
<<X,>> <<X>>
<<X>> <<X>>
I also tested a variant with commas in place of the X's and that left the single comma alone.
The problem:
I am attempting to use scanf to read a sentence with fields seperate by | ,so naturally i use the scanf's natural features to ignore this symbol but it then also ignores everything that has a | in it.
The code, simplified:
int main(){
char* a=malloc(8);
char* b=malloc(8);
scanf("%s | %s",a,b);
printf("%s %s",a,b);
}
when i attempt the input:
TEST | ME
it works as intended, but when i have the following case:
TEST ME|
it naturally reads the test, but ignores the ME|, is there any way around this?
scanf("%[^ \t|]%*[ \t|]%[^ \t\n|]", a,b);
printf("%s %s",a,b);
Annotation:
%* : ignore this element.
E.g. %*s //skip the reading of the text of this one
%[character set(allow)] : Read only character set that you specify.
E.g. %[0123456789] or %[0-9] //Read as a string only numeric characters
%[^character set(denied)] : It is to mean character other than when ^ is specified at the beginning of the character set.
Yes, you can scan for a character set. The problem you're seeing is not related to the vertical bar, it's the fact that a string stops at the first whitespace character, i.e. the space between "TEST" and "ME|".
So, do something like:
if(scanf("%7[^|] | %7[^|]", a, b) == 2)
{
a[7] = b[7] = '\0';
printf("got '%s' and '%s'\n", a, b);
}
See the manual page for scanf() for details on the [ conversion specifier.
This one should work.
char a[200], b[200];
scanf ("%[^|]| %[^\n]", a, b); // Use it exactly
printf ("a = %s\nb = %s\n", a, b);
Meaning of this formatting. I seperate the format string into 3 parts and explain.
"%[^|]" - Scan everything into 1st string, until the bar character('|') appears.
"| " - Read the '|' and ignore it. Read all white space characters and ignore them.
"%[\n]" - Read remainder of the line into the 2nd string.
Test case
first string is this | 2nd is this
a = first string is this
b = 2nd is this
no space|between bar
a = no space
b = between bar
Leading spaces can be truncated by using extra local variable to store leading spaces.
%[ ] needs to be mentioned in scanf to store leading spaces
"%[ ]%[^\n]",first_string,second_string , mentioned scanf format specifier is to read two strings .
first_string contains leading spaces from given input string
second_string contains actual data without leading spaces.
Following is the sample code
int main()
{
char lVar[30];
char lPlaceHolder[30];
printf("\n Enter any string with leading spaces : ");
memset(lVar,'\0',30);
memset(lPlaceHolder,'\0',30);
scanf("%[ ]%[^\n]",lPlaceHolder,lVar);
printf("\n lPlaceHolder is :%s:\n",lPlaceHolder);
printf("\n lVar is :%s:\n",lVar);
return(0);
}
Input:
" hello world"
Output:
lPlaceHolder is : :
lVar is :hello world:
Note: Space not displayed properly for lPlaceHolder after uploading to stackover flow website
I'd say instead of messing with scanf(), try using saner functions - those that work as per the (intuitive) expectations:
char s1[] = "FOO | BAR";
char s2[] = "FOO BAR |";
void print_sep(char *in)
{
char *endp;
char *sep = strtok_r(in, "|", &endp);
printf("%s\n", sep);
if (sep = strtok_r(NULL, "|", &endp))
printf("%s\n", sep);
}
print_sep(s1);
print_sep(s2);
I've been working on a program where I need to use C to scan lines from a file and store them in a struct.
My .txt file is in the form:
NAME 0.2 0.3
NAME2 0.8 0.1
Or in general its a string followed by 2 doubles
My struct is:
struct device {
char* name;
double interruptProbability, interruptTime, startTime, endTime;
} myDevice;
Now, I'm able to scan the lines in fine, but when it comes time to put them into my struct, something gets messed up. Here's how I'm doing the scanning:
char line[BUFSIZ];
while(fgets (line, BUFSIZ, devicesFile) != NULL){
struct device *d = &myDevice;
if(!isspace(*line)){
printf("String: %s \n", &line);
d->name = "success"; // for testing purposes
printf("device name before: %s \n", d[0]);
sscanf(line, "%s %f %f",&d->name, &d->interruptProbability, &d->interruptTime);
printf("device name after: %s \n", d[0]);
}
}
When I run this, it'll print off:
String: Disk 0.2 0.00005
device name before: success
before giving me a seg fault.
I ran GDB to test whats going on with the scan, and for whatever reason it puts in d->name a huge hex number that has (Address out of bounds) next to it.
any ideas?
It's because you're overwriting a literal string in the sscanf call. d->name points to a literal string, and those are read-only and of a fixed length (so if the string you try to get is longer than 7 character you also try to write beyond the end).
You need to either use an array for d->name or allocate memory on the heap for it.
You are not allocating space for each of your char *name. You need to add d->name = (char *)malloc(<length of the token>*sizeof(char)+1) before your sscanf call.
You can't scan a string into the pointer d->name.
Not even after you assign it a constant value:
d->name = "success"; // for testing purposes
You need to allocate memory for it, or make it an array. You should be very careful using sscanf to read strings. It might be better to use strtok or just strchr to find the first space and then copy the bytes out with strdup.
char *next = strchr(line, ' ');
if( next != NULL ) {
*next++ = 0; // Terminate string and move to next token.
d->name = strdup(line); // Make a copy of tokenised string
// Read the floats - note you should check that the result is equal to 2.
count = sscanf(next, "%f %f", d->interruptProbability, d->interruptTime);
}
You're sscanf-ing into a string literal. String literals are const char*s in C and C++, are are readonly, so sscanf will crash whilst trying to write to the string literal "success".