Input : [1,3,2,4]
I want to make arr[4] = {1, 3, 2, 4} from this input using scanf(). How can I do this in C language?
It is possible to parse input such as you describe with scanf, but each scanf call will parse up to a maximum number of fields determined by the given format. Thus, to parse an arbitrary number of fields requires an arbitrary number of scanf calls.
In comments, you wrote that
I want to find a method to ignore '[', ']', ',' and only accept integer units.
Taking that as the focus of the question, and therefore ignoring the issues of how you allocate space for the integers to be read when you do not know in advance how many there will be, and assuming that you may not use input functions other than scanf, it seems like you are looking for something along these lines:
int value;
char delim[2] = { 0 };
// Scan and confirm the opening '['
value = 0;
if (scanf("[%n", &value) == EOF) {
// handle end of file or I/O error ...
} else if (value == 0) {
// handle input not starting with a '[' ...
// Note: value == zero because we set it so, and the %n directive went unprocessed
} else {
// if value != 0 then it's because a '[' was scanned and the %n was processed
assert(value == 1);
}
// scan the list items
do {
// One integer plus trailing delimiter, either ',' or ']'
switch(scanf("%d%1[],]", &value, delim)) {
case EOF:
// handle end of file or I/O error (before an integer is read) ...
break;
case 0:
// handle input not starting with an integer ...
// The input may be malformed, but this point will also be reached for an empty list
break;
case 1:
// handle malformed input starting with an integer (which has been scanned) ...
break;
case 2:
// handle valid (to this point) input. The scanned value needs to be stored somewhere ...
break;
default:
// cannot happen
assert(0);
}
// *delim contains the trailing delimiter that was scanned
} while (*delim == ',');
// assuming normal termination of the loop:
assert(*delim == ']');
Points to note:
it is essential to pay attention to the return value of scanf. Failure to do so and to respond appropriately will cause all manner of problems when unexpected input is presented.
the above will accept slightly more general input than you describe, with whitespace (including line terminators) permitted before each integer.
The directive %1[],] attempts to scan a 1-character string whose element is either ] or ,. This is a bit arcane. Also, because the input is scanned as a string, you must be sure to provide space for a string terminator to be written, too.
it would be easier to write a character-by-character parser for your specific format that does not rely on scanf. You could also use scanf to read one character at a time to feed such a parser, but that seems to violate the spirit of the exercise.
While I think that John Bollinger answer is pretty good and complete (even without considering the wonderful %1[[,]), I would go for a more compact and tolerant version like this:
#include <stdio.h>
size_t arr_input(int *arr, size_t max_size)
{
size_t n;
for (n = 0; n < max_size; ++n) {
char c;
int res = scanf("%c%d", &c, arr + n);
if (res != 2
|| (n == 0 && c != '[')
|| (n > 0 && c != ',')
|| (n > 0 && c == ']')) {
break;
}
}
return n;
}
int main(void)
{
char *test_strings[] = { "[1,2,3,4]", "[42]", "[1,1,2,3,5,8]", "[]",
"[10,20,30,40,50,60,70,80,90,100]", "[1,2,3]4" };
size_t test_strings_n = sizeof test_strings / sizeof *test_strings;
char filename[L_tmpnam];
tmpnam(filename);
for (size_t i = 0; i < test_strings_n; ++i) {
freopen(filename, "w+", stdin);
fputs(test_strings[i], stdin);
rewind(stdin);
int arr[9];
size_t num_elem = arr_input(arr, 9);
printf("%zu: %s -> ", i, test_strings[i]);
for (size_t j = 0; j < num_elem; ++j) {
printf("%d ", arr[j]);
}
printf("\n");
fclose(stdin);
}
remove(filename);
return 0;
}
The idea is that you allocate space for the maximum number of integers you accept, then ask the arr_input() function to fill it up to max_size elements.
The check after scanf() tries to cope with incorrect input, but is not very complete. If you trust your input to be correct (don't) you can even make it shorter, by dropping the three || cases.
The most complex thing was to write the test driver with tmp files, strings, reopening and such. Here I'd have loved to have std::istream to just drop a std::stringstream. The fact that the FILE interface doesn't support strings really bugs me.
int arr[4];
for(int i=0;i<4;i++) scanf("%d",&arr[i]);
Are you asking for this? I was little bit confused with your question, if this doesn't solve your query, then don't hesitate to ask again...
use scanf to read a string input from user then parse that input into an integer array
To parse you can use string function "find" to locate the "," and "[]" and then use "atoi" to convert string into integer to fill the destination input array.
Edit: find is a C++ function.
the C function is strchr
Related
I want to check to make sure that a given string contained in an array called secretWord has no symbols in it (e.g. $ % & #). If it does have a symbol in it, I make the user re-enter the string. It takes advantage of recursion to keep asking until they enter a string that does not contain a symbol.
The only symbol I do accept is the NULL symbol (the symbol represented by the ASCII value of zero). This is because I fill all the empty space in the array with NULL symbols.
My function is as follows:
void checkForSymbols(char *array, int arraysize){ //Checks for symbols in the array and if there are any it recursively calls this function until it gets input without them.
for (int i = 0; i < arraysize; i++){
if (!isdigit(array[i]) && !isalpha(array[i]) && array[i] != (char) 0){
flushArray(array, arraysize);
printf("No symbols are allowed in the word. Please try again: ");
fgets(secretWord, sizeof(secretWord) - 1, stdin);
checkForSymbols(secretWord, sizeof(secretWord));
}//end if (!isdigit(array[i]) && !isalpha(array[i]) && array[i] != 0)
else
continue;
}//end for(i = 0; i < sizeof(string[]); i++){
}//end checkForSymbols
The problem: When I enter any input (see example below), the if statement runs (it prints No symbols are allowed in the word. Please try again: and asks for new input).
I assume the problem obviously stems from the statement if (!isdigit(array[i]) && !isalpha(array[i]) && array[i] != (char) 0). But I have tried changing the (char) 0 part to '\0' and 0 as well and neither change had any effect.
How do I compare if what is in the index is a symbol, then? Why are strings without symbols setting this if statement off?
And if any of you are wondering what the "flushArray" method I used was, here it is:
void flushArray(char *array, int arraysize){ //Fills in the entire passed array with NULL characters
for (int i = 0; i < arraysize; i++){
array[i] = 0;
}
}//end flushArray
This function is called on the third line of my main() method, right after a print statement on the first line that asks users to input a word, and an fgets() statement on the second line that gets the input that this checkForSymbols function is used on.
As per request, an example would be if I input "Hello" as the secretWord string. The program then runs the function on it, and the if statement is for some reason triggered, causing it to
Replace all values stored in the secretWord array with the ASCII value of 0. (AKA NULL)
Prints No symbols are allowed in the word. Please try again: to the console.
Waits for new input that it will store in the secretWord array.
Calls the checkForSymbols() method on these new values stored in secretWord.
And no matter what you input as new secretWord, the checkForSymbols() method's if statement fires and it repeats steps 1 - 4 all over again.
Thank you for being patient and understanding with your help!
You can do something like this to find symbols in your code, put the code at proper location
#include <stdio.h>
#include <string.h>
int main () {
char invalids[] = "#.<#>";
char * temp;
temp=strchr(invalids,'s');//is s an invalid character?
if (temp!=NULL) {
printf ("Invalid character");
} else {
printf("Valid character");
}
return 0;
}
This will check if s is valid entry or not similarly for you can create an array and do something like this if array is not null terminated.
#include <string.h>
char false[] = { '#', '#', '&', '$', '<' }; // note last element isn't '\0'
if (memchr(false, 'a', sizeof(false)){
// do stuff
}
memchr is used if your array is not null terminated.
As suggested by #David C. Rankin you can also use strpbrk like
#include <stdio.h>
#include <string.h>
int main () {
const char str1[] = ",*##_$&+.!";
const char str2[] = "##"; //input string
char *ret;
ret = strpbrk(str1, str2);
if(ret) {
printf("First matching character: %c\n", *ret);
} else {
printf("Continue");
}
return(0);
}
The only symbol I do accept is the NULL symbol (the symbol represented by the ASCII value of zero). This is because I fill all the empty space in the array with NULL symbols.
NULL is a pointer; if you want a character value 0, you should use 0 or '\0'. I assume you're using memset or strncpy to ensure the trailing bytes are zero? Nope... What a shame, your MCVE could be so much shorter (and complete). :(
void checkForSymbols(char *array, int arraysize){
/* ... */
if (!isdigit(array[i]) && !isalpha(array[i]) /* ... */
As per section 7.4p1 of the C standard, ...
In all cases the argument is an int, the value of which shall be representable as an unsigned char or shall equal the value of the macro EOF. If the argument has any other value, the behavior is undefined.
Not all char values are representable as an unsigned char or equal to EOF, and so it's possible (and highly likely given the nature of this question) that the code above invokes undefined behaviour.
As you haven't completed your question (by providing an MCVE, and describing what errors are occuring) I'm assuming that the question you're trying to ask might be a duplicate of this question, this question, this question, this question and probably a whole lot of others... If so, did you try Googling the error message? That's probably the first thing you should've done. Should that fail in the future, ask a question about the error message!
As per request, an example would be if I input "Hello" as the secretWord string.
I assume secretWord is declared as char secretWord[] = "Hello"; in your example, and not char *secretWord = "Hello";. The two types are distinct, and your book should clarify that. If not, which book are you reading? I can probably recommend a better book, if you'd like.
Any attempt to modify a string literal (i.e. char *array = "Hello"; flushArray(array, ...)) is undefined behaviour, as explained by answers to this question (among many others, I'm sure).
It seems a solution to this problem might be available by using something like this...
In response to your comment, you are probably making it a bit tougher on yourself than it needs to be. You have two issues to deal with (one you are not seeing). The first being to check the input to validate only a-zA-Z0-9 are entered. (you know that). The second being you need to identify and remove the trailing '\n' read and included in your input by fgets. (that one may be tripping you up)
You don't show how the initial array is filled, but given your use of fgets on secretWord[1], I suspect you are also using fgets for array. Which is exactly what you should be using. However, you need to remove the '\n' included at the end of the buffer filled by fgets before you call checkforsymbols. Otherwise you have character 0xa (the '\n') at the end, which, of course, is not a-zA-Z0-9 and will cause your check to fail.
To remove the trailing '\n', all you need to do is check the last character in your buffer. If it is a '\n', then simply overwrite it with the nul-terminating character (either 0 or the equivalent character representation '\0' -- your choice). You simply need the length of the string (which you get with strlen from string.h) and then check if (string[len - 1] == '\n'). For example:
size_t len = strlen (str); /* get length of str */
if (str[len - 1] == '\n') /* check for trailing '\n' */
str[--len] = 0; /* overwrite with nul-byte */
A third issue, important, but not directly related to the comparison, is to always choose a type for your function that will return an indication of Success/Failure as needed. In your case the choice of void gives you nothing to check to determine whether there were any symbols found or not. You can choose any type you like int, char, char *, etc.. All will allow the return of a value to gauge success or failure. For testing strings, the normal choice is char *, returning a valid pointer on success or NULL on failure.
A fourth issue when taking input is you always need to handle the case where the user chooses to cancel input by generating a manual EOF with either ctrl+d on Linux or ctrl+z on windoze. The return of NULL by fgets gives you that ability. But with it (and every other input function), you have to check the return and make use of the return information in order to validate the user input. Simply check whether fgets returns NULL on your request for input, e.g.
if (!fgets (str, MAXS, stdin)) { /* read/validate input */
fprintf (stderr, "EOF received -> user canceled input.\n");
return 1; /* change as needed */
}
For your specific case where you only want a-zA-Z0-9, all you need to do is iterate down the string the user entered, checking each character to make sure it is a-zA-Z0-9 and return failure if anything else is encountered. This is made easy given that every string in C is nul-terminated. So you simply assign a pointer to the start of your string (e.g. char *p = str;) and then use either a for or while loop to check each character, e.g.
for (; *p != 0; p++) { do stuff }
that can be written in shorthand:
for (; *p; p++) { do stuff }
or use while:
while (*p) { do stuff; p++; }
Putting all of those pieces together, you could write your function to take a string as its only parameter and return NULL if a symbol is encountered, or return a pointer to your original string on success, e.g.
char *checkforsymbols (char *s)
{
if (!s || !*s) return NULL; /* validate string and not empty */
char *p = s; /* pointer to iterate over string */
for (; *p; p++) /* for each char in s */
if ((*p < 'a' || *p > 'z') && /* char is not a-z */
(*p < 'A' || *p > 'Z') && /* char is not A-Z */
(*p < '0' || *p > '9')) { /* char is not 0-9 */
fprintf (stderr, "error: '%c' not allowed in input.\n", *p);
return NULL; /* indicate failure */
}
return s; /* indicate success */
}
A short complete test routine could be:
#include <stdio.h>
#include <string.h>
#define MAXS 256
char *checkforsymbols (char *s);
int main (void) {
char str[MAXS] = "";
size_t len = 0;
for (;;) { /* loop until str w/o symbols */
printf (" enter string: "); /* prompt for user input */
if (!fgets (str, MAXS, stdin)) { /* read/validate input */
fprintf (stderr, "EOF received -> user canceled input.\n");
return 1;
}
len = strlen (str); /* get length of str */
if (str[len - 1] == '\n') /* check for trailing '\n' */
str[--len] = 0; /* overwrite with nul-byte */
if (checkforsymbols (str)) /* check for symbols */
break;
}
printf (" valid str: '%s'\n", str);
return 0;
}
char *checkforsymbols (char *s)
{
if (!s || !*s) return NULL; /* validate string and not empty */
char *p = s; /* pointer to iterate over string */
for (; *p; p++) /* for each char in s */
if ((*p < 'a' || *p > 'z') && /* char is not a-z */
(*p < 'A' || *p > 'Z') && /* char is not A-Z */
(*p < '0' || *p > '9')) { /* char is not 0-9 */
fprintf (stderr, "error: '%c' not allowed in input.\n", *p);
return NULL; /* indicate failure */
}
return s; /* indicate success */
}
Example Use/Output
$ ./bin/str_chksym
enter string: mydoghas$20worthoffleas
error: '$' not allowed in input.
enter string: Baddog!
error: '!' not allowed in input.
enter string: Okheisagood10yearolddog
valid str: 'Okheisagood10yearolddog'
or if the user cancels user input:
$ ./bin/str_chksym
enter string: EOF received -> user canceled input.
footnote 1.
C generally prefers the use of all lower-case variable names, while reserving all upper-case for macros and defines. Leave MixedCase or camelCase variable names for C++ and java. However, since this is a matter of style, this is completely up to you.
I am trying to write a function that does the following things:
Start an input loop, printing '> ' each iteration.
Take whatever the user enters (unknown length) and read it into a character array, dynamically allocating the size of the array if necessary. The user-entered line will end at a newline character.
Add a null byte, '\0', to the end of the character array.
Loop terminates when the user enters a blank line: '\n'
This is what I've currently written:
void input_loop(){
char *str = NULL;
printf("> ");
while(printf("> ") && scanf("%a[^\n]%*c",&input) == 1){
/*Add null byte to the end of str*/
/*Do stuff to input, including traversing until the null byte is reached*/
free(str);
str = NULL;
}
free(str);
str = NULL;
}
Now, I'm not too sure how to go about adding the null byte to the end of the string. I was thinking something like this:
last_index = strlen(str);
str[last_index] = '\0';
But I'm not too sure if that would work though. I can't test if it would work because I'm encountering this error when I try to compile my code:
warning: ISO C does not support the 'a' scanf flag [-Wformat=]
So what can I do to make my code work?
EDIT: changing scanf("%a[^\n]%*c",&input) == 1 to scanf("%as[^\n]%*c",&input) == 1 gives me the same error.
First of all, scanf format strings do not use regular expressions, so I don't think something close to what you want will work. As for the error you get, according to my trusty manual, the %a conversion flag is for floating point numbers, but it only works on C99 (and your compiler is probably configured for C90)
But then you have a bigger problem. scanf expects that you pass it a previously allocated empty buffer for it to fill in with the read input. It does not malloc the sctring for you so your attempts at initializing str to NULL and the corresponding frees will not work with scanf.
The simplest thing you can do is to give up on n arbritrary length strings. Create a large buffer and forbid inputs that are longer than that.
You can then use the fgets function to populate your buffer. To check if it managed to read the full line, check if your string ends with a "\n".
char str[256+1];
while(true){
printf("> ");
if(!fgets(str, sizeof str, stdin)){
//error or end of file
break;
}
size_t len = strlen(str);
if(len + 1 == sizeof str){
//user typed something too long
exit(1);
}
printf("user typed %s", str);
}
Another alternative is you can use a nonstandard library function. For example, in Linux there is the getline function that reads a full line of input using malloc behind the scenes.
No error checking, don't forget to free the pointer when you're done with it. If you use this code to read enormous lines, you deserve all the pain it will bring you.
#include <stdio.h>
#include <stdlib.h>
char *readInfiniteString() {
int l = 256;
char *buf = malloc(l);
int p = 0;
char ch;
ch = getchar();
while(ch != '\n') {
buf[p++] = ch;
if (p == l) {
l += 256;
buf = realloc(buf, l);
}
ch = getchar();
}
buf[p] = '\0';
return buf;
}
int main(int argc, char *argv[]) {
printf("> ");
char *buf = readInfiniteString();
printf("%s\n", buf);
free(buf);
}
If you are on a POSIX system such as Linux, you should have access to getline. It can be made to behave like fgets, but if you start with a null pointer and a zero length, it will take care of memory allocation for you.
You can use in in a loop like this:
#include <stdlib.h>
#include <stdio.h>
#include <string.h> // for strcmp
int main(void)
{
char *line = NULL;
size_t nline = 0;
for (;;) {
ptrdiff_t n;
printf("> ");
// read line, allocating as necessary
n = getline(&line, &nline, stdin);
if (n < 0) break;
// remove trailing newline
if (n && line[n - 1] == '\n') line[n - 1] = '\0';
// do stuff
printf("'%s'\n", line);
if (strcmp("quit", line) == 0) break;
}
free(line);
printf("\nBye\n");
return 0;
}
The passed pointer and the length value must be consistent, so that getline can reallocate memory as required. (That means that you shouldn't change nline or the pointer line in the loop.) If the line fits, the same buffer is used in each pass through the loop, so that you have to free the line string only once, when you're done reading.
Some have mentioned that scanf is probably unsuitable for this purpose. I wouldn't suggest using fgets, either. Though it is slightly more suitable, there are problems that seem difficult to avoid, at least at first. Few C programmers manage to use fgets right the first time without reading the fgets manual in full. The parts most people manage to neglect entirely are:
what happens when the line is too large, and
what happens when EOF or an error is encountered.
The fgets() function shall read bytes from stream into the array pointed to by s, until n-1 bytes are read, or a is read and transferred to s, or an end-of-file condition is encountered. The string is then terminated with a null byte.
Upon successful completion, fgets() shall return s. If the stream is at end-of-file, the end-of-file indicator for the stream shall be set and fgets() shall return a null pointer. If a read error occurs, the error indicator for the stream shall be set, fgets() shall return a null pointer...
I don't feel I need to stress the importance of checking the return value too much, so I won't mention it again. Suffice to say, if your program doesn't check the return value your program won't know when EOF or an error occurs; your program will probably be caught in an infinite loop.
When no '\n' is present, the remaining bytes of the line are yet to have been read. Thus, fgets will always parse the line at least once, internally. When you introduce extra logic, to check for a '\n', to that, you're parsing the data a second time.
This allows you to realloc the storage and call fgets again if you want to dynamically resize the storage, or discard the remainder of the line (warning the user of the truncation is a good idea), perhaps using something like fscanf(file, "%*[^\n]");.
hugomg mentioned using multiplication in the dynamic resize code to avoid quadratic runtime problems. Along this line, it would be a good idea to avoid parsing the same data over and over each iteration (thus introducing further quadratic runtime problems). This can be achieved by storing the number of bytes you've read (and parsed) somewhere. For example:
char *get_dynamic_line(FILE *f) {
size_t bytes_read = 0;
char *bytes = NULL, *temp;
do {
size_t alloc_size = bytes_read * 2 + 1;
temp = realloc(bytes, alloc_size);
if (temp == NULL) {
free(bytes);
return NULL;
}
bytes = temp;
temp = fgets(bytes + bytes_read, alloc_size - bytes_read, f); /* Parsing data the first time */
bytes_read += strcspn(bytes + bytes_read, "\n"); /* Parsing data the second time */
} while (temp && bytes[bytes_read] != '\n');
bytes[bytes_read] = '\0';
return bytes;
}
Those who do manage to read the manual and come up with something correct (like this) may soon realise the complexity of an fgets solution is at least twice as poor as the same solution using fgetc. We can avoid parsing data the second time by using fgetc, so using fgetc might seem most appropriate. Alas most C programmers also manage to use fgetc incorrectly when neglecting the fgetc manual.
The most important detail is to realise that fgetc returns an int, not a char. It may return typically one of 256 distinct values, between 0 and UCHAR_MAX (inclusive). It may otherwise return EOF, meaning there are typically 257 distinct values that fgetc (or consequently, getchar) may return. Trying to store those values into a char or unsigned char results in loss of information, specifically the error modes. (Of course, this typical value of 257 will change if CHAR_BIT is greater than 8, and consequently UCHAR_MAX is greater than 255)
char *get_dynamic_line(FILE *f) {
size_t bytes_read = 0;
char *bytes = NULL;
do {
if ((bytes_read & (bytes_read + 1)) == 0) {
void *temp = realloc(bytes, bytes_read * 2 + 1);
if (temp == NULL) {
free(bytes);
return NULL;
}
bytes = temp;
}
int c = fgetc(f);
bytes[bytes_read] = c >= 0 && c != '\n'
? c
: '\0';
} while (bytes[bytes_read++]);
return bytes;
}
I am a beginner learning C; so, please go easy on me. :)
I am trying to write a very simple program that takes each word of a string into a "Hi (input)!" sentence (it assumes you type in names). Also, I am using arrays because I need to practice them.
My problem is that, some garbage gets putten into the arrays somewhere, and it messes up the program. I tried to figure out the problem but to no avail; so, it is time to ask for expert help. Where have I made mistakes?
p.s.: It also has an infinite loop somewhere, but it is probably the result of the garbage that is put into the array.
#include <stdio.h>
#define MAX 500 //Maximum Array size.
int main(int argc, const char * argv[])
{
int stringArray [MAX];
int wordArray [MAX];
int counter = 0;
int wordCounter = 0;
printf("Please type in a list of names then hit ENTER:\n");
// Fill up the stringArray with user input.
stringArray[counter] = getchar();
while (stringArray[counter] != '\n') {
stringArray[++counter] = getchar();
}
// Main function.
counter = 0;
while (stringArray[wordCounter] != '\n') {
// Puts first word into temporary wordArray.
while ((stringArray[wordCounter] != ' ') && (stringArray[wordCounter] != '\n')) {
wordArray[counter++] = stringArray[wordCounter++];
}
wordArray[counter] = '\0';
//Prints out the content of wordArray.
counter = 0;
printf("Hi ");
while (wordArray[counter] != '\0') {
putchar(wordArray[counter]);
counter++;
}
printf("!\n");
//Clears temporary wordArray for new use.
for (counter = 0; counter == MAX; counter++) {
wordArray[counter] = '\0';
}
wordCounter++;
counter = 0;
}
return 0;
}
Solved it! I needed to add to following if sentence to the end when I incremented the wordCounter. :)
if (stringArray[wordCounter] != '\n') {
wordCounter++;
}
You are using int arrays to represent strings, probably because getchar() returns in int. However, strings are better represented as char arrays, since that's what they are, in C. The fact that getchar() returns an int is certainly confusing, it's because it needs to be able to return the special value EOF, which doesn't fit in a char. Therefore it uses int, which is a "larger" type (able to represent more different values). So, it can fit all the char values, and EOF.
With char arrays, you can use C's string functions directly:
char stringArray[MAX];
if(fgets(stringArray, sizeof stringArray, stdin) != NULL)
printf("You entered %s", stringArray);
Note that fscanf() will leave the end of line character(s) in the string, so you might want to strip them out. I suggest implementing an in-place function that trims off leading and trailing whitespace, it's a good exercise as well.
for (counter = 0; counter == MAX; counter++) {
wordArray[counter] = '\0';
}
You never enter into this loop.
user1799795,
For what it's worth (now that you've solved your problem) I took the liberty of showing you how I'd do this given the restriction "use arrays", and explaining a bit about why I'd do it that way... Just beware that while I am experienced programmer I'm no C guru... I've worked with guys who absolutely blew me into the C-weeds (pun intended).
#include <stdio.h>
#include <string.h>
#define LINE_SIZE 500
#define MAX_WORDS 50
#define WORD_SIZE 20
// Main function.
int main(int argc, const char * argv[])
{
int counter = 0;
// ----------------------------------
// Read a line of input from the user (ie stdin)
// ----------------------------------
char line[LINE_SIZE];
printf("Please type in a list of names then hit ENTER:\n");
while ( fgets(line, LINE_SIZE, stdin) == NULL )
fprintf(stderr, "You must enter something. Pretty please!");
// A note on that LINE_SIZE parameter to the fgets function:
// wherever possible it's a good idea to use the version of the standard
// library function that allows you specificy the maximum length of the
// string (or indeed any array) because that dramatically reduces the
// incedence "string overruns", which are a major source of bugs in c
// programmes.
// Also note that fgets includes the end-of-line character/sequence in
// the returned string, so you have to ensure there's room for it in the
// destination string, and remember to handle it in your string processing.
// -------------------------
// split the line into words
// -------------------------
// the current word
char word[WORD_SIZE];
int wordLength = 0;
// the list of words
char words[MAX_WORDS][WORD_SIZE]; // an array of upto 50 words of
// upto 20 characters each
int wordCount = 0; // the number of words in the array.
// The below loop syntax is a bit cyptic.
// The "char *c=line;" initialises the char-pointer "c" to the start of "line".
// The " *c;" is ultra-shorthand for: "is the-char-at-c not equal to zero".
// All strings in c end with a "null terminator" character, which has the
// integer value of zero, and is commonly expressed as '\0', 0, or NULL
// (a #defined macro). In the C language any integer may be evaluated as a
// boolean (true|false) expression, where 0 is false, and (pretty obviously)
// everything-else is true. So: If the character at the address-c is not
// zero (the null terminator) then go-round the loop again. Capiche?
// The "++c" moves the char-pointer to the next character in the line. I use
// the pre-increment "++c" in preference to the more common post-increment
// "c++" because it's a smidge more efficient.
//
// Note that this syntax is commonly used by "low level programmers" to loop
// through strings. There is an alternative which is less cryptic and is
// therefore preferred by most programmers, even though it's not quite as
// efficient. In this case the loop would be:
// int lineLength = strlen(line);
// for ( int i=0; i<lineLength; ++i)
// and then to get the current character
// char ch = line[i];
// We get the length of the line once, because the strlen function has to
// loop through the characters in the array looking for the null-terminator
// character at its end (guess what it's implementation looks like ;-)...
// which is inherently an "expensive" operation (totally dependant on the
// length of the string) so we atleast avoid repeating this operation.
//
// I know I might sound like I'm banging on about not-very-much but once you
// start dealing with "real word" magnitude datasets then such habits,
// formed early on, pay huge dividends in the ability to write performant
// code the first time round. Premature optimisation is evil, but my code
// doesn't hardly ever NEED optimising, because it was "fairly efficient"
// to start with. Yeah?
for ( char *c=line; *c; ++c ) { // foreach char in line.
char ch = *c; // "ch" is the character value-at the-char-pointer "c".
if ( ch==' ' // if this char is a space,
|| ch=='\n' // or we've reached the EOL char
) {
// 1. add the word to the end of the words list.
// note that we copy only wordLength characters, instead of
// relying on a null-terminator (which doesn't exist), as we
// would do if we called the more usual strcpy function instead.
strncpy(words[wordCount++], word, wordLength);
// 2. and "clear" the word buffer.
wordLength=0;
} else if (wordLength==WORD_SIZE-1) { // this word is too long
// so split this word into two words.
strncpy(words[wordCount++], word, wordLength);
wordLength=0;
word[wordLength++] = ch;
} else {
// otherwise: append this character to the end of the word.
word[wordLength++] = ch;
}
}
// -------------------------
// print out the words
// -------------------------
for ( int w=0; w<wordCount; ++w ) {
printf("Hi %s!\n", words[w]);
}
return 0;
}
In the real world one can't make such restrictive assumptions about the maximum-length of words, or how many there will be, and if such restrictions are given they're almost allways arbitrary and therefore proven wrong all too soon... so straight-off-the-bat for this problem, I'd be inclined to use a linked-list instead of the "words" array... wait till you get to "dynamic data structures"... You'll love em ;-)
Cheers. Keith.
PS: You're going pretty well... My advise is "just keep on truckin"... this gets a LOT easier with practice.
I've been doing a fairly easy program of converting a string of Characters (assuming numbers are entered) to an Integer.
After I was done, I noticed some very peculiar "bugs" that I can't answer, mostly because of my limited knowledge of how the scanf(), gets() and fgets() functions work. (I did read a lot of literature though.)
So without writing too much text, here's the code of the program:
#include <stdio.h>
#define MAX 100
int CharToInt(const char *);
int main()
{
char str[MAX];
printf(" Enter some numbers (no spaces): ");
gets(str);
// fgets(str, sizeof(str), stdin);
// scanf("%s", str);
printf(" Entered number is: %d\n", CharToInt(str));
return 0;
}
int CharToInt(const char *s)
{
int i, result, temp;
result = 0;
i = 0;
while(*(s+i) != '\0')
{
temp = *(s+i) & 15;
result = (temp + result) * 10;
i++;
}
return result / 10;
}
So here's the problem I've been having. First, when using gets() function, the program works perfectly.
Second, when using fgets(), the result is slightly wrong because apparently fgets() function reads newline (ASCII value 10) character last which screws up the result.
Third, when using scanf() function, the result is completely wrong because first character apparently has a -52 ASCII value. For this, I have no explanation.
Now I know that gets() is discouraged to use, so I would like to know if I can use fgets() here so it doesn't read (or ignores) newline character.
Also, what's the deal with the scanf() function in this program?
Never use gets. It offers no protections against a buffer overflow vulnerability (that is, you cannot tell it how big the buffer you pass to it is, so it cannot prevent a user from entering a line larger than the buffer and clobbering memory).
Avoid using scanf. If not used carefully, it can have the same buffer overflow problems as gets. Even ignoring that, it has other problems that make it hard to use correctly.
Generally you should use fgets instead, although it's sometimes inconvenient (you have to strip the newline, you must determine a buffer size ahead of time, and then you must figure out what to do with lines that are too long–do you keep the part you read and discard the excess, discard the whole thing, dynamically grow the buffer and try again, etc.). There are some non-standard functions available that do this dynamic allocation for you (e.g. getline on POSIX systems, Chuck Falconer's public domain ggets function). Note that ggets has gets-like semantics in that it strips a trailing newline for you.
Yes, you want to avoid gets. fgets will always read the new-line if the buffer was big enough to hold it (which lets you know when the buffer was too small and there's more of the line waiting to be read). If you want something like fgets that won't read the new-line (losing that indication of a too-small buffer) you can use fscanf with a scan-set conversion like: "%N[^\n]", where the 'N' is replaced by the buffer size - 1.
One easy (if strange) way to remove the trailing new-line from a buffer after reading with fgets is: strtok(buffer, "\n"); This isn't how strtok is intended to be used, but I've used it this way more often than in the intended fashion (which I generally avoid).
There are numerous problems with this code. We'll fix the badly named variables and functions and investigate the problems:
First, CharToInt() should be renamed to the proper StringToInt() since it operates on an string not a single character.
The function CharToInt() [sic.] is unsafe. It doesn't check if the user accidentally passes in a NULL pointer.
It doesn't validate input, or more correctly, skip invalid input. If the user enters in a non-digit the result will contain a bogus value. i.e. If you enter in N the code *(s+i) & 15 will produce 14 !?
Next, the nondescript temp in CharToInt() [sic.] should be called digit since that is what it really is.
Also, the kludge return result / 10; is just that -- a bad hack to work around a buggy implementation.
Likewise MAX is badly named since it may appear to conflict with the standard usage. i.e. #define MAX(X,y) ((x)>(y))?(x):(y)
The verbose *(s+i) is not as readable as simply *s. There is no need to use and clutter up the code with yet another temporary index i.
gets()
This is bad because it can overflow the input string buffer. For example, if the buffer size is 2, and you enter in 16 characters, you will overflow str.
scanf()
This is equally bad because it can overflow the input string buffer.
You mention "when using scanf() function, the result is completely wrong because first character apparently has a -52 ASCII value."
That is due to an incorrect usage of scanf(). I was not able to duplicate this bug.
fgets()
This is safe because you can guarantee you never overflow the input string buffer by passing in the buffer size (which includes room for the NULL.)
getline()
A few people have suggested the C POSIX standard getline() as a replacement. Unfortunately this is not a practical portable solution as Microsoft does not implement a C version; only the standard C++ string template function as this SO #27755191 question answers. Microsoft's C++ getline() was available at least far back as Visual Studio 6 but since the OP is strictly asking about C and not C++ this isn't an option.
Misc.
Lastly, this implementation is buggy in that it doesn't detect integer overflow. If the user enters too large a number the number may become negative! i.e. 9876543210 will become -18815698?! Let's fix that too.
This is trivial to fix for an unsigned int. If the previous partial number is less then the current partial number then we have overflowed and we return the previous partial number.
For a signed int this is a little more work. In assembly we could inspect the carry-flag, but in C there is no standard built-in way to detect overflow with signed int math. Fortunately, since we are multiplying by a constant, * 10, we can easily detect this if we use an equivalent equation:
n = x*10 = x*8 + x*2
If x*8 overflows then logically x*10 will as well. For a 32-bit int overflow will happen when x*8 = 0x100000000 thus all we need to do is detect when x >= 0x20000000. Since we don't want to assume how many bits an int has we only need to test if the top 3 msb's (Most Significant Bits) are set.
Additionally, a second overflow test is needed. If the msb is set (sign bit) after the digit concatenation then we also know the number overflowed.
Code
Here is a fixed safe version along with code that you can play with to detect overflow in the unsafe versions. I've also included both a signed and unsigned versions via #define SIGNED 1
#include <stdio.h>
#include <ctype.h> // isdigit()
// 1 fgets
// 2 gets
// 3 scanf
#define INPUT 1
#define SIGNED 1
// re-implementation of atoi()
// Test Case: 2147483647 -- valid 32-bit
// Test Case: 2147483648 -- overflow 32-bit
int StringToInt( const char * s )
{
int result = 0, prev, msb = (sizeof(int)*8)-1, overflow;
if( !s )
return result;
while( *s )
{
if( isdigit( *s ) ) // Alt.: if ((*s >= '0') && (*s <= '9'))
{
prev = result;
overflow = result >> (msb-2); // test if top 3 MSBs will overflow on x*8
result *= 10;
result += *s++ & 0xF;// OPTIMIZATION: *s - '0'
if( (result < prev) || overflow ) // check if would overflow
return prev;
}
else
break; // you decide SKIP or BREAK on invalid digits
}
return result;
}
// Test case: 4294967295 -- valid 32-bit
// Test case: 4294967296 -- overflow 32-bit
unsigned int StringToUnsignedInt( const char * s )
{
unsigned int result = 0, prev;
if( !s )
return result;
while( *s )
{
if( isdigit( *s ) ) // Alt.: if (*s >= '0' && *s <= '9')
{
prev = result;
result *= 10;
result += *s++ & 0xF; // OPTIMIZATION: += (*s - '0')
if( result < prev ) // check if would overflow
return prev;
}
else
break; // you decide SKIP or BREAK on invalid digits
}
return result;
}
int main()
{
int detect_buffer_overrun = 0;
#define BUFFER_SIZE 2 // set to small size to easily test overflow
char str[ BUFFER_SIZE+1 ]; // C idiom is to reserve space for the NULL terminator
printf(" Enter some numbers (no spaces): ");
#if INPUT == 1
fgets(str, sizeof(str), stdin);
#elif INPUT == 2
gets(str); // can overflows
#elif INPUT == 3
scanf("%s", str); // can also overflow
#endif
#if SIGNED
printf(" Entered number is: %d\n", StringToInt(str));
#else
printf(" Entered number is: %u\n", StringToUnsignedInt(str) );
#endif
if( detect_buffer_overrun )
printf( "Input buffer overflow!\n" );
return 0;
}
You're correct that you should never use gets. If you want to use fgets, you can simply overwrite the newline.
char *result = fgets(str, sizeof(str), stdin);
char len = strlen(str);
if(result != NULL && str[len - 1] == '\n')
{
str[len - 1] = '\0';
}
else
{
// handle error
}
This does assume there are no embedded NULLs. Another option is POSIX getline:
char *line = NULL;
size_t len = 0;
ssize_t count = getline(&line, &len, stdin);
if(count >= 1 && line[count - 1] == '\n')
{
line[count - 1] = '\0';
}
else
{
// Handle error
}
The advantage to getline is it does allocation and reallocation for you, it handles possible embedded NULLs, and it returns the count so you don't have to waste time with strlen. Note that you can't use an array with getline. The pointer must be NULL or free-able.
I'm not sure what issue you're having with scanf.
never use gets(), it can lead to unprdictable overflows. If your string array is of size 1000 and i enter 1001 characters, i can buffer overflow your program.
Try using fgets() with this modified version of your CharToInt():
int CharToInt(const char *s)
{
int i, result, temp;
result = 0;
i = 0;
while(*(s+i) != '\0')
{
if (isdigit(*(s+i)))
{
temp = *(s+i) & 15;
result = (temp + result) * 10;
}
i++;
}
return result / 10;
}
It essentially validates the input digits and ignores anything else. This is very crude so modify it and salt to taste.
So I am not much of a programmer but let me try to answer your question about the scanf();. I think the scanf is pretty fine and use it for mostly everything without having any issues. But you have taken a not completely correct structure. It should be:
char str[MAX];
printf("Enter some text: ");
scanf("%s", &str);
fflush(stdin);
The "&" in front of the variable is important. It tells the program where (in which variable) to save the scanned value.
the fflush(stdin); clears the buffer from the standard input (keyboard) so you're less likely to get a buffer overflow.
And the difference between gets/scanf and fgets is that gets(); and scanf(); only scan until the first space ' ' while fgets(); scans the whole input. (but be sure to clean the buffer afterwards so you wont get an overflow later on)
I'm wanting to read hex numbers from a text file into an unsigned integer so that I can execute Machine instructions. It's just a simulation type thing that looks inside the text file and according to the values and its corresponding instruction outputs the new values in the registers.
For example, the instructions would be:
1RXY -> Save register R with value in
memory address XY
2RXY -> Save register R with value XY
BRXY -> Jump to register R if xy is
this and that etc..
ARXY -> AND register R with value at
memory address XY
The text file contains something like this each in a new line. (in hexidecimal)
120F
B007
290B
My problem is copying each individual instruction into an unsigned integer...how do I do this?
#include <stdio.h>
int main(){
FILE *f;
unsigned int num[80];
f=fopen("values.txt","r");
if (f==NULL){
printf("file doesnt exist?!");
}
int i=0;
while (fscanf(f,"%x",num[i]) != EOF){
fscanf(f,"%x",num[i]);
i++;
}
fclose(f);
printf("%x",num[0]);
}
You're on the right track. Here's the problems I saw:
You need to exit if fopen() return NULL - you're printing an error message but then continuing.
Your loop should terminate if i >= 80, so you don't read more integers than you have space for.
You need to pass the address of num[i], not the value, to fscanf.
You're calling fscanf() twice in the loop, which means you're throwing away half of your values without storing them.
Here's what it looks like with those issues fixed:
#include <stdio.h>
int main() {
FILE *f;
unsigned int num[80];
int i=0;
int rv;
int num_values;
f=fopen("values.txt","r");
if (f==NULL){
printf("file doesnt exist?!\n");
return 1;
}
while (i < 80) {
rv = fscanf(f, "%x", &num[i]);
if (rv != 1)
break;
i++;
}
fclose(f);
num_values = i;
if (i >= 80)
{
printf("Warning: Stopped reading input due to input too long.\n");
}
else if (rv != EOF)
{
printf("Warning: Stopped reading input due to bad value.\n");
}
else
{
printf("Reached end of input.\n");
}
printf("Successfully read %d values:\n", num_values);
for (i = 0; i < num_values; i++)
{
printf("\t%x\n", num[i]);
}
return 0
}
You can also use the function strtol(). If you use a base of 16 it will convert your hex string value to an int/long.
errno = 0;
my_int = strtol(my_str, NULL, 16);
/* check errno */
Edit: One other note, various static analysis tools may flag things like atoi() and scanf() as unsafe. atoi is obsolete due to the fact that it does not check for errors like strtol() does. scanf() on the other hand can do a buffer overflow of sorts since its not checking the type sent into scanf(). For instance you could give a pointer to a short to scanf where the read value is actually a long....and boom.
You're reading two numbers into each element of your array (so you lose half of them as you overwrite them. Try using just
while (i < 80 && fscanf(f,"%x",&num[i]) != EOF)
i++;
for your loop
edit
you're also missing the '&' to get the address of the array element, so you're passing a random garbage pointer to scanf and probably crashing. The -Wall option is your friend.
In this case, all of your input is upper case hex while you are trying to read lower case hex.
To fix it, change %x to %X.
Do you want each of the lines (each 4 characters long) separated in 4 different array elements? If you do, I'd try this:
/* read the line */
/* fgets(buf, ...) */
/* check it's correct, mind the '\n' */
/* ... strlen ... isxdigit ... */
/* put each separate input digit in a separate array member */
num[i++] = convert_xdigit_to_int(buf[j++]);
Where the function convert_xdigit_to_int() simply converts '0' (the character) to 0 (an int), '1' to 1, '2' to 2, ... '9' to 9, 'a' or 'A' to 10, ...
Of course that pseudo-code is inside a loop that executes until the file runs out or the array gets filled. Maybe putting the fgets() as the condition for a while(...)
while(/*there is space in the array && */ fgets(...)) {
}