My program is supposed to read from stdin and hand over the input to the system - in case the input does not equal "exit".
That works perfectly unless the second input is longer than the first.
For example if the first input is "hello" and the second is "hellohello", the input gets split up into "hello" and "ello".
I guess the problem is that the buffer s is not properly cleared while looping. Therefore I used memset() but unfortunately I did not get the results I was looking for.
Can anyone see the mistake?
Thanks a lot!
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAX 1024
int main(){
char *s = (char *)malloc(MAX);
char *temp = NULL;
while(fgets(s, (sizeof(s)-1), stdin)!= NULL){
temp = s+(strlen(s)-1);
*temp = '\0';
if (strcmp(s,"exit")==0){
break;
} else {
system(s);
}
memset(s, 0, MAX);
}
free(s);
return 0;
}
The incorrect thing here is (sizeof(s)-1). This will not return size of allocated buffer, instead return size of (char*). You size of buffer is MAX. memset() really doesn't do anything with this, so remove it. an you do not need to do that -1, fgets() will always automatically attach zero terminator in the end of string, even if buffer filled up.
Also these two lines
temp = s+(strlen(s)-1);
*temp = '\0';
are not needed, because
"fgets() reads in at most one less than size characters from stream and stores them into the buffer pointed to by s. Reading stops after an EOF or a newline. If a newline is read, it is stored into the buffer. A terminating null byte (aq\0aq) is stored after the last character in the buffer."
(from "man fgets", google for it)
Related
I am trying to get a sample (shell script) program on how to write to a file:
#include <unistd.h>
#include <stdio.h>
#include <string.h>
int main(int argc, char **argv){
char buff[1024];
size_t len, idx;
ssize_t wcnt;
for (;;){
if (fgets(buff,sizeof(buff),stdin) == NULL)
return 0;
idx = 0;
len = strlen(buff);
do {
wcnt = write(1,buff + idx, len - idx);
if (wcnt == -1){ /* error */
perror("write");
return 1;
}
idx += wcnt;
} while (idx < len);
}
}
So my problem is this: Let's say I want to write a file of 20000 bytes so every time I can only write (at most) 1024 (buffer size).
Let's say that in my first attempt everything is going perfect and fgets() reads 1024 bytes and in my first do while I write 1024 bytes.
Then, since we wrote "len" bytes we exit the do-while loop.
So now what?? The buffer is full from our previous reading. It seems to me that for some reason it is implied that fgets() will now continue reading from the point it reached in in-file the last time. (buf[1024] here).
How come, fgets() knows where it stopped reading in the in-file?
I checked the man page :
fgets() reads in at most one less than size characters from stream and stores them into the buffer pointed to by s. Reading stops after an EOF or a newline. If a newline is read, it is stored in the buffer. A terminating null byte (aq\0aq) is stored after the last character in the buffer.
fgets() return s on success, and NULL on error or when the end of file occurs while no characters have been read.*
So from that, I get that it returns a pointer to the first element of buf, which is always buf[0],
that's why I am confused.
When using aFILE stream, it contains information about the position in the file (among other things). fgets and other functions like freador fwrite merely utilize this information and updates it when an operation is performed.
So, whenever fgets reads from the stream, the stream will be updated to maintain the position, so that the next operation starts off where the previous ended.
I want to get a string input from a user in C.
I know how to use scanf to get string inputs. I know how to use malloc and realloc to allocate space for a variable.
The problem is when the user enters a string, I don't necessarily know what size that will be that I will need to reserve space for.
For instance if they write James I'd need to malloc((sizeof(char) * 5) but they might have written Bond in which case I would have only had to malloc(sizeof(char) * 4).
Is it just the case that I should be sure to allocate enough space beforehand (e.g. malloc(sizeof(char) * 100)).
And then does scanf do any realloc trimming under the hood or is that a memory leak for me to fix?
You have two misunderstandings you are struggling with. First scanf() does not modify the storage in any way (omitting for purposes of discussion the non-standard "%a", later renamed "%m" specifiers). Second, you are forgetting to provide length + 1 characters of storage to ensure room for the null-terminating character.
In your statement "For instance if they write "James" I'd need to malloc((sizeof(char)*5)" - no, no you would need malloc (6) to provide room for James\0. Note also that sizeof (char) is defined as 1 and should be omitted.
As to how to read a string, you generally want to avoid scanf() and even when using scanf() unless you are reading whitespace separated words, you don't want to use the "%s" conversion specifier which stops reading as soon as it encounters whitespace making it impossible to read "James Bond". Further, you have the issue of what is left unread in stdin after your call to scanf().
When reading using "%s" the '\n' character is left in stdin unread. This is a pitfall that will bite you on your next attempted read if using an input function that does not ignore leading whitespace (that is any character-oriented or line-oriented input function). These pitfalls, along with a host of others associated with scanf() use are why new C programmers are encourage to use fgets() to read user input.
With a sufficiently sized buffer (and if not, with a simple loop) fgets() will consume an entire line of input each time it is called, ensuring there is nothing left unread in that line. The only caveat is that fgets() reads and includes the trailing '\n' in the buffer it fills. You simply trim the trailing newline with a call to strcspn() (which can also provide you with the length of the string at the same time)
As mentioned above, one approach to solve the "I don't know how many characters I have?" problem is to use a fixed-size buffer (character array) and then repeatedly call fgets() until the '\n' is found in the array. That way you can allocate final storage for the line by determining the number of the character read into the fixed-size buffer. It doesn't matter if your fixed-size buffer is 10 and you have 100 characters to read, you simply call fgets() in a loop until the number of characters you read is less than a full fixed-size buffer's worth.
Now ideally, you would size your temporary fixed-size buffer so that your input fits the first time eliminating the need to loop and reallocate, but if the cat steps on the keyboard -- you are covered.
Let's look at an example, similar in function to the CS50 get_string() function. It allows the user to provide the prompt for the user, and reads and allocated storage for the result, returning a pointer to the allocated block containing the string that the user is then responsible for calling free() on when done with it.
#define MAXC 1024 /* if you need a constant, #define one (or more) */
char *getstr (const char *prompt)
{
char tmp[MAXC], *s = NULL; /* fixed size buf, ptr to allocate */
size_t n = 0, used = 0; /* length and total length */
if (prompt) /* prompt if not NULL */
fputs (prompt, stdout);
while (1) { /* loop continually */
if (!fgets (tmp, sizeof tmp, stdin)) /* read into tmp */
return s;
tmp[(n = strcspn (tmp, "\n"))] = 0; /* trim \n, save length */
if (!n) /* if empty-str, break */
break;
void *tmpptr = realloc (s, used + n + 1); /* always realloc to temp pointer */
if (!tmpptr) { /* validate every allocation */
perror ("realloc-getstr()");
return s;
}
s = tmpptr; /* assign new block to s */
memcpy (s + used, tmp, n + 1); /* copy tmp to s with \0 */
used += n; /* update total length */
if (n + 1 < sizeof tmp) /* if tmp not full, break */
break;
}
return s; /* return allocated string, caller responsible for calling free */
}
Above, a fixed size buffer of MAXC characters is used to read input from the user. A continual loop calls fgets() to read the input into the buffer tmp. strcspn() is called as the index to tmp to find the number of characters that does not include the '\n' character (the length of the input without the '\n') and nul-terminates the string at that length overwriting the '\n' character with the nul-terminating character '\0' (which is just plain old ASCII 0). The length is saved in n. If the line is empty after the removal of the '\n' there is nothing more to do and the function returns whatever is in s at that time.
If characters are present, the a temporary pointer is used to realloc() storage for the new characters (+1). After validating realloc() succeeded, the new characters are copied to the end of the storage and the total length of characters in the buffer is saved in used which is used as an offset from the beginning of the string. That repeats until you run out of characters to read and the allocated block containing the string is returned (if no characters were input, NULL is returned)
(note: you may also want to pass a pointer to size_t as a parameter that can be updated to the final length before return to avoid having to calculate the length of the returned string again -- that is left to you)
Before looking at an example, let's add debug output to the function so it tells us how many characters were allocated in total. Just add the printf() below before the return, e.g.
}
printf (" allocated: %zu\n", used?used+1:used); /* (debug output of alloc size) */
return s; /* return allocated string, caller responsible for calling free */
}
A short example that loops reading input until Enter is pressed on an empty line causing the program to exit after freeing all memory:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
/* insert getstr() function HERE */
int main (void) {
for (;;) {
char *s = getstr ("enter str: ");
if (!s)
break;
puts (s);
putchar ('\n');
free (s);
}
}
Example Use/Output
With MAXC at 1024 there isn't a chance of needing to loop unless the cat steps on the keyboard, so all input is read into tmp and then storage is allocated to exactly hold each input:
$ ./bin/fgetsstr
enter str: a
allocated: 2
a
enter str: ab
allocated: 3
ab
enter str: abc
allocated: 4
abc
enter str: 123456789
allocated: 10
123456789
enter str:
allocated: 0
Setting MAXC at 2 or 10 is fine as well. The only thing that changes is the number of times you loop reallocating storage and copying the contents of the temporary buffer to your final storage. E.g. with MAXC at 10, the user wouldn't know the difference in:
$ ./bin/fgetsstr
enter str: 12345678
allocated: 9
12345678
enter str: 123456789
allocated: 10
123456789
enter str: 1234567890
allocated: 11
1234567890
enter str: 12345678901234567890
allocated: 21
12345678901234567890
enter str:
allocated: 0
Above you have forced the while (1) loop to execute twice for each string of 10 characters or more. So while you want to set MAXC to some reasonable size to avoid looping, and a 1K buffer is fine considering you will have at minimum a 1M function stack on most x86 or x86_64 computers. You may want to reduce the size if you are programming for a micro-controller with limited storage.
While you could allocate for tmp as well, there really is no need and using a fixed-size buffer is about a simple as it gets for sticking with standard-C. If you have POSIX available, then getline() already provides auto-allocation for any size input you have. That is another good alternative to fgets() -- but POSIX is not standard C (though it is widely available)
Another good alternative is simply looping with getchar() reading a character at a time until the '\n' or EOF is reached. Here you just allocate some initial size for s say 2 or 8 and keep track of the number of characters used and then double the size of the allocation when used == allocated and keep going. You would want to allocate blocks of storage as you would not want to realloc() for every character added (we will omit the discussion of why that is less true today with a mmaped malloc() than it was in the past)
Look things over and let me know if you have further questions.
I personally use the malloc approach, but you need to mind one more thing, you can also then limit the characters accepted with %s in the scanf to match your buffer.
char *string = (char*) malloc (sizeof (char) * 100);
scanf ("%100s", string);
You can then reallocate the memory after getting the string size by using the string function strlen and then adding 1 for the terminator.
There are multiple approaches to this problem:
use an arbitrary maximum length, read the input into a local array and allocate memory based on actual input:
#include <stdio.h>
#include <string.h>
char *readstr(void) {
char buf[100];
if (scanf("%99s", buf) == 1)
return strdup(buf);
else
return NULL;
}
use non-standard library extensions, if supported and if allowed. For example the GNU libc has an m modifier for exactly this purpose:
#include <stdio.h>
char *readstr(void) {
char *p;
if (scanf("%ms", &p) == 1)
return p;
else
return NULL;
}
read input one byte at a time and reallocate the destination array on demand. Here is a simplistic approach:
#include <ctype.h>
#include <stdio.h>
#include <stdlib.h>
char *readstr(void) {
char *p = NULL;
size_t i = 0;
int c;
while ((c = getchar()) != EOF) {
if (isspace(c)) {
if (i > 0) {
ungetc(c, stdin);
break;
}
} else {
char *newp = realloc(p, i + 2);
if (newp == NULL) {
free(p);
return NULL;
}
p = newp;
p[i++] = c;
p[i] = '\0';
}
}
return p;
}
So far I have been using if statements to check the size of the user-inputted strings. However, they don't see to be very useful: no matter the size of the input, the while loop ends and it returns the input to the main function, which then just outputs it.
I don't want the user to enter anything greater than 10, but when they do, the additional characters just overflow and are outputted on a newline. The whole point of these if statements is to stop that from happening, but I haven't been having much luck.
#include <stdio.h>
#include <string.h>
#define SIZE 10
char *readLine(char *buf, size_t sz) {
int true = 1;
while(true == 1) {
printf("> ");
fgets(buf, sz, stdin);
buf[strcspn(buf, "\n")] = 0;
if(strlen(buf) < 2 || strlen(buf) > sz) {
printf("Invalid string size\n");
continue;
}
if(strlen(buf) > 2 && strlen(buf) < sz) {
true = 0;
}
}
return buf;
}
int main(int argc, char **argv) {
char buffer[SIZE];
while(1) {
char *input = readLine(buffer, SIZE);
printf("%s\n", input);
}
}
Any help towards preventing buffer overflow would be much appreciated.
When the user enters in a string longer than sz, your program processes the first sz characters, but then when it gets back to the fgets call again, stdin already has input (the rest of the characters from the user's first input). Your program then grabs another up to sz characters to process and so on.
The call to strcspn is also deceiving because if the "\n" is not in the sz chars you grab than it'll just return sz-1, even though there's no newline.
After you've taken input from stdin, you can do a check to see if the last character is a '\n' character. If it's not, it means that the input goes past your allowed size and the rest of stdin needs to be flushed. One way to do that is below. To be clear, you'd do this only when there's been more characters than allowed entered in, or it could cause an infinite loop.
while((c = getchar()) != '\n' && c != EOF)
{}
However, trying not to restructure your code too much how it is, we'll need to know if your buffer contains the newline before you set it to 0. It will be at the end if it exists, so you can use the following to check.
int containsNewline = buf[strlen(buf)-1] == '\n'
Also be careful with your size checks, you currently don't handle the case for a strlen of 2 or sz. I would also never use identifier names like "true", which would be a possible value for a bool variable. It makes things very confusing.
In case that string inside the file is longer that 10 chars, your fgets() reads only the first 10 chars into buf. And, because these chars doesn't contain the trailing \n, function strcspn(buf, "\n") returns 10 - it means, you are trying to set to 0 an buf[10], so it is over buf[] boundaries (max index is 9).
Additionally, never use true or false as the name of variable - it totally diminishes the code. Use something like 'ok' instead.
Finally: please clarify, what output is expected in case the file contains string longer than 10 characters. It should be truncated?
I am writing a program to write my html files rapidly. And when I came to write the content of my page I got a problem.
#include<stdio.h>
int main()
{
int track;
int question_no;
printf("\nHow many questions?\t");
scanf("%d",&question_no);
char question[question_no][100];
for(track=1;track<=question_no;track++)
{
printf("\n<div class=\"question\">%d. ",track);
printf("\nQuestion number %d.\t",track);
fgets(question[track-1],sizeof(question[track-1]),stdin);
printf("\n\n\tQ%d. %s </div>",track,question[track-1]);
}
}
In this program I am writing some questions and their answers (in html file). When I test run this program I input the value of question_no to 3. But when I enter my first question it doesn't go in question[0] and consequently the first question doesn't output. The rest of the questions input without issue.
I searched some questions on stackoverflow and found that fgets() looks for last \0 character and that \0 stops it.
I also found that I should use buffer to input well through fgets() so I used: setvbuf and setbuf but that also didn't work (I may have coded that wrong). I also used fflush(stdin) after my first and last (as well) scanf statement to remove any \0 character from stdin but that also didn't work.
Is there any way to accept the first input by fgets()?
I am using stdin and stdout for now. I am not accessing, reading or writing any file.
Use fgets for the first prompt too. You should also malloc your array as you don't know how long it is going to be at compile time.
#include <stdlib.h>
#include <stdio.h>
#define BUFSIZE 8
int main()
{
int track, i;
int question_no;
char buffer[BUFSIZE], **question;
printf("\nHow many questions?\t");
fgets(buffer, BUFSIZE, stdin);
question_no = strtol(buffer, NULL, 10);
question = malloc(question_no * sizeof (char*));
if (question == NULL) {
return EXIT_FAILURE;
}
for (i = 0; i < question_no; ++i) {
question[i] = malloc(100 * sizeof (char));
if (question[i] == NULL) {
return EXIT_FAILURE;
}
}
for(track=1;track<=question_no;track++)
{
printf("\n<div class=\"question\">%d. ",track);
printf("\nQuestion number %d.\t",track);
fgets(question[track-1],100,stdin);
printf("\n\n\tQ%d. %s </div>",track,question[track-1]);
}
for (i = 0; i < question_no; ++i) free(question[i]);
free(question);
return EXIT_SUCCESS;
}
2D arrays in C
A 2D array of type can be represented by an array of pointers to type, or equivalently type** (pointer to pointer to type). This requires two steps.
Using char **question as an exemplar:
The first step is to allocate an array of char*. malloc returns a pointer to the start of the memory it has allocated, or NULL if it has failed. So check whether question is NULL.
Second is to make each of these char* point to their own array of char. So the for loop allocates an array the size of 100 chars to each element of question. Again, each of these mallocs could return NULL so you should check for that.
Every malloc deserves a free so you should perform the process in reverse when you have finished using the memory you have allocated.
malloc reference
strtol
long int strtol(const char *str, char **endptr, int base);
strtol returns a long int (which in the code above is casted to an int). It splits str into three parts:
Any white-space preceding the numerical content of the string
The part it recognises as numerical, which it will try to convert
The rest of the string
If endptr is not NULL, it will point to the 3rd part, so you know where strtol finished. You could use it like this:
#include <stdio.h>
#include <stdlib.h>
int main()
{
char * endptr = NULL, *str = " 123some more stuff";
int number = strtol(str, &endptr, 10);
printf("number interpreted as %d\n"
"rest of string: %s\n", number, endptr);
return EXIT_SUCCESS;
}
output:
number interpreted as 123
rest of string: some more stuff
strtol reference
This is because the previous newline character left in the input stream by scanf(). Note that fgets() stops if it encounters a newline too.
fgets() reads in at most one less than size characters from stream and
stores them into the buffer pointed to by s. Reading stops after an
EOF or a newline. If a newline is read, it is stored into the
buffer
Don't mix fgets() and scanf(). A trivial solution is to use getchar() right after scanf() in order to consume the newline left in the input stream by scanf().
As per the documentation,
The fgets() function shall read bytes from stream into the array
pointed to by s, until n-1 bytes are read, or a < newline > is read and
transferred to s, or an end-of-file condition is encountered
In case of scanf("%d",&question_no); a newline is left in the buffer and that is read by
fgets(question[track-1],sizeof(question[track-1]),stdin);
and it exits.
In order to flush the buffer you should do,
while((c = getchar()) != '\n' && c != EOF)
/* discard */ ;
to clear the extra characters in the buffer
I'm writing a program that encrypts a file by adding 10 to each character. Somehow a portion of the programs working directory is being printed to the file, and I have no idea why.
#include <stdio.h>
int main(void){
FILE *fp;
fp=fopen("tester.csv","r+");
Encrypt(fp);
fclose(fp);
}
int Encrypt(FILE *fp){
int offset=10;
Shift(fp, offset);
}
int Decrypt(FILE *fp){
int offset= -10;
Shift(fp, offset);
}
int Shift(FILE *fp, int offset){
char line[50],tmp[50], character;
long position;
int i;
position = ftell(fp);
while(fgets(line,50,fp) != NULL){
for(i=0;i<50;i++){
character = line[i];
character = (offset+character)%256;
tmp[i] = character;
if(character=='\n' || character == 0){break;}
}
fseek(fp,position,SEEK_SET);
fputs(tmp,fp);
position = ftell(fp);
fseek(stdin,0,SEEK_END);
}
}
the file originally reads
this, is, a, test
i, hope, it, works!
after the program is run:
~rs}6*s}6*k6*~o}~
/alexio/D~6*y|u}+
k6*~o}~
/alexio/D
where users/alexio/Desktop is part of the path. How does this happen???
Because you "encode" the string, it won't be null terminated (that's your case), or it will contain a null even before the end of the string (character+offset % 256 == 0). Later you try to write it as a string, which overruns your buffer, and outputs part of your program arguments.
Use fread and fwrite.
The line
fputs(tmp,fp);
writes out a probably non-null terminated string. So it continues to copy memory to the file until it finds a null.
You need to add a null to the end of 'tmp' in the case where the loop breaks on a newline.
A number of things:
You're encoding all 50 chars from your read buffer, regardless of how many were actually read with fgets(). Recall that fgets() reads a line, not an entire buffer (unless the line is longer than a buffer, and your's is not). Anything past the string length from your line file input is stack garbage.
You're then dumping all that extra garbage data, andbeyond, by not terminating your tmp[] string before writing with fputs() which you should not be using anyway. Yet-more stack garbage.
Solution. Use fread() and fwrite() for this encoding. There is no reason to be using string functions whatsoever. When you write your decoder you'll thank yourself for using fread() and fwrite()