reading an unbounded line from the console with scanf - c

I need to read a finite yet unbounded-in-length string.
We learned only about scanf so I guess I cannot use fgets.
Anyway, I've ran this code on a an input with length larger than 5.
char arr[5];
scanf("%s", arr);
char *s = arr;
while (*s != '\0')
printf("%c", *s++);
scanf keeps scanning and writing the overflowed part, but it seems like an hack. Is that a good practice? If not, how should I read it?
Note: We have learned about the alloc functions family.

Buffer overflows are a plague, of the most famous and yet most elusive bugs. So you should definitely not rely on them.
Since you've learned about malloc() and friends, I suppose you're expected to make use of them.
#include <stdio.h>
#include <stdint.h>
#include <stdlib.h>
// Array growing step size
#define CHUNK_SIZE 8
int main(void) {
size_t arrSize = CHUNK_SIZE;
char *arr = malloc(arrSize);
if(!arr) {
fprintf(stderr, "Initial allocation failed.\n");
goto failure;
}
// One past the end of the array
// (next insertion position)
size_t arrEnd = 0u;
for(char c = '\0'; c != '\n';) {
if(scanf("%c", &c) != 1) {
fprintf(stderr, "Reading character %zu failed.\n", arrEnd);
goto failure;
}
// No more room, grow the array
// (-1) takes into account the
// nul terminator.
if(arrEnd == arrSize - 1) {
arrSize += CHUNK_SIZE;
char *newArr = realloc(arr, arrSize);
if(!newArr) {
fprintf(stderr, "Reallocation failed.\n");
goto failure;
}
arr = newArr;
// Debug output
arr[arrEnd] = '\0';
printf("> %s\n", arr);
// Debug output
}
// Append the character and
// advance the end index
arr[arrEnd++] = c;
}
// Nul-terminate the array
arr[arrEnd++] = '\0';
// Done !
printf("%s", arr);
free(arr);
return 0;
failure:
free(arr);
return 1;
}

%as or %ms(POSIX) can be used for such purpose If you are using gcc with glibc.(not C standard)
#include <stdio.h>
#include <stdlib.h>
int main(void){
char *s;
scanf("%as", &s);
printf("%s\n", s);
free(s);
return 0;
}

scanf is the wrong tool for this job (as for most jobs). If you are required to use this function, read one char at a time with scanf("%c", &c).
You code misuses scanf(): you are passing arr, the address of an array of pointers to char instead of an array of char.
You should allocate an array of char with malloc, read characters into it and use realloc to extend it when it is too small, until you get a '\n' or EOF.
If you can rewind stdin, you can first compute the number of chars to read with scanf("%*s%n", &n);, then allocate the destination array to n+1 bytes, rewind(stdin); and re-read the string into the buffer with scanf("%s", buf);.
It is risky business as some streams such as console input cannot be rewinded.
For example:
fpos_t pos;
int n = 0;
char *buf;
fgetpos(stdin, &pos);
scanf("%*[^\n]%n", &n);
fsetpos(stdin, &pos);
buf = calloc(n+1, 1);
scanf("%[^\n]", buf);
Since you are supposed to know just some basic C, I doubt this solution is what is expected from you, but I cannot think of any other way to read an unbounded string in one step using standard C.
If you are using the glibc and may use extensions, you can do this:
scanf("%a[^\n]", &buf);
PS: all error checking and handling is purposely ignored, but should be handled in you actual assignment.

Try limiting the amount of characters accepted:
scanf("%4s", arr);

It's just that you're writing beyond arr[5]. "Hopefully" you're keeping writing on allocated memory of the process, but if you go beyond you'll end up with a segmentation fault.

Consider
1) malloc() on many systems only allocates memory, not uses it. It isn't until the memory is assigned that the underlining physical memory usage occurs. See Why is malloc not "using up" the memory on my computer?
2) Unbounded user input is not realistic. Given that some upper bound should be employed to prevent hackers and nefarious users, simple use a large buffer.
If you system can work with these two ideas:
char *buf = malloc(1000000);
if (buf == NULL) return NULL; // Out_of_memory
if (scanf("%999999s", buf) != 1) { free(buf); return NULL; } //EOF
// Now right-size buffer
size_t size = strlen(buf) + 1;
char *tmp = realloc(buf, size);
if (tmp == NULL) { free(buf); return NULL; } // Out_of_memory
return tmp;
Fixed up per #chqrlie comments.

Related

How can I read a paragraph as a single string in C? [duplicate]

I'm writing a C program that should read in an essay from a user. The essay is divided into multiple paragraphs.
I don't know how many lines or characters the essay will be, but I do know that it ends with a hash symbol (#). I want to use only as much memory as is necessary to hold the essay.
Here is what I have tried so far:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
main(){
int size;
char *essay;
printf("\n how many characters?\n");
scanf("%d", &size);
essay =(char *) malloc(size+1);
printf("Type the string\n");
scanf("%s",essay);
printf("%s",essay );
}
As I said before, I don't know (and don't want to ask) about the number of characters beforehand. How do I dynamically allocate memory to save space? (What is dynamic memory allocation?) Is there another way to save memory that doesn't rely on dynamic allocation?
Additionally, my code only reads one line at a time right now. How can I read multiple lines and store them as a single string?
this is another code
#include <stdio.h>
#include <stdlib.h>
int main ()
{
char input;
int count = 0;
int n;
char* characters= NULL;
char* more_characters = NULL;
do {
printf ("type the essay:\n");
scanf ("%d", &input);
count++;
more_characters = (char*) realloc (characters, count * sizeof(char));
if (more_characters!=NULL) {
characters=more_characters;
characters[count-1]=input; }
else {
free (characters);
printf ("Error (re)allocating memory");
exit (1);
}
} while (input!='#');
printf ("the essay: ");
for (n=0;n<count;n++) printf ("%c",characters[n]);
free (characters);
}
it is not working
You can read character at a time and copy it into your essay buffer. When your essay buffer runs out of space, you can do a realloc to get another chunk of memory. When your character that you read is a "#" you're done.
Hmmm to "not waste space in memory",
then how about excessive calls of realloc()?
char *Read_Paragraph_i(void) {
size_t size = 0;
size_t i = 0;
char *dest = NULL;
int ch;
while ((ch = fgetc(stdin)) != EOF) {
if (ch == '#') break;
size++;
char *new_ptr = realloc(dest, size);
assert(new_ptr);
dest = new_ptr;
dest[i++] = ch;
}
size++;
char *new_ptr = realloc(dest, size+);
assert(new_ptr);
dest = new_ptr;
dest[i++] = '\0';
return dest;
}
A more sane approach would double the allocation size every time more memory is need, temporally wasting memory and then a final "right-size" allocation.
If this can use C++, you can use string (std::string) which will grow as needed as characters are added. If you can't then you will have to use malloc to create an array to hold characters. When it is full, you will have to create a new one, and copy the current data from old to new one, then add the new character. You can do that on each character read to use the minimal amount of memory, but that is WAY too inefficient. A better way is to allocate the character array in chucks, keeping the current size, and the number of characters currently in it. When you want to add another character and the array is full, then you allocate a new one that is some number of characters larger than current one, update current size to size of new one, then add new character.

reading a file of strings to a multidimensional array to access later

I am really having a problem understanding dynamically allocated arrays.
I am attempting to read a text file of strings to a 2d array so I can sort them out later. right now as my code stands it throws seg faults every once in a while. Which means I'm doing something wrong. I've been surfing around trying to get a better understanding of what malloc actually does but I want to test and check if my array is being filled.
my program is pulling from a text file with nothing but strings and I am attempting to put that data into a 2d array.
for(index = 0; index < lines_allocated; index++){
//for loop to fill array 128 lines at a time(arbitrary number)
words[index] = malloc(sizeof(char));
if(words[index] == NULL){
perror("too many characters");
exit(2);
}
//check for end of file
while(!feof(txt_file)) {
words = fgets(words, 64, txt_file);
puts(words);
//realloc if nessesary
if (lines_allocated == (index - 1)){
realloc(words, lines_allocated + lines_allocated);
}
}
}
//get 3rd value placed
printf("%s", words[3]);
since this just a gist, below here ive closed and free'd the memory, The output is being displayed using puts, but not from the printf from the bottom. an ELI5 version of reading files to an array would be amazing.
Thank you in advance
void *malloc(size_t n) will allocate a region of n bytes and return a pointer to the first byte of that region, or NULL if it could not allocate enough space. So when you do malloc(sizeof(char)), you're only allocating enough space for one byte (sizeof(char) is always 1 by definition).
Here's an annotated example that shows the correct use of malloc, realloc, and free. It reads in between 0 and 8 lines from a file, each of which contains a string of unknown length. It then prints each line and frees all the memory.
#include <stdio.h>
#include <stdlib.h>
/* An issue with reading strings from a file is that we don't know how long
they're going to be. fgets lets us set a maximum length and discard the
rest if we choose, but since malloc is what you're interested in, I'm
going to do the more complicated version in which we grow the string as
needed to store the whole thing. */
char *read_line(void) {
size_t maxlen = 16, i = 0;
int c;
/* sizeof(char) is defined to be 1, so we don't need to include it.
the + 1 is for the null terminator */
char *s = malloc(maxlen + 1);
if (!s) {
fprintf(stderr, "ERROR: Failed to allocate %zu bytes\n", maxlen + 1);
exit(EXIT_FAILURE);
}
/* feof only returns 1 after a read has *failed*. It's generally
easier to just use the return value of the read function directly.
Here we'll keep reading until we hit end of file or a newline. */
while ('\n' != (c = getchar())) {
if (EOF == c) {
/* We return NULL to indicate that we hit the end of file
before reading any characters, but if we've read anything,
we still want to return the string */
if (0 == i) return NULL;
break;
}
if (i == maxlen) {
/* Allocations are expensive, so we don't want to do one each
iteration. As such, we're always going to allocate more than
we need. Exactly how much extra we allocate depends on the
program's needs. Here, we just add a constant amount. */
maxlen += 16;
/* realloc will attempt to resize the memory pointed to by s,
or copy it to a newly allocated region of size maxlen. If it
makes a copy, it will free the old version. */
char *p = realloc(s, maxlen + 1);
if (!p) {
/* If the realloc fails, it does not free the old version, so we do it here. */
free(s);
fprintf(stderr, "ERROR: Failed to allocate %zu bytes\n", maxlen + 1);
exit(EXIT_FAILURE);
}
s = p;//set the pointer to the newly allocated memory
}
s[i++] = c;
}
s[i] = '\0';
return s;
}
int main(void) {
/* If we wanted to, we could grow the array of strings just like we do the strings
themselves, but for brevity's sake, we're just going to stop reading once we've
read 8 of them. */
size_t i, nstrings = 0, max_strings = 8;
/* Each string is an array of characters, so we allocate an array of char*;
each char* will point to the first element of a null-terminated character array */
char **strings = malloc(sizeof(char*) * max_strings);
if (!strings) {
fprintf(stderr, "ERROR: Failed to allocate %zu bytes\n", sizeof(char*) * max_strings);
return 1;
}
for (nstrings = 0; nstrings < max_strings; nstrings++) {
strings[nstrings] = read_line();
if (!strings[nstrings]) {//no more strings in file
break;
}
}
for (i = 0; i < nstrings; i++) {
printf("%s\n", strings[i]);
}
/* Free each individual string, then the array of strings */
for (i = 0; i < nstrings; i++) {
free(strings[i]);
}
free(strings);
return 0;
}
I haven't looked too closely so I could be offering an incomplete solution.
That being said, the error is probably here:
realloc(words, lines_allocated + lines_allocated);
realloc if succesful returns the new pointer, if you're lucky it can allocate the adjacent space (which wouldn't cause a segfault).
words = realloc(words, lines_allocated + lines_allocated);
would solve it, although you probably need to check for errors.

How do I use scanf when I dont know how many values it will assign in C?

These are the instructions:
"Read characters from standard input until EOF (the end-of-file mark) is read. Do not prompt the user to enter text - just read data as soon as the program starts."
So the user will be entering characters, but I dont know how many. I will later need to use them to build a table that displays the ASCII code of each value entered.
How should I go about this?
This is my idea
int main(void){
int inputlist[], i = -1;
do {++i;scanf("%f",&inputlist[i]);}
while(inputlist[i] != EOF)
You said character.So this might be used
char arr[10000];
ch=getchar();
while(ch!=EOF)
{
arr[i++]=ch;
ch=getchar();
}
//arr[i]=0; TO make it a string,if necessary.
And to convert to ASCII
for(j=0;j<i;j++)
printf("%d\n",arr[j]);
If you are particular in using integer array,Use
int arr[1000];
while(scanf("%d",&arr[i++])!=EOF);
PPS:This works only if your input is one character per line.
scanf returns EOF on EOF
You have a reasonable attempt at a start to the solution, with a few errors. You can't define an array without specifying a size, so int inputlist[] shouldn't even compile. Your scanf() specifier is %f for float, which is wrong twice (once because you declared inputlist with an integer type, and twice because you said your input is characters, so you should be telling scanf() to use %c or %s), and really if you're reading input unconditionally until EOF, you should use an unconditional input function, such as fgets() or fread(). (or read(), if you prefer).
You'll need two things: A place to store the current chunk of input, and a place to store the input that you've already read in. Since the input functions I mentioned above expect you to specify the input buffer, you can allocate that with a simple declaration.
char input[1024];
However, for the place to store all input, you'll want something dynamically allocated. The simplest solution is to simply malloc() a chunk of storage, keep track of how large it is, and realloc() it if and when necessary.
char *all_input;
int poolsize=16384;
all_input = malloc(pool_size);
Then, just loop on your input function until the return value indicates that you've hit EOF, and on each iteration of the loop, append the input data to the end of your storage area, increment a counter by the size of the input data, and check whether you're getting too close to the size of your input storage area. (And if you are, then use realloc() to grow your storage.)
You could read the input by getchar until reach EOF. And you don't know the size of input, you should use dynamic size buffer in heap.
char *buf = NULL;
long size = 1024;
long count = 0;
char r;
buf = (char *)malloc(size);
if (buf == NULL) {
fprintf(stderr, "malloc failed\n");
exit(1);
}
while( (r = getchar()) != EOF) {
buf[count++] = r;
// leave one space for '\0' to terminate the string
if (count == size - 1) {
buf = realloc(buf,size*2);
if (buf == NULL) {
fprintf(stderr, "realloc failed\n");
exit(1);
}
size = size * 2;
}
}
buf[count] = '\0';
printf("%s \n", buf);
return 0;
Here is full solution for your needs with comments.
#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>
// Number of elements
#define CHARNUM 3
int main(int argc, char **argv) {
// Allocate memory for storing input data
// We calculate requested amount of bytes by the formula:
// NumElement * SizeOfOneElement
size_t size = CHARNUM * sizeof(int);
// Call function to allocate memory
int *buffer = (int *) calloc(1, size);
// Check that calloc() returned valid pointer
// It can: 1. Return pointer in success or NULL in faulire
// 2. Return pointer or NULL if size is 0
// (implementation dependened).
// We can't use this pointer later.
if (!buffer || !size)
{
exit(EXIT_FAILURE);
}
int curr_char;
int count = 0;
while ((curr_char = getchar()) != EOF)
{
if (count >= size/sizeof(int))
{
// If we put more characters than now our buffer
// can hold, we allocate more memory
fprintf(stderr, "Reallocate memory buffer\n");
size_t tmp_size = size + (CHARNUM * sizeof(int));
int *tmp_buffer = (int *) realloc(buffer, tmp_size);
if (!tmp_buffer)
{
fprintf(stderr, "Can't allocate enough memory\n");
exit(EXIT_FAILURE);
}
size = tmp_size;
buffer = tmp_buffer;
}
buffer[count] = curr_char;
++count;
}
// Here you get buffer with the characters from
// the standard input
fprintf(stderr, "\nNow buffer contains characters:\n");
for (int k = 0; k < count; ++k)
{
fprintf(stderr, "%c", buffer[k]);
}
fprintf(stderr, "\n");
// Todo something with the data
// Free all resources before exist
free(buffer);
exit(EXIT_SUCCESS); }
Compile with -std=c99 option if you use gcc.
Also you can use getline() function which will read from standard input line by line. It will allocate enough memory to store line. Just call it until End-Of-File.
errno = 0;
int read = 0;
char *buffer = NULL;
size_t len = 0;
while ((read = getline(&buffer, &len, stdin)) != -1)
{ // Process line }
if (errno) { // Get error }
// Process later
Note that if you are using getline() you should anyway use dynamic allocated memory. But not for storing characters, rather to store pointers to the strings.

Reading from stdin (file of variable length)

So I've been trying to get this to assignment work in various different ways, but each time I get different errors. Basically what we have is a program that needs to read, byte by byte, the contents of a file that will be piped in (the file length could be humongous so we can't just call malloc and allocated a large chunk of space). We are required to use realloc to expand the amount of freed memory until we reach the end of the file. The final result should be one long C string (array) containing each byte (and we can't disregard null bytes either if they are part of the file). What I have at the moment is:
char *buff;
int n = 0;
char c;
int count;
if (ferror (stdin))
{
fprintf(stderr, "error reading file\n");
exit (1);
}
else
{
do {
buff = (char*) realloc (buff, n+1);
c = fgetc (stdin);
buff[n] = c;
if (c != EOF)
n++;
}
while (c != EOF);
}
printf("characters entered: ");
for (count = 0; count < n; count++)
printf("%s ", buff[count]);
free (buff);
It should keep reading until the end of the file, expanding the memory each time but when I try to run it by piping in a simple text file, it tells me I have a segmentation fault. I'm not quite sure what I'm doing wrong.
Note that we're allowed to use malloc and whatnot, but I couldn't see how to make that work since we have know idea how much memory is needed.
You are using an unassigned pointer buf in your first call to realloc. Change to
char *buf = malloc(100);
to avoid this problem.
Once you get it working, you'll notice that your program is rather inefficient, with a realloc per character. Consider realloc-ing in larger chunks to reduce the number of reallocations.
char* buff;
...
buff = (char*) realloc (buff, n+1);
You're trying to reallocate an unitialized pointer, which leads to undefined behaviour. Change to
char* buff = 0;
...
buff = (char*) realloc (buff, n+1);
But as has been pointed out, this is very inefficient.
Seems like the answers by #dasblinkenlight and #smocking are the current reason, but to avoid the next crashes:
Change char c; to int c;, as the EOF is represented by more than one char.
This is a bad idea to call realloc for one char at a time, instead increase the size in X bytes (let's say 100) each time, this will be MUCH more efficient.
You need to add the null terminator ('\0') at the end of the buffer, otherwise - undefined behavior at printf().
Here's what I came up with for reading stdin into a char[] or char* (when having embedded NULLs in stdin):
char* content = NULL;
char c;
int contentSize = 0;
while ((c = fgetc(stdin)) != EOF){
contentSize++;
content = (char*)(realloc(content, contentSize+1));
if (content == NULL) {
perror("Realloc failed.");
exit(2);
}
content[contentSize] = c;
}
for (int i = 0; i < contentSize; ++i) {
printf("%c",content[i]);
}

Why am I getting a segmentation fault?

I'm trying to write a program that takes in a plaintext file as it's argument and parses through it, adding all the numbers together and then print out the sum. The following is my code:
#include <stdio.h>
#include <stdlib.h>
#include <ctype.h>
static int sumNumbers(char filename[])
{
int sum = 0;
FILE *file = fopen(filename, "r");
char *str;
while (fgets(str, sizeof BUFSIZ, file))
{
while (*str != '\0')
{
if (isdigit(*str))
{
sum += atoi(str);
str++;
while (isdigit(*str))
str++;
continue;
}
str++;
}
}
fclose(file);
return sum;
}
int main(int argc, char *argv[])
{
if (argc != 2)
{
fprintf(stderr, "Please enter the filename as the argument.\n");
exit(EXIT_FAILURE);
}
else
{
printf("The sum of all the numbers in the file is : %d\n", sumNumbers(argv[1]));
exit(EXIT_SUCCESS);
}
return 0;
}
And the text file I'm using is:
This a rather boring text file with
some random numbers scattered
throughout it.
Here is one: 87 and here is another: 3
and finally two last numbers: 12
19381. Done. Phew.
When I compile and try to run it, I get a segmentation fault.
You've not allocated space for the buffer.The pointer str is just a dangling pointer. So your program effectively dumps the data read from the file into memory location which you don't own, leading to the segmentation fault.
You need:
char *str;
str = malloc(BUFSIZ); // this is missing..also free() the mem once done using it.
or just:
char str[BUFSIZ]; // but then you can't do str++, you'll have to use another
// pointer say char *ptr = str; and use it in place of str.
EDIT:
There is another bug in:
while (fgets(str, sizeof BUFSIZ, file))
The 2nd argument should be BUFSIZ not sizeof BUFSIZ.
Why?
Because the 2nd argument is the maximum number of characters to be read into the buffer including the null-character. Since sizeof BUFSIZ is 4 you can read max upto 3 char into the buffer. That is reason why 19381 was being read as 193 and then 81<space>.
You haven't allocated any memory to populate str. fgets takes as its first argument a buffer, not an unassigned pointer.
Instead of char *str; you need to define a reasonably sized buffer, say, char str[BUFSIZ];
Because you've not allocated space for your buffer.
A number of people have already addressed the problem you asked about, but I've got a question in return. What exactly do you think this accomplishes:
if (isdigit(*str))
{
if (isdigit(*str))
{
sum += atoi(str);
str++;
while (isdigit(*str))
str++;
continue;
}
}
What's supposed to be the point of two successive if statements with the exact same condition? (Note for the record: neither one has an else clause).
You have declared char* str, but you have not set aside memory for it just yet. You will need to malloc memory for it.
Many memory related errors such as this one can be easily found with valgrind. I'd highly recommend using it as a debugging tool.
char *str;
str has no memory allocated for it. Either use malloc() to allocate some memory for it, or declared it with a predefined size.
char str[MAX_SIZE];
Your program has several bugs:
It does not handle long lines correctly. When you read a buffer of some size it may happen that some number starts at the end of the buffer and continues at the beginning of the next buffer. For example, if you have a buffer of size 4, there might be the input The |numb|er 1|2345| is |larg|e., where the vertical lines indicate the buffer's contents. You would then count the 1 and the 2345 separately.
It calls isdigit with a char as argument. As soon as you read any "large" character (greater than SCHAR_MAX) the behavior is undefined. Your program might crash or produce incorrect results or do whatever it wants to do. To fix this, you must first cast the value to an unsigned char, for example isdigit((unsigned char) *str). Or, as in my code, you can feed it the value from the fgetc function, which is guaranteed to be a valid argument for isdigit.
You use a function that requires a buffer (fgets) but you fail to allocate the buffer. As others noted, the easiest way to get a buffer is to declare a local variable char buffer[BUFSIZ].
You use the str variable for two purposes: To hold the address of the buffer (which should remain constant over the whole execution time) and the pointer for analyzing the text (which changes during the execution). Make these two variables. I would call them buffer and p (short for pointer).
Here is my code:
#include <ctype.h>
#include <stdio.h>
static int sumNumbers(const char *filename)
{
int sum, num, c;
FILE *f;
if ((f = fopen(filename, "r")) == NULL) {
/* TODO: insert error handling here. */
}
sum = 0;
num = 0;
while ((c = fgetc(f)) != EOF) {
if (isdigit(c)) {
num = 10 * num + (c - '0');
} else if (num != 0) {
sum += num;
num = 0;
}
}
if (fclose(f) != 0) {
/* TODO: insert error handling here. */
}
return sum;
}
int main(int argc, char **argv) {
int i;
for (i = 1; i < argc; i++)
printf("%d\t%s\n", sumNumbers(argv[i]), argv[i]);
return 0;
}
Here is a function, that does your job:
static int sumNumbers(char* filename) {
int sum = 0;
FILE *file = fopen(filename, "r");
char buf[BUFSIZ], *str;
while (fgets(buf, BUFSIZ, file))
{
str=buf;
while (*str)
{
if (isdigit(*str))
{
sum += strtol(str, &str, 10);
}
str++;
}
}
fclose(file);
return sum;
}
This doesn't includes error handling, but works quite well. For your file, output will be
The sum of all the numbers in the file is : 19483

Resources