What is the point of buffer in getline? - c

http://man7.org/linux/man-pages/man3/getline.3.html
I don't understand the point of the second parameter size_t *n.
Why would you need a buffer between the input (stdin for example) and the output (some character array).
Also, in the example they provide, size_t len = 0;. What is the significance of setting a buffer of size 0?

The point of getline() is that it can reallocate the buffer it receives.
Given a caller doing
size_t n = some_value();
char *buffer = malloc(n);
getline(&buffer, &n, stdin);
The caller supplies an initial buffer of length n. If getline() reallocates, it changes buffer so it points at the memory, and changes n to record the new length.
Obviously, this assumes that it is valid to do a realloc() on buffer i.e. that buffer is either NULL or is the value returned by malloc(), calloc(), or realloc().
The significance of setting n to zero AND buffer to NULL is telling getline() that it has been given no buffer. getline() will therefore reallocate if it reads anything.
All of this is actually described in the link you referred to.

getline() needs to know if the array is big enough to hold the line that the user has entered. It gets the current size of the array from the n parameter. If the array isn't big enough, it reallocates it to the required size. It then updates *lineptr and *n to the new array and size. Updating *n allows the caller to know how big the array is for its future use (such as calling getline() in a loop, as in the example).
Remember, C pointers don't include the size of the array they point to. If a function needs to know this, it has to be passed as a parameter.

Related

Strcpy a static char array into a dynamically allocated char array to save memory

Say, in main(); you read a string from a file, and scan it into a statically declared char array. You then create a dynamically allocated char array with with length strlen(string).
Ex:
FILE *ifp;
char array_static[buffersize];
char *array;
fscanf(ifp, "%s", array_static);
array = malloc(sizeof(char) * strlen(array_static) + 1);
strcpy(array_static, array);
Is there anything we can do with the statically allocated array after copying it into the dynamically allocated array, or will it just be left to rot away in memory? If this is the case, should you even go through the trouble of creating an array with malloc?
This is just a hypothetical question, but what is the best solution here with memory optimization in mind?
Here's how to make your life easier:
/* Returns a word (delimited with whitespace) into a dynamically
* allocated string, which is returned. Caller is responsible
* for freeing the returned string when it is no longer needed.
* On EOF or a read error, returns NULL.
*/
char* read_a_word(FILE* ifp) {
char* word;
/* Note the m. It's explained below. */
if (fscanf(ifp, "%ms", &word) != 1)
return NULL;
return word;
}
The m qualifier in the scanf format means:
An optional 'm' character. This is used with string conversions (%s, %c, %[), and relieves the caller of the need to allocate a corresponding buffer to hold the input: instead, scanf() allocates a buffer of sufficient size, and assigns the address of this buffer to the corresponding pointer argument, which should be a pointer to a char * variable (this variable does not need to be initialized before the call). The caller should subsequently free(3) this buffer when it is no longer required.
It's a Posix extension to the standard C library and is therefore required by any implementation which hopes to be Posix compatible, such as Linux, FreeBSD, or MacOS (but, unfortunately, not Windows). So as long as you're using one of those platforms, it's good.

What happens when I pass 0 as the second parameter of getline?

cplusplus.com states that the second parameter of the getline function is the
Maximum number of characters to write to s
However, I've seen code like this:
size_t linecap = 0;
ssize_t linelen;
linelen = getline(&line, &linecap, fp);
Won't this be reading 0 bytes from source? Or is there something else going on?
No, it's not correct. From the man page, (emphasis mine)
If *lineptr is NULL, then getline() will allocate a buffer for storing the line, which should be freed by the user program. (In this case, the value in *n is ignored.)
Alternatively, before calling getline(), *lineptr can contain a pointer to a malloc(3)-allocated buffer *n bytes in size. If the buffer is not large enough to hold the line, getline() resizes it with realloc(3), updating *lineptr and *n as necessary.
In either case, on a successful call, *lineptr and *n will be updated to reflect the buffer address and allocated size respectively.
So, the initial value stored in the memory pointed to by the second argument has no effect on the actual scanning. After the value is scanned and filled into the buffer,
the function return value will tell you the size of the scanned input in bytes.
the value of *n will tell you the size of the buffer which was allocated to store the input (which is usually bigger than the size of the scanned input).
The idea of getline is that there is as few as possible reallocations, as calls to malloc tend to be expensive. Hence if you use getline repeatedly to read the lines in a file, reusing the same buffer and length, the buffer will be eventually grown to the size of the longest line in the file, and no reallocations will be needed for the lines succeeding the longest line.
But for that to work certain contracts must be followed - namely if *lineptr is non-NULL then it
must be a pointer returned by malloc
must have allocation size of at least *n bytes
Corollary: passing 0 in *n is fine under these 2 circumstances:
if *lineptr is NULL
*lineptr is any live pointer returned by malloc (as any pointer returned by malloc will have 0 bytes space).
in both cases *n will be updated to the length of the line, and the return value of realloc(*lineptr, new_line_length_with_terminator) (if successful) will have been assigned to *lineptr.
Of course

How to use strnlen safely?

I am trying to understand what is the correct way to use strnlen so that it will be used safely even considering edge cases.
Like for example having a non null-terminated string as input.
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int main()
{
void* data = malloc(5);
size_t len = strnlen((const char*)data, 10);
printf("len = %zu\n", len);
return 0;
}
If I expect a string of max size 10, but the string does not contain the null character within those 10 characters strnlen will read out of bounds bytes (the input pointer may point to heap allocated data). Is this behavior undefined? If yes, is there a way to safely use strnlen to compute the length of a string which takes into account this type of scenario and does not lead to undefined behavior?
In order to use strnlen safely you need to
Keep track of the size of the input buffer yourself (5 in your case) and pass that as the second parameter, not a number greater than that.
Make sure the input pointer is not NULL.
Make sure another thread is not writing to the buffer.
Formally, you don't need to initialise the contents of the buffer, as conceptually the function reads the buffer as if they are char types.
This code will most likely invoke undefined behavior.
The bytes returned by malloc have indeterminate values. If there are no null bytes in the 5 bytes that are returned, then strnlen will read past those bytes since it was passed a max of 10, and reading past the end of allocated memory invokes undefined behavior.
Simply reading the bytes that were returned however should not be undefined. While indeterminate values could hold a trap representation, strnlen reads the bytes using a char *, and character types do not have trap representation, so the values are merely unspecified and reading them is safe.
If the value passed to strnlen is no larger than the size of allocated memory, then its usage is safe.
Since the actual length of data is 5 and you most likely don't have a '\0' in there, it will start reading unallocated memory(starting at data[5]), which might be a little unpleasant.

Buffer overflow with random size buffer? (gets)

I'm trying to learn more about ways to exploit/prevent buffer overflow in my programs. I know that the following code is vulnerable if the size is constant, but what if the size is random every time? Is there still a way to grab it from the stack and somehow alter the amount of overflow characters dynamically?
void vulnFunc(int size){
char buffer[size];
gets(buffer);
// Arbitrary code
}
Consider
fgets(buf, sizeof(buf)-1, stdin);
with stdin and a size that matches your buffer. It will be safe. There are other possibilities, such as a loop with getc(stdin): when the data becomes larger than
your buffer you can realloc().
It depends on the variable that is used to represent the array, if its of type char[] or char *. let me explain why:
for char[], the variable name represents an array and the sizeof operator returns the size of the array in the memory (number of cell * sizeof(type)), so basicly you can get the number of cells using the following call:
sizeof(array)/sizeof(array[0])
for char*, the variable is a pointer which holds the value of the first cell of the array, sizeof(array) in this case will return the size of pointer in memory that is 8Byte for 64bit architecture, it has nothing with the array so you cant get the information from this kind of variable. Maybe you could store the size of the allocated buffer in memory but I don't know if it suits your needs.

scanf() not working properly?

char *str;
printf("Enter string:\n");
scanf("%s",str);
OUTPUT:
runtime-check failure#3
str is being used without being initialized
Allocate an array and read into that:
char str[100];
if (scanf("%99s", str) != 1)
...error...
Or, if you need a pointer, then:
char data[100];
char *str = data;
if (scanf("%99s", str) != 1)
...error...
Note the use of a length to prevent buffer overflow. Note that the length specified to scanf() et al is one less than the total length (an oddity based on ancient precedent; most code includes the null byte in the specified length — see fgets(), for example).
Remember that %s will skip past leading white space and then stop on the first white space after some non-white space character. In particular, it will leave the newline in the input stream, ready for the next input operation to read. If you want the whole line of input, then you should probably use fgets() and sscanf() rather than raw scanf() — in fact, very often you should use fgets() and sscanf() rather than scanf() or fscanf(), if only because it make sensible error reporting a lot easier.
Its an undefined behavior if you dont initialize it.You have an uninitialized pointer which is reading data into memory location which may eventually cause trouble for you. You've declared str as a pointer, but you haven't given it a valid location to point to; it initially contains some random value that may or may not be a writable memory address.Try to allocate memory to the char *str;
char *str = malloc(sizeof(char)*100);
char *str declares str as a pointer to char type. scanf("%s",str) will read only E from the entered string.
You might NOT want to use exactly scanf("%s") (or get in the habit of using it) as it is vulnerable to buffer overflow (i.e. might accept more characters than your buffer can hold). For a discussion on this see Disadvantages of scanf
You have to allocate memory before assigning some value to a pointer. Always remember that pointer points to a memory location and you have to tell him what memory location you want to assign for him. So, for doing that you have to do :
str = (char*)malloc(sizeof(char));
This will assign you 1byte of memory block. So, you can assign only one character in this memory block. For a string you have to assign as many blocks depending on the number of characters in the string.Say, "Stack" requires 5 character and 1 extra block for '\0'. So, you have to assign at least 6 memory block to him.
str = (char*)malloc(6*sizeof(char));
I hope this will help you to grow more concepts in C.

Resources