What happens when I pass 0 as the second parameter of getline? - c

cplusplus.com states that the second parameter of the getline function is the
Maximum number of characters to write to s
However, I've seen code like this:
size_t linecap = 0;
ssize_t linelen;
linelen = getline(&line, &linecap, fp);
Won't this be reading 0 bytes from source? Or is there something else going on?

No, it's not correct. From the man page, (emphasis mine)
If *lineptr is NULL, then getline() will allocate a buffer for storing the line, which should be freed by the user program. (In this case, the value in *n is ignored.)
Alternatively, before calling getline(), *lineptr can contain a pointer to a malloc(3)-allocated buffer *n bytes in size. If the buffer is not large enough to hold the line, getline() resizes it with realloc(3), updating *lineptr and *n as necessary.
In either case, on a successful call, *lineptr and *n will be updated to reflect the buffer address and allocated size respectively.
So, the initial value stored in the memory pointed to by the second argument has no effect on the actual scanning. After the value is scanned and filled into the buffer,
the function return value will tell you the size of the scanned input in bytes.
the value of *n will tell you the size of the buffer which was allocated to store the input (which is usually bigger than the size of the scanned input).

The idea of getline is that there is as few as possible reallocations, as calls to malloc tend to be expensive. Hence if you use getline repeatedly to read the lines in a file, reusing the same buffer and length, the buffer will be eventually grown to the size of the longest line in the file, and no reallocations will be needed for the lines succeeding the longest line.
But for that to work certain contracts must be followed - namely if *lineptr is non-NULL then it
must be a pointer returned by malloc
must have allocation size of at least *n bytes
Corollary: passing 0 in *n is fine under these 2 circumstances:
if *lineptr is NULL
*lineptr is any live pointer returned by malloc (as any pointer returned by malloc will have 0 bytes space).
in both cases *n will be updated to the length of the line, and the return value of realloc(*lineptr, new_line_length_with_terminator) (if successful) will have been assigned to *lineptr.
Of course

Related

Getting core dump freeing memory allocated by getline()

I am getting the core dump and I have been looking and it seems I am doing everything right.
int len = 0;
char *buff = NULL;
size_t sz;
if(getline(&buff,&sz, stdin) > 0){
while(isalpha(*buff++))
++len;
printf(" 1st word %d characters long ",len);
}
free(buff);
In any code you write that dynamically allocates memory, you have 2 responsibilities regarding any block of memory allocated: (1) always preserve a pointer to the starting address for the block of memory so, (2) it can be freed when it is no longer needed.
You fail in (1) so you can't do (2). Specifically, what is happening is getline() allocates storage and assigns the beginning address for the allocated block to buff. You then use buff to iterate over the string calling *buff++. When you are done iterating, buff no longer points to (holds the beginning address of) the block of memory allocated by getline().
When you attempt to pass buff to free(), an error occurs because you are attempting to free an address that was not previously allocated by malloc, calloc or realloc.
Use a separate pointer to iterate with, e.g. char *p = buff; within your read loop and iterate with p. (you can also use an index for iterating, e.g. buf[i] without changing the original address buff holds) Then you can pass buff to free().

How to use strnlen safely?

I am trying to understand what is the correct way to use strnlen so that it will be used safely even considering edge cases.
Like for example having a non null-terminated string as input.
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int main()
{
void* data = malloc(5);
size_t len = strnlen((const char*)data, 10);
printf("len = %zu\n", len);
return 0;
}
If I expect a string of max size 10, but the string does not contain the null character within those 10 characters strnlen will read out of bounds bytes (the input pointer may point to heap allocated data). Is this behavior undefined? If yes, is there a way to safely use strnlen to compute the length of a string which takes into account this type of scenario and does not lead to undefined behavior?
In order to use strnlen safely you need to
Keep track of the size of the input buffer yourself (5 in your case) and pass that as the second parameter, not a number greater than that.
Make sure the input pointer is not NULL.
Make sure another thread is not writing to the buffer.
Formally, you don't need to initialise the contents of the buffer, as conceptually the function reads the buffer as if they are char types.
This code will most likely invoke undefined behavior.
The bytes returned by malloc have indeterminate values. If there are no null bytes in the 5 bytes that are returned, then strnlen will read past those bytes since it was passed a max of 10, and reading past the end of allocated memory invokes undefined behavior.
Simply reading the bytes that were returned however should not be undefined. While indeterminate values could hold a trap representation, strnlen reads the bytes using a char *, and character types do not have trap representation, so the values are merely unspecified and reading them is safe.
If the value passed to strnlen is no larger than the size of allocated memory, then its usage is safe.
Since the actual length of data is 5 and you most likely don't have a '\0' in there, it will start reading unallocated memory(starting at data[5]), which might be a little unpleasant.

What is the point of buffer in getline?

http://man7.org/linux/man-pages/man3/getline.3.html
I don't understand the point of the second parameter size_t *n.
Why would you need a buffer between the input (stdin for example) and the output (some character array).
Also, in the example they provide, size_t len = 0;. What is the significance of setting a buffer of size 0?
The point of getline() is that it can reallocate the buffer it receives.
Given a caller doing
size_t n = some_value();
char *buffer = malloc(n);
getline(&buffer, &n, stdin);
The caller supplies an initial buffer of length n. If getline() reallocates, it changes buffer so it points at the memory, and changes n to record the new length.
Obviously, this assumes that it is valid to do a realloc() on buffer i.e. that buffer is either NULL or is the value returned by malloc(), calloc(), or realloc().
The significance of setting n to zero AND buffer to NULL is telling getline() that it has been given no buffer. getline() will therefore reallocate if it reads anything.
All of this is actually described in the link you referred to.
getline() needs to know if the array is big enough to hold the line that the user has entered. It gets the current size of the array from the n parameter. If the array isn't big enough, it reallocates it to the required size. It then updates *lineptr and *n to the new array and size. Updating *n allows the caller to know how big the array is for its future use (such as calling getline() in a loop, as in the example).
Remember, C pointers don't include the size of the array they point to. If a function needs to know this, it has to be passed as a parameter.

Determine the size of buffer allocated in heap

I want to know the size of a buffer allocated using calloc in byte. By testing the following in my machine:
double *buf = (double *) calloc(5, sizeof(double));
printf("%zu \n", sizeof(buf));
The result was 8 even when I change to only one element I still get 8. My questions are:
Does it mean that I can only multiply 8*5 to get the buffer size in byte? (I thought sizeof will return 40).
How can make a macro that return the size of buffer in byte (the buffer to be checked could be char, int, double, or float)?
Any ideas are appreciated.
Quoting C11, chapter ยง6.5.3.4 , (emphasis mine)
The sizeof operator yields the size (in bytes) of its operand, which may be an
expression or the parenthesized name of a type. The size is determined from the type of
the operand. [...]
So, using sizeof you cannot get the size of the memory location pointed to by a pointer. You need to keep a track on that yourself.
To elaborate, your case is equivalent to sizeof (double *) which basically gives you the size of a pointer (to double) as per your environment.
There is no generic or direct way to get the size of the allocated memory from a memory allocator function. You can however, use a sentinel value to mark the ending of the allocated buffer and using a loop, you can check the value, but this means
the allocation of an extra element to hold the sentinel value itself
the sentinel value has to be excluded from the permissible values in the memory.
Choose according to your needs.
sizeof(buf) is the size of the buf variable, which is a pointer, not the size of the buffer it points to.
Due to memory alignment requirements (imposed by the hardware), the size of the block allocated with calloc() is at least the product of the values you pass to calloc() as arguments.
In your case, the size of the buffer is at least 5 * sizeof(double).
Afaik there is no way to find the size of a dynamically allocated block of memory but as long as you allocate it, you already know its size; you have to pass it as argument to the memory allocation function (be it malloc(), calloc(), realloc() or any other.

C buffer memory allocation

I'm quite new to C so please bear with my incompetence. I want to read an whole EXEcutable file into a buffer:
#include <stdlib.h>
FILE *file = fopen(argv[1], "rb");
long lSize;
fseek(file, 0, SEEK_END);
lSize = ftell(file);
fseek(file, 0, SEEK_SET);
char *buffer = (char*) malloc(sizeof(char)*lSize);
fread(buffer, 1, lSize, file);
The file has 6144 bytes (stored correctly in lSize) but the size of my buffer is only 4 bytes, therefore only the MZ signature is stored in the buffer.
Why does malloc only allocate 4 bytes in this case?
Edit:
Probably the char buffer is terminated by the first 0 in the MZ header of the PE file. If I set the buffer to a certain value however, the whole file will be stored. If I set the buffer to int (= 4 bytes), the buffer won't be terminated but will be of course larger (vs. char = 1 byte). I just want to copy the file byte for byte with the null bytes as well.
Edit 2:
The buffer of course contains everything it should but if I try to write it to a new file with fwrite, it only wrote up to the first \0 (which is 4 bytes). I just got fwrite wrong. Fixed this. Sorry, the problem wasn't well defined enough.
If lSize really does equal 6144 then your code will indeed allocate 6144 bytes and then read the entire contents of the file. If you believe that only 4 bytes are being read it is probably because the 5th byte is a zero. Thus when buffer is interpreted as a zero terminated string, it terminates at that point.
You can inspect the rest of your buffer by looking at buffer[4], buffer[5], etc.
As an aside, you don't need to cast the return from malloc, and sizeof(char) == 1 by definition. Best practice is to write the malloc like this:
char *buffer = malloc(lSize);
But that will not change your results.
Why does malloc only allocate 4 bytes in this case?
Because you failed to #include <stdlib.h> (and cast the return value of malloc()).
Do not forget to #include <stdlib.h> so that the compiler knows malloc returns a value of type void* (rather than assuming it returns an int) and takes an argument of size_t type (rather than asuuming it is an int)
Also do not cast the return value of malloc. A value of type void* can be assigned to an object of pointer (to any type) type. Casting the return value makes the compiler silently convert int (assumed when <stdlib.h> was not included) to the type in the cast. Note the compiler would complain without the cast letting you know you had forgotten the include.
The real error is not malloc allocating the wrong amount (I believe it will allocate the correct amount anyway). The real error is assuming malloc returns an int when it returns a void*. int and void* can be passed differently (one in a register, the other on the stack for instance) or they have different representations (two's complement for int and segmented address for void*) or any other thing (most probably sizeof (int) != sizeof (void*)).
how are you checking for the size of buffer, are you doing a sizeof(buffer)? In that case you are only seeing the size of a pointer to int which is 4 bytes. You cannot get the size of a buffer out of it's pointer. You must store it separately as you have done (in lSize).
If malloc() did not return NULL then your buffer is fine and the size is correct.

Resources