Can you help explain how this buffer logic works - c

I'm trying to do some work with inotify and better understand C in general. I'm very novice. I was looking over the inotify man page and I saw an example of using inotify. I have some questions around how exactly they use buffers. The code is here:
http://man7.org/linux/man-pages/man7/inotify.7.html
The block I'm most interested is:
char buf[4096]
__attribute__ ((aligned(__alignof__(struct inotify_event))));
const struct inotify_event *event;
int i;
ssize_t len;
char *ptr;
/* Loop while events can be read from inotify file descriptor. */
for (;;) {
/* Read some events. */
len = read(fd, buf, sizeof buf);
if (len == -1 && errno != EAGAIN) {
perror("read");
exit(EXIT_FAILURE);
}
/* If the nonblocking read() found no events to read, then
it returns -1 with errno set to EAGAIN. In that case,
we exit the loop. */
if (len <= 0)
break;
/* Loop over all events in the buffer */
for (ptr = buf; ptr < buf + len;
ptr += sizeof(struct inotify_event) + event->len) {
event = (const struct inotify_event *) ptr;
What I'm trying to understand is is how exactly are the processing the bits in this buffer. This is what I know:
We define a char buf of 4096, which means we have a buffer just about 4kbs of size. When call read(fd, buf, sizeof buf) and len will be anywhere from 0 - 4096 (partial reads can occur).
We do some async checking, that's obvious.
Now we get to the for loop, here is where I'm a little confused. We set ptr equal to buf and then compare ptr's size to buff + len.
At this point does ptr equal the value '4096' ? And if so we are saying; is ptr:4096 < buf:4096 + len:[0-4096]. I'm using a colon here to signify what I think the variable's value is and [] meaning a range.
We then as the iterator expression, increase ptr+= the size of an inotify event.
I'm used to higher level OOP languages, in which I'd declare a buffer of 'inotify_event' objects. However I'm assuming since we are just getting back a byte array from 'read' we need to pull off the bites at the 'inotify_event' boundary and type cast those bits into an event object. Does this sounds correct?
Also I'm not exactly sure how comparison works with a buf[4096] values. We don't have concept of checking an array's current size (allocated indexes) so I'm assuming when used in comparison, we are comparing the size of it's allocated memory space '4096' in this case?
Thanks for the help, this is my first time really working with processing bits off a buffer. Trying to wrap my head around all this. Any further reading would be helpful! I've been finding a good amount of reading on C as a language, a good amount of reading on linux systems programming, but I can't seem to find topics such as 'working with buffers' or the grey area between the two.

When you do the assignment ptr = buf in C, you are assigning the address of the first element of buf to ptr. Thus, the comparison is checking whether ptr has gone beyond the end of the buffer.
The loop is jumping by the number of bytes needed to skip over one struct inotify_event, which is defined here, and the length of the name of the event.

ptr = buf
You are assigning the address of the first element of buf (i.e &buf[0]) to the pointer ptr. So you are starting looping through the buf using a pointer starting from the first element.
ptr < buf + len;
This is checking that your ptr pointer is "moving" through the array until the end of buf. It is made using pointer arithmetic. So the loop compare addresses of ptr pointed address with the address of buf + the len of buffer returned by read function.
ptr += sizeof(struct inotify_event) + event->len
Lastly the pointer is moved forward of size of the event struct struct inotify_event plus the event len, that I guess is variable based on the event type.

Related

Memory, pointers, and pointers to pointers

I am working on a short program that reads a .txt file. Intially, I was playing around in main function, and I had gotten to my code to work just fine. Later, I decided to abstract it to a function. Now, I cannot seem to get my code to work, and I have been hung up on this problem for quite some time.
I think my biggest issue is that I don't really understand what is going on at a memory/hardware level. I understand that a pointer simply holds a memory address, and a pointer to a pointer simply holds a memory address to an another memory address, a short breadcrumb trail to what we really want.
Yet, now that I am introducing malloc() to expand the amount of memory allocated, I seem to lose sight of whats going on. In fact, I am not really sure how to think of memory at all anymore.
So, a char takes up a single byte, correct?
If I understand correctly, then by a char* takes up a single byte of memory?
If we were to have a:
char* str = "hello"
Would it be say safe to assume that it takes up 6 bytes of memory (including the null character)?
And, if we wanted to allocate memory for some "size" unknown at compile time, then we would need to dynamically allocate memory.
int size = determine_size();
char* str = NULL;
str = (char*)malloc(size * sizeof(char));
Is this syntactically correct so far?
Now, if you would judge my interpretation. We are telling the compiler that we need "size" number of contiguous memory reserved for chars. If size was equal to 10, then str* would point to the first address of 10 memory addresses, correct?
Now, if we could go one step further.
int size = determine_size();
char* str = NULL;
file_read("filename.txt", size, &str);
This is where my feet start to leave the ground. My interpretation is that file_read() looks something like this:
int file_read(char* filename, int size, char** buffer) {
// Set up FILE stream
// Allocate memory to buffer
buffer = malloc(size * sizeof(char));
// Add characters to buffer
int i = 0;
char c;
while((c=fgetc(file))!=EOF){
*(buffer + i) = (char)c;
i++;
}
Adding the characters to the buffer and allocating the memory is what is I cannot seem to wrap my head around.
If **buffer is pointing to *str which is equal to null, then how do I allocate memory to *str and add characters to it?
I understand that this is lengthy, but I appreciate the time you all are taking to read this! Let me know if I can clarify anything.
EDIT:
Whoa, my code is working now, thanks so much!
Although, I don't know why this works:
*((*buffer) + i) = (char)c;
So, a char takes up a single byte, correct?
Yes.
If I understand correctly, by default a char* takes up a single byte of memory.
Your wording is somewhat ambiguous. A char takes up a single byte of memory. A char * can point to one char, i.e. one byte of memory, or a char array, i.e. multiple bytes of memory.
The pointer itself takes up more than a single byte. The exact value is implementation-defined, usually 4 bytes (32bit) or 8 bytes (64bit). You can check the exact value with printf( "%zd\n", sizeof char * ).
If we were to have a char* str = "hello", would it be say safe to assume that it takes up 6 bytes of memory (including the null character)?
Yes.
And, if we wanted to allocate memory for some "size" unknown at compile time, then we would need to dynamically allocate memory.
int size = determine_size();
char* str = NULL;
str = (char*)malloc(size * sizeof(char));
Is this syntactically correct so far?
Do not cast the result of malloc. And sizeof char is by definition always 1.
If size was equal to 10, then str* would point to the first address of 10 memory addresses, correct?
Yes. Well, almost. str* makes no sense, and it's 10 chars, not 10 memory addresses. But str would point to the first of the 10 chars, yes.
Now, if we could go one step further.
int size = determine_size();
char* str = NULL;
file_read("filename.txt", size, &str);
This is where my feet start to leave the ground. My interpretation is that file_read() looks something like this:
int file_read(char* filename, int size, char** buffer) {
// Set up FILE stream
// Allocate memory to buffer
buffer = malloc(size * sizeof(char));
No. You would write *buffer = malloc( size );. The idea is that the memory you are allocating inside the function can be addressed by the caller of the function. So the pointer provided by the caller -- str, which is NULL at the point of the call -- needs to be changed. That is why the caller passes the address of str, so you can write the pointer returned by malloc() to that address. After your function returns, the caller's str will no longer be NULL, but contain the address returned by malloc().
buffer is the address of str, passed to the function by value. Allocating to buffer would only change that (local) pointer value.
Allocating to *buffer, on the other hand, is the same as allocating to str. The caller will "see" the change to str after your file_read() returns.
Although, I don't know why this works: *((*buffer) + i) = (char)c;
buffer is the address of str.
*buffer is, basically, the same as str -- a pointer to char (array).
(*buffer) + i) is pointer arithmetic -- the pointer *buffer plus i means a pointer to the ith element of the array.
*((*buffer) + i) is dereferencing that pointer to the ith element -- a single char.
to which you are then assigning (char)c.
A simpler expression doing the same thing would be:
(*buffer)[i] = (char)c;
with char **buffer, buffer stands for the pointer to the pointer to the char, *buffer accesses the pointer to a char, and **buffer accesses the char value itself.
To pass back a pointer to a new array of chars, write *buffer = malloc(size).
To write values into the char array, write *((*buffer) + i) = c, or (probably simpler) (*buffer)[i] = c
See the following snippet demonstrating what's going on:
void generate0to9(char** buffer) {
*buffer = malloc(11); // *buffer dereferences the pointer to the pointer buffer one time, i.e. it writes a (new) pointer value into the address passed in by `buffer`
for (int i=0;i<=9;i++) {
//*((*buffer)+i) = '0' + i;
(*buffer)[i] = '0' + i;
}
(*buffer)[10]='\0';
}
int main(void) {
char *b = NULL;
generate0to9(&b); // pass a pointer to the pointer b, such that the pointer`s value can be changed in the function
printf("b: %s\n", b);
free(b);
return 0;
}
Output:
0123456789

Which pointer value is the max for a malloc call

Fairly simple question regarding malloc. What is the max that I can set within the allocated area. For instance:
char *buffer;
buffer = malloc(20);
buffer[19] = 'a'; //Is this the highest spot I can set?
buffer[20] = 'a'; //Or is this the highest spot I can set?
free(buffer);
The phrasing of your question is a bit off. You mean "what is the maximum index I can use for an allocated block of memory". The answer is the same as for arrays.
If you are reading or writing the memory, you may safely use indices between (and including) 0 and one less than the size of the block (in your case, that means index 19). All up, that means you can access the 20 values that you asked for.
If you are simply obtaining the pointer for comparison with other pointers inside the same block (and you are not going to read or write to it), you may additionally obtain the pointer one-past-the-end (in your case that means index 20).
To clarify these things with examples:
Yes, buffer[19] = 'a'; is the very last value you may access in a read or write capacity. Don't forget that if you want to store a string in this memory, and hand it to functions that expect a null-terminated string, this slot is your last chance to put that value of '\0'.
You are allowed to access buffer[20] in the following manner:
char *p;
for( p = &buffer[0]; p != &buffer[20]; ++p )
{
putc( *p, stdout );
}
This is useful because of the way we tend to iterate over memory and store sizes. It would make our code quite less readable if we had to subtract 1 all over the place.
Oh, and it gives you the neat trick:
size_t buf_size = 20;
char *buffer = malloc(buf_size);
char *start = buffer;
char *end = buffer + buf_size;
size_t oops_i_forgot_the_size = end - start;
malloc(x) will allocate x bytes.
So by accessing buffer[0] you access the first byte, by accessing buffer[1] you access the second.
e.g
char * buffer = (char *) malloc(1);
buffer[0] = 0; // legal
buffer[1] = 0; // illegal

store characters in character pointer

I have a thread which parses incomming characters/bytes one by one.
I would like to store the sequence of bytes in a byte pointer, and in the end when the sequence of "\r\n" is found it should print the full message out.
unsigned char byte;
unsigned char *bytes = NULL;
while (true){ // thread which is running on the side
byte = get(); // gets 1 byte from I/O
bytes = byte; //
*bytes++;
if (byte == 'x'){ // for now instead of "\r\n" i use the char 'x'
printf( "Your message: %s", bytes);
bytes = NULL; // or {0}?
}
}
You should define bytes as array with size of max message length not a pointer.
unsigned char byte, i;
unsigned char arr[10]; // 10 for example
i=0;
while (true){
byte = get();
arr[i] = byte;
i++;
if (byte == 'x'){
printf( "Your message: %s", arr);
}
}
When you define bytes as a pointer, it points to nothing and writing to it may erase other data in your program, you can make it array or allocate space for it in run time using malloc
Your Code
unsigned char byte;
unsigned char *bytes = NULL;
while (true){
Nothing wrong here, but some things must be cleared:
Did you alloc memory for your bytes buffer? That is, using malloc() family functions?
If so, did you check malloc() return and made sure the pointer is ok?
Did you include stdbool.h to use true and false?
Moving on...
byte = get();
bytes = byte;
*bytes++;
I'm assuming get() returns an unsigned char, since you didn't give the code.
Problem: bytes = byte. You're assigning an unsigned char to an unsigned char *. That's bad because unsigned char * is expecting a memory address (aka pointer) and you're giving it a character (which translates into a really bad memory address, cause you're giving addresses up to 255, which your program isn't allowed to access), and your compiler certainly complained about that assignment...
*byte++ has two "problems" (not being really problems): one, you don't need the * (dereferencing) operator to just increment the pointer reference, you could've done byte++; two, it was shorter and easier to understand if you switched this line and the previous one (bytes = byte) to *bytes++ = byte. If you don't know what this statement does, I suggest reading up on operator precedence and assignment operators.
Then we have...
if (byte == 'x'){
printf( "Your message: %s", bytes);
bytes = NULL;
}
if's alright.
printf() is messed up because you've been incrementing your bytes pointer the whole time while you were get()ting those characters. This means that the current location pointed by bytes is the end of your string (or message). To correct this, you can do one of two things: one, have a counter on the number of bytes read and then use that to decrement the bytes pointer and get the correct address; or two, use a secondary auxiliary pointer (which I prefer, cause it's easier to understand).
bytes = NULL. If you did malloc() for your bytes buffer, here you're destroying that reference, because you're making an assignment that effectively changes the address to which the pointer points to to NULL. Anyway, what you need to clear that buffer is memset(). Read more about it in the manual.
Another subtle (but serious) problem is the end of string character, which you forgot to put in that string altogether. Without it, printf() will start printing really weired things past your message until a Segmentation Fault or the like happens. To do that, you can use your already incremented bytes pointer and do *bytes = 0x0 or *bytes = '\0'. The NULL terminating byte is used in a string so that functions know where the string ends. Without it, it would be really hard to manipulate strings.
Code
unsigned char byte;
unsigned char *bytes = NULL;
unsigned char *bytes_aux;
bytes = malloc(500);
if (!bytes) return;
bytes_aux = bytes;
while (true) { /* could use while(1)... */
byte = get();
*bytes++ = byte;
if (byte == 'x') {
*(bytes - 1) = 0x0;
bytes = bytes_aux;
printf("Your message: %s\n", bytes);
memset(bytes, 0, 500);
}
}
if ((*bytes++ = get()) == 'x') is a compound version of the three byte = get(); *bytes++ = byte; if (byte == 'x'). Refer to that assignment link I told you about! This is a neat way of writing it and will make you look super cool at parties!
*(bytes - 1) = 0x0; The -1 bit is to exclude the x character which was saved in the string. With one step we exclude the x and set the NULL terminating byte.
bytes = bytes_aux; This restores bytes default state - now it correctly points to the beginning of the message.
memset(bytes, 0, 500) The function I told you about to reset your string.
Using memset is not necessary in this particular case. Every loop repetition we're saving characters from the beginning of the bytes buffer forward. Then, we set a NULL terminating byte and restore it's original position, effectively overwriting all other data. The NULL byte will take care of preventing printf() from printing whatever lies after the end of the current message. So the memset() part can be skipped and precious CPU time saved!
Somewhere when you get out of that loop (if you do), remember to free() the bytes pointer! You don't want that memory leaking...

Segment fault in realloc() on loop

I'm trying reallocate more 256 bytes to buffer on each loop call. In this buffer, I will store the buffer obtained from read().
Here is my code:
#define MAX_BUFFER_SIZE 256
//....
int sockfd = socket( ... );
char *buffer;
buffer = malloc( MAX_BUFFER_SIZE );
assert(NULL != buffer);
char *tbuf = malloc(MAX_BUFFER_SIZE);
char *p = buffer;
int size = MAX_BUFFER_SIZE;
while( read(sockfd, tbuf, MAX_BUFFER_SIZE) > 0 ) {
while(*tbuf) *p++ = *tbuf++;
size = size + MAX_BUFFER_SIZE; // it is the right size for it?
buffer = realloc(buffer, size);
assert(NULL != buffer);
}
printf("%s", buffer);
free(tbuf);
free(p);
free(buffer);
close(sockfd);
But the above code returns segment fault. Where am I wrong?
These are the problems that are apparent to me:
Your realloc can modify the location to which buffer points. But you fail to modify p accordingly and it is left pointing into the previous buffer. That's clearly an error.
I see potential for another error in that the while loop need not terminate and could run off the end of the buffer. This is the most likely cause of your segmentation fault.
The way you use realloc is wrong. If the call to realloc fails then you can no longer free the original buffer. You should assign the return value of realloc to a temporary variable and check for errors before overwriting the buffer variable.
You should not call free on the pointer p. Since that is meant to point into the block owned by buffer, you call free on buffer alone.
Thing is read doesn't add a 0-terminator. So your inner while is undoubtedly stepping outside the allocated memory:
while(*tbuf) *p++ = *tbuf++;
Another problem is that you are freeing stuff you didn't receive via malloc. By the time you call free, you will have incremented both p and tbuff which you try to free.
The whole buffer allocation things looks useless as you're not actually using it anywhere.
When you use realloc on buffer, it is possible that the address of buffer is changed as a result of changing the size. Once that happens, p is no longer holding the correct address.
Also towards the end, you are freeing both p and buffer while they point to the same location. You should only free one of them.

append two void* pointers

Is there a way to append 2 void* ptr? Each is an array of chars:
For example:
void * ptr;
ptr = malloc(3);
read(0, ptr, 3);
void * rtr;
rtr = malloc(3);
read (0, rtr, 3);
/*how to add ptr and rtr??*/
Thank you!
*EDIT: YES, I would like to add the contents together.
In actuality this is more of how my code works:
void *ptr;
ptr = malloc(3);
read(0, ptr, 3);
void *rtr;
rtr = malloc(1);
int reader;
reader=read(0, rtr, 1);
int i=1;
while(reader!=0){
/* append contents of rtr to ptr somehow?? */
i++;
rtr = realloc(rtr, i);
reader=read(0, rtr, 1);
}
I'm reading from a file. And the file might change, I have to append byte-by-byte if the file changes.
Your question doesn't really have an answer for the way you worded it, but I'll try...
You must allocate a block of memory first, using malloc(). Then, your void pointer would point to that. That block would have a definite size. The second block conforms to the same concepts, and has a definite size.
In order to append the second to the first, the first block should have been allocated with enough extra space to append the second block's contents. You would then use memcpy() to copy the bytes from the second block to the first block. You would need to use a cast to a byte pointer to specify the offset into the first block.
((unsigned char *)(ptr) + ptr_alloced_bytes) would be the offset into the first block to the end of the first copied data, where ptr_alloced_bytes is the number of bytes read by the first operation.
Otherwise you would need to allocate a new block that is large enough to hold both blocks, then copy them both using memcpy().

Resources