How can I read the contents of a file if I have to use the following parameters:
I have to read the file in parts by using "start-value" of the part and length of the part
The start-value and length of the parts will be read from another file
Overall, I am trying to compute the MD5 value of these parts (you can also call them as CHUNKS).
The start-value and length of the chunks have been computed and stored in a file.
I tried to use fread() as follows, but it does not give me logical results
char *chunk_buffer;
//chunk_buffer is a pointer to a memory block
while(cur_poly != NULL) {
//cur_poly is a structure which is used to store the start and length of chunks
chunk_buffer = (char*) malloc ((cur_poly->length)*8);
//here I am trying to allocate memory based on the size of each chunk
int x=fread (chunk_buffer,1, cur_poly->length, c_file);
//c_file is the file to be read according to the offsets
char hash[32];
hash=md5(chunk_buffer);
//md5() is a function which can generate the md5 hash values for the chunks
}
I see two potential issues.
What units does cur_poly->length represent? You are mallocing memory as if it is a count of 64-bit words, yet reading the file as if it is bytes. If the field represents length in bytes, then you are reading correctly, but allocating too much memory. However, if the field is length in 64-bit words, then you are allocating the right amount of memory, but only reading 1/8th the data.
The code seems to be ignoring offsets. (Or assuming all chunks must be contiguous). If you want to read from an arbitrary offset, do a fseek(fp, offset, SEEK_SET); before the fread.
If the chunks are supposed to be contiguous, there still may be padding at the ends to force them all to start on an even boundary. You would have to seek over the padding whenever the byte count was odd (.WAV does this, as an example)
I want to note some more issues with that code. You might need to add some more details on these points.
If you want to read consecutive chunks from your file, you usually don't need to modify the get pointer of your file. Just read a chunk, and then read the next one. If you need to read the chunks in random order, you need to use fseek. This way you adjust the start position of the next file operation by an offset (from beginning, or end of the file, or relative to the current position).
You have a char pointer chunk_buffer, that you obviously use to store the data from your file temporarily. That is, it's only valid for the current loop iteration.
If this is the case I would suggest to do the malloc once before you enter the loop:
char * chunk_buffer = malloc (MAXIMUM_CHUNK_SIZE);
in the loop you may clear this buffer using memset or just overwrite the data. Also note that malloc()ed memory is not initialized with '\0' values (I don't know if this is one assumption you rely on ...).
I am not sure, why you actually allocate a buffer of size length*8 and just read length bytes to it. Probably
int x = fread (chunk_buffer, SIZE_OF_ITEM, THIS_CHUNK_SIZE, c_file);
would fit your needs closer, if your items are indeed larger than a byte.
It is unclear, what the md5() function actually does. What value does it return? A pointer to a buffer that is allocated dynamically? A pointer to a local array? Anyway, you assign the return value to a pointer to a local array of chars. You might not need to allocate 32 bytes for this, but just
char * hash = md5 (chunk_buffer);
Make sure that you keep the pointer to that array somewhere you find it when the loop takes the next iteration. An array that is created statically in local scope of that function can of course not be passed this way.
Your md5() function. How does it know, what the size of a chunk is? It is passed a pointer, but not the size of the valid data (as far as I see it). You might need to adapt this function to take the length of the input array as additional parameter.
What does the md5() function produce, a C-style string (alphanumeric digits, null-terminated) or an array of byte sized unsigned integers (uint8_t) ?
make sure that you free() the memory you allocate dynamically. If you want to keep the malloc() inside the loop, make sure the loop always ends with
free (chunk_buffer);
For us to help you any further, you need to define
a) what are logical results for you and
b) what results do you get
Related
I have a problem, and I cannot figure out the solution for it. I have to programm some code to a µC, but I am not familiar with it.
I have to create an analysis and show the results of it on the screen of the machine. The analysis is allready done and functional. But getting the results from the analysis to the screen is my problem.
I have to store all results in a global array. Since the stack is really limited on the machine, I have to bring it to the larger heap. The linker is made that way, that every dynamic allocation ends up on the heap. But this is done in C so I cannot use "new". But everything allocated with malloc ends up on the heap automatically and that is why I need to use malloc, but I haven't used that before, so I have real trouble with it. The problem with the screen is, it accepts only char arrays.
In summaray: I have to create a global 2D char array holding the results of up to 100 positions and I have to allocate the memory for it using malloc.
To make it even more complicated I have to declare the variable with "extern" in the buffer.h file and have to implement it in the buffer.c file.
So my buffer.h line looks like this:
extern char * g_results[100][10];
In the buffer.c I am using:
g_results[0][0] = malloc ( 100 * 10 )
Each char is 1 byte, so the array should have the size of 1000 byte to hold 100 results with the length of 9 and 1 terminating /0. Right?
Now I try to store the results into this array with the help of strcpy.
I am doing this in a for loop at the end of the analysis.
for (int i = 0; i < 100, i++)
{
// Got to convert it to text 1st, since the display does not accept anything but text.
snprintf(buffer, 9, "%.2f", results[i]);
strcpy(g_results[i][0], buffer);
}
And then I iterate through the g_results_buffer on the screen and display the content. The problem is: it works perfect for the FIRST result only. Everything is as I wanted it.
But all other lines are empty. I checked the results-array, and all values are stored in them, so that is not the cause for the problem. Also, the values are not overwritten, it is really the 1st value.
I cannot see what it is the problem here.
My guesses are:
a) allocation with malloc isn't done correctly. Only allocating space for the 1st element? When I remove the [0][0] I get a compiler error: "assignment to expression with array type". But I do not know what that should mean.
b) (totally) wrong usage of the pointers. Is there a way I can declare that array as a non-pointer, but still on the heap?
I really need your help.
How do I store the results from the results-array after the 1st element into the g_results-array?
I have to store all results in a global array. Since the stack is really limited on the machine, I have to bring it to the larger heap.
A “global array“ and “the larger heap” are different things. C does not have a true global name space. It does have objects with static storage duration, for which memory is reserved for the entire execution of the program. People use the “heap” to refer to dynamically allocated memory, which is reserved from the time a program requests it (as with malloc) until the time the program releases it (as with free).
Variables declared outside of functions have file scope for their names, external or internal linkage, and static storage duration. These are different from dynamic memory. So it is not clear what memory you want: static storage duration or dynamic memory?
“Heap” is a misnomer. Properly, that word refers to a type of data structure. You can simply call it “allocated memory.” A “heap” may be used to organize pieces of memory available for allocation, but it can be used for other purposes, and the memory management routines may use other data structures.
The linker is made that way, that every dynamic allocation ends up on the heap.
The linker links object modules together. It has nothing to do with the heap.
But everything allocated with malloc ends up on the heap automatically and that is why I need to use malloc,…
When you allocate memory, it does not end up on the heap. The heap (if it is used for memory management) is where memory that has been freed is kept until it is allocated again. When you allocate memory, it is taken off of the heap.
The problem with the screen is, it accepts only char arrays.
This is unclear. Perhaps you mean there is some display device that you must communicate with by providing strings of characters.
In summaray: I have to create a global 2D char array holding the results of up to 100 positions and I have to allocate the memory for it using malloc.
That would have been useful at the beginning of your post.
So my buffer.h line looks like this:
extern char * g_results[100][10];
That declares an array of 100 arrays of 10 pointers to char *. So you will have 1,000 pointers to strings (technically 1,000 pointers to the first character of strings, but we generally speak of a pointer to the first character of a string as a pointer to the string). That is not likely what you want. If you want 100 strings of up to 10 characters each (including the terminating null byte in that 10), then a pointer to an array of 100 arrays of 10 characters would suffice. That can be declared with:
extern char (*g_results)[100][10];
However, when working with arrays, we generally just use a pointer to the first element of the array rather than a pointer to the whole array:
extern char (*g_results)[10];
In the buffer.c I am using:
g_results[0][0] = malloc ( 100 * 10 )
Each char is 1 byte, so the array should have the size of 1000 byte to hold 100 results with the length of 9 and 1 terminating /0. Right?
That space does suffice for 100 instances of 10-byte strings. It would not have worked with your original declaration of extern char * g_results[100][10];, which would need space for 1,000 pointers.
However, having changed g_results to extern char (*g_results)[10];, we must now assign the address returned by malloc to g_results, not to g_results[0][0]. We can allocate the required space with:
g_results = malloc(100 * sizeof *g_results);
Alternately, instead of allocating memory, just use static storage:
char g_results[100][10];
Now I try to store the results into this array with the help of strcpy. I am doing this in a for loop at the end of the analysis.
for (int i = 0; i < 100, i++)
{
// Got to convert it to text 1st, since the display does not accept anything but text.
snprintf(buffer, 9, "%.2f", results[i]);
strcpy(g_results[i][0], buffer);
}
There is no need to use buffer; you can send the snprintf results directly to the final memory.
Since g_results is an array of 100 arrays of 10 char, g_results[i] is an array of 10 char. When an array is used as an expression, it is automatically converted to a pointer to its first element, except when it is the operand of sizeof, the operand of unary &, or is a string literal used to initialize an array (in a definition). So you can use g_results[i] to get the address where string i should be written:
snprintf(g_results[i], sizeof g_results[i], "%.2f", results[i]);
Some notes about this:
We see use of the array both with automatic conversion and without. The argument g_results[i] is converted to &g_results[i][0]. In sizeof g_results[i], sizeof gives the size of the array, not a pointer.
The buffer length passed to snprintf does not need to be reduced by 1 for allow for the terminating null character. snprintf handles that by itself. So we pass the full size, sizeof g_results[i].
But all other lines are empty.
That is because your declaration of g_results was wrong. It declared 1,000 pointers, and you stored an address only in g_results[0][0], so all the other pointers were uninitialized.
This is all odd, you seem to just want:
// array of 100 arrays of 10 chars
char g_results[100][10];
for (int i = 0; i < 100, i++) {
// why snprintf+strcpy? Just write where you want to write.
snprintf(g_results[i], 10, "%.2f", results[i]);
// ^^^^^^^^ has to be float or double
// ^^ why 9? The buffer has 10 chars.
}
Only allocating space for the 1st element?
Yes, you are, you only assigned first element g_results[0][0] to malloc ( 100 * 10 ).
wrong usage of the pointers. Is there a way I can declare that array as a non-pointer, but still on the heap?
No. To allocate something on the heap you have to call malloc.
But there is no reason to use the heap, especially that you are on a microcontroller and especially that you know how many elements you are going to allocate. Heap is for unknowns, if you know that you want exactly 100 x 10 x chars, just take them.
Overall, consider reading some C books.
I do not know what that should mean.
You cannot assign to an array as a whole. You can assign to array elements, one by one.
I'm trying to code a buffer for an input file. The Buffer should always contain a defined amount of data. If a few bytes of the data were used, the buffer should read data from the file until it has the defined size again.
const int bufsize = 10;
int *field = malloc(bufsize*sizeof(int)); //allocate the amount of memory the buffer should contain
for(i=0;i<bufsize;++i) //initialize memory with something
*(field+i) = i*2;
field += 4; //Move pointer 4 units because the first 4 units were used and are no longer needed
field= realloc(field,bufsize*sizeof(int)); //resize the now smaller buffer to its original size
//...some more code were the new memory (field[6]-field[9]) are filled again...
Here is a short example of how I'm trying to do it at the moment (without files, because this is the part thats not working), but the realloc() always returns NULL. In this example, the first 4 units were used, so the pointer should move forward and the missing data at the end of the memory (so that it will again contain 10 elements) should be allocated. What am I doing wrong?
I would be very thankful if someone could help me
You need memmove() instead
memmove(field, field + 4, (bufsize - 4) * sizeof(*field));
you don't need to realloc() because you are not changing the size of the buffer, just think about it.
If you do this
field += 4;
now you lost the reference to the begining of field so you can't even call free on it, nor realloc() of course. Read WhozCraig comment for instance.
Doing realloc() for the same size doesn't make that much sense.
Using realloc() the way you do causes some other problems, for instance when it fails you also run into the same problem, you loose reference to the original pointer.
So the recommended method is
void *pointer;
pointer = realloc(oldPointer, oldSize + nonZeroNewSize);
if (pointer == NULL)
handleFailure_PerhapsFree_oldPointer();
oldPointer = pointer;
So the title of your question contains the answer to it, what you need is to move the data from offset 4 * sizeof(int) bytes to the begining of the pointer, for which memmove() is the perfect tool, notice that you could also think of using memcpy() but memcpy() cannot handle the case of overlapping data, which is your case.
Your problem should be named as Cyclic Buffer.
You should call malloc() just once when opening file and once free() when you close it.
You don't need to call realloc() at all. All is necessary is advaning pointer by amount of read data, wrapping its value around size of buffer and replacing old data with new data from file.
Your problem with realloc(): you must pass same pointer to it which was previously returned from malloc() or realloc() without offsetting it!
I want to store 5 names without wasting 1byte , so how can allocate memory using malloc
That's for all practical purposes impossible, malloc will more often than not return blocks of memory bigger than requested.
#include <stdio.h>
#include<stdlib.h>
int main()
{
int n,i,c;
char *p[5];/*declare a pointer to 5 strings for the 5 names*/
for(i=0;i<5;i++)
{
n=0;
printf("please enter the name\n" );/*input name from the user*/
while((c=getchar())!='\n')
n++;/*count the total number of characters in the name*/
p[i]= (char *)malloc(sizeof(char)*n);/*allocate the required amount of memory for a name*/
scanf("%s",p[i]);
}
return 0;
}
If you know the cumulative length of the five names, let's call it length_names, you could do a
void *pNameBlock = malloc(length_names + 5);
Then you could store the names, null terminated (the +5 is for the null termination), one right after the other in the memory pointed to by pNameBlock.
char *pName1 = (char *) pNameBlock;
Store the name data at *pName1. Maybe via
char *p = *pName1; You can then write byte by byte (following is pseudo-codeish).
*p++ = byte1;
*p++ = byte2;
etc.
End with a null termination:
*p++ = '\0';
Now set
char *pName2 = p;
and write the second name using p, as above.
Doing things this way will still waste some memory. Malloc will internally get itself more memory than you are asking for, but it will waste that memory only once, on this one operation, getting this one block, with no overhead beyond this once.
Be very careful, though, because under this way of doing things, you can't free() the char *s, such as pName1, for the names. You can only free that one pointer you got that one time, pNameBlock.
If you are asking this question out of interest, ok. But if you are this memory constrained, you're going to have a very very hard time. malloc does waste some memory, but not a lot. You're going to have a hard time working with C this constrained. You'd almost have to write your own super light weight memory manager (do you really want to do that?). Otherwise, you'd be better off working in assembly, if you can't afford to waste even a byte.
I have a hard time imagining what kind of super-cramped embedded system imposes this kind of limit on memory usage.
If you don't want to waste any byte to store names, you should dynamically allocate a double array (char) in C.
A double array in C can be implemented as a pointer to a list of pointers.
char **name; // Allocate space for a pointer, pointing to a pointer (the beginning of an array in C)
name = (char **) malloc (sizeof(char *) * 5); // Allocate space for the pointer array, for 5 names
name[0] = (char *) malloc (sizeof(char) * lengthOfName1); // Allocate space for the first name, same for other names
name[1] = (char *) malloc (sizeof(char) * lengthOfName2);
....
Now you can save the name to its corresponding position in the array without allocating more space, even though names might have different lengths.
You have to take double pointer concept and then have to put your name character by character with increment of pointer address and then you are able to save all 5 names so as you are able to save your memory.
But as programmer you should not have to use this type of tedious task you have to take array of pointers to store names and have to allocate memory step by step.
This is only for the concept of storing names but if you are dealing with large amount of data then you have to use link list to store all data.
When you malloc a block, it actually allocates a bit more memory than you asked for. This extra memory is used to store information such as the size of the allocated block.
Encode the names in binary and store them in a byte array.
What is "memory waste"? If you can define it clearly, then a solution can be found.
For example, the null in a null terminated string might be considered "wasted memory" because the null isn't printed; however, another person might not consider it memory waste because without it, you need to store a second item (string length).
When I use a byte, the byte is fully used. Only if you can show me how it might be done without that byte will I consider your claims of memory waste valid. I use the nulls at the ends of my strings. If I declare an array of strings, I use the array too. Make what you need, and then if you find that you can rearrange those items to use less memory, decide that the other way wasted some memory. Until then, you're chasing a dream which you haven't finished.
If these five "names" are assembly jump points, you don't need a full string's worth of memory to hold them. If the five "names" are block scoped variables, perhaps they won't need any more memory than the registers already provide. If they are strings, then perhaps you can combine and overlay strings; but, until you come up with a solution, and a second solution to compare the first against, you don't have a case for wasted / saved memory.
I'm new to c. Just have a question about the character arrays (or string) in c: When I want to create a character array in C, do I have to give the size at the same time?
Because we may not know the size that we actually need. For example of client-server program, if we want to declare a character array for the server program to receive a message from the client program, but we don't know the size of the message, we could do it like this:
char buffer[1000];
recv(fd,buffer, 1000, 0);
But what if the actual message is only of length 10. Will that cause a lot of wasted memory?
Yes, you have to decide the dimension in advance, even if you use malloc.
When you read from sockets, as in the example, you usually use a buffer with a reasonable size, and dispatch data in other structure as soon you consume it. In any case, 1000 bytes is not a so much memory waste and is for sure faster than asking a byte at a time from some memory manager :)
Yes, you have to give the size if you are not initializing the char array at the time of declaration. Better approach for your problem is to identify the optimum size of the buffer at run time and dynamically allocate the memory.
What you're asking about is how to dynamically size a buffer. This is done with a dynamic allocation such as using malloc() -- a memory allocator. Using it gives you an important responsibility though: when you're done using the buffer you must return it to the system yourself. If using malloc() [or calloc()], you return it with free().
For example:
char *buffer; // pointer to a buffer -- essentially an unsized array
buffer = (char *)malloc(size);
// use the buffer ...
free(buffer); // return the buffer -- do NOT use it any more!
The only problem left to solve is how to determine the size you'll need. If you're recv()'ing data that hints at the size, you'll need to break the communication into two recv() calls: first getting the minimum size all packets will have, then allocating the full buffer, then recv'ing the rest.
When you don't know the exact amount of input data, do as follows:
Create a small buffer
Allocate some memory for a "storage" (e.g. twice of buffer size)
Fill the buffer with the data from the input stream (e.g. socket, file etc.)
Copy the data from the buffer to the storage
4.1 If there is not enough place in storage, re-allocate the memory (e.g. with a size twice bigger than it is at this point)
Do steps 3 and 4 unless the "END OF STREAM"
Your storage contains the data now.
If you don't know the size a-priori, then you have no choice but to create it dynamically using malloc (or whatever equivalent mechanism in your language of choice.)
size_t buffer_size = ...; /* read from a DEFINE or from a config file */
char * buffer = malloc( sizeof( char ) * (buffer_size + 1) );
Creating a buffer of size m, but only receiving an input string of size n with n < m is not a waste of memory, but an engineering compromise.
If you create your buffer with a size close to the intended input, you risk having to refill the buffer many, many times for those cases where m >> n. Typically, iterations over the buffer are tied up with I/O operations, so now you might be saving some bytes (which is really nothing in today's hardware) at the expense of potentially increasing the problems in some other end. Specially for client-server apps. If we were talking about resource-constrained embedded systems, that'd be another thing.
You should be worrying about getting your algorithms right and solid. Then you worry, if you can, about shaving off a few bytes here and there.
For me, I'd rather create a buffer that is 2 to 10 times greater than the average input (not the smallest input as in your case, but the average), assuming my input tends to have a slow standard deviation in size. Otherwise, I'd go 20 times the size or more (specially if memory is cheap and doing this minimizes hitting the disk or the NIC card.)
At the most basic setup, one typically gets the size of the buffer as a configuration item read off a file (or passed as an argument), and defaulting to a default compile time value if none is provided. Then you can adjust the size of your buffers according to the observed input sizes.
More elaborate algorithms (say TCP) adjust the size of their buffers at run-time to better accommodate input whose size might/will change over time.
Even if you use malloc you also must define the size first! So instead you give a large number that is capable of accepting the message like:
int buffer[2000];
In case of small message or large you can reallocate it to release the unused locations or to occupy the unused locations
example:
int main()
{
char *str;
/* Initial memory allocation */
str = (char *) malloc(15);
strcpy(str, "tutorialspoint");
printf("String = %s, Address = %u\n", str, str);
/* Reallocating memory */
str = (char *) realloc(str, 25);
strcat(str, ".com");
printf("String = %s, Address = %u\n", str, str);
free(str);
return(0);
}
Note: make sure to include stdlib.h library
I have a structure that has an array of pointers. I would like to insert into the array digits in string format, i.e. "1", "2", etc..
However, is there any difference in using either sprintf or strncpy?
Any big mistakes with my code? I know I have to call free, I will do that in another part of my code.
Many thanks for any advice!
struct port_t
{
char *collect_digits[100];
}ports[20];
/** store all the string digits in the array for the port number specified */
static void g_store_digit(char *digit, unsigned int port)
{
static int marker = 0;
/* allocate memory */
ports[port].collect_digits[marker] = (char*) malloc(sizeof(digit)); /* sizeof includes 0 terminator */
// sprintf(ports[port].collect_digits[marker++], "%s", digit);
strncpy(ports[port].collect_digits[marker++], digit, sizeof(ports[port].collect_digits[marker]));
}
Yes, your code has a few issues.
In C, don't cast the return value of malloc(). It's not needed, and can hide errors.
You're allocating space based on the size of a pointer, not the size of what you want to store.
The same for the copying.
It is unclear what the static marker does, and if the logic around it really is correct. Is port the slot that is going to be changed, or is it controlled by a static variable?
Do you want to store only single digits per slot in the array, or multiple-digit numbers?
Here's how that function could look, given the declaration:
/* Initialize the given port position to hold the given number, as a decimal string. */
static void g_store_digit(struct port_t *ports, unsigned int port, unsigned int number)
{
char tmp[32];
snprintf(tmp, sizeof tmp, "%u", number);
ports[port].collect_digits = strdup(tmp);
}
strncpy(ports[port].collect_digits[marker++], digit, sizeof(ports[port].collect_digits[marker]));
This is incorrect.
You have allocated onto collect_digits a certain amount of memory.
You copy char *digits into that memory.
The length you should copy is strlen(digits). What you're actually copying is sizeof(ports[port].collect_digits[marker]), which will give you the length of a single char *.
You cannot use sizeof() to find the length of allocated memory. Furthermore, unless you know a priori that digits is the same length as the memory you've allocated, even if sizeof() did tell you the length of allocated memory, you would be copying the wrong number of bytes (too many; you only need to copy the length of digits).
Also, even if the two lengths are always the same, obtaining the length is this way is not expressive; it misleads the reader.
Note also that strncpy() will pad with trailing NULLs if the specified copy length is greater than the length of the source string. As such, if digits is the length of the memory allocated, you will have a non-terminated string.
The sprintf() line is functionally correct, but for what you're doing, strcpy() (as opposed to strncpy()) is, from what I can see and know of the code, the correct choice.
I have to say, I don't know what you're trying to do, but the code feels very awkward.
The first thing: why have an array of pointers? Do you expect multiple strings for a port object? You probably only need a plain array or a pointer (since you are malloc-ing later on).
struct port_t
{
char *collect_digits;
}ports[20];
You need to pass the address of the string, otherwise, the malloc acts on a local copy and you never get back what you paid for.
static void g_store_digit(char **digit, unsigned int port);
Finally, the sizeof applies in a pointer context and doesn't give you the correct size.
Instead of using malloc() and strncpy(), just use strdup() - it allocates the buffer bin enough to hold the content and copies the content to the new string, all in one shot.
So you don't need g_store_digit() at all - just use strdup(), and maintain marker on the caller's level.
Another problem with the original code: The statement
strncpy(ports[port].collect_digits[marker++], digit, sizeof(ports[port].collect_digits[marker]));
references marker and marker++ in the same expression. The order of evaluation for the ++ is undefined in this case -- the second reference to marker may be evaluated either before or after the increment is performed.