I have this code
#define BUFFER_LEN (2048)
static float buffer[BUFFER_LEN];
int readcount;
while ((readcount = sf_read_float(handle, buffer, BUFFER_LEN))) {
// alsa play
}
which reads BUFFER_LEN floats from buffer, and returns the number of floats it actually read. "handle" tells sf_rad_float how big buffer is.
E.g. if buffer contains 5 floats, and BUFFER_LEN is 3, readcount would first return 3, and next time 2, and the while-loop would exit.
I would like to have a function that does the same.
Update
After a lot of coding, I think this is the solution.
#include <stdio.h>
int copy_buffer(double* src, int src_length, int* src_pos,
float* dest, int dest_length) {
int copy_length = 0;
if (src_length - *src_pos > dest_length) {
copy_length = dest_length;
printf("copy_length1 %i\n", copy_length);
} else {
copy_length = src_length - *src_pos;
printf("copy_length2 %i\n", copy_length);
}
for (int i = 0; i < copy_length; i++) {
dest[i] = (float) src[*src_pos + i];
}
// remember where to continue next time the copy_buffer() is called
*src_pos += copy_length;
return copy_length;
}
int main() {
double src[] = {1,2,3,4,5};
int src_length = 5;
float dest[] = {0,0};
int dest_length = 2;
int read;
int src_pos = 0;
read = copy_buffer(src, src_length, &src_pos, dest, dest_length);
printf("read %i\n", read);
printf("src_pos %i\n", src_pos);
for (int i = 0; i < src_length; i++) {
printf("src %f\n", src[i]);
}
for (int i = 0; i < dest_length; i++) {
printf("dest %f\n", dest[i]);
}
return 0;
}
Next time copy_buffer() is called, dest contains 3,4. Running copy_buffer() again only copies the value "5". So I think it works now.
Although it is not very pretty, that I have int src_pos = 0; outside on copy_buffer().
It would be a lot better, if I instead could give copy_buffer() a unique handle instead of &src_pos, just like sndfile does.
Does anyone know how that could be done?
If you would like to create unique handles, you can do so with malloc() and a struct:
typedef intptr_t HANDLE_TYPE;
HANDLE_TYPE init_buffer_traverse(double * src, size_t src_len);
int copy_buffer(HANDLE_TYPE h_traverse, double * dest, size_t dest_len);
void close_handle_buffer_traverse(HANDLE_TYPE h);
typedef struct
{
double * source;
size_t source_length;
size_t position;
} TRAVERSAL;
#define INVALID_HANDLE 0
/*
* Returns a new traversal handle, or 0 (INVALID_HANDLE) on failure.
*
* Allocates memory to contain the traversal state.
* Resets traversal state to beginning of source buffer.
*/
HANDLE_TYPE init_buffer_traverse(double *src, size_t src_len)
{
TRAVERSAL * trav = malloc(sizeof(TRAVERSAL));
if (NULL == trav)
return INVALID_HANDLE;
trav->source = src;
trav->source_len = src_len;
trav->position = 0;
return (HANDLE_TYPE)trav;
}
/*
* Returns the system resources (memory) associated with the traversal handle.
*/
void close_handle_buffer_traverse(HANDLE_TYPE h)
{
TRAVERSAL * trav = NULL;
if (INVALID_HANDLE != h)
free((TRAVERSAL *)h);
}
int copy_buffer(HANDLE_TYPE h,
float* dest, int dest_length)
{
TRAVERSAL * trav = NULL;
if (INVALID_HANDLE == h)
return -1;
trav = (TRAVERSAL *)h;
int copy_length = trav->source_length - trav->position;
if (dest_length < copy_length)
copy_length = dest_length;
for (int i = 0; i*emphasized text* < copy_length; i++)
dest[i] = trav->source[trav->position + i];
// remember where to continue next time the copy_buffer() is called
trav->position += copy_length;
return copy_length;
}
This sort of style is what some C coders used before C++ came into being. The style involves a data structure, which contains all the data elements of our 'class'. Most API for the class takes as its first argument, a pointer to one of these structs. This pointer is similar to the this pointer. In our example this parameter was named trav.
The exception for the API would be those methods which allocate the handle type; these are similar to constructors and have the handle type as a return value. In our case named init_buffer_traverse might as well have been called construct_traversal_handle.
There are many other methods than this method for implementing an "opaque handle" value. In fact, some coders would manipulate the bits (via an XOR, for example) in order to obscure the true nature of the handles. (This obscurity does not provide security where such is needed.)
In the example given, I'm not sure (didn't look at sndlib) whether it would make most sense for the destination buffer pointer and length to be held in the handle structure or not. If so, that would make it a "copy buffer" handle rather than a "traversal" handle and you would want to change all the terminology from this answer.
These handles are only valid for the lifetime of the current process, so they are not appropriate for handles which must survive restarts of the handle server. For that, use an ISAM database and the column ID as handle. The database approach is much slower than the in-memory/pointer approach but for persistent handles, you can't use in-memory values, anyway.
On the other hand, it sounds like you are implementing a library which will be running within a single process lifetime. In which case, the answer I've written should be usable, after modifying to your requirements.
Addendum
You asked for some clarification of the similarity with C++ that I mention above. To be specific, some equivalent (to the above C code) C++ code might be:
class TRAVERSAL
{
double * source;
size_t source_length;
size_t position;
public TRAVERSAL(double *src, size_t src_len)
{
source = src;
source_length = src_len;
position = 0;
}
public int copy_buffer(double * dest, size_t dest_len)
{
int copy_length = source_length - position;
if (dest_length < copy_length)
copy_length = dest_length;
for (int i = 0; i < copy_length; i++)
dest[i] = source[position + i];
// remember where to continue next time the copy_buffer() is called
position += copy_length;
return copy_length;
}
}
There are some apparent differences. The C++ version is a little bit less verbose-seeming. Some of this is illusory; the equivalent of close_handle_buffer_traverse is now to delete the C++ object. Of course delete is not part of the class implementation of TRAVERSAL, delete comes with the language.
In the C++ version, there is no "opaque" handle.
The C version is more explicit and perhaps makes more apparent what operations are being performed by the hardware in response to the program execution.
The C version is more amenable to using the cast to HANDLE_TYPE in order to create an "opaque ID" rather than a pointer type. The C++ version could be "wrapped" in an API which accomplished the same thing while adding another layer. In the current example, users of this class will maintain a copy of a TRAVERSAL *, which is not quite "opaque."
In the function copy_buffer(), the C++ version need not mention the trav pointer because instead it implicitly dereferences the compiler-supplied this pointer.
sizeof(TRAVERSAL) should be the same for both the C and C++ examples -- with no vtable, also assuming run-time-type-identification for C++ is turned off, the C++ class contains only the same memory layout as the C struct in our first example.
It is less common to use the "opaque ID" style in C++, because the penalty for "transparency" is lowed in C++. The data members of class TRAVERSAL are private and so the TRAVERSAL * cannot be accidentally used to break our API contract with the API user.
Please note that both the opaque ID and the class pointer are vulnerable to abuse from a malicious API user -- either the opaque ID or class pointer could be cast directly to, e.g., double **, allowing the holder of the ID to change the source member directly via memory. Of course, you must trust the API caller already, because in this case the API calling code is in the same address space. In an example of a network file server, there could be security implications if "opaque ID" based on a memory address is exposed to the outside.
I would not normally make the digression into trustedness of the API user, but I want to clarify that the C++ keyword private has no "enforcement powers," it only specifies an agreement between programmers, which the compiler respects also unless told otherwise by the human.
Finally, the C++ class pointer can be converted to an opaque ID as follows:
typedef intptr_t HANDLE_TYPE;
HANDLE_TYPE init_buffer_traverse(double *src, size_t src_len)
{
return (HANDLE_TYPE)(new TRAVERSAL(src, src_len));
}
int copy_buffer(HANDLE_TYPE h_traverse, double * dest, size_t dest_len)
{
return ((TRAVERSAL *)h_traverse)->copy_buffer(dest, dest_len);
}
void close_handle_buffer_traverse(HANDLE_TYPE h)
{
delete ((TRAVERSAL *)h);
}
And now our brevity of "equivalent" C++ may be further questioned.
What I wrote about the old style of C programming which relates to C++ was not meant to say that C++ is better for this task. I only mean that encapsulation of data and hiding of implementation details could be done in C via a style that is almost isomorphic to a C++ style. This can be good to know if you find yourself programming in C but unfortunately having learned C++ first.
PS
I just noticed that our implementation to date had used:
dest[i] = (float)source[position + i];
when copying the bytes. Because both dest and source are double * (that is, they both point to double values), there is no need for a cast here. Also, casting from double to float may lose digits of precision in the floating-point representation. So this is best removed and restated as:
dest[i] = source[position + i];
I started to look at it, but you could probably do it just as well: libsndfile is open source, so one could look at how sf_read_float() works and create a function that does the same thing from a buffer. http://www.mega-nerd.com/libsndfile/ has a download link.
Related
My current concat function:
char* concat(char* a, int a_size,
char* b, int b_size) {
char* c = malloc(a_size + b_size);
memcpy(c, a, a_size);
memcpy(c + a_size, b, b_size);
free(a);
free(b);
return c;
}
But this used extra memory. Is it possible to append two byte arrays using realloc without making extra memory space?
Like:
void append(char* a, int a_size, char* b, int b_size)
...
char* a = malloc(2);
char* b = malloc(2);
void append(a, 2, b, 2);
//The size of a will be 4.
While Jean-François Fabre answered the stated question, I'd like to point out that you can manage such byte arrays better by using a structure:
typedef struct {
size_t max; /* Number of chars allocated for */
size_t len; /* Number of chars in use */
unsigned char *data;
} bytearray;
#define BYTEARRAY_INIT { 0, 0, NULL }
void bytearray_init(bytearray *barray)
{
barray->max = 0;
barray->len = 0;
barray->data = NULL;
}
void bytearray_free(bytearray *barray)
{
free(barray->data);
barray->max = 0;
barray->len = 0;
barray->data = NULL;
}
To declare an empty byte array, you can use either bytearray myba = BYTEARRAY_INIT; or bytearray myba; bytearray_init(&myba);. The two are equivalent.
When you no longer need the array, call bytearray_free(&myba);. Note that free(NULL) is safe and does nothing, so it is perfectly safe to free a bytearray that you have initialized, but not used.
To append to a bytearray:
int bytearray_append(bytearray *barray, const void *from, const size_t size)
{
if (barray->len + size > barray->max) {
const size_t len = barray->len + size;
size_t max;
void *data;
/* Example policy: */
if (len < 8)
max = 8; /* At least 8 chars, */
else
if (len < 4194304)
max = (3*len) / 2; /* grow by 50% up to 4,194,304 bytes, */
else
max = (len | 2097151) + 2097153 - 24; /* then pad to next multiple of 2,097,152 sans 24 bytes. */
data = realloc(barray->data, max);
if (!data) {
/* Not enough memory available. Old data is still valid. */
return -1;
}
barray->max = max;
barray->data = data;
}
/* Copy appended data; we know there is room now. */
memmove(barray->data + barray->len, from, size);
barray->len += size;
return 0;
}
Since this function can at least theoretically fail to reallocate memory, it will return 0 if successful, and nonzero if it cannot reallocate enough memory.
There is no need for a malloc() call, because realloc(NULL, size) is exactly equivalent to malloc(size).
The "growth policy" is a very debatable issue. You can just make max = barray->len + size, and be done with it. However, dynamic memory management functions are relatively slow, so in practice, we don't want to call realloc() for every small little addition.
The above policy tries to do something better, but not too aggressive: it always allocates at least 8 characters, even if less is needed. Up to 4,194,304 characters, it allocates 50% extra. Above that, it rounds the allocation size to the next multiple of 2,097,152 and substracts 24. The reasoning behid this is complex, but it is more for illustration and understanding than anything else; it is definitely NOT "this is best, and this is what you should do too". This policy ensures that each byte array allocates at most 4,194,304 = 222 unused characters. However, 2,097,152 = 221 is the size of a huge page on AMD64 (x86-64), and is a power-of-two multiple of a native page size on basically all architectures. It is also large enough to switch from so-called sbrk() allocation to memory mapping on basically all architectures that do that. It means that such huge allocations use a separate part of the heap for each, and the unused part is usually just virtual memory, not necessarily backed by any RAM, until accessed. As a result, this policy tends to work quite well for both very short byte arrays, and very long byte arrays, on most architectures.
Of course, if you know (or measure!) the typical size of the byte arrays in typical workloads, you can optimize the growth policy for that, and get even better results.
Finally, it uses memmove() instead of memcpy(), just in case someone wishes to repeat a part of the same byte array: memcpy() only works if the source and target areas do not overlap; memmove() works even in that case.
When using more advanced data structures, like hash tables, a variant of the above structure is often useful. (That is, this is much better in cases where you have lots of empty byte arrays.)
Instead of having a pointer to the data, the data is part of the structure itself, as a C99 flexible array member:
typedef struct {
size_t max;
size_t len;
unsigned char data[];
} bytearray;
You cannot declare a byte array itself (i.e. bytearray myba; will not work); you always declare a pointer to a such byte arrays: bytearray *myba = NULL;. The pointer being NULL is just treated the same as an empty byte array.
In particular, to see how many data items such an array has, you use an accessor function (also defined in the same header file as the data structure), rather than myba.len:
static inline size_t bytearray_len(bytearray *const barray)
{
return (barray) ? barray->len : 0;
}
static inline size_t bytearray_max(bytearray *const barray)
{
return (barray) ? barray->max : 0;
}
The (expression) ? (if-true) : (if-false) is a ternary operator. In this case, the first function is exactly equivalent to
static inline size_t bytearray_len(bytearray *const barray)
{
if (barray)
return barray->len;
else
return 0;
}
If you wonder about the bytearray *const barray, remember that pointer declarations are read from right to left, with * as "a pointer to". So, it just means that barray is constant, a pointer to a byte array. That is, we may change the data it points to, but we won't change the pointer itself. Compilers can usually detect such stuff themselves, but it may help; the main point is however to remind us human programmers that the pointer itself is not to be changed. (Such changes would only be visible within the function itself.)
Since such arrays often need to be resized, the resizing is often put into a separate helper function:
bytearray *bytearray_resize(bytearray *const barray, const size_t len)
{
bytearray *temp;
if (!len) {
free(barray);
errno = 0;
return NULL;
}
if (!barray) {
temp = malloc(sizeof (bytearray) + len * sizeof barray->data[0]);
if (!temp) {
errno = ENOMEM;
return NULL;
}
temp->max = len;
temp->len = 0;
return temp;
}
if (barray->len > len)
barray->len = len;
if (barray->max == len)
return barray;
temp = realloc(barray, sizeof (bytearray) + len * sizeof barray->data[0]);
if (!temp) {
free(barray);
errno = ENOMEM;
return NULL;
}
temp->max = len;
return temp;
}
What does that errno = 0 do in there? The idea is that because resizing/reallocating a byte array may change the pointer, we return the new one. If the allocation fails, we return NULL with errno == ENOMEM, just like malloc()/realloc() do. However, since the desired new length was zero, this saves memory by freeing the old byte array if any, and returns NULL. But since that is not an error, we set errno to zero, so that it is easier for callers to check if an error occurred or not. (If the function returns NULL, check errno. If errno is nonzero, an error occurred; you can use strerror(errno) to get a descriptive error message.)
You probably also noted the sizeof barray->data[0], used even when barray is NULL. This is okay, because sizeof is not a function, but an operator: it does not access the right side at all, it only evaluates to the size of the thing the right side refers to. (You only need to use parentheses when the right size is a type.) This form is nice, because it lets a programmer change the type of the data member, without changing any other code.
To append data to such a byte array, we probably want to be able to specify whether we anticipate further appends to the same array, or whether this is probably the final append, so that only the exact needed amount of memory is needed. For simplicity, I'll only implement the exact size version here. Note that this function returns a pointer to the (modified) byte array:
bytearray *bytearray_append(bytearray *barray,
const void *from, const size_t size,
int exact)
{
size_t len = bytearray_len(barray) + size;
if (exact) {
barray = bytearray_resize(barray, len);
if (!barray)
return NULL; /* errno already set by bytearray_resize(). */
} else
if (bytearray_max(barray) < len) {
if (!exact) {
/* Apply growth policy */
if (len < 8)
len = 8;
else
if (len < 4194304)
len = (3 * len) / 2;
else
len = (len | 2097151) + 2097153 - 24;
}
barray = bytearray_resize(barray, len);
if (!barray)
return NULL; /* errno already set by the bytearray_resize() call */
}
if (size) {
memmove(barray->data + barray->len, from, size);
barray->len += size;
}
return barray;
}
This time, we declared bytearray *barray, because we change where barray points to in the function. If the fourth parameter, final, is nonzero, then the resulting byte array is exactly the size needed; otherwise the growth policy is applied.
yes, since realloc will preserve the start of your buffer if the new size is bigger:
char* concat(char* a, size_t a_size,
char* b, size_t b_size) {
char* c = realloc(a, a_size + b_size);
memcpy(c + a_size, b, b_size); // dest is after "a" data, source is b with b_size
free(b);
return c;
}
c may be different from a (if the original memory block cannot be resized in-place contiguously to the new size by the system) but if that's the case, the location pointed by a will be freed (you must not free it), and the original data will be "moved".
My advice is to warn the users of your function that the input buffers must be allocated using malloc, else it will crash badly.
I am writing my own OS and had to implement my own malloc realloc functions. However I think that what I have written may not be safe and may also cause a memory leak because the variable isn't really destroyed, its memory is set to zero, but the variable name still exists. Could someone tell me if there are any vulnerabilities in this code? The project will be added to github soon as its finished under user subado512.
Code:
void * malloc(int nbytes)
{
char variable[nbytes];
return &variable;
}
void * free(string s) {
s= (string)malloc(0);
return &s;
}
void memory_copy(char *source, char *dest, int nbytes) {
int i;
for (i = 0; i < nbytes; i++) {
*(dest + i) = *(source + i); // dest[i] = source[i]
}
}
void *realloc(string s,uint8_t i) {
string ret;
ret=(string)malloc(i);
memory_copy(s,ret,i);
free(s);
return &ret;
}
Context in which code is used : Bit of pseudo code to increase readability
string buffstr = (string) malloc(200);
uint8_t i = 0;
while(reading)
{
buffstr=(string)realloc(buffstr,i+128);
buffstr[i]=readinput();
}
The behaviour on your using the pointer returned by your malloc is undefined: you are returning the address of an array with automatic storage duration.
As a rough start, consider using a static char array to model your memory pool, and return segments of this back to the caller; building up a table of that array that is currently in use. Note that you'll have to do clever things with alignment here to guarantee that the returned void* meets the alignment requirements of any type. free will then be little more than your releasing a record in that table.
Do note that the memory management systems that a typical C runtime library uses are very sophisticated. With that in mind, do appreciate that your undertaking may be little more than a good programming exercise.
I'm trying to come up with a program that reads in numbers from the command line, turns the argv array into integers, and then finds the smallest integer in the array of those integers.
Below is my code for this program, can anyone help me out?
#include <stdio.h>
#include <stdlib.h>
int *integerizeArgs(int, char **);
int *findMin(int, int *);
int *integerizeArgs(int argc, char **argv)
{
int i = 0;
int *a = malloc(sizeof(int) * (argc-1));
for(i= 1; i < argc; ++i){
a[i-1] = atoi(argv[1]);
return a;
}
return 0;
}
int *findMin(int itemCount, int *a) {
int i, smallest = a[0];
for (i=0; i < itemCount; i++) {
if(a[i] < smallest) {
smallest = a[i];
return smallest;
}
return 0;
}
return 0;
}
int main(int argc, char **argv){
int *a = integerizeArgs(argc, argv);
int b = findMin(argc, a[0]);
printf("%d", b);
return 0;
}
Write proper code.
You should check if malloc() was successful.
Be careful for typo. argv[1] is not as reasonable as argv[i] here.
Do not use return; when you don't want to return from the function.
Use proper type. Distinguish between "normal" integers and pointers.
Be careful for off-by-one error.
In this case, argc-1 elements are allocated, not argc elements.
You should free whatever you allocated.
Corrected code:
#include <stdio.h>
#include <stdlib.h>
int *integerizeArgs(int, char **);
int findMin(int, int *);
int *integerizeArgs(int argc, char **argv)
{
int i = 0;
int *a = malloc(sizeof(int) * (argc-1));
if (a == NULL){ /* add error check */
perror("malloc");
exit(1);
}
for(i= 1; i < argc; ++i){
a[i-1] = atoi(argv[i]); /* convert each arguments instead of only the first one */
/* don't return when the process is not done */
}
return a; /* return the result */
}
/* use proper return type */
int findMin(int itemCount, int *a) {
int i, smallest = a[0];
for (i=0; i < itemCount; i++) {
if(a[i] < smallest) {
smallest = a[i];
/* don't return when the process is not done */
}
/* don't return when the process is not done */
}
return smallest; /* return the result */
}
int main(int argc, char **argv){
int *a = integerizeArgs(argc, argv);
/* pass the (pointer to) the array instead of the first element of the array (&a[0] is also OK) */
/* pass correct itemCount (there are argc-1 items because the first argument typically is the command) */
int b = findMin(argc - 1, a);
printf("%d", b);
free(a); /* free whatever you allocated */
return 0;
}
There are multiple problems with this code. Such significant misunderstandings would tend to elude to the point that your resources aren't working for you. Have you considered trying other resources?
If you don't yet understand the basics of procedural programming, I recommend learning a different language first. Unfortunately I haven't come across a decent C programming book that teaches both procedural programming in general and C programming. The only books I know of seem to require that you already understand procedural programming. I'll try to help a bit with that with this post.
I can highly recommend K&R 2E, providing you've understood the procedural programming basics first; remember to do the exercises as you encounter them as they're a valuable part of the learning experience.
You seem to be quite confused about the effect upon the flow of execution caused by return, for and if. C is a procedural language, meaning it has a structure similar to a recipe in a cookbook (a procedure), for example.
Okay, that's grossly over-simplified, but if you think of steps like "preheat the oven to 180C (described in page 42)" and "caramelise the onion" as though they are separate procedures then we can establish a use for keywords like return, for and if, so bear with me.
As you turn to page 42 you might notice the procedure is simple but disjoint from the recipe, something like this:
Ensure the oven is empty, and if necessary install an oven thermometer onto the front of one of the racks.
if the oven is a gas oven:
Strike a match.
Kneel in front of the oven, and turn the knob to the appropriate temperature.
Bring the lit match into contact with the gas stream, keeping your fingers well clear of the gas stream at all times.
else turn the knob to the appropriate temperature.
Close the oven.
for the duration beginning when you close the oven and ending when the thermometer reaches the appropriate value, periodically check the thermometer
return to the previous procedure.
Here, the words if and else are clearly meant to ensure that you choose the appropriate path for your oven. The same goes for your computer. You're telling it to choose which one of the paths based on the condition in your code.
The return keyword is used to tell you that the procedure is over, and you should resume the procedure you were using before.
There is a loop embedded into the procedure, too. That could be expressed using for language, i.e. "for the duration beginning when you close the oven and ending when the thermometer reaches the appropriate value, periodically check the thermometer" would loosely translate to something like:
close(oven);
for (actual_temp = check_temp(oven); actual_temp < desired_temp; actual_temp = check_temp(oven)) {
sleep(15 minutes);
}
Recipes are easy to understand because they're written in natural language. Computer programs, however, aren't. Programming languages have idioms that aren't so commonly used in natural language, such as variable scope and memory location, so it's important to use those idioms consistently.
I recommend designing software as though it's meant to fit in with the environment. I can see you've given that some thought by forwarding argc and argv (or its translated equivalent) to each of your functions, but a deeper analysis of the environment is required.
It goes against the grain for a function to perform allocation and expect the caller to perform cleanup for that allocation. If you analyse the example set by the C standard library, the only functions that allocate memory are allocation functions, thread creation and file creation. Everything else lets the caller choose how memory is allocated. Thus, your integerizeArgs function should probably be refactored:
int *integerize_arguments(int *destination, char **source, size_t length) {
for (size_t x = 0; x < length; x++) {
if (sscanf(source[x], "%d", &destination[x]) != 1) {
return NULL;
}
}
return destination;
}
Did you notice how closely this resembles memcpy, for example? Using this pattern the caller can choose what kind of allocation the array should have, which is very nice for maintenance should you decide you don't need malloc.
Also notice how I used size_t for index variables. This is important because it doesn't usually make sense to allow negative numbers for arrays.
As for your findMin function, the best advice I can give there is to think about the procedural language we discussed earlier. I don't think the instructions you're giving your computer are the same instructions you have in your head. If they are, they don't make sense and you might need to run through that procedure a few times by hand to see what's going wrong.
Unless the caller (main, in this case) needs to know the address of the minimum value, findMin doesn't need to return int *; it should return int instead. If, on the other hand, main needs to know where the minimum value is located, then your algorithm needs to be adapted because you're currently not storing that position in your logic.
As covered by MikeCAT, you shouldn't be returning to the previous procedure until you've inspected the entire array. Hence your return smallest; should be below the for loop.
This post is getting quite lengthy, kind of turning into a book of it's own... so I'll wrap up here, and recommend in summary that you purchase a book about programming.
This is a learning exercise. I'm attempting to augment memcpy by notifying the user if the copy operation will pass or fail before it begins. My biggest question is the following. If I allocate two char arrays of 100 bytes each, and have two pointers that reference each array, how do I know which direction I am copying? If I copy everything from the first array to the second how do I ensure that the user will not be overwriting the original array?
My current solution compares the distance of the pointers from the size of the destination array. If the size between is smaller than I say an overwrite will occur. But what if its copying in the other direction? I'm just kind of confused.
int memcpy2(void *target, void *source, size_t nbytes) {
char * ptr1 = (char *)target;
char * ptr2 = (char *)source;
int i, val;
val = abs(ptr1 - ptr2);
printf("%d, %d\n", val, nbytes + 0);
if (val > nbytes) {
for (i = 0; i < val; i++){
ptr1[i] = ptr2[i];
}
return 0; /*success */
}
return -1; /* error */
}
int main(int argc, char **argv){
char src [100] = "Copy this string to dst1";
char dst [20];
int p;
p = memcpy2(dst, src, sizeof(dst));
if (p == 0)
printf("The element\n'%s'\nwas copied to \n'%s'\nSuccesfully\n", src, dst);
else
printf("There was an error!!\n\nWhile attempting to copy the elements:\n '%s'\nto\n'%s', \n Memory was overlapping", src, dst);
return 0;
}
The only portable way to determine if two memory ranges overlap is:
int overlap_p(void *a, void *b, size_t n)
{
char *x = a, *y = b;
for (i=0; i<n; i++) if (x+i==y || y+i==x) return 1;
return 0;
}
This is because comparison of pointers with the relational operators is undefined unless they point into the same array. In reality, the comparison does work on most real-world implementations, so you could do something like:
int overlap_p(void *a, void *b, size_t n)
{
char *x = a, *y = b;
return (x<=y && x+n>y) || (y<=x && y+n>x);
}
I hope I got that logic right; you should check it. You can simplify it even more if you want to assume you can take differences of arbitrary pointers.
What you want to check is the position in memory of the source relatively to the destination:
If the source is ahead of the destination (ie. source < destination), then you should start from the end. If the source is after, you start from the beginning. If they are equal, you don't have to do anything (trivial case).
Here are some crude ASCII drawings to visualize the problem.
|_;_;_;_;_;_| (source)
|_;_;_;_;_;_| (destination)
>-----^ start from the end to shift the values to the right
|_;_;_;_;_;_| (source)
|_;_;_;_;_;_| (destination)
^-----< start from the beginning to shift the values to the left
Following a very accurate comment below, I should add that you can use the difference of the pointers (destination - source), but to be on the safe side cast those pointers to char * beforehand.
In your current setting, I don't think that you can check if the operation will fail. Your memcpy prototype prevents you from doing any form of checking for that, and with the rule given above for deciding how to copy, the operation will succeed (outside of any other considerations, like prior memory corruption or invalid pointers).
I don't believe that "attempting to augment memcpy by notifying the user if the copy operation will pass or fail before it begins." is a well-formed notion.
First, memcpy() doesn't succeed or fail in the normal sense. It just copies the data, which might cause a fault/exception if it reads outside the source array or writes outside the destination array, and it might also read or write outside one of those arrays without causing any fault/exception and just silently corrupting data. When I say "memcpy does this" I'm not talking just about the implementation of the C stdlib memcpy, but about any function with the same signature -- it doesn't have enough information to do otherwise.
Second, if your definition of "succeed" is "assuming the buffers are big enough but may be overlapping, copy the data from source to dst without tripping over yourself while copying" -- that is indeed what memmove() does, and it's always possible. Again, there's no "return failure" case. If the buffers don't overlap it's easy, if the source is overlapping the end of the destination then you just copy byte by byte from the beginning; if the source is overlapping the beginning of the destination then you just copy byte by byte from the end. Which is what memmove() does.
Third, when writing this kind of code, you have to be very careful about overflow cases for your pointer arithmetic (including addition, subtraction, and array indexing). In val = abs(ptr1 - ptr2), ptr1 - ptr2 could be a very large number, and it will be unsigned, so abs() won't do anything to it, and int is the wrong type to store that in. Just so you know.
I've been looking into using C over C++ as I find it cleaner and the main thing I find it to lack is a vector like array.
What is the best implementation of this?
I want to just be able to call something like vector_create, vector_at, vector_add, etc.
EDIT
This answer is from a million years ago, but at some point, I actually implemented a macro-based, efficient, type-safe vector work-alike in C that covers all the typical features and needs. You can find it here:
https://github.com/eteran/c-vector
Original answer below.
What about a vector are you looking to replicate? I mean in the end, it all boils down to something like this:
int *create_vector(size_t n) {
return malloc(n * sizeof(int));
}
void delete_vector(int *v) {
free(v);
}
int *resize_vector(int *v, size_t n) {
return realloc(v, n * sizeof(int));
/* returns NULL on failure here */
}
You could wrap this all up in a struct, so it "knows its size" too, but you'd have to do it for every type (macros here?), but that seems a little uneccessary... Perhaps something like this:
typedef struct {
size_t size;
int *data;
} int_vector;
int_vector *create_vector(size_t n) {
int_vector *p = malloc(sizeof(int_vector));
if(p) {
p->data = malloc(n * sizeof(int));
p->size = n;
}
return p;
}
void delete_vector(int_vector *v) {
if(v) {
free(v->data);
free(v);
}
}
size_t resize_vector(int_vector *v, size_t n) {
if(v) {
int *p = realloc(v->data, n * sizeof(int));
if(p) {
v->data = p;
v->size = n;
}
return v->size;
}
return 0;
}
int get_vector(int_vector *v, size_t n) {
if(v && n < v->size) {
return v->data[n];
}
/* return some error value, i'm doing -1 here,
* std::vector would throw an exception if using at()
* or have UB if using [] */
return -1;
}
void set_vector(int_vector *v, size_t n, int x) {
if(v) {
if(n >= v->size) {
resize_vector(v, n);
}
v->data[n] = x;
}
}
After which, you could do:
int_vector *v = create_vector(10);
set_vector(v, 0, 123);
I dunno, it just doesn't seem worth the effort.
The most complete effort I know of to create a comprehensive set of utility types in C is GLib. For your specific needs it provides g_array_new, g_array_append_val and so on. See GLib Array Documentation.
Rather than going off on a tangent in the comments to #EvanTeran's answer I figured I'd submit a longer reply here.
As various comments allude to there's really not much point in trying to replicate the exact behavior of std::vector since C lacks templates and RAII.
What can however be useful is a dynamic array implementation that just works with bytes. This can obviously be used directly for char* strings, but can also easily be adapted for usage with any other types as long as you're careful to multiply the size parameter by sizeof(the_type).
Apache Portable Runtime has a decent set of array functions and is all C.
See the tutorial for a quick intro.
If you can multiply, there's really no need for a vector_create() function when you have malloc() or even calloc(). You just have to keep track of two values, the pointer and the allocated size, and send two values instead of one to whatever function you pass the "vector" to (if the function actually needs both the pointer and the size, that is). malloc() guarantees that the memory chunk is addressable as any type, so assign it's void * return value to e.g. a struct car * and index it with []. Most processors access array[index] almost as fast as variable, while a vector_at() function can be many times slower. If you store the pointer and size together in a struct, only do it in non time-critical code, or you'll have to index with vector.ptr[index]. Delete the space with free().
Focus on writing a good wrapper around realloc() instead, that only reallocates on every power of e.g. 2 or 1.5. See user786653's Wikipedia link.
Of course, calloc(), malloc() and realloc() can fail if you run out memory, and that's another possible reason for wanting a vector type. C++ has exceptions that automatically terminate the program if you don't catch it, C doesn't. But that's another discussion.
Lack of template functionality in C makes it impossible to support a vector like structure. The best you can do is to define a 'generic' structure with some help of the preprocessor, and then 'instantiate' for each of the types you want to support.