Copying unsigned char in C - c

I want to use memcpy but it seems to me that it's copying the array from the start?
I wish to copy from A[a] to A[b]. So, instead I found an alternative way,
void copy_file(char* from, int offset, int bytes, char* to) {
int i;
int j = 0;
for (i = offset; i <= (offset+bytes); i++) to[i] = from[j++];
}
I'm getting seg faults but I don't know where I am getting this seg fault from?
each entry holds 8 bytes so my second attempt was
void copy_file(char* from, int offset, int bytes, char* to) {
int i;
int j = 0;
for (i = 8*offset; i <= 8*(offset+bytes); i++) to[i] = from[j++];
}
but still seg fault. If you need more information please don't hesitate to ask!

I'm getting seg faults but I don't know where I am getting this seg fault from?
Primary Suggestion: Learn to use a debugger. It provides helpful information about erroneous instruction(s).
To answer you query on the code snippet shown on above question,
Check the incoming pointers (to and from) against NULL before dereferencing them.
Put a check on the boundary limits for indexes used. Currently they can overrun the allocated memory.
To use memcpy() properly:
as per the man page, the signature of memcpy() indicates
void *memcpy(void *dest, const void *src, size_t n);
it copies n bytes from address pointer by src to address pointed by dest.
Also, a very very important point to note:
The memory areas must not overlap.
So, to copy A[a] to A[b], you may write something like
memcpy(destbuf, &A[a], (b-a) );

it seems to me that memcpy copying the array from the start
No, it does not. In fact, memcpy does not have a slightest idea that it is copying from or to an array. It treats its arguments as pointers to unstructured memory blocks.
If you wish to copy from A[a] to A[b], pass an address of A[a] and the number of bytes between A[b] and A[a] to memcpy, like this:
memcpy(Dest, &A[a], (b-a) * sizeof(A[0]));
This would copy the content of A from index a, inclusive, to index b, exclusive, into a memory block pointed to by Dest. If you wish to apply an offset to Dest as well, use &Dest[d] for the first parameter. Multiplication by sizeof is necessary for arrays of types other than char, signed or unsigned.

Change the last line from
for (i = offset; i <= (offset+bytes); i++)
to[i] = from[j++];
to
for (i = offset; i <= bytes; i++,j++)
to[j] = from[i];
This works fine for me. I have considered offset as the start of the array and byte as the end of the array. ie to copy from[offset] to from[bytes] to to[].

Related

How do you iterate through a pointer?

For example:
int *start;
start = (int*)malloc(40);
If I wanted to iterate through all 40 bytes, how would I do so?
I tried doing something like:
while(start != NULL){
start++;
}
but that iterates through a massive number of values, which is much greater than 40. Thus, how do you ensure that you iterate through all 40 bytes.
Thanks for all the help.
There are two issues here.
A single ptr++ skips as many bytes as the type of element it points to.
Here the type is int, so it would skip 4 bytes each time (assuming a 32 bit machine since integer is 4 bytes (32 bits) there).
If you want to iterate through all 40 bytes (one byte at a time), iterate using say a char data type (or type cast your int* to char* and then increment)
The other problem is your loop termination.
There is no one putting a NULL at the end here, so your loop would keep running (and pointer advancing forward) until it runs into may be a null or goes out of your allotted memory area and crashes. The behavior is undefined.
If you allocated 40 bytes, you have to terminate at 40 bytes yourself.
It is worth mentioning that type casting the result of malloc is not a good idea in C. The primary reason is that it could potentially tamper a failed allocation. It is a requirement in C++ though. The details can be found in the exact same question on SO. Search "casting return value of malloc"
First of all, you should be allocating ints correctly:
int* start = malloc( sizeof( int )*40 ) ;
Then you can use array subscripting:
for( size_t i = 0 ; i < 40 ; i++ )
{
start[i] = 0 ;
}
or a pointer to the end of the allocated memory:
int* end = start+40 ;
int* iter = start ;
while( iter < end )
{
*iter= 0 ;
iter++ ;
}
Arrays represent contiguous blocks of memory. Since the name of the array is basically a pointer to the first element, you can use array notation to access the rest of the block. Remember though, there is no error checking by C on the bounds of the array, so if you walk off the end of the memory block, you can do all kinds of things that you didn't intend and more than likely will end up with some sort of memory fault or segmentation error. Since your int can be variable size, I would use this code instead:
int *start;
int i;
start = malloc(40 * sizeof(int));
for (i = 0; i < 40; i++)
{
start[i] = 0;
}
Something like that will work nicely. The way that you are doing it, at least from the code that you posted, there is no way to stop the loop because once it exceeds the memory block, it will keep going until it runs into a NULL or you get a memory fault. In other words, the loop will only exit if it runs into a null. That null may be within the block of memory that you allocated, or it may be way beyond the block.
EDIT: One thing I noticed about my code. It will allocate space for 40 ints which can be either 4 bytes, 8 bytes, or something else depending on the architecture of the machine you are working on. If you REALLY only want 40 bytes of integers, then do something like this:
int *start;
int i;
int size;
size = 40/sizeof(int);
start = malloc(size);
for (i = 0; i < size; i++)
{
start[i] = 0;
}
Or you can use a char data type or an unsigned char if you need to. One other thing that I noticed. The malloc function returns a void pointer type which is compatible with all pointers, so there is no need to do a typecast on a malloc.
Well arrays in C aren't bounded so, a few options, the most common:
int *start;
int cnt = 0;
start = (int*)malloc(sizeof(int)*40);;
while(cnt<40)
{
start++;
cnt++;
}
Another option:
int *start;
int *ref;
start = ref = (int*)malloc(sizeof(int)*40);
while(start != ref+40)
start++;
And this last one is the closest to what you seem to mean to do:
int *start;
start = ref = (int*)malloc(sizeof(int)*41);
start[40] = -1;
while((*start) != -1)
start++;
I would suggest reading more on how pointers in C work. You don't appear to have a very good grasp of it. Also, remember that C takes off the training wheels. Arrays aren't bounded or terminated in a standard way, and a pointer (address in memory) will never be NULL after iterating through an array, and the contents a pointer is pointing to could be anything.

how to implement overlap-checking memcpy in C

This is a learning exercise. I'm attempting to augment memcpy by notifying the user if the copy operation will pass or fail before it begins. My biggest question is the following. If I allocate two char arrays of 100 bytes each, and have two pointers that reference each array, how do I know which direction I am copying? If I copy everything from the first array to the second how do I ensure that the user will not be overwriting the original array?
My current solution compares the distance of the pointers from the size of the destination array. If the size between is smaller than I say an overwrite will occur. But what if its copying in the other direction? I'm just kind of confused.
int memcpy2(void *target, void *source, size_t nbytes) {
char * ptr1 = (char *)target;
char * ptr2 = (char *)source;
int i, val;
val = abs(ptr1 - ptr2);
printf("%d, %d\n", val, nbytes + 0);
if (val > nbytes) {
for (i = 0; i < val; i++){
ptr1[i] = ptr2[i];
}
return 0; /*success */
}
return -1; /* error */
}
int main(int argc, char **argv){
char src [100] = "Copy this string to dst1";
char dst [20];
int p;
p = memcpy2(dst, src, sizeof(dst));
if (p == 0)
printf("The element\n'%s'\nwas copied to \n'%s'\nSuccesfully\n", src, dst);
else
printf("There was an error!!\n\nWhile attempting to copy the elements:\n '%s'\nto\n'%s', \n Memory was overlapping", src, dst);
return 0;
}
The only portable way to determine if two memory ranges overlap is:
int overlap_p(void *a, void *b, size_t n)
{
char *x = a, *y = b;
for (i=0; i<n; i++) if (x+i==y || y+i==x) return 1;
return 0;
}
This is because comparison of pointers with the relational operators is undefined unless they point into the same array. In reality, the comparison does work on most real-world implementations, so you could do something like:
int overlap_p(void *a, void *b, size_t n)
{
char *x = a, *y = b;
return (x<=y && x+n>y) || (y<=x && y+n>x);
}
I hope I got that logic right; you should check it. You can simplify it even more if you want to assume you can take differences of arbitrary pointers.
What you want to check is the position in memory of the source relatively to the destination:
If the source is ahead of the destination (ie. source < destination), then you should start from the end. If the source is after, you start from the beginning. If they are equal, you don't have to do anything (trivial case).
Here are some crude ASCII drawings to visualize the problem.
|_;_;_;_;_;_| (source)
|_;_;_;_;_;_| (destination)
>-----^ start from the end to shift the values to the right
|_;_;_;_;_;_| (source)
|_;_;_;_;_;_| (destination)
^-----< start from the beginning to shift the values to the left
Following a very accurate comment below, I should add that you can use the difference of the pointers (destination - source), but to be on the safe side cast those pointers to char * beforehand.
In your current setting, I don't think that you can check if the operation will fail. Your memcpy prototype prevents you from doing any form of checking for that, and with the rule given above for deciding how to copy, the operation will succeed (outside of any other considerations, like prior memory corruption or invalid pointers).
I don't believe that "attempting to augment memcpy by notifying the user if the copy operation will pass or fail before it begins." is a well-formed notion.
First, memcpy() doesn't succeed or fail in the normal sense. It just copies the data, which might cause a fault/exception if it reads outside the source array or writes outside the destination array, and it might also read or write outside one of those arrays without causing any fault/exception and just silently corrupting data. When I say "memcpy does this" I'm not talking just about the implementation of the C stdlib memcpy, but about any function with the same signature -- it doesn't have enough information to do otherwise.
Second, if your definition of "succeed" is "assuming the buffers are big enough but may be overlapping, copy the data from source to dst without tripping over yourself while copying" -- that is indeed what memmove() does, and it's always possible. Again, there's no "return failure" case. If the buffers don't overlap it's easy, if the source is overlapping the end of the destination then you just copy byte by byte from the beginning; if the source is overlapping the beginning of the destination then you just copy byte by byte from the end. Which is what memmove() does.
Third, when writing this kind of code, you have to be very careful about overflow cases for your pointer arithmetic (including addition, subtraction, and array indexing). In val = abs(ptr1 - ptr2), ptr1 - ptr2 could be a very large number, and it will be unsigned, so abs() won't do anything to it, and int is the wrong type to store that in. Just so you know.

How can I copy a repeating pattern into a memory buffer?

I want write a repeating pattern of bytes into a block of memory. My idea is to write the first example of the pattern, and then copy it into the rest of the buffer. For example, if I start with this:
ptr: 123400000000
Afterward, I want it to look like this:
ptr: 123412341234
I thought I could use memcpy to write to intersecting regions, like this:
memcpy(ptr + 4, ptr, 8);
The standard does not specify what order the copy will happen in, so if some implementation makes it copy in reverse order, it can give different results:
ptr: 123412340000
or even combined results.
Is there any workaround that lets me still use memcpy, or do I have to implement my own for loop? Note that I cannot use memmove because it does exactly what I'm trying to avoid; it make the ptr be 123412340000, while I want 123412341234.
I program for Mac/iPhone(clang compiler) but a general answer will be good too.
There is no standard function to repeat a pattern of bytes upon a memory range. You can use the memset_pattern* function family to get fixed-size patterns; if you need the size to vary, you'll have to roll your own.
// fills the 12 first bytes at `ptr` with the 4 first bytes of `ptr`
memset_pattern4(ptr, ptr, 12);
Be aware that memset_pattern4, memset_pattern8 and memset_pattern16 exist only on Mac OS/iOS, so don't use them for cross-platform development.
Otherwise, rolling a (cross-platform) function that does a byte-per-byte copy is pretty easy.
void byte_copy(void* into, void* from, size_t size)
{
for (size_t i = 0; i < size; i++)
into[i] = from[i];
}
Here is what kernel.org says:
The memcpy() function copies n bytes
from memory area src to memory area
dest. The memory areas must not
overlap. Use memmove(3) if the
memory areas do overlap.
An here is what MSDN says:
If the source and destination overlap,
the behavior of memcpy is undefined.
Use memmove to handle overlapping
regions.
The C++ answer for all platforms is std::fill_n(destination, elementRepeats, elementValue).
For what you've asked for:
short val = 0x1234;
std::fill_n(ptr, 3, val);
This will work for val of any type; chars, shorts, ints, int64_t, etc.
Old answer
You want memmove(). Full description:
The memmove() function shall copy n bytes from the object pointed to by s2 into the object pointed to by s1. Copying takes place as if the n bytes from the object pointed to by s2 are first copied into a temporary array of n bytes that does not overlap the objects pointed to by s1 and s2, and then the n bytes from the temporary array are copied into the object pointed to by s1.
From the memcpy() page:
If copying takes place between objects that overlap, the behaviour is undefined.
You have to use memmove() anyway. This is because the result of using memcpy() is not reliable in any way.
Relevant bits to the actual question
You're asking for memcpy(ptr + 4, ptr, 8); which says copy 8 bytes from ptr and put them at ptr+4. ptr is 123400000000, the first 8 bytes are 1234000, so it is doing this:
Original : 123400000000
Writes : 12340000
Result : 123412340000
You'd need to call:
memcpy(ptr+4, ptr, 4);
memcpy(ptr+8, ptr, 4);
To achieve what you're after. Or implement an equivalent. This ought to do it, but it is untested, and is equivalent to memcpy; you'll need to either add the extra temporary buffer or use two non-overlapping areas of memory.
void memexpand(void* result, const void* start,
const uint64_t cycle, const uint64_t limit)
{
uint64_t count = 0;
uint8_t* source = start;
uint8_t* dest = result;
while ( count < limit )
{
*dest = *source;
dest++;
count++;
if ( count % cycle == 0 )
{
source = start;
}
else
{
source++;
}
}
}
You can do that by copying once, and then memcpy everything to copied to the following bytes and repeat that, it's better understood in code:
void block_memset(void *destination, const void *source, size_t source_size, size_t repeats) {
memcpy(destination,source,source_size);
for (size_t i = 1; i < repeats; i += i)
memcpy(destination + i,destination,source_size * (min(i,repeats - i)));
}
I benchmarked; it's as fast as regular memset for large number of repeats, and the source_size is quite dynamic without much performance penalty too.
Why not just allocate an 8 byte buffer, move it there, then move it back to where you want it? (As #cnicutar says, you shouldn't have overlapping address spaces for memcpy.)

Segmentation Fault with strcat

I'm having a bit of a problem with strcat and segmentation faults. The error is as follows:
Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: KERN_INVALID_ADDRESS at address: 0x0000000000000000
0x00007fff82049f1f in __strcat_chk ()
(gdb) where
#0 0x00007fff82049f1f in __strcat_chk ()
#1 0x0000000100000adf in bloom_operation (bloom=0x100100080, item=0x100000e11 "hello world", operation=1) at bloom_filter.c:81
#2 0x0000000100000c0e in bloom_insert (bloom=0x100100080, to_insert=0x100000e11 "hello world") at bloom_filter.c:99
#3 0x0000000100000ce5 in main () at test.c:6
bloom_operation is as follows:
int bloom_operation(bloom_filter_t *bloom, const char *item, int operation)
{
int i;
for(i = 0; i < bloom->number_of_hash_salts; i++)
{
char temp[sizeof(item) + sizeof(bloom->hash_salts[i]) + 2];
strcat(temp, item);
strcat(temp, *bloom->hash_salts[i]);
switch(operation)
{
case BLOOM_INSERT:
bloom->data[hash(temp) % bloom->buckets] = 1;
break;
case BLOOM_EXISTS:
if(!bloom->data[hash(temp) % bloom->buckets]) return 0;
break;
}
}
return 1;
}
The line with trouble is the second strcat. The bloom->hash_salts are part of a struct defined as follows:
typedef unsigned const char *hash_function_salt[33];
typedef struct {
size_t buckets;
size_t number_of_hash_salts;
int bytes_per_bucket;
unsigned char *data;
hash_function_salt *hash_salts;
} bloom_filter_t;
And they are initialized here:
bloom_filter_t* bloom_filter_create(size_t buckets, size_t number_of_hash_salts, ...)
{
bloom_filter_t *bloom;
va_list args;
int i;
bloom = malloc(sizeof(bloom_filter_t));
if(bloom == NULL) return NULL;
// left out stuff here for brevity...
bloom->hash_salts = calloc(bloom->number_of_hash_salts, sizeof(hash_function_salt));
va_start(args, number_of_hash_salts);
for(i = 0; i < number_of_hash_salts; ++i)
bloom->hash_salts[i] = va_arg(args, hash_function_salt);
va_end(args);
// and here...
}
And bloom_filter_create is called as follows:
bloom_filter_create(100, 4, "3301cd0e145c34280951594b05a7f899", "0e7b1b108b3290906660cbcd0a3b3880", "8ad8664f1bb5d88711fd53471839d041", "7af95d27363c1b3bc8c4ccc5fcd20f32");
I'm doing something wrong but I'm really lost as to what. Thanks in advance,
Ben.
I see a couple of problems:
char temp[sizeof(item) + sizeof(bloom->hash_salts[i]) + 2];
The sizeof(item) will only return 4 (or 8 on a 64-bit platform). You probably need to use strlen() for the actual length. Although I don't think you can declare it on the stack like that with strlen (although I think maybe I saw someone indicate that it was possible with newer versions of gcc - I may be out to lunch on that).
The other problem is that the temp array is not initialized. So the first strcat may not write to the beginning of the array. It needs to have a NULL (0) put in the first element before calling strcat.
It may already be in the code that was snipped out, but I didn't see that you initialized the number_of_hash_salts member in the structure.
You need to use strlen, not sizeof. item is passed in as a pointer, not an array.
The line:
char temp[sizeof(item) + sizeof(bloom->hash_salts[i]) + 2];
will make temp the 34x the length of a pointer + 2. The size of item is the size of a pointer, and the sizeof(bloom->hash_salts[i]) is currently 33x the size of a pointer.
You need to use strlen for item, so you know the actual number of characters.
Second, bloom->hash_salts[i] is a hash_function_salt, which is an array of 33 pointers to char. It seems like hash_function_salt should be defined as:
since you want it to hold 33 characters, not 33 pointers. You should also remember that when you're passing a string literal to bloom_filter_create, you're passing a pointer. That means to initialize the hash_function_salt array we use memcpy or strcpy. memcpy is faster when we know the exact length (like here):
So we get:
typedef unsigned char hash_function_salt[33];
and in bloom_filter_create:
memcpy(bloom->hash_salts[i], va_arg(args, char*), sizeof(bloom->hash_salts[i]));
Going back to bloom_operation, we get:
char temp[strlen(item) + sizeof(bloom->hash_salts[i])];
strcpy(temp, item);
strcat(temp, bloom->hash_salts[i]);
We use strlen for item since it's a pointer, but sizeof for the hash_function_salt, which is a fixed size array of char. We don't need to add anything, because hash_function_salt already includes room for a NUL. We use strcpy first. strcat is for when you already have a NUL-terminated string (which we don't here). Note that we drop the *. That was a mistake following from your incorrect typedef.
Your array size calculation for temp uses sizeof(bloom->hash_salts[i]) (which is
just the size of the pointer), but then you dereference the pointer and try
to copy the entire string into temp.
First, as everyone has said, you've sized temp based on the sizes of two pointers, not the lengths of the strings. You've now fixed that, and report that the symptom has moved to the call to strlen().
This is showing a more subtle bug.
You've initialized the array bloom->hash_salts[] from pointers returned by va_arg(). Those pointers will have a limited lifetime. They may not even outlast the call to va_end(), but they almost certainly do not outlive the call to bloom_filter_create().
Later, in bloom_filter_operation(), they point to arbitrary places and you are doomed to some kind of interesting failure.
Edit: Resolving this this requires that the pointers stored in the hash_salts array have sufficient lifetime. One way to deal with that is to allocate storage for them, copying them out of the varargs array, for example:
// fragment from bloom_filter_create()
bloom->hash_salts = calloc(bloom->number_of_hash_salts, sizeof(hash_function_salt));
va_start(args, number_of_hash_salts);
for(i = 0; i < number_of_hash_salts; ++i)
bloom->hash_salts[i] = strdup(va_arg(args, hash_function_salt));
va_end(args);
Later, you would need to loop over hash_salts and call free() on each element before freeing the array of pointers itself.
Another approach that would require more overhead to initialize, but less effort to free would be to allocate the array of pointers along with enough space for all of the strings in a single allocation. Then copy the strings and fill in the pointers. Its a lot of code to get right for a very small advantage.
Are you sure that the hash_function_salt type is defined correctly? You may have too many *'s:
(gdb) ptype bloom
type = struct {
size_t buckets;
size_t number_of_hash_salts;
int bytes_per_bucket;
unsigned char *data;
hash_function_salt *hash_salts;
} *
(gdb) ptype bloom->hash_salts
type = const unsigned char **)[33]
(gdb) ptype bloom->hash_salts[0]
type = const unsigned char *[33]
(gdb) ptype *bloom->hash_salts[0]
type = const unsigned char *
(gdb)

How to make the bytes of the block be initialized so that they contain all 0s

I am writing the calloc function in a memory management assignment (I am using C). I have one question, I wrote the malloc function and thinking about using it for calloc as it says calloc will take num and size and return a block of memory that is (num * size) which I can use malloc to create, however, it says that I need to initialize all bytes to 0 and I am confused about how to do that in general?
If you need more info please ask me :)
So malloc will return a pointer (Void pointer) to the first of the usable memory and i have to go through the bytes, initialize them to zero, and return the pointer to that front of the usable memory.
I am assuming you can't use memset because it's a homework assignment assignment, and deals with memory management. So, I would just go in a loop and set all bytes to 0. Pseudocode:
for i = 1 to n:
data[i] = 0
Oh, if you're having trouble understanding how to dereference void *, remember you can do:
void *b;
/* now make b point to somewhere useful */
unsigned char *a = b;
When you need to set a block of memory to the same value, use the memset function.
It looks like this: void * memset ( void * ptr, int value, size_t num );
You can find more information about the function at: http://www.cplusplus.com/reference/clibrary/cstring/memset/
If you can't use memset, then you'll need to resort to setting each byte individually.
Since you're calling malloc from your calloc function, I'm going to assume it looks something like this:
void *calloc (size_t count, size_t sz) {
size_t realsz = count * sz;
void *block = malloc (realsz);
if (block != NULL) {
// Zero memory here.
}
return block;
}
and you just need the code for "// Zero memory here.".
Here's what you need to know.
In order to process the block one byte at a time, you need to cast the pointer to a type that references bytes (char would be good). To cast your pointer to (for example) an int pointer, you would use int *block2 = (int*)block;.
Once you have the right type of pointer, you can use that to store the correct data value based on the type. You would do this by storing the desired value in a loop which increments the pointer and decrements the count until the count reaches zero.
Hopefully that's enough to start with without giving away every detail of the solution. If you still have problems, leave a comment and I'll flesh out the answer until you have it correct (since it's homework, I'll be trying to get you to do most of the thinking).
Update: Since an answer's already been accepted, I'll post my full solution. To write a basic calloc in terms of just malloc:
void *calloc (size_t count, size_t sz) {
size_t realsz, i;
char *cblock;
// Get size to allocate (detect size_t overflow as well).
realsz = count * sz;
if (count != 0)
if (realsz / count != sz)
return NULL;
// Allocate the block.
cblock = malloc (realsz);
// Initialize all elements to zero (if allocation worked).
if (cblock != NULL) {
for (i = 0; i < realsz; i++)
cblock[i] = 0;
}
// Return allocated, cleared block.
return cblock;
}
Note that you can work directly with char pointers within the function since they freely convert to and from void pointers.
Hints:
there is already a posix library function for zeroing a block of memory
consider casting the void * to some pointer type that you can dereference / assign to.

Resources