implementation of memcpy function - c

I looked at
http://www.opensource.apple.com/source/xnu/xnu-2050.24.15/libsyscall/wrappers/memcpy.c
and didn't understand the following :
1-
inside
void * memcpy(void *dst0, const void *src0, size_t length) {
char *dst = dst0;
const char *src = src0;
line:
if ((unsigned long)dst < (unsigned long)src) {
How can we cast dst to an unsigned long ? it's a pointer !
2- Why do they sometimes prefer forward copying and sometimes backwards??

You are right, this implementation is non-portable, because it is assuming that a pointer is going to fit in unsigned long. This is not guaranteed by the standard.
A proper type for this implementation would have been uintptr_t, which is guaranteed to fit a pointer.

When comparing a void* and char* pointer, the compiler will give a warning (gcc -Wall):
warning: comparison of distinct pointer types lacks a cast
I imagine that the developer decided to "make the warning go away" - but the correct way to do this is with a void* cast (which is portable):
if((void*) dst < (void*) src) {
As for the second point - as was pointed out, you have to take care of overlapping memory locations. Imagine the following 8 characters in successive memory locations:
abcdefgh
Now we want to copy this "3 to the right". Starting with a, we would get (with left-to-right copy):
abcdefgh
abcaefgh
^
abcabfgh
^^
abcabcgh
^^^
etc until you end up with
abcabcabcab
^^^^^^^^
When we wanted to get
abcabcdefgh
Starting from the other end, we don't overwrite things we still have to copy. When the destination is to the left of the source, you have to do it in the opposite direction. And when there is no overlap between source and destination, it doesn't matter what you do.

Related

Explicit cast to Implicit cast in C

I was wondering if it was possible to return the same result with only 1 explicit cast ?
void *begin(void *pt, size_t size)
{
return (void*)((size_t)pt & -size);
}
Every time in tried I got a BAD_ACCESS code 1
Exemple:
void *begin(void *pt, size_t size)
{
size_t *tmp = pt;
size_t res = *tmp & -size;
return (void*)(res);
}
It can be done with zero casts, with implementation-dependent code. However, this is akin to writing code without using the letter “e”: It may be a challenge, but it serves no purpose in production code. If it is posed as an academic exercise, it can be useful because artificial constraints can induce a student to think about things they might not otherwise think about so much, like alternative ways to do things or the technical rules of the language. However, in practice, this is generally pointless.
Your sample code uses size_t, but the preferred type for working with address as integers is uintptr_t, if it is defined. If it is defined, it is defined in <stdint.h>, and any normal C implementation of even modest quality should define it.
Your sample code assumes that converting an address to size_t yields a plain integer address in memory. (The address & -size operation is a common way of finding an address aligned to a multiple of size, which must be a power of two, by clearing the low bits, and so we recognize that your (size_t) pt must be a plain address, at least in its low bits.) Instead, let us assume that a pointer is represented in memory using a plain integer for the address and is the same size as uintptr_t. In any C implementation in which either of these true, the other is likely true too. Before using the following code, you should confirm this for your target C implementations.
Given that assumption, we can implement your begin routine with no casts:
#include <stdint.h>
#include <string.h>
void *begin(void *pt, size_t size)
{
uintptr_t us = size; // Convert size to uintptr_t to ensure it is at least as wide.
uintptr_t x; // Make space to copy pointer.
memcpy(&x, &pt, sizeof x); // Copy bytes.
x &= -us; // Zero low bits.
memcpy(&pt, &x, sizeof pt); // Copy bytes back.
return pt;
}
If the assumption is not true, it is nonetheless possible to implement begin for any chosen C implementation by setting an unsigned char pointer with unsigned char *p = (unsigned char *) &pt; and then using p to examine and manipulate the bytes of pt. The C standard requires each implementation to document its representation of types, so the meanings of the bytes in the void * pt must be documented, which enables writing code to compute with them as desired.
That uses one cast. It could be reduced to zero with void *v = &pt; unsigned char *p = v;.

Safely checking for overlapping memory regions

I'm trying to further my knowledge and experience in C, so I'm writing some small utilities.
I'm copying memory, and according to the man page for memcpy(3):
NOTES
Failure to observe the requirement that the memory areas do not overlap has been
the source of real bugs. (POSIX and the C standards are explicit that employing
memcpy() with overlapping areas produces undefined behavior.) Most notably, in
glibc 2.13 a performance optimization of memcpy() on some platforms (including
x86-64) included changing the order in which bytes were copied from src to dest.
Clearly, overlapping memory regions passed to memcpy(3) can cause a lot of problems.
I'm trying to write a safe wrapper as part of learning C to make sure that these memory regions don't overlap:
int safe_memcpy(void *dest, void *src, size_t length);
The logic I'm trying to implement is:
Check both the source and destination pointers for NULL.
Establish the pointer "range" for both source and dest with the length parameter.
Determine if the source range intersects with the destination range, and vice versa.
My implementation so far:
#define SAFE_MEMCPY_ERR_NULL 1
#define SAFE_MEMCPY_ERR_SRC_OVERLAP 2
#define SAFE_MEMCPY_ERR_DEST_OVERLAP 3
int safe_memcpy(void *dest, void *src, size_t length) {
if (src == NULL || dest == NULL) {
return SAFE_MEMCPY_ERR_NULL;
}
void *dest_end = &dest[length - 1];
void *src_end = &src[length - 1];
if ((&src >= &dest && &src <= &dest_end) ||
(&src_end >= &dest && &src_end <= &dest_end)) {
// the start of src falls within dest..dest_end OR
// the end of src falls within dest..dest_end
return SAFE_MEMCPY_ERR_SRC_OVERLAP;
}
if ((&dest >= &src && &dest <= &src_end) ||
(&dest_end >= &src && &dest_end <= &src_end)) {
// the start of dest falls within src..src_end
// the end of dest falls within src..src_end
return SAFE_MEMCPY_ERR_DEST_OVERLAP;
}
// do the thing
memcpy(dest, src, length);
return 0;
}
There's probably a better way to do errors, but this is what I've got for now.
I'm pretty sure I'm triggering some undefined behavior in this code, as I'm hitting SAFE_MEMCPY_ERR_DEST_OVERLAP on memory regions that do not overlap. When I examine the state using a debugger, I see (for instance) the following values:
src: 0x7ffc0b75c5fb
src_end: 0x7ffc0b75c617
dest: 0x1d05420
dest_end: 0x1d0543c
Clearly, these addresses do not even remotely overlap, hence why I'm thinking I'm triggering UB, and compiler warnings indicate as such:
piper.c:68:27: warning: dereferencing ‘void *’ pointer
void *dest_end = &dest[length - 1];
It seems that I need to cast the pointers as a different type, but I'm not sure which type to use: the memory is untyped so should I use a char * to "look at" the memory as bytes? If so, should I cast everything as a char *? Should I instead use intptr_t or uintptr_t?
Given two pointers and a length for each of them, how can I safely check if these regions overlap one another?
In the first place, a conforming program cannot perform pointer arithmetic on a pointer of type void *, nor (relatedly) apply the indexing operator to it, not even with index 0. void is an incomplete type, and unique among those in that it cannot be completed. The most relevant implication of that is that that type does not convey any information about the size of the thing to which it points, and pointer arithmetic is defined in terms of the pointed-to object.
So yes, expressions such as your &dest[length - 1] have undefined behavior with respect to the C standard. Some implementations provide extensions affecting that, and others reject such code at compile time. In principle, an implementation could accept the code and do something bizarre with it, but that's relatively unlikely.
In the second place, you propose to
write a safe wrapper as part of learning C to make sure that these memory regions don't overlap
, but there is no conforming way to do that for general pointers. Pointer comparisons and pointer differences are defined only for pointers into the same array (or to one element past the end of the array), where a pointer to a scalar is considered in that regard as a pointer to the first element of dimension-1 array.
Converting to a different pointer type, perhaps char *, would resolve the pointer arithmetic issue, but not, in the general case, the pointer comparability issue. It might get exactly the behavior you want out of some implementations, reliably even, but it is not a conforming approach to the problem, and the ensuing undefined behavior might produce genuine bugs in other implementations.
Relatively often, you can know statically that pointers do not point to overlapping regions. In particular, if one pointer in question is a pointer to an in-scope local variable or to a block of memory allocated by the current function, then you can usually be sure whether there is an overlap. For cases where you do not know, or where you know that there definitely is overlap, the correct approach is to use memmove() instead of memcpy().
This "safe" memcpy is not safe as well as it does not copy anything when programmes expects it. Use memmove to be safe
You should not use &src and &dest as it is not beginning of the data or buffer but the address of the parameter src and dest itself.
Same is with srcend and destend
Given two pointers and a length for each of them, how can I safely check if these regions overlap one another?
<, <=, >=, > are not defined when 2 pointers are not related to the same object.
A tedious approach checks the endpoints of one against all the other's elements and takes advantage that the length of the source and destination are the same.
int safe_memcpy(void *dest, const void *src, size_t length) {
if (length > 0) {
unsigned char *d = dest;
const unsigned char *s = src;
const unsigned char *s_last = s + length - 1;
for (size_t i = 0; i < length; i++) {
if (s == &d[i]) return 1; // not safe
if (s_last == &d[i]) return 1; // not safe
}
memcpy(dest, src, length);
}
return 0;
}
If the buffer lengths differ, check the shorter one's endpoints against the addresses of the longer one's elements.
should I cast everything as a char *
Use unsigned char *.
mem...(), str...() behave as if each array element was unsigned char.
For all functions in this subclause, each character shall be interpreted as if it had the type unsigned char (and therefore every possible object representation is valid and has a different value). C17dr § 7.24.1 3
With rare non-2's complement, unsigned char is important to avoid signed char traps and maintain -0, +0 distinctiveness. Strings only stop on +0.
With functions like int strcmp/memcmp(), unsigned char that use integer math, it is important when comparing elements outside the range of [0...CHAR_MAX] to return the correctly signed result.
Even if void * indexing was allowed, void *dest_end = &dest[length - 1]; is very bad when length == 0 as that is like &dest[SIZE_MAX];
&src >= &dest s/b src >= dest for even a chance at working.
The addresses of src, dest are irrelevant to the copy, only their values are important.
I suspect this errant code leads to UB in OP's other code.
Should I instead use intptr_t or uintptr_t?
Note that (u)intptr_t are optional types - they might not exist in a conforming compiler.
Even when the types exist, math on the pointers is not defined to be related to math on the integer values.
Clearly, these addresses do not even remotely overlap, hence why I'm thinking I'm triggering UB,
"Clearly" if ones assumes a liner mapping addresses to integers, something not specified in C.
The memory is untyped so should I use a char * to "look at" the memory as bytes? If so, should I cast everything as a char *?
Use unsigned char* if you need to dereference the data, or just char* when you want to increment/decrement the pointer value by count of bytes.
It's common to do:
void a_function_that_takes_void(void *x, void *y) {
char *a = x;
char *b = y;
/* uses a and b throughout here */
}
If so, should I cast everything as a char *?
Yes. It's also common to do:
void_pointer = (char*)void_pointer + 1;
Should I instead use intptr_t or uintptr_t?
You could, but that would be the same as using char*, except for a char* to intptr_t conversion.
how can I safely check if these regions overlap one another?
It's good to do some research. how to implement overlap-checking memcpy in C

strcpy()/strncpy() crashes on structure member with extra space when optimization is turned on on Unix?

When writing a project, I ran into a strange issue.
This is the minimal code I managed to write to recreate the issue. I am intentionally storing an actual string in the place of something else, with enough space allocated.
// #include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <stdint.h>
#include <stddef.h> // For offsetof()
typedef struct _pack{
// The type of `c` doesn't matter as long as it's inside of a struct.
int64_t c;
} pack;
int main(){
pack *p;
char str[9] = "aaaaaaaa"; // Input
size_t len = offsetof(pack, c) + (strlen(str) + 1);
p = malloc(len);
// Version 1: crash
strcpy((char*)&(p->c), str);
// Version 2: crash
strncpy((char*)&(p->c), str, strlen(str)+1);
// Version 3: works!
memcpy((char*)&(p->c), str, strlen(str)+1);
// puts((char*)&(p->c));
free(p);
return 0;
}
The above code is confusing me:
With gcc/clang -O0, both strcpy() and memcpy() works on Linux/WSL, and the puts() below gives whatever I entered.
With clang -O0 on OSX, the code crashes with strcpy().
With gcc/clang -O2 or -O3 on Ubuntu/Fedora/WSL, the code crashes (!!) at strcpy(), while memcpy() works well.
With gcc.exe on Windows, the code works well whatever the optimization level is.
Also I found some other traits of the code:
(It looks like) the minimum input to reproduce the crash is 9 bytes (including zero terminator), or 1+sizeof(p->c). With that length (or longer) a crash is guaranteed (Dear me ...).
Even if I allocate extra space (up to 1MB) in malloc(), it doesn't help. The above behaviors don't change at all.
strncpy() behaves exactly the same, even with the correct length supplied to its 3rd argument.
The pointer does not seem to matter. If structure member char *c is changed into long long c (or int64_t), the behavior remains the same. (Update: changed already).
The crash message doesn't look regular. A lot of extra info is given along.
I tried all these compilers and they made no difference:
GCC 5.4.0 (Ubuntu/Fedora/OS X/WSL, all are 64-bit)
GCC 6.3.0 (Ubuntu only)
GCC 7.2.0 (Android, norepro???) (This is the GCC from C4droid)
Clang 5.0.0 (Ubuntu/OS X)
MinGW GCC 6.3.0 (Windows 7/10, both x64)
Additionally, this custom string copy function, which looks exactly like the standard one, works well with any compiler configuration mentioned above:
char* my_strcpy(char *d, const char* s){
char *r = d;
while (*s){
*(d++) = *(s++);
}
*d = '\0';
return r;
}
Questions:
Why does strcpy() fail? How can it?
Why does it fail only if optimization is on?
Why doesn't memcpy() fail regardless of -O level??
*If you want to discuss about struct member access violation, pleast head over here.
Part of objdump -d's output of a crashing executable (on WSL):
P.S. Initially I want to write a structure, the last item of which is a pointer to a dynamically allocated space (for a string). When I write the struct to file, I can't write the pointer. I must write the actual string. So I came up with this solution: force store a string in the place of a pointer.
Also please don't complain about gets(). I don't use it in my project, but the example code above only.
What you are doing is undefined behavior.
The compiler is allowed to assume that you will never use more than sizeof int64_t for the variable member int64_t c. So if you try to write more than sizeof int64_t(aka sizeof c) on c, you will have an out-of-bounds problem in your code. This is the case because sizeof "aaaaaaaa" > sizeof int64_t.
The point is, even if you allocate the correct memory size using malloc(), the compiler is allowed to assume you will never use more than sizeof int64_t in your strcpy() or memcpy() call. Because you send the address of c (aka int64_t c).
TL;DR: You are trying to copy 9 bytes to a type consisting of 8 bytes (we suppose that a byte is an octet). (From #Kcvin)
If you want something similar use flexible array members from C99:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
typedef struct {
size_t size;
char str[];
} string;
int main(void) {
char str[] = "aaaaaaaa";
size_t len_str = strlen(str);
string *p = malloc(sizeof *p + len_str + 1);
if (!p) {
return 1;
}
p->size = len_str;
strcpy(p->str, str);
puts(p->str);
strncpy(p->str, str, len_str + 1);
puts(p->str);
memcpy(p->str, str, len_str + 1);
puts(p->str);
free(p);
}
Note: For standard quote please refer to this answer.
I reproduced this issue on my Ubuntu 16.10 and I found something interesting.
When compiled with gcc -O3 -o ./test ./test.c, the program will crash if the input is longer than 8 bytes.
After some reversing I found that GCC replaced strcpy with memcpy_chk, see this.
// decompile from IDA
int __cdecl main(int argc, const char **argv, const char **envp)
{
int *v3; // rbx
int v4; // edx
unsigned int v5; // eax
signed __int64 v6; // rbx
char *v7; // rax
void *v8; // r12
const char *v9; // rax
__int64 _0; // [rsp+0h] [rbp+0h]
unsigned __int64 vars408; // [rsp+408h] [rbp+408h]
vars408 = __readfsqword(0x28u);
v3 = (int *)&_0;
gets(&_0, argv, envp);
do
{
v4 = *v3;
++v3;
v5 = ~v4 & (v4 - 16843009) & 0x80808080;
}
while ( !v5 );
if ( !((unsigned __int16)~(_WORD)v4 & (unsigned __int16)(v4 - 257) & 0x8080) )
v5 >>= 16;
if ( !((unsigned __int16)~(_WORD)v4 & (unsigned __int16)(v4 - 257) & 0x8080) )
v3 = (int *)((char *)v3 + 2);
v6 = (char *)v3 - __CFADD__((_BYTE)v5, (_BYTE)v5) - 3 - (char *)&_0; // strlen
v7 = (char *)malloc(v6 + 9);
v8 = v7;
v9 = (const char *)_memcpy_chk(v7 + 8, &_0, v6 + 1, 8LL); // Forth argument is 8!!
puts(v9);
free(v8);
return 0;
}
Your struct pack makes GCC believe that the element c is exactly 8 bytes long.
And memcpy_chk will fail if the copying length is larger than the forth argument!
So there are 2 solutions:
Modify your structure
Using compile options -D_FORTIFY_SOURCE=0(likes gcc test.c -O3 -D_FORTIFY_SOURCE=0 -o ./test) to turn off fortify functions.
Caution: This will fully disable buffer overflow checking in the whole program!!
No answer has yet talked in detail about why this code may or may not be undefined behaviour.
The standard is underspecified in this area, and there is a proposal active to fix it. Under that proposal, this code would NOT be undefined behaviour, and the compilers generating code that crashes would fail to comply with the updated standard. (I revisit this in my concluding paragraph below).
But note that based on the discussion of -D_FORTIFY_SOURCE=2 in other answers, it seems this behaviour is intentional on the part of the developers involved.
I'll talk based on the following snippet:
char *x = malloc(9);
pack *y = (pack *)x;
char *z = (char *)&y->c;
char *w = (char *)y;
Now, all three of x z w refer to the same memory location, and would have the same value and the same representation. But the compiler treats z differently to x. (The compiler also treats w differently to one of those two, although we don't know which as OP didn't explore that case).
This topic is called pointer provenance. It means the restriction on which object a pointer value may range over. The compiler is taking z as having a provenance only over y->c, whereas x has provenance over the entire 9-byte allocation.
The current C Standard does not specify provenance very well. The rules such as pointer subtraction may only occur between two pointers to the same array object is an example of a provenance rule. Another provenance rule is the one that applies to the code we are discussing, C 6.5.6/8:
When an expression that has integer type is added to or subtracted from a pointer, the result has the type of the pointer operand. If the pointer operand points to an element of an array object, and the array is large enough, the result points to an element offset from the original element such that the difference of the subscripts of the resulting and original array elements equals the integer expression. In other words, if the expression P points to the i-th element of an array object, the expressions (P)+N (equivalently, N+(P)) and (P)-N (where N has the value n) point to, respectively, the i+n-th and i−n-th elements of the array object, provided they exist. Moreover, if the expression P points to the last element of an array object, the expression (P)+1 points one past the last element of the array object, and if the expression Q points one past the last element of an array object, the expression (Q)-1 points to the last element of the array object. If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined. If the result points one past the last element of the array object, it shall not be used as the operand of a unary * operator that is evaluated.
The justification for bounds-checking of strcpy, memcpy always comes back to this rule - those functions are defined to behave as if they were a series of character assignments from a base pointer that's incremented to get to the next character, and the increment of a pointer is covered by (P)+1 as discussed in this rule.
Note that the term "the array object" may apply to an object that wasn't declared as an array. This is spelled out in 6.5.6/7:
For the purposes of these operators, a pointer to an object that is not an element of an array behaves the same as a pointer to the first element of an array of length one with the type of the object as its element type.
The big question here is: what is "the array object"? In this code, is it y->c, *y, or the actual 9-byte object returned by malloc?
Crucially, the standard sheds no light whatsoever on this matter. Whenever we have objects with subobjects, the standard does not say whether 6.5.6/8 is referring to the object or the subobject.
A further complicating factor is that the standard does not provide a definition for "array", nor for "array object". But to cut a long story short, the object allocated by malloc is described as "an array" in various places in the standard, so it does seem that the 9-byte object here is a valid candidate for "the array object". (In fact this is the only such candidate for the case of using x to iterate over the 9-byte allocation, which I think everyone would agree is legal).
Note: this section is very speculative and I attempt to provide an argument as to why the solution chosen by the compilers here is not self-consistent
An argument could be made that &y->c means the provenance is the int64_t subobject. But this does immediately lead to difficulty. For example, does y have the provenance of *y? If so, (char *)y should have the the provenance *y still, but then this contradicts the rule of 6.3.2.3/7 that casting a pointer to another type and back should return the original pointer (as long as alignment is not violated).
Another thing it doesn't cover is overlapping provenance. Can a pointer compare unequal to a pointer of the same value but a smaller provenance (which is a subset of the larger provenance) ?
Further, if we apply that same principle to the case where the subobject is an array:
char arr[2][2];
char *r = (char *)arr;
++r; ++r; ++r; // undefined behavior - exceeds bounds of arr[0]
arr is defined as meaning &arr[0] in this context, so if the provenance of &X is X, then r is actually bounded to just the first row of the array -- perhaps a surprising result.
It would be possible to say that char *r = (char *)arr; leads to UB here, but char *r = (char *)&arr; does not. In fact I used to promote this view in my posts many years ago. But I no longer do: in my experience of trying to defend this position, it just can't be made self-consistent, there are too many problem scenarios. And even if it could be made self-consistent, the fact remains that the standard doesn't specify it. At best, this view should have the status of a proposal.
To finish up, I would recommend reading N2090: Clarifying Pointer Provenance (Draft Defect Report or Proposal for C2x).
Their proposal is that provenance always applies to an allocation. This renders moot all the intricacies of objects and subobjects. There are no sub-allocations. In this proposal, all of x z w are identical and may be used to range over the whole 9-byte allocation. IMHO the simplicity of this is appealing, compared to what was discussed in my previous section.
This is all because of -D_FORTIFY_SOURCE=2 intentionally crashing on what it decides is unsafe.
Some distros build gcc with -D_FORTIFY_SOURCE=2 enabled by default. Some don't. This explains all the differences between different compilers. Probably the ones that don't crash normally will if you build your code with -O3 -D_FORTIFY_SOURCE=2.
Why does it fail only if optimization is on?
_FORTIFY_SOURCE requires compiling with optimization (-O) to keep track of object sizes through pointer casts / assignments. See the slides from this talk for more about _FORTIFY_SOURCE.
Why does strcpy() fail? How can it?
gcc calls __memcpy_chk for strcpy only with -D_FORTIFY_SOURCE=2. It passes 8 as the size of the target object, because that's what it thinks you mean / what it can figure out from the source code you gave it. Same deal for strncpy calling __strncpy_chk.
__memcpy_chk aborts on purpose. _FORTIFY_SOURCE may be going beyond things that are UB in C and disallowing things that look potentially dangerous. This gives it license to decide that your code is unsafe. (As others have pointed out, a flexible array member as the last member of your struct, and/or a union with a flexible-array member, is how you should express what you're doing in C.)
gcc even warns that the check will always fail:
In function 'strcpy',
inlined from 'main' at <source>:18:9:
/usr/include/x86_64-linux-gnu/bits/string3.h:110:10: warning: call to __builtin___memcpy_chk will always overflow destination buffer
return __builtin___strcpy_chk (__dest, __src, __bos (__dest));
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
(from gcc7.2 -O3 -Wall on the Godbolt compiler explorer).
Why doesn't memcpy() fail regardless of -O level?
IDK.
gcc fully inlines it just an 8B load/store + a 1B load/store. (Seems like a missed optimization; it should know that malloc didn't modify it on the stack, so it could just store it from immediates again instead of reloading. (Or better keep the 8B value in a register.)
why making things complicated? Overcomplexifying like you're doing gives just more space for undefined behaviour, in that part:
memcpy((char*)&p->c, str, strlen(str)+1);
puts((char*)&p->c);
warning: passing argument 1 of 'puts' from incompatible pointer ty
pe [-Wincompatible-pointer-types]
puts(&p->c);
you're clearly ending up in an unallocated memory area or somewhere writable if you're lucky...
Optimizing or not may change the values of the addresses, and it may work (since the addresses match), or not. You just cannot do what you want to do (basically lying to the compiler)
I would:
allocate just what's needed for the struct, don't take the length of the string inside into account, it's useless
don't use gets as it's unsafe and obsolescent
use strdup instead of the bug-prone memcpy code you're using since you're handling strings. strdup won't forget to allocate the nul-terminator, and will set it in the target for you.
don't forget to free the duplicated string
read the warnings, put(&p->c) is undefined behaviour
test.c:19:10: warning: passing argument 1 of 'puts' from incompatible pointer ty
pe [-Wincompatible-pointer-types]
puts(&p->c);
My proposal
int main(){
pack *p = malloc(sizeof(pack));
char str[1024];
fgets(str,sizeof(str),stdin);
p->c = strdup(str);
puts(p->c);
free(p->c);
free(p);
return 0;
}
Your pointer p->c is the cause of crash.
First initialize struct with size of "unsigned long long" plus size of "*p".
Second initialize pointer p->c with the required area size.
Make operation copy: strcpy(p->c, str);
Finally free first free(p->c) and free(p).
I think it was this.
[EDIT]
I'll insist.
The cause of the error is that its structure only reserves space for the pointer but does not allocate the pointer to contain the data that will be copied.Take a look
int main()
{
pack *p;
char str[1024];
gets(str);
size_t len_struc = sizeof(*p) + sizeof(unsigned long long);
p = malloc(len_struc);
p->c = malloc(strlen(str));
strcpy(p->c, str); // This do not crashes!
puts(&p->c);
free(p->c);
free(p);
return 0;
}
[EDIT2]
This is not a traditional way to store data but this works:
pack2 *p;
char str[9] = "aaaaaaaa"; // Input
size_t len = sizeof(pack) + (strlen(str) + 1);
p = malloc(len);
// Version 1: crash
strcpy((char*)p + sizeof(pack), str);
free(p);

How do I decide where to put the Asterisk?

will these lines do the same? If not, what's the difference? How do I decide where to put the asterisk?
#define OUT1 *((portptr ) SIGDATA_ADR)
#define OUT1 ((portptr *) SIGDATA_ADR)
Ok, sorry for the vague problem description.
What I'm trying to do is a function that continuously reads the value of two switches, makes a XOR, and puts it on a LED ramp.
My program looks like this; and it should work.
typedef unsigned char *port8ptr;
#define OUT *((port8ptr) 0x400)
#define IN1 *((port8ptr) 0x600)
#define IN2 *((port8ptr) 0x601)
void DipSwitchEor( void )
{
while( 1 )
{
OUT = IN1 ^ IN2;
}
}
So I'm just curious if I could have written #define OUT ((port8ptr *) 0x400) instead. I'm getting mixed answers.
There is a huge difference between these two macro definitions
#define OUT *((port8ptr) 0x400)
#define OUT ((port8ptr*) 0x400)
The first one first makes a port8ptr (cast) that points at memory address 0x400, then it dereferences this pointer (* operator outside of parentheses) to yield the unsigned char that is located at memory address 0x400.
The second one makes a pointer to a port8ptr that points at memory address 0x400. The result has the same type as a variable foo declared with port8ptr* foo;. That is, the result is a double pointer to an unsigned char, the char being located in memory wherever the pointer stored at memory address 0x400 is pointing at.
If all you need is to tell what various C declaration constructs mean, I think that you should try http://cdecl.org. It is also a utility that you can install on your computer. Probably what you want with your example is the code below, assuming that 0x600, 0x601 and 0x400 are the addresses of your registers:
void dip_switch() {
unsigned char *in1 = 0x600;
unsigned char *in2 = 0x601;
unsigned char *out = 0x400;
*out = (*in1) ^ (*in2);
}
Using #defines will only bite you in a situation like this. You should probably steer away from it.
The definitions of OUT1 and OUT2 are not semantically equivalent.
In the first case SIGDATA_ADR is cast to a pointer type portptr, and then dereferenced, so the argument to OUT1() in the first case is the data pointed to by SIGDATA_ADR. This makes sense and is likely the correct semantics - given you subsequent example of the use case, it is certainly correct.
In the second case, (portptr*) is a pointer-to-pointer, and the argument to OUT1() in the second case SIGDATA_ADR cast to a pointer-to-pointer - which seems unlikely.
Note that in the first case the * is the dereference operator, and the second it is not an operator but a data type modifier. They are not the same thing - they just happen to be represented by the same character - context is everything; understanding that probably makes determining which is correct far simpler.

C pointers void * buffer problem

Sorry for messing you all with the C stuff.
The write() takes void * buff. And i need to call this function from main() by giving the required data.
But when i am printing it throws an error. Help me out friends.
Code is as follows.
void write(int fd, void *buff,int no_of_pages)
{
// some code that writes buff into a file using system calls
}
Now i need to send the buff with the data i need.
#include "stdio.h"
#include "malloc.h"
int main()
{
int *x=(int*)malloc(1024);
*(x+2)=3192;
*(x+3)="sindhu";
printf("\n%d %s",*(x+2),*(x+3));
write(2,x,10); //(10=4bytes for int + 6 bytes for char "sindhu");
}
It warns me
warning: format ‘%s’ expects type ‘char *’, but argument 3 has type ‘int’
How can i remove this warning
By casting to a valid type:
printf("\n%d %s",*(x+2),(char*)(x+3));
Note: What you're doing looks evil. I'd reconsider this design!
Quite simply: do as the error says. Do not pass an integer to a string formatting sequence.
printf("\n%d %d", *(x+2), *(x+3));
^--- note the change
You need to use a char * to reference a string:
char * cp = "sindhu";
printf("\n%d %s", *(x+2), cp);
would be better.
There are actually a couple of interesting points in your question. Firstly, I am surprised that the printf is generating a warning that is rather helpful of your compiler as inherently printf is not type safe so no warning is necessary. Secondly, I am actually amazed that your compiler is allowing this:
*(x+3) = "sindhu";
I am pretty sure that should be an error or at the very least a warning, without an explicit cast. Note that "sindhu" is of type const char* and your array is an array of type int. So essentially what you are doing here is putting the memory address of the string into the 4th integer in your array. Now the important thing here is that this makes the very dangerous assumption that:
sizeof(int) == sizeof(char*)
This can be easily not be the case; most notably many 64-bit systems do not exhibit this property.
Bitmask's answer will eliminate the warning you are receiving, however as he suggests, I strongly advise that you change the design of your program such that this is not necessary.
Also as one final stylistic point remember that for the most part arrays and pointers in C are the same, this is not entirely true but sufficed to say that *(x+2) is equivalent to x[2] which is rather easier on the eyes when reading the code.
int *x=(int*)malloc(1024);
Lose the cast; it's not necessary, and it will suppress a useful diagnostic if you forget to #include stdlib.h or otherwise don't have a cast for malloc in scope. Secondly, it's generally better from a readability standpoint to specify the number of elements you need of a specific type, rather than a number of bytes. You'd do that like so:
int *x = malloc(N * sizeof *x);
which says "allocate enough memory to store N int values".
*(x+2)=3192;
Okay. You're assigning the integer value 3192 to x[2].
*(x+3)="sindhu";
Bad juju; I'm surprised the compiler didn't yak on this line. You're attempting to store a value of type char * to an int (since the type of x is int *, the type of *(x + 3) is an int). I'm not sure what you're trying to accomplish here; if you're trying to store the value of the pointer at x[3], note that pointer values may not necessarily be representable as an int (for example, suppose an char * is 4 bytes wide but an int is 2 bytes wide). In either case the types are not compatible, and a cast is required:
*(x + 3) = (int) "sindhu"; // equivalent to writing x[3] = (int) "sindhu"
If you're trying to copy the contents of the string to the buffer starting at x[3], this is definitely the wrong way to go about it; to make this "work" (for suitably loose definitions of "work"), you would need to use either the strcpy or memcpy library functions:
strcpy((char *) (x + 3), "sindhu"); // note the cast, and the fact that
// x + 3 is *not* dereferenced.
As for the problem in the printf statement, the type of *(x + 3) is int, not char *, which is not compatible with the %s conversion specifier. Again, to make this "work", you'd do something like
printf("%d %s\n", *(x + 2), (char *) (x + 3));
You really don't want to store different types of data in the same memory buffer in such an unstructured way; unless you really know what you're doing, it leads to massive heartburn.

Resources