Is memcpy of array in C Vaxocentrist? [closed] - c

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
Is a memcpy of a chunk of one array to another in C guilty of Vaxocentrism?
Example:
double A[10];
double B[10];
// ... do stuff ...
// copy elements 3 to 7 in A to elements 2 to 6 in B
memcpy(B+2, A+3, 5*sizeof(double)
As a related question, is casting from an array to a pointer Vaxocentrist?
char A[10];
char* B = (char*)A;
B[0]=2;
A[1]=3;
B[2]=5;
I certainly appreciate the idea of writing code that works under different machine architectures and different compilers, but if I applied type safety to the extreme it would cripple many of C's useful features! How much / little can I assume about how the compiler implements arrays/pointers/etc.?

No. The model on which memcpy works is defined in the abstract machine specified by the C language standard and has nothing to do with any particular physical machine it might be running on. In particular, all objects in C have a representation which is defined as an overlaid array of type unsigned char[sizeof object], and memcpy works on this representation.
Likewise, the 'decay' of arrays to pointers via cast or implicit conversion is completely defined on the abstract machine and has nothing to do with physical machines.
Further, none of the points 1-14 in the linked article have anything to do with the code you're asking about.

In C code, memcpy() can be a useful optimization in a couple of cases. First, if the array of memory is very small then the copy operation can often be inlined directly by the compiler instead of calling a function. This can be a big win in a tight loop that runs a lot. Second, in the case where the array is very large and the hardware supports a faster mode of memory access for certain aligned memory cases then that faster code can be used for the vast majority of the memory. You honestly do not want to know the scary details of alignment and copy operations for different hardware, better to just put that stuff in memcpy() and let everyone use it.

For your first example you're using the + operator incorrectly. You want to deference the element its pointing to. This is safe because both arrays are of size 10, and when allocating memory for arrays all the addresses are sequential with respect to element 0. Also you're copying doesn't go outside of the bounds of the declared array that you're copying to, B.
memcpy(&B[2], &A[3], 5*sizeof(double));
On your related point, you're making the same mistake, you'd want to do the following:
char A[10];
char* B = &A[0];
B[0]=2;
A[1]=3;
B[2]=5;

Related

Can pointers manipulate memory at will? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 1 year ago.
Improve this question
int a=10;
char *b ;
b=(char*)&a;
strcpy(b,"xxxxx");
printf("%s",b);
The compilation can pass, but the program exits with an error. Why doesn't this work? What is the mechanism of realization?
It is likely that, in your C implementation, int is four bytes. The C standard defines a char to use one byte. So b = (char *) &a; sets b to point the first byte of the four that make up a. We do not know what lies after those four bytes.
strcpy(b, "xxxxx"); asks strcpy to copy six bytes (the five “x” characters and a terminating null character) to the memory pointed to by b. In the simplest case, this will overwrite two bytes beyond a. This can disrupt your program in a variety of ways—it can corrupt some other data the compiler stored there, it can make your stack frame unusable, it can corrupt a return address, and other things can go wrong.
Additionally, when the compiler translates and optimizes your program, it relies on guarantees made to it by the C standard, such as that the operation strcpy(b, …) will not write outside of the properly defined object pointed to by b, which is a. When you violate those guarantees, the C standard does not define the resulting behavior, and the translations and optimizations made by the compiler may cause your program to go awry in unexpected ways.
int a=10;
char *b ;
b=(char*)&a;
strcpy(b,"xxxxx");
printf("%s",b);
Why doesn't this work?
This doesn't work because strcpy() copy 6 characters (5 times 'x' and one nul terminator) to the address pointed by b and there is not enough room for that, at least if the compiler you used store int type into 32bits (4 bytes).
You didn't showed the full code, but assuming a is a local variable, it is allocated on the stack. You overflow the space allocated for variable a and this means you overwrite something on the stack. That data on the stack is essential for program continuation and being overwritten it crashes the system.
"at will"? No they are actually sentient!
If code does not behave as you expect it is because you have a semantic error - that is code that is syntactically valid (i.e. it compiles) but does not mean what you think it does when executed according to the rules of the language.
Moreover as systems level language C does not protect you from doing invalid things to the execution environment - i.e. runtime errors, and such errors generally have undefined behaviour.
In this case:
b=(char*)&a;
strcpy(b,"xxxxx");
b points to an object of int size. On Windows or any 32 bit system, that will normally be 4 bytes. You then copy 6 bytes to it, overrunning its space. The effect of this is undefined, but it is likely that it will corrupt some adjacent variable in memory or the function return address.
If b were corrupted by the strcpy() error, trying to print the string at b would cause a run-time error is b were no longer a valid address.
If the return address were corrupted, the program would fail when you return from the calling function.
In either case the precise behaviour is not defined, and may not be trapped; it depends on when gets corrupted, what value the corrupted data takes, and how and when that corrupted data is used.
You will be able to observe the effects on the variables and/or call stack by running and stepping this code in a debugger.

Why does C not require a garbage collector? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 years ago.
Improve this question
My understanding of this has come down to C's origins as a "portable assembler" and the option of less overhead. Is thiat all there is to it?
First of all, lets be clear about what garbage is.
The Java definition of garbage is objects that are no longer reachable. The precise meaning of reachable is a bit abstruse, but a practical definition is that if you can get to an object by following references (pointers) from well known places like thread stacks or static variables, then it may be reachable. (In practice, some imprecision is OK, so long as objects that are reachable don't get deleted.)
You could try to apply the same definition to C and C++. An object is garbage if it cannot be reached.
However, the practical problem with this definition ... and garbage collection ... in C or C++ is whether a "pointer like" value is actually a valid pointer. For instance:
An uninitialized C variable can contain a random value that looks like a pointer to an object.
When a C union type that overlays a pointer with an long, a garbage collector cannot be sure whether the union contains one or the other ... or both.
When C application code "compresses" pointers to word aligned heap nodes by dividing them by 4 or 8, a garbage collector won't detect them as "pointer like". Or if it does, it will misinterpret them.
A similar issues is when C application code represents pointers as offsets relative to something else.
However, it is clear that a C program can call malloc, forget to call free, and then forget the address of the heap node. That node is garbage.
There are two reasons why C / C++ doesn't have garbage collection.
It is "culturally inappropriate". The culture of these languages is to leave storage management to the programmer.
It would be technically difficult (and expensive) to implement a precise garbage collector for C / C++. Indeed, doing this would involve things that made the language implementation slow.
Imprecise (i.e. conservative) garbage collectors are practical, but they have performance and (I have heard) reliability issues. (For instance, a conservative collector cannot move non-garbage objects.)
It would be simpler if the implementer (of a C / C++ garbage collector) could assume that the programmer only wrote code that strictly conformed to the C / C++ specs. But they don't.
But your answer seems to be, why did they design C like that?
Questions like that can only be answered authoritatively by the designers (in this case, the late Dennis Ritchie) or their writings.
As you point out in the question, C was designed to be simple and "close to the hardware".
However, C was designed in the early 1970's. In those days programming languages which required a garbage collector were rare, and GC techniques were not as advanced as they are now.
And even now, it is still a fact that garbage collected languages (like Java) are not suitable for applications that require predictable "real-time" performance.
In short, I suspect that the designers were of the view that garbage collection would make the language impractical for its intended purpose.
There are some garbage collectors built for C or C++:
Please check http://www.hboehm.info/gc/.
As you stated, garbage collection defies the purpose of performance claimed by C and C++, as it requires tracking allocations and/or reference counting.

Iterate through memory using pointers [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 6 years ago.
Improve this question
I am new to C; I have more background knowledge with Java.
I want to try searching for value in memory using pointer
#include <stdio.h>
void main(){
int value = 10;
find(value);
}
void find(int value){ // will change this to return array of long or int depending on which is better
int* intPointer;
for(int i = 40000; i < 40400 ; i+= 32){ /* not sure if i do it like this or another way,
* the starting value, ending value and condition
* is just for testing purpose */
intPointer = (int*)i;
if(*intPointer == value){
printf("found value"); //will actually be storing it to an array instead
}
}
}
The find method gives me a segmentation fault most likely because the address does not store int value. Is there a way I can find out what type data is stored in the memory address.
Is there a better way of achieving a similar task? Please explain and show. The actual program will not have int value = ?? instead only the find method will be used to get a array of addresses which contains this value.
Also what is the best type to store addresses int or long?
A couple things to know right off the bat:
Memory is not always guaranteed to be physically contiguous (even if the memory model appears to be as such).
Knowing what a segfault is would be helpful, as well as what segmented memory and memory protection are in general
The find method gives me a segmentation fault most likely because the address does not store int value
From what I can tell you are trying to loop through memory addresses and find a specific value. Technically it's possible, but practically there's no reason to do it except a few rare cases.
Unfortunately (for your purposes) your operating system of choice handles memory rather intelligently; instead of just placing everything it is given in a linear order, it divides (or segments) memory into their own address spaces to keep memory of process A from interfering with memory of process B, and so forth. To keep track of which processes are where in memory, a mapping mechanism is used.
You tried to access memory outside of your program's assigned segment. To be able to do what it seems you want to do, you're going to need to find a way to get the memory map from the OS.
is there a way I can find out what type data is stored in the memory address
No, at least not in the way you want. A couple SO answers address this.
There is no way to do what you trying in a generic portable way. The code generates undefined behavior as you are accessing memory not allocated to you.
On simple embedded systems it may be possible. To start with, it will require that you have full insight in the systems memory layout and how the compiler works. For instance, you must know whether int is always 4 byte aligned in memory, start/end address of RAM. If an MMU is present and active, you also need to know how it works.
is there a way I can find out what type data is stored in the memory address.
No
You are faking a pointer with a temporary var inside the loop.
The correct way to search for a value would be to have two arguments: the value to search for, and a collection (array, for example) containing some values to be searched.

CPUs with addressable GPR files, address of register variables, and aliasing between memory and registers [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
Background
Some CPUs, such as the Atmel AVR, have a general purpose register file that is also addressable as part of main memory -- see Figure 7-2 in section 7.4 and the paragraph after the figure.
What was WG14 thinking?
Given this, why did the C committee choose to make
register int ri;
int* pi = &ri;
universally ill-formed, as per footnote 101 to N1124 section 6.7.1? Wouldn't undefined or implementation-defined behavior make more sense, considering that the code above is meaningful on at least one processor, and C bends over backwards to accommodate far stranger (and scarcer!) targets than the AVR?
101) The implementation may treat any register declaration simply as an auto
declaration. However, whether or not addressable storage is actually used, the address
of any part of an object declared with storage-class specifier register cannot be
computed, either explicitly (by use of the unary & operator as discussed in 6.5.3.2)
or implicitly (by converting an array name to a pointer as discussed in 6.3.2.1). Thus,
the only operator that can be applied to an array declared with storage-class specifier
register is sizeof.
I just changed a CPU register through a pointer. Wat?!
Furthermore, using the GCC explicit register variables extension, it is possible to direct the compiler to place a variable into a specific register. In this case, you can get a pointer that aliases with a register variable, as below:
register int ri asm("r15") = 0;
int* pi = (int*)0x15;
/* pi now aliases ri */
*pi = 42;
/* ri is 42 now */
assert(ri == 42);
How does GCC deal with such a case? It strikes me as truly bizarre that something like this has not been considered...or has it?
C is an abstract language defined without knowledge of the machine that will eventually implement it. The definition of C does not assume that the underlying machine will even have registers in the conventional form (or a stack, or contiguous memory, or many other things irrelevant to this question that are present on real machines).
The point being that register does not mean that the variable should be assigned a machine register. The meaning of the keyword is that the variable cannot have its address taken; the compiler is then theoretically able to perform better optimisations on it because it reduces the number of paths through which the variable can potentially be modified. Taking the address of a register variable isn't meaningful in C, regardless of what processor it runs on, because register is an incredibly badly-named keyword (named for the most obvious optimisation it enables) that specifically means the address should not be taken. That is all it means.
An intelligent compiler for the AVR should be able to make that optimisation without needing you to hint at it, anyway (in practice the keyword is useless precisely because any halfdecent compiler can detect when it would be applicable anyway, since there's basically no well-defined way to reference an auto object without taking its address explicitly).

accessing AVR registers with C? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I've been trying to learn everything I can about micro-controllers lately. Since this is self-study, it's taken me a while to learn how the things work at the bare metal. Long story short, I don't want to use the AVR libraries in my C code; I want to access the registers specifically through their addresses using pointers in C. I've searched everywhere online, looked inside the AVR header files, and read a book. If someone could help me out that would be wonderful.
You can cast from an integer to a pointer. It's just a normal cast expression.
volatile char * const port_a = (volatile char *) 0x1B;
Many compilers provide extensions to instruct the linker to place an object at a specific address:
volatile char port_a # 0x1B; // Or something like this
The advantage is that you don't introduce a global variable to represent the pointer, but it might not do the right thing for a hardware register. You need to read carefully your compiler's manual for your specific platform.
The official AVR headers probably contain something more like this:
#define PORTA (* (volatile char *) 0x1B)
This avoids the global variable and the linker hack, but many also consider using the preprocessor also to be hacking.
The only viable solution for production code is to use the official headers. Anything else is only instructional.
It pretty much depends on the compiler, some use a stricter interpretation of the register keyword.
For example,
unsigned char a #0x0001; will put the variable into the specific register.
Otherwise, you could just assign numeric values to your pointers, it's a big no-no if your program runs in an OS, but if you have a guaranteed physical memory which you know the boundaries of, it might be acceptable. However, care must be taken that the compiler does not use that register automatically, which is a hard thing to make sure unless you write most of your code in Assembly.
So, the variable declaration method (if your compiler supports it) is the better choice, as it guarantees that no other variable will take up its place.

Resources