__eds__ WORD __ramspace[0x100] __attribute__((eds,address(0x8000ul),noload));
I want to understand the syntax above ( The program is for pic24 and in C ) especially __ramspace[0x100]. Can anybody help me?
It's a bit late, but maybe this can help someone else:
__eds__ means you want to put whatever follows into the extended data space. You do this when you want to use the data space beyond a certain address. You can find from which address the extended space begins for your MCU in the datasheet.
WORD means you will reserve whole words (and not for instance bytes). For a pic24 this means 16 bit chunks.
__ramspace[0x100] is a 1D array 256 in size. When you take a look at what is written in front of this, you can see you are declaring an array named __ramspace , size 256 words (so 256x 16 bit values) in extended data space (eds).
Now you must declare the offset e.g. the start address of the array (the physical address where __ramspace[0] - first array element - will be). This is what 0x8000 does.
Finally you instruct the compiler if the array should be initialised at boot-up (for instance filled with zeros). In your case there is a noload, meaning random data will be inside the array at boot-up, until you write your own values in it.
Hope this helps.
The __eds__ qualifier is described in the "MPLAB® C Compiler for PIC24 MCUs and dsPIC® DSCs User’s Guide" as:
In the attribute context the eds, for extended data space, attribute
indicates to the com- piler that the variable will may be allocated
anywhere within data memory. Variables with this attribute will likely
also need the eds type qualifier (see Chapter
6. “Additional C Pointer Types”) in order for the compiler to properly generate the correct access sequence. Note that the eds qualifier
and the eds attribute are closely related, but not identical. On some
devices, eds may need to be specified when allocating variables into
certain memory spaces such as space(ymemory) or space(dma) as this
memory may only exist in the extended data space.
__ramspace is not a special designator, it's just the identifier that was chosen.
__ramspace[0x100] is the only part of that line which is just pure C. :) It declares an array of 0x100 (256, in decimal) elements of type WORD. The name of the array is __ramspace.
See #Brian Cain's answer for details about eds.
The address(0x8000ul) argument to __attribute__() presumably makes the linker put the variable in question at location 0x8000.
Related
I was trying to wrap my head around the concept of a variable.
Obviously it is implicitly clear how a variable works. However, I was trying to explicate my implicit knowledge and ran into some difficulties. Here is what I came up with:
A variable is a container of a certain size.
The size is both dependant of the data type in the declaration of the variable and the hardware (what specifically? word size?).
A variable has an address in memory that is stored within the preallocated size of that container (how is the name of the variable connected to its memory address?).
Within the reserved spot in memory for that variable, a value that corresponds to the data type of the declaration can be stored.
What of that is wrong or not precise (I'm sure much)? How can it be explained better?
In C, a variable consists of two things: an identifier and an object.
An identifier is a string of text that is used in source code to denote the object. (Identifiers may also denote functions, structure members, and other things.)
An object is “a region of data storage in the execution environment, the contents of which can represent values” (C 2018 3.1.15 1).
We generally think of an object has having a certain type. The type determines the meaning of the value stored in an object—the same bits may mean 3.75 when interpreted as a float but 1,081,081,856 when interpreted as an int. The C standard defines some properties of how types are represented (such as that some form of binary is used for integers) and requires C implementations to define the rest (except for certain aspects of bit-fields).
Therefore the “final say” on how any object is represented is up to each C implementation. Most C implementations are influenced by the hardware, as they are designed to work efficiently on their target systems, but a C implementation may provide 37-bit int objects on hardware that uses 32-bit words.
Earlier, I said we generally think of an object has having a certain type. When an identifier for an object is defined, storage is reserved for it. The amount of that storage is determined by the type. However, the actual interpretation of the value of an object depends on the expression used to access it. Almost all the time, we access an object using its declared type: After declaring float x;, we use x = 3.75; printf("%g\n", x);, and so on, and the type used to access the object in these expressions is float, the declared type of x. But C is flexible and allows us to set a char pointer to the memory using char *p = (char *) &x;, and then we can access the bytes of x using c[0], c[1], and so on. In this case, the type of the expression used to access the object, or its parts, is char, so we get char values instead of float values when using these expressions to access the object.
The compiler knows and arranges the connection between an identifier and its storage (memory). When an identifier for an object is defined, the compiler will plan storage for it (subject to program optimization by the compiler). That storage may be in a data section of the program or in the stack section or somewhere else. The compiler knows of ways to refer to the storage. Locations in the stack may be referred to by offsets relative to a stack pointer or a frame pointer. Locations in data sections may be referred to by offsets relative to a base address stored in a particular register by the program loader. Locations may be referred to by offsets relative to section starts or by absolute memory addresses. Whatever the case may be, when the compiler needs to generate instructions that access an object, it generates suitable instructions. This may be an instruction that includes in the instruction itself an offset relative to the stack pointer. Or it could be two or more instructions that add the offset to a base register and then use the result to access memory. Or it could be a partially generated instruction that is later completed when the program loader adjusts it to have the final address.
I was playing with C, and I just discovered that a and &a yield to the same result that is the address to the first element of the array. By surfing here over the topics, I discovered they are only formatted in a different way. So my question is: where is this address stored?
This is an interesting question! The answer will depend on the specifics of the hardware you're working with and what C compiler you have.
From the perspective of the C language, each object has an address, but there's no specific prescribed mechanism that accounts for how that address would actually be stored or accessed. That's left up to the compiler to decide.
Let's imagine that you've declared your array as a local variable, and then write something like array[137], which accesses the 137th element of the array. How does the generated program know how to find your array? On most systems, the CPU has a dedicated register called the stack pointer that keeps track of the position of the memory used for all the local variables of the current function. As the compiler translates your C code into an actual executable file, it maintains an internal table mapping each local variable to some offset away from where the stack pointer points. For example, it might say something like "because 64 bytes are already used up for other local variables in this function, I'm going to place array 64 bytes past where the stack pointer points." Then, whenever you reference array, the compiler generates machine instructions of the form "look 64 bytes past the stack pointer to find the array."
Now, imagine you write code like this:
printf("%p\n", array); // Print address of array
How does the compiler generate code for this? Well, internally, it knows that array is 64 bytes past the stack pointer, so it might generate code of the form "add 64 to the stack pointer, then pass that as an argument to printf."
So in that sense, the answer to your question could be something like "the hardware stores a single pointer called the stack pointer, and the generated code is written in a way that takes that stack pointer and then adds some value to it to get to the point in memory where the array lives."
Of course, there are a bunch of caveats here. For example, some systems have both a stack pointer and a frame pointer. Interpreters use a totally different strategy and maintain internal data structures tracking where everything is. And if the array is stored at global scope, there's a different mechanism used altogether.
Hope thi shelps!
It isn't stored anywhere - it's computed as necessary.
Unless it is the operand of the sizeof, _Alignof, or unary & operators, or is a string literal used to initialize a character array in a declaration, an expression of type "N-element array of T" is converted ("decays") to an expression of type "pointer to T", and the value of the expression is the address of the first element of the array.
When you declare an array like
T a[N]; // for any non-function type T
what you get in memory is
+---+
| | a[0]
+---+
| | a[1]
+---+
...
+---+
| | a[N-1]
+---+
That's it. No storage is materialized for any pointer. Instead, whenever you use a in any expression, the compiler will compute the address of a[0] and use that instead.
Consider this C code:
int x;
void foo(void)
{
int y;
...
}
When implementing this program, a C compiler will need to generate instructions that access the int objects named x and y and the int object allocated by the malloc. How does it tell those instructions where the objects are?
Each processor architecture has some way of referring to data in memory. This includes:
The machine instruction includes some bits that identify a processor register. The address in memory is in that processor register.
The machine instruction includes some bits that specify an address.
The machine instruction includes some bits that specify a processor register and some bits that specify an offset or displacement.
So, the compiler has a way of giving an address to the processor. It still needs to know that address. How does it do that?
One way is the compiler could decide exactly where everything in memory is going to go. It could decide it is going to put all the program’s instructions at addresses 0 to 10,000, and it is going to put data at 10,000 and on, and that x will go at address 12300. Then it could write an instruction to fetch x from address 12300. This is called absolute addressing, and it is rarely used anymore because it is inflexible.
Another option is that the compiler can let the program loader decide where to put the data. When the software that loads the program into memory is running, it will read the executable, see how much space is needed for instructions, how much is needed for data that is initialized to zero, how much space is needed for data with initial values listed in the executable file, how much space is needed for data that does not need to be initialized, how much space is requested for the stack, and so on. Then the loader will decide where to put all of these things. As it does so, it will set some processor registers, or some tables in memory, to contain the addresses where things go.
In this case, the compiler may know that x goes at displacement 2300 from the start of the “zero-initialized data” section, and that the loader sets register r12 to contain the base address of that section. Then, when the compiler wants to access x, it will generate an instruction that says “Use register r12 plus the displacement 2300.” This is largely the method used today, although there are many embellishments involving linking multiple object modules together, leaving a placeholder in the object module for the name x that the linker or loader fills in with the actual displacement as they do their work, and other features.
In the case of y, we have another problem. There can be two or more instances of y existing at once. The function foo might call itself, which causes there to be a y for the first call and a different y for the second call. Or foo might call another function that calls foo. To deal with this, most C implementations use a stack. One register in the processor is chosen to be a stack pointer. The loader allocates a large amount of space and sets the stack pointer register to point to the “top” of the space (usually the high-address end, but this is arbitrary). When a function is called, the stack pointer is adjusted according to how much space the new function needs for its local data. When the function executes, it puts all of its local data in memory locations determined by the value of the stack pointer when the function started executing.
In this model, the compiler knows that the y for the current function call is at a particular offset relative to the current stack pointer, so it can access y using instructions with addresses such as “the contents of the stack pointer plus 84 bytes.” (This can be done with a stack pointer alone, but often we also have a frame pointer, which is a copy of the stack pointer at the moment the function was called. This provides a firmer base address for working with local data, one that might not change as much as the stack pointer does.)
In either of these models, the compiler deals with the address of an array the same way it deals with the address of a single int: It knows where the object is stored, relative to some base address for its data segment or stack frame, and it generates the same sorts of instruction addressing forms.
Beyond that, when you access an array, such as a[i], or possibly a multidimensional array, a[i][j][k], the compiler has to do more calculations. To do this, compiler takes the starting address of the array and does the arithmetic necessary to add the offsets for each of the subscripts. Many processors have instructions that help with these calculations—a processor may have an addressing form that says “Take a base address from one register, add a fixed offset, and add the contents of another register multiplied by a fixed size.” This will help access arrays of one dimension. For multiple dimensions, the compiler has to write extra instructions to do some of the calculations.
If, instead of using an array element, like a[i], you take its address, as with &a[i], the compiler handles it similarly. It will get a base address from some register (the base address for the data segment or the current stack pointer or frame pointer), add the offset to where a is in that segment, and then add the offset required for i elements. All of the knowledge of where a[i] is is built into the instructions the compiler writes, plus the registers that help manage the program’s memory layout.
Yet one more point of view, a TL;DR answer if you will: When the compiler produces the binary, it stores the address everywhere where it is needed in the generated machine code.
The address may be just plain number in the machine code, or it may be a calculation of some sort, such as "stack frame base address register + a fixed offset number", but in either case it is duplicated everywhere in the machine code where it is needed.
In other words, it is not stored in any one location. Talking more technically, &some_array is not an lvalue, and trying to take the address of it, &(&some_array), will produce compiler error.
This actually applies to all variables, array is not special in any way here. The address of a variable can be used in the machine code directly (and if compiler actually generates code which does store the address somewhere, you have no way to know that from C code, you have to look at the assembly code).
The one thing special about arrays, which seems to be the source of your confusion is, that some_array is bascially a more convenient syntax for &(some_array[0]), while &some_array means something else entirely.
Another way to look at it:
The address of the first element doesn't have to be stored anywhere.
An array is a chunk of memory. It has an address simply because it exists somewhere in memory. That address may or may not have to be stored somewhere depending on a lot of things that others have already mentioned.
Asking where the address of the array has to be stored is like asking where reality stores the location of your car. The location doesn't have to be stored - your car is located where your car happens to be - it's a property of existing. Sure, you can make a note that you parked your car in row 97, spot 114 of some huge lot, but you don't have to. And your car will be wherever it is regardless of your note-taking.
I read an article about Dynamically Sized Arrays on ITJungle and was wondering if this is not an "making easy thing much more complex" thing.
So as I understand if I define an static variable, including arrays, the runtime reserves the needed space at RUNTIME. So when defining a array of CHAR(10) DIM(10) the whole space would be reserved when starting the program.
So as the article says if I want to have a dynamically increasing array which resizes itself to fit the data like an List<String> in C#, I have to create a CHAR(10) DIM(10). I then have to re-allocate new space only if needed?
Why? The space is already reserved. What reason would someone have to base a array with (lets say) 100 bytes size on a pointer when only needing i.e. 80 bytes?
Am I just missing something? Is the "init-value" for sizing the array just to calm down the compiler so I don't get an error that the "compiler doesn't know the size at compile time"?
For normal arrays, you are correct that the space gets allocated at runtime as soon as the particular arrays scope is reached (start of the program for globals, start of subprocedure for subprocedures).
However, you will notice that the data structure is declared with based(pInfo). based is the keyword that will cause the memory to NOT be allocated. It will instead assume the all the memory for the data structure (included the array member) is already allocated at the location specified by the pointer passed to the based keyword (pInfo in this case).
Effectively, once you use the based keyword you are simply telling the compiler how you would like the memory at the specified pointer to be used but it is up to you to actually manage that memory.
In summary, if I understand your question properly, the statement you made about "knowing the size at compile time" is correct. RPG does not support pointer/array duality or array-like objects like some languages so you essentially just have to declare to RPG that you will NEVER go beyond "init-value" bounds.
My brain gets numb just even imagining this.So bear with me if my question is little wordy.So I've sliced my question into parts.
1) What do we have at the at the bits/bytes starting at the address of the function?I mean,at an integer variable's address, we visualize 4 bytes(for 32 bit systems) of 1's and 0's that represent the number in binary form.For a character variable we visualize a single byte with the ASCII value of the character.For a double we visualize 8 bytes accordingly.But what on earth should I visualize at the bytes starting with the address of a function?I know that a call stack is created when a function is invoked,but what about the function itself?At its address do we have the function's expressions,ifs,loops, etc in binary form?Are those bits/bytes representing a function too complicated to visualize by a human unlike say integers?
2) Can we use sizeof for a function?Why or why can't we?If we have no idea how to determine the size allocated to a function, then how do functions have addresses?As if they have addresses,they must have size and since we have pointers to functions, how is it determined by the pointers how many bytes to interpret starting with pointer address?After all we can use those pointers to invoke the functions.
Please be generous with the details.Books and Google hasn't been helpful at all in this regard.
It can be anything at all. It is not required to be anything specific.
No. A function's address is just the entry point. There's no requirement that it, for example, even occupy consecutive memory locations.
Usually, the function address is where the actual machine code for that function begins. There's no reliable way to tell where the function ends. Some platforms might lay out functions as they appear in the source code, one after the other. But other platforms, particularly ones with IPO, won't be nearly as simple.
In most C implementations, a pointer to a function is implemented as an address of the start of the function’s machine code. The bytes at that address are the bytes of the instructions that are executed when the function is called.
In some implementations, a pointer to a function is implemented as an address of data about the function, such as data that contains the address of the machine code and a description of the function’s parameters or register use.
This answer is just for educational purposes, because these details are not part of the C standard and vary between implementations.
1.
I usually visualize the memory pointed to by a function pointer as the assembler mnemonics themselves instead of a stream of bytes. If you're on a architecture with fixed-width instructions, you can visualize it as an array of integers - each encoding a different instruction.
2.
No, you can't. There's some great answers on SO that explain why you can't sizeof() a function but it basically boils down to the fact that code for that function isn't guaranteed to be all together so it's impossible to determine the size. A compiler could emit instructions that jump into another functions if it wanted to (ironically, this is exactly what happens when you call a function or evoke a function pointer ;) ).
It is perfectly possible and valid to have an address of something and not know its size - just look at a void pointer for example. Just as we don't know the size of the data a void pointer points to, we don't know the size of code that a function pointer points to.
I am not getting the whole purpose of working with the byte size of a variable by knowing the address of it. For example, let's say I know where an int variable is stored, let's say it is stored in address 0x8C729A09, if I want to get the int stored in that address I can just dereference the address and get the number stored on it.
So, what exactly is the purpose of knowing the byte size of the variable? Why does it matter if the variable has 4 bytes (being int) or 8 bytes if I am able to get the value of the variable by just dereference the address? I am asking this, because I am working on dereferencing some address and I thought that I needed to go through a for loop to get the variable (By knowing the start address, which is the address of the variable, and the size of the variable in bytes) but whenever I do this I am just getting other variables that are also declared.
A little bit of context: I am working on a tool called Pin and getting the addresses of the global variables declared in another program.
The for case looks something like this:
for(address1 = (char *) 0x804A03C, limit = address1 + bytesize; address1 < limit; address1++)
cout << *(address1) << "\n";
Michael Krelin gave a very compact answer but I think I can expand on it a bit more.
In any language, not just C, you need to know the size for a variety of reasons:
This determines the maximum value that can be stored
The memory space an array of those values will take (1000 bytes will get you 250 ints or 125 longs).
When you want to copy one array of values into another, you need to know how many bytes are used to allocate enough space.
While you may dereference a pointer and get the value, you could dereference the pointer at a different portion of that value, but only if you know how many bytes it is composed of. You could get the high-value of an int by grabbing just the first two bytes, and the low value by getting the last two bytes.
Different architectures may have different sizes for different variables, which would impact all the above points.
Edit:
Also, there are certainly reasons where you may need to know the number of bits that a given variables is made of. If you want 32 booleans, what not a better variable to use than a single int, which is made of 32 bits? Then you can use some constants to create pointers to each bit and now you have an "array" of booleans. These are usually called bit-fields (correct me if I am wrong). In programming, every detail can matter, just not all the time for every application. Just figured that might be an interesting thought exercise.
The answer is simple: the internal representation of most types needs more than one byte. In order to dereference a pointer you (either you or the compiler) need to know how much bytes should be read.
Also consider it when working with strings, you cannot always relay on the terminating \0, hence you need to know how many bytes you have to read. Examples of these are functions like memcpy or strncmp.
Supposed you have an array of variables. Where do you find the variable at non-zero index without knowing its size? And how many bytes do you allocate for non-zero length array of variables?