I'm trying to write a simple soft CPU in C that will work on an imaginary machine for an embedded application. I'm new to this, so bear with me.
I've been trying to do this in an IDE, but run into an issue where I need to malloc the memory and am not getting a consistent memory address for allocating my registers, so I'm unable to run tests and debug. On an actual piece of hardware, I understand that the documentation would give me the addresses of specific registers, main memory, and hard disk memory, correct? I'd like to be able to define macros for my registers that I can then pass around to read/write, but this seems impossible without static memory addresses.
So it seems like I need a good way to allocate a static chunk of memory with static addresses, either in an IDE or on my own machine with a text editor. What would be the best way to do this? For reference, I'm using Cloud9 IDE but can't figure out how to do it in this platform.
Thanks!
You should do something like uint8_t* const address_space = calloc( memory_size, sizeof(uint8_t) );, check the return value of course, and then make all your machine addresses indices into the array, like address_space[dest] = register[src];. If your emulated CPU can handle data of different sizes or has less strict alignment restrictions than your host CPU, you would need to use memcpy() or pointer casts to transfer data.
Your debugger will understand expressions like address_space[i] whether address_space is statically or dynamically allocated, but you can statically allocate it if you know the exact size in advance, such as to emulate a machine with 16-bit addresses that always has exactly 65,536 bytes of RAM.
Related
I was wondering what is the equivalent of the command Peek and Poke (Basic and other variants) in C. Something like PeekInt, PokeInt (integer).
Something that deals with memory banks, I know in C there are lots of ways to do this and I am trying to port a Basic program to C.
Is this just using pointers? Do I have to use malloc?
Yes, you will have to use pointers. No, it does not necessarily involve malloc.
If you have a memory address ADDRESS and you want to poke a byte value into it, in C that's
char *bytepointer = ADDRESS;
*bytepointer = bytevalue;
As a concrete example, to poke the byte value 0h12 to memory address 0h3456, that would look like
char *bytepointer = 0x3456;
*bytepointer = 0x12;
If on the other hand you want to poke in an int value, that looks like
int *intpointer = ADDRESS;
*intpointer = intvalue;
"Peeking" is similar:
char fetched_byte_value = *bytepointer;
int fetched_int_value = *intpointer;
What's happening here is basically that these pointer variables like bytepointer and intpointer contain memory addresses, and the * operator accesses the value that the pointer variable points to. And this "access" can be either for the purpose of fetching (aka "peeking"), or storing ("poking").
See also question 19.25 in the C FAQ list.
Now, with that said, I have two cautions for you:
Pointers are an important but rather deep concept in C, and they can be tricky to understand at first. If you've never used a pointer, if you haven't gotten to the Pointers chapter in a good C textbook, the superficial introduction I've given you here isn't going to be enough, and there are all sorts of pitfalls waiting for you if you don't learn about them first.
Setting a pointer to point to an absolute, numeric memory address, and then accessing that location, is a low-level, high-powered technique that ends up being rather rare these days. It's useful (and still commonly used) if you're doing embedded programming, if you're using a microcontroller to access some hardware, or perhaps if you're writing an operating system. In more ordinary, "applications" level programming, on the other hand, there just aren't any memory addresses which you're allowed to access in this way.
What values do you want to peek or poke, to accomplish what? What is that Basic program supposed to do? If you're trying to, for example, poke values into display memory so they show up on the screen, beware that on a modern operating system (that is, anything newer than MS-DOS), it doesn't work that way any more.
Yes, its exactly pointers.
to read from the location pointed by some pointer code value = *pointer. (peek)
to write to that location, code *pointer = value. (poke)
notice the asterisk sign before pointer.
malloc is used when you need to allocate runtime memory in heap, but its not the only way to get a memory pointer. you can get a pointer simply by using & before name of a variable.
int value = 2;
int *myPointer = &value;
remember to free any memory allocated with malloc after you are done with it. See also the valgrind tool.
also I recommend to stick with non malloc solution because of its simplicity and easy maintenance.
you can call a function many time to allocate new memory for its local variables and work with them.
The concept of peek/poke (to directly access memory) in C is irrelevant - you probably don't need them and if you do you really should ask a specific question about the code you are trying to write. If you are building 32 or 64 bit code for a modern platform, peek() / poke() are largely irrelevant in any event and unlikely to do what you expect - unless you expect a memory protection fault! You will be digging out some antique compiler if you are building 16-bit x86 "real-mode" code.
C is a systems-level language and as such operating directly on memory fundamental to the language. It is also a compiled rather than interpreted language and you would normally access a memory location through a variable/symbol and let the linker resolve/locate its address. However in some cases (in embedded systems code or kernel level device drivers for example) you might need to access a specific physical address which you might do as follows:
#include <stdint.h>
const uint16_t* const KEYBOARD_STATE_FLAGS_ADDR = ((uint16_t*)0x0417) ;
...
// Peek keyboard state
keyboard_flags = *KEYBOARD_STATE_FLAGS_ADDR ;
If you really want functions to observe/modify arbitrary address locations then for example:
uint16_t peek16( uint32_t addr )
{
return *((uint16_t*)addr) ;
}
void poke16( uint32_t addr, uint16_t value )
{
*((uint16_t*)addr) = value ;
}
... and perhaps corresponding implementations for 8, 32 and even 64 bit access. However good luck avoiding a SEG-FAULT exception on a modern system where memory is virtualised and protected - as I said it largely makes no sense.
Something that deals with memory banks,
I assume given the [qbasic] tag that you are referring to 16-bit x86 segment:offset addressing rather than "memory banks" - that is an architecture specific thing, irrelevant on modern systems. If you really need to deal with that (i.e. are targeting 16-bit x86 C code), you need to mention that - it is a very different issue, specific to a particular obsolete architecture. You would also need to specify what compiler you are using, because it involves non-standard compiler extensions (and to be frank, techniques I have long forgotten; it would be software archaeology).
Is this just using pointers?
Essentially yes, in C pointers are addresses, but in 16-Bit x86 code there is the added complication of the segmented memory architecture and the concept of near and far pointers. If you are porting this code to a modern system, it is unlikely to be a simple matter of porting peek/poke commands verbatim - they are unlikely to work. If you are are just translating 16-bit QBasic code to 16-bit C - why?!
Do I have to use malloc?
No - or at least that depends on the semantics of the code you are porting, but unlikely, and it is not relevant to peek/poke specifically. malloc allocates memory from a system heap and provides its address - which is non-deterministic. peek/poke are a means of accessing specific memory addresses.
In the end it is probably not a matter of directly implementing the peek/poke commands in the code you are porting, and whether that would work or not would depend on the platform and architecture you are porting the code to. Most likely you would be better off looking at the higher level semantics of the code you are porting and implementing that in a manner that suits the platform you are targeting or better be platform independent and therefore portable.
Basically when porting you should ask your self what does this bit of code do, and port that rather than a line-by-line translation. And that consideration should be done at as high a level as possible to make the best use of the target language an/or its library. peek/poke were typically used for either accessing machine features no exposed through some higher lever interface, or for directly accessing hardware or video-buffers. On a modern platform much of that is either unnecessary or won't work - and probably both.
Is it possible to share an array of pointers between multiple kernels in OpenCL. If so, how would I go about implementing it? If I am not completely mistaken - which may though be the case - the only way of sharing things between kernels would be a shared cl_mem, however I also think these cannot contain pointers.
This is not possible in OpenCL 1.x because host and device have completely separate memory spaces, so a buffer containing host pointers makes no sense on the device side.
However, OpenCL 2.0 supports Shared Virtual Memory (SVM) and so memory containing pointers is legal because the host and device share an address space. There are three different levels of granularity though, which will limit what you can have those pointers point to. In the coarsest case they can only refer to locations within the same buffer or other SVM buffers currently owned by the device. Yes, cl_mem is still the way to pass in a buffer to a kernel, but in OpenCL 2.0 with SVM that buffer may contain pointers.
Edit/Addition: OP points out they just want to share pointers between kernels. If these are just device pointers, then you can store them in the buffer in one kernel and read them from the buffer in another kernel. They can only refer to __global, not __local memory. And without SVM they can't be used on the host. The host will of course need to allocate the buffer and pass it to both kernels for their use. As far as the host is concerned, it's just opaque memory. Only the kernels know they are __global pointers.
I ran into a similar problem, but I managed to get around it by using a simple pointer structure.I have doubts about the fact that someone says that buffers change their position in memory,perhaps this is true for some special cases.But this definitely cannot happen while the kernel is working with it. I have not tested it on different video cards, but on nvidia(cl 1.2) it works perfectly, so I can access data from an array that was not even passed as an argument into the kernel.
typedef struct
{
__global volatile point_dataT* point;//pointer to another struct in different buffer
} pointerBufT;
__kernel void tester(__global pointerBufT * pointer_buf){
printf("Test id: %u\n",pointer_buf[coord.x+coord.y*img_width].point->id);//Retrieving information from an array not passed to the kernel
}
I know that this is a late reply, but for some reason I have only come across negative answers to similar questions, or a suggestion to use indexes instead of pointers. While a structure with a pointer inside works great.
I'm playing with a simple cache simulator I wrote, and I want to know if it's possible to allocate a virtual page manually through Linux so I can test way conflicts.
I understand this is doubtful and probably not something even considered in Linux's design, and it is clearly easier to test this in a different manner (just passing a value for the address), but I just thought I'd throw this question out for my own curiosity.
I would have something like:
char *p1 = (char *)SomeLiteral;
*p1 = value1;
dcache.writeback(p1);
char *p2 = (char *)ADifferentLiteral;
*p2 = value2;
//may map to same set index and be brought to second way
dcache.writeback(p2);
This would probably work on some embedded systems, but it's obviously going to page fault under Linux. So, is there a way to allocate a virtual page for p1 and p2? Or even set the virtual address for a program's heap?
I apologize if this sounds obtuse, and thanks!
If your system supports it, you can allocate single 1GB huge page then mlock() it so it won't be swapped out.
This page should be large enough that your experiments can all fit inside of it easily, but you'll need to know the cache hashing/placement algorithm to be sure.
Also, you won't know the most significant bits of the physical address of the page, and that leads me to another point - you may need to consider what inputs the cache placement algorithm takes on your system - indexing and tagging can each be done by physical or virtual addresses, you should review your system's architecture to see how or if this impacts your research.
If you want it to allocate physical pages then just go through and touch each of the pages by writing a single value to it. Say you need 1024 4k pages preallocated, then call malloc, then walk the addresses in 4k steps and write a single value at each of those addresses.
I haven't coded in a while, so excuse me upfront. I have this odd problem. I am trying to malloc 8GB in one go and I plan to manage that heap with TLSF later on. That is, I want to avoid mallocing throughout my application at all, just get one big glob at the beginning and freeing it in the end. Here is the peculiarity though. I was always using dlmalloc until now in my programs. Linking it in and everything went well. However, now when I try to malloc 8GB at once and link in dlmalloc to use it I get segmentation fault 11 on OSX when I run it, without dlmalloc everything goes well. Doesn't matter if I use either gcc or clang. System doesn't have 8GB of RAM though, but it has 4GB. Interestingly enough same thing happens on Windows machine which has 32GB of RAM and Ubuntu one that has 16GB of RAM. With system malloc it all works, allocation goes through and simple iteration through allocated memory works as expected on all three systems. But, when I link in dlmalloc it fails. Tried it both with malloc and dlmalloc function calls.
Allocation itself is nothing extraordinary, plain c99.
[...]
size_t bytes = 1024LL*1024LL*1024LL*8LL;
unsigned long *m = (unsigned long*)malloc(bytes);
[...]
I'm confused by several things here. How come system malloc gives me 8GB malloc even without system having 4GB or RAM, are those virtual pages? Why dlmalloc doesn't do the same? I am aware there might not be a continuos block of 8GB of RAM to allocate, but why segmentation fault then, why not a null ptr?
Is there a viable robust (hopefully platform neutral) solution to get that amount of RAM in one go from malloc even if I'm not sure system will have that much RAM?
edit: program is 64-bit as are OS' which I'm running on.
edit2:
So I played with it some more. Turns out if I break down allocation into 1GB chunks, that is 8 separate mallocs, then it works with dlmalloc. So it seems to be an issue with contiguous range allocation where dlmalloc probably tries to allocate only if there is a contiguous block. This makes my question then even harder to formulate. Is there a somewhat sure way to get that size of a memory chunk with or without dlmalloc across platforms, and not have it fail if there is no physical memory left (can be in swap, as long as it doesn't fail). Also would it be possible in a cross platform manner to tell if malloc is in ram or swap.
I will give you just a bit of perspective, if not an outright answer. When I see you attempting to allocate 8GB of contiguous RAM, I cringe. Yes, with 64-bit computing and all, that is probably "legal", but on a normal machine, you are probably going to run into a lot of edge cases, 32-bit legacy code choking on a 64-bit size, and just plain usability issues getting a chunk of memory big enough to make this work. If you want to try this sort of thing, perhaps attempt to malloc the single chunk, then if that fails, use smaller chunks. This somewhat defeats the purpose of a 1 chunk system though. Perhaps there is some sort of "page size" in the OS that you could link your malloc size to - in order to help performance and just plain ability to get memory in the amount you wish.
On game consoles, this approach to memory management is somewhat common - allocate 1 buffer from the OS at bootup as big as possible, then place your own memory manager on there to avoid OS overhead and possible inferior allocation code. It also allows one to better control memory fragmentation on such systems where virtual memory doesn't exist. But on these systems, you also know up front exactly how much RAM you have.
Is there a way to see if memory is physical or virtual in a platform independent way? I don't think so, but perhaps someone else can give a good answer to that and I'll edit this part away.
So not a 100% answer, but some random thoughts to help out and my internally wondering what you are doing that wants 8GB of RAM in one chunk when it sounds like multiple chunks will work fine. :)
I've been thinking for day about the following question:
In a common pc when you allocate some memory, you ask for it to the OS that keeps track of which memory segments are occupied and which ones are not, and don't let you mess around with other programs memory etc.
But what about a microcontroller, I mean a microcontroller doesn't have an operating system running so when you ask for a bunch of memory what is going on? you cannot simply acess the memory chip and acess a random place cause it may be occupied... who keeps track of which parts of memory are already occupied, and gives you a free place to store something?
EDIT:
I've programmed microcontrollers in C... and I was thinking that the answer could be "language independent". But let me be more clear: supose i have this program running on a microcontroller:
int i=0;
int d=3;
what makes sure that my i and d variables are not stored at the same place in memory?
I think the comments have already covered this...
To ask for memory means you have some operating system managing memory that you are mallocing from (using a loose sense of the term operating system). First you shouldnt be mallocing memory in a microcontroller as a general rule (I may get flamed for that statement). Can be done in cases but you are in control of your memory, you own the system with your application, asking for memory means asking yourself for it.
Unless you have reasons why you cannot statically allocate your structures or arrays or use a union if there are mutually exclusive code paths that might both want much or all of the spare memory, you can try to allocate dynamically and free but it is a harder system engineering problem to solve.
There is a difference between runtime allocation of memory and compile time. your example has nothing to do with the rest of the question
int i=0;
int d=3;
the compiler at compile time allocates two locations in .data one for each of those items. the linker and/or script manages where .data lives and what its limitations are on size, if .data is bigger than what is available you should get a linker warning, if not then you need to fix your linker commands or script to match your system.
runtime allocation is managed at runtime and where and how it manages the memory is determined by that library, even if you have plenty of memory a bad or improperly written library could overlap .text, .data, .bss and/or the stack and cause a lot of problems.
excessive use of the stack is also a pretty serious system engineering problem which coming from non-embedded systems is these days often overlooked because there is so much memory. It is a very real problem when dealing with embedded code on a microcontroller. You need to know your worst case stack usage, leave room for at least that much memory if you are going to have a heap to dynamically allocate, or even if you statically allocate.