C code, why is address 0xFF00 being cast to a struct? - c

I am trying to understand some Linux kernel driver code written in C for a USB Wi-Fi adapter. Line 1456 in file /drivers/net/wireless/rtl818x/rtl8187/dev.c (just in case anyone wanted to refer to the kernel code for context) reads:
priv->map = (struct rtl818x_csr *)0xFF00;
I am curious about what exactly the right operand is doing here - (struct rtl818x_csr *)0xFF00;. I have been interpreting this as saying "cast memory address 0xFF00 to be of type rtl818x_csr and then assign it to priv->map". If my interpretation is correct, what is so special about memory address 0xFF00 that the driver can reliably tell that what it is after will always be at this address? The other thing I am curious about is that 0xFF00 is only 16-bits. I would be expecting 32/64-bits if it were casting a memory address.
Can anyone clarify exactly what is going on in this line of code? I imagine there's a flaw in my understanding of the C syntax.

0xFF00 is an address in the IO address space of the system. If you look in the code, the address is never directly dereferenced but accessed through IO functions.
For example, in the call
rtl818x_iowrite8(priv, &priv->map->EEPROM_CMD,
RTL818X_EEPROM_CMD_CONFIG);
which then calls Linux kernel low level IO functions.
The address is cast to a pointer to a struct to give access to offsets from the adress, example here:
0xFF00 + offsetof(struct rtl818x_csr, EEPROM_CMD)
Note that in the rtl818x_iowrite8 call above, no dereference occurs when passing the &priv->map->EEPROM_CMD argument because of the & operator, only the address + offset is computed. The dereference is further achieved withtin the internal low level functions called inside rtl818x_iowrite8.

Casting an absolute address to a pointer to a structure is a common way in drivers to access the (memory mapped) registers of a device as a normal C structure.
Using 0xff00 works because C doesn't do sign extension of numbers.

You have to consider this from the device point of view.
Starting at address 0xFF00 inside the address space mapped for the rtl8187 device is a memory range that holds information structured the same way as the rtl818x_csr struct defined here.
So after you logically map that region you can start doing bus reads and writes on it to control the device. Like here (had to cut two more hyperlinks because I don't have the reputation necessary to post more than 3, but you get the point). These are just a couple of examples. If you read the entire file you'll see reads and writes are sprinkled everywhere.
In order to understand why that structure looks that way and why 0xFF00 is used instead of 0xBEEF or 0xDEAD you'll have to consult the datasheet for that device.
So if you want to start looking at kernel code, and specially device drivers, you'll have to have more than just the code. You'll need the datasheet or specifications as well. This can be rather difficult to find (see the gazillions of email threads and articles soliciting open documentation from the vendors).
Anyway, I hope I answered your question.
Happy hacking!

Related

How are addresses resolved by a compiler in a medium memory model?

I'm new to programming small/medium memory models CPUs. I am working with an embedded processor that has 256KB of flash code space contained in addresses 0x00000 to 0x3FFFF, and with 20KB of RAM contained in addresses 0xF0000 to 0xFFFFF. There are compiler options to choose between small, medium, or large memory models. I have medium selected. My question is, how does the compiler differentiate between a code/flash address and a RAM address?
Take for example I have a 1 byte variable at RAM address 10, and I have a const variable at the real address 10. I did something like:
value = *((unsigned char *)10);
How would the compiler choose between the real address 10 or the (virtual?) address 10. I suppose if I wanted to specify the value at real address 10 I would use:
value = *((const unsigned char *)10);
?
Also, can you explain the following code which I believe is related to the answer:
uint32_t var32; // 32 bit unsigned integer.
unsigned char *ptr; // 2 byte pointer.
ptr = (unsigned char *)5;
var32 = (uint32_t)ptr;
printf("%lu", var32)
The code prints 983045 (0xf0005 hex). It seems unrealistic, how can a 16 bit variable return a value greater than what 16 bits can store?
Read your compiler's documentation to find out details about each memory model.
It may have various sorts of pointer, e.g. char near * being 2-byte, and char far * being 4-byte. Alternatively (or as well as), it might have instructions for changing code pages which you'd have to manually invoke.
how can a 16 bit variable return a value greater than what 16 bits can store?
It can't. Your code converts the pointer to a 32-bit int. , and 0xF0005 can fit in a 32-bit int. Based on your description, I'd guess that char * is only pointing to the data area, and you would use a different sort of pointer to point to the code area.
I tried to comment on Matt's answer but my comment was too long, and I think it might be an answer, so here's my comment:
I think this is an answer, I'm really looking for more details though. I've read the manual but it doesn't have much information on the topic. You are right, the compiler has near/far keywords you can use to manually specify the address (type?). I guess the C compiler knows if a variable is a near or far pointer, and if it's a near pointer it generates instructions that map the 2 byte near pointer to a real address; and these generated mapping instructions are opaque to the C programmer. That would be my only guess. This is why the pointer returns a value greater than its 16 bit value; the compiler is mapping the address to an absolute address before it stores the value in var32. This is possible because 1) the RAM addresses begin at 0xF0000 and end at 0xFFFFF, so you can always map a near address to its absolute address by or'ing the address with 0xF0000, and 2) there is no overlap between a code (far) pointer and a near pointer or'd with 0xF0000. Can anyone confirm?
My first take would be read the documentation, however as I had seen, it was already done.
So my assumption would be that you somehow got to work for example on a large existing codebase which was developed with a not too widely supported compiler on a not too well known architecture.
In such a case (after all my attempts with acquiring proper documentation failed) my take would be generating assembler outputs for test programs, and analysing those. I did this a while ago, so it is not from thin air (it was a 8051 PL/M compiler running on an MDS-70, which was emulated by a DOS based emulator from the late 80s, for which DOS was emulated by DOSBox - yes, and for the huge codebase we needed to maintain we couldn't get around this mess).
So build simple programs which would do something with some pointers, compile those without optimizations to assembly (or request an assembly dump, whatever the compiler can do for you), and understand the output. Try to cover all pointer types and memory models you know of in your compiler. It will clarify what is happening, and hopefully the existing documentations will also help once you understand their gaps this way. Finally, don't stop at understanding just enough for the immediate problem, try to document the gaps properly, so later you won't need to redo the experiments to figure out things you once almost done.

In what machine can (* (void (*)()) 0)() be used?

C traps and pitfalls 2.1
I thought 0 is always invalid address. How could he put a function in that position?
It's architecture dependent.
From the book:
I
once
talked
to
someone
who
was
writing
a
C
program
that
was
going
to
run
stand-alone
in
a
small
microprocessor (answer right here).
When
this
machine
was
switched
on,
the
hardware
would
call
the
subroutine
whose
address
was
stored
in
location
0.
In
order
to
simulate
turning
power
on,
we
had
to
devise
a
C
statement
that
would
call
this
subroutine
explicitly.
After
some
thought,
we
came
up
with
the
following:
(*(void(*)())0)();
For microprocessors/microcontrollers, you have raw access to any RAM/Flash Address unless prohibited in hardware. Therefore accessing address 0 in microprocessor is completely vaild.
I think that (* (void (*)()) 0) means that it is trying to invoke a function that is located in memory at address 0x00000000(which probably is an invalid address)
A very similar question on stackoverflow What does this C statement mean? may help

What memory addresses are available for use?

How do i find out what memory addresses are suitable for use ?
More specifically, the example of How to use a specific address is here: Pointer to a specific fixed address, but not information on Why this is a valid address for reading/writing.
I would like a way of finding out that addresses x to y are useable.
This is so i can do something similar to memory mapped IO without a specific simulator. (My linked Question relevant so i can use one set of addresses for testing on Ubuntu, and another for the actual software-on-chip)
Ubuntu specific answers please.
You can use whatever memory address malloc() returns. Moreover, you can specify how much memory you need. And with realloc() you even can change your mind afterwards.
You're mixing two independent topics here. The Question that you're linking to, is regarding a micro controller's memory mapped IO. It's referring to the ATM128, a Microcontroller from the Atmel. The OP of that question is trying to write to one of the registers of it, these registers are given specific addresses.
If you're trying to write to the address of a register, you need to understand how memory mapped IO works, you need to read the spec for the chipset/IC your working on. Asking this talking about "Ubuntu specific answers" is meaningless.
Your program running on the Ubuntu OS is running it it's own virtual address space. So asking if addresses x to y are available for use is pretty pointless... unless you're accessing hardware, there's no point in looking for a specific address, just use what the OS gives you and you'll know you're good.
Based on your edit, the fact that you're trying to do a simulation of memory mapped IO, you could do something like:
#ifdef SIMULATION
unsigned int TX_BUF_REG; // The "simulated" 32-bit register
#else
#define TX_BUF_REG 0x123456 // The actual address of the reg you're simulating
#endif
Then use accessor macro's to read or write specific bits via a mask (as is typically done):
#define WRITE_REG_BITS(reg, bits) {reg |= bits;}
...
WRITE_REG_BITS(TX_BUF_REG, SOME_MASK);
Static variables can be used in simulations this way so you don't have to worry about what addresses are "safe" to write to.
For the referenced ATMega128 microcontroller you look in the Datasheet to see which addresses are mapped to registers. On a PC with OS installed you won't have a chance to access hardware registers directly this way. At least not from userspace. Normally only device drivers (ring 0) are allowed to access hardware.
As already mentioned by others you have to use e.g. malloc() to tell the OS that you need a pointer to memory chuck that you are allowed to write to. This is because the OS manages the memory for the whole system.

Read struct from physical memory address in C

This is probably more of a problem with my lack of C knowledge, but I'm hoping someone might be able to offer a possible solution. In a nutshell, I'm trying to read a struct that is stored in memory, and I have it's physical memory address. Also this is being done on a 64-bit Linux system (Debian (Wheezy) Kernel 3.6.6), and I'd like to use C as the language.
For example the current address of the struct in question is at physical address: 0x3f5e16000
Now I did initially try to access this address by using using a pointer to /dev/mem. However, I've since learned that access to any address > 1024MB is not allowed, and I get a nice error message in var/log/messages telling me all about it. At present access is being attempted from a userspace app, but I'm more than happy to look into writing a kernel module, if that is what is required.
Interesting, I've also discovered something known as 'kprobe', which supposedly allows the > 1024MB /dev/mem restriction to be bypassed. However, I don't really want to introduce any potential security issues into my system, and I'm sure there must be an easier way to accomplish this. The info on kprobe can be found here: http://www.libcrack.so/2012/09/02/bypassing-devmem_is_allowed-with-kprobes/
I've done some reading and I've found references to using mmap to map the physical address into userspace so that it can be read, but I must confess that I don't understand the implementation of this in C.
If anyone could provide some information on accessing physical memory, or either mapping data from a physical address to a userspace virtual address, I would be extremely grateful.
You'll have to forgive me if I'm a little bit vague as to exactly what I'm doing, but it's part of a project and I don't want to give too much information away, so please bear with me :) I'm not being obtuse or anything.
The structure in memory is a block of four ints and ten longs that is loaded into memory by a running kernel module.
The address that I'm using is definitely a physical address and it's set to non-paged, the kernel module performs the translations to physical and I'm not using the address-of operator.
I'm wondering if I should just rephrase the question as how to read an int from a physical location, as that is the first element of the struct. I hope that helps to clarify things!
EDIT - After doing some more reading, it appears that one possible solution to this problem is to construct a kernel module, and then use the mmap function to map the physical address to a virtual address the kernel module can then access. Can anyone offer any advice on achieving this using mmap?
I'm only going to answer this question:
I'm wondering if I should just rephrase the question as how to read an int from a physical location, as that is the first element of the struct.
No. The problem is not int vs. struct, the problem is that C in and of itself has no notion of physical memory. The OS in conjunction with the MMU makes sure that every process, including every running C program, runs in a virtual memory sandbox. The OS might offer an escape hatch into physical memory.
If you're writing a kernel module that manages some object at physical address 0x3f5e16000, then you should offer some API to get to that memory, preferably one that uses a file descriptor or some other abstraction to hide the nitty-gritty of kernel memory management from the user program it communicates with.
If you're trying to communicate with a poorly designed kernel module that expects you to access a fixed physical memory address, then ugly hacks involving /dev/mem are your share.

Declare a pointer to an integer at address 0x200 in memory

I have a couple of doubts, I remember some where that it is not possible for me to manually put a variable in a particular location in memory, but then I came across this code
#include<stdio.h>
void main()
{
int *x;
x=0x200;
printf("Number is %lu",x); // Checkpoint1
scanf("%d",x);
printf("%d",*x);
}
Is it that we can not put it in a particular location, or we should not put it in a particular location since we will not know if it's a valid location or not?
Also, in this code, till the first checkopoint, I get output to be 512.
And then after that Seg Fault.
Can someone explain why? Is 0x200 not a valid memory location?
In the general case - the behavior you will get is undefined - everything can happen.
In linux for example, the first 1GB is reserved for kernel, so if you try to access it - you will get a seg fault because you are trying to access a kernel memory in user mode.
No idea how it works in windows.
Reference for linux claim:
Currently the 32 bit x86 architecture is the most popular type of
computer. In this architecture, traditionally the Linux kernel has
split the 4GB of virtual memory address space into 3GB for user
programs and 1GB for the kernel.
Adding to what #amit wrote:
In windows it is the same. In general it is the same for all protected-mode operating systems. Since DOS etc. are no longer around it is the same with all systems except kernel-mode (km-drivers) and embedded systems.
The operating system manages which memory-pages you are allowed to write to and places markers that will make the cpu automatically raise access-violations if some other page is written to.
Up until the "checkpoint", you haven't accessed memory location 0x200, so everything works fine.
There I'd a local variable x in the function main. It is of type "pointer to int". x is assigned the value 0x200, and then that value is printed. But the target of x hasn't been accessed, so up to this point it doesn't matter whether x holds a valid memory address or not.
Then scanf tries to write to the memory address you passed in, which is the 0x200 stored in x. Then you get a seg fault, which is certainly sac possible result of trying to write to an arbitrary memory address.
So what are your doubts? What makes you think that this might work, when you come across this code that clearly doesn't?
Writing to a particular memory address might work under certain conditions, but is extremely unlikely to in general. Under all modern OSes, normal programs do not have control over their memory layout. The OS decides where initial things like the program's code, stack, and globals go. The OS will probably also be using some memory space, and it is not required to tell you what it's using. Instead you ask for memory (either by making variables or by calling memory allocation routines), and you use that.
So writing to particular addresses is very very likely to get either memory that hasn't been allocated, or memory that is being used for some other purpose. Neither of those is good, even if you do manage to hit an address that is actually writable. What if you clobber sundry some piece of data used by one of your program's other variables? Or some other part of your program clobbers the value you just wrote?
You should never be choosing a particular hard-coded memory address, you should be using an address of something you know is a variable, or an address you got from something like malloc.

Resources