Data transfer between host and endpoint using PCIe

Data transfer between host and endpoint using PCIe - c

Last few days onwards I am trying to develop data transfer between the host and endpoint, but I am unable to do that implementation.I have tried how to read the configure space using some calls(pci_read_long),it is successfully reading the data like vendor_id,device_id...etc.
In the configuration space BAR(base address register) is stored the memory address as well as I/O address it depends on 0th bit.coming to my problem I am reading the 10h address register, for example, let us consider the value 0XFE000000 what I am doing is to clear last four bits then complement the bits and finally add 1 to the address then the result indicates the size of the address.
my problems are:
whenever I am writing to particular address location(FE000000) using this pci_write_long I am facing the segmentation fault.
why I am facing segmentation fault while writing ? can anyone please
help me to resolve this issue and is it correct calculating the size of memory(above steps).
about bar : Is it represent the memory base address?
coming to my code:
int c = pci_write_long(dev,0X10,0xFFFFFFFF);//write all 1's to that location
c = pci_read_long(dev,0x10); //reading the address
printf("c = %x\n",c);
for(i=0;i<4;i++) //clearing the last four bits
c = c & ~(1<<i);
printf("c = %x\n",c);
c = ~c; // 1's complement
c =c+1; //add the one to that address
printf("c = %x\n",c); // size of the address
int ch1 = pci_write_byte(dev,c,0xf); // i am facing the segmentation fault here
printf("ch1 = %x\n",ch1);
ch1 = pci_read_byte(dev,c); // again i am reading the the data current location
printf("final read = %x\n",ch1);
Is it the correct way of implementing the code or not? if it is not correct can u provide any related information or any link?

whenever I am writing to particular address location(FE000000) using
this pci_write_long I am facing the segmentation fault.
pci_write_long() is used to access the Endpoints config space. The address you are trying to access is memory mapped Endpoint's internal registers and you can simply use pointers to do it. Just mmap() the address and access it like normal buffer provided you are aware of the device's internal register structure.
You can see that first write of 0xFFFFFFFF was successful because you were writing to offset 0x10 which lies in the config space. Next write which caused the segmentation fault, you are writing to offset 0xFE000000 from the config space base which is invalid.
One more point, in case you are on ARM, is this memory i.e. 0xFE000000 lies withing the PCIe controllers memory range? If that is not the case, then you are accessing an invalid memory.
why I am facing segmentation fault while writing ? can anyone please
help me to resolve this issue and is it correct calculating the size
of memory(above steps).
Reason for crash is given above. The size calculation algorithm looks correct. You can simply match your calculated size with the BAR size mentioned in the EP's datasheet to be sure.
about bar : Is it represent the memory base address?
It represents where your Endpoint's or Device's internal registers/memory is mapped in your host address space region. You can use the same address from your host machine to access the device's internal registers as if you are accessing a local memory buffer. Just mmap() the address before accessing if your architecture has virtual memory in place.
EDIT 1 - I can see you are writing all 1's and trying to read the address. You are not string any address first which should be from your host address space. I believe there is some misunderstanding you need to clear with BAR usage.
1 - Get the BAR size.
2 - Write an host address to the BAR where you want EP's internal registers to map. (For ARM, the address should be from PCIe controllers address range. Check manual of your board)
- Enable memory transaction access by writing 0x3 in command register.
3 - from your c code, do mmap() of that address
4 - Access that address now.

Related

How to modify vdso variable (vvar)?

Recently I am study vdso in linux. I tried to modify the data in vvar section but it failed. Following is what I've tried.
According to lwn described, there're two address for vvar:
The first one is a regular kernel-space address whose value is greater than PAGE_OFFSET. If you look at the System.map file, you'll find that this symbol has an address like ffffffff81c76080.
The second address is in a region called the "vvar page". The base address (VVAR_ADDRESS) of this page is defined in the kernel to be at 0xffffffffff5ff000, close to the end of the 64-bit address space. This page is made available read-only to user-space code.
Modify by first address: First address is easy to find by checking the address of _vdso_data in /boot/System.map-kernel-version. After getting the address, I modify the member (__unused) of it in kernel module ( thus, there's not permission issue ). Then, I check the __unused from user space program, it still remain 0 which means that I failed to modify it from kernel space.
Modify by second address: Second address can be found by auxiliary vector. After getting the address of vdso, then we can find vvar section. I pass this address into kernel module and modify its member __unused. The the error occur, showing the permission error issue. ( The reason should be vvar is read-only memory, checking by cat /proc/pid/maps )
I think the first way is almost at the solution but it seems vvar at that address is not be mapped to all process' vvar section. Are there any idea? Thanks in advance.
[EDIT]: The first way get more closer to solution. I modify the cr0 bit, thus allow permission of writing. Even I successfully write it, it still can't be read from linux kernel (by getting vdso_data through __arch_get_k_vdso_data and accessing its member which I modify in user space + kernel module previously)

Confusion of virtual memory

Consider a sample below.
char* p = (char*)malloc(4096);
p[0] = 'a';
p[1] = 'b';
The 4KB memory is allocated by calling malloc(). OS handles the memory request by the user program in user-space. First, OS requests memory allocation to RAM, then RAM gives physical memory address to OS. Once OS receives physical address, OS maps the physical address to virtual address then OS returns the virtual address which is the address of p to user program.
I wrote some value(a and b) in virtual address and they are really written into main memory(RAM). I'm confusing that I wrote some value in virtual address, not physical address, but it is really written to main memory(RAM) even though I didn't care about them.
What happens in behind? What OS does for me? I couldn't found relevant materials in some books(OS, system programming).
Could you give some explanation? (Please omit the contents about cache for easier understanding)

A detailed answer to your question will be very long - and too long to fit here at StackOverflow.
Here is a very simplified answer to a little part of your question.
You write:
I'm confusing that I wrote some value in virtual address, not physical address, but it is really written to main memory
Seems you have a very fundamental misunderstanding here.
There is no memory directly "behind" a virtual address. Whenever you access a virtual address in your program, it is automatically translated to a physical address and the physical address is then used for access in main memory.
The translation happens in HW, i.e. inside the processor in a block called "MMU - Memory management unit" (see https://en.wikipedia.org/wiki/Memory_management_unit).
The MMU holds a small but very fast look-up table that tells how a virtual address is to be translated into a physical address. The OS configures this table but after that, the translation happens without any SW being involved and - just to repeat - it happens whenever you access a virtual memory address.
The MMU also takes some kind of process ID as input in order to do the translation. This is need because two different processes may use the same virtual address but they will need translation to two different physical addresses.
As mentioned above the MMU look-up table (TLB) is small so the MMU can't hold a all translations for a complete system. When the MMU can't do a translation, it can make an exception of some kind so that some OS software can be triggered. The OS will then re-program the MMU so that the missing translation gets into the MMU and the process execution can continue. Note: Some processors can do this in HW, i.e. without involving the OS.

You have to understand that virtual memory is virtual, and it can be more extensive than physical memory RAM, so it is mapped differently. Although they are actually the same.
Your programs use virtual memory addresses, and it is your OS who decides to save in RAM. If it fills up, then it will use some space on the hard drive to continue working.
But the hard drive is slower than the RAM, that's why your OS uses an algorithm, which could be Round-Robin, to exchange pages of memory between the hard drive and RAM, depending on the work being done, ensuring that the data that are most likely to be used are in fast memory. To swap pages back and forth, the OS does not need to modify virtual memory addresses.
Summary overlooking a lot of things

You want to understand how virtual memory works. There's lots of online resources about this, here's one I found that seems to do a fair job of trying to explain it without getting too crazy in technical details, but also doesn't gloss over important terms.
https://searchstorage.techtarget.com/definition/virtual-memory

For Linux on x86 platforms, the assembly equivalent of asking for memory is basically a call into the kernel using int 0x80 with some parameters for the call set into some registers. The interrupt is set at boot by the OS to be able to answer for the request. It is set in the IDT.
An IDT descriptor for 32 bits systems looks like:
struct IDTDescr {
uint16_t offset_1; // offset bits 0..15
uint16_t selector; // a code segment selector in GDT or LDT
uint8_t zero; // unused, set to 0
uint8_t type_attr; // type and attributes, see below
uint16_t offset_2; // offset bits 16..31
};
The offset is the address of the entry point of the handler for that interrupt. So interrupt 0x80 has an entry in the IDT. This entry points to an address for the handler(also called ISR). When you call malloc(), the compiler will compile this code to a system call. The system call returns in some register the address of the allocated memory. I'm pretty sure as well that this system call will actually use the sysenter x86 instruction to switch into kernel mode. This instruction is used alongside an MSR register to securely jump into kernel mode from user mode at the address specified in the MSR (Model Specific Register).
Once in kernel mode, all instructions can be executed and access to all hardware is unlocked. To provide with the request the OS doesn't "ask RAM for memory". RAM isn't aware of what memory the OS uses. RAM just blindly answers to asserted pins on it's DIMM and stores information. The OS just checks at boot using the ACPI tables that were built by the BIOS to determine how much RAM there is and what are the different devices that are connected to the computer to avoid writing to some MMIO (Memory Mapped IO). Once the OS knows how much RAM is available (and what parts are usable), it will use algorithms to determine what parts of available RAM every process should get.
When you compile C code, the compiler (and linker) will determine the address of everything right at compilation time. When you launch that executable the OS is aware of all memory the process will use. So it will set up the page tables for that process accordingly. When you ask for memory dynamically using malloc(), the OS determines what part of physical memory your process should get and changes (during runtime) the page tables accordingly.
As to paging itself, you can always read some articles. A short version is the 32 bits paging. In 32 bits paging you have a CR3 register for each CPU core. This register contains the physical address of the bottom of the Page Global Directory. The PGD contains the physical addresses of the bottom of several Page Tables which themselves contain the physical addresses of the bottom of several physical pages (https://wiki.osdev.org/Paging). A virtual address is split into 3 parts. The 12 bits to the right (LSB) are the offset in the physical page. The 10 bits in the middle are the offset in the page table and the 10 MSB are the offset in the PGD.
So when you write
char* p = (char*)malloc(4096);
p[0] = 'a';
p[1] = 'b';
you create a pointer of type char* and making a system call to ask for 4096 bytes of memory. The OS puts the first address of that chunk of memory into a certain conventional register (which depends on the system and OS). You should not forget that the C language is just a convention. It is up to the operating system to implement that convention by writing a compatible compiler. It means that the compiler knows what register and what interrupt number to use (for the system call) because it was specifically written for that OS. The compiler will thus take the address stored into this certain register and store it into this pointer of type char* during runtime. On the second line you are telling the compiler that you want to take the char at the first address and make it an 'a'. On the third line you make the second char a 'b'. In the end, you could write an equivalent:
char* p = (char*)malloc(4096);
*p = 'a';
*(p + 1) = 'b';
The p is a variable containing an address. The + operation on a pointer increments this address by the size of what is stored in that pointer. In this case, the pointer points to a char so the + operation increments the pointer by one char (one byte). If it was pointing to an int then it would be incremented of 4 bytes (32 bits). The size of the actual pointer depends on the system. If you have a 32 bits system then the pointer is 32 bits wide (because it contains an address). On a 64 bits system the pointer is 64 bits wide. A static memory equivalent of what you did is
char p[4096];
p[0] = 'a';
p[1] = 'b';
Now the compiler will know at compile time what memory this table will get. It is static memory. Even then, p represents a pointer to the first char of that array. It means you could write
char p[4096];
*p = 'a';
*(p + 1) = 'b';
It would have the same result.

First, OS requests memory allocation to RAM,…
The OS does not have to request memory. It has access to all of memory the moment it boots. It keeps its own database of which parts of that memory are in use for what purposes. When it wants to provide memory for a user process, it uses its own database to find some memory that is available (or does things to stop using memory for other purposes and then make it available). Once it chooses the memory to use, it updates its database to record that it is in use.
… then RAM gives physical memory address to OS.
RAM does not give addresses to the OS except that, when starting, the OS may have to interrogate the hardware to see what physical memory is available in the system.
Once OS receives physical address, OS maps the physical address to virtual address…
Virtual memory mapping is usually described as mapping virtual addresses to physical addresses. The OS has a database of the virtual memory addresses in the user process, and it has a database of physical memory. When it is fulfilling a request from the process to provide virtual memory and it decides to back that virtual memory with physical memory, the OS will inform the hardware of what mapping it choose. This depends on the hardware, but a typical method is that the OS updates some page table entries that describe what virtual addresses get translated to what physical addresses.
I wrote some value(a and b) in virtual address and they are really written into main memory(RAM).
When your process writes to virtual memory that is mapped to physical memory, the processor will take the virtual memory address, look up the mapping information in the page table entries or other database, and replace the virtual memory address with a physical memory address. Then it will write the data to that physical memory.

In MIPS TLB, Confusion between virtual and physical addresses

I have some C code running in RTL mode, I use CPU I6400 the C code is just a simple code to read and write from some subsystem, for example,
I tried to write in this address : 0x001e400000 (physical address) so when CPU executes this address I got a TLB exception because this address is a mapped area, after many research I fond that I need to translate this address from Virtual address to physical address, I replace this address by 0xffffffffbe400000 (Kseg1), now im able to write some value in this address is done, But when I tried to read from this address(W/R) or another address I got an exception on KSEG2,
Have you an idea why the writing step pass correctly but the reading step generate a exception?

mmap a device and access it device's memory map

I'm working on a very simple application which will test a device. There is no need for a driver and I have admin permissions.
I was going to use a mmap and this is where I got a confused.
The idea is to do the following
int devD = open("/path/to/my/device", "rw");
void *myDevPtr = mmap(start, length, prot, flags, devD, offset);
Here is where I found the documentation for it. I'm confused about every parameter except the file descriptor and the protection.
void *start. What exactly this is the start of? Is it the start of memory map for my device?
size_t length. My device has its own memory map. Is it the length of my devices memory map or this is something else?
int flags. This one puzzles me. If my file descriptor is a device, what do I set my flags to?
off_t offset. This one is also confusing. This is an offset from the start pointer, but what exactly is this offset into?
Another is question that I have is about communicating with the mmapped device. Say I need to write data to a specific register in the device. How would I do it?
I realize that these questions might look too simplistic, but I've been at it for some time now and couldn't find a concrete example that would address my situation.
Any help with this is really appreciated.

void *start. What exactly this is the start of? Is it the start of memory map for my device?
It's the logical address within your program where you want the mapping to occur. If you give it NULL, it will assign one [recommmended]. This is a "hint" and the address of the mapped area is the mmap return value
size_t length. My device has its own memory map. Is it the length of my devices memory map or this is something else?
[Not knowing your device], I would assume it's the same. But, say your device was 6GB long. You may want to access this in sections, so you might specify (e.g.) 1MB instead. And, then, remap later [see the offset section]
int flags. This one puzzles me. If my file descriptor is a device, what do I set my flags to?
Use MAP_SHARED so that what you write to the area is flushed to the "backing store" (which is your device).
off_t offset. This one is also confusing. This is an offset from the start pointer, but what exactly is this offset into?
No, it is not an offset from start. It is the offset within your device that the mapping should be done to (i.e. just like the offset for lseek).
UPDATE:
When you say: [Not knowing your device], I would assume it's the same. Do you mean that length is the length of the devices memory map?
From that standpoint, yes. If you want to map the entire area, which as I mentioned, can be large.
Normally, you map the entire device/file, starting at offset 0 for the length of the device/file.
I'm also a little confused about the offset. Say my device has a register at offset 0x100. In order to read/write this register I would need to set offset to 0x100. Am I correct?
Yes and no. You can do it two ways. Herein, let's call the mapped address [the return value from mmap] by the name mapbase.
(1) Give mmap an offset parameter of 0x100. Then, do (e.g.) val = *mapbase. Effectively, this is saying to the OS: "I only care about this one register and you handle the mapping to it"
(2) Give mmap an offset parameter of 0. Then, do val = mapbase[0x100] Effectively, this is saying: "I want a mapping to all the registers and I will handle the indexing/offsetting manually"
Method (2) is more usual (i.e. you want to create a single mapping that can access just about any register). If you use method (1), what about a register that is located at 0x80? It's inaccessible [unless you do a remap, which is time consuming].
UPDATE #2:
As arsv pointed out, you may need to open /dev/mem to map to a device's registers.
This depends upon your device's driver. Suppose we have /dev/mydevice. Now, suppose we do fdd = open("/dev/mydevice",O_RDWR)
It is up to the driver to provide a mapping between I/O done to the open file descriptor (fdd) and the device's registers.
Some drivers support this, but most don't. If the device does support this, then we do the mmap with fdd
If it doesn't we have to do fdm = open("/dev/mem",O_RDWR) and pass fdm to mmap. Of course, now the mmap offset parameter will be radically different.

Check Mapping a physical device to a pointer in User space.
start is virtual memory address to map the device to. Leave NULL there.
flags should be MAP_SHARED.
offset is into the file being mmaped; for /dev/mem, that would be page-aligned physical address of the device.
Then just write to the mmaped area.
char* ptr = mmap(..., [/dev/mem], BASE);
*(ptr + OFFSET) = value;
Keep in mind that the physical address in this case will be (BASE + OFFSET).

Pass a hex address to a Pointer Variable

I know how to work with pointers. But I don't know how do this:
I have an hex address that's of course it has a any value from any app.
I know to find the address that I want.
I want to write a C Code to pass this address to a pointer variable, and then I could capture the value from that address and so on.
For example:
hex 0x00010010
int *pointer;
How can I pass this address to the pointer, what's the syntax for that?

By using int* you are assuming the data you are point to is an int (4 bytes, on most OS's).
Shouldn't we use a void* instead? Later, if you want to access it's data using a particular type, all you need to do is cast.
void *addr = (void*)0x0023FF74; // valid memory address within my app
printf("Address: %p\n", addr);
getchar();
printf("Data (4bytes): %d\n", *(int*)addr); // print the first 4 bytes of data as int
getchar();
EDIT:
Help me understand what you are trying to do, because this statement confused me:
There is app using that address, I know the value in it, but the (int)pointer don't access the value.
Are you trying to write an application that will modify the memory of another application that is running your system? Your current approach won't work, and this is why: When the Operating System loads your application into a process, it reserves a memory region to be used by the process and assigns a range of virtual addresses to this region. These virtual addresses do not map directly to memory addresses in RAM, so the OS has to keep an internal table for that.
On Windows, each process loaded receives the same range of virtual addresses, but this area is only visible to the process that is running inside it. For instance, (on Windows) processes are loaded in memory address 0x00400000, which mean each process has it's own memory address 0x00400000, and therefore you can't assign X memory address to a pointer in your application and expect Windows to magically know that you are reffering to address X that is inside another application.
What you are trying to accomplish it's called Code Injection, and there's a lot of information on the web about it.

In typical modern operating systems, you cannot access another process' (app's) address space. The addresses are virtual, so the same numerical value means different things to different processes.
If you are on a system with a true "flat" address space (where processes directly work with actual physical memory addresses) such as Amiga OS on the 680x0, you can do something like:
int *pointer = (int *) 0x00010010;

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight