OVERLAPPED STRUCTURES and LARGE_INTEGER - c

I working through exercises from Windows System Programming and I'm not fully comprehending LARGE_INTEGER and OVERLAPPED Structures. For example I have the following Structures defined in main. The first structure is used to keep track of the number of records. The second is used for the record data. The author defines and uses two overlapped structure to keep track of the record file offset.
typedef struct _HEADER {
DWORD numRecords;
DWORD numNonEmptyRecords;
} HEADER; /* 8bytes */
typedef struct _RECORD {
DWORD referenceCount;
SYSTEMTIME recordCreationTime;
SYSTEMTIME recordLastRefernceTime;
SYSTEMTIME recordUpdateTime;
TCHAR dataString[STRING_SIZE];
} RECORD; /* 308bytes */
LARGE_INTEGER currentPtr;
OVERLAPPED ov = {0, 0, 0, 0, NULL}, ovZero = {0, 0, 0, 0, NULL};
After the records are created. The user can be prompted Read, Write, or Delete a record. The record entered by the user is stored in recNo.
currentPtr.QuadPart = (LONGLONG)recNo * sizeof(RECORD) + sizeof(HEADER);
ov.Offset = currentPtr.LowPart;
ov.OffsetHigh = currentPtr.HighPart;
Can someone please explain how the values for the LARGE_INTEGR currentPtr are calculated? What is a Union? I have looked at the example in windbg and I don't understand how the currentPtr.LowPart and currentPtr.HighPart are calculated. Below is an example of file read operation being called with the OVERLAPPED Structure.
ReadFile (hFile, &record, sizeof (RECORD), &nXfer, &ov)

A union gives different names and types to the same location in memory. So if a LARGE_INTEGER union was stored at location 0x1000, and since X86 is little endian:
LARGE_INTEGER.QuadPart is 64 bit integer at 0x1000
LARGE_INTEGER.LowPart is the lower 32 bits of the 64 bit integer at 0x1000.
LARGE_INTEGER.HighPart is the upper 32 bits of the 64 bit integer at 0x1004.
OVERLAPPED is used for asynchronous I/O. A read or write call in overlapped mode will return immediately, and the event specified in the OVERLAPPED structure will be signaled when the I/O completes.
MSDN article for OVERLAPPED structure:
http://msdn.microsoft.com/en-us/library/windows/desktop/ms684342(v=vs.85).aspx
In 32 bit mode, Offset would share memory with Pointer, in 64 bit mode, Offset and OffsetHigh would share memory with Pointer. Offset and OffsetHigh are inputs, while Pointer is used internally. InternalHigh is poorly named since it now reports number of bytes transferred, and may change yet again. Internal is now a status.

Related

STM32H7, weird behavior of HAL_FLASH_Program function

For the context, I'm writting a bootloader for my STM32H743XI cause I want to erase and upload code through USB without using pin.
So my bootloader start at 0x08000000, it's size is 21kB (17% of the first sector of 128kB), and I want to read/write data at the end of the sector which will be shared with my App. When I say end of the sector it's the last 10kB of the sector which means I start to R/W at 0x0801D800.
The structure that I want to R/W is 8x32bits cause if I understand well this is the size of a WORD on STM32H74x/5X devices.
This is my struct:
typedef struct
{
int32_t BootLoaderMode;
int32_t StartingPartition;
int32_t AppStartingError;
int32_t temp4;
int32_t temp5;
int32_t temp6;
int32_t temp7;
int32_t temp8;
} ExchangeWord_1;
I've got a pointer to an allocated struct:
ExchangeWord_1* m_ExchangeWord_1 = (ExchangeWord_1*)malloc(sizeof(ExchangeWord_1));
Before writing i unlock memory with:
HAL_FLASH_Unlock();
HAL_FLASH_OB_Unlock();
The write operation looks like (id=0 and the second parameter is my allocated struct):
void writeExchangeWord(uint16_t id, ExchangeWord_1* exchangeWord )
{
//unlock function
uint32_t flash_address = (0x0801D800+id*32);
uint32_t data_address = (uint32_t)exchangeWord;
HAL_FLASH_Program(FLASH_TYPEPROGRAM_FLASHWORD, flash_address, data_address);
//lock function
}
Then I lock the memory :
HAL_FLASH_Lock();
HAL_FLASH_OB_Lock();
So the first call of this works well and the debugger confirms it when I look at the memory:
[Flash memory on first call][1]:
https://i.stack.imgur.com/cH9fI.png
But on on the next call the memory is filled with 0, more weird at the third call it's mulpiple words starting at 0x0801D800 who are filled with 0.
The adress of my struct is well aligned (m_ExchangeWord_1 = 0x20001D60).
What I am missing? Do I need to clear some flags before/after writting?
Ok it's seems that it is impossible to write two time in a row at the same adress, i've read somewhere that we are only allow to switch a bit from 1 to 0 if we want to write multiple time without erasing. I moved my "shared area" in a specific sector that I have to erase each time I want to write on it.
My problem is solved.

Initializing, constructing and converting struct to byte array causes misalignment

I am trying to design a data structure (I have made it much shorter to save space here but I think you get the idea) to be used for byte level communication:
/* PACKET.H */
#define CM_HEADER_SIZE 3
#define CM_DATA_SIZE 16
#define CM_FOOTER_SIZE 3
#define CM_PACKET_SIZE (CM_HEADER_SIZE + CM_DATA_SIZE + CM_FOOTER_SIZE)
// + some other definitions
typedef struct cm_header{
uint8_t PacketStart; //Start Indicator 0x5B [
uint8_t DeviceId; //ID Of the device which is sending
uint8_t PacketType;
} CM_Header;
typedef struct cm_footer {
uint16_t DataCrc; //CRC of the 'Data' part of CM_Packet
uint8_t PacketEnd; //should be 0X5D or ]
} CM_Footer;
//Here I am trying to conver a few u8[4] tp u32 (4*u32 = 16 byte, hence data size)
typedef struct cm_data {
union {
struct{
uint8_t Value_0_0:2;
uint8_t Value_0_1:2;
uint8_t Value_0_2:2;
uint8_t Value_0_3:2;
};
uint32_t Value_0;
};
//same thing for Value_1, 2 and 3
} CM_Data;
typedef struct cm_packet {
CM_Header Header;
CM_Data Data;
CM_Footer Footer;
} CM_Packet;
typedef struct cm_inittypedef{
uint8_t DeviceId;
CM_Packet Packet;
} CM_InitTypeDef;
typedef struct cm_appendresult{
uint8_t Result;
uint8_t Reason;
} CM_AppendResult;
extern CM_InitTypeDef cmHandler;
The goal here is to make reliable structure for transmitting data over USB interface. At the end the CM_Packet should be converted to an uint8_t array and be given to data transmit register of an mcu (stm32).
In the main.c file I try to init the structure as well as some other stuff related to this packet:
/* MAIN.C */
uint8_t packet[CM_PACKET_SIZE];
int main(void) {
//use the extern defined in packet.h to init the struct
cmHandler.DeviceId = 0x01; //assign device id
CM_Init(&cmHandler); //construct the handler
//rest of stuff
while(1) {
CM_GetPacket(&cmHandler, (uint8_t*)packet);
CDC_Transmit_FS(&packet, CM_PACKET_SIZE);
}
}
And here is the implementation of packet.h which screws up everything so bad. I added the packet[CM_PACKET_SIZE] to watch but it is like it is just being generated randomly. Sometimes by pure luck I can see in this array some of the values that I am interested in! but it is like 1% of the time!
/* PACKET.C */
CM_InitTypeDef cmHandler;
void CM_Init(CM_InitTypeDef *cm_initer) {
cmHandler.DeviceId = cm_initer->DeviceId;
static CM_Packet cmPacket;
cmPacket.Header.DeviceId = cm_initer->DeviceId;
cmPacket.Header.PacketStart = CM_START;
cmPacket.Footer.PacketEnd = CM_END;
cm_initer->Packet = cmPacket;
}
CM_AppendResult CM_AppendData(CM_InitTypeDef *handler, uint8_t identifier,
uint8_t *data){
CM_AppendResult result;
switch(identifier){
case CM_VALUE_0:
handler->Packet.Data.Value_0_0 = data[0];
handler->Packet.Data.Value_0_1 = data[1];
handler->Packet.Data.Value_0_2 = data[2];
handler->Packet.Data.Value_0_3 = data[3];
break;
//Also cases for CM_VALUE_0, 1 , 2
//to build up the CM_Data sturct of CM_Packet
default:
result.Result = CM_APPEND_FAILURE;
result.Reason = CM_APPEND_CASE_ERROR;
return result;
break;
}
result.Result = CM_APPEND_SUCCESS;
result.Reason = 0x00;
return result;
}
void CM_GetPacket(CM_InitTypeDef *handler, uint8_t *packet){
//copy the whole struct in the given buffer and later send it to USB host
memcpy(packet, &handler->Packet, sizeof(CM_PACKET_SIZE));
}
So, the problem is this code gives me 99% of the time random stuff. It never has the CM_START which is the start indicator of packet to the value I want to. But most of the time it has the CM_END byte correctly! I got really confused and cant find out the reason. Being working on an embedded platform which is hard to debugg I am kind of lost here...
If you transfer data to another (different) architecture, do not just pass a structure as a blob. That is the way to hell: endianess, alignment, padding bytes, etc. all can (and likely will) cause trouble.
Better serialize the struct in a conforming way, possily using some interpreted control stream so you do not have to write every field out manually. (But still use standard functions to generate that stream).
Some areas of potential or likely trouble:
CM_Footer: The second field might very well start at a 32 or 64 bit boundary, so the preceeding field will be followed by padding. Also, the end of that struct is very likely to be padded by at least 1 bytes on a 32 bit architecture to allow for proper alignment if used in an array (the compiler does not care you if you actually need this). It might even be 8 byte aligned.
CM_Header: Here you likely (not guaranteed) get one uint8_t with 4*2 bits with the ordering not standardized. The field my be followed by 3 unused bytes which are required for the uint32_t interprettion of the union.
How do you guarantee the same endianess (for >uint8_t: high byte first or low byte first?) for host and target?
In general, the structs/unions need not have the same layout for host and target. Even if the same compiler is used, their ABIs may differ, etc. Even if it is the same CPU, there might be other system constraints. Also, for some CPUs, different ABIs (application binary interface) exist.

Hooking mmap system to provide real-time type conversion?

I'm working on some stuff where I want to memory map some large files containing numeric data. The problem is that the data can be a number of formats, including real byte/short/int/long/float/double and complex byte/short/int/long/float/double. Naturally handling all those types all the time quickly gets unwieldy, so I was thinking of implementing a memory mapping interface that can do real-time type conversion for the user.
I really like the idea of mapping a file so you get a pointer in memory back, doing whatever you need and then unmapping it. No bufferology or anything else needed. So a function that reads the data and does the type conversion for me would take a lot away from that.
I was thinking I could memory map the file being operated on, and then simultaneously mapping an anonymous file, and somehow catching page fetches/stores and doing the type conversion on demand. I'll be working on 64-bit so this would give you a 63-bit address space in these cases, but oh well.
Does anyone know if this sort of mmap hooking would be possible, and if so, how might it be accomplished?
Yes(-ish). You can create inaccessible mmap regions. Whenever anybody tries to touch one, handle the SIGSEGV raised by fixing its permissions, filling it, and resuming.
long *long_view =
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
double *double_view =
mmap(NULL, 4096, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
static void on_segv(int signum, siginfo_t *info, void *data) {
void *addr = info->si_addr;
if ((uintptr_t)addr - (uintptr_t)long_view < 4096) {
mprotect(long_view, 4096, PROT_READ|PROT_WRITE);
/* translate from double_view to long_view */
mprotect(double_view, 4096, PROT_NONE);
} else if ((uintptr_t)addr - (uintptr_t)double_view < 4096) {
mprotect(double_view, 4096, PROT_READ|PROT_WRITE);
/* translate from long_view to long_view */
mprotect(double_view, 4096, PROT_NONE);
} else {
abort();
}
}
struct sigaction segv_action = {
.sa_sigaction = on_segv,
.sa_flags = SA_RESTART | SA_SIGINFO,
};
sigaction(SIGSEGV, &segv_action, NULL);
long_view[0] = 42;
/* hopefully, this will trigger the code to fixup double_view and resume */
printf("%g\n", double_view[0]);
(Untested, but something along these lines ought to work...)
If you don't want to fill a whole page at once, that's still doable I think... the third argument can be cast to a ucontext_t *, with which you can decode the instruction being executed and fix it up as if it had performed the expected operation, while leaving the memorry PROT_NONE to catch further accesses... but it'll be a lot slower since you're trapping every access rather than just the first.
The reading part sounds somewhat doable to me. I have no experience in that, but in principle having a signal handler fetch your data and translate it as soon as you access a page that is not yet present in your user-presented buffer should be possible. But it could be that such a thing would be quite inefficient, you'd have a context switch on every page.
The other way round would be much harder I guess. Per default writes are asynchronous, so it will be difficult to capture them.
So "half" of what you want could be possible: always write the data in a new file in the format that the user wants, but translate it automatically on the fly when reading such a file.
But what would be much more important for you, I think, that you have a clear semantic on your different storage representation, and that you encapsulate a read or write of a data item properly. If you have such an interface (something like "store element E at position i with type T") you could easily trigger the conversion with respect to the target format.
Is there a reason why you're not using accessor functions?
There are two basic cases: structured data, and plain data. Structured data has mixed data types, plain data only one format of the ones you listed. You could also have support for transparent endianness correction, if you can store a prototype value (with distinct byte components) for each used type (28 or 30 bytes total for all types you listed -- complex types are just pairs, and have the same byte order as the base components). I have used this approach to store and access atom data from molecular dynamics simulations, and is quite efficient in practice -- the fastest portable one I've tested.
I would use a structure to describe the "file" (the backing file, memory map, endianness, and data format if plain data):
struct file {
int descriptor;
size_t size;
unsigned char *data;
unsigned int endian; /* Relative to current architecture */
unsigned int format; /* endian | format, for unstructured files */
};
#define ENDIAN_I16_MASK 0x0001
#define ENDIAN_I16_12 0x0000
#define ENDIAN_I16_21 0x0001
#define ENDIAN_I32_MASK 0x0006
#define ENDIAN_I32_1234 0x0000
#define ENDIAN_I32_4321 0x0002
#define ENDIAN_I32_2143 0x0004
#define ENDIAN_I32_3412 0x0006
#define ENDIAN_I64_MASK 0x0018
#define ENDIAN_I64_1234 0x0000
#define ENDIAN_I64_4321 0x0008
#define ENDIAN_I64_2143 0x0010
#define ENDIAN_I64_3412 0x0018
#define ENDIAN_F16_MASK 0x0020
#define ENDIAN_F16_12 0x0000
#define ENDIAN_F16_21 0x0020
#define ENDIAN_F32_MASK 0x00C0
#define ENDIAN_F32_1234 0x0000
#define ENDIAN_F32_4321 0x0040
#define ENDIAN_F32_2143 0x0080
#define ENDIAN_F32_3412 0x00C0
#define ENDIAN_F64_MASK 0x0300
#define ENDIAN_F64_1234 0x0000
#define ENDIAN_F64_4321 0x0100
#define ENDIAN_F64_2143 0x0200
#define ENDIAN_F64_3412 0x0300
#define FORMAT_MASK 0xF000
#define FORMAT_I8 0x1000
#define FORMAT_I16 0x2000
#define FORMAT_I32 0x3000
#define FORMAT_I64 0x4000
#define FORMAT_P8 0x5000 /* I8 pair ("complex I8") */
#define FORMAT_P16 0x6000 /* I16 pair ("complex I16") */
#define FORMAT_P32 0x7000 /* I32 pair ("complex I32") */
#define FORMAT_P64 0x8000 /* I64 pair ("complex I64") */
#define FORMAT_R16 0x9000 /* BINARY16 IEEE-754 floating-point */
#define FORMAT_R32 0xA000 /* BINARY32 IEEE-754 floating-point */
#define FORMAT_R64 0xB000 /* BINARY64 IEEE-754 floating-point */
#define FORMAT_C16 0xC000 /* BINARY16 IEEE-754 complex */
#define FORMAT_C32 0xD000 /* BINARY32 IEEE-754 complex */
#define FORMAT_C64 0xE000 /* BINARY64 IEEE-754 complex */
The accessor functions can be implemented in various ways. In Linux, functions marked static inline are as fast as macros, too.
Since the double type does not fully cover 64-bit integer types (since it has only 52 bits in the mantissa), I'd define a number structure,
#include <stdint.h>
struct number {
int64_t ireal;
int64_t iimag;
double freal;
double fimag;
};
and have the accessor functions always fill in the four fields. Using GCC, you can also create a macro to define a struct number using automatic type detection:
#define Number(x) \
( __builtin_types_compatible_p(__typeof__ (x), double) ? number_double(x) : \
__builtin_types_compatible_p(__typeof__ (x), _Complex double) ? number_complex_double(x) : \
__builtin_types_compatible_p(__typeof__ (x), _Complex long) ? number_complex_long(x) : \
number_int64(x) )
static inline struct number number_int64(const int64_t x)
{
return (struct number){ .ireal = (int64_t)x,
.iimag = 0,
.freal = (double)x,
.fimag = 0.0 };
}
static inline struct number number_double(const double x)
{
return (struct number){ .ireal = (int64_t)x,
.iimag = 0,
.freal = x,
.fimag = 0.0 };
}
static inline struct number number_complex_long(const _Complex long x)
{
return (struct number){ .ireal = (int64_t)(__real__ (x)),
.iimag = (int64_t)(__imag__ (x)),
.freal = (double)(__real__ (x)),
.fimag = (double)(__imag__ (x)) };
}
static inline struct number number_complex_double(const _Complex double x)
{
return (struct number){ .ireal = (int64_t)(__real__ (x)),
.iimag = (int64_t)(__imag__ (x)),
.freal = __real__ (x),
.fimag = __imag__ (x) };
}
This means that Number(value) constructs a correct struct number as long as value is an integer or floating-point real or complex type.
Note how the integer and floating-point components are set to the same values, as far as type conversions allow. (For very large integers in magnitude, the floating-point value will be an approximation. You could also use (int64_t)round(...) to round instead of truncate the floating-point parameter, when setting the integer component(s).
You'll need four accessor functions: Two for structured data, and two for unstructured data. For unstructured (plain) data:
static inline struct number file_get_number(const struct file *const file,
const size_t offset)
{
...
}
static inline void file_set_number(const struct file *const file,
const size_t offset,
const struct number number)
{
...
}
Note that offset above is not the byte offset, but the index of the number. For a structured file, you'll need to use the byte offset, and add a parameter specifying the number format used in the file:
static inline struct number file_get(const struct file *const file,
const size_t byteoffset,
const unsigned int format)
{
...
}
static inline void file_set(const struct file *const file,
const size_t byteoffset,
const unsigned int format,
const struct number number)
{
...
}
The conversions needed in the function bodies I omitted (...) are quite straightforward. There are a few tricks you can do for optimization, too. For example, I like to adjust the endianness constants so that the low bit is always a byte swap (ab -> ba, abcd -> badc, abcdefgh -> badcfehg), and the high bit is a short swap (abcd -> cdab, abcdefgh ->cdabghef). You might need a third bit for 64-bit values (abcdefgh -> efghabcd), if you want to be completely certain.
The if or case statements within the function body do cause a small access overhead, but it should be small enough to ignore in practice. All ways to avoid it will lead to much more complex code. (For maximum throughput, you'd need to open-code all access variants, and use __builtin_types_compatible_p() in a function or macro to determine the correct one to use. If you consider the endianness conversion, that means quite a few functions. I believe the very small access overhead -- a few clocks at most per access -- is much more preferable. (All my tests have been I/O bound anyway, even at 200 Mb/s, so to me the overhead is completely irrelevant.)
In general, for automatic endianness conversion using prototype values you simply test each possible conversion for the type. As long as each byte component of the prototype values are unique, then only one conversion will produce the correct expected value. On some architectures integers and floating-point values have different endianness; this is why the ENDIAN_ constants are for each type and size separately.
Assuming you have implemented all of the above, in your application code the data access would look something like
struct file f;
/* Set first number to zero. */
file_set_number(&f, 0, Number(0));
/* Set second number according to variable v,
* which can be just about any numeric type. */
file_set_number(&f, 1, Number(v));
I hope you find this useful.

How to use WAVEHDR

Where can I find information about what data should be in the lpData buffer for the WAVEHDR structure?
MSDN simply says:
lpData
Pointer to the waveform buffer.
typedef struct wavehdr_tag {
LPSTR lpData;
DWORD dwBufferLength;
DWORD dwBytesRecorded;
DWORD_PTR dwUser;
DWORD dwFlags;
DWORD dwLoops;
struct wavehdr_tag *lpNext;
DWORD_PTR reserved;
} WAVEHDR, *LPWAVEHDR;
Thanks
I found this tutorial by David Overton very helpful.
Basically, when you call waveOutOpen, you pass in a format structure. Here's from his code:
WAVEFORMATEX wfx; /* look this up in your documentation */
wfx.nSamplesPerSec = 44100; /* sample rate */
wfx.wBitsPerSample = 16; /* sample size */
wfx.nChannels = 2; /* channels*/
Then your data in lpData is just 2 bytes per sample (signed short int), left, right, left, etc.
lpData is like an old DOS DMA buffer.
So you can write a piece of track on It like a single block loop.
So in C you declare some proper array ...char myarray[porpersize].
and then you point it ->>> myhdrstruc.lpData=&myarray[0]
CCRMA has a decent overview of the wave file format.
It's vague because the data can be in a variety for formats. The format is typically specified by a WAVEFORMATEX.

Utilizing the LDT (Local Descriptor Table)

I am trying to do some experiments using different segments besides the default code and data user and kernel segments. I hope to achieve this through use of the local descriptor table and the modify_ldt system call. Through the system call I have created a new entry in LDT which is a segment descriptor with a base address of a global variable I want to "isolate" and a limit of 4 bytes.
I try to load the data segment register with the segment selector of my custom LDT entry through inline assembly in a C program, but when I try to access the variable I receive a segmentation fault.
My suspicion is that there is an issue with the offset of my global variable, and when the address is calculated, it exceeds the limit of my custom segment therefore causing a seg fault.
Does anyone know of a work around to this situation?
Oh, by the way, this is on an x86 architecture in Linux. This is my first time asking a question like this on a forum, so if there is any other information that could prove to be useful, please let me know.
Thank you in advance.
Edit: I realized that I probably should include the source code :)
struct user_desc* table_entry_ptr = NULL;
/* Allocates memory for a user_desc struct */
table_entry_ptr = (struct user_desc*)malloc(sizeof(struct user_desc));
/* Fills the user_desc struct which represents the segment for mx */
table_entry_ptr->entry_number = 0;
table_entry_ptr->base_addr = ((unsigned long)&mx);
table_entry_ptr->limit = 0x4;
table_entry_ptr->seg_32bit = 0x1;
table_entry_ptr->contents = 0x0;
table_entry_ptr->read_exec_only = 0x0;
table_entry_ptr->limit_in_pages = 0x0;
table_entry_ptr->seg_not_present = 0x0;
table_entry_ptr->useable = 0x1;
/* Writes a user_desc struct to the ldt */
num_bytes = syscall( __NR_modify_ldt,
LDT_WRITE, // 1
table_entry_ptr,
sizeof(struct user_desc)
);
asm("pushl %eax");
asm("movl $0x7, %eax"); /* 0111: 0-Index 1-Using the LDT table 11-RPL of 3 */
asm("movl %eax, %ds");
asm("popl %eax");
mx = 0x407CAFE;
The seg fault occurs at that last instruction.
I can only guess, since I don't have the assembly available to me.
I'm guessing that the line at which you get a segfault is compiled to something like:
mov ds:[offset mx], 0x407cafe
Where offset mx is the offset to mx in the program's data segment (if it's a static variable) or in the stack (if it's an automatic variable). Either way, this offset is calculated at compile time, and that's what will be used regardless of what DS points to.
Now what you've done here is create a new segment whose base is at the address of mx and whose limit is either 0x4 or 0x4fff (depending on the G-bit which you didn't specify).
If the G-bit is 0, then the limit is 0x4, and since it's highly unlikely that mx is located between addresses 0x0 and 0x4 of the original DS, when you access the offset to mx inside the new segment you're crossing the limit.
If the G-bit is 1, then the limit is 0x4fff. Now you'll get a segfault only if the original mx was located above 0x4fff.
Considering that the new segment's base is at mx, you can access mx by doing:
mov ds:[0], 0x407cafe
I don't know how I'd go about writing that in C, though.

Resources