How to chain BCryptEncrypt and BCryptDecrypt calls using AES in GCM mode? - c

Using the Windows CNG API, I am able to encrypt and decrypt individual blocks of data with authentication, using AES in GCM mode. I now want to encrypt and decrypt multiple buffers in a row.
According to documentation for CNG, the following scenario is supported:
If the input to encryption or decryption is scattered across multiple
buffers, then you must chain calls to the BCryptEncrypt and
BCryptDecrypt functions. Chaining is indicated by setting the
BCRYPT_AUTH_MODE_IN_PROGRESS_FLAG flag in the dwFlags member.
If I understand it correctly, this means that I can invoke BCryptEncrypt sequentially on multiple buffers an obtain the authentication tag for the combined buffers at the end. Similarly, I can invoke BCryptDecrypt sequentially on multiple buffers while deferring the actual authentication check until the end. I can not get that to work though, it looks like the value for dwFlags is ignored. Whenever I use BCRYPT_AUTH_MODE_IN_PROGRESS_FLAG, I get a return value of 0xc000a002 , which is equal to STATUS_AUTH_TAG_MISMATCH as defined in ntstatus.h.
Even though the parameter pbIV is marked as in/out, the elements pointed to by the parameter pbIV do not get modified by BCryptEncrypt(). Is that expected? I also looked at the field pbNonce in the BCRYPT_AUTHENTICATED_CIPHER_MODE_INFO structure, pointed to by the pPaddingInfo pointer, but that one does not get modified either. I also tried "manually" advancing the IV, modifying the contents myself according to the counter scheme, but that did not help either.
What is the right procedure to chain the BCryptEncrypt and/or BCryptDecrypt functions successfully?

I managed to get it to work. It seems that the problem is in MSDN, it should mention setting BCRYPT_AUTH_MODE_CHAIN_CALLS_FLAG instead of BCRYPT_AUTH_MODE_IN_PROGRESS_FLAG.
#include <windows.h>
#include <assert.h>
#include <vector>
#include <Bcrypt.h>
#pragma comment(lib, "bcrypt.lib")
std::vector<BYTE> MakePatternBytes(size_t a_Length)
{
std::vector<BYTE> result(a_Length);
for (size_t i = 0; i < result.size(); i++)
{
result[i] = (BYTE)i;
}
return result;
}
std::vector<BYTE> MakeRandomBytes(size_t a_Length)
{
std::vector<BYTE> result(a_Length);
for (size_t i = 0; i < result.size(); i++)
{
result[i] = (BYTE)rand();
}
return result;
}
int _tmain(int argc, _TCHAR* argv[])
{
NTSTATUS bcryptResult = 0;
DWORD bytesDone = 0;
BCRYPT_ALG_HANDLE algHandle = 0;
bcryptResult = BCryptOpenAlgorithmProvider(&algHandle, BCRYPT_AES_ALGORITHM, 0, 0);
assert(BCRYPT_SUCCESS(bcryptResult) || !"BCryptOpenAlgorithmProvider");
bcryptResult = BCryptSetProperty(algHandle, BCRYPT_CHAINING_MODE, (BYTE*)BCRYPT_CHAIN_MODE_GCM, sizeof(BCRYPT_CHAIN_MODE_GCM), 0);
assert(BCRYPT_SUCCESS(bcryptResult) || !"BCryptSetProperty(BCRYPT_CHAINING_MODE)");
BCRYPT_AUTH_TAG_LENGTHS_STRUCT authTagLengths;
bcryptResult = BCryptGetProperty(algHandle, BCRYPT_AUTH_TAG_LENGTH, (BYTE*)&authTagLengths, sizeof(authTagLengths), &bytesDone, 0);
assert(BCRYPT_SUCCESS(bcryptResult) || !"BCryptGetProperty(BCRYPT_AUTH_TAG_LENGTH)");
DWORD blockLength = 0;
bcryptResult = BCryptGetProperty(algHandle, BCRYPT_BLOCK_LENGTH, (BYTE*)&blockLength, sizeof(blockLength), &bytesDone, 0);
assert(BCRYPT_SUCCESS(bcryptResult) || !"BCryptGetProperty(BCRYPT_BLOCK_LENGTH)");
BCRYPT_KEY_HANDLE keyHandle = 0;
{
const std::vector<BYTE> key = MakeRandomBytes(blockLength);
bcryptResult = BCryptGenerateSymmetricKey(algHandle, &keyHandle, 0, 0, (PUCHAR)&key[0], key.size(), 0);
assert(BCRYPT_SUCCESS(bcryptResult) || !"BCryptGenerateSymmetricKey");
}
const size_t GCM_NONCE_SIZE = 12;
const std::vector<BYTE> origNonce = MakeRandomBytes(GCM_NONCE_SIZE);
const std::vector<BYTE> origData = MakePatternBytes(256);
// Encrypt data as a whole
std::vector<BYTE> encrypted = origData;
std::vector<BYTE> authTag(authTagLengths.dwMinLength);
{
BCRYPT_AUTHENTICATED_CIPHER_MODE_INFO authInfo;
BCRYPT_INIT_AUTH_MODE_INFO(authInfo);
authInfo.pbNonce = (PUCHAR)&origNonce[0];
authInfo.cbNonce = origNonce.size();
authInfo.pbTag = &authTag[0];
authInfo.cbTag = authTag.size();
bcryptResult = BCryptEncrypt
(
keyHandle,
&encrypted[0], encrypted.size(),
&authInfo,
0, 0,
&encrypted[0], encrypted.size(),
&bytesDone, 0
);
assert(BCRYPT_SUCCESS(bcryptResult) || !"BCryptEncrypt");
assert(bytesDone == encrypted.size());
}
// Decrypt data in two parts
std::vector<BYTE> decrypted = encrypted;
{
DWORD partSize = decrypted.size() / 2;
std::vector<BYTE> macContext(authTagLengths.dwMaxLength);
BCRYPT_AUTHENTICATED_CIPHER_MODE_INFO authInfo;
BCRYPT_INIT_AUTH_MODE_INFO(authInfo);
authInfo.pbNonce = (PUCHAR)&origNonce[0];
authInfo.cbNonce = origNonce.size();
authInfo.pbTag = &authTag[0];
authInfo.cbTag = authTag.size();
authInfo.pbMacContext = &macContext[0];
authInfo.cbMacContext = macContext.size();
// IV value is ignored on first call to BCryptDecrypt.
// This buffer will be used to keep internal IV used for chaining.
std::vector<BYTE> contextIV(blockLength);
// First part
authInfo.dwFlags = BCRYPT_AUTH_MODE_CHAIN_CALLS_FLAG;
bcryptResult = BCryptDecrypt
(
keyHandle,
&decrypted[0*partSize], partSize,
&authInfo,
&contextIV[0], contextIV.size(),
&decrypted[0*partSize], partSize,
&bytesDone, 0
);
assert(BCRYPT_SUCCESS(bcryptResult) || !"BCryptDecrypt");
assert(bytesDone == partSize);
// Second part
authInfo.dwFlags &= ~BCRYPT_AUTH_MODE_CHAIN_CALLS_FLAG;
bcryptResult = BCryptDecrypt
(
keyHandle,
&decrypted[1*partSize], partSize,
&authInfo,
&contextIV[0], contextIV.size(),
&decrypted[1*partSize], partSize,
&bytesDone, 0
);
assert(BCRYPT_SUCCESS(bcryptResult) || !"BCryptDecrypt");
assert(bytesDone == partSize);
}
// Check decryption
assert(decrypted == origData);
// Cleanup
BCryptDestroyKey(keyHandle);
BCryptCloseAlgorithmProvider(algHandle, 0);
return 0;
}

#Codeguard's answer got me through the project I was working on which lead me to find this question/answer in the first place; however, there were still a number of gotchas I struggled with. Below is the process I followed with the tricky parts called out. You can view the actual code at the link above:
Use BCryptOpenAlgorithmProvider to open the algorithm provider using BCRYPT_AES_ALGORITHM.
Use BCryptSetProperty to set the BCRYPT_CHAINING_MODE to BCRYPT_CHAIN_MODE_GCM.
Use BCryptGetProperty to get the BCRYPT_OBJECT_LENGTH to allocate for use by the BCrypt library for the encrypt/decrypt operation. Depending on your implementation, you may also want to:
Use BCryptGetProperty to determine BCRYPT_BLOCK_SIZE and allocate scratch space for the IV. The Windows API updates the IV with each call, and the caller is responsible for providing the memory for that usage.
Use BCryptGetProperty to determine BCRYPT_AUTH_TAG_LENGTH and allocate scratch space for the largest possible tag. Like the IV, the caller is responsible for providing this space, which the API updates each time.
Initialize the BCRYPT_AUTHENTICATED_CIPHER_MODE_INFO struct:
Initialize the structure with BCRYPT_INIT_AUTH_MODE_INFO()
Initialize the pbNonce and cbNonce field. Note that for the first call to BCryptEncrypt/BCryptDecrypt, the IV is ignored as an input and this field is used as the "IV". However, the IV parameter will be updated by that first call and used by subsequent calls, so space for it must still be provided. In addition, the pbNonce and cbNonce fields must remain set (even though they are unused after the first call) for all calls to BCryptEncrypt/BCryptDecrypt or those calls will complain.
Initialize pbAuthData and cbAuthData. In my project, I set these fields just before the first call to BCryptEncrypt/BCryptDecrypt and immediately reset them to NULL/0 immediately afterward. You can pass NULL/0 as the input and output parameters during these calls.
Initialize pbTag and cbTag. pbTag can be NULL until the final call to BCryptEncrypt/BCryptDecrypt when the tag is retrieved or checked, but cbTag must be set or else BCryptEncrypt/BCryptDecrypt will complain.
Initialize pbMacContext and cbMacContext. These point to a scratch space for the BCryptEncrypt/BCryptDecrypt to use to keep track of the current state of the tag/mac.
Initialize cbAAD and cbData to 0. The APIs use these fields, so you can read them at any time, but you should not update them after initially setting them to 0.
Initialize dwFlags to BCRYPT_AUTH_MODE_CHAIN_CALLS_FLAG. After initialization, changes to this field should be made by using |= or &=. Windows also sets flags within this field that the caller needs to take care not to alter.
Use BCryptGenerateSymmetricKey to import the key to use for encryption/decryption. Note that you will need to supply the memory associated with BCRYPT_OBJECT_LENGTH to this call for use by BCryptEncrypt/BCryptDecrypt during operation.
Call BCryptEncrypt/BCryptDecrypt with your AAD, if any; no input nor space for output need be supplied for this call. (If the call succeeds, you can see the size of your AAD reflected in the cbAAD field of the BCRYPT_AUTHENTICATED_CIPHER_MODE_INFO structure.)
Set pbAuthData and cbAuthData to reflect the AAD.
Call BCryptEncrypt or BCryptDecrypt.
Set pbAuthData and cbAuthData back to NULL and 0.
Call BCryptEncrypt/BCryptDecrypt "N - 1" times
The amount of data passed to each call must be a multiple of the algorithm's block size.
Do not set the dwFlags parameter of the call to anything other than 0.
The output space must be equal to or greater than the size of the input
Call BCryptEncrypt/BCryptDecrypt one final time (with or without plain/cipher text input/output). The size of the input need not be a multiple of the algorithm's block size for this call. dwFlags is still set to 0.
Set the pbTag field of the BCRYPT_AUTHENTICATED_CIPHER_MODE_INFO structure either to the location at which to store the generated tag or to the location of the tag to verify against, depending on whether the operation is an encryption or decryption.
Remove the BCRYPT_AUTH_MODE_CHAIN_CALLS_FLAG from the dwFlags field of the BCRYPT_AUTHENTICATED_CIPHER_MODE_INFO structure using the &= syntax.
Call BCryptDestroyKey
Call BCryptCloseAlgorithmProvider
It would be wise, at this point, to wipe out the space associated with BCRYPT_OBJECT_LENGTH.

Related

IMFTransform::ProcessOutput returns E_INVALIDARG

The problem
I am trying to get call ProcessOutput to get decoded data from my decoder and get the following error:
E_INVALIDARG One or more arguments are invalid.
What I have tried
As ProcessOutput has many arguments I have tried to pinpoint what the error might be. Documentation for ProcessOutput does not mention E_INVALIDARG. However, the documentation for MFT_OUTPUT_DATA_BUFFER, the datatype for one of the arguments, mentions in its Remarks section that:
Any other combinations are invalid and cause ProcessOutput to return E_INVALIDARG
What it talks about there is how the MFT_OUTPUT_DATA_BUFFER struct is setup. So an incorrectly setup MFT_OUTPUT_DATA_BUFFER might cause that error. I have however tried to set it up correctly.
By calling GetOutputStreamInfo I find that I need to allocate the sample sent to ProcessOutput which is what I do. I'm using pretty much the same method that worked for ProcessInput so I don't know what I am doing wrong here.
I have also tried to make sure that the other arguments, who logically should also be able to cause an E_INVALIDARG. They look good to me and I have not been able to find any other leads to which of my arguments to ProcessOutput might be invalid.
The code
I have tried to post only the relevant parts of the code below. I have removed or shortened many of the error checks for brevity. Note that I am using plain C.
"Prelude"
...
hr = pDecoder->lpVtbl->SetOutputType(pDecoder, dwOutputStreamID, pMediaOut, dwFlags);
...
// Send input to decoder
hr = pDecoder->lpVtbl->ProcessInput(pDecoder, dwInputStreamID, pSample, dwFlags);
if (FAILED(hr)) { /* did not fail */ }
So before the interesting code below I have successfully setup things (I hope) and sent them to ProcessInput which did not fail. I have 1 input stream and 1 output stream, AAC in, PCM out.
Code directly leading to the error
// Input has now been sent to the decoder
// To extract a sample from the decoder we need to create a strucure to hold the output
// First we ask the OutputStream for what type of output sample it will produce and who should allocate it
// Then we create both the sample in question (if we should allocate it that is) and the MFT_OUTPUT_DATA_BUFFER
// which holds the sample and some other information that the decoder will fill in.
#define SAMPLES_PER_BUFFER 1 // hardcoded here, should depend on GetStreamIDs results, which right now is 1
MFT_OUTPUT_DATA_BUFFER pOutputSamples[SAMPLES_PER_BUFFER];
DWORD *pdwStatus = NULL;
// There are different allocation models, find out which one is required here.
MFT_OUTPUT_STREAM_INFO streamInfo = { 0,0,0 };
MFT_OUTPUT_STREAM_INFO *pStreamInfo = &streamInfo;
hr = pDecoder->lpVtbl->GetOutputStreamInfo(pDecoder, dwOutputStreamID, pStreamInfo);
if (FAILED(hr)) { ... }
if (pStreamInfo->dwFlags == MFT_OUTPUT_STREAM_PROVIDES_SAMPLES) { ... }
else if (pStreamInfo->dwFlags == MFT_OUTPUT_STREAM_CAN_PROVIDE_SAMPLES) { ... }
else {
// default, the client must allocate the output samples for the stream
IMFSample *pOutSample = NULL;
DWORD minimumSizeOfBuffer = pStreamInfo->cbSize;
IMFMediaBuffer *pBuffer = NULL;
// CreateMediaSample is explained further down.
hr = CreateMediaSample(minimumSizeOfBuffer, sampleDuration, &pBuffer, &pOutSample);
if (FAILED(hr)) {
BGLOG_ERROR("error");
}
pOutputSamples[0].pSample = pOutSample;
}
// since GetStreamIDs return E_NOTIMPL then dwStreamID does not matter
// but its recomended that it is set to the array index, 0 in this case.
// dwOutputStreamID will be 0 when E_NOTIMPL is returned by GetStremIDs
pOutputSamples[0].dwStreamID = dwOutputStreamID; // = 0
pOutputSamples[0].dwStatus = 0;
pOutputSamples[0].pEvents = NULL; // have tried init this myself, but MFT_OUTPUT_DATA_BUFFER documentation says not to.
hr = pDecoder->lpVtbl->ProcessOutput(pDecoder, dwFlags, outputStreamCount, pOutputSamples, pdwStatus);
if (FAILED(hr)) {
// here E_INVALIDARG is found.
}
CreateMediaSample that is used in the code is derived from an example from the official documentation but modified to call SetSampleDuration and SetSampleTime. I get the same error by not setting those two though so it should be something else causing the problem.
Some of the actual data that was sent to ProcessOutput
In case I might have missed something which is easy to see from the actual data:
hr = pDecoder->lpVtbl->ProcessOutput(
pDecoder, // my decoder
dwFlags, // 0
outputStreamCount, // 1 (from GetStreamCount)
pOutputSamples, // se comment below
pdwStatus // NULL
);
// pOutputSamples[0] holds this struct:
// dwStreamID = 0,
// pSample = SampleDefinedBelow
// dwStatus = 0,
// pEvents = NULL
// SampleDefinedBelow:
// time = 0
// duration = 0.9523..
// buffer = with max length set correctly
// attributes[] = NULL
Question
So anyone have any ideas on what I am doing wrong or how I could debug this further?
ProcessOutput needs a valid pointer as the last argument, so this does not work:
DWORD *pdwStatus = NULL;
pDecoder->lpVtbl->ProcessOutput(..., pdwStatus);
This is okay:
DWORD dwStatus;
pDecoder->lpVtbl->ProcessOutput(..., &dwStatus);
Regarding further E_FAIL - your findings above, in general, looks good. It is not that I see something obvious, and also the error code does not suggest that the problem is with MFT data flow. Perhaps it could be bad data or data not matching media types set.

SetupDiGetDeviceRegistryProperty: "The data area passed to a system call is too small" error

I have a code that enumerates USB devices on Windows XP using SetupAPI:
HDEVINFO hDevInfo = SetupDiGetClassDevs( &_DEVINTERFACE_USB_DEVICE, 0, 0, DIGCF_DEVICEINTERFACE | DIGCF_PRESENT);
for (DWORD i = 0; ; ++i)
{
SP_DEVINFO_DATA devInfo;
devInfo.cbSize = sizeof(SP_DEVINFO_DATA);
BOOL succ = SetupDiEnumDeviceInfo(hDevInfo, i, &devInfo);
if (GetLastError() == ERROR_NO_MORE_ITEMS)
break;
if (!succ) continue;
DWORD devClassPropRequiredSize = 0;
succ = SetupDiGetDeviceRegistryProperty(hDevInfo, &devInfo, SPDRP_COMPATIBLEIDS, NULL, NULL, 0, &devClassPropRequiredSize);
if (!succ)
{
// This shouldn't happen!
continue;
}
}
It used to work for years, but now I get FALSE from SetupDiGetDeviceRegistryProperty, last error is "The data area passed to a system call is too small".
It seems that my call parameters correspond to the documentation for this function: http://msdn.microsoft.com/en-us/library/windows/hardware/ff551967(v=vs.85).aspx
Any ideas what's wrong?
Problem was in your original code: SetupDiGetDeviceRegistryProperty function may return FALSE (and set last error to ERROR_INSUFFICIENT_BUFFER) when required property doesn't exist (or when its data is not valid, yes they have been lazy to pick a proper error code) so you should always check for ERROR_INSUFFICIENT_BUFFER as a (not so) special case:
DWORD devClassPropRequiredSize = 0;
succ = SetupDiGetDeviceRegistryProperty(
hDevInfo,
&devInfo,
SPDRP_COMPATIBLEIDS,
NULL,
NULL,
0,
&devClassPropRequiredSize);
if (!succ) {
if (ERROR_INSUFFICIENT_BUFFER == GetLastError() {
// I may ignore this property or I may simply
// go on, required size has been set in devClassPropRequiredSize
// so next call should work as expected (or fail in a managed way).
} else {
continue; // Cannot read property size
}
}
Usually you may simply ignore this error when you're reading property size (if devClassPropRequiredSize is still zero you can default it to proper constant for maximum allowed length). If property can't be read then next call SetupDiGetDeviceRegistryProperty will fail (and you'll manage error there) but often you're able to read value and your code will work smoothly.

Using VirtualQueryEx to enumerate modules at remote process doesn't return all modules

I am trying to get a list of DLLs that a given process is using, I am trying to achieve that through VirtualQueryEx. My problem is that it return to me just a partial list of DLLs and not all of them (i can see the list using Process Explorer or using VirtualQuery on the given process).
Here's the code:
char szBuf[MAX_PATH * 100] = { 0 };
PBYTE pb = NULL;
MEMORY_BASIC_INFORMATION mbi;
HANDLE h_process = OpenProcess(PROCESS_QUERY_INFORMATION, FALSE, iPID);
while (VirtualQueryEx(h_process, pb, &mbi, sizeof(mbi)) == sizeof(mbi)) {
int nLen;
char szModName[MAX_PATH];
if (mbi.State == MEM_FREE)
mbi.AllocationBase = mbi.BaseAddress;
if ((mbi.AllocationBase == hInstDll) ||
(mbi.AllocationBase != mbi.BaseAddress) ||
(mbi.AllocationBase == NULL)) {
// Do not add the module name to the list
// if any of the following is true:
// 1. If this region contains this DLL
// 2. If this block is NOT the beginning of a region
// 3. If the address is NULL
nLen = 0;
} else {
nLen = GetModuleFileNameA((HINSTANCE) mbi.AllocationBase,
szModName, _countof(szModName));
}
if (nLen > 0) {
wsprintfA(strchr(szBuf, 0), "\n%p-%s",
mbi.AllocationBase, szModName);
}
pb += mbi.RegionSize;
}
I am getting the result on szBuf.
This function is part of a DLL file so that it is harder for me to debug.
Right now the DLL is compiled as x64 binary and i am using it against x64 processes.
P.S i know about EnumProcessModules and i am not using it with a reason (too long:).
GetModuleFileName() only gives you the name for modules loaded in your process, not the other process. It will give you a few hits by accident, the Windows operating system DLLs will get loaded at the same address and will thus have the same module handle value.
You will need to use GetModuleFileNameEx() so you can pass the process handle.
Do note the fundamental flaw with your code as posted, you are not doing anything to ensure that you can safely use VirtualQueryEx() on another process. Which requires that you suspend all its threads so it cannot allocate memory while you are iterating it, the kind of thing a debugger does. Also required for EnumProcessModules. The failure mode is nasty, it is random and it can easily get your loop stuck, iterating the same addresses over and over again. Which is why the CreateToolHelp32Snapshot() function exists, emphasis on "snapshot".

Need to write algorithm in state-machine style, but it becomes very hard to read

I work on embedded device's firmware (write in C), I need to take a screenshot from the display and save it as a bmp file. Currently I work on the module that generates bmp file data. The easiest way to do that is to write some function that takes the following arguments:
(for simplicity, only images with indexed colors are supported in my example)
color_depth
image size (width, height)
pointer to function to get palette color for color_index (i)
pointer to function to get color_index of the pixel with given coords (x, y)
pointer to function to write image data
And then user of this function should call it like that:
/*
* Assume we have the following functions:
* int_least32_t palette_color_get (int color_index);
* int pix_color_idx_get (int x, int y);
* void data_write (const char *p_data, size_t len);
*/
bmp_file_generate(
1, //-- color_depth
x, y, //-- size
palette_color_get,
pic_color_idx_get,
data_write
);
And that's it: this functions does all the job, and returns only when job is done (i.e. bmp file generated and "written" by given user callback function data_write().
BUT, I need to make bmp_writer module to be usable in cooperative RTOS, and data_write() might be a function that actually transmits data via some protocol (say, UART) to another device), so, this function needs to be called only from Task context. This approach doesn't work then, I need to make it in OO-style, and its usage should look like this:
/*
* create instance of bmp_writer with needed params
* (we don't need "data_write" pointer anymore)
*/
T_BmpWriter *p_bmp_writer = new_bmp_writer(
1, //-- color_depth
x, y, //-- size
palette_color_get,
pic_color_idx_get
);
/*
* Now, byte-by-byte get all the data!
*/
while (bmp_writer__data_available(p_bmp_writer) > 0){
char cur_char = bmp_writer__get_next_char(p_bmp_writer);
//-- do something useful with current byte (i.e. cur_char).
// maybe transmit to another device, or save to flash, or anything.
}
/*
* Done! Free memory now.
*/
delete_bmp_writer(p_bmp_writer);
As you see, user can call bmp_writer__get_next_char(p_bmp_writer) when he need that, and handle received data as he wants.
Actually I already implemented this, but, with that approach, all the algorithm becomes turned inside out, and this code is extremely non-readable.
I'll show you a part of old code that generates palette data (from the function that does all the job, and returns only when job is done), and appropriate part of new code (in state-machine style).
Old code:
void bmp_file_generate(/*....args....*/)
{
//-- ... write headers
//-- write palette (if needed)
if (palette_colors_cnt > 0){
size_t i;
int_least32_t cur_color;
for (i = 0; i < palette_colors_cnt; i++){
cur_color = callback_palette_color_get(i);
callback_data_write((const char *)&cur_color, sizeof(cur_color));
}
}
//-- ...... write image data ..........
}
As you see, very short and easy-readable code.
Now, new code.
It looks like state-machine, because it's actually splitted by stages (HEADER_WRITE, PALETTE_WRITE, IMG_DATA_WRITE), each stage has its own context. In the old code, context was saved in local variables, but now we need to make the structure and allocate it from heap.
So:
/*
* Palette stage context
*/
typedef struct {
size_t i;
size_t cur_color_idx;
int_least32_t cur_color;
} T_StageContext_Palette;
/*
* Function that switches stage.
* T_BmpWriter is an object context, and pointer *me is analogue of "this" in OO-languages.
* bool_start is 1 if stage is just started, and 0 if it is finished.
*/
static void _stage_start_end(T_BmpWriter *me, U08 bool_start)
{
switch (me->stage){
//-- ...........other stages.........
case BMP_WR_STAGE__PALETTE:
if (bool_start){
//-- palette stage is just started. Allocate stage context and initialize it.
me->p_stage_context = malloc(sizeof(T_StageContext_Palette));
memset(me->p_stage_context, 0x00, sizeof(T_StageContext_Palette));
//-- we need to get first color, so, set index of byte in cur_color to maximum
((T_StageContext_Palette *)me->p_stage_context)->i = sizeof(int_least32_t);
} else {
free(me->p_stage_context);
me->p_stage_context = NULL;
}
break;
//-- ...........other stages.........
}
}
/*
* Function that turns to the next stage
*/
static void _next_stage(T_BmpWriter *me)
{
_stage_start_end(me, 0);
me->stage++;
_stage_start_end(me, 1);
}
/*
* Function that actually does the job and returns next byte
*/
U08 bmp_writer__get_next_char(T_BmpWriter *me)
{
U08 ret = 0; //-- resulting byte to return
U08 bool_ready = 0; //-- flag if byte is ready
while (!bool_ready){
switch (me->stage){
//-- ...........other stages.........
case BMP_WR_STAGE__PALETTE:
{
T_StageContext_Palette *p_stage_context =
(T_StageContext_Palette *)me->p_stage_context;
if (p_stage_context->i < sizeof(int_least32_t)){
//-- return byte of cur_color
ret = *( (U08 *)&p_stage_context->cur_color + p_stage_context->i );
p_stage_context->i++;
bool_ready = 1;
} else {
//-- need to get next color (or even go to next stage)
if (p_stage_context->cur_color_idx < me->bmp_details.palette_colors_cnt){
//-- next color
p_stage_context->cur_color = me->callback.p_palette_color_get(
me->callback.user_data,
p_stage_context->cur_color_idx
);
p_stage_context->cur_color_idx++;
p_stage_context->i = 0;
} else {
//-- next stage!
_next_stage(me);
}
}
}
break;
//-- ...........other stages.........
}
}
return ret;
}
So huge code, and it's so hard to understand it!
But I really have no idea how to make it in some different way, to be able to get information byte-by-byte.
Does anyone know how to achieve this, and keep code readability?
Any help is appreciated.
You can try protothread, which is useful to transform a state-machine based program into thread-style program. I'm not 100% sure that it can solve your problem elegantly, you can give it a try. The paper is a good starting point: Protothreads: simplifying event-driven programming of memory-constrained embedded systems
Here is its source code: http://code.google.com/p/protothread/
By the way, protothread is also used in the Contiki embedded OS, for implementing process in Contiki.

SetProp problem

Can anybody tell me why the following code doesn't work? I don't get any compiler errors.
short value = 10;
SetProp(hCtl, "value", (short*) value);
The third parameter is typed as a HANDLE, so IMO to meet the explicit contract of the function you should save the property as a HANDLE by allocating a HGLOBAL memory block. However, as noted in the comments below, MSDN states that any value can be specified, and indeed when I try it on Windows 7 using...
SetProp(hWnd, _T("TestProp"), (HANDLE)(10)); // or (HANDLE)(short*)(10)
...
(short)GetProp(hWnd, _T("TestProp"));
... I get back 10 from GetProp. I suspect somewhere between your SetProp and GetProp one of two things happens: (1) the value of hWnd is different -- you're checking a different window or (2) a timing issue -- the property hasn't been set yet or had been removed.
If you wanted to use an HGLOBAL instead to follow the specific types of the function signature, you can follow this example in MSDN.
Even though a HANDLE is just a pointer, it's a specific data type that is allocated by calls into the Windows API. Lots of things have handles: icons, cursors, files, ... Unless the documentation explicitly states otherwise, to use a blob of data such as a short when the function calls for a HANDLE, you need a memory handle (an HGLOBAL).
The sample code linked above copies data as a string, but you can instead set it as another data type:
// TODO: Add error handling
hMem = GlobalAlloc(GPTR, sizeof(short));
lpMem = GlobalLock(hMem);
if (lpMem != NULL)
{
*((short*)lpMem) = 10;
GlobalUnlock(hMem);
}
To read it back, when you GetProp to get the HANDLE you must lock it to read the memory:
// TODO: Add error handling
short val;
hMem = (HGLOBAL)GetProp(hwnd, ...);
if (hMem)
{
lpMem = GlobalLock(hMem);
if (lpMem)
{
val = *((short*)lpMem);
}
}
I would create the short on the heap, so that it continues to exist, or perhaps make it global, which is perhaps what you did. Also the cast for the short address needs to be void *, or HANDLE.

Resources