I had a quick OpenCV question. Is it possible to take a vector of keypoints and convert it to a CvSeq?
Thanks in advance.
I don't know why you could want that but it should be possible, with these functions you can do whatever you want :
I should add, that the following is a mix of C and C++ (in OpenCV as well)
CreateSeq
CvSeq* cvCreateSeq(int seqFlags, int headerSize, int elemSize, CvMemStorage* storage)
SeqPush
char* cvSeqPush(CvSeq* seq, void* element=NULL)
Here is the code, i have not tried it yet, please let me know if there are errors, if it works or not, i just gave it a try...
vector<KeyPoint> myKeypointVector; //Your KeyPoint vector
// Do whatever you want with your vector
CvMemStorage* storage = cvCreateMemStorage(0)
// By default the flag 0 is 64K
// but myKeypointVector.size()*(sizeof(KeyPoint)+sizeof(CvSeq)) should work
// it may be more efficient but be careful there may have seg fault
// (not sure about the size
CvSeq* myKeypointSeq = cvCreateSeq(0,sizeof(CvSeq),sizeof(KeyPoint),storage);
// Create the seq at the location storage
for (size_t i=0; myKeypointVector.size(); i++) {
int* added = (int*)cvSeqPush(myKeypointSeq,&(myKeypointVector[i]));
// Should add the KeyPoint in the Seq
}
cvClearMemStorage( storage );
cvReleaseMemStorage(&storage);
Julien,
Related
Suppose I have a structure used for describing values stored inside a virtual memory map:
typedef struct
{
uint16_t u16ID;
uint16_t u16Offset;
uint8_t u8Size;
} MemMap_t;
const MemMap_t memoryMap[3] =
{
{
.u16ID = 0,
.u16Offset = 0,
.u8Size = 3
},
{
.u16ID = 1,
.u16Offset = 3,
.u8Size = 2
},
{
.u16ID = 2,
.u16Offset = 5,
.u8Size = 3
}
};
Each entry contains an offset for addressing the memory location and the size of the value it contains
The offset of each following value is dependent on the offset and size of the values before it
In this example I set all offsets manually.
The reason why I implemented it that way is that it allows me to change the layout of the entire memory map later on,
the structure still making it possible to look up the offset and size of an entry with a certain ID.
The problem with this is that setting the offsets manually is going to get unwieldy quite quickly once the map becomes bigger
and changing the size of an entry at the beginning would require manually changing all offsets of the entries after that one.
I came up with some ways to just calculate the offsets at runtime, but as the target system this will run on is a very RAM constrained embedded system, I really want to keep the entire map as a constant.
Is there an elegant way to calculate the offsets of the map entries at compile time?
After some experiments, found something that may work for large number of attributes. Posting as new answer, as my previous answer took very different approach.
Consider create a proxy structure that describe the object described by MamMap_t, using series of char[] objects.
static struct MemMap_v {
char t0[3] ;
char t1[2] ;
char t2[3] ;
char t3[10] ;
} vv ;
const MemMap_t memoryMap[3] =
{
{
.u16ID = 0,
.u16Offset = vv.t0 - vv.t0,
.u8Size = sizeof(vv.t0)
},
{
.u16ID = 1,
.u16Offset = vv.t1 - vv.t0,
.u8Size = sizeof(vv.t1)
},
{
.u16ID = 2,
.u16Offset = vv.t2 - vv.t0,
.u8Size = sizeof(vv.t2)
}
};
Is there an elegant way to calculate the offsets of the map entries at compile time?
Yes: write yourself a code generator that accepts input data describing the memory map and outputs C source for the initializer or for the whole declaration. Have the appropriate source file #include that. Structure this program so that the form of its input data is convenient for you to maintain.
If the number of map entries were bounded by a (very) small number, and if their IDs were certain to be consecutive and to correspond to their indices in the memoryMap array, then I feel pretty confident that it would be possible to write a set of preprocessor macros that did the job without a separate program. Such a preprocessor-based solution would be messy, and difficult to debug and maintain. I do not recommend this alternative.
Short Answer: not possible to calculate values at compile time, given data structure.
Alternative:
Consider using symbolic constants for the sizes. E_0, E_1, E_2, ..., then you can calculate the offset at compile time (E_0, E_0+E_1, E_0+E_1+E_2). Not very elegant, and does not scale well for large number of items, but will meet the requirements.
Second alternative will be to create a function that will return the pointer to memoryMap. The function can initialize the offset on the first call. The program will call getMemoryMap instead of memoryMap.
static MemMap_t memoryMap[3] =
{
...
}
const MemMap_t *getMemoryMap() {
MemMap_t *p = memoryMap ;
static bool offsetDone ;
if ( !offsetDone ) {
offsetDone = true ;
for (int i=1; i<sizeof(memoryMap)/sizeof(memoryMap[0]) ; i++ ) {
p[i].u16Offset = p[i-1].u16Offset + p[i-1].u8Size ;
} ;
return p;
}
I'll try to make that question as concise as possible, but don't hesitate to ask for clarification.
I'm dealing with legacy code, and I'm trying to load thousands of 8 bit images from the disk to create a texture for each.
I've tried multiple things, and I'm at the point where I'm trying to load my 8 bits images into a 32 bits surface, and then create a texture from that surface.
The problem : while loading and 8 bit image onto a 32 bit surface is working, when I try to SDL_CreateTextureFromSurface, I end up with a lot of textures that are completely blank (full of transparent pixels, 0x00000000).
Not all textures are wrong, thought. Each time I run the program, I get different "bad" textures. Sometimes there's more, sometimes there's less. And when I trace the program, I always end up with a correct texture (is that a timing problem?)
I know that the loading to the SDL_Surface is working, because I'm saving all the surfaces to the disk, and they're all correct. But I inspected the textures using NVidia NSight Graphics, and more than half of them are blank.
Here's the offending code :
int __cdecl IMG_SavePNG(SDL_Surface*, const char*);
SDL_Texture* Resource8bitToTexture32(SDL_Renderer* renderer, SDL_Color* palette, int paletteSize, void* dataAddress, int Width, int Height)
{
u32 uiCurrentOffset;
u32 uiSourceLinearSize = (Width * Height);
SDL_Color *currentColor;
char strSurfacePath[500];
// The texture we're creating
SDL_Texture* newTexture = NULL;
// Load image at specified address
SDL_Surface* tempSurface = SDL_CreateRGBSurface(0x00, Width, Height, 32, 0x00FF0000, 0x0000FF00, 0x000000FF, 0xFF000000);
SDL_SetSurfaceBlendMode(tempSurface, SDL_BLENDMODE_NONE);
if(SDL_MUSTLOCK(tempSurface)
SDL_LockSurface(tempSurface);
for(uiCurrentOffset = 0; uiCurrentOffset < uiSourceLinearSize; uiCurrentOffset++)
{
currentColor = &palette[pSourceData[uiCurrentOffset]];
if(pSourceData[uiCurrentOffset] != PC_COLOR_TRANSPARENT)
{
((u32*)tempSurface->pixels)[uiCurrentOffset] = (u32)((currentColor->a << 24) + (currentColor->r << 16) + (currentColor->g << 8) + (currentColor->b << 0));
}
}
if(SDL_MUSTLOCK(tempSurface)
SDL_UnlockSurface(tempSurface);
// Create texture from surface pixels
newTexture = SDL_CreateTextureFromSurface(renderer, tempSurface);
// Save the surface to disk for verification only
sprintf(strSurfacePath, "c:\\tmp\\surfaces\\%s.png", GenerateUniqueName());
IMG_SavePNG(tempSurface, strSurfacePath);
// Get rid of old loaded surface
SDL_FreeSurface(tempSurface);
return newTexture;
}
Note that in the original code, I'm checking for boundaries, and for NULL after the SDL_Create*. I'm also aware that it would be better to have a spritesheet for the textures instead of loading each texture individually.
EDIT :
Here's a sample of what I'm observing in NSight if I capture a frame and use the Resources View.
The first 3186 textures are correct. Then I get 43 empty textures. Then I get 228 correct textures. Then 100 bad ones. Then 539 correct ones. Then 665 bad ones. It goes on randomly like that, and it changes each time I run my program.
Again, each time the surfaces saved by IMG_SavePNG are correct. This seems to indicate that something happens when I call SDL_CreateTextureFromSurface but at that point, I don't want to rule anything out, because it's a very weird problem, and it smells undefined behaviour all over the place. But I just can't find the problem.
With the help of #mark-benningfield, I was able to find the problem.
TL;DR
There's a bug (or at least, an undocumented feature) in SDL with the DX11 renderer. There's a work-around ; see at the end.
CONTEXT
I'm trying to load around 12,000 textures when my program start. I know it's not a good idea, but I was planning on using that as a stepping-stone to another more sane system.
DETAILS
What I realized while debugging that problem is that the SDL renderer for DirectX 11 does that when it creates a texture :
result = ID3D11Device_CreateTexture2D(rendererData->d3dDevice,
&textureDesc,
NULL,
&textureData->mainTexture
);
The Microsoft's ID3D11Device::CreateTexture2D method page indicates that :
If you don't pass anything to pInitialData, the initial content of the memory for the resource is undefined. In this case, you need to write the resource content some other way before the resource is read.
If we're to believe that article :
Default Usage
The most common type of usage is default usage. To fill a default texture (one created with D3D11_USAGE_DEFAULT) you can :
[...]
After calling ID3D11Device::CreateTexture2D, use ID3D11DeviceContext::UpdateSubresource to fill the default texture with data from a pointer provided by the application.
So it looks like that D3D11_CreateTexture is using the second method of the default usage to initialize a texture and its content.
But right after that, in the SDL, we call SDL_UpdateTexture (without checking the return value ; I'll get to that later). If we dig until we get the the D3D11 renderer, we get that :
static int
D3D11_UpdateTextureInternal(D3D11_RenderData *rendererData, ID3D11Texture2D *texture, int bpp, int x, int y, int w, int h, const void *pixels, int pitch)
{
ID3D11Texture2D *stagingTexture;
[...]
/* Create a 'staging' texture, which will be used to write to a portion of the main texture. */
ID3D11Texture2D_GetDesc(texture, &stagingTextureDesc);
[...]
result = ID3D11Device_CreateTexture2D(rendererData->d3dDevice, &stagingTextureDesc, NULL, &stagingTexture);
[...]
/* Get a write-only pointer to data in the staging texture: */
result = ID3D11DeviceContext_Map(rendererData->d3dContext, (ID3D11Resource *)stagingTexture, 0, D3D11_MAP_WRITE, 0, &textureMemory);
[...]
/* Commit the pixel buffer's changes back to the staging texture: */
ID3D11DeviceContext_Unmap(rendererData->d3dContext, (ID3D11Resource *)stagingTexture, 0);
/* Copy the staging texture's contents back to the texture: */
ID3D11DeviceContext_CopySubresourceRegion(rendererData->d3dContext, (ID3D11Resource *)texture, 0, x, y, 0, (ID3D11Resource *)stagingTexture, 0, NULL);
SAFE_RELEASE(stagingTexture);
return 0;
}
Note : code snipped for conciseness.
This seems to indicate, based on that article I mentioned, that SDL is using the second method of the Default Usage to allocate the texture memory on the GPU, but uses the Staging Usage to upload the actual pixels.
I don't know that much about DX11 programming, but that mixing up of techniques got my programmer's sense tingling.
I contacted a game programmer I know and explained the problem to him. He told me the following interesting bits :
The driver gets to decide where it's storing staging textures. It usually lies in CPU RAM.
It's much better to specify a pInitialData pointer, as the driver can decide to upload the textures asynchronously.
If you load too many staging textures without commiting them to the GPU, you can fill up the RAM.
I then wondered why SDL didn't return me a "out of memory" error at the time I called SDL_CreateTextureFromSurface, and I found out why (again, snipped for concision) :
SDL_Texture *
SDL_CreateTextureFromSurface(SDL_Renderer * renderer, SDL_Surface * surface)
{
[...]
SDL_Texture *texture;
[...]
texture = SDL_CreateTexture(renderer, format, SDL_TEXTUREACCESS_STATIC,
surface->w, surface->h);
if (!texture) {
return NULL;
}
[...]
if (SDL_MUSTLOCK(surface)) {
SDL_LockSurface(surface);
SDL_UpdateTexture(texture, NULL, surface->pixels, surface->pitch);
SDL_UnlockSurface(surface);
} else {
SDL_UpdateTexture(texture, NULL, surface->pixels, surface->pitch);
}
[...]
return texture;
}
If the creation of the texture is successful, it doesn't care whether or not it succeeded in updating the textures (no check on SDL_UpdateTexture's return value).
WORKAROUND
The poor-man's workaround to that problem is to call SDL_RenderPresent each time you call a SDL_CreateTextureFromSurface.
It's probably fine to do it once every hundred textures depending on your texture size. But just be aware that calling SDL_CreateTextureFromSurface repeatedly without updating the renderer will actually fill up the system RAM, and the SDL won't return you any error condition to check for this.
The irony of this is that had I implemented a "correct" loading loop with percentage of completion on screen, I would never had that problem. But fate had me implement this the quick-and-dirty way, as a proof of concept for a bigger system, and I got sucked into that problem.
Recently, I tried to write mexfunctions using structure variables.
I watched the tutorial but got confused because of how the variable values are passed.
The following example (mexfunction_using_ex_wrong.m & mexfunction_using_ex_wrong.cpp) demonstrates how to fetch the variables passed from matlab in mexfunction.
However, in this case, the result is:
address i_c1=2067094464 i_c2=2067094464
i_c1=10 i_c2=10
address i_c1=1327990656 i_c2=2067100736
i_c1=2 i_c2=20
address i_c1=2067101056 i_c2=2067063424
i_c1=3 i_c2=30
As can be seen, the 1st element of the c1 & c2 array of a structure variable is accidentally the same.
But, in another example (mexfunction_using_ex_correct.m & mexfunction_using_ex_correct.cpp), the elements of array 1 (b1) and array 2(b2) of a structure variable are unrelated as I expect.
The result is:
address i_b1=1978456576 i_b2=1326968576
i_b1=1 i_b2=10
address i_b1=1978456584 i_b2=1326968584
i_b1=2 i_b2=20
address i_b1=1978456592 i_b2=1326968592
i_b1=3 i_b2=30
However, it's more common to use the 1st example in programming. so could anybody explain why in the 1st example the addresses of i_c1 & i_c2 are the same?
The following code is mexfunction_using_ex_wrong.m
clc
clear all
close all
mex mexfunction_using_ex_c_wrong.cpp;
a.b(1).c1=double(1);
a.b(2).c1=double(2);
a.b(3).c1=double(3);
a.b(1).c2=double(1);
a.b(2).c2=double(2);
a.b(3).c2=double(3);
mexfunction_using_ex_c_wrong(a);
The following code is mexfunction_using_ex_c_wrong.cpp
#include "mex.h"
void mexFunction(int nlhs,mxArray *plhs[],int nrhs,const mxArray *prhs[])
{
int i, j, k;
double *i_c1;
double *i_c2;
// for struct variables(pointers) inside fcwcontext
mxArray *mx_b, *mx_c1, *mx_c2;
mx_b=mxGetField(prhs[0], 0, "b");
for(i = 0;i < 3;i=i+1)
{
mx_c1=mxGetField(mx_b, i, "c1");
mx_c2=mxGetField(mx_b, i, "c2");
i_c1=mxGetPr(mx_c1);
i_c2=mxGetPr(mx_c2);
*i_c2=(*i_c2)*10;
printf("address i_c1=%d i_c2=%d\n", i_c1, i_c2);
printf(" i_c1=%g i_c2=%g\n", *i_c1, *i_c2);
}
}
The following code is mexfunction_using_ex_c_correct.m
clc
clear all
close all
mex mexfunction_using_ex_correct.cpp;
a.b1(1)=double(1);
a.b1(2)=double(2);
a.b1(3)=double(3);
a.b2(1)=double(1);
a.b2(2)=double(2);
a.b2(3)=double(3);
mexfunction_using_ex_correct(a);
The following code is mexfunction_using_ex_c_correct.cpp
#include "mex.h"
void mexFunction(int nlhs,mxArray *plhs[],int nrhs,const mxArray *prhs[])
{
int i, j, k;
double *i_b1;
double *i_b2;
mxArray *mx_b1, *mx_b2;
mx_b1=mxGetField(prhs[0], 0, "b1");
mx_b2=mxGetField(prhs[0], 0, "b2");
for(i = 0;i < 3;i=i+1)
{
i_b1=mxGetPr(mx_b1);
i_b2=mxGetPr(mx_b2);
i_b2[i]=i_b2[i]*10;
printf("address i_b1=%d i_b2=%d\n", &i_b1[i], &i_b2[i]);
printf(" i_b1=%g i_b2=%g\n", i_b1[i], i_b2[i]);
}
}
The addresses are not "accidentally the same" - they're intentionally the same, due to MATLAB's internal copy-on-write optimisations. If you look at the MEX documentation, you'll see warnings scattered around...
Do not modify any prhs values in your MEX-file. Changing the data in these read-only mxArrays can produce undesired side effects.
...in various forms...
Note Inputs to a MEX-file are constant read-only mxArrays. Do not modify the inputs. Using mxSetCell* or mxSetField* functions to modify the cells or fields of a MATLABĀ® argument causes unpredictable results.
...trying to make it very clear that you should absolutely not modify anything you recieve as an input. By calling mxGetPr() on input data and writing back to that pointer as you do with i_b2 and i_c2, you're getting right into that "unpredictable results" territory - if you look at a.b(1).c1 in the MATLAB workspace after the call, it'll really be 10 even though you "only" changed c2.
From MEX, you're looking at the raw data storage without any knowledge of, or access to, MATLAB's internal housekeeping, so the only safe way to modify anything is to use the mxCreate* or mxDuplicate* functions to get your own safe arrays you can then do whatever you want with, and pass back to MATLAB via plhs.
That said, I will admit to having abused in-place modification for a significant performance gain in one instance where I could guarantee my data was unique and unshared, but it's at best unsupported and at worst downright perilous.
I would like to be able to use my own memory allocation function for certain data structures (real valued vectors and arrays) in R. The reason for this is that I need my data to be 64bit aligned and I would like to use the numa library for having control over which memory node is used (I'm working on compute nodes with four 12-core AMD Opteron 6174 CPUs).
Now I have two functions for allocating and freeing memory: numa_alloc_onnode and numa_free (courtesy of this thread). I'm using R version 3.1.1, so I have access to the function allocVector3 (src/main/memory.c), which seems to me as the intended way of adding a custom memory allocator. I also found the struct R_allocator in src/include/R_ext
However it is not clear to me how to put these pieces together. Let's say, in R, I want the result res of an evaluation such as
res <- Y - mean(Y)
to be saved in a memory area allocated with my own function, how would I do this? Can I integrate allocVector3 directly at the R level? I assume I have to go through the R-C interface. As far as I know, I cannot just return a pointer to the allocated area, but have to pass the result as an argument. So in R I call something like
n <- length(Y)
res <- numeric(length=1)
.Call("R_allocate_using_myalloc", n, res)
res <- Y - mean(Y)
and in C
#include <R.h>
#include <Rinternals.h>
#include <numa.h>
SEXP R_allocate_using_myalloc(SEXP R_n, SEXP R_res){
PROTECT(R_n = coerceVector(R_n, INTSXP));
PROTECT(R_res = coerceVector(R_res, REALSXP));
int *restrict n = INTEGER(R_n);
R_allocator_t myAllocator;
myAllocator.mem_alloc = numa_alloc_onnode;
myAllocator.mem_free = numa_free;
myAllocator.res = NULL;
myAllocator.data = ???;
R_res = allocVector3(REALSXP, n, myAllocator);
UNPROTECT(2);
}
Unfortunately I cannot get beyond a variable has incomplete type 'R_allocator_t' compilation error (I had to remove the .data line since I have no clue as to what I should put there). Does any of the above code make sense? Is there an easier way of achieving what I want to? It seems a bit odd to have to allocate a small vector in R and the change its location in C just to be able to both control the memory allocation and have the vector available in R...
I'm trying to avoid using Rcpp, as I'm modifying a fairly large package and do not want to convert all C calls and thought that mixing different C interfaces could perform sub-optimally.
Any help is greatly appreciated.
I made some progress in solving my problem and I would like to share in case anyone else encounters a similar situation. Thanks to Kevin for his comment. I was missing the include statement he mentions. Unfortunately this was only one among many problems.
dyn.load("myAlloc.so")
size <- 3e9
myBigmat <- .Call("myAllocC", size)
print(object.size(myBigmat), units = "auto")
rm(myBigmat)
#include <R.h>
#include <Rinternals.h>
#include <R_ext/Rallocators.h>
#include <numa.h>
typedef struct allocator_data {
size_t size;
} allocator_data;
void* my_alloc(R_allocator_t *allocator, size_t size) {
((allocator_data*)allocator->data)->size = size;
return (void*) numa_alloc_local(size);
}
void my_free(R_allocator_t *allocator, void * addr) {
size_t size = ((allocator_data*)allocator->data)->size;
numa_free(addr, size);
}
SEXP myAllocC(SEXP a) {
allocator_data* my_allocator_data = malloc(sizeof(allocator_data));
my_allocator_data->size = 0;
R_allocator_t* my_allocator = malloc(sizeof(R_allocator_t));
my_allocator->mem_alloc = &my_alloc;
my_allocator->mem_free = &my_free;
my_allocator->res = NULL;
my_allocator->data = my_allocator_data;
R_xlen_t n = asReal(a);
SEXP result = PROTECT(allocVector3(REALSXP, n, my_allocator));
UNPROTECT(1);
return result;
}
For compiling the c code, I use R CMD SHLIB -std=c99 -L/usr/lib64 -lnuma myAlloc.c. As far as I can tell, this works fine. If anyone has improvements/corrections to offer, I'd be happy to include them.
One requirement from the original question that remains unresolved is the alignment issue. The block of memory returned by numa_alloc_local is correctly aligned, but other fields of the new VECTOR_SEXPREC (eg. the sxpinfo_struct header) push back the start of the data array. Is it somehow possible to align this starting point (the address returned by REAL())?
R has, in memory.c:
main/memory.c
84:#include <R_ext/Rallocators.h> /* for R_allocator_t structure */
so I think you need to include that header as well to get the custom allocator (RInternals.h merely declares it, without defining the struct or including that header)
I work on embedded device's firmware (write in C), I need to take a screenshot from the display and save it as a bmp file. Currently I work on the module that generates bmp file data. The easiest way to do that is to write some function that takes the following arguments:
(for simplicity, only images with indexed colors are supported in my example)
color_depth
image size (width, height)
pointer to function to get palette color for color_index (i)
pointer to function to get color_index of the pixel with given coords (x, y)
pointer to function to write image data
And then user of this function should call it like that:
/*
* Assume we have the following functions:
* int_least32_t palette_color_get (int color_index);
* int pix_color_idx_get (int x, int y);
* void data_write (const char *p_data, size_t len);
*/
bmp_file_generate(
1, //-- color_depth
x, y, //-- size
palette_color_get,
pic_color_idx_get,
data_write
);
And that's it: this functions does all the job, and returns only when job is done (i.e. bmp file generated and "written" by given user callback function data_write().
BUT, I need to make bmp_writer module to be usable in cooperative RTOS, and data_write() might be a function that actually transmits data via some protocol (say, UART) to another device), so, this function needs to be called only from Task context. This approach doesn't work then, I need to make it in OO-style, and its usage should look like this:
/*
* create instance of bmp_writer with needed params
* (we don't need "data_write" pointer anymore)
*/
T_BmpWriter *p_bmp_writer = new_bmp_writer(
1, //-- color_depth
x, y, //-- size
palette_color_get,
pic_color_idx_get
);
/*
* Now, byte-by-byte get all the data!
*/
while (bmp_writer__data_available(p_bmp_writer) > 0){
char cur_char = bmp_writer__get_next_char(p_bmp_writer);
//-- do something useful with current byte (i.e. cur_char).
// maybe transmit to another device, or save to flash, or anything.
}
/*
* Done! Free memory now.
*/
delete_bmp_writer(p_bmp_writer);
As you see, user can call bmp_writer__get_next_char(p_bmp_writer) when he need that, and handle received data as he wants.
Actually I already implemented this, but, with that approach, all the algorithm becomes turned inside out, and this code is extremely non-readable.
I'll show you a part of old code that generates palette data (from the function that does all the job, and returns only when job is done), and appropriate part of new code (in state-machine style).
Old code:
void bmp_file_generate(/*....args....*/)
{
//-- ... write headers
//-- write palette (if needed)
if (palette_colors_cnt > 0){
size_t i;
int_least32_t cur_color;
for (i = 0; i < palette_colors_cnt; i++){
cur_color = callback_palette_color_get(i);
callback_data_write((const char *)&cur_color, sizeof(cur_color));
}
}
//-- ...... write image data ..........
}
As you see, very short and easy-readable code.
Now, new code.
It looks like state-machine, because it's actually splitted by stages (HEADER_WRITE, PALETTE_WRITE, IMG_DATA_WRITE), each stage has its own context. In the old code, context was saved in local variables, but now we need to make the structure and allocate it from heap.
So:
/*
* Palette stage context
*/
typedef struct {
size_t i;
size_t cur_color_idx;
int_least32_t cur_color;
} T_StageContext_Palette;
/*
* Function that switches stage.
* T_BmpWriter is an object context, and pointer *me is analogue of "this" in OO-languages.
* bool_start is 1 if stage is just started, and 0 if it is finished.
*/
static void _stage_start_end(T_BmpWriter *me, U08 bool_start)
{
switch (me->stage){
//-- ...........other stages.........
case BMP_WR_STAGE__PALETTE:
if (bool_start){
//-- palette stage is just started. Allocate stage context and initialize it.
me->p_stage_context = malloc(sizeof(T_StageContext_Palette));
memset(me->p_stage_context, 0x00, sizeof(T_StageContext_Palette));
//-- we need to get first color, so, set index of byte in cur_color to maximum
((T_StageContext_Palette *)me->p_stage_context)->i = sizeof(int_least32_t);
} else {
free(me->p_stage_context);
me->p_stage_context = NULL;
}
break;
//-- ...........other stages.........
}
}
/*
* Function that turns to the next stage
*/
static void _next_stage(T_BmpWriter *me)
{
_stage_start_end(me, 0);
me->stage++;
_stage_start_end(me, 1);
}
/*
* Function that actually does the job and returns next byte
*/
U08 bmp_writer__get_next_char(T_BmpWriter *me)
{
U08 ret = 0; //-- resulting byte to return
U08 bool_ready = 0; //-- flag if byte is ready
while (!bool_ready){
switch (me->stage){
//-- ...........other stages.........
case BMP_WR_STAGE__PALETTE:
{
T_StageContext_Palette *p_stage_context =
(T_StageContext_Palette *)me->p_stage_context;
if (p_stage_context->i < sizeof(int_least32_t)){
//-- return byte of cur_color
ret = *( (U08 *)&p_stage_context->cur_color + p_stage_context->i );
p_stage_context->i++;
bool_ready = 1;
} else {
//-- need to get next color (or even go to next stage)
if (p_stage_context->cur_color_idx < me->bmp_details.palette_colors_cnt){
//-- next color
p_stage_context->cur_color = me->callback.p_palette_color_get(
me->callback.user_data,
p_stage_context->cur_color_idx
);
p_stage_context->cur_color_idx++;
p_stage_context->i = 0;
} else {
//-- next stage!
_next_stage(me);
}
}
}
break;
//-- ...........other stages.........
}
}
return ret;
}
So huge code, and it's so hard to understand it!
But I really have no idea how to make it in some different way, to be able to get information byte-by-byte.
Does anyone know how to achieve this, and keep code readability?
Any help is appreciated.
You can try protothread, which is useful to transform a state-machine based program into thread-style program. I'm not 100% sure that it can solve your problem elegantly, you can give it a try. The paper is a good starting point: Protothreads: simplifying event-driven programming of memory-constrained embedded systems
Here is its source code: http://code.google.com/p/protothread/
By the way, protothread is also used in the Contiki embedded OS, for implementing process in Contiki.