Flipping Pebble Screen Issue - c

I'm writing a Pebble Time Watch app using Pebble SDK 3.0 on the basalt platform that requires text to be displayed upsidedown.
The logic is:-
Write to screen
Capture screen buffer
Flip screen buffer (using flipHV routine, see below)
Release buffer.
After a fair amount of experimentation I've got it working after a fashion but the (black) text has what seems to be random vertical white lines through it (see image below) which I suspect is something to do with shifting bits.
The subroutine I'm using is:-
void flipHV(GBitmap *bitMap) {
GRect fbb = gbitmap_get_bounds(bitMap);
int Width = 72; // fbb.size.w;
int Height = 84; // fbb.size.h;
uint32_t *pBase = (uint32_t *)gbitmap_get_data(bitMap);
uint32_t *pTopRemainingPixel = pBase;
uint32_t *pBottomRemainingPixel = pBase + (Height * Width);
while (pTopRemainingPixel < pBottomRemainingPixel) {
uint32_t TopPixel = *pTopRemainingPixel;
uint32_t BottomPixel = *pBottomRemainingPixel;
TopPixel = (TopPixel << 16) | (TopPixel >> 16);
*pBottomRemainingPixel = TopPixel;
BottomPixel = (BottomPixel << 16) | (BottomPixel >> 16);
*pTopRemainingPixel = BottomPixel;
pTopRemainingPixel++;
pBottomRemainingPixel--;
}
}
and its purpose is to work though the screen buffer taking the first pixel and swapping with the last one, the second one and swapping it with the second last one etc etc.
Because each 32 bit 'byte' holds 2 pixels I also need to rotate it through 16 bits.
I suspect that that is where the problem lies.
Can someone have a look at my code and see if they can see what is going wrong and put me right. I should say that I'm both a C and Pebble SDK newbie so please explain everything as if to a child!

Your assignments like
TopPixel = (TopPixel << 16) | (TopPixel >> 16)
swap pixels pair-wise
+--+--+ +--+--+
|ab|cd| => |cd|ab|
+--+--+ +--+--+
What you want instead is a full swap:
+--+--+ +--+--+
|ab|cd| => |dc|ba|
+--+--+ +--+--+
That can be done with even more bit-fiddling, e.g
TopPixel = ((TopPixel << 24) | // move d from 0..7 to 24..31
((TopPixel << 8) & 0x00ff0000) | // move c from 8..15 to 16..23
((TopPixel >> 8) & 0x0000ff00) | // move b from 16..23 to 8..15
((TopPixel >> 24) | // move a from 24..31 to 0..7
or - way more readable(!) - by using GColor8 instead of uint32_t and a loop on a per-pixel-basis:
// only loop to half of the distance to avoid swapping twice
for (int16_t y = 0; y <= max_y / 2; y++) {
for (int16_t x = 0; x <= max_x / 2; x++) {
GColor8 *value_1 = gbitmap_get_bytes_per_row(bmp) * y + x;
GColor8 *value_2 = gbitmap_get_bytes_per_row(bmp) * (max_y - y) + (max_x - x);
// swapping the two pixel values, could be simplified with a SWAP(a,b) macro
GColor8 tmp = *value_1;
*value_1 = *value_2;
*value_2 = tmp;
}
}
Disclaimer: I haven't compiled this code. It might also be necessary to cast the gbitmap_get_byes_per_row()... expressions to GColor8*. And the whole pointer arithmetic can be tuned if you see that this is a performance bottle-neck.

It turns out that I needed to replace all of the uint32_t with uint8_t and do away with the shifting.

Related

Difference between two buffers in C/C++

This development is being done on Windows in usermode.
I have two (potentially quite large) buffers, and I would like to know the number of bytes different between the two of them.
I wrote this myself just checking byte by byte, but this resulted in a quite slow implementation. As I'm comparing on the order of hundreds of megabytes, this is undesirable. I'm aware that I could optimize this though many different means, but this seems like a common problem that's probably got optimized solutions already out there, and there's no way I'm going to optimize this as effectively as if it was written by optimization experts.
Perhaps my Googling is inadequate, but I'm unable to find any other C or C++ functions that can count the number of different bytes between two buffers. Is there such a built in function to the C standard library, WinAPI, or C++ standard library that I just don't know of? Or do I need to manually optimize this?
I ended up writing this (perhaps somewhat poorly) optimized code to do the job for me. I was hoping it would vectorize this under the hood, but that doesn't appear to be happening unfortunately, and I didn't feel like digging around the SIMD intrinsics to do it manually. As a result, my bit fiddling tricks may end up making it slower, but it's still fast enough that it's no more than about 4% of my code's runtime (and almost all of that was memcmp). Whether or not it could be better, it's good enough for me.
I'll note that this is designed to be fast for my use case, where I'm expecting only rare differences.
inline size_t ComputeDifferenceSmall(
_In_reads_bytes_(size) char* buf1,
_In_reads_bytes_(size) char* buf2,
size_t size) {
/* size should be <= 0x1000 bytes */
/* In my case, I expect frequent differences if any at all are present. */
size_t res = 0;
for (size_t i = 0; i < (size & ~0xF); i += 0x10) {
uint64_t diff1 = *reinterpret_cast<uint64_t*>(buf1) ^
*reinterpret_cast<uint64_t*>(buf2);
if (!diff1) continue;
/* Bit fiddle to make each byte 1 if they're different and 0 if the same */
diff1 = ((diff1 & 0xF0F0F0F0F0F0F0F0ULL) >> 4) | (diff1 & 0x0F0F0F0F0F0F0F0FULL);
diff1 = ((diff1 & 0x0C0C0C0C0C0C0C0CULL) >> 2) | (diff1 & 0x0303030303030303ULL);
diff1 = ((diff1 & 0x0202020202020202ULL) >> 1) | (diff1 & 0x0101010101010101ULL);
/* Sum the bytes */
diff1 = (diff1 >> 32) + (diff1 & 0xFFFFFFFFULL);
diff1 = (diff1 >> 16) + (diff1 & 0xFFFFULL);
diff1 = (diff1 >> 8) + (diff1 & 0xFFULL);
diff1 = (diff1 >> 4) + (diff1 & 0xFULL);
res += diff1;
}
for (size_t i = (size & ~0xF); i < size; i++) {
res += (buf1[i] != buf2[i]);
}
return res;
}
size_t ComputeDifference(
_In_reads_bytes_(size) char* buf1,
_In_reads_bytes_(size) char* buf2,
size_t size) {
size_t res = 0;
/* I expect most pages to be identical, and both buffers should be page aligned if
* larger than a page. memcmp has more optimizations than I'll ever come up with,
* so I can just use that to determine if I need to check for differences
* in the page. */
for (size_t pn = 0; pn < (size & ~0xFFF); pn += 0x1000) {
if (memcmp(&buf1[pn], &buf2[pn], 0x1000)) {
res += ComputeDifferenceSmall(&buf1[pn], &buf2[pn], 0x1000);
}
}
return res + ComputeDifferenceSmall(
&buf1[size & ~0xFFF], &buf2[size & ~0xFFF], size & 0xFFF);
}

Rijndael S-box in C

I am trying to write a function which computes the Rijndael S-box according to this Wikipedia article. Rijndael S-box
#define ROTL8(x,shift) ((uint8_t) ((x) << shift | ((x) >> (8 - (shift)))))
uint8_t sbox(uint8_t b)
{
uint8_t s = b ^ ROTL8(b,1) ^ ROTL8(b,2) ^ ROTL8(b,3) ^ ROTL8(b,4) ^ 0x63;
return s;
}
Now this works when I try sbox(0x00)=0x63 and sbox(0x01)=0x7c, but it starts to go astray from sbox(0x02), which should be 0x77, but I get 0x5d instead. I suspected the issue might be the rotation not working correctly, but that now does not seem like an issue...
What is wrong here?
This is the wrong way to implement AES's S-box - most implementations are either hardcoded (they explicitly write the entire S-box as a 256-byte array), or they iteratively build the entries of the S-box, as in the Wikipedia article you linked:
void initialize_aes_sbox(uint8_t sbox[256]) {
uint8_t p = 1, q = 1;
/* loop invariant: p * q == 1 in the Galois field */
do {
/* multiply p by 3 */
p = p ^ (p << 1) ^ (p & 0x80 ? 0x1B : 0);
/* divide q by 3 (equals multiplication by 0xf6) */
q ^= q << 1;
q ^= q << 2;
q ^= q << 4;
q ^= q & 0x80 ? 0x09 : 0;
/* compute the affine transformation */
uint8_t xformed = q ^ ROTL8(q, 1) ^ ROTL8(q, 2) ^ ROTL8(q, 3) ^ ROTL8(q, 4);
sbox[p] = xformed ^ 0x63;
} while (p != 1);
/* 0 is a special case since it has no inverse */
sbox[0] = 0x63;
}
notice that xformed's value - which is the value you calculated in your own implementation - changes iteratively over the iterations (and it is not the S-box value of its q the way you implemented). In practice, every manual S-box construction has some sort of similar iterative process - look over at Code Golf for some creative implementations.

how can I implement paging , and find physical memory address knowing virtual address

I want to implement the initialisation of paging .
Referring to some links of osdev wiki : https://wiki.osdev.org/Paging , https://wiki.osdev.org/Setting_Up_Paging , my own version is very different.
Because , when we look at the page directory , they said that 12 bits is for the flag and the rest is for the address of the page table , so I tried something like this:
void init_paging() {
unsigned int i = 0;
unsigned int __FIRST_PAGE_TABLE__[0x400] __attribute__((aligned(0x1000)));
for (i = 0; i < 0x400; i++) __PAGE_DIRECTORY__[i] = PAGE_PRESENT(0) | PAGE_READ_WRITE;
for (i = 0; i < 0x400; i++) __FIRST_PAGE_TABLE__[i] = ((i * 0x1000) << 12) | PAGE_PRESENT(1) | PAGE_READ_WRITE;
__PAGE_DIRECTORY__[0] = ((unsigned int)__FIRST_PAGE_TABLE__ << 12) | PAGE_PRESENT(1) | PAGE_READ_WRITE;
_EnablingPaging_();
}
this function help me to know the physical address knowing the virtual address :
void *get_phyaddr(void *virtualaddr) {
unsigned long pdindex = (unsigned long)virtualaddr >> 22;
unsigned long ptindex = (unsigned long)virtualaddr >> 12 & 0x03FF;
unsigned long *pd = (unsigned long *)__PAGE_DIRECTORY__[pdindex];
unsigned long *pt = (unsigned long *)pd[ptindex];
return (void *)(pt + ((unsigned int)virtualaddr & 0xFFF));
}
I'm in the wrong direction?
Or still the same?
Assuming you're trying to identity map the first 4 MiB of the physical address space:
a) for unsigned int __FIRST_PAGE_TABLE__[0x400] __attribute__((aligned(0x1000))); it's a local variable (e.g. likely put on the stack); and it will not survive after the function returns (e.g. the stack space it was using will be overwritten by other functions later), causing the page table to become corrupted. That isn't likely to end well.
b) For __FIRST_PAGE_TABLE__[i] = ((i * 0x1000) << 12) | PAGE_PRESENT(1) | PAGE_READ_WRITE;, you're shifting i twice, once with * 0x1000 (which is the same as << 12) and again with the << 12. This is too much, and it needs to be more like __FIRST_PAGE_TABLE__[i] = (i << 12) | PAGE_PRESENT(1) | PAGE_READ_WRITE;.
c) For __PAGE_DIRECTORY__[0] = ((unsigned int)__FIRST_PAGE_TABLE__ << 12) | PAGE_PRESENT(1) | PAGE_READ_WRITE;, the address is already an address (and not a "page number" that needs to be shifted), so it needs to be more like __PAGE_DIRECTORY__[0] = ((unsigned int)__FIRST_PAGE_TABLE__) | PAGE_PRESENT(1) | PAGE_READ_WRITE;.
Beyond that; I'd very much prefer better use of types. Specifically; you should probably get in the habit of using uint32_t (or uint64_t, or a typedef of your own) for physical addresses to make sure you don't accidentally confuse a virtual address with a physical address (and make sure the compiler complains abut the wrong type when you make a mistake); because (even though it's not very important now because you're identity mapping) it will become important "soon". I'd also recommend using uint32_t for page table entries and page directory entries, because they must be 32 bits and not "whatever size the compiler felt like int should be" (note that this is a difference in how you think about the code, which is more important than what the compiler actually does or whether int happens to be 32 bits anyway).
When we ask page , but the page was not present , we have pageFault Interrupt .
SO to avoid that , we can check if the page is there , else , i choice to return 0x0:
physaddr_t *get_phyaddr(void *virtualaddr) {
uint32_t pdindex = (uint32_t)virtualaddr >> 22;
uint32_t ptindex = (uint32_t)virtualaddr >> 12 & 0x03FF;
uint32_t *pd, *pt, ptable;
if ((page_directory[pdindex] & 0x3) == 0x3) {
pd = (uint32_t *)(page_directory[pdindex] & 0xFFFFF000);
if ((pd[ptindex] & 0x3) == 0x3) {
ptable = pd[ptindex] & 0xFFFFF000;
pt = (uint32_t *)ptable;
return (physaddr_t *)(pt + ((uint32_t)(virtualaddr)&0xFFF));
} else
return 0x0;
} else
return 0x0;
}

Bilinear Interpolation image resizing marks & 'splotches'

I'm making an image resizing program in C and at the moment I'm having trouble with the Bilinear Interpolation function (it's one of many I'm using). This problem only arises for 16-bit bitmaps, If I use 24-bit versions, it resizes them perfectly.
Here's my code for the Bilinear Interpolation. n_w and n_h are the new widths and heights of the image:
#define getelm(x) (((pix+index)->x)*(1-xdiff)*(1-ydiff))+(((pix+index+1)->x)*(xdiff)*(1-ydiff))+(((pix+index+o_w)->x)*(1-xdiff)*(ydiff))+((pix+index+o_w+1)->x)*xdiff*ydiff
int pad = (2*n_w) & 3;
if (pad)
pad = 4-pad;
uint16_t *buffer;
if (buf)
buffer = malloc(2*n_w);
for (i = 0; i < n_h; i++) {
for (j = 0; j < n_w; j++) {
x = (int)(j*xrat);
y = (int)(i*yrat);
xdiff = (xrat*j)-x;
ydiff = (yrat*i)-y;
index = y*o_w+x;
uint16_t container = 0;
container |= (int)(round(getelm(b))) << 11;
container |= (int)(round(getelm(g))) << 5;
container |= (int)round(getelm(r));
if (buf)
*(buffer+j) = container;
else
fwrite(&container, 1, 2, dest);
}
if (buf)
fwrite(buffer, 1, 2*n_w, dest);
fwrite(&pad, 1, pad, dest);
}
My 24-bit version of this code (where the only difference is that the container is not used and instead 3 8-bit integers hold the RGB values) works beautifully.
This code, however, gives weird results. Look at the image below:
When I resize this, it gives me this back:
I can't see why this would be happening, especially when it works for 24-bit bitmaps, and also that some other resizing algorithms (Nearest Neighbour for example) work with 16-bit in the same way that this should.
EDIT:
I don't think it's an overflow problem, because adding the following code gives no output when run:
if (MAX((int)(getelm(b)), 31) > 31)
printf("blue overflow: %.10f\n", (getelm(b)));
if (MAX((int)(getelm(g)), 63) > 63)
printf("green overflow: %.10f\n", (getelm(g)));
if (MAX((int)(getelm(r)), 31) > 31)
printf("red overflow: %.10f\n", (getelm(r)));
EDIT 2:
I don't think it's an underflow problem either, this does nothing:
if ((getelm(b)) < 0 || (getelm(g)) < 0 || (getelm(r)) < 0)
printf("Underflow\n");
Assuming the data in pix have the type
struct
{
uint16_t r : 5;
uint16_t g : 6;
uint16_t b : 5;
};
there's a bug in the calculation of container. Using round won't always prevent overflow. The next code will:
uint16_t container = 0;
container |= ((int)(round(getelm(b))) & 31) << 11; // optionally
container |= ((int)(round(getelm(g))) & 63) << 5;
container |= ((int)round(getelm(r)) & 31);
or to retain a maximum of the lost information:
uint16_t container = 0;
container |= min((int)(round(getelm(b))) , 31) << 11;
container |= min(((int)(round(getelm(g))) , 63) << 5;
container |= min(((int)round(getelm(r)) , 31);
EDIT
Since pix->r, pix->g and pix->b come from 8bit values, a same reasoning apply to them and their range needs to be checked.
Since a white region turns to purple this means that the green color is suppressed due to overflows or it's read as zero in the first place. In this case inspecting read color can help.
Similarly, a black color turns into green means a bit representing a small value is shifted and the color is somehow inverted.
To find the bug I recommend splitting the code into small functions and asserting the input of each one of them.

Explanation of Header Pixel in GIMP created C Header File of an XPM image

In GIMP, you're able to save an image as a C header file. I did so with an XPM file, which looks like the image below:
If I were to save the XPM image as a C header file, GIMP will output this C header file.
In order to process each pixel of the given image data, the header pixel is called repeatedly. What I don't understand is what the header pixel does to process the data in the first place.
#define HEADER_PIXEL(data,pixel) {\
pixel[0] = (((data[0] - 33) << 2) | ((data[1] - 33) >> 4)); \
pixel[1] = ((((data[1] - 33) & 0xF) << 4) | ((data[2] - 33) >> 2)); \
pixel[2] = ((((data[2] - 33) & 0x3) << 6) | ((data[3] - 33))); \
data += 4; \
}
When I saw it in use in another person's code, they stated the byte order was in the wrong order and rearranged it themselves. They used it like this:
char *pixel, *data = header_data;
int i = width * height;
*processed_data = pixel = malloc(i * 4 + 1);
while(i-- > 0) {
pixel[0] = ((((data[2] - 33) & 0x3) << 6) | ((data[3] - 33)));
pixel[1] = ((((data[1] - 33) & 0xF) << 4) | ((data[2] - 33) >> 2));
pixel[2] = (((data[0] - 33) << 2) | ((data[1] - 33) >> 4));
pixel[3] = 0;
data += 4;
pixel += 4;
}
But that didn't really help me understand what is going on with all the bit shifting and bitwise or's and "why minus 33?" and so forth. If anyone can give an explanation on what is going on to process to the image data in the header, that would be much appreciated.
Thanks in advance!
Each pixel is represented by 3 bytes. These pixels are defined as a character array, named header_data.
The problem is that not every byte is a printable character that could exist in that header file.
This is solved by only using the printable characters 33 through 97. That gives 6 bits of information, so every four characters will give 24 bits, which can represent all permutations of 3 bytes.

Resources