Algorithms for downscaling bitmapped fonts - c

This is a follow-up to this question.
I am working on a low level C app where I have to draw text. I have decided to store the font I want to use as an array (black and white, each char 128x256, perhaps), then I'd downscale it to the sizes I need with some algorithm (as grayscale, so I can have some crude font smoothing).
Note: this is a toy project, please disregard stuff like doing calculations at runtime or not.
Question is, which algorithm?
I looked up 2xSaI, but it's rather complicated. I'd like something I can read the description for and work out the code myself (I am a beginner and have been coding in C/C++ for just under a year).
Suggestions, anyone?
Thanks for your time!
Edit: Please note, the input is B&W, the output should be smoothed grayscale

Figure out the rectangle in the source image that will correspond to a destination pixel. For example if your source image is 50x100 and your destination is 20x40, the upper left pixel in the destination corresponds to the rectangle from (0,0) to (2.2,2.2) in the source image. Now, do an area-average over those pixels:
Area is 2.2 * 2.2 = 4.84. You'll scale the result by 1/4.84.
Pixels at (0,0), (0,1), (1,0), and (1,1) each weigh in at 1 unit.
Pixels at (0,2), (1,2), (2,0), and (2,1) each weigh in at 0.2 unit (because the rectangle only covers 20% of them).
The pixel at (2,2) weighs in at 0.04 (because the rectangle only covers 4% of it).
The total weight is of course 4*1 + 4*0.2 + 0.04 = 4.84.
This one was easy because you started with source and destination pixels lined up evenly at the edge of the image. In general, you'll have partial coverage at all 4 sides/4 corners of the sliding rectangle.
Don't bother with algorithms other than area-averaging for downscaling. Most of them are plain wrong (they result in horrible aliasing, at least with a factor smaller than 1/2) and the ones that aren't plain wrong are a good bit more painful to implement and probably won't give you better results.

Consider that your image is a N*M BW bitmap. For simplicity we'll consider it char Letter[N][M], when allowable values are 0 and 1. Now consider that you want to downscale it to the unsigned char letter[n][m]. This will mean that each greyscale pixel from letter will be computed as number of white pixels in the big bitmap:
char Letter[N][M];
unsigned char letter[n][m];
int rect_sz_X = N / n; // the size of rectangle that will map to a single pixel
int rect_sz_Y = M / m; // in the downscaled image
int i, j, x, y;
for (i = 0; i < n; i++) for (j = 0; j < m; j++){
int sum = 0;
for (x = 0; x < rect_sz_X; x++) for (y = 0; y < rect_sz_Y; y++)
sum += Letter[i*rect_sz_X + x][j*rect_sz_Y + y];
letter[n][m] = ( sum * 255) / (rect_sz_X * rect_sz_Y);
};
Note that the rectangles that creates pixels could overlap (in case when sizes aren't divisible). The larger is your original bitmap, the better.

Scaling a bitmapped font is the same problem as scaling any other bitmap. The general class of algorithm that you're after is interpolation. There's quite a few ways to do this - in general, the more visually accurate the result, the more complicated the algorithm. You could start by looking at (in increasing order of complexity):
Nearest-neighbour
Bilinear interpolation
Bicubic interpolation

It's pretty simple. If all you've got is a bitmapped font instead of an outline font then you have very limited choices in picking an anti-aliasing pixel color. For example, if the bitmapped font point size is exactly four times as large as the desired display point size then you can only ever get 16 distinct choices. The number of 'lit' pixels in the 4x4 mapping rectangle.
Having to deal with fractional mapping is a programming exercise but not one that improves the quality.

If it is acceptable to constrain the downscaling to multiples of 2 (50%, 25%, 12.5%, etc.), then a very simple and fairly good algorithm is to create each downscaled pixel as the majority vote of all the source pixels. For example, at 50%, a square of four pixels are forming the one downscaled pixel: if zero or one of them is on, then the output is off; if three or four are on, then the output is on. The artistic case (for two pixels on), either always choose on or off, or look at other surrounding pixels for tiebreaking.

Related

Resizing single 1 pixel wide bitmap strip - faster than this example? (for Raycaster algorithm)

I am attaching the picture example and my current code.
My question is: Can I make resizing/streching/interpolating single vertical bitmap strip faster
that using another for-loop.
The current Code looks very optimal:
for current strip size in the screen, iterate from start height to end height. Get corresponding
pixel from texture and add to output buffer. Add step to get another pixel.
here is an essential part of my code:
inline void RC_Raycast_Walls()
{
// casting ray for every width pixel
for (u_int16 rx = 0; rx < RC_render_width_i; ++rx)
{
// ..
// traversing thru map of grid
// finding intersecting point
// calculating height of strip in screen
// ..
// step size for nex pixel in texutr
float32 tex_step_y = RC_texture_size_f / (float32)pp_wall_height;
// starting texture coordinate
float32 tex_y = (float32)(pp_wall_start - RC_player_pitch - player_z_div_wall_distance - RC_render_height_d2_i + pp_wall_height_d2) * tex_step_y;
// drawing walls into buffer <- ENTERING ANOTHER LOOP only for SINGLE STRIP
for (int16 ry = pp_wall_start; ry < pp_wall_end; ++ry)
{
// cast the texture coordinate to integer, and mask with (texHeight - 1) in case of overflow
u_int16 tex_y_safe = (u_int16)tex_y & RC_texture_size_m1_i;
tex_y += tex_step_y;
u_int32 texture_current_pixel = texture_pixels[RC_texture_size_i * tex_y_safe + tex_x];
u_int32 output_pixel_index = rx + ry * RC_render_width_i;
output_buffer[output_pixel_index] =
(((texture_current_pixel >> 16 & 0x0ff) * intensity_value) >> 8) << 16 |
(((texture_current_pixel >> 8 & 0x0ff) * intensity_value) >> 8) << 8 |
(((texture_current_pixel & 0x0ff) * intensity_value) >> 8);
}
}
}
Maybe some bigger stepping like 2 instead of 1, got then every second line empty,
but adding another line of code that could fil that empty space results the same performance..
I would not like to have doubled pixels and interpolating between two of them I think would take even
longer. ??
Thank You in Advance!
ps.
Its based on Lodev Raycaster algorithm:
https://lodev.org/cgtutor/raycasting.html
You do not need floats at all
You can use DDA on integers without multiplication and division. These days floating is not that slow as it used to but your conversion between float and int might be ... See these QAs (both use this kind of DDA:
DDA line with subpixel
DDA based rendering routines
use LUT for applying Intensity
Looks like each color channel c is 8 bit and intensity i is fixed point in range <0,1> so you can precompute every combination into something like this:
u_int8 LUT[256][256]
for (int c=0;c<256;c++)
for (int i=0;i<256;i++)
LUT[c][i]=((c*i)>>8)
use pointers or union to access RGB channels instead of bit operations
My favorite is union:
union color
{
u_int32 dd; // 1x 32bit RGBA
u_int16 dw[2]; // 2x 16bit
u_int8 db[4]; // 4x 8bit (individual channels)
};
texture coordinates
Again looks like you are doing too many operations. for example [RC_texture_size_i * tex_y_safe + tex_x] if your texture size is 128 you can bitshift lef by 7 bits instead of multiplication. Yes on modern CPUs is this not an issue however the whole thing can be replaced by simple LUT. You can remember pointer to each horizontal ScanLine of texture and rewrite to [tex_y_safe][tex_x]
So based on #2,#3 rewrite your color computation to this:
color c;
c.dd=texture_current_pixel;
c.db[0]=LUT[c.db[0]][intensity_value];
c.db[1]=LUT[c.db[1]][intensity_value];
c.db[2]=LUT[c.db[2]][intensity_value];
output_buffer[output_pixel_index]=c.dd;
As you can see its just bunch of memory transfers instead of multiple bit-shifts,bit-masks and bit-or operations. You can also use pointer of color instead of texture_current_pixel and output_buffer[output_pixel_index] to speed up little more.
And finally see this:
Ray Casting with different height size
Which is my version of the raycast using VCL.
Now before changing anything measure the performance you got now by measuring the time it needs to render. Then after each change in the code measure if it actually improve performance or not. In case it didn't use old version of code as predicting what is fast on nowadays platforms is sometimes hard.
Also for resize much better visual results are obtained by using mipmaps ... that usually eliminates the weird noise while moving

Fill Grid With Random Pixels

I have a grid of pixels 64x8. The aim is to to activate the pixels on this grid in a random manner till the whole grid is activated.
Logically I can generate random numbers in 0-63 and 0-7 range and then activate this pixel. Assuming I run this for long enough, the grid should be completely activated.
However, I am wondering if there is any algorithm that can minimize / avoid altogether collision (returning already activated pixel coordinate) and guarantee complete grid activation in a finite amount of time?
Fill an array of length 512 with numbers increasing from from 0 to 511 (64x8 = 512), so the array will contain {0,1,2,3,..., 511}).
Then shuffle that array, for example like explained here: Shuffle array in C.
Then define a function that maps a number to a coordinate, that would be:
y = n / 8
x = n % 8
n being one of the numbers of the array.
If the array is well shuffled this guarantees that all pixels will be activatged in a random order.
You could implement a pseudo random generator (PRG # Wikipedia) with a period of 64 * 8. Use 3 bits for the axis with 8, and the remaining 6 bits for the axis with 64.

Sprite Rotation Math Formula for Screen Width and Height

I am programming an asteroids type game in C, and I have a sprite sheet of 36 sprites that is the ship rotating. I would like to know a math formula for figuring out how to move the ship in the direction of the sprite I have chosen from the sprite sheet. Note that I am incrementing by 10 degrees (hence 36 sprites for 360 degrees).
For example, my screen is 320 pixels wide by 256 pixels high.
If I select sprite image 10 (which is 90 degrees (the ship is facing right)), how can I calculate (using some sort of formula) the X and Y coordinates to move the ship in? I know 90 degrees is an easy one, by imagine if it were 30 degrees. There is a certain value for X and a certain value for Y. Since the screen in wider in width that height, the X speed would be higher than the Y speed.
Hope that makes sense.
Many thanks.
There are two easy approaches: you can build a table of [x,y] distances for each of the 36 angles, or you can do the math "on the fly".
The advantage of calculating the distances immediately is that you can easily increase the accuracy later on, if you decide you want more than 36 angles (and don't mind the sprite is off by a couple of degrees). Also, since you will be working with floats anyway, you can do all of your calculations with a far greater accuracy. Your speed could be as low as 0.01 pixel per second, and if you store your position as floats as well, you'd see your sprite move a tiny bit every few minutes.
Pre-calculating a table is easy and fast, though. Run this program to create the arrays xmove and ymove. Then, for an angle a, you can set xpos += ((speed*xmove[a])>>8) and ypos += ((speed*ymove[a])>>8).
The table stores sin and cos times 256, as integers. The values need to be multiplied by some large factor because they always fall inside the floating point range -1..1; storing them as their original floating point value is possible but unnecessary (it would only re-introduce floating point calculations in what can be reasonably approximated with pure integers, in your case). Now since the values are "premultiplied" by 256, you need to divide the speed*move calculation again by that number -- shifting right by 8 bits is all it takes. (There is a small rounding issue here; if it bothers you, add 128 before the right-shift.)
You can use a larger accuracy by using a multiplier of 1024 or higher, but again, more accuracy is probably entirely invisible for your purposes. ('1024' instead of '1000' because you can still efficiently use bit-shifting with that number.)
I believe that nowadays any modern screen has nigh-on square pixels, so unless you want it as some sort of special effect, speed in the y direction should be the same as x-speed. However, it's simple to add. Instead of dividing by 256, you'd use something like ypos += ((speed*ymove[angle])/341); -- this is (4*256/3), so the vertical speed is 75% of the horizontal speed.
A final possible refinement: you can also store your xpos,ypos as pre-multiplied by 256! Then you would not shift right the new coordinates, but immediately add the correct value. Only when displaying the actual sprite, you'd divide the coordinates by 256. That way your ship will not move by "entire pixels" only, but way more smooth. If your speed is variable, you can store it with higher accuracy the same way (remember to scale down correctly, because it'd make your 'virtual' speed is 256*256 higher than your 'screen' speed).
The table created below assumes #0 is "straight up", #9 (not 10!) is "right", #18 is down and #27 is "left", where positive y points downwards.
By the way: the size of your ship doesn't really matter ... You probably don't want it to "jump" distances equal to its own size.
#include <stdio.h>
#include <math.h>
#ifndef M_PI
#define M_PI 3.14159265358979323846
#endif
int main (void)
{
int i, angle;
printf ("int xmove[36] = {\n");
for (i=0; i<36; i++)
{
angle = 10*i;
// x distance: sin
printf ("\t%d,", (int)(round(256*sin(angle * M_PI/180))));
printf ("\t\tangle: %d\n", angle);
}
printf ("};\n");
printf ("\n");
printf ("int ymove[36] = {\n");
for (i=0; i<36; i++)
{
angle = 10*i;
// y distance: cos
printf ("\t%d,", (int)(round(-256*cos(angle * M_PI/180))));
printf ("\t\tangle: %d\n", angle);
}
printf ("};\n");
return 0;
}

How would you convert X,Y points to Rho,Theta for hough transform in C?

So I am trying to code Hough Transform on C. I have a binary image and have extracted the binary values from the image. Now to do hough transform I have to convert the [X,Y] values from the image into [rho,theta] to do a parametric transform of the form
rho=xcos(theta)+ysin(theta)
I don't quite understand how it's actually transformed, looking at other online codes. Any help explaining the algorithm and how the accumulator for [rho,theta] values should be done based on [X,Y] would be appreciated.Thanks in advance. :)
Your question hints at the fact that you think that you need to map each (X,Y) point of interest in the image to ONE (rho, theta) vector in the Hough space.
The fact of the matter is that each point in the image is mapped to a curve, i.e. SEVERAL vectors in the Hough space. The number of vectors for each input point depends on some "arbitrary" resolution that you decide upon. For example, for 1 degree resolution, you'd get 360 vectors in Hough space.
There are two possible conventions, for the (rho, theta) vectors: either you use [0, 359] degrees range for theta, and in that case rho is always positive, or you use [0,179] degrees for theta and allow rho to be either positive or negative. The latter is typically used in many implementation.
Once you understand this, the Accumulator is little more than a two dimension array, which covers the range of the (rho, theta) space, and where each cell is initialized with 0. It is used to count the number of vectors that are common to various curves for different points in the input.
The algorithm therefore compute all 360 vectors (assuming 1 degree resolution for theta) for each point of interest in the input image. For each of the these vectors, after rounding rho to the nearest integral value (depends on precision in the rho dimension, e.g. 0.5 if we have 2 points per unit) it finds the corresponding cell in the accumulator, and increment the value in this cell.
when this has been done for all points of interest, the algorithm searches for all cells in the accumulator which have a value above a chosen threshold. The (rho, theta) "address" of these cells are the polar coordinates values for the lines (in the input image) that the Hough algorithm has identified.
Now, note that this gives you line equations, one is typically left with figure out the segment of these lines that effectively belong in the input image.
A very rough pseudo-code "implementation" of the above
Accumulator_rho_size = Sqrt(2) * max(width_of_image, height_of_image)
* precision_factor // e.g. 2 if we want 0.5 precision
Accumulator_theta_size = 180 // going with rho positive or negative convention
Accumulator = newly allocated array of integers
with dimension [Accumulator_rho_size, Accumulator_theta_size]
Fill all cells of Accumulator with 0 value.
For each (x,y) point of interest in the input image
For theta = 0 to 179
rho = round(x * cos(theta) + y * sin(theta),
value_based_on_precision_factor)
Accumulator[rho, theta]++
Search in Accumulator the cells with the biggest counter value
(or with a value above a given threshold) // picking threshold can be tricky
The corresponding (rho, theta) "address" of these cells with a high values are
the polar coordinates of the lines discovered in the the original image, defined
by their angle relative to the x axis, and their distance to the origin.
Simple math can be used to compute various points on this line, in particular
the axis intercepts to produce a y = ax + b equation if so desired.
Overall this is a rather simple algorithm. The complexity lies mostly in being consistent with the units, for e.g. for the conversion between degrees and radians (most math libraries' trig functions are radian-based), and also regarding the coordinates system used for the input image.

Getting RGB values for each pixel from a raw image in C

I want to read the RGB values for each pixel from a raw image. Can someone tell me how to achieve this? Thanks for help!
the format of my raw image is .CR2 which come from camera.
Assuming the image is w * h pixels, and stored in true "packed" RGB format with no alpha component, each pixel will require three bytes.
In memory, the first line of the image might be represented in awesome ASCII graphics like this:
R0 G0 B0 R1 G1 B1 R2 G2 B2 ... R(w-1) G(w-1) B(w-1)
Here, each Rn Gn and Bn represents a single byte, giving the red, green or blue component of pixel n of that scanline. Note that the order of the bytes might be different for different "raw" formats; there's no agreed-upon world standard. Different environments (graphics cards, cameras, ...) do it differently for whatever reason, you simply have to know the layout.
Reading out a pixel can then be done by this function:
typedef unsigned char byte;
void get_pixel(const byte *image, unsigned int w,
unsigned int x,
unsigned int y,
byte *red, byte *green, byte *blue)
{
/* Compute pointer to first (red) byte of the desired pixel. */
const byte * pixel = image + w * y * 3 + 3 * x;
/* Copy R, G and B to outputs. */
*red = pixel[0];
*green = pixel[1];
*blue = pixel[2];
}
Notice how the height of the image is not needed for this to work, and how the function is free from bounds-checking. A production-quality function might be more armor-plated.
Update If you're worried this approach will be too slow, you can of course just loop over the pixels, instead:
unsigned int x, y;
const byte *pixel = /* ... assumed to be pointing at the data as per above */
for(y = 0; y < h; ++y)
{
for(x = 0; x < w; ++x, pixel += 3)
{
const byte red = pixel[0], green = pixel[1], blue = pixel[2];
/* Do something with the current pixel. */
}
}
None of the methods posted so far are likely to work with a camera "raw" file. The file formats for raw files are proprietary to each manufacturer, and may contain exposure data, calibration constants, and white balance information, in addition to the pixel data, which will likely be in a packed format where each pixel can takes up more than one byte, but less than two.
I'm sure there are open-source raw file converter programs out there that you could consult to find out the algorithms to use, but I don't know of any off the top of my head.
Just thought of an additional complication. The raw file does not store RGB values for each pixel. Each pixel records only one color. The other two colors have to be interpolated from heighboring pixels. You'll definitely be better off finding a program or library that works with your camera.
A RAW image is an uncompressed format, so you just have to point where your pixel is (skipping any possible header, and then adding the size of the pixel times the number columns times the number of row plus the number of the colum), and then read whatever binary data is giving a meaningful format to the layout of the data (with masks and shifts, you know).
That's the general procedure, for your current format you'll have to check the details.

Resources