cvTranspose Gives a Different Image Size? - c

I am new to OpenCV, and I want to transpose a grayscale image but I am getting the wrong output size.
// img is an unsigned char image
IplImage *img = cvLoadImage("image.jpg"); // where image is of width=668 height=493
int width = img->width;
int height = img->height;
I want to transpose it:
IplImage *imgT = cvCreateImage(cvSize(height,width),img->depth,img->nChannels);
cvTranspose(img,imgT);
When I check the images I see that the original image img has a size of 329324, which is correct: 493*668* 1 byte as it is an unsigned char. However imgT has a size of 331328.
I am not really sure where this happened.
EDIT: 1- I am using Windows XP and OpenCV 2.2.
2- By when i check the image, i meant when i see the values of the variable imgT. Such as the imgT->width, imgT->heigt, imgT->size, etc.

This is due to the fact, that OpenCV aligns the rows of the images at 4-byte boundaries. In the first image a row is 668 bytes wide, which is dividable by 4, so your image elements are contiguous.
The second image has a width of 493 (due to the transposing), which is not dividable by 4. The next higher number dividable by 4 is 496, so your rows are actually 496 bytes wide, with 3 unused bytes at the end of each row, to align the rows at 4-byte boundaries. And in fact 496*668 is indeed 331328. So you should always be aware of the fact, that your image elements need not be contiguous (at least they should be contiguous inside a single row).

You can store your image in cv:Mat() and use Mat.t() to transpose it. Rows and columns will be automatically allocated/deallocated so you don't have to worry about the size

Related

C (OSDev) - How could I shift the contents of a 32-bit framebuffer upwards efficiently?

I'm working on writing a hobby operating system. Currently a large struggle that I'm having is attempting to scroll the framebuffer upwards.
It's simply a 32-bit linear framebuffer.
I have access to a few tools that could be helpful:
some of the mem* functions from libc: memset, memcpy, memmove, and memcmp
direct access to the framebuffer
the width, height, and size in bytes, of said framebuffer
a previous attempt that managed to scroll it up a few lines, albeit EXTREMELY slowly, it took roughly 25 seconds to scroll the framebuffer up by 5 pixels
speaking of which, my previous attempt:
for (uint64_t i = 0; i != atoi(numLines); i++) {
for (uint64_t j = 0; j != bootboot.fb_width; j++) {
for (uint64_t k = 1; k != bootboot.fb_size; k++) {
((uint32_t *)&fb)[k - 1] = ((uint32_t *)&fb)[k];
}
}
}
A few things to note about the above:
numLines is a variable passed into the function, it's a char * that contains the number of lines to scroll up by, in a string. I eventually want this to be the number of actual text lines to scroll up by, but for now treating this as how many pixels to scroll up by is sufficient.
the bootboot struct is provided by the bootloader I use, it contains a few variables that could be of use: fb_width (the width of the framebuffer), fb_height (the height of the framebuffer), and fb_size (the size, in bytes, of the framebuffer)
the fb variable that I'm using the address of is also provided by the bootloader I use, it is a single byte that is placed at the first byte of the framebuffer, hence the need to cast it into a uint32_t * before using it.
Any and all help would be appreciated.
If I read the code correctly, what's happening with the triple nested loops is:
For every line to scroll,
For every pixel that the framebuffer is wide,
For every pixel in the entire framebuffer,
Move that pixel backwards by one.
Essentially you're moving each pixel one pixel distance at a time, so it's no wonder it takes so long to scroll the framebuffer. The total number of pixel moves is (numLines * fb_width * fb_size), so if your framebuffer is 1024x768, that's 5*1024*1024*768 moves, which is 4,026,531,840 moves. That's basically 5000 times the amount of work required.
Instead, you'll want to loop over the framebuffer only once, calculate that pixel's start and its end pointer, and only do the move once. Or you can calculate the source, destination, and size of the move once and then use memmove. Here's my attempt at this (with excessive comments):
// Convert string to integer
uint32_t numLinesInt = atoi(numLines);
// The destination of the move is just the top of the framebuffer
uint32_t* destination = (uint32_t*)&fb;
// Start the move from the top of the framebuffer plus however
// many lines we want to scroll.
uint32_t* source = (uint32_t*)&fb +
(numLinesInt * bootboot.fb_width);
// The total number of pixels to move is the size of the
// framebuffer minus the amount of lines we want to scroll.
uint32_t pixelSize = (bootboot.fb_height - numLinesInt)
* bootboot.fb_width;
// The total number of bytes is that times the size of one pixel.
uint32_t byteSize = pixelSize * sizeof(uint32_t);
// Do the move
memmove(destination, source, byteSize);
I haven't tested this, and I'm making a number of assumptions about how your framebuffer is laid out, so please make sure it works before using it. :)
(P.S. Also, if you put atoi(numLines) inside the end condition of the for loop, atoi will be called every time through the loop, instead of once at the beginning like you intended.)
Currently a large struggle that I'm having is attempting to scroll the framebuffer upwards.
The first problem is that the framebuffer is typically much slower to access than RAM (especially reads); so you want to do all the drawing in a buffer in RAM and then blit it efficiently (with a smaller number of much larger writes).
Once you have a buffer in RAM, you can make the buffer bigger than the screen. E.g. for a 1024 x 768 video mode you might have a 1024 x 1024 buffer. In that case small amounts of scrolling can often be done using the same "blit it efficiently" function; but sometimes you'll have to scroll the buffer in RAM.
To scroll the buffer in RAM you can cheat - treat it as a circular buffer and map a second copy into virtual memory immediately after the first. This allows you to (e.g.) copy 768 lines starting from the middle of the first copy without caring about hitting the end of the first buffer. The end result is that you can scroll the buffer in RAM without moving any data or changing the "blit it efficiently" function.
As a bonus, this also minimizing "tearing" artifacts. E.g. often you want to scroll the pixel data up and add more pixel data to the bottom, then blit it (without the user seeing an intermediate "half finished" frame).

how to extract the number of column and the number of row from size of an image

I have to extract a pgm image. All I have is a size of the whole PGM image which is 505
I tried to extract the number of row and number of column from that image.
int size = 505;
At the beginning, I think Number of column should be
int col = 505/8// size of 1 byte
int row = col *8
I don't know if this is correct? Please advise
PGM images can have arbitrary whitespace at-least 1 byte per pixel, plus a header consisting of 3 ascii mumbers and 6 other bytes with optional padding, so given a size of 505 it'd be at-most 495 pixels in any rectangular arrangement, more than that can't be said from size alone, if you can read the first three lines of he image all will be revealed.
The file size is useful - it's gives a buffer size that will be sufficient to store the pixels, but it does not tell you how they are to be arranged.

Saving grayscale in CMYK using libjpeg in c

If this function does what I think it does, it seems that on my machine at least in CMYK, C=0, M=0, Y=0 and K=0 does not correspond to white! What is the problem?
float *arr is a float array with size elements. I want to save this array as a JPEG with IJG's libjpeg in two color spaces upon demand: g: Gray scale and c: CMYK. I follow their example and make the input JSAMPLE *jsr array with the number of JSAMPLE elements depending on the color space: size elements for gray scale and 4*size elements for CMYK. JSAMPLE is just another name for unsigned char on my machine at least. The full program can be seen on Github. This is how I fill jsr:
void
floatfilljsarr(JSAMPLE *jsr, float *arr, size_t size, char color)
{
size_t i;
double m;
float min, max;
/* Find the minimum and maximum of the array:*/
fminmax(arr, size, &min, &max);
m=(double)UCHAR_MAX/((double)max-(double)min);
if(color=='g')
{
for(i=0;i<size;i++)
jsr[i]=(arr[i]-min)*m;
}
else
for(i=0;i<size;i++)
{
jsr[i*4+3]=(arr[i]-min)*m;
jsr[i*4]=jsr[i*4+1]=jsr[i*4+2]=0;
}
}
I should note that color has been checked before this function to be either c or g.
I then write the JPEG image exactly as the example.c program in libjpeg's source.
Here is the output after printing both images in a TeX document. Grayscale is on the left and CMYK is on the right. Both images are made from the same ordered array, so the bottom left element (the first in the array as I have defined it and displayed it here) has JSAMPLE value 0 and the top right element has JSAMPLE value 255.
Why aren't the two images similar? Because of the different nature, I would expect the CMYK image to be reversed with its bottom being bright and its top being black. Their displaying JSAMPLE values (the only value in grayscale and the K channel in CMYK) are identical, but this output I get is not what I expected! The CMYK image is also brighter to the top, but very faintly!
It seems that C=0, M=0, Y=0 and K=0 does not correspond to white at least with this algorithm and on my machine!?! How can I set white when I only what to change the K channel and keep the rest zero?
Try inverting your K channel:
jsr[i*4+3]= (m - ((arr[i]-min)*m);
I think I found the answer my self. First I tried setting all four colors to the same value. It did produce a reasonable result but the output was not inverted as I expected. Such that the pixel with the largest value in all four colors was white, not black!
It was then that it suddenly occurred to me that somewhere in the process, either in IJG's libjpeg or in general in the JPEG standard I have no idea which, CMYK colors are reversed. Such that for example a Cyan value of 0 is actually interpreted as UCHAR_MAX on the display or printing device and vice versa. If this was the solution, the fact that the image in the question was so dark and that its grey shade was the same as the greyscale image could easily be explained (since I set all three other colors to zero which was actually interpreted as the maximum intensity!).
So I set the first three CMYK colors to the full range (=UCHAR_MAX):
jsr[i*4]=jsr[i*4+1]=jsr[i*4+2]=UCHAR_MAX /* Was 0 before! */;
Then to my surprise the image worked. The greyscale (left) shades of grey are darker, but atleast generally everything can be explained and is reasonably similar. I have checked separately and the absolute black color is identical in both, but the shades of grey in grey scale are darker for the same pixel value.
After I checked them on print (below) the results seemed to differ less, although the shades of grey in greys-scale are darker! Image taken with my smartphone!
Also, until I made this change, on a normal image viewer (I am using Scientific Linux), the image would be completely black, that is why I thought I can't see a CMYK image! But after this correction, I could see the CMYK image just as an ordinary image. Infact using Eye of GNOME (the default image viewer in GNOME), the two nearly appear identical.

Bitmap point processing

would appreciate some brainstorming help for one of my assignments. I am to write a program that does basic point processing of a .bmp image. Program will open a .bmp file for reading and writing and will not change any part of the header, but the pixel values in the file according to command line arguments:
-fromrow x, where x specifies the bottommost row to process
-torowx, where x specifies the topmost row to process
-fromcol x, where x specifies the leftmost column to process
-tocol x, where x specifies the rightmost column to process
-op x, where x is one of the following:
- 1 = threshold the image (any pixel value in the specifies range over 127 is changed to 255, and pixel values 127 or less is changed to 0)
- 2 = negative (any pixel value p in the specified range is changed to 255-p)
To process image data, you will need to make use of the following:
- each pixel value is an unsigned char
- the number of rows in the image is stored as an int at position (byte address) 22 in the file
- the number of columns in the image is stored as an int at position (byte address) 18 in the file
- the position at which the pixel data starts is an int stored at position (byte address) 10 in the file
- pixel information is stored row by row, starting from the bottommost row in the image (row 0) and progressing upwards. within a row; pixel information is stored left to right. padding is added to the end of each row to make row length a multiple of 4 bytes (if the row has 479 columns, there is one extra padding at the end of the row before the next row starts)
I'm a bit lost as to how to begin, but I figure I should make a struct bitmap first like so?
struct bitmap {
unsigned int startrow;
unsigned int endrow;
unsigned int startcol;
unsigned int endcol;
}
Can anyone help walk me through what I would need to do for the byte addresses that the assignment references? Any other brainstorming advice would be greatly appreciated as well. Thanks!
You can read raw bytes by opening a file in binary mode:
FILE *fid = fopen("blah.bmp", "rb");
You can then read some amount of data thus:
int num_actually_read = fread(p, sizeof(*p), num_to_read, fid);
where p is a pointer to some buffer. In this case, you probably want p to be of type uint8_t *, because you're dealing with raw bytes mostly.
Alternatively, you can jump around in a file thus:
fseek(fid, pos, SEEK_SET);
I hope this is enough to get you going.
You will need a pointer to point to the byte addresses 22 and 18 of the file. Once you point to those addresses, you will need to dereference the pointer to get the row and column values. Then you have to point your pointer to address 10 and then traverse the pixels one by one.

image scaling with C

I'm trying to read an image file and scale it by multiplying each byte by a scale its pixel levels by some absolute factor. I'm not sure I'm doing it right, though -
void scale_file(char *infile, char *outfile, float scale)
{
// open files for reading
FILE *infile_p = fopen(infile, 'r');
FILE *outfile_p = fopen(outfile, 'w');
// init data holders
char *data;
char *scaled_data;
// read each byte, scale and write back
while ( fread(&data, 1, 1, infile_p) != EOF )
{
*scaled_data = (*data) * scale;
fwrite(&scaled_data, 1, 1, outfile);
}
// close files
fclose(infile_p);
fclose(outfile_p);
}
What gets me is how to do each byte multiplication (scale is 0-1.0 float) - I'm pretty sure I'm either reading it wrong or missing something big. Also, data is assumed to be unsigned (0-255). Please don't judge my poor code :)
thanks
char *data;
char *scaled_data;
No memory was allocated for these pointers - why do you need them as pointers? unsigned char variables will be just fine (unsigned because it makes more sense for byte data).
Also, what happens when the scale shoots the value out of the 256-range? Do you want saturation, wrapping, or what?
change char *scaled_data; to char scaled_data;
change *scaled_data = (*data) * scale; to scaled_data = (*data) * scale;
That would get you code that would do what you are trying to do, but ....
This could only possibly work on an image file of your own custom format. There is no standard image format that just dumps pixels in bytes in a file in sequential order. Image files need to know more information, like
The height and width of the image
How pixels are represented (1 byte gray, 3 bytes color, etc)
If pixels are represented as an index into a palette, they have the palette
All kinds of other information (GPS coordinates, the software that created it, the date it was created, etc)
The method of compression used for the pixels
All of this is called Meta-data
In addition (as alluded to by #5), pixel data is usually compressed.
You're code is equivalent to saying "I want to scale down my image by dividing the bits in half"; it doesn't make any sense.
Images files are complex formats with headers and fields and all sorts of fun stuff that needs to be interpreted. Take nobugz's advice and check out ImageMagick. It's a library for doing exactly the kind of thing you want.
why you think you are wrong, i see nothing wrong in your algorithm except for not being efficient and char *data; and char *scaled_data; should be unsigned char data; and unsigned char scaled_data;
My understanding of a bitmap (just the raw data) is that each pixle is represented by three numbers one each for RGB; multiplying each by a number <=1 would just make the image darker. If you're trying to make the image wider, you could maby just output each pixle twice (to double the size), or just output every other pixel (to halve the size), but that depends on how its rasterized.

Resources