How does the UYVY (YUV 4:2:2, Y422, UYNV, HDYC) colour system handle odd pixel counts? - video-processing

According to a reference that I've been reading, some planar YUV formats (e.g. UYVY) use macropixels which contain data for multiple pixels - specifically, in the case of UYVY, luma values per pixel and U and V samples for every other horizontal pixel.
What I don't see described is what value should be used for video when the dimensions are not divisible by 2. For example, if a frame's width in pixels is odd, should the last macropixel on each line wrap onto the next line, or should the second Y value be assumed to be ignored during decoding? Is there a standard for what that Y value should be set to (e.g. zero)?
If the macropixels do wrap, then what should be the case for the final macropixel in frame sizes with an odd pixel count, such as 51x51?

I asked about this on #ffmpeg on Freenode IRC, and a kind person named iive gave me some answers.
Each line is treated separately, so there's no wrapping of values in a macropixel from one line to the next. In the case of an odd frame width, the Y value from the last pixel is duplicated. So, if you've got a pixel with YUV values of [123, 45, 67] at the end of a line, the UYVY macropixel would have values of [45, 123, 67, 123].
There may also be padding at the end of each line's data, in order to align each frame line to a boundary so that SIMD instructions only need to operate on aligned data. This depends on the exact format you're using.

Related

Convert 4 uint8 (byte) to float32 in GLSL

I have a vertex buffer object containing vertex data for a model. However the layout is a bit weird. The vertex uses 4 x uint_8 for the position and 4 x int_8 for the normal data. The texture position data is appended at the end, with 4 x uint_8 representing a float value, that I can access with a offset value. Using 8 bytes would give me 2 float values that i can use in a vec2 for texture coordinates.
The layout is basically [ [4 x uint_8 (vertex pos)] | [ 4 x int_8 (vertex_normal) ] | ... (alternating pos and norm) | [ 4x uint_8 ] (byte data for float value)].
In my hit shader I read the buffer as an array of int_8 and I am able to read the vertex data without problems. However I can't seem to find a way to construct a float value out of the 4 bytes used to represent it.
I can of course change the stucture of the data, but I have legacy code that relies on this structure and changing it would break the rest of the program. I could also create a new vertex buffer, but since I already have the data and can read it without problems it would only take up more space and would be redundant in my opinion.
There is probably a way to define the structure before, so that buffer information has the right format in the shader. I know that you can set a format for the vertex input in a pipeline, but since this is a raytracing pipeline, I possibly cannot use this feature. But maybe I am wrong.
So the final question is: Is it possible to construct a float value out of 4 uint_8 values in a glsl shader, or should I consider changing the vertex buffer? Or is there maybe another way to define the data?
I have found a solution that works for me.
Basically I use two layouts with the same set and binding, except for the textures and normals the buffer is read as an array of int8 values. The buffer in the second layout is read as an array of vec2s. Since the buffer reads the original byte data it can pack it into a vec2 correctly.
So for example the byte data
[31, 133, 27, 63, 84, 224, 75, 63]
would give me a vec2 of
(0.6075, 0.7964)
which is what I wanted.
Of course this solution is not perfect, but for now it is enough. If you know any prettier solutions, feel free to share them!

Saving grayscale in CMYK using libjpeg in c

If this function does what I think it does, it seems that on my machine at least in CMYK, C=0, M=0, Y=0 and K=0 does not correspond to white! What is the problem?
float *arr is a float array with size elements. I want to save this array as a JPEG with IJG's libjpeg in two color spaces upon demand: g: Gray scale and c: CMYK. I follow their example and make the input JSAMPLE *jsr array with the number of JSAMPLE elements depending on the color space: size elements for gray scale and 4*size elements for CMYK. JSAMPLE is just another name for unsigned char on my machine at least. The full program can be seen on Github. This is how I fill jsr:
void
floatfilljsarr(JSAMPLE *jsr, float *arr, size_t size, char color)
{
size_t i;
double m;
float min, max;
/* Find the minimum and maximum of the array:*/
fminmax(arr, size, &min, &max);
m=(double)UCHAR_MAX/((double)max-(double)min);
if(color=='g')
{
for(i=0;i<size;i++)
jsr[i]=(arr[i]-min)*m;
}
else
for(i=0;i<size;i++)
{
jsr[i*4+3]=(arr[i]-min)*m;
jsr[i*4]=jsr[i*4+1]=jsr[i*4+2]=0;
}
}
I should note that color has been checked before this function to be either c or g.
I then write the JPEG image exactly as the example.c program in libjpeg's source.
Here is the output after printing both images in a TeX document. Grayscale is on the left and CMYK is on the right. Both images are made from the same ordered array, so the bottom left element (the first in the array as I have defined it and displayed it here) has JSAMPLE value 0 and the top right element has JSAMPLE value 255.
Why aren't the two images similar? Because of the different nature, I would expect the CMYK image to be reversed with its bottom being bright and its top being black. Their displaying JSAMPLE values (the only value in grayscale and the K channel in CMYK) are identical, but this output I get is not what I expected! The CMYK image is also brighter to the top, but very faintly!
It seems that C=0, M=0, Y=0 and K=0 does not correspond to white at least with this algorithm and on my machine!?! How can I set white when I only what to change the K channel and keep the rest zero?
Try inverting your K channel:
jsr[i*4+3]= (m - ((arr[i]-min)*m);
I think I found the answer my self. First I tried setting all four colors to the same value. It did produce a reasonable result but the output was not inverted as I expected. Such that the pixel with the largest value in all four colors was white, not black!
It was then that it suddenly occurred to me that somewhere in the process, either in IJG's libjpeg or in general in the JPEG standard I have no idea which, CMYK colors are reversed. Such that for example a Cyan value of 0 is actually interpreted as UCHAR_MAX on the display or printing device and vice versa. If this was the solution, the fact that the image in the question was so dark and that its grey shade was the same as the greyscale image could easily be explained (since I set all three other colors to zero which was actually interpreted as the maximum intensity!).
So I set the first three CMYK colors to the full range (=UCHAR_MAX):
jsr[i*4]=jsr[i*4+1]=jsr[i*4+2]=UCHAR_MAX /* Was 0 before! */;
Then to my surprise the image worked. The greyscale (left) shades of grey are darker, but atleast generally everything can be explained and is reasonably similar. I have checked separately and the absolute black color is identical in both, but the shades of grey in grey scale are darker for the same pixel value.
After I checked them on print (below) the results seemed to differ less, although the shades of grey in greys-scale are darker! Image taken with my smartphone!
Also, until I made this change, on a normal image viewer (I am using Scientific Linux), the image would be completely black, that is why I thought I can't see a CMYK image! But after this correction, I could see the CMYK image just as an ordinary image. Infact using Eye of GNOME (the default image viewer in GNOME), the two nearly appear identical.

Determining if rectangles overlap, if so return the area of overlapping rectangle compare with given set

I am a beginner programmer that is working on elementary tests for myself to grasp core values of working with C. I have a test case and don't really know where to begin in structuring it for compiling with GCC. I have a basic theory and pseudocode summary but mainly needing a little help stepping forward.
I have googled the related questions and permutations to this question and have been unable to make heads or tails of the logic for C.
Given the following logic:
Using the C language, have the function OverlappingRectangles(strArr)
read the strArr parameter being passed which will represent two
rectangles on a Cartesian coordinate plane and will contain 8
coordinates with the first 4 making up rectangle 1 and the last 4
making up rectange 2. It will be in the following format:
"(0,0),(2,2),(2,0),(0,2),(1,0),(1,2),(6,0),(6,2)." Your program should
determine the area of the space where the two rectangles overlap, and
then output the number of times this overlapping region can fit into
the first rectangle. For the above example, the overlapping region
makes up a rectangle of area 2, and the first rectangle (the first 4
coordinates) makes up a rectangle of area 4, so your program should
output 2. The coordinates will all be integers. If there's no overlap
between the two rectangles return 0.
I'm lost.
Should have added this at first:
Given a string(n1,n2,n3,n4,m1,m2,m3,m4)
Split string into string1(n1,n2,n3,n4) string2(m1,m2,m3,m4)
If n1+n4 < m1 or n2+n3 < m2 or m1+m4 < n1 or m2+m3 < m1
Calculate area of intersecting rectangle and divide into area of first rectangle.
Else
Print 0
You have a string of the form:
(x1,y1)(x2,y2)(x2,y1)(x1,y2)(x3,y3)(x4,y4)(x4,y3)(x3,y4)
defining 2 rectangles:
r1 = (x1,y1) to (x2,y2)
r2 = (x3,y3) to (x4,y4)
You need to first:
define a representation (structure) for the rectangles
parse (read) the string to extract the numbers for x1-x4 and y1-y4 -- look at e.g. sscanf and its return value for doing this
You can create a helper function, e.g.:
const char *parse_rectangle(const char *str, rectangle *r);
That will read a rectangle r from str in the form (x1,y1)(x2,y2)(x2,y1)(x1,y2)(x3,y3) (including any validation) and return a pointer to the next character.
Now, you will have two rectangles.
You can then compute the intersection of these rectangles as a third rectangle, e.g.:
int intersection(const rectangle *r1, const rectangle *r2, rectangle *result);
which will return 1 if the rectangles intersect, or 0 if they don't and fill result with the intersection. If you are using C99, you can use _Bool instead.
Now, you need a function to compute the area, e.g.:
int area(const rectangle *r);
You can pass this through the intersected rectangle and the first rectangle to get the areas of both.
Now, you simply divide the first rectangle area by the intersected rectangle area and print the result.

HVS color space in Open CV

I am going to detect a yellow color object when i open up my System CAM using Open CV programming, i got some help from the tutorial Object Recognition in Open CV but i am not clear about this line of code, what it does, i don't know. please elaborate me on the below line of code, which i am using.
cvInRangeS(imgHSV, cvScalar(20, 100, 100), cvScalar(30, 255, 255), imgThreshed);
other part of program:
CvMoments *moments = (CvMoments*)malloc(sizeof(CvMoments));
cvMoments(imgYellowThresh, moments, 1);
// The actual moment values
double moment10 = cvGetSpatialMoment(moments, 1, 0);
double moment01 = cvGetSpatialMoment(moments, 0, 1);
double area = cvGetCentralMoment(moments, 0, 0);
What about reading documentation?
inRange:
Checks if array elements lie between the elements of two other arrays.
And actually that article contains clear explanation:
And the two cvScalars represent the lower and upper bound of values
that are yellowish in colour.
About second code. From that calculations author finds center of object and its square. Quote from article:
You first allocate memory to the moments structure, and then you
calculate the various moments. And then using the moments structure,
you calculate the two first order moments (moment10 and moment01) and
the zeroth order moment (area).
Dividing moment10 by area gives the X coordinate of the yellow ball,
and similarly, dividing moment01 by area gives the Y coordinate.

cvTranspose Gives a Different Image Size?

I am new to OpenCV, and I want to transpose a grayscale image but I am getting the wrong output size.
// img is an unsigned char image
IplImage *img = cvLoadImage("image.jpg"); // where image is of width=668 height=493
int width = img->width;
int height = img->height;
I want to transpose it:
IplImage *imgT = cvCreateImage(cvSize(height,width),img->depth,img->nChannels);
cvTranspose(img,imgT);
When I check the images I see that the original image img has a size of 329324, which is correct: 493*668* 1 byte as it is an unsigned char. However imgT has a size of 331328.
I am not really sure where this happened.
EDIT: 1- I am using Windows XP and OpenCV 2.2.
2- By when i check the image, i meant when i see the values of the variable imgT. Such as the imgT->width, imgT->heigt, imgT->size, etc.
This is due to the fact, that OpenCV aligns the rows of the images at 4-byte boundaries. In the first image a row is 668 bytes wide, which is dividable by 4, so your image elements are contiguous.
The second image has a width of 493 (due to the transposing), which is not dividable by 4. The next higher number dividable by 4 is 496, so your rows are actually 496 bytes wide, with 3 unused bytes at the end of each row, to align the rows at 4-byte boundaries. And in fact 496*668 is indeed 331328. So you should always be aware of the fact, that your image elements need not be contiguous (at least they should be contiguous inside a single row).
You can store your image in cv:Mat() and use Mat.t() to transpose it. Rows and columns will be automatically allocated/deallocated so you don't have to worry about the size

Resources