What is the final step in Sobel Edge Detection? - sobel

I've been reading about sobel edge detection, and I'm a little confused as to where the gradient angle comes in.
Assuming a one-channel gray scale image f, after we convolve the Sobel operator g_x and g_y with our original image f, we have two images G_x and G_y. We can use this to calculate |G| = sqrt(G_x^2+G_y^2). We can also use this to calculate theta=arctan(G_y/G_x).
I'm not sure about how to finally calculate the output image here for edge detection. What do we do with |G| and theta?
See Sobel Edge Detection described here
https://www.cs.auckland.ac.nz/compsci373s1c/PatricesLectures/Edge%20detection-Sobel_2up.pdf

Related

Uniform random sampling of CIELUV for RGB colors

Selecting a random color on a computer is a touch harder than I thought it would be.
The naive way of uniform random sampling of 0..255 for R,G,B will tend to draw lots of similar greens. It would make sense to sample from a perceptually uniform space like CIELUV.
A simple way to do this is to sample L,u,v on a regular mesh and ensure the color solid is contained in the bounds (I've seen different bounds for this). If the sample falls outside embedded RGB solid (tested by mapping it XYZ then RGB), reject it and sample again. You can settle for a kludgy-but-guaranteed-to-terminate "bailout" selection (like the naive procedure) if you reject more then some arbitrary threshold number of times.
Testing if the sample lies within RGB needs to be sure to test for the special case of black (some implementations end up being silent on the divide by zero), I believe. If L=0 and either u!=0 or v!=0, then the sample needs to be rejected or else you would end up oversampling the L=0 plane in Luv space.
Does this procedure have an obvious flaw? It seems to work but I did notice that I was rolling black more often than I thought made sense until I thought about what was happening in that case. Can anyone point me to the right bounds on the CIELUV grid to ensure that I am enclosing the RGB solid?
A useful reference for those who don't know it:
https://www.easyrgb.com/en/math.php
The key problem with this is that you need bounds to reject samples that fall outside of RGB. I was able to find it worked out here (nice demo on page, API provides convenient functions):
https://www.hsluv.org/
A few things I noticed with uniform sampling of CIELUV in RGB:
most colors are green and purple (this is true independent of RGB bounds)
you have a hard time sampling what we think of as yellow (very small volume of high lightness, high chroma space)
I implemented various strategies that focus on sampling hues (which is really what we want when we think of "sampling colors") by weighting according to the maximum chromas at that lightness. This makes colors like chromatic light yellows easier to catch and avoids oversampling greens and purples. You can see these methods in actions here (select "randomize colors"):
https://www.mysticsymbolic.art/
Source for color randomizers here:
https://github.com/mittimithai/mystic-symbolic/blob/chromacorners/lib/random-colors.ts
Okay, while you don't show the code you are using to generate the random numbers and then apply them to the CIELUV color space, I'm going to guess that you are creating a random number 0.0-100.0 from a random number generator, and then just assigning it to L*.
That will most likely give you a lot of black or very dark results.
Let Me Explain
L* of L * u * v* is not linear as to light. Y of CIEXYZ is linear as to light. L* is perceptual lightness, so an exponential curve is applied to Y to make it linear to perception but then non-linear as to light.
TRY THIS
To get L* with a random value 0—100:
Generate a random number between 0.0 and 1.0
Then apply an exponent of 0.42
Then multiply by 100 to get L*
Lstar = Math.pow(Math.random(), 0.42) * 100;
This takes your random number that represents light, and applies a powercurve that emulates human lightness perception.
UV Color
As for the u and v values, you can probably just leave them as linear random numbers. Constrain u to about -84 and +176, and v to about -132.5 and +107.5
Urnd = (Math.random() - 0.5521) * 240;
Vrnd = (Math.random() - 0.3231) * 260;
Polar Color
It might be interesting converting uv to LChLUV or LshLUV
For hue, it's probably as simple as H = Math.random() * 360
For chroma contrained 0—178: C = Math.random() * 178
The next question is, should you find chroma? Or saturation? CIELUV can provide either Hue or Sat — but for directly generating random colors, it seems that chroma is a bit better.
And of course these simple examples are not preventing over-runs, so they color values to be tested to see if they are legal sRGB or not. There's a few things that can be done to constrain the generated values to legal colors, but the object here was to get you to a better distribution without excess black/dark results.
Please let me know of any questions.

Physics in 3D world in OPENGL with C language only

I've been trying to code a 3D game where the player shoots an arrow and I wanted to do the equations for the 3D. I know the equations for the 2D world where:
x = v0 * cosθ * t
y = v0 * sinθ * t - 0.5 * g * t^2
But how do I use these equations in my 3D world where I have the Z axis?
Instead of making the arrows follow an explicit curve, I suggest simulating the arrow step by step.
What you need to store is a position (with x,y,z coordinates, starting off at the archer's location) and a velocity (also with x,y,z coordinates, starting off as some constant times the direction the player is looking), and some scene gravity (also with x,y,z coordinates, but it'll probably point straight down).
When the simulation progresses by a timestep of t, add t times the velocity to the position, then add t times gravity to the velocity.
This way, you're free to do more interesting things to the arrow later, like having wind act on it (add t times wind to the velocity) or having air resistance act on it (multiply velocity by t times some value a bit smaller than 1) or redirecting it (change velocity to something else entirely) without having to recalculate the path of the arrow.

Do depth values in AVDepthData (from TrueDepth camera) indicate distance from camera or camera plane?

Do depth values in AVDepthData (from TrueDepth camera) indicate distance in meters from the camera, or perpendicular distance from the plane of the camera (i.e. z-value in camera space)?
My goal is to get an accurate 3D point from the depth data, and this distinction is important for accuracy. I've found lots online regarding OpenGL or Kinect, but not for TrueDepth camera.
FWIW, this is the algorithm I use. I'm find the value of depth buffer at a pixel found using some OpenCV feature detection. Below is the code I use to find real world 3D point at a given pixel at let cgPt: CGPoint. This algorithm seems to work quite well, but I'm not sure whether small error is introduced by the assumption of depth being distance to camera plane.
let depth = 1/disparity
let vScreen = sceneView.projectPoint(SCNVector3Make(0, 0, -depth))
// cgPt is the 2D coordinates at which I sample the depth
let worldPoint = sceneView.unprojectPoint(SCNVector3Make(cgPt.x, cgPt.y, vScreen.z))
I'm not sure of authoritative info either way, but it's worth noticing that capture in a disparity (not depth) format uses distances based on a pinhole camera model, as explained in the WWDC17 session on depth photography. That session is primarily about disparity-based depth capture with back-facing dual cameras, but a lot of the lessons in it are also valid for the TrueDepth camera.
That is, disparity is 1/depth, where depth is distance from subject to imaging plane along the focal axis (perpendicular to imaging plane). Not, say, distance from subject to the focal point, or straight-line distance to the subject's image on the imaging plane.
IIRC the default formats for TrueDepth camera capture are depth, not disparity (that is, depth map "pixel" values are meters, not 1/meters), but lacking a statement from Apple it's probably safe to assume the model is otherwise the same.
It looks like it measures distance from the camera's plane rather than a straight line from the pinhole. You can test this out by downloading the Streaming Depth Data from the TrueDepth Camera sample code.
Place the phone vertically 10 feet away from the wall, and you should expect to see one of the following:
If it measures from the focal point to the wall as a straight line, you should expect to see a radial pattern (e.g. the point closest to the camera is straight in front of it; the points furthest to the camera are those closer to the floor and ceiling).
If it measures distance from the camera's plane, then you should expect the wall color to be nearly uniform (as long as you're holding the phone parallel to the wall).
After downloading the sample code and trying it out, you will notice that it behaves like #2, meaning it's distance from the camera's plane, not from the camera itself.

Source engine styled rope rendering

I am creating a 3D graphics engine and one of the requirements is ropes that behave like in Valve's source engine.
So in the source engine, a section of rope is a quad that rotates along it's direction axis to face the camera, so if the section of rope is in the +Z direction, it will rotate along the Z axis so it's face is facing the camera's centre position.
At the moment, I have the sections of ropes defined, so I can have a nice curved rope, but now I'm trying to construct the matrix that will rotate it along it's direction vector.
I already have a matrix for rendering billboard sprites based on this billboarding technique:
Constructing a Billboard Matrix
And at the moment I've been trying to retool it so that Right, Up, Forward vector match the rope segment's direction vector.
My rope is made up of multiple sections, each section is a rectangle made up of two triangles, as I said above, I can get the position and sections perfect, it's the rotating to face the camera that's causing me a lot of problems.
This is in OpenGL ES2 and written in C.
I have studied Doom 3's beam rendering code in Model_beam.cpp, the method used there is to calculate the offset based on normals rather than using matrices, so I have created a similar technique in my C code and it sort of works, at least it, works as much as I need it to right now.
So for those who are also trying to figure this one out, use the cross-product of the mid-point of the rope against the camera position, normalise that and then multiply it to how wide you want the rope to be, then when constructing the vertices, offset each vertex in either + or - direction of the resulting vector.
Further help would be great though as this is not perfect!
Thank you
Check out this related stackoverflow post on billboards in OpenGL It cites a lighthouse3d tutorial that is a pretty good read. Here are the salient points of the technique:
void billboardCylindricalBegin(
float camX, float camY, float camZ,
float objPosX, float objPosY, float objPosZ) {
float lookAt[3],objToCamProj[3],upAux[3];
float modelview[16],angleCosine;
glPushMatrix();
// objToCamProj is the vector in world coordinates from the
// local origin to the camera projected in the XZ plane
objToCamProj[0] = camX - objPosX ;
objToCamProj[1] = 0;
objToCamProj[2] = camZ - objPosZ ;
// This is the original lookAt vector for the object
// in world coordinates
lookAt[0] = 0;
lookAt[1] = 0;
lookAt[2] = 1;
// normalize both vectors to get the cosine directly afterwards
mathsNormalize(objToCamProj);
// easy fix to determine wether the angle is negative or positive
// for positive angles upAux will be a vector pointing in the
// positive y direction, otherwise upAux will point downwards
// effectively reversing the rotation.
mathsCrossProduct(upAux,lookAt,objToCamProj);
// compute the angle
angleCosine = mathsInnerProduct(lookAt,objToCamProj);
// perform the rotation. The if statement is used for stability reasons
// if the lookAt and objToCamProj vectors are too close together then
// |angleCosine| could be bigger than 1 due to lack of precision
if ((angleCosine < 0.99990) && (angleCosine > -0.9999))
glRotatef(acos(angleCosine)*180/3.14,upAux[0], upAux[1], upAux[2]);
}

How to recognizing money bills in Images?

I'm having some images, of euro money bills. The bills are completely within the image
and are mostly flat (e.g. little deformation) and perspective skew is small (e.g. image quite taken from above the bill).
Now I'm no expert in image recognition. I'd like to achieve the following:
Find the boundingbox for the money bill (so I can "cut out" the bill from the noise in the rest of the image
Figure out the orientation.
I think of these two steps as pre-processing, but maybe one can do the following steps without the above two. So with that I want to read:
The bills serial-number.
The bills face value.
I assume this should be quite possible to do with OpenCV. I'm just not sure how to approach it right. Would I pick a FaceDetector like approach or houghs or a contour detector on an edge detector?
I'd be thankful for any further hints for reading material as well.
Hough is great but it can be a little expensive
This may work:
-Use Threshold or Canny to find the edges of the image.
-Then cvFindContours to identify the contours, then try to detect rectangles.
Check the squares.c example in opencv distribution. It basically checks that the polygon approximation of a contour has 4 points and the average angle betweeen those points is close to 90 degrees.
Here is a code snippet from the squares.py example
(is the same but in python :P ).
..some pre-processing
cvThreshold( tgray, gray, (l+1)*255/N, 255, CV_THRESH_BINARY );
# find contours and store them all as a list
count, contours = cvFindContours(gray, storage)
if not contours:
continue
# test each contour
for contour in contours.hrange():
# approximate contour with accuracy proportional
# to the contour perimeter
result = cvApproxPoly( contour, sizeof(CvContour), storage,
CV_POLY_APPROX_DP, cvContourPerimeter(contour)*0.02, 0 );
res_arr = result.asarray(CvPoint)
# square contours should have 4 vertices after approximation
# relatively large area (to filter out noisy contours)
# and be convex.
# Note: absolute value of an area is used because
# area may be positive or negative - in accordance with the
# contour orientation
if( result.total == 4 and
abs(cvContourArea(result)) > 1000 and
cvCheckContourConvexity(result) ):
s = 0;
for i in range(4):
# find minimum angle between joint
# edges (maximum of cosine)
t = abs(angle( res_arr[i], res_arr[i-2], res_arr[i-1]))
if s<t:
s=t
# if cosines of all angles are small
# (all angles are ~90 degree) then write quandrange
# vertices to resultant sequence
if( s < 0.3 ):
for i in range(4):
squares.append( res_arr[i] )
-Using MinAreaRect2 (Finds circumscribed rectangle of minimal area for given 2D point set), get the bounding box of the rectangles. Using the bounding box points you can easily calculate the angle.
you can also find the C version squares.c under samples/c/ in your opencv dir.
There is a good book on openCV
Using a Hough transform to find the rectangular bill shape (and angle) and then find rectangles/circles within it should be quick and easy
For more complex searching, something like a Haar classifier - if you needed to find odd corners of bills in an image?
You can also take a look at the Template Matching methods in OpenCV; another option would be to use SURF features. They let you search for symbols & numbers in size, angle etc. invariantly.

Resources