Looking to see if anyone can recommend a computationally efficient method for translating/shifting an image by (x,y) pixels.
Reason being, I have been part successful in implementing the fourier-mellin transform to determine the rotation and translation between image frames. Once the image is unrotated I would like to untranslate the image by the calculated pixel offset (x,y). Allowing me to test the image correlation after rotation and translation.
I would think that a efficient method would be to:
Make a border cv::copyMakeBorder().
Use a ROI e.g. make a new matrix header without copying data.
Good luck
Related
In Python code I see that images given to MobileNet are 224x224 while the Tensorflow.js version seems to work with any size or aspect ratio. For non-square images does it stretch them or add white or transparent pixels to produce square input with the aspect ratio of the image maintained? If it does stretch it to become square should one do some image manipulation before using model.classify?
https://github.com/tensorflow/tfjs-models/tree/master/mobilenet#making-a-classification doesn't say anything about this.
There is no requirements for images to be square. Using non square images will achieve the same result. Maybe the reason why some neural networks such as mobilenet use square images are for operation such as convolution where the kernel is chosen most of the time as square.
To use mobilenet for classification, the image needs to be reshape to a shape of [224, 224, 3] which is the input size of the network. Methods such as .resizeBilinear, resizeNearestNeighbor, ... will achieve that very purpose. Obviously transforming a non square image to a square image will distort the image. But those algorithms use the technique of anti-aliasing to make up for the distorsion.
But the distorsion of the input image is the less thing one need to be concerned with. Actually, a good model prediction should be invariant to such distorsion, because the trained data were so much distorted and augmented with noise so that the model can generalize well.
Similar to calibrating a single camera 2D image with a chessboard, I wish to determine the width/height of the chessboard (or of a single square) in pixels.
I have a camera aimed vertically at the ground, ensured to be perfectly level with the surface below. I am using the camera to determine the translation between consequtive frames (successfully achieved using fourier phase correlation), at the moment my result returns the translation in pixels, however I would like to use techniques similar to calibration, where I move the camera over the chessboard which is flat on the ground, to automatically determine the size of the chessboard in pixels, relative to my image height and width.
Knowing the size of the chessboard in millimetres, I can then convert a pixel unit to a real-world-unit in millimetres, ie, 1 pixel will represent a distance proportional to the height of the camera above the ground. This will allow me to convert a translation in pixels to a translation in millimetres, recalibrating every time I change the height of the camera.
What would be the recommended way of achieving this? Surely it must be simpler than single camera 2D calibration.
OpenCV can give you the position of the chessboard's corners with cv::findChessboardCorners().
I'm not sure if the perspective distortion will affect your calculations, but if the chessboard is perfectly aligned beneath the camera, it should work.
This is just an idea so don't hit me.. but maybe using the natural contrast of the chessboard?
"At some point it will switch from bright to dark pixels and that should happen (can't remember number of columns on chessboard) times." should be a doable algorithm.
I had a generalized question to find out if it was possible or not to do matrix calculations on a rectangle. I have a CvRect that has information stored in it with coordinates and I have a cvMat that has transformational data. What I would like to know is if there was a way to get the Rect to use the matrix data to generate a rotated, skewed, and repositioned rectangle out of it. I've searched online, but I was only able to get information on image transforms.
Thanks in advance for the help.
No, this is not possible. cv::Rect is also not capable of that, as it only describes rectangles in a Manhattan world. There is cv::RotatedRect, but this also does not handle skewing.
You can, however, feed the corner points of your rectangle to cv::transform:
http://opencv.itseez.com/modules/core/doc/operations_on_arrays.html?highlight=transform#cv2.transform
You will then obtain four points that are transformed accordingly. Note that there are also more specialized versions of this function, e.g. warpPerspective() and warpAffine().
Using libjpeg, if possible, I would like to read a row from the middle of a JPEG image without reading all the preceding rows. Can this be done?
The answer is almost certainly "yes you can, but it will take more effort than you want".
A JPEG image is a stream of markers that contain either information global to the whole compressed image, or information related to specific portions of the image. The compression works by breaking the image into color planes, possibly changing color spaces to one where the color information can be down-sampled, and within each plane operating on 8x8 pixel blocks.
For instance, it is possible to rotate a compressed image by 90 degrees if it is sized such that it is made up of only whole blocks by only transposing the basic blocks and the coefficients inside each block; i.e. without uncompressing, rotating the real image, and recompressing.
Given that, your approach would be to parse the marker stream on the way into the library, passing all the markers that are global to the image, modifying any related to image size, and dropping markers containing coefficients that lie outside your cropping rectangle.
You will likely need to further crop the result if the restriction of cropping to complete basic blocks is too coarse.
What isn't clear to me is whether there is any real win over the alternative, which is to crop the results as it comes out of the library. The library is highly configurable, so you can provide an uncompressed data consumer function that discards all pixels outside your cropping rectangle and only saves pixels you want to keep.
I'm building a KML file to use as a map layer in Google Earth and whatever else handles KML/KMZ files.
What I want to do is this: Display a number of bitmap images such that each is stretched to fit into a specified quadrilateral, where the first vertex of the quadrilateral specified would, for example, be the top-left corner of the bitmap, the next vertex would be where the top-right corner fits, and so on. Is there a (relatively) simple way to do this? If distorting/stretching the image isn't possible in any simple way, just displaying it at a specified location, scaling and rotation would be acceptable.
Update: To clarify: Given a set of four geospatial coordinates that form a quadrilateral, I'd like to take a rectangular bitmap (either via a specified URL or included in a KMZ file) and lay it onto the map such that its four corners line up with the four corners of the aforementioned quadrilateral. If it's not possible to distort an image to fit any quadrilateral, it would be sufficient to just specify position, rotation and size. Hopefully that's a little clearer.
Any help would be much appreciated.
Thanks!
Figured it out; you use a LatLonQuad:
<GroundOverlay>
<name>Example Image Overlay</name>
<color>87ffffff</color>
<Icon>
<href>mypicture.jpg</href>
<viewBoundScale>0.75</viewBoundScale>
</Icon>
<gx:LatLonQuad>
<coordinates>
-115.8993079806076,36.72147153334678,0
-115.8990441694222,36.72500067085463,0
-115.9002128356738,36.72511090523616,0
-115.9005214644026,36.72164386079184,0
</coordinates>
</gx:LatLonQuad>
</GroundOverlay>