I am using OpenCV for image manipulation in C. Please forgive me if this question is answered in the documentation, but I have found the OpenCV docs to be pretty badly formed and difficult to read.
I have an CvMat* that i have extracted from an image file as below:
CvMat* mat = cvLoadImageM((char*) filename, CV_LOAD_IMAGE_COLOR);
What I need to do is get a subimage of that by cropping out a certain bounded region. A logical command for this might be:
CvMat* subMat = cvGetSubImage(mat, minx, maxx, miny, maxy);
where minx, maxx, miny, and maxy define the boundaries of the cropped region. Is there a built in way to do this easily?
Take a look at http://nashruddin.com/OpenCV_Region_of_Interest_(ROI)/
In which the tutorial does the following on a Region of Interest:
cvSetImageROI(img1, cvRect(10, 15, 150, 250));
IplImage *img2 = cvCreateImage(cvGetSize(img1),
img1->depth,
img1->nChannels);
cvCopy(img1, img2, NULL);
cvResetImageROI(img1);
OpenCV has built in capabilities for setting the region which you care about and copying that region out of an image, just as you want to achieve.
If you want a sub-pixel accurate rectangular section of a src image use cvGetRectSubPix or cv::getRectSubPix (this creates an individual copy of all the data, this is not a ROI!)
Example:
cv::Size size(dst_width,dst_height);
cv::Point2f center(src_centerx,src_center_y);
cv::Mat dst;
cv::getRectSubPix(src,size, center,dst,CV_8U);
Generally this is done by cropping an ROI (region of interest). This blog post goes into some detail on cropping:
/* load image */
IplImage *img1 = cvLoadImage("elvita.jpg", 1);
/* sets the Region of Interest
Note that the rectangle area has to be __INSIDE__ the image */
cvSetImageROI(img1, cvRect(10, 15, 150, 250));
/* create destination image
Note that cvGetSize will return the width and the height of ROI */
IplImage *img2 = cvCreateImage(cvGetSize(img1),
img1->depth,
img1->nChannels);
/* copy subimage */
cvCopy(img1, img2, NULL);
/* always reset the Region of Interest */
cvResetImageROI(img1);
To convert between IplImage (legacy OpenCV) and cvMat (OpenCV 2.x), simply use the cvMat constructor or look at this question for more methods.
Related
I'm experiencing a frustrating issue trying to draw text using Pango and Cairo libraries in C in a Gtk application running on Ubuntu Linux.
I'm creating a Pango layout and then drawing it at a given location which is determined by the size of the text as reported by pango_layout_get_pixel_size, but the size returned by that function is wrong in both width and height, especially in height. Here is my full code:
// Create a cairo context with which to draw
// Note that we already have a GtkDrawingArea widget m_pGtkDrawingArea
cairo_t *cr = gdk_cairo_create(m_pGtkDrawingArea->window);
// Text to draw
std::string szText("NO DATA AVAILABLE");
// Create the layout
PangoLayout *pLayout = gtk_widget_create_pango_layout(m_pGtkDrawingArea, szText.c_str());
// Set layout properties
pango_layout_set_alignment(pLayout, PANGO_ALIGN_LEFT);
pango_layout_set_width(pLayout, -1);
// The family to use
std::string szFontFamily("FreeSans");
// The font size to use
double dFontSize = 36.0;
// Format the font description string
char szFontDescription[32];
memset(&(szFontDescription[0]), 0, sizeof(szFontDescription));
snprintf(szFontDescription, sizeof(szFontDescription) - 1, "%s %.1f", szFontFamily.c_str(), dFontSize);
// Get a new pango font description
PangoFontDescription *pFontDescription = pango_font_description_from_string(szFontDescription);
// Set up the pango font description
pango_font_description_set_weight(pFontDescription, PANGO_WEIGHT_NORMAL);
pango_font_description_set_style(pFontDescription, PANGO_STYLE_NORMAL);
pango_font_description_set_variant(pFontDescription, PANGO_VARIANT_NORMAL);
pango_font_description_set_stretch(pFontDescription, PANGO_STRETCH_NORMAL);
// Set this as the pango font description on the layout
pango_layout_set_font_description(pLayout, pFontDescription);
// Use auto direction
pango_layout_set_auto_dir(pLayout, TRUE);
// Get the pixel size of this text - this reports a size of 481x54 pixels
int iPixelWidth = 0, iPixelHeight = 0;
pango_layout_get_pixel_size(pLayout, &iPixelWidth, &iPixelHeight);
// Calculate the text location based on iPixelWidth and iPixelHeight
double dTextLocX = ...;
double dTextLocY = ...;
// Set up the cairo context for drawing the text
cairo_set_source_rgba(cr, 1.0, 1.0, 1.0, 1.0);
cairo_set_antialias(cr, CAIRO_ANTIALIAS_BEST);
// Move into place
cairo_move_to(cr, dTextLocX, dTextLocY);
// Draw the layout
pango_cairo_show_layout(cr, pLayout);
//
// pango_layout_get_pixel_size() reported a size of 481x54 pixels,
// but the actual size when drawn is 478x37 pixels!
//
//
// Clean up...
//
So, as described at the bottom of the above code, the pango_layout_get_pixel_size function reports a size of 481x54 pixels, but the size of the text on the screen is actually 478x37 pixels.
What am I doing wrong here? How can I get the actual correct pixel size?
Thanks in advance for your help!
The text you are displaying ("NO DATA AVAILABLE") is all-caps, and consequently has no descenders (letters which are partly below the baseline, like j, p and q.) Normally, when you measure the extent of a text box, you include room for the descenders whether or not they are present; otherwise, you will see odd artifacts such as inconsistent line separation depending on whether or not a given line has a descender.
Pango provides APIs which return both the logical extent (which includes the full height of the font) and the ink extent (which is the bounding box of the inked part of the image). I suspect you are looking for the ink extent.
I am using CUDA to generate this ABGR output image. The image in question is stored in a uchar4 array. Each element of the array represents the color of each pixel in the image. Obviously, this output array is a 2D image but it is allocated in CUDA as a linear memory of interleaved bytes.
I know that CUDA can easily map this array to an OpenGL Vertex Buffer Object. My question is, assuming that I have the RGB value of every pixel in an image, along with the width and height of the image, how can I draw this image to screen using OpenGL?
I know that some kind of shader must be involved but since my knowledge is very little, I have no idea how a shader can use the color of each pixel, but map it to correct screen pixels.
I know I should increase my knowledge in OpenGL, but this seems like a trivial task.
If there is an easy way for me to draw this image, I'd rather not spend much time learning OpenGL.
I finally figured out an easy way to do what I wanted. Unfortunately, I did not know about the existence of the sample that Robert was talking about on NVIDIA's website.
Long story short, the easiest way to draw the image was to define a Pixel Buffer Object in OpenGL, register the buffer with CUDA and pass it as an output array of uchar4 to the CUDA kernel. Here is a quick pseudo-code based on JOGL and JCUDA that shows the steps involved. Most of the code was obtained from the sample on NVIDIA's website:
1) Creaing the OpenGL buffers
GL2 gl = drawable.getGL().getGL2();
int[] buffer = new int[1];
// Generate buffer
gl.glGenBuffers(1, IntBuffer.wrap(buffer));
glBuffer = buffer[0];
// Bind the generated buffer
gl.glBindBuffer(GL2.GL_ARRAY_BUFFER, glBuffer);
// Specify the size of the buffer (no data is pre-loaded in this buffer)
gl.glBufferData(GL2.GL_ARRAY_BUFFER, imageWidth * imageHeight * 4, (Buffer)null, GL2.GL_DYNAMIC_DRAW);
gl.glBindBuffer(GL2.GL_ARRAY_BUFFER, 0);
// The bufferResource is of type CUgraphicsResource and is defined as a class field
this.bufferResource = new CUgraphicsResource();
// Register buffer in CUDA
cuGraphicsGLRegisterBuffer(bufferResource, glBuffer, CUgraphicsMapResourceFlags.CU_GRAPHICS_MAP_RESOURCE_FLAGS_NONE);
2) Initialize the texture and set texture parameters
GL2 gl = drawable.getGL().getGL2();
int[] texture = new int[1];
gl.glGenTextures(1, IntBuffer.wrap(texture));
this.glTexture = texture[0];
gl.glBindTexture(GL2.GL_TEXTURE_2D, glTexture);
gl.glTexParameteri(GL2.GL_TEXTURE_2D, GL2.GL_TEXTURE_MIN_FILTER, GL2.GL_LINEAR);
gl.glTexParameteri(GL2.GL_TEXTURE_2D, GL2.GL_TEXTURE_MAG_FILTER, GL2.GL_LINEAR);
gl.glTexImage2D(GL2.GL_TEXTURE_2D, 0, GL2.GL_RGBA8, imageWidth, imageHeight, 0, GL2.GL_BGRA, GL2.GL_UNSIGNED_BYTE, (Buffer)null);
gl.glBindTexture(GL2.GL_TEXTURE_2D, 0);
3) Run the CUDA kernel and display the results in OpenGL's display loop.
this.runCUDA();
GL2 gl = drawable.getGL().getGL2();
gl.glBindBuffer(GL2.GL_PIXEL_UNPACK_BUFFER, glBuffer);
gl.glBindTexture(GL2.GL_TEXTURE_2D, glTexture);
gl.glTexSubImage2D(GL2.GL_TEXTURE_2D, 0, 0, 0,
imageWidth, imageHeight,
GL2.GL_RGBA, GL2.GL_UNSIGNED_BYTE, 0); //The last argument must be ZERO! NOT NULL! :-)
gl.glBindBuffer(GL2.GL_PIXEL_PACK_BUFFER, 0);
gl.glBindBuffer(GL2.GL_PIXEL_UNPACK_BUFFER, 0);
gl.glBindTexture(GL2.GL_TEXTURE_2D, glTexture);
gl.glEnable(GL2.GL_TEXTURE_2D);
gl.glDisable(GL2.GL_DEPTH_TEST);
gl.glDisable(GL2.GL_LIGHTING);
gl.glTexEnvf(GL2.GL_TEXTURE_ENV, GL2.GL_TEXTURE_ENV_MODE, GL2.GL_REPLACE);
gl.glMatrixMode(GL2.GL_PROJECTION);
gl.glPushMatrix();
gl.glLoadIdentity();
gl.glOrtho(-1.0, 1.0, -1.0, 1.0, -1.0, 1.0);
gl.glMatrixMode(GL2.GL_MODELVIEW);
gl.glLoadIdentity();
gl.glViewport(0, 0, imageWidth, imageHeight);
gl.glBegin(GL2.GL_QUADS);
gl.glTexCoord2f(0.0f, 1.0f);
gl.glVertex2f(-1.0f, -1.0f);
gl.glTexCoord2f(1.0f, 1.0f);
gl.glVertex2f(1.0f, -1.0f);
gl.glTexCoord2f(1.0f, 0.0f);
gl.glVertex2f(1.0f, 1.0f);
gl.glTexCoord2f(0.0f, 0.0f);
gl.glVertex2f(-1.0f, 1.0f);
gl.glEnd();
gl.glMatrixMode(GL2.GL_PROJECTION);
gl.glPopMatrix();
gl.glDisable(GL2.GL_TEXTURE_2D);
3.5) The CUDA call:
public void runCuda(GLAutoDrawable drawable) {
devOutput = new CUdeviceptr();
// Map the OpenGL buffer to a resource and then obtain a CUDA pointer to that resource
cuGraphicsMapResources(1, new CUgraphicsResource[]{bufferResource}, null);
cuGraphicsResourceGetMappedPointer(devOutput, new long[1], bufferResource);
// Setup the kernel parameters making sure that the devOutput pointer is passed to the kernel
Pointer kernelParams =
.
.
.
.
int gridSize = (int) Math.ceil(imageWidth * imageHeight / (double)DESC_BLOCK_SIZE);
cuLaunchKernel(function,
gridSize, 1, 1,
DESC_BLOCK_SIZE, 1, 1,
0, null,
kernelParams, null);
cuCtxSynchronize();
// Unmap the buffer so that it can be used in OpenGL
cuGraphicsUnmapResources(1, new CUgraphicsResource[]{bufferResource}, null);
}
PS: I thank Robert for providing the link to the sample. I also thank the people who downvoted my question without any useful feedback!
I am trying to do the examples in the OpenCV book and I got to the part regarding cvCanny. I am trying to use it, but I keep getting a memory exception error of
Unhandled exception at 0x75d8b760 in Image_Transform.exe: Microsoft C++ exception: cv::Exception at memory location 0x0011e7a4..
I have also looked at another post that was similar to this question, but it did not help for me as I got the same error each time. Any help is greatly appreciated and the source code for the function is located below.
void example2_4(IplImage* img)
{
// Create windows to show input and ouput images
cvNamedWindow("Example 2-4 IN", CV_WINDOW_AUTOSIZE);
cvNamedWindow("Example 2-4 OUT", CV_WINDOW_AUTOSIZE);
// Display out input image
cvShowImage("Example 2-4 IN", img);
// Create an image to hold our modified input image
IplImage* out = cvCreateImage(cvGetSize(img), IPL_DEPTH_8U, 3);
// Do some smoothing
//cvSmooth(img, out, CV_GAUSSIAN, 3, 3);
// Do some Edge detection
cvCanny(img, out, 10, 20, 3);
// Show the results
cvShowImage("Example 2-4 OUT", out);
// Release the memory used by the transformed image
cvReleaseImage(&out);
// Wait for user to hit a key then clean up the windows
cvWaitKey(0);
cvDestroyWindow("Example 2-4 IN");
cvDestroyWindow("Example 2-4 OUT");
}
int main()
{
// Load in an image
IplImage* img = cvLoadImage("images/00000038.jpg");
// Run the transform
example2_4(img);
// clean the image from memory
cvReleaseImage(&img);
return 0;
}
You forgot to say if you are able to see the original image being displayed on the screen.
I never get tired of telling people that checking the return of functions is a must!
Consider IplImage* img = cvLoadImage("images/00000038.jpg"); , how can you tell if this function succeeded or not? As far as I can tell, the error you are having might be from a function failing prior to cvCanny() being called.
Anyway, I recently posted a code that uses cvCanny to improve circle detection. You can check that code and see what you are doing differently.
EDIT:
Your problem in this case is that you are passing to cvCanny input and output as a 3 channel image, when it takes only a single channel image. Check the docs:
void cvCanny(const CvArr* image, CvArr* edges, double threshold1, double threshold2, int aperture_size=3)
Implements the Canny algorithm for edge detection.
Parameters:
* image – Single-channel input image
* edges – Single-channel image to store the edges found by the function
* threshold1 – The first threshold
* threshold2 – The second threshold
* aperture_size – Aperture parameter for the Sobel operator (see Sobel)
So, change your code to:
// Create an image to hold our modified input image
IplImage* out = cvCreateImage(cvGetSize(img), IPL_DEPTH_8U, 1);
// Do some smoothing
//cvSmooth(img, out, CV_GAUSSIAN, 3, 3);
IplImage* gray = cvCreateImage(cvGetSize(img), IPL_DEPTH_8U, 1);
cvCvtColor(img, gray, CV_BGR2GRAY);
// Do some Edge detection
cvCanny(gray, out, 10, 20, 3);
I'm new to OpenCV and image processing and I'M not sure how to solve my problem.
I have a photo of document made in iPhone and I want to convert that document to black and white. I tried to use threshold but the text was not so good (a little blurry and unreadable). I'd like to text looks same as on the original image, only black, and background will be white. What can I do?
P.S. When I made a photo of part of the document, where text is quite big, then result is ok.
I will be grateful for any help.
Here are the example image I use and the result:
My attemp, maybe a little more readable than yours:
IplImage * pRGBImg = 0;
pRGBImg = cvLoadImage(input_file.c_str(), CV_LOAD_IMAGE_UNCHANGED);
if(!pRGBImg)
{
std::cout << "ERROR: Failed to load input image" << std::endl;
return -1;
}
// Allocate the grayscale image
IplImage * pGrayImg = 0;
pGrayImg = cvCreateImage(cvSize(pRGBImg->width, pRGBImg->height), pRGBImg->depth, 1);
// Convert it to grayscale
cvCvtColor(pRGBImg, pGrayImg, CV_RGB2GRAY);
// Dilate
cvDilate(pGrayImg, pGrayImg, 0, 0.2);
cvThreshold(pGrayImg, pGrayImg, 30, 255, CV_THRESH_BINARY | CV_THRESH_OTSU);
cvSmooth(pGrayImg, pGrayImg, CV_BLUR, 2, 2);
cvSaveImage("out.png", pGrayImg);
Threshold image is used for different purposes.
If u just want to convert it to b/w image just do this. USing openCV 2.2
cv::Mat image_name = cv::imread("fileName", 0);
the second parameter 0 tells to read a color image as b/w image.
And if you want to save as a b/w image file.
code this
cv::Mat image_name = cv::imread("fileName", 0);
cv::imwrite(image_name, "bw_filename.jpg");
Using Adaptive Gaussian Thresholding is a good idea here. This will also enhance the quality of text written in the image.
You can do that by simply giving the command:
AdaptiveThreshold(src_Mat, dst_Mat, Max_value, Adaptive_Thresholding_Method, Thresholding_type, blocksize, C);
I am making a game for my CS class and the sprites I have found online are too small. How do you 'stretch' a bitmap... make them bigger using SDL? (I need to increase there size 50%, there all the same size.) A snippet of example code would be appreciated.
This question does not specify SDL version, and even though SDL2 was not available when the question was written, an SDL2 answer would add completeness here i believe.
Unlike SDL1.2, scaling is possible in SDL2 using the API method SDL_RenderCopyEx. No additional libs besides the basic SDL2 lib are needed.
int SDL_RenderCopyEx(SDL_Renderer* renderer,
SDL_Texture* texture,
const SDL_Rect* srcrect,
const SDL_Rect* dstrect,
const double angle,
const SDL_Point* center,
const SDL_RendererFlip flip)
By setting the size of dstrect one can scale the texture to an integer number of pixels. It is also possible to rotate and flip the texture at the same time.
Reference: https://wiki.libsdl.org/SDL_RenderCopyEx
Create your textures as usual:
surface = IMG_Load(filePath);
texture = SDL_CreateTextureFromSurface(renderer, surface);
And when it's time to render it, call SDL_RenderCopyEx instead of SDL_RenderCopy
You're going to get a better looking result using software that is designed for this task. A good option is ImageMagick. It can be used from the command line or programatically.
For example, from the command line you just enter:
convert sprite.bmp -resize 150% bigsprite.bmp
If for some strange reason you want to write your own bilinear resize, this guy looks like he knows what he is doing.
have you tried?
SDL_Rect src, dest;
src.x = 0;
src.y = 0;
src.w = image->w;
src.h = image->h;
dest.x = 100;
dest.y = 100;
dest.w = image->w*1.5;
dest.h = image->h*1.5;
SDL_BlitSurface(image, &src, screen, &dest);
Use for stretch the undocumented function of SDL:
extern DECLSPEC int SDLCALL SDL_SoftStretch(SDL_Surface *src, SDL_Rect *srcrect,
SDL_Surface *dst, SDL_Rect *dstrect);
Like:
SDL_SoftStretch(image, &src, screen, &dest);