How to identify that an image comes from a dot matrix printer - c

I am using Tesseract for OCR character recognition using the Charles Weld C# wrapper. I am pre-processing the images with Open CV.
My issue is that I need to pre-process the image differently if it came from a dot matrix printout. Is there a way, using OpenCV, to tell that the image was scanned from a dot matrix printout?
I have tried blurring the image once and counting the differences using AbsDiff which is the technique I use to detect if an image needs to be despeckled but there is no consitent result that indicates dot matrix.

I had a few thoughts and decided to put them down in ImageMagick but you can equally do this sort of thing with OpenCV and its findContours().
I used this as an input image:
If you erode the black areas a little using morphology (or alternatively dilate the white, it comes to the same thing) each of the dots will become separated from adjacent ones. If you then do a "Connected Component Analysis" you will see that the image has an abnormally large number of very small dots which are roughly the same height as width - characteristic of circles or dots.
Here is the code I used in Terminal to run ImageMagick:
magick dotmatrix.png -threshold 50% -morphology dilate disk:1 \
-define connected-components:verbose=true \
-connected-components 8 -auto-level result.png
The output is this image wherein each detected blob gets a successively brighter shade of white:
More interesting though is the verbose output, which has one line of output per blob detected in the image. It shows lots of small 2x2, 3x2 and similarly sized dots with an area around 7 pixels, circled in red. I would use this as a base for exploring some more...

Related

How to plot several datafiles in consecutive graphs using gnuplot

I have a large number of ASCII files named in order of the form data.xxxx.tab, where "xxxx" is a number between 0000 and 9999. Each file contains 5 columns, where the first is for X-coordinate, second is for Y-coordinate and the remaining three are for variables which I wish to plot against X-coordinate. I need to know how to write a loop in gnuplot 4.6, that could plot consecutive graphs of one of the variables against X-coordinate.
I already tried the instructions given in the following posts:
Plotting with gnuplot from several files
and
gnuplot : plotting data from multiple input files in a single graph
but these created a single graph containing all the curves from all the data files together, whereas what I need are consecutive graphs that are plotted one after another, thus showing the evolution in time of the variable graph.
The following should work:
# fix axes for proper comparison between graphs
set xrange [0:10]
set yrange [0:10]
# if you want an animated gif
set term gif animate
set output 'output.gif'
# then plot your data
do for [n=0:9999]{
plot sprintf("data.%04d.tab", n) using 1:2 title 'case '.n
}
The %04d string inside the sprintf command prints the number n with until four zeros before the minimum field width of n, i.e. n=2 is printed as 0002, and n=9999 is printed as 9999.
I would suggest using a shell script that calls a gnuplot file
file plot.gp:
set term png
set out fname.".png"
set title fname
plot fname w l
and then in the shell:
for fname in data.????.tab; do gnuplot -e fname=\"$i\" plot.gp; done
you'll get a file named data.xxxx.tab.png file for each data.xxxx.tab.

Winmerge - Way to make identical lines not to be shown when compare 2 files?

When I compare 2 large files
I used winmerge
It is a great tool that can find the delta between 2 files
In my case
Each of 2 files nearly contains 3000 lines
And there are some lines ONLY contain changes
Any Way to make identical lines in both files not to be shown when compare?
And show ONLY lines that have deltas?
To can inspect only DELTA lines and minimize scrolling overhead hover too long vertical files comparison result
In version 2.16 you can go to menu View - Diff Context and select 0 lines. This will only make visible the different lines in 2 files, and hide the rest.
EDIT: My answer is not correct anymore. See answer blow.
OLD ANSER:
From the winmerge FAQ:
4.6. Can I hide similar lines in a file comparison, so that only different lines are visible?
No, you can't. Many users have requested this feature but we don't
have any plans to implement it. We don't believe it would really
improve usability.

DOSBox autoexec menu design

I'm trying to make a (somewhat) stylish DOS menu as a present for my father.
I was able to get the whole menu system to work, but I wanted to gussy it up with some box drawing characters and, possibly, colored text.
In this YouTube video, the user shows an example of what I'm trying to do (example at the 5:00 mark), but doesn't explain how those characters are being rendered. In the Notepad document, it is displayed as goofy characters.
Do I need to save the file with a special type of encoding? Can it only be done in Notepad (I'm using TextEdit on Mac)? Can someone provide an example menu that can be added to DOSBox's [autoexec] config?
Also, I'm not sure if it is possible, but how can the text color/background color be changed? When running DOSBox initially, it shows their welcome screen with a blue background and box drawing characters, so I would think all of that is possible.
I tried using escaped unicode characters and I tried using a capital-E acute (as shown in the linked video), but they just render funky stuff when run in DOSBox.
The discrepancy in characters is a result of different code pages being used in character rendering. English-speaking Windows uses ANSI code page 1252 (otherwise known as Latin-1), while DOS uses OEM code page 437, or IBM-PC.
The codepage that Windows uses will vary based on your system language, so you many need to experiment to find the correct characters, but basically, find the character you want to print in 437 (say ╔, which is 200) and then in your code use the 1252 version (where 200 is È). Then save the file in ANSI encoding.

LSCOLORS actual color values

I find that the default colors I get with apps like gnome-terminal are often nearly unreadable. In particular, the bright green for executables is unreadle with a white background.
I can use LSCOLORS to remap executable to be red instead of bright green (or whatever), but what I'd rather do is to make the bright green a shade that's not quite so close to white.
Is there a file somewhere that maps the color numbers in LSCOLORS to RGB values?

processing .raw file image with ffmpeg api or C code

I am trying to process a .raw image file captured using vrl2, it's a h264 encoded image with yuv422 color space from a Logitech c920 webcam, dcraw is not working for me however from my previous question this command is working fine with low performance (a 32kb jpg image however using opencv capture I get a 900kb image for the same 640x480 resolution):
ffmpeg -f rawvideo -s 640x480 -pix_fmt yuyv422 -i frame-1.raw
frame-1.jpg
I need a code written in C or the ffmpeg api/OpenCV etc .. to do the same as this command,I don't want to use QProcess in Qt(I am working on a server using Qt where I am trying to send the raw file from a Raspberry PI to the server and process it their), dcraw output is a corrupted image.
http://ffmpeg.org/doxygen/trunk/examples.html
There should be some api samples in there that show how to get the image out with that specific encoding.
When interacting with a RAW file, I have also used IrfanView. If you know the headersize of the file and the width and the height and the bits per pixel per color, you can see what it looks like quickly that way.
EDIT: I tried using Irfanview with your RAW, and I got something close, but not quite. The coloring was always off. I don't think it can handle that particular encoding of a RAW file right now.

Resources