I started looking at card.io as part of a Android application that should to be able to scan a card and recognise the card number, date of expiration, card holder.
After digging for a while, I got to the card.io-dmz/models/generated folder where I see files that, according to a comment in their beginning, were "Autogenerated from models/conv/...".
However I was not able to find details about the files used to generate these "models". After checking the code, I assume that these generated files are directly responsible with the OCR of the numbers from cards.
To provide an example, the following card is scanned and recognised just fine (numbers only)
but the following card fails
I tried adjusting the ROI before the vertical segmentation is done, but I think the differences between the font used on the 2 cards makes it impossible to scan the second card.
My question really is, given the current open source projects from git hub, are there any chances for someone to add the capability of scanning cards similar as the black one above, or this would require to have access to other resources used to perform the actual OCR?
Dave from card.io here.
#Adrian your conclusions are all correct.
While we'd love to extend our deep-learning character-recognition models to cover newer style cards, such as your second card above, it's a big task.
Quite a few new-style cards (~100) would be required both to update the code that locates the card number in the first place, and then to train new character-recognition models.
At the moment, this isn't something that lends itself well to open sourcing. People tend to not want to share images of their credit cards, for some reason.
We have given some thought toward creating an open-source app that could be used to collect some portions of card images (e.g., all digit positions, plus actual images of just a few of the digits, plus an image of the expiration date). Then perhaps we could crowd-source a usefully large collection of information. And while that collection is being built, we could work on open-sourcing the many in-house tools we have created for working with computer vision and deep learning.
Would such a project be something you might participate in?
Related
This is my first time getting into drones.
I am looking at DJI drones, currently as it seems most promising from a documentation and reviews point of view.
Basically, I would like to program a drone(s) to fly a certain pattern and take pictures when a certain criteria is met. For example, I would like the drone to take off and fly around a small park, stopping to take a picture of each tree it encounters, automatically (auto-piloted / driven by some "AI").
Now I glanced thru the DJI SDK documentation, and so far it SEEMS this is possible (via FlightControl class). But im not sure.
Question:
Can my requirements be met with current drone SDK technologies?
Yes, the correct SDK, 4.11.1 will do everything you mentioned. You will need to do some location calculations but that's about it.
The sample will almost do everything you want as-is, with minor changes.
With the DJI Mobile SDK you can use the Mission classes to automatically fly a given set of coordinates (waypoints) and do some actions once you arrive at a waypoint, e.g. take a picture.
However the SDK has limitations:
The SDK is unable to detects objects in the video stream. Therefore it is needed to use your own code to detect objects yourself.
The way the drone flies to the waypoint is quite limited, e.g. the drone will always face the camera in the direction of flight.
When using the DJI Mission classes, a change of the route during execution is only possible with the use of timeline Missions by adding timeline elements to the list.
As you already assumed in the comment: Yes, the Mobile SDK is more advanced than Windows SDK.
I've been asked to create a stand-alone site/app that's not connected to the web (all on a local server).
One part of it is to have a map of a natural reserve with a bunch of links that will show footpaths, different animals habitat areas, visitor centres and such.
So there's a map (static picture) and when you click on it some overlay goes on top of it.
At least that's the way I see it now.
I've looked here: http://www.carto.net/williams/yosemite/ but it just looks mucho ugly.
Getting Maps Premium is not an option as it's not that cheap. And the reason they don't want to use Maps/Earth free API is because internet connection is still very slow there (sattelite internet only and when optic cable will be hooked up nobody knows).
Looking for some recommendations as to how to proceed there. Drawing paths/areas on the picture of the maps seems extremely insufficient and time consuming.
I'd need some way to use coordinates to automatically draw areas and lines over the map (and then somehow export that as a graphis file (or SVG) that'll be layered on top of original map simply using ajax.
Will ARCGIS pro edition be the way to go or should I start learning SVG. Do you know some good SVG books/tutorials (as related to mapping)? Maybe there's some other way around altogether...
They do have detailed maps of the area in ARCGIS (whatever format they are in I don't know yet).
Just looking for some ideas, any help will be appreciated. Thanks in advance.
Do you know GeoServer? More or less all-in-one, compatible with different types of datasets, widely customisable.
Starting from "raw" SVG and write the whole thing yourself will probably be prohibitively time consuming.
If you have very little data (say less than 50 geometries) that is fixed, you could also use OpenLayers without any backend server.
For the data you could use a OpenLayers.Layer.Image if your (overlay-) map consists of a small raster image. For vector data, you can use OpenLayers.Layer.Text or a OpenLayers.Layer.Vecor together with protocols OpenLayers.Layer.KML or .JSON.
You can click through the current release examples.
I admit that this is not an easy task for a beginner, but it's fun hacking the maps together.
I want to create wayfinder/pathfinder mobile application. it will route visitors in our buildings. we have 20 buildings and each at least have 4 floor.
We want to develop our own wayfinder ex:http://www.wayfinderkiosk.com/
It should use Lat/Long coordinates to locate the people. and help to find its route.
So where should i start. Does any one have any idea for that. and it is going to be mobile.
I can develop app/site based on these platforms ( Mobile Web/Iphone/Adroid /Symbian/Windows )
But i need a start point. and i need your help.
Thanks
You want to use lat/long inside of a building? Assuming these visitors are going to be using their own unmodified devices, you may have trouble with GPS. Unless you somehow get reliable GPS signal despite being under a four-story building, that's probably not going to work.
An RFID-tagged badge and sensors placed throughout the building seems more likely to work. Put a unique QR-code on each badge that directs the phone's browser to a tracking page for that specific badge.
edit: and now that I re-read your question and see that multiple buildings are involved, the GPS bit could certainly work for routing them from one building to another.
I have recently attempted to generate reports in Silverlight 4. In my problem domain, these reports either need to go directly to the printer and/or the client-side SL application creates a PDF and allows the user to store it somewhere.
As for the report, it's roughly composed of 50% flow text (incl. enumerations), 30% tables and 20% charts. The flow text part makes it slighty more challenging, as proper line breaking would have to take place.
So far, I have tried the following approaches - each with its own shortcomings that make them not so much feasible:
Silverlight's own PrintDocument: technically, there are two major concerns. For one, getting page breaks to work and printing UIElements on it with proper layout is a bit of a dirty hackjob and full of compromises; thankfully that's the part I've managed to get working so far. However, the PrintDocument class always renders all visuals as bitmaps before sending them off; this is not so much fun, if one uses a PDF printer and hopes to still be able to search in / select text. David Poll's approach in "Silverlight and Beyond" [1] wasn't that helpful as well as it inherently follows the same approach and thus suffers from very similar issues.
silverPDF [2]: a barely documented library that requires to do most of the layout manually (the former approach at least allowed me to re-use Silverlight's layouting engine). So far, I see no way to (for instance) measure paragraphs and the only sample with long flowtext uses hardcoded absolute values for layout rectangles. Also, the developing party seems to be inactive.
Personally, I'm now thinking of following an entirely different strategy: simply generate HTML documents. But I was hoping that the community here might have hints for the two approaches above or know other good approaches.
Thanks in advance,
~Manny
Do you need to generate the report on the client, or can you get the server to generate it? Your options are better if you can generate it on the server. Personally, I think the way Silverlight printing works at the moment is pretty poor for report usage (sending each page to the printer as raster rather than vector, resulting in potentially huge amounts of data travelling through the network, and lower printing quality output). I've found the best strategy is to generate the PDF on the server (enabling you to take advantage of a reporting engine), and display it in your application. There are also a few commercial products (such as Telerik's Silverlight Report Viewer, Report Sharp Shooter, or even First Floor Software's Document Toolkit). If a client side solution is really required, perhaps one of these might be the best option (although the printing quality will still be poor). Note that Silverlight 5 is supposed to have support for vector printing, but it's another 6 months or more away from release. Yet another option is Pete Brown and David Poll's open source reporting framework here: http://silverlightreporting.codeplex.com/.
If you want to take the option of generating the report on the server as a PDF and displaying it in your application, I've written an article on doing so here: http://www.silverlightshow.net/items/Building-a-Silverlight-Line-Of-Business-Application-Part-6.aspx. This doesn't work for OOB applications, but the source code accompanying my book (Pro Business Applications with Silverlight 4) does: apress.com/book/view/9781430272076.
Hope this helps...
Chris Anderson
I'm communicating with a logic analyzer (HP 1660A) over RS232. I issue a command which tells the analyzer to print screen its display and send it over to the controller (my pc) through serial communication. I'm saving the result (which is usually abut 25kB) to my computer and I would like to view it as a TIFF or other format. The problem is that the response from the analyzer comes in PCL format, therefore suitable to be sent to a printer and printed directly, but not to be opened as an image. I have tried a few PCL to image converters to do the job, I found one which does it properly, however I've used the trial version and I am reluctant to purchase it. I've given you the background of my labour. I would appreciate any kind of help, a reference to the commands in pcl 1 and what should I do in order to extract the data and format it properly from the PCL file. I have no experience with PCL and image processing whatsoever, so please, give me a hand here. Thank you.
P.S. I've obtained the PCL file from the analyzer, both in C# and matlab... I have one slight problem in C# with the serial port control, some images have some uninterpreted characters in the image, when using the above converters. I say all these because I need an algorithm or some indications, no matter the programming language, so please feel free to post.
PCL is complex to read. There are only a handful of tools out there that do a good job of this. We have lots of PCL expertise and still often look to other to supply conversion to PDF and other formats. If the PCL is quite simple, that is, just text, a few fonts, and a graphic or two, a couple of RegEx commands could deal with the extraction of the text and then you could mock up a new document using whatever tools you wish.
Looking at these files in stackoverflow might be tough. If you can get them on an ftp and post a link I can take a quick look and post my findings/thoughts here. The other option is to look to an outside tool. There are a few we've had success with. Our needs are broad so I've settled on one that works the best with many different PCL streams (some PCL coding is better than others). As you are dealing with a known quantity of PCL you may have a few options. Here are a few we've used and had some success with (in order of usefulness to us)
PCLWorks by PageTech (they have a GUI viewer and complete SDK)
VeryPDF PCL Converter (command line tool)
SwiftView
There are others, and even an opensource variant of Ghostscript that handles PCL (we've never had much luck as the PCL we use often contains very custom fonts, symbol sets, and tons of macros which seem to choke it.
GhostPCL
EDIT: Most recently we've been working with LincPDF (http://www.lincolnco.com/). This is also an excellent product with has one big benefit, deployment is simple. Some of the other tools have complex software installations. This solution is very easy for us to deploy as a feature in an application. It's also faster then any tools we've tested to date (at least with the PCL that we generate from our apps which is quite complex as they include specialized fonts and macros).
According to the spec sheet for the HP 1660 (pdf) series can send the TIFF,PCX and postscript.
Wouldn't it be easier to use TIFF?
The project was put on hold for a while, but I would like to offer a complete and usable solution.
#Adrian
You can save the image to a floppy disk, I've done that, saved it as TIFF and everything worked fine. Unfortunately, it sends only PCL through RS232. The idea to save the print screen over serial communication was to avoid using too much the floppy disk, which the device uses in order to boot.
#Douglas
Thank you for your elaborate answer. I'll take a look at the indicated tools, however, my desire is to offer a complete front-end solution, which yields directly the graphic. I've put some files from my tests here in order to see the complexity of the PCL constructions. Do you have any knowledge of a possible API that I could integrate into my application, which can parse the file and interpret the PCL?
Regards,
Cosmin
We capture the serial input via a serial spooler that watches COM1:. It's called SSpool.exe. It redirects the PCL as input to PCLXForm. PCLXForm converts it into any raster format (TIFF, JPG, PDF, BMP, etc.) However, we can also extract the text during the conversion and we can extract individual raster objects from the PCL for re-arrangement in the downstream application. Our pricing model is positioned for licensee's that need to convert up to 50,000 pages of invoices into indexed PDF's per month. However, this type of application normally requires a custom license in order to get our pricing down to the level required. In order to do so, we often have to restrict our product to convert unlimited files, but only up to the 20th page within any one PCL print file. That provides enough page volume and gives us the ability to reduce the pricing per unit. To demo, you would need the PCLTool SDK.