I want to be able to read the content of pdf files. I need to do that with C on Linux.
The closer i can get to this was here but I think Haru can only create pdf and is not able to read them (not 100% sure).
PS: I only need the plain text from pdf
Check out libpoppler. I've never used it work extracting text, just querying PDF attributes. It's pretty easy to use.
How well do you need to parse them?
Just extracting strings should be relatively easy, fully accurate rendering is harder.
Take a look at the source for evince or ghostscript?
This is for C++ but might be a good starting point for understanding PDF structure http://www.codeproject.com/KB/cpp/ExtractPDFText.aspx (sorry wrong link before)
Another possible, though I've never used it is VersyPDF. It claims to allow you to edit PDFs ... http://versypdf.sybrex-systems-ltd.qarchive.org/
Related
I'm trying to look through and find a way to annotate a video in C with polygons bounding boxes, however I'm stuck at a very elementary step.
Assuming I know how to break a .MPEG movie up into multiple JPEG images, how do I manipulate that file in C? The things I'll eventually need to draw on are text, points, and lines, but I am having a hard time figuring out how to get started with this.
If I declare:
FILE* img = fopen('foo.jpeg', 'r');
then what could I do with img? Is there a way to access certain pixels in the drawing?
What you did in your code sample is just opening a file. You didn't even read any data from it yet.
The simplest way to load an image file is to use dedicated library, such as SOIL.
If you weren't able to do it by yourself, however, I really don't think you will be able to accomplish your project goals - it is really advanced stuff you want to create, and you failed, as you already noticed, on the most basic of steps.
I'm going to put a background on my embedded console application.
The question is simple: I just don't know how to do that :)
suppose I have a picture of any format and want to convert it to character array to print it.
As i told you I'm clueless on this one, so if this is a bad approach, please let me know. If any better solution, please suggest it. thank you all!
You need to decide if you want an actual image as is, or if you want an image converted to text:
If an image converted to text is what you want, have a look at AAlib as #IlmariKaronen suggested, or maybe jp2a.
If you do want a "real" image, then you need to use a terminal emulator which supports changing background image via a proprietary escape sequence, and I think for instance eterm supports this. Then the background could look something like this:
If you are using the Linux console without X11, there are other options related to FBDEV and bootsplash, but I don't know those as well.
The current XML Helper in CakePHP doesn't give you the ability to specify if you want the whitespace to be significant or non-significant. Normally it wouldn't matter, but I'm working with a strict API that requires certain values to have no excess characters surrounding the value (no \n's or \t's). I'd like to modify the Cake source to support this ability, and if anyone has done this before and has any tips or advise on how to start, I'd appreciate it. Actually I believe the most helpful thing would be if someone has a flowchart of how Cake comes together (ie: starts in index.php and flows through router.php or what-not). I'd like to get a better understanding of how Cake is constructed (even from a high level).
Thanks!
If you want to change one of the built-in helpers, just copy it into your /app/views/helpers/ directory and edit it from there. The version in your app will be used instead of the original.
This is in C Language
I want to know how i can write a program to lookup all the input fields of a website. Any website. and then can fill them in. I can write the simple webbrowser in vbs but how can i analyse the input fields. even better would be is i could click the lookup field and it puts the name of it in a box..... that would be ideal.
Anyone can help? thanks :)
Are you sure you want to do this in C?
I ask because it is not easy. First of all, you need to be able to run the HTTP GET request against the webpage you wish to view. For this, you probably need libcurl; you definitely don't want to be writing from scratch at any rate.
Next, you need to process the html you get, finding all input fields. You do NOT want to do this using regular expressions, if anything for the sake of bobince's blood pressure. HTML is not a regular language is the bit you need to take away - you need an xml parser. Enter libxml. I'm sure there are other xml libraries out there, and even libraries for parsing html.
Finally, having done that (got the fields etc) you need to be able to populate them and submit the correct request as per the ACTION and METHOD parameters of the FORM.
This is of course assuming you know what the fields should be formatted with. And it also assumes nothing else is going on. If you have a javascript validated web form (I sincerely hope they're validating on the request too, but they might provide feedback via JS) you won't benefit from that (unless you're going to integrate JS, in which case you might as well write a browser).
This is not a trivial task and it is the reason there are accessibility standards for HTML, because otherwise it becomes tricky to interpret the form without human interaction.
Of course, this all assumes said html is well formed, which isn't always the case...
I might suggest another approach. BeautifulSoup is a well known Python web scraping library that works very well. Python as a language allows easier string manipulation too, which will dramatically cut down your development time. I'd suggest giving the need to use C some serious thought given the size and complexity of the task you want to undertake vs your need to get a result quickly. If you have a lot of time, by all means go for C.
I wonder if it is possible to embed dynamic text into Keynote'09? I want to create a new presentation and run this presentation with different text messages (depending on the time of the day and day of the month).
You can insert formulas in tables. I don't have the english version of keynote open, so I can't tell you the exact names of the functions (guessing). You can do something like
=IF(MINUTE(NOW()) > 30; "> 30" ; "<= 30")
See the formula help. If you tell me what you want to achieve, I can give you further details.
I'm not aware of any direct or easy method to achieve what you are asking for.
However, with AppleScript you can access and change at least the title and the body boxes of the slides. This should be done prior to the presentation.
If the 'dynamic' text is to appear in a text box, you could use some scripting to modify the presentation's XML directly. An older Keynote's XML schema should be reasonably well (but not wholly) documented in the iWork Programming Guide, but as the '09 file format is not backwards compatible I don't know how much that would help.
You could try using an encapsulated post script image file. Postscript is a real programming language. I don't know if Keynote will accept it (or if it will cache a bitmap), but Cocoa loads EPS, and Keynote is cocoa.
On Mac OS X, an EPS file gets evaluated when it is opened and converted to a PDF in memory. This process can take a really long time, like 30 seconds, if this is the first time you've tried to open an EPS file since logging in.
Ah! Someone pointed out to me that you can embed Quartz Composer compositions into keynote. This is a good way to do it.