Fetching data from websites and make them into .txts - c

I am having an assignment in which I have to fetch data from a site (somewhat like a news site) and make them into text files and then list them using tags.
Could someone please provide me some information/knowledge/keywords/instructions that can help me finish this?(using C only) Thank you.

The basic concept is this: you'll need to connect to the HTTP server on port 80, fetch the HTML, parse it and store the information you want in files. To accomplish the former, if you're using Windows, you can use the WinInet API. Otherwise go with cURL, which is a very popular, efficient, cross-platform solution.
The other tasks require a relatively good command of C, but it's nothing special - you'll need to work with strings to parse your results, then use C's file I/O (fopen/fprintf/fwrite) to save your stuff to disk.

Related

Google translate API phrase into every language using classic ASP

Good morning everyone,
I have something [maybe not so] unique which I want to do which is translate a simple phrase like "Hello" into every language available under Google Translate API. I want to basically capture the results and store it into a SQL Server database.
In the past, I have written bulk geocoding processes using ASP and that worked well, and I am thinking that I can do the same with the translate API using a querystring. However, there are really no great examples of it.
I am about to drop the languages and their codes into a table so that I can just loop through things, and then use a JSON parser since I am not using the latest version of SQL Server. I have done crazy ASP SOAP implementation in the past which required authentication twice, but am thinking that this can be done differently.
I am just figuring that someone else out here might have had to slay this dragon before and any and all tips would be greatly appreciated.
Thanks
The Translation API only allows for one target language per request. Unfortunately, for your use case you would have to loop through the desired languages and send a separate request for each.

Usage of ngcsv in web apps

I really can't think of a scenario that would make me use ngcsv, it converts array to CSV file.
What I don't understand is why the server couldn't just return a proper file instead of pass the data through client code first. If you use I'll be happy to understand why this is useful.
Presumably it would work without access to the server since the code is all completely client side.
I'm planning on using it in a 100% offline app that relays on LocalStorage.. so I guess there's many scenarios where it could be useful

FileMaker - Asterisk Communication

Does anyone had accomplished this?.
The big picture would be to develop the entire asterisk GUI from filemaker, but right now I'm asking you help to connect both.
Asterisk controls our entire Call Center. I would like the info from incoming calls and queues to be written in a FileMaker database.
Disclaimer: I don't know the first thing about FileMaker. But, if it's like any other programming language (which from what I know, I'm not sure that's true) then let's look at the options on how we'd accomplish this generically with other programming languages...
If you just want the results of your call, the CDRs (call detail records), you can configure Asterisk to output custom CDRs in cdr_custom.conf (check it out if you'd generated the sample configurations)
Here's an example cdr_custom.conf:
[mappings]
Simple.csv => ${CSV_QUOTE(${EPOCH})},${CSV_QUOTE(${CDR(src)})},${CSV_QUOTE(${CDR(dst)})}
It will drop a file typically in /var/log/asterisk/ if you haven't changed it otherwise in your configuration.
Then, either restart asterisk, or more gracefully just reload the cdr module:
asterisk*CLI> cdr show status
asterisk*CLI> module reload cdr_custom.so
Using the resulting file, parse the CSV and format it in a friendly fashion for Filemaker / "your language of choice".
If you're looking for real-time information about calls, it does get more complicated. Probably for just reporting purposes, you can use the Asterisk AMI (Asterisk Manager Interface). (Canonical wiki page linked)
This is a TCP IP application, open a socket to it, and you're good to go. There's also the AJAM interface (Asynchronous Javascript Asterisk Manger). Which you can make HTTP calls to.
Lastly, if you want to do further processing during the routing of the call via the dialplan, you'd want to use AGI (Asterisk Gateway Interface) which is called from the dialplan, and is all over STDIO.
Actually you can create a ODBC connection to the asterisk databases and use filemaker to access the tables direct. It will give you a 'live' connection and save you all the import <-> export fuss. If you google on filemaker odbc you'll get results on setting this up, it works quite easy (not always fast depending on your query but certainly a lot quicker than the manual method)

How to collect data from a website

Preface: I have a broad, college knowledge, of a handful of languages (C++, VB,C#,Java, many web languages), so go with which ever you like.
I want to make an android app that compares numbers, but in order to do that I need a database. I'm a one man team, and the numbers get updated biweekly so I want to grab those numbers off of a wiki that gets updated as well.
So my question is: how can I access information from a website using one of the languages above?
What I understand the problem to be: Some entity generates a data set (i.e. numbers) every other week and you have a need to download that data set for treatment (e.g. sorting).
Ideally, the web site maintaining the wiki would provide a Service, like a RESTful interface, to easily gather the data. If that were the case, I'd go with any language that provides easy manipulation of HTTP request & response, and makes your data manipulation easy. As a previous poster said, Java would work well.
If you are stuck with the wiki page, you have a couple of options. You can parse the HTML your browser receives (Perl comes to mind as a decent language for that). Or you can use tools built for that purpose such as the aforementioned Jsoup.
Your question also mentions some implementation details such as needing a database. Evidently, there isn't enough contextual information for me to know whether that's optimal, so I won't address this aspect of the problem.
http://jsoup.org/ is a great Java tool for accessing content on html pages
Consider https://scraperwiki.com/ - it's a site where users can contribute scrapers. It's free as long as you let your scraper be public. The results of your scraper are exposed as csv and JSON.
If you don't know what a "scraper" is, google "screen scraping" - it's a long and frustrating tradition for coders, who have dealt with the same problem you have since the beginning of networked computing.
You could check out :http://web-harvest.sourceforge.net/
For Python, BeautifulSoup is one of the most tolerant HTML parsers out there. The documentation also lists similar libraries in Ruby and Java, so you'll probably find something relevant there.

Apache module FORM handling in C

I'm implementing an Apache 2.0.x module in C, to interface with an existing product we have. I need to handle FORM data, most likely using POST but I want to handle the GET case as well.
Nick Kew's Apache Modules book has a section on handling form data. It provides code examples for POST and GET, which return an apr_hash_t of the key+value pairs in the form. parse_form_from_POST marshalls the bucket brigade and flattens it into a buffer, while parse_form_from_GET can simply reference the URL. Both routines rely on a parse_form_from_string routine to walk through each delimited field and extract the information into the hash table.
That would be fine, but it seems like there should be an easier way to do this than adding a couple hundred lines of code to my module. Is there an existing module or routines within apache, apr, or apr-util to extract the field names and associated data from a GET or POST FORM into a structure which C code can more easily access? I cannot find anything relevant, but this seems like a common need for which there should be a solution.
I switched to G-WAN which offers a transparent ANSI C scripts interface for GET and POST forms (and many other goodies like charts, GIF I/O, etc.).
A couple of AJAX examples are available at the GWAN developer page
Hope it helps!
While, on it's surface, this may seem common, cgi-style content handlers in C on apache are pretty rare. Most people just use CGI, FastCGI, or the myriad of frameworks such as mod_perl.
Most of the C apache modules that I've written are targeted at modifying the particular behavior of the web server in specific, targeted ways that are applicable to every request.
If it's at all possible to write your handler outside of an apache module, I would encourage you to pursue that strategy.
I have not yet tried any solution, since I found this SO question as a result of my own frustration with the example in the "Apache Modules" book as well. But here's what I've found, so far. I will update this answer when I have researched more.
Luckily it looks like this is now a solved problem in Apache 2.4 using the ap_parse_form_data funciton.
No idea how well this works compared to your example, but here is a much more concise read_post function.
It is also possible that mod_form could be of value.

Resources