How to download web content using C?

How to download web content using C? - c

I have to write a C parser for online blogs and different word manipulation features.
I know how to parse / tokenise stings in C, but how would you on execution download the pages content to a local /tmp directory as an HTML file so I can save the information (the blogs) into a string using I/O?
Or, just grab the block of text directly from the page I am viewing...
My system could be either Ubuntu or Windows 7, so I dont think wget will cut it. Please help.

Take a look at libcurl:
libcurl is a free and easy-to-use client-side URL transfer library, supporting [...] HTTP, HTTPS, [...]
libcurl is highly portable, it builds and works identically on numerous platforms, including [...] Linux, [...] Windows, [...]

Alternatively you can make use of system to execute wget

And there is libsoup too.

MSDN: URLDownloadToFile

Related

Retrieve network camera stream URL in Linux

I am trying to retrieve rtsp URLs of cameras on my network. I can do this using Onvif Device Manager on Windows but how to do this on Linux using C/C++ or command line tool. I have tried various libs e.g. onvifc (OpenCVR) and onvifcpplib but none of them could compile on Linux, neither they have API documentation. Any suggestions please!

I was able to find a gsoap-onvif solution from https://github.com/tonyhu/gsoap-onvif. This programs successfully retrieves parameters from most of the Onvif complaint cameras.

you can have a try with python onvif, some feature you can use, may be other feature such as PTZ you can use .
also, you can have a try with opencvr's another project, https://github.com/veyesys/h5stream, if you can't compile,you can download from sourceforge.
Good luck.

How to use Sphinx3 in an application

I used Sphinx4 for some time which really fits my needs. I load a recognizer, pass the audio data to it and use the recognized String in my application.
Right now I'm working on a C application (C++ is unfortunately not an option) where I need something similar and thought that I could use Sphinx3 which is written in C.
The problem is that I don't really know how it is used inside an application and there is no "Hello World"-example as Sphinx4 provides it.
I already compiled and installed sphinxbase and sphinx3 and now I can include the sphinx header files in my application.
Now to my questions:
Is there a "simple" and well documented example application that uses sphinx3 from a C environment?
How can I load up the sphinx3 engine and call a recognizer with my binary audio data?
OR: Do I need to start an application like "sphinx3_decode" and call it from my own application? If so, is there an example application for that?
Thank you in advance!
Best regards,
Robert

It's not recommended to use Sphinx3. From the website:
Sphinx-3 is CMU’s large vocabulary speech recognition system. It’s
older C based decoder that we continue to maintain. It’s planned to
make it obsolete in the future, it’s still most accurate decoder for
large vocabulary tasks. We are using it as a baseline to check the
recognizer accuracy. This decoder is only intended for researchers who
want to evaluate bleeding edge methods in ASR like tree search method.
If you need to use a decoder you should use pocketsphinx. You can find the tutorial and the API documentation on the website
http://cmusphinx.sourceforge.net/wiki/tutorialpocketsphinx
http://cmusphinx.sourceforge.net/api/pocketsphinx/pocketsphinx_8h.html

I Recently worked on an Intregated Project on Punjabi Language.
Here are some steps that we used...
First we recorded the punjabi audio data in a vaccumed room in 16000 hz sample rate.
Then we took the recorded data and segmented it using Praat Software into small wav and raw files of 2 to 30 sec and saved them in a folder named train.
Then we took a system having Linux ie. Ubuntu and installed the required plug in like autoconfig, automake etc and untarred Sphinx 3 along with 4 packages that are cmuclmtk, pocketsphinx, sphinxbase, sphinxtrain.
Then according to the small wav files we made many files like transcription, dic, phone, filler, file id, ccs etc.
Then we opened the terminal and typed –"sphinx_fe” to check the whether the sphinx is functional or not.
Then we created an folder named “man” and then in terminal wrote its path.
Then we run the command- “sphinxtrain –t man setup”. By running this command an folder named “etc” will be formed in “man” folder containing files “feat_paramas” & ”config”.
Changes were made in the in the config file according to our data.
Then we moved all the files that we created before ie. transcription, dic in the etc folder in that is located in man folder.
Then we placed ‘lang1.sh” script in etc folder and remaining 4 scripts in man folder.
Then we opened the path for etc folder in terminal and run command- “lang1.sh”
Then we run series of commands in terminal – “mfcgen2.sh” then “verify3.sh” then “hmm4.sh” and at last “end-test.sh” to get the final result.
Rest if you have worked on Sphinx 4 then you may know about the files that are mentioned above in the steps. I hope this helps you.

Opening a file with the default viewer on Linux

I am working at an OS independent file manager (mostly Windows and Linux), and I am wondering if there is a Linux way to open a file in its default viewer. For example, open a html file with Firefox, an .avi with vlc, and so on.
On Windows, there is a function ShellExecute() that does that, but my understanding is that on Linux it is not that simple, and each desktop environment has a specific way.
I would appreciate any help with this.

You could use xdg-open(1). It works on all freedesktop compliant desktops.

The default programs for different mime-types are defined in /etc/mailcap and $HOME/.mailcap, indexed by file type and action (display, edit, print). The command line interface is run-mailcap. See also the manpages run-mailcap(1) and mailcap(5).

It depends what desktop environment you're using in Linux. Gnome for example has a MIME database you can use to find out what to launch for a given file.

Sahil Muthoo has given you good advice. I will just give further examples.
If xdg-open is not available you can also use "gnome-open" for GNOME and "kfmclient" for KDE.

How to make binary which downloads its newer copy?(limitied conditions)

Would like to ask for advice.
there is a need for binary to have a mechanism for self update. Lets imagine binary rolls on host A and updates-server is server B.
Lobster method is to fork bash script with wget/ftp/ncftp/etc getter wich will download and replace. But ehm...there is no such tools on A and they will not be installed.
In short I can't use any external software tools(external to running binary).I can just hardcode mechanism in running binary.
As binary image runs it can load binary(and md5 file) simply via tcp sockets in tmp file,then do md5 compare and if everything ok then replace binary and restart itself. Its easy to do, but I have some strange feeling...dunno.
Mb someone can share and advice?:)Thank you in advance.
Conditions: binary is written in pure c. freebsd is binary rolling side and update-serve is centos. So java/python/c++/any is available at server side but not on free. Y, tobe honest its is possbile install some tools on client side and openfirewall for ftp, but want to avoid and hardcode :)
ADDED: must be noted that the enviroment between A and B is secured..eghm...as we think, in any way security and access problem and spoofing/sniffing out of our world there :) its just local update implementation mechanism for some binary which nowdays we update from center within expect scripts via ssh.

You will have to reimplement a whole host of functionality if you want to do so. My easiest suggestion would be to link to libcurl, hardcode the download path into your executable and write the image of your executable back to $ARGV[0]. However, you should definitely rethink your distribution concept, most distributions do some form of package management, and using it is the easiest alternative for all parties involved.

First of all check if you can modify a binary when a process is executing it, some system does not allow it.
You say you can not use external tool so probbly you can not create another "updater program" which will do the chenge instead of your binary.
Probalby you can download such program (from where you want to downlaod your update), execute it (exec, replaces current process with the new one)
that executed process will download and upded your main one, and then exec to it.

Getting proxy information on Linux programmatically

I am currently using libproxy to get the proxy information (if any) on RedHat and Debian Linux. It doesn't work all that well, but it's the only way I know I can use to get the proxy information from my code.
I need to stop using the lib since in most cases it doesn't recognize the proxy.
Is there any way to acquire the proxy information? What i mean is, is there a file (or group of files) i can read, or an env variable or an API or system call that i can use to get the information?
Gnome based code is OK, KDE might help as well but i am looking for something more generic.
The code is C.
Now, before anyone asks, I don't want to use libproxy anymore. Period. I don't want to start investigating why it doesn't work. I don't really want to know whether there is a new version of that lib. I know it might work, I just don't want to use it. i can't use it (just because). So please don't point me that way.
Code is appreciated.
thanks.

In linux, the "global proxy setting" is typically just environment variables that are usually set in /etc/profile. You can examine those variables to see what proxy is set.
The variables are:
http_proxy - the proxy for HTTP connections
ftp_proxy - the proxy for FTP connections
Using the Network Proxy Preferences tool under Gnome saves information in the GConf database. The path to the keys are /system/http_proxy and /system/proxy. You can read about the detail in those trees at this page.
You can access the GConf database using the library API. Note that GConf is based on GObject. To examine the contents of this tree using the command line, try the following:
gconftool-2 -R /system/http_proxy
This will provide a "name = value" listing of the tree, which may be usable in your application. Note that this requires a system() call, so it's not recommended for a deployed application, but it might help you get started.

GNOME has its own place to store the Proxy settings, and I am sure KDE or any other DE has its own place too. May be you can look for any mention of where Proxy settings should be store in the Linux Standard Base. That could hint you a standard of doing it irrespective of Distro or DE.
DE -> Desktop Environment

char* proxy = getenv("all_proxy");
This statement puts the value of the environment variable called all_proxy, which is used by the system as a global proxy, in your C variable.
To print it in bash, try env | grep 'all_proxy' | cut -d= -f 2.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight