How to use TCP-based HTTP to download image? - c

I got some images to download using HTTP. I got these images' URL, how to build the TCP-based HTTP buffer to download the image?
I got no library in my current platform, the only supported language in this platform is C, so I have to build the HTTP buffer for these resources.
Currently I have build the normal API request, they are all HTTP request, every request have 0 or more parameters. But the image request got only a URL, such as http://some-image.jpg, it seems just a download job, no API parameters, no authorization, it's simple, but how to construct the TCP request?

You would have to implement HTTP protocol or a subset of it. There are open source implementations. For example:
https://github.com/bagder/curl/tree/master/lib
https://github.com/joshthecoder/libhttp

how to build the TCP-based HTTP buffer to download the image?
Stop thinking TCP. It has it's own buffers which have nothing to do with what's happenning at the HTTP level.
You really don't want to implement your own HTTP stack - it's not trivial. There are several well-written ones already available - I'd recommend using libcurl.

According to the http://www.jmarshall.com/easy/http/#sample, I build my TCP request like that:
sprintf(tcp_send_buf, "GET %s HTTP/1.1\r\nHost: %s\r\n\r\n", img_path, img_host);
/* I wrapped TCP APIs for convenient, hope you understand it... */
set_host_and_port(img_host, 80);
tcp_send(tcp_send_buf, strlen(tcp_send_buf), recv_callback);
On my recv_callback, I got the server response like that:
HTTP/1.1 200 OK
Content-Length: 42299
Content-Type: image/jpeg
Last-Modified: Mon, 02 Jul 2007 07:58:47 GMT
Accept-Ranges: bytes
ETag: "e2c8b5d17ebcc71:15d5"
Server: Microsoft-IIS/6.0
X-Powered-By: ASP.NET
Date: Fri, 25 May 2012 01:33:57 GMT
<binary image data>
I downloaded the image from Chrome, and it's size is the same as Content-Length: 42299, I think I got the image buffer.

Related

Cannot see the Http-Settings Header in the nghttp2 example

I am currently trying to learn nghttp2 and was trying to execute the client code which is provided at the bottom of this page:
https://nghttp2.org/documentation/tutorial-client.html
I executed the above C code doing the following:
./libevent-client URL
My Server is Windows IIS 10.0 and i want to see the http2-settings frame in the output of the header. As of now it is showing the following output:
Connected
Request headers:
:method: GET
:scheme: https
:authority: MY URL
:path: /
Response headers from stream ID=1:
:status: 200
content-type: text/html
last-modified: Mon, 01 Jul 2019 17:57:17 GMT
accept-ranges: bytes
etag: "c7c5406c3630d51:0"
server: Microsoft-IIS/10.0
date: Mon, 08 Jul 2019 16:02:27 GMT
content-length:51
All headers received
<html><head>Hello</head><html>
I need to know what should i need in the code to see whether the http-settings are getting passed on with the request. I know that following function does the work of sending the SETTINGS frames with the request:
static void send_client_connection_header(http2_session_data *session_data) {
nghttp2_settings_entry iv[1] = {
{NGHTTP2_SETTINGS_MAX_CONCURRENT_STREAMS, 100}};
int rv;
/* client 24 bytes magic string will be sent by nghttp2 library */
rv = nghttp2_submit_settings(session_data->session, NGHTTP2_FLAG_NONE, iv,
ARRLEN(iv));
if (rv != 0) {
errx(1, "Could not submit SETTINGS: %s", nghttp2_strerror(rv));
}
}
I also don't know what is the tag we use for HTTP-Settings in http2 protocol just like for method we have ":method", for scheme ":scheme" etc. I couldn't find it even in the RFC.
The HTTP/2 settings frame is not an HTTP Header - it is a separate message sent at the beginning of connections. Therefore it is not possible to display it like as if it was an HTTP Header.
HTTP/2 contains many such non-visible control frames:
SETTINGS frame to define how the connection is used
WINDOW_UPDATE frame to implement flow control
PRIORITY frame to reprioritise streams (and therefore responses).
...etc.
Typically browsers and servers do not show these Control messages to the user or even in developer tools. Chrome allows you to see them using the chrome://net-export URL, or you can use a network sniffing tool like Wireshark to see them.
One of the easiest ways however, and a very good way to learn HTTP/2 by examining the raw frames, is to use the nghttpd tool (which is part of the nghttp2 suite) to create an HTTP/2 server that can log any messages sent to it when run in verbose mode, like this:
nghttpd -v 443 server.key server.crt
I discuss how to do this in more depth in my book, which you can preview online for free for 5 mins a day at: https://livebook.manning.com/#!/book/http2-in-action/chapter-4/176
One thing I should say is that when connecting over non-encrypted HTTP/1.1 (as opposed to HTTPS) and then upgrading to HTTP/2 then the Settings are sent in an HTTP Header (called HTTP2-Settings) but this is a special case, and when this is sent the message is an HTTP/1.1 message. Additionally browsers only support HTTP/2 over HTTPS (for good reasons) and I see you are using HTTPS too. So I would ignore this and only mentioning it for completeness sake.

GET request google

I'm trying to implement a simple web browser in C.
When ever I send a get request to google.com using
GET / HTTP/1.1\r\n\r\n
I receive
HTTP/1.1 302 Found
Cache-Control: private
Content-Type: text/html; charset=UTF-8
Location: http://www.google.co.in/?gfe_rd=cr&ei=1wIjWPqZA6DmugSY4I-IDw
Content-Length: 261
Date: Wed, 09 Nov 2016 11:04:55 GMT
<HTML><HEAD><meta http-equiv="content-type" content="text/html;charset=utf-8">
<TITLE>302 Moved</TITLE></HEAD><BODY>
<H1>302 Moved</H1>
The document has moved
here.
</BODY></HTML>
Subsequently I send another GET request
GET /?gfe_rd=cr&ei=1wIjWPqZA6DmugSY4I-IDw HTTP/1.1\r\n\r\n
And I receive error code 404 not found.
If not this, what should be the GET request to redirect me to the site. I find ip address of google using
char *hostname = "www.google.com";
struct hostent *he;
he = gethostbyname( hostname );
You're requesting the wrong URL.
Take a closer look at the URL given in the Location header:
http://www.google.co.in/?gfe_rd=cr&ei=1wIjWPqZA6DmugSY4I-IDw
and the URL in the HTML source:
http://www.google.co.in/?gfe_rd=cr&ei=1wIjWPqZA6DmugSY4I-IDw
You'll notice that the second of these is slightly different, because ampersands have to be encoded as & in HTML documents.
If you use the URL in the Location header, you stand a better chance of success. However, you might still have problems if the server's behaviour depends on other factors. For example, a lot of websites will reject requests without a recognisable User-Agent request header.

POST Method got converted to OPTIONS automatically

Basically I am using a POST method but it automatically gets converted to OPTIONS method. I know browser does this but also read that it is fine and should get response as 201, but in my case it is not behaving as expected, I have also tried Access-Control-Allow-Methods in request headers but didn't get anything.
This is what my Request looks like:
OPTIONS http://xyz/abc
Accept: application/json
Content-Type: application/json
Response:
405, Method Not Allowed
Access-Control-Allow-Origin: *
Date: Tue, 05 May 2015 06:15:19 GMT
Connection: close
Accept-Ranges: bytes
Access-Control-Allow-Headers: authorization, content-type
Content-Length: 0
Access-Control-Allow-Methods: GET, PUT, POST, DELETE, HEAD
Can anyone tell me the cause of this issue and what could be the exact reason for the same after having enough research everything looks fine at my end.
Thanks in advance.
You are probably seeing pre-flight check during a POST-request in cross-origin resource sharing. I don't know how your webserver needs to be setup to support this, but this Wikipedia article might be a first help: http://en.wikipedia.org/wiki/Cross-origin_resource_sharing
The easiest solution is to do the POST request on the same origin as the where you are loading the web-page from. A reverse proxy might be a reasonable solution.

UNIX C HTTP request returning 301 Moved Permanently

I am familiar with the 301 error code but new to http requests and formatting them correctly.
In my program i need to retrieve my school's homepage, but i get a 301 Moved Permanently header. The header's location says where the page moved to, but even that new location won't work for me, probably because i didn't format it correctly.
Initially i send this request:
GET / HTTP/1.1\r\nHost: www.cs.uregina.ca\r\nConnection: close\r\n\r\n
And receive this header:
Received: HTTP/1.1 301 Moved Permanently
Date: Tue, 04 Nov 2014 05:38:42 GMT
Server: Apache
Location: http://www.cs.uregina.ca/
Connection: close
Transfer-Encoding: chunked
Content-Type: text/html; charset=utf-8
What should my new HTTP request look like to get the above moved webpage?
If i try the location of the moved page like it suggests then i get the following 400 Bad Request Response:
GET / HTTP/1.1\r\nHost: http://www.cs.uregina.ca\r\nConnection: close\r\n\r\n
Received: HTTP/1.1 400 Bad Request
Date: Tue, 04 Nov 2014 05:52:36 GMT
Server: Apache
Content-Length: 334
Connection: close
Content-Type: text/html; charset=iso-8859-1
Initially i send this request:
GET / HTTP/1.1\r\nHost: www.cs.uregina.ca\r\nConnection: close\r\n\r\n
And receive this header:
Received: HTTP/1.1 301 Moved Permanently
...
Location: http://www.cs.uregina.ca/
...
This is exactly what I get when I request cs.uregina.ca. You have probably connected to cs.uregina.ca (or some subdomain other than www), or to an IP address the does not correspond to www.cs.uregina.ca.
If i try the location of the moved page like it suggests then i get
the following 400 Bad Request Response:
GET / HTTP/1.1\r\nHost: http://www.cs.uregina.ca\r\nConnection: close\r\n\r\n
Received: HTTP/1.1 400 Bad Request
...
This is not surprising. You must remove the http:// protocol from the Host: header. Eg:
GET / HTTP/1.1\r\nHost: www.cs.uregina.ca\r\nConnection: close\r\n\r\n
In general, when requesting a URL such as the following:
http://domain.example:80/path/to/resource/?query#fragment
---- -------------- ==------------------------
protocol host | path
port
you would:
resolve the host name to an IP address, and connect to that IP address on port (if present in the URL) or the default port associated with the protocol.
Communicate with the server using a mechanism specific to protocol. In this case, an HTTP request.
Request path from the server with an appropriate Host: header (in case there are multiple hosts on the same IP).
The fragment identifier is used with (X)HTML and is not actually sent to the server.
The request should (at a minimum) look like this:
GET /path/to/resource/?query HTTP/1.1
Host: domain.example
Connection: close
The full details can be found in:
RFC 7230: Hypertext Transfer Protocol (HTTP/1.1): Message Syntax and Routing.
RFC 7231: Hypertext Transfer Protocol (HTTP/1.1): Semantics and Content.
RFC 7232: Hypertext Transfer Protocol (HTTP/1.1): Conditional Requests.
RFC 7233: Hypertext Transfer Protocol (HTTP/1.1): Range Requests.
RFC 7234: Hypertext Transfer Protocol (HTTP/1.1): Caching.
RFC 7235: Hypertext Transfer Protocol (HTTP/1.1): Authentication.
If you just want the homepage, download nc and type "nc www.cs.uregina.ca 80"
When nc starts type the following and then hit return twice:
GET http://www.cs.uregina.ca HTTP/1.0

How to live stream video using C program. What should be the HTTP reply ? How can I use chunked encoding if possible?

(the actual question has been edited because I was successful doing live streaming, BUT NOW I DO NOT UNDERSTAND THE COMMUNICATION between client and my C code.)
Okay I finally did live streaming using my C code. BUT I COULD NOT UNDERSTAND HOW "HTTP" IS WORKING HERE.
I studied the communication b/w my browser and the server at the link http://www.flumotion.com/demosite/webm/ using wireshark.
I found that the client first sends this GET request
GET /ahiasfhsasfsafsgfg.webm HTTP/1.1
Host: localhost
Connection: keep-alive
Referer: file:///home/anirudh/Desktop/anitom.html
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US) AppleWebKit/534.13 (KHTML, like Gecko) Chrome/9.0.597.98 Safari/534.13
Accept-Encoding: gzip,deflate,sdch
Accept-Language: en-US,en;q=0.8
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.3
Range: bytes=0-1024
to this get request the server responds by sending this reply
HTTP/1.0 200 OK
Date: Tue, 01 Mar 2011 06:14:58 GMT
Connection: close
Cache-control: private
Content-type: video/webm
Server: FlumotionHTTPServer/0.7.0.1
and then the server sends the data until the client disconnects. The client disconnects when it receives a certain amount of data. The CLIENT then connects to the server on a new port and the same GET request is sent to the server. The server again gives the same reply but this time the client does not disconnect but continuously reads the packets until the server disconnects. I wrote a C code which in which I have a server socket which replicates the above behavior. (thanks to wireshark, flumotion and stackoverflow)
BUT BUT BUT, I could not understand why does the client need to send two requests and why does it resets on the first request and again send the same request on a new port and this time it listens to the data as if its getting live streamed.
Also I do not know how I can live stream using chunked encoding.
The same thing in detail is available here : http://systemsdaemon.blogspot.com/2011/03/live-streaming-video-tutorial-for.html
and here http://systemsdaemon.blogspot.com/2011/03/http-streaming-video-using-program-in-c.html
Please help me out. Thanks in advance.
The first request is limited to 1024 bytes in order to test that the stream is actually a valid video source and not say a 600MB Windows executable.

Resources