Receive http response messages using OpenSSL in C - c

I have some problems when trying to receive http response message of a website.
This is my function:
void Reveive_response(char *resp, SSL *ssl) {
const int BUFFER_SIZE = 1024;
char response[1048576];
char *buffer = NULL; // to read from ssl
char *check = (char *) malloc(BUFFER_SIZE*sizeof(char));
int bytes; // number of bytes actually read
int received = 0; // number of bytes received
buffer = (char *) malloc(BUFFER_SIZE*sizeof(char)); // malloc
memset(response, '\0', sizeof(response)); // response
assign = '\0'
do{
memset(buffer, '\0', BUFFER_SIZE); // empty buffer
bytes = SSL_read(ssl, buffer, BUFFER_SIZE);
if (bytes < 0) {
printf("Error: Receive response\n");
exit(0);
}
if (bytes == 0) break;
received += bytes;
printf("Received...%d bytes\n", received);
strncat(response, buffer, bytes); // concat buffer to response
} while (SSL_pending(ssl)); // while pending
response[received] = '\0';
printf("Receive DONE\n");
printf("Response: \n%s\n", response);
free(buffer);
strcpy(resp, response); // return via resp
}
When I call the function, it seems like the response message is not complete. Like this:
Received...1014 bytes
Received...1071 bytes
Receive DONE
Response:
HTTP/1.1 200 OK
<... something else....>
Vary: Accept-Encoding
Content-Type: text/html
Conne
Then if i call the function again, it returns:
Received...39 bytes
Receive DONE
Response:
ction: keep-alive
Content-Length: 0
The field Connection was split. Why my function didn't receive all the response message? I used do while loop inside. Please tell me where did i go wrong? Thank you.

There is nothing wrong. This is simply how TCP works. It is a streaming transport, it has no concept of message boundaries. There is no 1-to-1 relationship between the number of bytes sent and the number of bytes read. Your reading receives arbitrary bytes, which you are then responsible for processing as needed. Keep reading, buffering and parsing the HTTP data as you go, until you discover the end of the response (see RFC 2616 Section 4.4 Message Length for details). Looping on SSL_pending() is not sufficient (or correct).
In this case, you have to read CRLF-delimited lines one at a time until you reach a CRLF/CRLF pair indicating the end of the response headers, then you need to analyze the headers you have received to know whether a response body is present and how to read it, as it may be in one of several different encoded formats. If present, you can then read the body (decoding it as you go along) until you reach the end of the body as specified by the headers.
See the pseudo-code I posted in my answer to the following question:
Receiving Chunked HTTP Data With Winsock
That said, you really should not be implementing HTTP (let alone HTTPS) manually to begin with. HTTP is not trivial to implement from scratch, and neither is SSL/TLS for that matter. You have dived head-first into a deep well without understand some important basics of network programming and OpenSSL programming. You should use an existing HTTP/S library instead, such as libcurl, and let it handle the details for you so you can focus on your code's business logic and not its communications logic.

Related

C Sockets: recv() blocks when all data is downloaded [duplicate]

This question already has an answer here:
Differ between header and content of http server response (sockets)
(1 answer)
Closed 4 years ago.
I'm writing a wrapper for Berkley sockets on Windows and Linux. The test program got problem here:
char buf[BUFSIZE];
int res = 0;
while((res = NetRecv(sock, buf, BUFSIZE, 0)) > 0) // 'NetRecv' is pointing to 'recv'
{
buf[res-1] = '\0';
printf("%s", buf);
}
The response is to a HTTP-Get request of a web-page content. The socket is streaming.
The 'NetRecv' is initialized correctly - that is, no type mismatch of the functions' pointers there is, I've checked it.
So, Windows version works flawlessly, the Linux one is stuck after reading all page. Namely, the previous to the last 'NetRecv' call accepts last chunk of the response, outputs it, and the next (last) call just blocks. Closing the terminal causes 'SIGHUP' signal.
Looks like the Linux version just doesn't realize, that it received the last chunk of data and waits for more.
Is it as it should be? Don't understand then, for what reason there is blocking call possibility.
Now, I surely could make non-blocking call and use 'select', but do I really have to?
Thanks in advance)
EDIT: Minimal working example (all checks are omitted and net functions are the standard ones, which also were tested):
int sock = socket(AF_INET, SOCK_STREAM, 0);
// Here getting good IP address of google.com - no problem here
char serv_ip[IPADDR_BUFSIZE];
GetHostAddrByName(AF_INET, "www.google.com", serv_ip, IPADDR_BUFSIZE);
// ip ver site out buf out buf size
// The routine above is made with 'getaddrinfo', to be precise
printf("Current IP of '%s' is '%s'.\n", SERV_URL, serv_ip);
// Copying IP string to address struct
struct sockaddr_in addr;
NetIpFromStr(AF_INET, serv_ip, &addr.sin_addr);
addr.sin_family = AF_INET;
addr.sin_port = NetHtons(80);
connect(sock, (const struct sockaddr*)&addr, sizeof(addr));
const char* msg = "GET / HTTP/1.1\r\n\r\n";
send(sock, msg, strlen(msg), 0);
char buf[BUFSIZE];
int res = 0;
while((res = recv(sock, buf, BUFSIZE-1, 0)) > 0)
{
buf[res] = '\0';
printf("%s", buf);
}
EDIT 2: Important notice: the Windows version also blocks the call, when all the data is read. Closing the terminal just doesn't crash the program, like it happens in Linux. Therefore, the whole question is such: How to realize that all data is read?
The problem is that you are blindly reading from the socket in a loop until an error occurs. Once you have received the entire response, you go back to the socket and keep reading, which then blocks because there is nothing left to read. The only error that can occur at this point is when the connection is closed (or lost), which the server is likely not doing since you are sending an HTTP 1.1 request, where keep-alive is the default behavior for 1.1 (see RFC 2616 Section 8.1 Persistent Connections)
The correct solution is to parse the HTTP response and stop reading from the socket when you reach the end of the response, NOT simply relying on the server to close the socket. Read RFC 2616 Section 4.4 Message Length for how to detect when you have reached the end of the response. DO NOT read more than the response indicates! Once you stop reading, then you can decide whether to close your end of the socket, or reuse it for a new request.
Have a look at this pseudo code for the type of parsing and reading logic you need to use.
Also, your HTTP request is malformed, as you are not sending a required Host header, so no matter what, you will always receive a 400 Bad Request response from any HTTP 1.1 compliant server:
const char* msg = "GET / HTTP/1.1\r\n"
"Host: www.google.com\r\n" // <-- add this!
"\r\n";
The solution was to shutdown the socket for reading, both in Windows and Linux:
// after sending a request:
shutdown(sock, SD_SEND); // or 'SHUT_WR' in Linux
// now read loop
Curiously, 'shutdown' was called in Winsock tutorials too, but I thought that was unnecessary.

HTTP proxy implementation: content encoding error

I am implementing an proxy. I could receive response from server, but failed to send the response to client.
To be more detail, I could only rend response header content, but failed to send message body. And webpage shows'content encoding error'
//I could sending request to server successfully.
send(connfd_to_server, request, strlen(request), 0);
//receive response from server
char res_buf[1024];
while(1){
bzero(res_buf, 1024);
if(recv(connfd_to_server, res_buf, sizeof(res_buf),0) <=0){
break; //if recv failed, then message body is finished.
} //receive response using recv
send(connfd_to_client, res_buf, strlen(res_buf));
}
I also tried:
char* response = (char*)malloc(strlen(res_buf));
char* res_line;
res_line = strtok(res_buf, "\r\n");
for(int i = 0; i<=11; i++){
strcat(response, res_line);
strcat(response, "\r\n");
res_line = strtok(NULL, "\r\n");
} //copy header content using strcat
while(res_line!= NULL){
memcpy(response, res_line, sizeof(res_line));
res_line = strtok(NULL, "\r\n");
} //copy message body as bytes using memcpy
then send response to client using send function.
However, no matter which function I use, message body are not send successfully.
like shown in a weird symbol above
Any hints?
Many thanks in advance
The function recv returns the number of bytes read which you could use when calling function send.
You used strlen which is based on detecting a null character to find the end of the buffer, which in this case is not valid.
Content encoding says gzip so the data coming from server will have binary data in its http response body which may also contain null character so you may end up sending partial data, as you have used strlen() in send(), that your client will not be able to decode

(C Socket Programming) Seperate send() calls from server ending up in same client recv() buffer

I was wondering if anyone could shed any light as to why two seperate send() calls would end up in the same recv() buffer using the loopback address for testing yet once switched to two remote machines they would require two recv() calls instead? I have been looking at the wireshark captures yet cant seem to make any sense as to why this would be occuring. Perhaps someone could critique my code and tell me where im going wrong. The two incoming messages from the server is of an undetermined length to the client. By the way i'm using BSD sockets using C in Ubuntu.
In the example shown below im parsing the entire buffer to extract the two seperate messages from it which i'll admit isn't an ideal approach.
-------SERVER SIDE--------
// Send greeting string and receive again until end of stream
ssize_t numBytesSent = send(clntSocket, greeting, greetingStringLen, 0);
if (numBytesSent < 0)
DieWithSystemMessage("send() failed");
//-----------------------------Generate "RANDOM" Message -----------------------
srand(time(NULL)); //seed random number from system clock
size_t randomStringLen = rand() % (RANDOMMSGSIZE-3); //generates random num
// betweeen 0 and 296
char randomMsg [RANDOMMSGSIZE] = "";
// declare and initialize allowable characteer set for the
const char charSet[] = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ";
if (randomStringLen) {
--randomStringLen;
for (size_t i = 0; i < randomStringLen; i++) {
int p = rand() % (int) (sizeof charSet - 1);
randomMsg[i] = charSet[p];
}
randomStringLen = strlen(randomMsg);
printf("Random String Size Before newline: %d\n", (int)randomStringLen);
strcat(randomMsg,"\r\n");
}
randomStringLen = strlen(randomMsg);
printf("Random String: %s\n", randomMsg);
//-----------------------------Send "RANDOM" Message ---------------------------
// Send greeting string and receive again until end of stream
numBytesSent = send(clntSocket, randomMsg, randomStringLen, 0);
if (numBytesSent < 0)
DieWithSystemMessage("send() failed");
//------------------------------------------------------------------------------
------CLIENT SIDE-------
//----------------------------- Receive Server Greeting ---------------------------
char buffer[BUFSIZE] = ""; // I/O buffer
// Receive up to the buffer size (minus 1 to leave space for
// a null terminator) bytes from the sender
ssize_t numBytesRcvd = recv(sock, buffer, BUFSIZE - 1, 0);
if (numBytesRcvd < 0)
DieWithSystemMessage("recv() failed");
buffer[numBytesRcvd] = '\0'; //terminate the string after calling recv()
printf("Buffer contains: %s\n",buffer); // Print the buffer
//printf("numBytesRecv: %d\n",(int)numBytesRcvd); // Print the buffer
//------------------------ Extracts the random message from buffer ---------------------------
char *randomMsg = strstr(buffer, "\r\n"); // searches from first occurance of substring
char randomMessage [BUFSIZE] = "";
strcat(randomMessage, randomMsg+2);
int randomStringLen = strlen(randomMessage)-2;
printf("Random Message: %s\n",randomMessage); // Print the buffer
char byteSize [10];
sprintf(byteSize,"%d", randomStringLen);
printf("ByteSize = %s\n",byteSize);
//----------------------- Send the number for random bytes recieved -------------------------
size_t byteStringLen = strlen(byteSize); // Determine input length
numBytes = send(sock, byteSize, byteStringLen, 0);
if (numBytes < 0)
DieWithSystemMessage("send() failed");
else if (numBytes != byteStringLen)
DieWithUserMessage("send()", "sent unexpected number of bytes");
shutdown(sock,SHUT_WR); // further sends are disallowed yet recieves are still possible
//----------------------------------- Recieve Cookie ----------------------------------------
On Unix systems recv and send are just special cases of the read and write that accepts additional flags. (Windows also emulates this with Winsock).
You shouldn't assume that one recv corresponds to one send because that's generally isn't true (just like you can read a file in multiple parts, even if it was written in a single write). Instead you should start each "message" with a header that tells you how long the message is, if it's important to know what were the separate messages, or just read the stream like a normal file, if it's not important.
TCP is a byte-stream protocol, not a message protocol. There is no guarantee that what you write with a single send() will be received via a single recv(). If you need message boundaries you must implement them yourself, e.g. with a length-word prefix, a type-length-value protocol, or a self-describing protocol like XML.
You're experiencing a TCP congestion avoidance optimization commonly referred to as the Nagle algorithm (named after John Nagle, its inventor).
The purpose of this optimization is to reduce the number of small TCP segments circulating over a socket by combining them together into larger ones. When you write()/send() on a TCP socket, the kernel may not transmit your data immediately; instead it may buffer the data for a very short delay (typically a few tens of milliseconds), in case another request follows.
You may disable Nagle's algorithm on a per-socket basis, by setting the TCP_NODELAY option.
It is customary to disable Nagle in latency-sensitive applications (remote control applications, online games, etc..).

C Sockets recv: connection reset by peer

I'm trying to use sockets to get a small JSON test file, which is hosted on my website (http://a-cstudios.com/text.json). When I do this
long numbytes;
char *request = malloc(sizeof(char) * 300);
sprintf(request, "GET %s \r\nHOST:%s \r\n\r\n", restOfURL, baseServer);
// restOfURL = "/text.json" baseServer = "www.a-cstudios.com"
send(sockfd, request, strlen(request) + 1, 0);
char buf[1024];
if ((numbytes = recv(sockfd, buf, 1024-1, 0)) == -1) {
perror("recv");
}
I get recv: connection reset by peer. But if I use the same code, where restOfURL is /index.html and baseServer is www.google.com, this works fine, and buf will contain the text of index.html. Why won't this work for the file on my website?
Since you didn't post full code, I am going to take a stab at it and make an assumption:
You populate the format string of "GET %s \r\nHOST:%s \r\n\r\n" with restOfURL and baseServer
However, during the time of the sprintf call restOfURL isn't initilized so you're pushing garbage data into the first %s
Either post more of your code or make sure you initialize resOfURL
As #Kninnug pointed out, you need the HTTP version field (e.g., HTTP/1.1) at the end of the first line of the request. I just want to point out that you should not include the null terminator when you send the request. That is, change the send statement to
send(sockfd, request, strlen(request), 0);
Also, it is a good practice to always use snprintf instead of sprintf to prevent buffer overflow, although to be really safe you still need to check for truncation.

Socket programming issue with recv() receiving partial messages

I have a socket that is receiving streaming stock tick data. However, I seem to get a lot of truncated messages, or what appears to be truncated messages. Here is how I am receiving data:
if((numbytes = recv(sockfd, buf, MAXDATASIZE-1, 0)) == -1) {
perror("recv()");
exit(1);
}
else {
buf[numbytes] = '\0';
// Process data
}
Can recv() receive just a partial message of what was sent?
My feeling is I might need another loop around the recv() call that receives until a complete message is sent. I know that a libcurl implementation I have (not possible to use libcurl here I would think) has an outer loop:
// Read the response (sum total bytes read in tot_bytes)
for(tot_bytes=0; ; tot_bytes += iolen)
{
wait_on_socket(sockfd, 1, 60000L);
res = curl_easy_recv(curl, buf + tot_bytes, sizeof_buf - tot_bytes, &iolen);
if(CURLE_OK != res) {
// printf( "## %d", res );
break;
}
}
Do I need an recv() loop similar to the libcurl example (that verifiably works)?
We can also pass the flag to recv to wait until all the message has arrived. It works when you know the number of bytes to receive. You can pass the command like this.
numbytes = recv(sockfd, buf, MAXDATASIZE-1, MSG_WAITALL);
You're right, you need a loop. recv only retrieves the data that's currently available; once any data has been read, it doesn't wait for more to appear before it returns.
The manual page says "The receive calls normally return any data available, up to the requested amount, rather than waiting for receipt of the full amount requested."
TCP does not respect message boundaries. That means that recv() is not guaranteed to get the entire message, exactly as you hypothesize. And that is indeed why you need a loop around your recv(). (That's also why upper-layer protocols like HTTP either close the socket, or prepend a length indicator, so the recipient knows exactly when to stop reading from the socket.)
can recv() receive just a partial message of what was sent?
Yes, indeed, if you use TCP. I think this can help you.
Handling partial return from recv() TCP in C

Resources