Basic http proxy in c, problems

Basic http proxy in c, problems - c

I am building a http proxy in c.
The proxy is supposed to filter some keywords in the URL and in the html content.
The first problem I have is with the send() function. When I am loading the page for the first time all is fine and dandy. And if I let the page finnish loading, the next request is also fine. But if I open www.google.com and start to type the "instant-feature" is making a new request before the last one is complete and i get the following error:
Program received signal SIGPIPE, Broken pipe.
0x00007ffff7b2efc2 in send () from /lib/x86_64-linux-gnu/libc.so.6
(gdb) up
#1 0x0000000000401f1a in main () at net-ninny2.c:232
232 bytes_sent += send(i, buffer+bytes_sent, buffer_size-bytes_sent, 0);
The code-block that generates the error looks like this:
while(bytes_sent < buffer_size) {
bytes_sent += send(i, buffer+bytes_sent, buffer_size-bytes_sent, 0);
printf("* Bytes sent to Client: %d/%d\n", bytes_sent, buffer_size);
}
If you think it's relevant i'll be happy to provide more code.
My second problem is related to Http headers. Since I want to filter keywords in the html content, I don't want the content to be encoded. Google doesn't seem to agree with that and no matter what I put in the Accept-Encoding -header, I always get the content back encoded in gzip. Any ideas how to get rid of that?
EDIT:
I am also trying to use fork() to create child processes for the new connections, but that just throws a nasty error:
select: Interrupted system call
I have put it where I create a new file descriptor from a incoming connection:
if (i == listener) {
// New connection
remote_addr_len = sizeof remote_addr;
newfd = accept(listener, (struct sockaddr *)&remote_addr, &remote_addr_len);
if (newfd == -1) {
perror("accept");
}
else {
FD_SET(newfd, &master); // Add new connection to master set
if (newfd > fdmax) {
fdmax = newfd;
}
printf("* New connection from %s on "
"socket %d\n",
inet_ntop(remote_addr.ss_family,
get_in_addr((struct sockaddr*)&remote_addr),
remoteIP, INET6_ADDRSTRLEN), newfd);
if(!fork()) {
fprintf(stderr, "!fork()\n");
close(newfd);
exit(5);
}
}
}
But I'm guessing I am doing it all wrong.
Cheers!

For your first question, you will want to ignore the SIGPIPE signal:
signal(SIGPIPE, SIG_IGN);
See How to prevent SIGPIPEs (or handle them properly) for more detail. If you ignore the signal and the socket connection is reset, you will also want to handle the -1 error return value from send() appropriately.
For your second question, you may not be able to force Google to send data uncompressed, since Google may assume that all browsers can handle compressed data. You will probably need to embed a gzip decompressor in your proxy. It's certainly not fair to increase the bandwidth requirements of both ends just because you want to filter some keywords.

Related

Confused about the behavior of function shutdown(fd, options)

I'm testing my socket code which is used to transfer a text-based file, and I'm writing this code by referring the book Unix Network Programming (Chinese Version). Briefly I will paste some code below:
My serve_client function:
void serve_client(int connfd, const char *filename, size_t filesize)
{
char header[1024];
int fd = open(filename, O_RDONLY, 0);
char *file_mapped;
if (fd == -1)
{
char *not_found = "HTTP/1.1 404 NOT FOUND\r\n";
send(connfd, not_found, strlen(not_found), 0);
}
else
{
sprintf(header, "HTTP/1.1 200 OK\r\n");
sprintf(header, "%sContent-Length: %u\r\n", header, filesize);
sprintf(header, "%sContent-Type: text/plain; charset=utf-8\r\n\r\n", header);
// send http response header
send(connfd, header, strlen(header), 0);
printf("Response headers:\n");
printf("%s", header);
file_mapped = (char *)mmap(0, filesize, PROT_READ, MAP_PRIVATE, fd, 0);
close(fd);
// send http response body
send(connfd, file_mapped, filesize, 0);
int unmapped = munmap(file_mapped, filesize);
if (unmapped == -1)
{
perror("memory unmapped failed!");
_exit(1);
}
}
}
There are several questions I would like to ask you guys:
After this serve_client() function successfully returns, I mean at least the data I need should be completely copied into the kernel buffer, to be sent in the near future. Am I right about this?
shutdown() function is called as below:
serve_client(connfd, path, st.st_size);
shutdown(connfd, SHUT_WR);
// thread or process ends
I check the tips mentioned in this book, it says that this function with this option SHUT_WR will cause the data remained in kernel buffer firstly to be sent and then the final FIN. Is that right?
I capture the data sent and received with WireShark, as the photo illustrated below:
https://i.imgur.com/Xu8gAgh.jpg
I saw that the RST arrived, before all the data showed up. which failed the client e.g. wget or just web access. Any advice would be great.
Now I worked around this issue by doing this, letting the client close the connection and server waits for the FIN arrives. It works. But still, not what I want. :(
while (1)
{
ssize_t bytes_read = recv(connfd, buf, 1024, 0);
if (bytes_read > 0)
{
continue;
}
else if (bytes_read == 0)
{
close(connfd);
break;
}
else
{
// < 0
// handle error
close(connfd);
break;
}
}
EDIT Sorry for the misunderstanding this question caused, the dump showed the RST sent from the server, which is like what I've been told, the process exited prematurely. That's the reason the previous code won't work. Thank you for all your explanations, really helping me better understand the progress under the hood.

Ending a process implicitly close()s all file/socket descriptors. And this is the problem. Closing after sending may cause data loss on the receiver side (depending on the TCP stack's implementation).
You need to implement an application level protocol having the client acknowledge reception of all data before the server may close the socket.
To summarise: Using closure of a socket as part of the application level protocol is not reliable. Do not do this.

Send returns 29200 on ARM

I'm using a c program to respond to API calls. I want to reply using JSON.
I created a streaming socket listening on my port and create a GET request using a browser (firefox in my case). I then reply using the "send" method based on the request received.
The problem is when my reply is bigger than 29200 bytes. Then the send method returns 29200 and only sends the first 29200 bytes, then it just stops. I cannot find why it would stop at this number.
I tried google and found:
C++ socket programming Max size of TCP/IP socket Buffer?
My socket is blocking, so the send() function should block until all data is sent.
I also tried to find if linux blocks anything, but when I checked (not sure how I checked, cannot find the stackoverflow issue describing this) it was set to something way bigger than 29200.
I would like to know why my socket stops at 29200 and, if possible, how I can change the socket to make it send more data?
Edit:
Did some testing with the following results:
Created a test program to just send back 29999 bytes of data: https://pastebin.ca/4010317
I'm using curl to receive the data using
curl -X GET -i 'http://:12345'
When running on my computer the response is:
received: -1 bytes
received:
sent 29999 bytes
I can see that on my computer (x64) the receive does not work, but the send does (Curl does receive the data)
but when running on the ARM device the response is:
received: 83 bytes
received: GET / HTTP/1.1
Host: 192.168.1.118:12345
User-Agent: curl/7.47.0
Accept: */*
J
sent 29200 bytes
Here, curl receives 29200 bytes of data.
When trying to loop the send (https://pastebin.ca/4010318), the result is:
received: -1 bytes
received:
sent 29199 bytes
Here, curl receives 29200 bytes, the second send returns -1. So looping is not possible.
I will keep trying, but the help is appreciated.

Your code work.
Well, to be more exact, it work despite many little misusing function.
First, I asked you an mvce : you give a code with some useless line of code that add just complexity : if you don't need the information on the client connexion, the you can pass NULL to accept.
accept4(socketId, NULL, NULL, SOCK_NONBLOCK);
This way, we do not have 2 useless variables.
Second : Checking for error is cool, but displaying something (or logging) is better, because you can have hint why this doesn't work.
For example, the "received: -1 bytes" that you interpret as non-working is in fact working : The error (errno) is EAGAIN, meaning that since your socket is non-blocking, the data is not currently available so you have to loop your recv to read the incomming data. Looping recv will "solve" your false problem.
And finally : No, you do not loop your send either : you just merely detect if you haven't send all the data and try again one more time : do a LOOP !
Edit :
You can see how I do your "initServerSocket" function for the "check error and logging" part :
int InitServerSocket(int portNum, int nbClientMax)
{
int socketId = -1;
struct sockaddr_in addressInfo;
int returnFunction = -1;
if ((socketId = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP)) < 0) {
// Log !
goto END_FUNCTION;
}
addressInfo.sin_family = AF_INET;
addressInfo.sin_addr.s_addr = htonl(INADDR_ANY);
addressInfo.sin_port = htons(portNum);
if (bind(socketId, (struct sockaddr *)&addressInfo, sizeof(addressInfo)) == -1) {
// Log !
goto END_FUNCTION;
}
const int flags = fcntl(socketId, F_GETFL, 0);
fcntl(socketId, F_SETFL, flags ^ O_NONBLOCK);
// Error detection & log ?
if (listen(socketId, nbClientMax) == -1) {
// Log !
goto END_FUNCTION;
}
returnFunction = socketId;
socketId = -1;
/* GOTO */END_FUNCTION:
if (socketId != -1) {
close(socketId);
}
return (returnFunction);
}

I finally found the issue, the problem was that the
accept4(socket, (struct sockaddr *) &addrInfoFromClient, &sizeAddrInfo, SOCK_NONBLOCK);
was setting the socket to non blocking, while in the init the socket was set to blocking. On the ARM device, this caused the "write" function to stop writing after 29200 bytes (not sure why).
When i changed the accept4 to:
accept(socket, (struct sockaddr *) &addrInfoFromClient, &sizeAddrInfo);
It worked.

How do you keep a socket connection open indefinitely in C?

I'm trying to implement a C socket server in Linux using the code from Beej's sockets guide, which is here:
http://beej.us/guide/bgnet/examples/server.c
This works, and I've written a Windows client in C# to communicate with it. Once the client connects, I have it send a byte array to the server, the server reads it, then sends back a byte array. This works.
However, after this, if I have the client try to send another byte array, I get a Windows popup saying "An established connection was aborted by the software in your host machine." Then I have to re-connect with the client again. I want to keep the connection open indefinitely, until the client sends a disconnect command, but despite reading through Beej's guide, I just don't seem to get it. I'm not even trying to implement the disconnect command at present, I'm just trying to keep the connection open until I close the server.
I've tried removing the close() calls in Beej's code:
while(1) { // main accept() loop
sin_size = sizeof their_addr;
new_fd = accept(sockfd, (struct sockaddr *)&their_addr, &sin_size);
if (new_fd == -1) {
perror("accept");
continue;
}
inet_ntop(their_addr.ss_family,
get_in_addr((struct sockaddr *)&their_addr),
s, sizeof s);
printf("server: got connection from %s\n", s);
if (!fork()) { // this is the child process
close(sockfd); // child doesn't need the listener
ProcessRequest(new_fd); // this is not Beej's code, I've replaced his code here (which was a simple string send()) with a function call that does a read() call, processes some data, then sends back a byte array to the client using send().
close(new_fd);
exit(0);
}
close(new_fd); // parent doesn't need this
}
But that just gets me an infinite loop of "socket accept: bad file descriptor" (I tried removing both the close(new_fd) lines, together and apart, and the close(sockfd) as well.
Can anyone more versed with C socket programming give me a hint where I should be looking? Thank you.

The reason for the accept() problem is that sockfd isn't valid. You must have closed it somewhere. NB if you get such an error you shouldn't just keep retrying as though it hadn't happened.
The reason for the client problem is that you're only processing one request in ProcessRequest(), as its name suggests, and as you describe in your comment. Use a loop, reading requests until recv() returns zero or an error occurs.

Cause
The reason client faces error is because of close(new_fd) either by the server-parent or server-child.
Solution
At any point of time, a server may get two kind of events:
Connection request from a new client
Data from an existing client
The server have to honor both of them. There are two (major) ways to handle this.
Solution Approach 1
Design the server as a concurrent server. In Beej's guide it is
7.2. select()—Synchronous I/O Multiplexing
http://beej.us/guide/bgnet/output/html/singlepage/bgnet.html#select
Since OP's approach is not this one, we do not explore it further.
Solution Approach 2
At server, fork() a process per client. This is the approach OP has taken and we explore here. Essentially, it is fine tuning the ProcessRequest() function in OP's code. Here is a sketch.
void ProcessRequest( int new_fd ) {
char buffer[ N ];
for( ; ; ) { // infinite loop until client disconnects or some error
int const recvLen = recv( new_fd, buffer, sizeof buffer, 0 );
if( recvLen == 0 ) { break; } // client disconnected
else if( recvLen == -1 ) { perror( "recv" ); break; }
int const sendLen = send( new_fd, buffer, recvLen, 0 );
if( sendLen == -1 ) { perror( "send" ); break; }
// TODO if( sendLen < recvLen ) then send() in loop
}
}
Note
I am sorry for having the half-baked solution four few hours. While I was editing the answer, I lost connectivity to stackoverflow.com which lasted for couple of hours.

How to find the socket connection state in C?

I have a TCP connection. Server just reads data from the client. Now, if the connection is lost, the client will get an error while writing the data to the pipe (broken pipe), but the server still listens on that pipe. Is there any way I can find if the connection is UP or NOT?

You could call getsockopt just like the following:
int error = 0;
socklen_t len = sizeof (error);
int retval = getsockopt (socket_fd, SOL_SOCKET, SO_ERROR, &error, &len);
To test if the socket is up:
if (retval != 0) {
/* there was a problem getting the error code */
fprintf(stderr, "error getting socket error code: %s\n", strerror(retval));
return;
}
if (error != 0) {
/* socket has a non zero error status */
fprintf(stderr, "socket error: %s\n", strerror(error));
}

The only way to reliably detect if a socket is still connected is to periodically try to send data. Its usually more convenient to define an application level 'ping' packet that the clients ignore, but if the protocol is already specced out without such a capability you should be able to configure tcp sockets to do this by setting the SO_KEEPALIVE socket option. I've linked to the winsock documentation, but the same functionality should be available on all BSD-like socket stacks.

TCP keepalive socket option (SO_KEEPALIVE) would help in this scenario and close server socket in case of connection loss.

There is an easy way to check socket connection state via poll call. First, you need to poll socket, whether it has POLLIN event.
If socket is not closed and there is data to read then read will return more than zero.
If there is no new data on socket, then POLLIN will be set to 0 in revents
If socket is closed then POLLIN flag will be set to one and read will return 0.
Here is small code snippet:
int client_socket_1, client_socket_2;
if ((client_socket_1 = accept(listen_socket, NULL, NULL)) < 0)
{
perror("Unable to accept s1");
abort();
}
if ((client_socket_2 = accept(listen_socket, NULL, NULL)) < 0)
{
perror("Unable to accept s2");
abort();
}
pollfd pfd[]={{client_socket_1,POLLIN,0},{client_socket_2,POLLIN,0}};
char sock_buf[1024];
while (true)
{
poll(pfd,2,5);
if (pfd[0].revents & POLLIN)
{
int sock_readden = read(client_socket_1, sock_buf, sizeof(sock_buf));
if (sock_readden == 0)
break;
if (sock_readden > 0)
write(client_socket_2, sock_buf, sock_readden);
}
if (pfd[1].revents & POLLIN)
{
int sock_readden = read(client_socket_2, sock_buf, sizeof(sock_buf));
if (sock_readden == 0)
break;
if (sock_readden > 0)
write(client_socket_1, sock_buf, sock_readden);
}
}

Very simple, as pictured in the recv.
To check that you will want to read 1 byte from the socket with MSG_PEEK and MSG_DONT_WAIT. This will not dequeue data (PEEK) and the operation is nonblocking (DONT_WAIT)
while (recv(client->socket,NULL,1, MSG_PEEK | MSG_DONTWAIT) != 0) {
sleep(rand() % 2); // Sleep for a bit to avoid spam
fflush(stdin);
printf("I am alive: %d\n", socket);
}
// When the client has disconnected, this line will execute
printf("Client %d went away :(\n", client->socket);
Found the example here.

I had a similar problem. I wanted to know whether the server is connected to client or the client is connected to server. In such circumstances the return value of the recv function can come in handy. If the socket is not connected it will return 0 bytes. Thus using this I broke the loop and did not have to use any extra threads of functions. You might also use this same if experts feel this is the correct method.

get sock opt may be somewhat useful, however, another way would to have a signal handler installed for SIGPIPE. Basically whenever you the socket connection breaks, the kernel will send a SIGPIPE signal to the process and then you can do the needful. But this still does not provide the solution for knowing the status of the connection. hope this helps.

You should try to use: getpeername function.
now when the connection is down you will get in errno:
ENOTCONN - The socket is not connected.
which means for you DOWN.
else (if no other failures) there the return code will 0 --> which means UP.
resources:
man page: http://man7.org/linux/man-pages/man2/getpeername.2.html

On Windows you can query the precise state of any port on any network-adapter using:
GetExtendedTcpTable
You can filter it to only those related to your process, etc and do as you wish periodically monitoring as needed. This is "an alternative" approach.
You could also duplicate the socket handle and set up an IOCP/Overlapped i/o wait on the socket and monitor it that way as well.

#include <sys/socket.h>
#include <poll.h>
...
int client = accept(sock_fd, (struct sockaddr*)&address, (socklen_t*)&addrlen);
pollfd pfd = {client, POLLERR, 0}; // monitor errors occurring on client fd
...
while(true)
{
...
if(not check_connection(pfd, 5))
{
close(client);
close(sock[1]);
if(reconnect(HOST, PORT, reconnect_function))
printf("Reconnected.\n");
pfd = {client, POLLERR, 0};
}
...
}
...
bool check_connection(pollfd &pfd, int poll_timeout)
{
poll(&pfd, 1, poll_timeout);
return not (pfd.revents & POLLERR);
}

you can use SS_ISCONNECTED macro in getsockopt() function.
SS_ISCONNECTED is define in socketvar.h.

For BSD sockets I'd check out Beej's guide. When recv returns 0 you know the other side disconnected.
Now you might actually be asking, what is the easiest way to detect the other side disconnecting? One way of doing it is to have a thread always doing a recv. That thread will be able to instantly tell when the client disconnects.

Sending while receiving in C

I've made a piece of code in what's on my server as multiple threads
The problem is that it doesn't send data while im receiving on the other socket.
so if i send something from to client 1 to client 2, client2 only receives if he sends something himself(jumps out of the recv function) .. how can i solve this ?
/* Thread*/
while (! stop_received) {
nr_bytes_recv = recv(s, buffer, BUFFSIZE, 0);
if(strncmp(buffer, "SEND", 4) == 0) {
char *message = "Text asads \n";
rv = send(users[0].s, message, strlen(message), 0);
rv = send(users[1].s, message, strlen(message), 0);
if (rv < 0) {
perror("Error sending");
exit(EXIT_FAILURE);
}
}else{
char *message = "Unknown command \n";
rv = send(s, message, strlen(message), 0);
if (rv < 0) {
perror("Error sending");
exit(EXIT_FAILURE);
}
}
}

To be a little more specific, there are a few types of I/O. What you're doing currently is called blocking i/o. In general that means that when you call send or recv the operation will "block" until it has completed.
In contrast to that there is what is known as non-blocking i/o. In this i/o model an operation will return immediately if it's unable to complete. Typically the select function is used with this i/o model.
You can see an example program here at the Select Tutorial. The full source code is at the bottom of the page.
As others have noted, your other option is to use threads.

Your code will block on the recv() call. Either write a multi-threaded application, or investigate the use of the select() function.

Put send and receive in separate threads.

I notice that you are using perror() (the POSIX error function), which leads me to believe you are using a POSIX operating system, which makes me suspect its GNU/Linux.
select() is portable, poll() is POSIX centric and epoll() is Linux centric. If using GNU/Linux, I strongly suggest avoiding select() and using:
poll() if you are polling only a few dozen file descriptors
epoll() if you need to scale to thousands of connections, and its available.
If your application need not be portable, and no requirement prohibits using extensions, use poll() or epoll(). Once you learn how select() works, you'll be very happy to get rid of it, especially for something that has to scale to serve many clients.
If portability is a requirement, see if either poll() or epoll() exist during your build configuration and use either in favor of select().
Note, epoll() did not appear until Linux 2.5(something), so its best to get used to using both.

You shoud separete the code in two threads, one transmitter and one receiver.
Somewthing like this:
/* 1st Thread*/
while (! stop_received) {
nr_bytes_recv = recv(s, buffer, BUFFSIZE, 0);
}
/* 2nd Thread*/
while (! stop_received) {
if(strncmp(buffer, "SEND", 4) == 0) {
char *message = "Text asads \n";
rv = send(users[0].s, message, strlen(message), 0);
rv = send(users[1].s, message, strlen(message), 0);
if (rv < 0) {
perror("Error sending");
exit(EXIT_FAILURE);
}
}else{
char *message = "Unknown command \n";
rv = send(s, message, strlen(message), 0);
if (rv < 0) {
perror("Error sending");
exit(EXIT_FAILURE);
}
}
}
The concurrency will bring some issues, like access to the buffer variable.

There are two ways of achieving the goal you want:
1.) implement the sending and receiving codes in different threads. but there will be some issues, like increasing no of clients might get you into troubles to handle the code. also there will be some some problem of concurrency (as mentioned by pcent).
you can go for no blocking sockets but i suggest not to do so, as i hope you dont want a cpu hog.
2.) The other way is to use of select() function which will let you monitor multiple sockets of different types at the same time. for more description of "select()" you can google it. :)

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight