Using C FILE stream in C sockets programming - c

Am back with a question on using C File stream in sockets programming. I was reading about it and saw mixed reviews - some people say it's not reliable (ie leaky abstraction?).
Has any one got a view about using C File stream in sockets programming?

Yes. Don't.
The TCP and UDP protocols have too many semantics to be easily mapped to your usual file stream APIs. That's not to say it's impossible or even difficult, but there are likely to be lots and lots of gotchas and edge cases that will give you wildly unpredictable behaviour. I also cannot think off the top of my head of any applications where you might want to treat a socket as an ordinary file.
At the end of the day, once you've dealt with binding and listening and accepting, none of which you can do with C File streams, and wrapped the resultant file descripter in a File stream type, all you are going to do is use fread() and fwrite(), maybe fgetc(), so you may as well leave it as an ordinary file descriptor and use recv(), and send() and save yourself the hassle of wrapping. You may save yourself the hassle of dealing with buffering, but having control of the buffering allows you to tune your buffer to the application's requirements and save yourself some network overhead and speed.

It depends on the kind of application you're writing. FILE streams are not suitable for nonblocking, asynchronous, or select/poll-based IO. This may be no problem for a command line program that performs a sequential task of connecting to a server, making some request, and getting the results. It also works alright for a run-from-inetd-only server process. But if your application will be doing anything event-based, you're in trouble. If you really want to use FILE streams with sockets in an event-based application, you can make it possible using threads, but I doubt it's a good idea...

Related

Message Passing between Processes without using dedicated functions

I'm writing a C program in which I need to pass messages between child processes and the main process. But the thing is, I need to do it without using the functions like msgget() and msgsnd()
How can I implement? What kind of techniques can I use?
There are multiple ways to communicate with children processes, it depends on your application.
Very much depends on the level of abstraction of your application.
-- If the level of abstraction is low:
If you need very fast communication, you could use shared memory (e.g. shm_open()). But that would be complicated to synchronize correctly.
The most used method, and the method I'd use if I were in your shoes is: pipes.
It's simple, fast, and since pipes file descriptors are supported by epoll() and those kind of asynchronous I/O APIs, you can take advantage from this fact.
Another plus is that, if your application grows, and you need to communicate with remote processes (processes that are not in your local machine), adapting pipes to sockets is very easy, basically it's still the same reading/writing from/to a file descriptor.
Also, Unix-domain sockets (which in other platforms are called "named pipes") let you to have a server process that creates a listening socket with a very well known name (e.g. an entry in the filesystem, such as /tmp/my_socket) and all clients in the local machine can connect to that.
Pipes, networking sockets, or unix-domain sockets are very interchangeable solutions, because - as said before - all involve reading/writing data from/to a file descriptor, so you can reuse the code.
The disadvantage with a file descriptor is that you're writing data to a stream of bytes, so you need to implement the "message streaming protocol" of your messages by yourself, to "unstream" your messages (marshalling/unmarshalling), but that's not so complicated in the most of the cases, and that also depends on the kind of messages you're sending.
I'd pass on other solutions such as memory mapped files and so on.
-- If the level of abstraction is higher:
You could use a 3rd party message passing system, such as RabbitMQ, ZMQ, and so on.

select() equivalence in I/O Completion Ports

I am developing a proxy server using WinSock 2.0 in Windows. If I wanted to develop it in blocking model, select() was the way to wait for client or remote server to receive data from. Is there any applicable way to do this so using I/O Completion Ports?
I used to have two Contexts for two directions of data using I/O Completion Ports. But having a WSARecv pending couldn't receive any data from remote server! I coudn't find the problem.
Thanks in advance.
EDIT. Here's the WorkerThread Code on currently developed I/O Completion Ports. But I am asking about how to implement select() equivalence.
I/O Completion Ports provide an indication of when an I/O operation completes, they do not indicate when it is possible to initiate an operation. In many situations this doesn't actually matter. Most of the time the overlapped I/O model will work perfectly well if you assume it is always possible to initiate an operation. The underlying operating system will, in most cases, simply do the right thing and queue the data for you until it is possible to complete the operation.
However, there are some situations when this is less than ideal. For example you can always send to a socket using overlapped I/O. You can do this even when the remote peer is not reading and the TCP stack has started to use flow control and has filled the TCP window... This simply uses resources on your local machine in a completely uncontrolled manner (not entirely uncontrolled, but controlled by the peer, which is not ideal). I write about this here and in many situations you DO need to actively manage this kind of thing by tracking how many outstanding I/O write requests you have and using that as an indication of 'readiness to send'.
Likewise if you want a 'readiness to recv' indication you could issue a 'zero byte' read on the socket. This is a read which is issued with a zero length buffer. The read returns when there is data to read but no data is returned. This would give you the indication that there is data to be read on the connection but is, IMHO, pointless unless you are suffering from the very unlikely situation of hitting the I/O page lock limit, as you may as well read the data when it becomes available rather than forcing multiple kernel to user mode transitions.
In summary, you don't really need an answer to your question. You need to look at how the API works and write your code to work with it rather than trying to force the API to work in a way that other APIs that you are familiar with work.

Implementing FTP Server/Client in C

I am asked for an assignment that requires implementation of FTP protocol. I have gone through the documentation given at RFC959.
I am confused with a couple of implementation details
1)If a file needs to be transferred, what function can be used. can a simple send() function be used for a non text file.
2) Is it possible to get a good tutorial that speaks about implementing Modes and file structures, and to specify, which are essential.
hope to get a reply soon.
FTP transfers file through a plain TCP connection, and you can transfer any kind of file with it. There is no difference between text files and binary files, they are all just sequence of bytes.
For the file transmission is sufficient to open a connection and call the write function many times until the entire file is transmitted (check the return value of write to know how many bytes it sent).
The rest of the FTP protocol is text based and is sent to a different port.
There is a good tutorial on using FTP directly through netcat, that can be useful to understand how things work. Understanding active and passive mode can also be useful, since you are going to implement at least one of them.
Also, use wireshark to follow a TCP stream and see the data you are sending/receiving, it can be very useful in debugging.
The protocol implementation won't give you a file structure. The protocol is here to define some rules and states.
The dev/prog part is up to you. You just need to respect the FTP protocol in order to gain the normalization and the compatibility with other client/server.
Best regards

Whats the advantages and disadvantages of using Socket in IPC

I have been asked this question in some recent interviews,Whats the advantages and disadvantages of using Socket in IPC when there are other ways to perform IPC.Have not found exact answer .
Any help would be much appreciated.
Compared to pipes, IPC sockets differ by being bidirectional, that is, reads and writes can be done on the same descriptor. Pipes, unlike sockets, are unidirectional. You have to keep a pair of descriptors if you want to do both reads and writes.
Pipes, on the other hand, guarantee atomicity when reading or writing under a certain amount of bytes. Writing something less than PIPE_BUF bytes at once is guaranteed to be delivered in one chunk and never observed partial. Sockets do require more care from the programmer in that respect.
Shared memory, when used for IPC, requires explicit synchronisation from the programmer. It may be the most efficient and most flexible mechanism, but that comes at an increased complexity cost.
Another point in favour of sockets: an app using sockets can be easily distributed - ie. it can be run on one host or spread across several hosts with little effort. This depends of course on the nature of the app.
Perhaps this is too simplified an answer, yet it is an important detail. Sockets are not supported on all OS's. Recently, I have been aware of a project that used sockets for IPC all over the place only to find that they were forced to change from Linux to a proprietary OS which was POSIX, but did not support sockets the same way as Linux.
Sockets allow you a few benefits...
You can connect a simple client to them for testing (manually enter data, see the response).
This is very useful for debugging, simulating and blackbox testing.
You can run the processes on different machines. This can be useful for scalability and is very helpful in debugging / testing if you work in embedded software.
It becomes very easy to expose your process as a service
But there are drawbacks as well
Overhead is greater than IPC optimized for a single machine. Shared memory in particular is better if you need the performance, and you know your processes are all on the same machine.
Security - if your client apps can connect so can anyone else, if you're not careful about authentication. Data can also be sniffed if you're not encrypting, and modified if you're not at least signing data sent over the wire.
Using a true message queue tends to leave you with fixed sized messages. If you have a large number of messages of wildly varying sizes this can become a performance problem. Using a socket can be a way around this, though you're then left trying to wrap this functionality to become identical to a queue, which is tricky to get the detail right on, particularly aspects like blocking/non-blocking and atomicity.
Shared memory is quick but requires management (you end up writing a version of malloc to manage the SHM) plus you have to synchronise and lock it in some way. Though you can use libraries to help with this the availability depends on your environment and language.
Queues are easy but have the downsides listed as pros to my socket discussion.
Pipes have been covered by Blagovests answer to this question.
As is ever the case with this kind of stuff I would suggest reading the W. Richard Stevens books on IPC and sockets. There is no better explanation than his! :-)

Sending system calls to file server

I was designing a file server using socket programming in C. I send the calls like open(), write() etc as plain strings using stream sockets and decipher it at the server end.i.e if it is a open call then we extract path, mode, flags. Is it ok or should I be using some kind of struct to store the file system calls and send to server where the server simply accesses the fields.
Is there some standard way i don't know?
Thanks
You're basically starting to define your own protocol. It would be a lot easier if you sent numbers describing operations instead of strings.
If you're serious about this you might want to look into RPC - RFC707 (you did ask for a standard way, right?).
Yes, there is a standard way. Look into NFS, AFP, CIFS, and WebDAV, then pick one.
You already have answers for the standard way, so I'll give you a few caveats you should look out for.
If you intend to deploy your file server in an un-trusted environment (e.g. on the Internet) think about securing it right away. Securing it is not just a question of slapping encryption on - you need to know how you intend to authenticate your users, how you want to authorize different types of access to different parts of the server, how you will insure the authenticity and the integrity of the data and how you intend to keep the data confidential.
You'll also need to think about your server's availability. That means that you should be fault-tolerant - i.e. connections can (and will) break (regardless of whether they're being broken on purpose or not) so you need to detect that, either will some kind of keep-alive (which will fail if the client left) or with some kind of activity time-out (which will expire if the client left). You also need to think about how many clients you are willing to support simultaneously - which can radically change the architecture of your server.
As for the open, close, read, write etc. commands, most file transfer protocols don't go into so much detail, but it may be interesting to be able to do so depending on your situation. If your files are huge and you only need some chunks of it, or if you want to be able to lock files to work on them exclusively, etc. you may want to go into such detail. If you don't have those requirements, simpler, transactional commands such as get & put (rather than open, read, read, read, close and open, write, write some more, close) may both be easier to implement and easier to work with.
If you want a human being to interact with your server and give it commands, text is a good approach: it's easy to debug when sniffing and humans understand text and can type is easily. If there are no humans involved, using integers as commands is probably a better approach: you can structure your command to start with an integer followed by a number of parameters and always simply expect the same thing on the server's end (and do a switch on the command you receive). Even in that case, though, it may be a good idea to have human-readable values in your integers. For example, putting 'READ' in an integer as the read command uses as many bytes as 0x00000001, but is easier to read when sniffed with WireShark.
Finally, you really take a look at the standard approaches and try to understand the trade-offs made in each case. Ask yourself, for example, why HTTP has such verbose headers and why WebDAV uses it. Why does FTP use two connections (one for commands and one for data) while many other protocols use only one? How did NFS evolve to where it is now, and why? Understanding the answers to these questions will help you develop your own protocol - if after you understand those answers, you still feel you need your own protocol.

Resources