I'm working with POSIX sockets in C.
Given X, I have a need to verify that the socketfd contains at least X bytes before proceeding to perform an operation with it.
With that being said, I don't want to receive X bytes and store it into a buffer using recv as X has the potential of being very large.
My first idea was to use MSG_PEEK...
int x = 9999999
char buffer[1];
int num_bytes = recv(socketfd, buffer, X, MSG_PEEK);
(value == X) ? good : bad;
...
...
...
// Do some operation
But I'm concerned X > 1 is corrupting memory, flag MSG_TRUNC seems to resolve the memory concern but removes X bytes from socketfd.
There's a big difference between e.g. TCP and UDP in this regards.
UDP is packet based, you send and receive packets of fixed size, basically.
TCP is a streaming protocol, where data begins to stream on connection and stops at disconnection. There are no message boundaries or delimiters in TCP, other than what you add at the application layer. It's simply a stream of bytes without any meaning (in TCP's point of view).
That means there's no way to tell how much will be received with a single recv call.
You need to come up with an application-level protocol (on top of TCP) which can either tell the size of the data to be received; For example there might be a fixed-size data-header that contains the size of the following data; Or you could have a specific delimiter between messages, something that can't occur in the stream of bytes.
Then you receive in a loop until you either have received all the data, or until you have received the delimiter. But note, with a delimiter there's the possibility that you also receive the beginning of the next message, so you need to be able to handle partial beginnings of message after the current message have been fully received.
int num_bytes = recv(socketfd, buffer, X, MSG_PEEK);
This will copy up to X byte into buffer and return it without removing it from the socket. But your buffer is only 1 byte large. Increase your buffer.
Have you tried this?
ssize_t available = recv(socketfd, NULL, 0, MSG_PEEK | MSG_TRUNC);
Or this?
size_t available;
ioctl(socketfd, FIONREAD, &available);
I am very confused with read() calls to get inotify events.
Here is the code:
#define EVENT_SIZE sizeof(inotify_event)
int fd = inotify_init();
int wd = inotify_add_watch(fd, dir, IN_MODIFY);
void* p = malloc(sizeof(EVENT_SIZE));
read(wd, p, (EVENT_SIZE + 10));
My test file is a.txt.
The output after debug in gdb is:
{wd = 0, mask = 0, cookie = 0, len = 0, name = 0x558f05d002d0 ""}
Now, when I change the last line to read(fd, p, (EVENT_SIZE + 16));, the output that I get in gdb is:
{wd = 1, mask = 2, cookie = 0, len = 16, name = 0x5625cdd422d0 ".a.txt.swp"}
Q1. Why don't I get an overflow error because in both cases, I am writing more than the allocated buffer p?
Q2. If there is no error, then my first program should also work because my filename is less than 10, but it doesn't work and it only works with 16. What am I missing here?
compiler - g++ 9.3.0
os - ubuntu 20.04
Thank you.
The behaviour of writing outside the boundaries of an array is undefined. Additionally, read reads up to the given number of bytes.
???
Q1
You get a segfault, if your process attempts to access a memory address that does not belong to it.
malloc - "Normally, malloc() allocates memory from the heap, and adjusts the size of the heap as required, using sbrk(2)."
sbrk - "sbrk() increments the program's data space by increment bytes."
From 1., 2. and 3.: There is (probably) room left where you are able to write to, but the data (if any) after your allocated memory is definitely corrupted!
Q2
Read the documentation for inotify.
The meaning of the fields for inotify_event structure:
mask contains bits that describe the event that occurred (see below).
and
The len field counts all of the bytes in name, including the null bytes; the length of each inotify_event structure is thus sizeof(struct inotify_event)+len.
and
The name field is present only when an event is returned for a file inside a watched directory; it identifies the filename within the watched directory. This filename is null-terminated, and may include further null bytes ('\0') to align subsequent reads to a suitable address boundary.
and
The behavior when the buffer given to read(2) is too small to return information about the next event depends on the kernel version: in kernels before 2.6.21, read(2) returns 0; since kernel 2.6.21, read(2) fails with the error EINVAL. Specifying a buffer of size sizeof(struct inotify_event) + NAME_MAX + 1 will be sufficient to read at least one event.
I'm using a modification of the C code in https://www.tcpdump.org/sniffex.c to print information on TCP packets passing through an interface, using libpcap.
This is a sample of the callback code I'm trying to use to check if the received packet has the source field equal to the IP address of the current interface (so to only analyse outgoing packets). It is a rather extensive program, so I decided to just include the problematic part:
// retrieve IP address of interface
char * dev_name = "eth0";
struct ifreq ifr;
int fd;
char *dev_ip;
fd = socket(AF_INET, SOCK_DGRAM, 0);
// type of address to retrieve (IPv4)
ifr.ifr_addr.sa_family = AF_INET;
// copy the interface name in the ifreq structure
strncpy(ifr.ifr_name , dev_name , IFNAMSIZ-1);
ioctl(fd, SIOCGIFADDR, &ifr);
close(fd);
dev_ip = inet_ntoa(( (struct sockaddr_in *)&ifr.ifr_addr )->sin_addr);
printf("IPv4 address: %s\n", dev_ip);
printf("inet_ntoa: %s\n",inet_ntoa(ip->ip_src));
if (strcmp(dev_ip,inet_ntoa(ip->ip_src)) == 0)
printf("EQUAL!\n");
However, as you can see in the following screenshot, even if the source IP (inet_ntoa) and the IP address of the interface (IPv4 address) are different, their values are always equal according to the program.
What could the problem be?
inet_ntoa returns a pointer to a string that has been constructed in static memory that is internal to inet_ntoa. It re-uses that same static memory every time it is called. When you do this:
dev_ip = inet_ntoa(...);
dev_ip is set to point to that internal static buffer. At that point the static buffer contains a string that represents the interface address, so your:
printf("IPv4 address: %s\n", dev_ip);
shows the expected result. But then you do this:
printf("inet_ntoa: %s\n",inet_ntoa(ip->ip_src));
and that overwrites inet_ntoa's internal buffer with a string representing the packet's address.
But remember that dev_ip is still pointing to that internal buffer. So when you do this:
if (strcmp(dev_ip,inet_ntoa(ip->ip_src)) == 0)
(which BTW unnecessarily overwrites the internal buffer again with the packet's address, which was already there) the two arguments that are passed to strcmp are both pointers to inet_nota's internal buffer, and therefore strcmp will always find that the target strings match -- because both arguments are pointing to the same string.
To fix, either make a local copy of the string generated by inet_ntoa immediately after you call it, by copying into a local buffer or by doing something like:
dev_ip = strdup(inet_ntoa(...));
(and remember to free(dev_ip) when you no longer need that string) or, even better, use the thread-safe variant of inet_ntoa called inet_ntoa_r if your platform has that function. inet_ntoa_r does not use (and re-use) an internal buffer. It requires its caller to provide the buffer where the string will be placed.
This is for a Linux system, in C. It involves network programming. It is for a file transfer program.
I've been having this problem where this piece of code works unpredictably. It either is completely successful, or the while loop in the client never ends. I discovered that this is because the fileLength variable would sometimes be a huge (negative or positive) value, which I thought was attributed to making some mistake with ntohl. When I put in a print statement, it seemed to work perfectly, without error.
Here is the client code:
//...here includes relevant header files
int main (int argc, char *argv[]) {
//socket file descriptor
int sockfd;
if (argc != 2) {
fprintf (stderr, "usage: client hostname\n");
exit(1);
}
//...creates socket file descriptor, connects to server
//create buffer for filename
char name[256];
//recieve filename into name buffer, bytes recieved stored in numbytes
if((numbytes = recv (sockfd, name, 255 * sizeof (char), 0)) == -1) {
perror ("recv");
exit(1);
}
//Null terminator after the filename
name[numbytes] = '\0';
//length of the file to recieve from server
long fl;
memset(&fl, 0, sizeof fl);
//recieve filelength from server
if((numbytes = recv (sockfd, &fl, sizeof(long), 0)) == -1) {
perror ("recv");
exit(1);
}
//convert filelength to host format
long fileLength = ntohl(fl);
//check to make sure file does not exist, so that the application will not overwrite exisitng files
if (fopen (name, "r") != NULL) {
fprintf (stderr, "file already present in client directory\n");
exit(1);
}
//open file called name in write mode
FILE *filefd = fopen (name, "wb");
//variable stating amount of data recieved
long bytesTransferred = 0;
//Until the file is recieved, keep recieving
while (bytesTransferred < fileLength) {
printf("transferred: %d\ntotal: %d\n", bytesTransferred, fileLength);
//set counter at beginning of unwritten segment
fseek(filefd, bytesTransferred, SEEK_SET);
//buffer of 256 bytes; 1 byte for byte-length of segment, 255 bytes of data
char buf[256];
//recieve segment from server
if ((numbytes = recv (sockfd, buf, sizeof buf, 0)) == -1) {
perror ("recv");
exit(1);
}
//first byte of buffer, stating number of bytes of data in recieved segment
//converting from char to short requires adding 128, since the char ranges from -128 to 127
short bufLength = buf[0] + 128;
//write buffer into file, starting after the first byte of the buffer
fwrite (buf + 1, 1, bufLength * sizeof (char), filefd);
//add number of bytes of data recieved to bytesTransferred
bytesTransferred += bufLength;
}
fclose (filefd);
close (sockfd);
return 0;
}
This is the server code:
//...here includes relevant header files
int main (int argc, char *argv[]) {
if (argc != 2) {
fprintf (stderr, "usage: server filename\n");
exit(1);
}
//socket file descriptor, file descriptor for specific client connections
int sockfd, new_fd;
//...get socket file descriptor for sockfd, bind sockfd to predetermined port, listen for incoming connections
//...reaps zombie processes
printf("awaiting connections...\n");
while(1) {
//...accepts any incoming connections, gets file descriptor and assigns to new_fd
if (!fork()) {
//close socket file discriptor, only need file descriptor for specific client connection
close (sockfd);
//open a file for reading
FILE *filefd = fopen (argv[1], "rb");
//send filename to client
if (send (new_fd, argv[1], strlen (argv[1]) * sizeof(char), 0) == -1)
{ perror ("send"); }
//put counter at end of selected file, and find length
fseek (filefd, 0, SEEK_END);
long fileLength = ftell (filefd);
//convert length to network form and send it to client
long fl = htonl(fileLength);
//Are we sure this is sending all the bytes??? TEST
if (send (new_fd, &fl, sizeof fl, 0) == -1)
{ perror ("send"); }
//variable stating amount of data unsent
long len = fileLength;
//Until file is sent, keep sending
while(len > 0) {
printf("remaining: %d\ntotal: %d\n", len, fileLength);
//set counter at beginning of unread segment
fseek (filefd, fileLength - len, SEEK_SET);
//length of the segment; 255 unless last segment
short bufLength;
if (len > 255) {
len -= 255;
bufLength = 255;
} else {
bufLength = len;
len = 0;
}
//buffer of 256 bytes; 1 byte for byte-length of segment, 255 bytes of data
char buf[256];
//Set first byte of buffer as the length of the segment
//converting short to char requires subtracting 128
buf[0] = bufLength - 128;
//read file into the buffer starting after the first byte of the buffer
fread(buf + 1, 1, bufLength * sizeof(char), filefd);
//Send data too client
if (send (new_fd, buf, sizeof buf, 0) == -1)
{ perror ("send"); }
}
fclose (filefd);
close (new_fd);
exit (0);
}
close (new_fd);
}
return 0;
}
Note: I've simplified the code a bit, to make it clearer I hope.
Anything beginning with //... represents a bunch of code
You seem to be assuming that each send() will either transfer the full number of bytes specified or will error out, and that each one will will pair perfectly with a recv() on the other side, such that the recv() receives exactly the number of bytes sent by the send() (or error out), no more and no less. Those are not safe assumptions.
You don't show the code by which you set up the network connection. If you're using a datagram-based protocol (i.e. UDP) then you're more likely to get the send/receive boundary matching you expect, but you need to account for the possibility that packets will be lost or corrupted. If you're using a stream-based protocol (i.e. TCP) then you don't have to be too concerned with data loss or corruption, but you have no reason at all to expect boundary-matching behavior.
You need at least three things:
An application-level protocol on top of the network-layer. You've got parts of that already, such as in how you transfer the file length first to advise the client about much content to expect, but you need to do similar for all data transferred that are not of pre-determined, fixed length. Alternatively, invent another means to communicate data boundaries.
Every send() / write() that aims to transfer more than one byte must be performed in a loop to accommodate transfers being broken into multiple pieces. The return value tells you how many of the requested bytes were transferred (or at least how many were handed off to the network stack), and if that's fewer than requested you must loop back to try to transfer the rest.
Every recv() / read() that aims to transfer more than one byte must be performed in a loop to accommodate transfers being broken into multiple pieces. I recommend structuring that along the same lines as described for send(), but you also have the option of receiving data until you see a pre-arranged delimiter. The delimiter-based approach is more complicated, however, because it requires additional buffering on the receiving side.
Without those measures, your server and client can easily get out of sync. Among the possible results of that are that the client interprets part of the file name or part of the file content as the file length.
Even though you removed it from that code I'll make an educated guess and assume that you're using TCP or some other stream protocol here. This means that the data that the servers sends is a stream of bytes and the recv calls will not correspond in the amount of data they get with the send calls.
It is equally legal for your first recv call to just get one byte of data, as it is to get the file name, file size and half of the file.
You say
When I put in a print statement,
but you don't say where. I'll make another educated guess here and guess that you did it on the server before sending the file length. And that happened to shake things enough that the data amounts that were sent on the connection just accidentally happened to match what you were expecting on the client.
You need to define a protocol. Maybe start with a length of the filename, then the filename, then the length of the file. Or always send 256 bytes for the filename regardless of how long it is. Or send the file name as a 0-terminated string and try to figure out the data from that. But you can never assume that just because you called send with X bytes that the recv call will get X bytes.
I believe the issue is actually a compound of everything you and others have said. In the server code you send the name of the file like this:
send (new_fd, argv[1], strlen (argv[1]) * sizeof(char), 0);
and receive it in the client like this:
recv (sockfd, name, 255 * sizeof (char), 0);
This will cause an issue when the filename length is anything less than 255. Since TCP is a stream protocol (as mentioned by #Art), there are no real boundaries between the sends and recvs, which can cause you to receive data in odd places where you are not expecting them.
My recommendation would be to first send the length of the filename, eg:
// server
long namelen = htonl(strlen(argv[1]));
send (new_fd, &namelen, 4, 0);
send (new_fd, argv[1], strlen (argv[1]) * sizeof(char), 0);
// client
long namelen;
recv (sockfd, &namelen, 4, 0);
namelen = ntohl(namelen);
recv (sockfd, name, namelen * sizeof (char), 0);
This will ensure that you are always aware of exactly how long your filename is and makes sure that you aren't accidentally reading your file length from somewhere in the middle of your file (which is what I expect is happening currently).
edit.
Also, be cautious when you are sending sized numbers. If you use the sizeof call on them, you may be sending and receiving different sizes. This is why I hard-coded the sizes in the send and recv for the name length so that there is no confusion on either side.
Well, after some testing, I discovered that the issue causing the problem did have something to do with htonl(), though I had still read the data incorrectly in the beginning. It wasn't that htonl() wasn't working at all, but that I didn't realize a 'long' has different lengths depending on system architecture (thanks #tofro). That is to say the length of a 'long' integer on 32-bit and 64-bit operating systems is 4 bytes and 8 bytes, respectively. And the htonl() function (from arpa/inet.h) for 4-byte integers. I was using a 64-bit OS, which explains why the value was being fudged. I fixed the issue by using the int32_t variable (from stdint.h) to store the file length. So the main issue in this case was not that it was becoming out of sync (I think). But as for everyone's advice towards developing an actual protocol, I think I know what exactly you mean, I definitely understand why it's important, and I'm currently working towards it. Thank you all for all your help.
EDIT: Well now that it has been several years, and I know a little more, I know that this explanation doesn't make sense. All that would result from long being larger than I expected (8 bytes rather than 4) is that there's some implicit casting going on. I used sizeof(long) in the original code rather than hardcoding it to assume 4 bytes, so that particular (faulty) assumption of mine shouldn't have produced the bug I saw.
The problem is almost certainly what everyone else said: one call to recv was not getting all of the bytes representing the file length. At the time I doubted this was the real cause of the behaviour I saw, because the file name (of arbitrary length) I was sending through was never partially sent (i.e. the client always created a file of the correct filename). Only the file length was messed up. My hypothesis at the time was that recv mostly respected message boundaries, and while recv can possibly only send part of the data, it was more likely that it was sending it all and there was another bug in my code. I now know this isn't true at all, and TCP doesn't care.
I'm a little curious as to why I didn't see other unexpected behaviour as well (e.g. the file name being wrong on the receiving end), and I wanted to investigate further, but despite managing to find the files, I can't seem to reproduce the problem now. I suppose I'll never know, but at least I understand the main issue here.
I'm using the pcap library but I don't know why I get always this output:
new packet with size: udata= 8 hdr=8 pkt=8
This is the code:
void handle_pcap(u_char *udata, const struct pcap_pkthdr *hdr, const u_char *pkt)
{
DEBUG("DANY new packet with size: udata= %d hdr=%d pkt=%d", (int) sizeof(udata),(int) sizeof(hdr),(int) sizeof(pkt) );
...
stuff
}
and in another file I use:
status = pcap_loop (pcap_obj,
-1 /* How many packets it should sniff for before returning (a negative value
means it should sniff until an error occurs (loop forever) ) */,
handle_pcap /* Callback that will be called*/,
NULL /* Arguments to send to the callback (NULL is nothing) */);
Is it normal that output?
I think not because sometimes my program works sometimes doesn't..
You are printing the size of the pointers instead of looking into the pcap_pkthdr* hdr to see the size of the packet.
You can find the size of the captured data and the size of the entire packet by looking at hdr->caplen and hdr->len.
Um. You are getting the size of (the various) pointers.
e.g. sizeof(udata) gets the size of a u_char *. That's why the numbers look suspect.
If you want the sizes of the packets, they are in hdr->caplen and hdr->len (the former is the captured length, the latter is the packet length).