I'm writing code for an application using sockets where I have to send integers and strings over the network, but I'm having trouble packing the data into a buffer for transmission. I tried doing this:
sendline[0] = htons(3);
sendline[1] = htons(strlen(argv[3]));
for(int i = 0; i < strlen(argv[3]); i++)
{
sendline[i + 2] = argv[3][i];
}
sendline[2 + strlen(argv[3])] = htons(atoi(argv[4]));
sendline[3 + strlen(argv[3])] = '\0';
but it doesn't work.
Where am I going wrong here? Also, what would be the best way to serialize this kind of data?
This is my deserialization code :
case '3': // switching on the value of buf[0]
{
int i;
int len = ntohs(buf[1]);
char* ch = (char*)malloc(len * sizeof(char));
for(i = 0; i < len; i++)
{
ch[i] = buf[i + 2];
}
}
You should user htonl()and ntohl(), if you are serializing 32 bit (4 byte) integers.
send:
int my_integer = INT32_MAX;
uint32_t data = htonl(my_integer);
write(s, &data, sizeof(data));
recv:
uint32_t data;
read(s, &data, sizeof(data));
int my_integer = ntohl(data);
There are many questions on SO already describing serialization and the htonl()/ntohl() functions, but beware of those omitting endianess. Remember to check return values of read() and write().
As for the "strings", do not serialize them, if they are char data describing ASCII text. UTF8 also works without serialization, but beware of UTF-16. The important thing to grasp about endianess early on is that it refers to byte order and not the order of bits within a byte. IIRC C guarantees that the presentation form of a single byte, so it appears the same on all C platforms.
Buffer sent/received via sockets consists of char, which is 1 byte long. Integer/uint16_t type is longer.
So what you really do, when you try to assign?
sendline[0] = htons(3);
sendline[0] is 1 byte long, but htons(3) is 2 bytes long, so you truncate it. Similarly you use sendline[1] by:
sendline[1] = htons(strlen(argv[3]));
Obviously your buffer gets corrupted.
Send integer numbers in one socket and strings in other, convert integers to strings or do not overwrite your buffer (so you have to use sizeof(int) on some places)
Related
I have built a Winsock2 server. Part of that program has a function that receives data from clients. Originally the receive function I built would peek at the incoming data and determine if there was any additional data to be read before allowing recv() to pull the data from the buffer. This worked fine in the beginning of the project but I am now working on improving performance.
Here is a portion of the code I've written to eliminate the use of peek:
unsigned char recv_buffer[4096];
unsigned char *pComplete_buffer = malloc(sizeof(recv_buffer) * sizeof(unsigned char*));
int offset = 0;
int i = 0;
...
for (i; i <= sizeof(recv_buffer); i++) {
if (recv_buffer[i] == NULL) {
break;
}
pComplete_buffer[offset] = recv_buffer[i];
offset++;
}
...
This code would work great but the problem is that NULL == 0. If the client happens to send a 0 this loop will break prematurely. I thought I would be clever and leave the data uninitialized to 0xcc and use that to determine the end of recv_buffer but it seems that clients sometimes send that as part of their data as well.
Question:
Is there a character I can initialize recv_buffer to and reliably break on?
If not, is there another way I can eliminate the use of peek?
The correct solution is to keep track of how many bytes you store in recv_buffer to begin with. sizeof() gives you the TOTAL POSSIBLE size of the buffer, but it does not tell you HOW MANY bytes actually contain valid data.
recv() tells you how many bytes it returns to you. When you recv() data into recv_buffer, use that return value to increment a variable you define to indicate the number of valid bytes in recv_buffer.
For example:
unsigned char recv_buffer[4096];
int num_read, recv_buffer_size = 0;
const int max_cbuffer_size = sizeof(recv_buffer) * sizeof(unsigned char*);
unsigned char *pComplete_buffer = malloc(max_cbuffer_size);
...
num_read = recv(..., recv_buffer, sizeof(recv_buffer), ...);
if (num_read <= 0) {
// error handling...
return;
}
recv_buffer_size = num_read;
...
int available = max_cbuffer_size - offset;
int num_to_copy = min(recv_buffer_size, available);
memcpy(pComplete_buffer + offset, recv_buffer, num_to_copy);
offset += num_to_copy;
memmove(recv_buffer, recv_buffer + num_to_copy, recv_buffer_size - num_to_copy);
recv_buffer_size -= num_to_copy;
...
Is there a character I can initialize recv_buffer to and reliably break on?
Nope. If the other side can send any character at any time, you'll have to examine them.
If you know the sender will never send two NULs in a row (\0\0), you could check for that. But then some day the sender will decide to do that.
If you can change the message structure, I'd send the message length first (as a byte, network-ordered short or int depending on your protocol). Then, after parsing that length, the receiver will know exactly how long to keep reading.
Also if you're using select, that will block until there's something to read or the socket closes (mostly -- read the docs).
I'm (unsuccessfully) trying to pass a .wav file through a socket in C.
The following code reads the .wav file and assigns it to the short samples variable (num_samples being its size).
char* filename = "./test.wav";
FILE* f;
short* samples; // stored signal
int num_samples, curr_samples; // count of signal samples
if ((f = fopen(filename, "rb")) == NULL) {
fprintf(stderr, "cannot open %s\n", filename);
return;
}
/* reads the .wav file into memory (samples) */
if (read_wav(f, &samples, &num_samples) < 0) {
return;
}
fclose(f);
Then, the samples are loaded iteratively into buffers and passed through the socket
int buffer_size = 320;
unsigned char buffer[buffer_size];
short bufferSamples[buffer_size/2];
int curr_samples = 0;
while(curr_samples < num_samples) {
bzero(buffer,buffer_size);
bzero(bufferSamples,buffer_size/2);
// Store samples in short array
for (i = curr_samples; i < buffer_size/2 + curr_samples; i++) {
bufferSamples[i-curr_samples] = *(samples + i);
}
// assign to buffer
for (i = 0; i < buffer_size; i+=2) {
unsigned char upper = bufferSamples[i/2] & 0xFF;
unsigned char lower = bufferSamples[i/2] >> 8;
buffer[i] = lower;
buffer[i+1] = upper;
}
n = write(sockfd,buffer,strlen(buffer));
if (n < 0) error("ERROR writing to socket");
// sleep and increment
usleep(10000);
curr_samples += buffer_size/2;
}
For the sake of simplicity, I have not posted the entire code (socket definitions etc)
I believe I have confirmed that bufferSamples correctly stores the signal in shorts for each iteration (by comparing its prints with the output of the command "od -s test.wav"), so I suspect that the problem occurs when I assign the short array to the char buffer. I have tried altering the endianess to no avail.
The server also reads input from another client I have no access to, but it handles its inputs correctly, so the problem lies in this client.
I have little experience with sockets and byte conversions, so I would be grateful if you provided me with some insight. Hopefully the solution with be quite obvious for the experienced.
EDIT: As it turns out, the problem lies in the interface used to pass the signal to this code. So, apart from the strlen() function, it seems ok after all. Nevertheless, thank you for your educating tips!
Mostly this looks ok. The code could be tidier, more functions to convert between Short and 2x char etc... to improve readability, but what you have written looks correct at a glance.
There is one problem however.
n = write(sockfd,buffer,strlen(buffer));
strlen() is for counting the length of a null terminated string.
If your buffer has any byte which is 0, strlen will stop counting. You probably want to send the size of the buffer each time sizeof(buffer)
I agree with the other answer that the use of strlen is definitely going to cause you a problem. I'm answering in addition to point out that you are doing a lot of unnecessary work that may also be bug prone. Copying the samples into another short buffer and then copying that short buffer into a buffer of chars can both be skipped. Another issue you're going to have is when you reach the end of your buffer. If its not a multiple of 320 bytes then you'll just read right past the end of the buffer.
int numBytesRemaining = num_samples * 2;
int numBytesToSend = 0;
int numBytesSent = 0;
unsigned char* pBuf = (unsigned char*)samples;
while (numBytesRemaining)
{
numBytesThisTime = 320;
// don't read past the end of the buffer.
if (numBytesThisTime > numBytesRemaining)
numBytesThisTime = numBytesRemaining;
// just in case fewer bytes were written than requested.
// 0 is an example.
numBytesWritten = write(sockfd,pBuf,numBytesThisTime);
if (numBytesWritten < 0) error("ERROR writing to socket");
// sleep and increment
usleep(10000);
// only advance by numBytesWritten
numBytesRemaining -= numBytesWritten;
pBuf += numBytesWritten; // advance the pointer.
}
I'm trying to interpret WebSocket Frames that I get over a TCP connection. I want to do this in pure C (so no reinterpret_cast). The Format is specified in IEEE RFC 6455. I want to fill the following struct:
typedef struct {
uint8_t flags;
uint8_t opcode;
uint8_t isMasked;
uint64_t payloadLength;
uint32_t maskingKey;
char* payloadData;
} WSFrame;
with the following Function:
static void parseWsFrame(char *data, WSFrame *frame) {
frame->flags = (*data) & FLAGS_MASK;
frame->opcode = (*data) & OPCODE_MASK;
//next byte
data += 1;
frame->isMasked = (*data) & IS_MASKED;
frame->payloadLength = (*data) & PAYLOAD_MASK;
//next byte
data += 1;
if (frame->payloadLength == 126) {
frame->payloadLength = *((uint16_t *)data);
data += 2;
} else if (frame->payloadLength == 127) {
frame->payloadLength = *((uint64_t *)data);
data += 8;
}
if (frame->isMasked) {
frame->maskingKey = *((uint32_t *)data);
data += 4;
}else{
//still need to initialize it to shut up the compiler
frame->maskingKey = 0;
}
frame->payloadData = data;
}
The code is for the ESP8266, so debugging is only possible with printfs to the serial console. Using this method, I discovered that the code crashes right after the frame->maskingKey = *((uint32_t *)data); and the first two ifs get skipped, so this is the first time I cast a pointer to another pointer.
The data is not \0 terminated, but i get the size in the data received callback. In my test, I'm trying to send the message 'test' over the already established WebSocket, and the received data length is 10, so:
1 byte flags and opcode
1 byte masked and payload length
4 bytes masking key
4 bytes payload length
At the point the code crashes, I expect data to be offsetted by 2 bytes from the initial position, so it has enough data to read the following 4 bytes.
I did not code any C for a long time, so I expect only a small error in my code.
PS.: I've seen a lot code where they interpret the values byte-by-byte and shift the values, but I see no reason why this method should not work either.
The problem with casting a char* to a pointer to a larger type is that some architectures do not allow unaligned reads.
That is, for example, if you try to read a uint32_t through a pointer, then the value of the pointer itself has to be a multiple of 4. Otherwise, on some architectures, you will get a bus fault (e.g. - signal, trap, exception, etc.) of some sort.
Because this data is coming in over TCP and the format of the stream / protocol is laid out without any padding, then you will likely need to read it out from the buffer into local variables byte by byte (e.g. - using memcpy) as appropriate. For example:
if (frame->isMasked) {
mempcy(&frame->maskingKey, data, 4);
data += 4;
// TODO: handle endianness: e.g.: frame->maskingKey = ntohl(frame->maskingKey);
}else{
//still need to initialize it to shut up the compiler
frame->maskingKey = 0;
}
There's two problems:
data might not be correctly aligned for uint32_t
The bytes in data might not be in the same order as your hardware uses for value representation of integer. (sometimes called "endianness issue").
To write reliable code, look at the message specification to see which order the bytes are coming in. If they are most-significant-byte first then the portable version of your code would be:
unsigned char *udata = (unsigned char *)data;
frame->maskingKey = udata[0] * 0x1000000ul
+ udata[1] * 0x10000ul
+ udata[2] * 0x100ul
+ udata[3];
This might look like a handful at first, but you could make an inline function that takes a pointer as argument, and returns the uint32_t, which will keep your code readable.
Similar problem applies to your reads of uint16_t.
While receiving the integer array, checking the bytes of data received is needed.
For example, when receiving an integer array with length 100:
int count = 0;
int msg[100];
while(count < 100 * sizeof(int)){
count += read(fd, msg + count / sizeof(int), 100 * sizeof(int) - count);
}
Is this a right way? Will read() return a value which is not a multiple of sizeof(int)?
If this is not correct, what the right way to receive an integer array?
On Linux you can use the MSG_WAITALL option for recv(), which makes the function wait for the full given length of incoming data.
In alternative (working on all platforms) you can also write a generic receive function that receives a given amount of bytes, like this one (which assumes the socket is not set as non-blocking; requires including <stdint.h>):
/// \brief Receives a block of data of the specified size
/// \param sk Socket for incoming data
/// \param data Pointer to input buffer
/// \param len Number of bytes to read
/// \return Number of bytes received (same as len) on success, -1 on failure
int block_recv(const int sk, uint8_t* data, unsigned int len)
{
int i, j = 0;
while (len > 0) {
i = recv(sk, (char*) data, len, 0);
if (i <= 0) {
return -1;
}
data += i;
len -= i;
j += i;
}
return j;
}
Then you can just call it to receive your integer buffer:
if (block_recv(fd, (uint8_t*) msg, sizeof(msg)) != sizeof(msg)) {
fprintf(stderr, "Error receiving integer buffer...\n");
// whatever error handling you need...
}
You are correct that read may not return all the data you requested, esp. if it is connected to a network socket. read will not necessarily return a value with multiple of sizeof(int). If you want to use this (manual) method of receiving data, I would probably recommend you count bytes instead of sizeof(int)s (which can be 4 or 8 depending on your system). Even easier than doing this is to use something like Protocol Buffers, which lets you define a data format for your packets and serialize/deserialize them quickly and easily. (Define a message that simply includes your integer array and let protobuf take care of everything else.)
You're right - there's no guarantee that read will return data whose size is a multiple of sizeof(int).
The minimum size you may receive is a char.
There are other issues such as endianness when receiving integers from across a network (which obviously also apply to sending integers across a network) that you should be aware of.
For these reasons, an easier solution is to use a char[] instead of int[] to store the message, and then copy it to an int[]. If you are concerned about efficiency, prove to yourself that this is a bottleneck (profile your code) before you worry about optimizing code.
Also, if you are sending and recving across a network, be aware that protocols like TCP are stream-based, i.e. they simply send streams of characters and you need to implement some way of detecting the end of a message and formatting it to your needs. Two common ways are to either send the length of the message as a header or to use a special character like '\n' to signal the end of the message. Also, since you are sending an array, you could use something like '|' to separate elements.
So a sample message can be: "1|100|239|23|\n"
Following my previous question (Why do I get weird results when reading an array of integers from a TCP socket?), I have come up with the following code, which seems to work, sort of. The code sample works well with a small number of array elements, but once it becomes large, the data is corrupt toward the end.
This is the code to send the array of int over TCP:
#define ARRAY_LEN 262144
long *sourceArrayPointer = getSourceArray();
long sourceArray[ARRAY_LEN];
for (int i = 0; i < ARRAY_LEN; i++)
{
sourceArray[i] = sourceArrayPointer[i];
}
int result = send(clientSocketFD, sourceArray, sizeof(long) * ARRAY_LEN);
And this is the code to receive the array of int:
#define ARRAY_LEN 262144
long targetArray[ARRAY_LEN];
int result = read(socketFD, targetArray, sizeof(long) * ARRAY_LEN);
The first few numbers are fine, but further down the array the numbers start going completely different. At the end, when the numbers should look like this:
0
0
0
0
0
0
0
0
0
0
But they actually come out as this?
4310701
0
-12288
32767
-1
-1
10
0
-12288
32767
Is this because I'm using the wrong send/recieve size?
The call to read(..., len) doesn't read len bytes from the socket, it reads a maximum of len bytes. Your array is rather big and it will be split over many TCP/IP packets, so your call to read probably returns just a part of the array while the rest is still "in transit". read() returns how many bytes it received, so you should call it again until you received everything you want. You could do something like this:
long targetArray[ARRAY_LEN];
char *buffer = (char*)targetArray;
size_t remaining = sizeof(long) * ARRAY_LEN;
while (remaining) {
ssize_t recvd = read(socketFD, buffer, remaining);
// TODO: check for read errors etc here...
remaining -= recvd;
buffer += recvd;
}
Is the following ok?
for (int i = 0; sourceArrayPointer < i; i++)
You are comparing apples and oranges (read pointers and integers). This loop doesnot get executed since the pointer to array of longs is > 0 (most always). So, in the receiving end, you are reading off of from an unitialized array which results in those incorrect numbers being passed around).
It'd rather be:
for (int i = 0; i < ARRAY_LEN; i++)
Use functions from <net/hton.h>
http://en.wikipedia.org/wiki/Endianness#Endianness_in_networking
Not related to this question, but you also need to take care of endianness of platforms if you want to use TCP over different platforms.
It is much simpler to use some networking library like curl or ACE, if that is an option (additionally you learn a lot more at higher level like design patterns).
There is nothing to guarantee how TCP will packet up the data you send to a stream - it only guarantees that it will end up in the correct order at the application level. So you need to check the value of result, and keep on reading until you have read the right number of bytes. Otherwise you won't have read the whole of the data. You're making this more difficult for yourself using a long array rather than a byte array - the data may be send in any number of chunks, which may not be aligned to long boundaries.
I see a number of problem's here. First, this is how I would rewrite your send code as I understand it. I assume getSourceArray always returns a valid pointer to a static or malloced buffer of size ARRAY_LEN. I'm also assuming you don't need sourceArrayPointer later in the code.
#define ARRAY_LEN 262144
long *sourceArrayPointer = getSourceArray();
long sourceArray[ARRAY_LEN];
long *sourceArrayIdx = sourceArray;
for (; sourceArrayIdx < sourceArray+ARRAY_LEN ; )
sourceArrayIdx++ = sourceArrayPointer++;
int result = send(clientSocketFD, sourceArray, sizeof(long) * ARRAY_LEN);
if (result < sizeof(long) * ARRAY_LEN)
printf("send returned %d\n", result);
Looking at your original code I'm guessing that your for loop was messed up and never executing resulting in you sending whatever random junk happens to be in the memory sourceArray points to. Basically your condition
sourceArrayPointer < i;
is pretty much guaranteed to fail the first time through.