Send an integer in two byte - c

I am going to transfer an integer number in two byte. As you know by using sprint function it splits your number into ASCII and sends them(for me by Ethernet connection). For Example:
sprintf(Buffer,"A%02ldB",Virtual);
My Virtual number is between 0 to 3600. By using sprintf function it sends ASCII codes(3600 converts to 4 ASCII bytes). However by converting 3600 to binary form we can see that we can squeeze it into a 12bit (or two bytes which bits between 13-16 are unused). But I can't send binary code again because sprintf sends ASCII codes for each 1 and 0. if I can convert my Virtual variable into two bytes I can increase my bytes transportation. So how can I convert a variable to two byte and them them by sprintf function?

sprintf() doesn't do this, partly because it's specifically meant to work with strings (not binary data), and partly because there's nothing to convert – an integer variable in C already has bytes representing the number.
For example, if you have the variable declared as a short int or as uint16_t, that'll always hold your number in exactly two bytes, with &Virtual indicating the memory address where those bytes are kept, and I believe you can directly memcpy() the data into your network buffer. (Note: some types have a fixed size, others vary between architectures; be careful with those.)
The only thing you really should do before sending it is ensure that the bytes are in the correct order. That's another thing that varies between CPU architectures, so use htons()/htonl() to get an integer suitable for sending over the network, and upon receiving use ntohs()/ntohl() to convert it back to native CPU format.

Related

Meaning of character in C's streams

I seem to have a blind spot in my understand of the meaning of character in C's stream abstraction; I just can't seem to stitch the picture together.
What is the meaning of character with respect to binary streams?
From 7.19.7.1p2 ...
If the end-of-file indicator for the input stream pointed to by stream is not set and a next character is present, the fgetc function obtains that character as an unsigned char converted to an int and advances the associated file position indicator for the stream (if defined).
...
Suppose I wrote a file on machine where characters require 16 bits and I start reading on a machine on which the characters fit in 7 bits. Then what am I actually reading with each call to fgetc? Is it part of the 16 bit character (i.e., I'm reading 7 bits at a time) or is the 16-bit character "squezzed" into a 7 bit representation with information loss?
From the spec:
3.7.1 1 character single-byte character 〈C〉 bit representation that fits in a byte
and:
3.6 1 byte addressable unit of data storage large enough to hold any member of the basic character set of the execution
environment NOTE 1 It is possible to express the address of each
individual byte of an object uniquely. NOTE 2 A byte is composed
of a contiguous sequence of bits, the number of which is
implementation- defined. The least significant bit is called the
low-order bit; the most significant bit is called the high-order bit.
So on your writing machine, char is likely a 16-bit type. On your reading machine, char is likely an 8-bit type. C requires that char be at least an 8-bit type:
5.2.4.2.1 Sizes of integer types
...
— number of bits for smallest object that is not a bit-field (byte)
CHAR_BIT 8
So on your reading machine, you'll need make two fgetc calls to read each half of the 16-bit characters you wrote on the original machine.
Technically, char is a one byte type that can hold values from -128 to 127; depending on the architecture it can also be unsigned, holding values from 0 to 255. But although it is, strictly speaking, an integer type, it is not used to hold integers generally. You will almost always use type int or one of its many variations for that.
Type char, in practice, has a couple of dedicated uses:
It can hold an ASCII value. As there are 128 ASCII codes, or 255 ASCII codes in some extended versions, char is an ideal type for this purpose. But when it is used this way, it nearly always appears in a program as part of a string, which (in C, although not always in C++) is a simple array of char.
If you are designing a structure to be compact, and you want to create a field (that is, a data member) that will never hold more than 256 different values, you might want to use char type for that purpose as well.
Note that there is a subtle point here not always obvious to new C programmers. You can assign ASCII codes to char variables, but that is not really a property of char in C. For example, I could assign ASCII code numbers to any integer field. The C language itself does not prevent this. But remember that C string library functions are designed to be used with arrays of char, not arrays of int.
char* is how you declare a pointer to a char variable. It’s useful when you want a string with unknown length.
1st example:
char name[10];
strcpy (name, "type_your_name_here"); //overwrites the first argument with the second.
Here you’re reserving 10 pieces of memory. You might use them all or your name might just be “Jack” which, if we account for the '\0' special character that comes at the end of every string, takes only 5 memory chunks. That means you have 5 remaining pieces that you’re not using.
Maybe your name is longer that 10 characters, where will you store the extra ones then? You won’t be able to. Because you gave a static declaration to your array of chars.
2nd example:
char *name;
This means that you just declared the pointer variable where you’ll store the address of the first character in your string. It gives more freedom and flexibility to your usage. Whether your name is long or short, the predefined string functions like strcpy and strcat can handle memory allocation for you.
In short:
My understanding is that, in the first example you defined both the starting and end points of your string, which limits what you can fit in there and also can waste memory space. In the second example, you only specified the starting point which grants more usage freedom and memory economy. I don’t know of any drawbacks to the second example, it’s just my first year learning this as well. So may be the experts can shed a brighter light on this matter than I can.

About the wav data sub-chunk

I am working on a project in which I have to merge two 8bits .wav files using C and i still have no clue how to do it.
I have read about wav files and I want to start by reading one of the files.
There's one thing i didn't understand:
Let's say i have an 8bit WAV audio file, And i was able to read (even tho I am still trying to) the Data that starts after the 44 byte, I will get numbers between 0 and 255 logically.
My question is:
What do those numbers mean?
If I get 255 or 0 what do they mean?
Are they samples from the wave?
Can anyone please explain?
Thanks in advance
Assuming we're not dealing with file format issues, getting values between 0 and 255 means that the audio samples are of unsigned eight-bit format, as you have put it.
One way of merging data would consist of reading data from files into buffers, arrays a and b and summing them value by value: c[i] = a[i] + b[i]. By doing so, you'd have to take care of the following:
length of the files may not be equal
on summing the unsigned 8-bit buffers, such as yours will almost certainly overflow
This is usually achieved using a for loop. You first get the sizes of the chunks. Your for loop has to be written in such a way that it neither reads past the array boundary, nor ignores what can be read. For preventing overflows you can either:
divide values by two on reading
or
read (convert) into a format which wouldn't overflow, then normalize and convert the merged data back into the original format or whichever format desired.
For all particulars of reading from and writing to a .wav format file you may use some of the existing audio file libraries, or write your own routine. Dealing with audio file format is not a trivial thing, though. Here's a reference on .wav format.
Here are few audio file APIs worth of looking at:
libsndfile
sndlib
Hope this can help.
See any good guide to WAVE for information on the format of samples in the data chunk, such as this one I found: http://www.neurophys.wisc.edu/auditory/riff-format.txt
Relevant excerpts:
In a single-channel WAVE file, samples are stored consecutively. For
stereo WAVE files, channel 0 represents the left channel, and channel
1 represents the right channel. The speaker position mapping for more
than two channels is currently undefined. In multiple-channel WAVE
files, samples are interleaved.
Data Format of the Samples
Each sample is contained in an integer i. The size of i is the
smallest number of bytes required to contain the specified sample
size. The least significant byte is stored first. The bits that
represent the sample amplitude are stored in the most significant bits
of i, and the remaining bits are set to zero.
For example, if the sample size (recorded in nBitsPerSample) is 12
bits, then each sample is stored in a two-byte integer. The least
significant four bits of the first (least significant) byte is set to
zero.
The data format and maximum and minimums values for PCM waveform
samples of various sizes are as follows:
Sample Size Data Format Maximum Value Minimum Value
One to Unsigned 255 (0xFF) 0
eight bits integer
Nine or Signed Largest Most negative more bits
integer i positive value of i
value of i
N.B.: Even if the file has >8 bits of audio resolution, you should read the file as an array of unsigned char and reconstitute the larger samples manually as per the above spec. Don't try to do anything like reading the samples directly over an array of native C ints, as their layout and size is platform-dependent and therefore should not be relied upon in any code.
Note also that the header is not guaranteed to be 44 bytes long: How can I detect whether a WAV file has a 44 or 46-byte header? You need to read the length and process the header based on that, not any assumption.

Best way to receive integer array on c socket

I need to receive a nested integer array on a socket, e.g.
[[1,2,3],[4,5,6],...]
The subarrays are always 3 values long, the length of the main array varries, but is known in advance.
Searching google has given me a lot of options, from sending each integer seperatly to just casting the buffer to what I think it should be (seems kind of unsafe to me), so I am looking for a safe and fast way to do this.
The "subarrays" don't matter, in the end you're going to be transmitting 3 n numbers and have the receiver interpret them as n rows of 3 numbers each.
For any external representation, you're going to have to pick a precision, i.e. how many bits you should use for each integer. The type int is not well-specified, so perhaps pick 32 bits and treat each number as an int32_t.
As soon as an external integer representation has multiple bytes, you're going to have to worry about the order of those bytes. Traditionally network byte ordering ("big endian") is used, but many systems today observe that most hardware is little-endian so they use that. In that case you can write the entire source array into the socket in one go (assuming of course you use a TCP/IP socket), perhaps prepended by either the number of rows or the total number of integers.
Assuming that bandwidth and data size isn't very critical I would propose, that (de-)serializing the array to a string is a safe and platform/architecture independent way to transfer such an array. This has the following advantages:
No issues with different sizes of the binary representations of integers between the communicating hosts
No issues with differing endiannesses
More flexible if the parameters change (length of the subarrays, etc)
It is easier to debug a text-protocol in contrast to a binary protocol
The drawback is, that more bytes have to be transmitted over the channel as minimal necessary with a good binary encoding.
If you want to go with a ready-to-use library for serializing/deserializing your array, you could take a look at one of the many JSON-libraries available.
http://www.json.org/ provides a list with several implementations.
Serialize it the way you want, two main possibilities:
encode as strings, and fix separators, etc.
encode with NBO, and send data to fix some parameters: first the length of your ints, then the length of the array and then the data; everything properly encoded.
In C, you can use XDR routines to encode properly your data.

sprintf or itoa or memcpy for IPC

A process say PA wants to send values of 2 integers to PB by sending it in a char buf after populating it with values. Assume PA and PB are in same machine. PB knows that the buffer it reads contains values of 2 integers.
uint x=1;
uint y=65534;
Case 1
PA writes into char buf as shown
sprintf(buff,"%d%d",x,y);
Q1 - In this case how will PB able to extract their values as 1 and 65534 since it just has an array containing 1,6,5,5,3,4. Is using sprintf the problem?
Case 2
PA use itoa function to populate the value of integers in to buffer.
PB use atoi to extract the values from buffer.
Since itoa puts a null terminator after each value this should be possible.
Q2 - Now consider PA is running on a 32 bit machine with 4 byte int size and PB is running on a 16 bit machine with 2 byte int size. Will only checking for out of range make my code portable?
Q3 - Is memcpy another way of doing this?
Q4 - How is this USUALLY done ?
1) The receiver will read the string values from the network, and do its own conversion; in this case it woud get the string representation of 165,534. You need some way of delimiting the values for the receiver.
2) Checking for out of range is a good start, but portability depends on more factors, such as defining a format for the transfer, be it binary or textual.
3) Wha?
4) It's usually done by deciding on a standard for binary representation of the number, i.e., is it a signed/unsigned 16/32/64 bit value, and then converting it into what's commonly referred to as network byte order[1] on the sending side, and converting it to host byte order on the receiving side.
[1] http://en.wikipedia.org/wiki/Network_byte_order#Endianness_in_networking
I would suggest that you have a look into
As you noticed in Case 1 there is no way to extract the values from the buffer if you don't have additional information. So you need some delimitier character.
In Q2 you mention a 16 Bit machine. Not only the #bytes for an int can be a problem but also the endianess and the sign.
What I would do:
- Define an own protocol for different numbers (you can't send a 4 byte int to the 16 bit machine and use the same type without loosing information)
Or
- Check the int (must fit in 2 bytes) before writing.
I hope this helps.
Q1: Not using sprintf is the problem, but the way of using it. How about:
sprintf(buff,"%d:%d",x,y);
(Note: A comma as seperator could cause problems with international formats)
Q2: No. Other problems, e.g. regarding endianness, could arise
Q3: No if you use different machines. One a single machine, you can (mis)use your buffer as an array of bytes.
Q4: Different ways, e.g. XDR (http://en.wikipedia.org/wiki/External_Data_Representation)
You need a protocol and a transport mechanism.
Transport mechanisms include sockets, named pipes, shared memory, SSL etc.
The protocol could be as simple as space separated strings, as you suggested. It could also be something more "complicated" like an XML-based format. Or binary format.
All these protocol types are in use in various applications. Which protocol to choose depends on your requirements.

What is meant by Octet String? What's the difference between Octet and Char?

What is the difference between octet string and char? How can an octet string be used? Can anybody write a small C program on Octet string? How are octet strings stored in memory?
Standards (and such) use "octet" to explicitly state that they're talking about 8-bit groups. While most current computers work with bytes that are also 8 bits in size, that's not necessarily the case. In fact, "byte" is rather poorly defined, with considerable disagreement over what it means for sure -- so it's generally avoided when precision is needed.
Nonetheless, on a typical computer, an octet is going to be the same thing as a byte, and an octet stream will be stored in a series of bytes.
An octet is another word for a 8-bit byte.
A char is usually 8 bits, but may be another size on some architectures.
An octet is 8 bits meant to be handled together (hence the "oct" in "octet"). It's what we think of when we say "byte" these days.
A char is basically a byte -- it's defined as the smallest addressable unit of memory, which on almost all modern computers is the same as an octet. But there have been computers with 9-bit, 16-bit, even 36-bit "words" that qualify as chars by that definition. You only need to care about those computers (and thus, about the difference between a char and an octet) if you have one -- let the people who have the weird hardware worry about how to make their programs run on it.
An octet string is simply a sequence of bits grouped into chunks of 8. Those 8-sized groups often represent characters. Octet string is a basic data type used for SNMP.
A string used to be a set of octets, which is turn is a set of 8 bits.
A string in C, is always a null-terminated, memory contiguous, set of bytes.
Back in the day, each byte, an octet, represented a character. That's why they named the type used to make strings, char.
The ASCII table, that goes from 0 to 127, with the graphics/accents version going from 0 to 255, was no longer enough for displaying characters in a string, so someone though of adding bits to a character representation. Dumb-asses from CS though of 9bit and so forth, to what HW guys replied "are you nuts??? keep it a multiple of memory addressing unit", which was the byte, back then.
Enter wide-character strings, i.e. 16bits per character.
On a WC string, each character is represented by 2 bytes... there goes your char=1 byte rule down the drain.
To keep an exact description of a string, if it's a set of characters-represented-by-8bits (in Earth, following the ASCII table, but I've been to Mars), it's an "octet string".
If it's not "octet string" it may or may not be WC... Joel was a nice post on this.

Resources