storing in unsigned char array - c

i got a question about unsigned char array.
How can i store an integer in the array continually?
for example, i need to store 01011 to the array first. Then i need to store 101, how can i stored as 01011101 in the array?
thanks for your help!

Store 01011 first. You'll get value 00001011. Then when you want to store three more bits, perform a left shift by three positions (you'll get 01011000) and make OR with 00000101, you'll get 01011101. However, doing it this way you have to know definitely that you had only five bits filled after first assignment.

Obviously, you'll need to resize the array as it grows. Dynamic memory allocation/reallocation is a way to go, probably. Pay attention to choosing a right reallocation strategy.
Apart from that, you may want to look at C++ STL containers if you are not limited only to C.

You should write an abstract data type called bitstream for this purpose. It could have the following interface:
This is file bitstream.h:
#ifndef BITSTREAM_H
#define BITSTREAM_H
typedef struct bitstream_impl bitstream;
struct bitstream_impl;
/**
* Creates a bit stream and allocates memory for it. Later, that memory
* must be freed by calling bitstream_free().
*/
extern bitstream *bitstream_new();
/**
* Writes the lower 'bits' bits from the 'value' into the stream, starting with
* the least significand bit ("little endian").
*
* Returns nonzero if the writing was successful, and zero if the writing failed.
*/
extern int bitstream_writebits_le(bitstream *bs, unsigned int value, unsigned int bits);
/**
* Writes the lower 'bits' bits from the 'value' into the stream, starting with
* the most significand bit ("big endian").
*
* Returns nonzero if the writing was successful, and zero if the writing failed.
*/
extern int bitstream_writebits_be(bitstream *bs, unsigned int value, unsigned int bits);
/**
* Returns a pointer to the buffer of the bitstream.
*
* The returned pointer remains valid until the next time that one of the
* bitstream_write* functions is called. The returned buffer must not be
* modified. All bits in the buffer that have not yet been written are zero.
* (This applies only to the last byte of the buffer.) Each byte of the buffer
* contains at most 8 bits of data, even if CHAR_BITS is larger.
*/
extern unsigned char *bitstream_getbuffer(const bitstream *bs);
/**
* Returns the number of bits that have been written to the stream so far.
*/
extern unsigned int bitstream_getsize(const bitstream *bs);
/**
* Frees all the memory that is associated with this bitstream.
*/
extern void bitstream_free(bitstream *bs);
#endif
Then you need to write an implementation for this interface. It should be in a file called bitstream.c. This is left as an excercise.
To use the bitstream should then be pretty simple:
#include "bitstream.h"
static void
die(const char *msg) {
perror(msg);
exit(EXIT_FAILURE);
}
int main(void)
{
bitstream *bs;
unsigned char *buf;
bs = bitstream_new();
if (bs == NULL)
die("bitstream_new");
if (!bitstream_writebits_be(bs, 0x000b, 5))
die("write 5 bits");
if (!bitstream_writebits_be(bs, 0x0005, 3))
die("write 3 bits");
if (bitstream_getsize(bs) != 8)
die("FAIL: didn't write exactly 8 bits.");
buf = bitstream_getbuffer(bs);
if (buf[0] != 0x005dU)
die("FAIL: didn't write the expected bits.");
bitstream_free(bs);
return 0;
}

Related

passign a struct pointer to a function and struct padding in c programming

In build_uart_frame() , I call calcFCS() which calculates an XOR of all the bytes in the struct members(len, cmd0, cmd1 and data).
I do not think the struct is padded therefore will calling calcFCS() be an issue? Could somebody explain what is the issue in relation to struct padding as I don't understand its role here and secondly how can I do this operation correctly?
Thank you
typedef struct uart_frame {
uint8_t sof; /* 1 byte */
uint8_t len; /* 1 bytes */
uint8_t cmd0; /* 1 byte */
uint8_t cmd1;
char data[11]; /* 0 -250 byte */
unsigned char fcs; /* 1 byte */
} uart_frame_t;
//-------------------------------------------------------------------------
// Global uart frame
uart_frame_t rdata;
//-------------------------------------------------------------------------
unsigned char calcFCS(unsigned char *pMsg, unsigned char len) {
unsigned char result = 0;
while(len--) {
result ^= *pMsg++;
}
return(result);
}
//-------------------------------------------------------------------------
// Worker code to populate the frame
int build_uart_frame() {
uart_frame_t *rd = &rdata; //pointer variable 'rd' of type uart_frame
// common header codes
rd->sof = 0xFE;
rd->len = 11;
rd->cmd0 = 0x22;
rd->cmd0 = 0x05;
snprintf(rd->data, sizeof(rd->data), "%s", "Hello World");
rd->fcs = calcFCS((unsigned char *)rd, sizeof(uart_frame_t) - 1); //issue with struct padding
return 0;
}
Given your very specific example, it is unlikely that padding will be an issue, since all data types are bytes. Padding is mostly an issue when you use larger data types, because those should typically not be allocated at misaligned addresses.
Yet that is no guarantee: the compiler could in theory decide to replace a char with an int if it thinks that will get faster code. It is free to insert any amount of padding anywhere in a struct, except at the very top.
This is why structs are unsuitable to describe memory maps or data protocols. You will have to ensure that no padding is present and preferably do so portably. The best way to ensure this is a standard C compile-time assert:
_Static_assert(sizeof(uart_frame_t) == offsetof(uart_frame_t, fcs)+sizeof(unsigned char),
"Padding detected");
Here the size of the whole struct is checked against the byte position of the last struct member + the size of that member. If they are the same, there was no padding.
Now of course this only prevents your code from compiling and misbehaving, it doesn't solve the actual problem. Unfortunately there is no portable way to block padding. #pragma pack(1) is common but non-standard. __attribute__((packed)) is another compiler-specific command for this.
Ensuring that no packing is present on the given system where the code is compiled is usually enough.
Also, some of the more exotic systems (MIPS, SPARC etc) don't even support misaligned reads, meaning that misaligned access will not just mean slower code, but a run-time bus error crash.
The only way to safely ensure maximum portability of code using structs, is to write serialize/de-serialize routines that manually copies every member to/from a raw byte array:
void uart_serialize (const uart_frame_t* frame, uint8_t* raw)
{
raw[0] = frame->sof;
raw[1] = frame->len;
...
memcpy(&raw[4], frame->data, 11);
...
}
The downside of such methods is that they obviously adds some execution time, so I would only use them for code that I know needs to be ported to all kinds of different systems.

Continuous memory allocation with different data type in C?

I'm trying to compose a string (char array exactly) containing a fixed 14 starting characters and ending with varying content. The varying bit contains 2 floats and 1 32-bit integer that's to be individually treated as 4 1-byte characters in the array separated by commas. It can be illustrated by the following piece of code, which doesn't compile for some obvious reasons (*char can't assign to *float). So, what can I do to get around it?
char *const comStr = "AT+UCAST:0000=0760,0020,0001\r"; // command string
float *pressure;
float *temperature;
uint32_t *timeStamp;
pressure = comStr + 14; // pressure in the address following the '=' in command string
temperature = comStr + 18; // temperature in the address following the 1st ',' in command string
timeStamp = comStr + 22; // time stamp in the address following the 2nd ',' in command string
I have an unclear memory about something like struct and union in the C language which reserves strictly the memory allocation order in which the variables are defined within the "structure". Maybe something like this:
typedef struct
{
char[14] command;
float *pressure;
char comma1;
float *temperature;
char comma2;
uint32_t *time_stamp;
char CR;
}comStr;
Does this structure guarantee that comStr-> command[15] gives me the first/last byte (depends on the endian) of *pressure? Or is there any other special structure do the trick hiding from me?
(Note: comStr-> command[15] isn't going to be evaluated in future code, so exceeding index boundary is not a concern here. The only important thing here is just whether the memory is allocated continuously so that a hardware fetch lasting for 29 bytes starting from the memory address (comStr-> command) gives me exactly the string I want).
p.s. As I am writing this, I came up with an idea. Can I possibly just use memcpy() for the purpose ;) memcpy has parameters of void* type, hopefully it works! I am going to try it now! All hail stackOverflow anyway!
EDIT: I should have made myself clearer, sorry for any misleading and misunderstanding! The character array I want to construct is to be sent through UART byte by byte. To do this, a DMA system is to be used to transfer the array to the transmit buffer byte by byte automatically if the character array's starting memory address and length are given to the DMA system. So the character array must to be stored continuously in the memory. I hope this makes the question clearer.
This proposed structure:
typedef struct
{
char[14] command;
float *pressure;
char comma;
float *temperature;
char comma;
uint32_t *time_stamp;
char CR;
}comStr;
Is not going to help you with your requirement:
The only important thing here is just whether the memory is allocated continuously so that a hardware fetch lasting for 29 bytes starting from the memory address (comStr->command) gives me exactly the string I want.
Note you can't have two members with the same name; you'd need to use comma1 and comma2 for example. Also, the array dimension is in the wrong place.
One problem is that there will be padding bytes within the structure.
Another problem is that the pointers will be holding addresses of something outside the structure (since there is nothing valid inside the structure for them to point at).
It is not clear what you're after. Only a very limited range of floating point values can be represented by 4 bytes in a string. If you're after binary data I/O, then you can drop the pointers and the commas:
typedef struct
{
char command[14];
float pressure;
float temperature;
uint32_t time_stamp;
}comStr;
If you want the commas present, then you're going to have to work harder:
typedef struct
{
char command[14];
char pressure[4];
char comma1;
char temperature[4];
char comma2;
char time_stamp[4];
char CR;
} comStr;
You will have to load the data carefully:
struct comStr com;
float pressure = ...;
float temperature = ...;
uint32_t time_stamp = ...;
assert(sizeof(float) == 4);
...
memmove(&com.pressure, &pressure, sizeof(pressure));
memmove(&com.temperature, &temperature, sizeof(temperature));
memmove(&com.time_stamp, &time_stamp, sizeof(time_stamp));
You have to unpack with a similar set of memory copies. Note that you won't be able to use simple string manipulation on the structure; there could be zero bytes in any or all of the pressure, temperature and time_stamp sections of the structure.
Structure padding
#include <stddef.h>
#include <stdio.h>
#include <stdint.h>
typedef struct
{
char command[14];
float *pressure;
char comma1;
float *temperature;
char comma2;
uint32_t *time_stamp;
char CR;
} comStr;
int main(void)
{
static const struct
{
char *name;
size_t offset;
} offsets[] =
{
{ "command", offsetof(comStr, command) },
{ "pressure", offsetof(comStr, pressure) },
{ "comma1", offsetof(comStr, comma1) },
{ "temperature", offsetof(comStr, temperature) },
{ "comma2", offsetof(comStr, comma2) },
{ "time_stamp", offsetof(comStr, time_stamp) },
{ "CR", offsetof(comStr, CR) },
};
enum { NUM_OFFSETS = sizeof(offsets)/sizeof(offsets[0]) };
printf("Size of comStr = %zu\n", sizeof(comStr));
for (int i = 0; i < NUM_OFFSETS; i++)
printf("%-12s %2zu\n", offsets[i].name, offsets[i].offset);
return 0;
}
Output on Mac OS X:
Size of comStr = 64
command 0
pressure 16
comma1 24
temperature 32
comma2 40
time_stamp 48
CR 56
Note how large the structure is on a 64-bit machine. Pointers are 8-bytes each and are 8-byte aligned.
Various issues to be a covered in your question. I'll take a shot at some of those issues.
The order of members in a structure is guaranteed to be the same as order you have declared them. But there is a different issue here - padding.
Check this -http://c-faq.com/struct/padding.html and follow other links/questions there
Next thing is that you are mistaken in thinking that something like "125" is an integer or something like "1.25" is a float - it's not - it's a string. i.e.
char * p = "125";
p[0] will not contain 0. It will contain '0' - if the encoding is ASCII, then this will be 48. i.e. p[0] will contain 48 & not 0. p[1] will contain 49 & p[2] will contain 52. It will be something similar for float.
The opposite will also happen.
i.e. if you have at an address and you treat it as a char array - the char array will not contain the float you think it will.
Try this program to see this
#include <stdio.h>
struct A
{
char c[4];
float * p;
int i;
};
int main()
{
float x = 1.25;
struct A a;
a.p = &x;
a.i = 0; // to make sure the 'presumed' string starting at p gets null terminate after the float
printf("%s\n", &a.c[4]);
}
For me, it prints "╪·↓". And this has nothing to do with endianness.
Another thing you need to remember, while assigning values to your structure object - you need to remember that comStr.pressure & comStr.temperature are pointers. You cannot assign values to them directly. You need to either give them the address of an existing float or allocate memory dynamically to which they can point to.
Also are you trying to create the char array or to parse the char array which already exists. If you are trying to create it, a better way to do this will be to use snprintf to do what you want. snprintf uses format specifiers similar to printf but prints to a char array. You can create your char array that way. A bigger question remains - what do you plan to do with this char array you create - that will determine if endianness is relevant for you.
If you are trying to read from the char array you have been given and trying to split into floats and commas and whatever, then one way to do this will be sscanf but may be difficult for your particular string format.
At last, I found an easy way round but I don't know if there is any drawback for this method. I did:
char commandStr[27];
char *commandHeader = "AT+UCAST:0000=";
float pressure = 760.0;
float temperature = 20.0;
uint32_t timeStamp = 0;
memcpy(commandStr, commandHeader, 14);
commandStr[26] = '\r';
memcpy((void*)(comStr+14), (void*)(&pressure), 4);
memcpy((void*)(comStr+18), (void*)(&temperature), 4);
memcpy((void*)(comStr+22), (void*)(&timeStamp), 4);
Does this code have any security issues or performance issues or whatever?

Odd behaviour using flexible array member

I tried to replace a void* member of a struct with a flexible array member using the more accepted idiom:
typedef struct Entry {
int counter;
//void* block2; // This used to be what I had
unsigned char block[1];
}
I then add entries into a continuous memory block:
void *memPtr = mmap(NULL, someSize*1024, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
as such:
int number = 0;
int AddEntry(void *data) {
Entry *entry;
entry = malloc(sizeof(Entry) + ((SECTOR_SIZE-1) * sizeof(unsigned char));
entry->counter = 1;
memcpy(entry->block, data, SECTOR_SIZE);
// make sure number doesn't overflow space, etc...
memcpy(&memPtr[number], entry, sizeof(Entry) + ((SECTOR_SIZE-1) * sizeof(unsigned char));
number++;
return 0;
}
The problem is unpacking this data once I need it. For example, if I do:
void * returnBlock(int i) {
Entry * entry = &memPtr[i];
printf("Entry counter is %d\n", entry->counter); // returns 1, reliably
return entry->block; // Gives me gibberish but not if I uncomment void* block2.
}
Is there a reason this could be? I don't necessarily think I'm stomping on stuff anywhere, and it used to work with the void* approach. The weird thing is that if I put a dummy void* back into the struct, it works. It doesn't work if I put in a dummy int.
Edit: actually, it also fails if number in AddEntry is not 0. What am I stepping on, if anything?
Your problem is here:
&memPtr[number]
Since memPtr is a void * pointer, this isn't actually allowed in C at all. Some compilers do allow arithmetic on void * pointers as a language extension - however they treat it as if it were a char * pointer.
That means that &memPtr[number] is likely indexing only number bytes into your memory block - so the second Entry structure copied in will overlap the first one, and so on.
Your allocation line appears to be assuming 1024 bytes per Entry (if someSize is a number of Entry structures), so you probably want something like:
((char *)memPtr + number * 1024)
(and similar in the returnBlock() function).
However, if you do this you will notice that there is no point in using the flexible array member - because you're creating a contiguous array of these structures, and don't have a separate index, you have to assume each one is a fixed size. This means that you might as well make each one a fixed size:
typedef struct Entry {
int counter;
unsigned char block[1024 - sizeof counter];
}

Read binary data (from file) into a struct

I'm reading binary data from a file, specifically from a zip file. (To know more about the zip format structure see http://en.wikipedia.org/wiki/ZIP_%28file_format%29)
I've created a struct that stores the data:
typedef struct {
/*Start Size Description */
int signatute; /* 0 4 Local file header signature = 0x04034b50 */
short int version; /*  4 2 Version needed to extract (minimum) */
short int bit_flag; /*  6 2 General purpose bit flag */
short int compression_method; /*  8 2 Compression method */
short int time; /* 10 2 File last modification time */
short int date; /* 12 2 File last modification date */
int crc; /* 14 4 CRC-32 */
int compressed_size; /* 18 4 Compressed size */
int uncompressed_size; /* 22 4 Uncompressed size */
short int name_length; /* 26 2 File name length (n) */
short int extra_field_length; /* 28 2 Extra field length (m) */
char *name; /* 30 n File name */
char *extra_field; /*30+n m Extra field */
} ZIP_local_file_header;
The size returned by sizeof(ZIP_local_file_header) is 40, but if the sum of each field is calculated with sizeof operator the total size is 38.
If we have the next struct:
typedef struct {
short int x;
int y;
} FOO;
sizeof(FOO) returns 8 because the memory is allocated with 4 bytes every time. So, to allocate x are reserved 4 bytes (but the real size is 2 bytes). If we need another short int it will fill the resting 2 bytes of the previous allocation. But as we have an int it will be allocated plus 4 bytes and the empty 2 bytes are wasted.
To read data from file, we can use the function fread:
ZIP_local_file_header p;
fread(&p,sizeof(ZIP_local_file_header),1,file);
But as there're empty bytes in the middle, it isn't read correctly.
What can I do to sequentially and efficiently store data with ZIP_local_file_header wasting no bytes?
In order to meet the alignment requirements of the underlying platform, structs may have "padding" bytes between members so that each member starts at a properly aligned address.
There are several ways around this: one is to read each element of the header separately using the appropriately-sized member:
fread(&p.signature, sizeof p.signature, 1, file);
fread(&p.version, sizeof p.version, 1, file);
...
Another is to use bit fields in your struct definition; these are not subject to padding restrictions. The downside is that bit fields must be unsigned int or int or, as of C99, _Bool; you may have to cast the raw data to the target type to interpret it correctly:
typedef struct {
unsigned int signature : 32;
unsigned int version : 16;
unsigned int bit_flag; : 16;
unsigned int compression_method : 16;
unsigned int time : 16;
unsigned int date : 16;
unsigned int crc : 32;
unsigned int compressed_size : 32;
unsigned int uncompressed_size : 32;
unsigned int name_length : 16;
unsigned int extra_field_length : 16;
} ZIP_local_file_header;
You may also have to do some byte-swapping in each member if the file was written in big-endian but your system is little-endian.
Note that name and extra field aren't part of the struct definition; when you read from the file, you're not going to be reading pointer values for the name and extra field, you're going to be reading the actual contents of the name and extra field. Since you don't know the sizes of those fields until you read the rest of the header, you should defer reading them until after you've read the structure above. Something like
ZIP_local_file_header p;
char *name = NULL;
char *extra = NULL;
...
fread(&p, sizeof p, 1, file);
if (name = malloc(p.name_length + 1))
{
fread(name, p.name_length, 1, file);
name[p.name_length] = 0;
}
if (extra = malloc(p.extra_field_length + 1))
{
fread(extra, p.extra_field_length, 1, file);
extra[p.extra_field_length] = 0;
}
C structs are just about grouping related pieces of data together, they do not specify a particular layout in memory. (Just as the width of an int isn't defined either.) Little-endian/Big-endian is also not defined, and depends on the processor.
Different compilers, the same compiler on different architectures or operating systems, etc., will all layout structs differently.
As the file format you want to read is defined in terms of which bytes go where, a struct, although it looks very convenient and tempting, isn't the right solution. You need to treat the file as a char[] and pull out the bytes you need and shift them in order to make numbers composed of multiple bytes, etc.
The solution is compiler-specific, but for instance in GCC, you can force it to pack the structure more tightly by appending __attribute__((packed)) to the definition. See http://gcc.gnu.org/onlinedocs/gcc-3.2.3/gcc/Type-Attributes.html.
It's been a while since I worked with zip-compressed files, but I do remember the practice of adding my own padding to hit the 4-byte alignment rules of PowerPC arch.
At best you simply need to define each element of your struct to the size of the piece of data you want to read in. Don't just use 'int' as that may be platform/compiler defined to various sizes.
Do something like this in a header:
typedef unsigned long unsigned32;
typedef unsigned short unsigned16;
typedef unsigned char unsigned8;
typedef unsigned char byte;
Then instead of just int use an unsigned32 where you have a known 4-byte vaule. And unsigned16 for any known 2-byte values.
This will help you see where you can add padding bytes to hit 4-byte alignment, or where you can group 2, 2-byte elements to make up a 4-byte alignment.
Ideally you can use a minimum of padding bytes (which can be used to add additional data later as your expand the program) or none at all if you can align everything to 4-byte boundaries with variable-length data at the end.
Also, the name and extra_field will not contain any meaningful data, most likely. At least not between runs of the program, since these are pointers.

How to store an integer value of 4 bytes in a memory of chunk which is malloced as type char

I have allocated a chunk of memory of type char and size is say 10 MB (i.e mem_size = 10 ):
int mem_size = 10;
char *start_ptr;
if((start_ptr= malloc(mem_size*1024*1024*sizeof(char)))==NULL) {return -1;}
Now I want to store the size information in the header of the memory chunk.To make myself more clear, let's say: start_ptr = 0xaf868004 (This is the value I got from my execution, it changes every time).
Now I want to put the size information in the start of this pointer, i.e *start_ptr = mem_size*1024*1024;.
But I am not able to put this information in the start_ptr. I think the reason is because my ptr is of type char which only takes one byte but I am trying to store int which takes 4 bytes, is the problem .
I am not sure how to fix this problem..
You'll need to cast your char pointer to an int pointer. In two steps:
int *start_ptr_int = (int*)start_ptr;
*start_ptr_int = mem_size * 1024 * 1024;
In one step:
*((int*)start_ptr) = mem_size * 1024 * 1024;
The (int*) in front of your pointer name tells the compiler: "Yeah, I know this is not actually a pointer to int, but just pretend for the time being, okay?"
*((int*)start_ptr) = mem_size*1024*1024
You could also just memcpy the value in ...
ie
int toCopy = mem_size * 1024 * 1024;
memcpy( start_ptr, &toCopy, 4 );
You'd even be surprised how most compilers won't even make the memcpy call and will just set the value.
One way to do it without casts:
#include <stdlib.h>
struct Block {
size_t size;
char data[];
};
#define SIZE (1024*1024)
int main()
{
struct Block* block = malloc(sizeof(struct Block) + SIZE);
block->size = SIZE;
char* start_ptr = block->data;
// ...
}
Or, to get the effect you want, change one line:
char* start_ptr = (char*)block;
A comment on style: Don't do this:
if ((ptr=malloc()) == NULL)
There is nothing wrong with
ptr = malloc();
if (ptr == NULL) ...
Good programmers know what they could do with the language. Excellent programmers know why they shouldn't do it. ;)
And -1 to all posters who assume an int in C to always be 32 bits, including the OP in the thread title. An int is guaranteed to have at least 16 bits, and on 32 bit machines it is usually a safe assumption to have 32 bits, but your code may fail as soon as you move to a 64 bit machine.

Resources