Related: flexible array member in a nested struct
I am trying to parse some data into a struct. The data contains information organized as follows:
struct unit {
struct unit_A {
// 28 bytes each
// dependency r6scA 1
char dr6scA1_tagclass[4];
uint32_t dr6scA1_tagnamepointer;
uint32_t dr6scA1_tagnamestringlength;
uint32_t dr6scA1_tagid;
// 12 bytes of 0x00
}A;
// A strings
struct unit_B {
// 48 bytes each
// dependency r6scB 1
char dr6scB1_tagclass[4];
uint32_t dr6scB1_tagnamepointer;
uint32_t dr6scB1_tagnamestringlength;
uint32_t dr6scB1_tagid;
// 32 bytes of 0x00
}B;
// B strings
// unit strings
}unit_container;
You can ignore the weird nomenclature.
My line comments // A strings, // B strings and // unit strings each contain null-terminated C strings, the numbers of which coincides with however many unit_A, unit_B, and unit struct entries there are in the data. So like if there are 5 entries of A in unit_container, then there would be 5 C strings in the location where it says // A strings.
Since I cannot use flexible array members at these locations, how should I interpret what are essentially an unknown number of variable-length C strings at these locations in the data?
For example, the data at these locations could be:
"The first entry is here.\0Second entry\0Another!\0Fourth.\0This 5th entry is the bestest entry evah by any reasonable standards.\0"
...which I expect I should interpret as:
char unit_A_strings[]
...but this is not possible. What are my options?
Thank you for your consideration.
EDIT:
I think the most attractive option so far is:
char** unit_A_strings; to point to an array of char strings.
If I do:
char unit_A_strings[1]; to define a char array of fixed size of 1 char, then I must abandon sizeof(unit) and such, or hassle with memory allocation sizes, even though it is most accurate to the kind of data present. The same situation occurs if I do char * unit_A_strings[1];.
Another question: What would be the difference between using char *unit_A_strings; and char** unit_A_strings;?
Conclusion:
The main problem is that structs are intended for fixed-size information and what I am needing is a variable-sized information memory region. So I can't legitimately store the data into the struct -- at least not as the struct. This means that any other interpretation would be alright, and it seems to me that char** is the best available option for this struct situation.
I think it can using the char** instead (Or you can write some structure to wrapper it).
for example, you can write a help function to decode you stream.
char** decodeMyStream(uint_8* stream, unsigned int* numberOfCString)
{
*numberOfCString = decodeNumberOfCString(stream);
char** cstrings = malloc((*numberOfCString) * sizeof(char*));
unsigned int start = 0;
for (unsigned int i = 0; i < *numberOfCString; ++i)
{
usigned int len = calculateIthStringLength(stream, start)
cstrings[i] = malloc((len) * sizeof(char));
memcpy(cstrings[i], stream + start, len);
start += len
}
return cstrings;
}
it just no thinking example code, you can think out more better algorithms.
I think the closest you're going to get is by providing an array of strings:
char *AStrings[] = { "The first entry is here.",
"Second entry",
"Another!",
"Fourth.",
"This 5th entry is the bestest entry evah by any reasonable standards.",
NULL
};
Note two things:
AStrings is an array of pointers-to-strings - it will be 6 (see 2. below) consecutive pointers that point to the actual strings, NOT the 'compound' string you used in your example.
I ended AStrings with a NULL pointer, to resolve the "when do I finish?" question.
So you can "fall off the end" of A and start looking at locations as pointers - but be careful! The compiler may put in all sorts of padding between one variable and the next, mucking up any assumptions about where they are relative to each other in memory - including reordering them!
Edit
Oh! I just had a thought. Another data representation that may help is essentially what you did. I've 'prettied' it up a bit:
char AString[] = "The first entry is here.\0"
"Second entry\0"
"Another!\0"
"Fourth.\0"
"This 5th entry is the bestest entry evah by any reasonable standards.\0";
The C compiler will automatically concatenate two 'adjacent' strings as though they were one string - with no NUL character between them. I put them in specifically above.
The C compiler will automatically put a '\0' at the end of any string - at the semicolon (;) in the above example. That means that the string actually ends with two NUL characters, not one.
You can use that fact to keep track of where you are while parsing the string 'array' - assuming that every desired value has a (sub)string of more than zero length! As soon as you encounter a zero-length (sub)string, you know you've reached the end of the string 'array'.
I call these kind of strings ASCIIZZ strings (ASCIIZ strings with a second NUL at the end of all of them).
Related
Initialize a global 1D array "StudentData" of type char with your ID (5 digits).
Using pointers syntax is mandatory in this part.
Can you give me some tips about doing that?
I tried doing this,
char *StudentData;
void loadData(){
StudentData=(char*)"60897";
}
Is it right or should I try doing something else?
You probably want something along the lines of:
#include <string.h> //This header has strcpy()
#define ENOUGH_DIGITS 5 //This is so you can easily modify ID length in the future
char StudentData[ENOUGH_DIGITS+1]; //Global array 1 bigger than the longest string
void loadData(){
//Ask the compiler to put a read-only string somewhere in the data memory
const char *myID = "60897";
//Copy the read-only string into the global array
strcpy(StudentData,myID);
}
While you are indeed using an array (the string literal actually is one), it is an anonymous one – and your studentData is only a pointer to that one.
a global 1D array "StudentData"
I would rather interpret this as you are intended to have a true array with that name, so that would look like:
char studentData[N];
where N is a constant expression representing an appropriate size for your array, at least 5 as you need to be able to store 5 digits – possibly 6 if your ID should be represented as a C-string (such one needs one additional character holding the mandatory null-terminator!), or you go with a power of two right away (8 minimally).
Using pointers syntax is mandatory in this part.
So you'll need a pointer to that array:
char* ptr = studentData;
You could now just use the pointer to assign values to:
*ptr++ = '0'; // dereferences the pointer, assigns a value to (in this
// case a character representing the digit zero) and
// increments it afterwards to point to the next character
// repeat this for the next four digits!
*ptr++ = 0; // terminating null character (note: no single quotes)
// or:
*ptr = 0;
// it's up to you to decide if incrementing yet another time actually
// is meaningful (first variant) or unnecessary (second variant)...
If there are no further requirements given you might have this code directly in main function or as a little bonus place it in another function being called from main like:
void loadData(size_t size, char data[size])
{
// ideally size check with appropriate error handling
// assignment as above
}
// in main:
loadData(sizeof(studentData), studentData);
Note: For function parameters all of char* data, char data[] or char[someArbitrarySize] are equivalent, if any size is given, it is simply ignored – we still can add it for documentation purposes, in above signature: to tell that an array with a size of (at least) size is expected. Note, too, that if there are more than one dimensions given this only applies for the outer most dimension, though!
1D array "StudentData"
No no no. StudentData is not an array, it's a pointer. Arrays are blocks of memory while pointers are addresses to memory, which may or may not be an array. An array sometimes becomes a pointer, that's called decay.
"60897" is already a char * compatible. You can directly assign it to StudentData. Like this:
StudentData = "60897";
If you want to use an array, do this:
#include <string.h>
char StudentData[WHATEVER_IS_ENOUGH_TO_HOLD_THE_DATA]; /* allocate array.
make sure that the size of the array accounts for the null
terminating character. */
void loadData(){
strcpy(StudentData, "60897"); //you can't directly assign to an array.
}
Hello I am new to this site, and I require some help with understanding what would be considered the "norm" while coding structures in C that require a string. Basically I am wondering which of the following ways would be considered the "industry standard" while using structures in C to keep track of ALL of the memory the structure requires:
1) Fixed Size String:
typedef struct
{
int damage;
char name[40];
} Item;
I can now get the size using sizeof(Item)
2) Character Array Pointer
typedef struct
{
int damage;
char *name;
} Item;
I know I can store the size of name using a second variable, but is there another way?
i) is there any other advantage to using the fixed size (1)
char name[40];
versus doing the following and using a pointer to a char array (2)?
char *name;
and if so, what is the advantage?
ii) Also, is the string using a pointer to a char array (2) going to be stored sequentially and immediately after the structure (immediately after the pointer to the string) or will it be stored somewhere else in memory?
iii) I wish to know how one can find the length of a char * string variable (without using a size_t, or integer value to store the length)
There are basically 3 common conventions for strings. All three are found in the wild, both for in-memory representation and storage/transmission.
Fixed size. Access is very efficient, but if the actual length varies you both waste space and need one of the below methods to determine the end of the "real" content.
Length prefixed. Extra space is included in the dynamically allocation, to hold the length. From the pointer you can find both the character content and the length immediately preceding it. Example: BSTR Sometimes the length is encoded to be more space efficient for short strings. Example: ASN-1
Terminated. The string extends until the first occurrence of the termination character (typically NUL), and the content cannot contain that character. Variations made the termination two NUL in sequence, to allow individual NUL characters to exist in the string, which is then often treated as a packed list of strings. Other variations use an encoding such as byte stuffing (UTF-8 would also work) to guarantee that there exists some code reserved for termination that can't ever appear in the encoded version of the content.
In the third case, there's a function such as strlen to search for the terminator and find the length.
Both cases which use pointers can point to data immediately following the fixed portion of the structure, if you carefully allocate it that way. If you want to force this, then use a flexible array on the end of your structure (no pointer needed). Like this:
typedef struct
{
int damage;
char name[]; // terminated
} Item;
or
typedef struct
{
int damage;
int length_of_name;
char name[];
} Item;
1) is there any other advantage to using the fixed size (1)
char name[40];
versus doing the following and using a pointer to a char array (2)?
char *name;
and if so, what is the advantage?
With your array declared as char name[40]; space for name is already allocated and you are free to copy information into name from name[0] through name[39]. However, in the case of char *name;, it is simply a character pointer and can be used to point to an existing string in memory, but, on its own, cannot be used to copy information to until you allocate memory to hold that information. So say you have a 30 character string you want to copy to name declared as char *name;, you must first allocate with malloc 30 characters plus an additional character to hold the null-terminating character:
char *name;
name = malloc (sizeof (char) * (30 + 1));
Then you are free to copy information to/from name. An advantage of dynamically allocating is that you can realloc memory for name if the information you are storing in name grows. beyond 30 characters. An additional requirement after allocating memory for name, you are responsible for freeing the memory you have allocated when it is no longer needed. That's a rough outline of the pros/cons/requirements for using one as opposed to the other.
If you know the maximum length of the string you need, then you can use a character array. It does mean though that you will be using more memory than you'd typically use with dynamically allocated character arrays. Also, take a look at CString if you are using C++. You can find the length of the character array using strlen. In case of static allocation I believe it will be a part of the variable. Dynamic can be anywhere on the heap.
Hi i am having 2 questions here
How do i store a hex value in a buffer, say for example 0x0a and 0x1F;
char buffer[2] = "0x0a 0x1F";
But this is not right method, It is giving size 10 instead of 2. Can any one suggest how can i proceed.
I have seen the array like this
char buffer[] = " static array";
In the structure also,
struct Point {
char x[];
char y[];
};
what does it mean? how much size it will take for compilation
For the first, assuming you really want a two-byte array rather than a three-byte string (including NULL terminator), you can use:
char buffer[] = {0x0a, 0x1f};
For the second, the easiest way to find out is to simply check:
sizeof(buffer)
or:
sizeof(struct Point)
although I'm pretty certain your structure definition will fail because char x[] is not a complete type. Current versions of the standard allow flexible array sizes at the end of structures but not the way you have it there.
Likely sizes of the two (once you declare struct Point with char x[5]) will be 14 (number of characters in " static array" including the NULL terminator) and 5 (the size of x itself (flexible array members tend not to take up space, they're more for allowing arbitrary extra space if the memory block is obtained by malloc, for example).
I would like to know how the result of the memcpy() with respect to the memory allocation.
#include<stdio.h>
#include<string.h>
typedef struct {
char myname[7] ;
} transrules;
trans typedef struct {
char ip ;
int udp;
transrules rules[256];
} __attribute__ ((__packed__)) myudp;
myudp __attribute__ ((__packed__)) udpdata ;
char arr[400] ;
int main() {
memset (&udpdata , 0 ,sizeof(udpdata));
udpdata.ip = 'a' ;
udpdata.udp = 13 ;
udpdata.rules[0].myname = "lalla\0" ;
memcpy (&arr , &udpdata, sizeof(udpdata));
printf("%c",arr[0]);
return 0;
}
With respect to the code , how do we print out the character array in the structure transrules?
PS : Yes this code throws an error, sheesh char's !
As the array defined is of type char why does arr [1] still accept an integer value with memcpy() ?
memcpy does not allocate any memory. In your memcpy call, the memory for the destination arr was allocated when the variable arr was defined (char arr[400]).
There's a problem there, which is that you haven't allocated enough room. You copy sizeof(updata) bytes into arr, which is probably 1+4+7*256=1797 (this may vary depending on sizeof(int) and on whether __packed__ actually leaves out all unused bytes on your platform). If you really need arr (you probably don't), make it at least sizeof(updata) large. Defining it with char arr[sizeof(updata)] is fine.
If the layout of the structure is defined by some external format, you should use a fixed-size type instead of int (which is 2 or 4 bytes depending on the platform, and could be other sizes but you're unlikely to encounter them).
If the layout of the structure is defined by some external binary format and you want to print out the 1797 bytes in this format, use fwrite.
fwrite(updata, sizeof(updata), 1, stdout);
If you want to have a human representation of the data, use printf with appropriate format specifications.
printf("ip='%c' udp=%d\n", updata.ip, updata.ip);
for (i = 0; i < sizeof(updata.rules)/sizeof(updata.rules[0]); i++) {
puts(updata.rules[i].myname);
}
Despite the name, char is in fact the type of bytes. There is no separate type for characters in C. A character constant like 'a' is in fact an integer value (97 on almost all systems, as per ASCII). It's things like writing it with putchar or printf("%c", …) that make the byte interpreted as a character.
If your compiler is not signaling an error when you mix up a pointer (such as char*) with an integer, on the other hand, turn up the warning level. With Gcc, use at least gcc -O -Wall.
After actually compiling your code, I see the main error (you should have copy-pasted the error message from the compiler in your question):
udpdata.rules[0].myname = "lalla\0" ;
udpdata.rules[0].myname is an array of bytes. You can't assign to an array in C. You need to copy the elements one by one. Since this is an array of char, and you want to copy a string into it, you can use strcpy to copy all the bytes of the string. For a bunch of bytes in general, you would use memcpy.
strcpy(udpdata.rules[0].myname, "lalla");
(Note that "lalla\0" is equivalent to "lalla", all string literals are zero-terminated in C.¹) Since strcpy does not perform any size verification, you must make sure that the string (including its final null character) fits in the memory that you've allocated for the targets. You can use other functions such as strncat or strlcpy if you want to specify a maximum size.
¹ There's one exception (and only this exception) where "lalla" won't be zero-terminated: when initializing an array of 5 bytes, e.g. char bytes[5] = "lalla". If the array size is at least 6 or unspecified, there will be a terminating zero byte.
// this is really bad
udpdata.rules[0].myname = "lalla\0" ;
// do this instead. You want the literal string in your field.
memcpy(udpdata.rules[0].myname, "lalla\0", 6);
....
// This is wrong. arr is already a pointer.
memcpy (&arr , &udpdata, sizeof(udpdata));
// do this instead
mempcy (arr, &udpdata, sizeof(udpdate));
Concerning printing, I don't know how big ints are on your machine but if they are 4 bytes then
printf("%.7s", &arr[1+4]);
I'm not sure why you want to convert everything to a char array if you wanted to print out the content. Just use the struct and a for loop. Anyway I think you may want to read up on C arrays.
With respect to the code , how do we print out the character array in the structure transrules ?
/* incorrect -> udpdata.rules[0].myname = "lalla\0" ; */
strcpy(udpdata.rules[0].myname,"lalla") ;
printf("%s\n",udpdata.rules[0].myname);
As the array defined is of type char why does arr [1] still accept an integer value with memcpy ?
memcpy doesn't know or care about what the underlying datatypes might be where it is copying to. It takes void pointers and copies the value in one or more byte to one or more other bytes:
void * memcpy ( void * destination, const void * source, size_t num );
I have a structure that has an array of pointers. I would like to insert into the array digits in string format, i.e. "1", "2", etc..
However, is there any difference in using either sprintf or strncpy?
Any big mistakes with my code? I know I have to call free, I will do that in another part of my code.
Many thanks for any advice!
struct port_t
{
char *collect_digits[100];
}ports[20];
/** store all the string digits in the array for the port number specified */
static void g_store_digit(char *digit, unsigned int port)
{
static int marker = 0;
/* allocate memory */
ports[port].collect_digits[marker] = (char*) malloc(sizeof(digit)); /* sizeof includes 0 terminator */
// sprintf(ports[port].collect_digits[marker++], "%s", digit);
strncpy(ports[port].collect_digits[marker++], digit, sizeof(ports[port].collect_digits[marker]));
}
Yes, your code has a few issues.
In C, don't cast the return value of malloc(). It's not needed, and can hide errors.
You're allocating space based on the size of a pointer, not the size of what you want to store.
The same for the copying.
It is unclear what the static marker does, and if the logic around it really is correct. Is port the slot that is going to be changed, or is it controlled by a static variable?
Do you want to store only single digits per slot in the array, or multiple-digit numbers?
Here's how that function could look, given the declaration:
/* Initialize the given port position to hold the given number, as a decimal string. */
static void g_store_digit(struct port_t *ports, unsigned int port, unsigned int number)
{
char tmp[32];
snprintf(tmp, sizeof tmp, "%u", number);
ports[port].collect_digits = strdup(tmp);
}
strncpy(ports[port].collect_digits[marker++], digit, sizeof(ports[port].collect_digits[marker]));
This is incorrect.
You have allocated onto collect_digits a certain amount of memory.
You copy char *digits into that memory.
The length you should copy is strlen(digits). What you're actually copying is sizeof(ports[port].collect_digits[marker]), which will give you the length of a single char *.
You cannot use sizeof() to find the length of allocated memory. Furthermore, unless you know a priori that digits is the same length as the memory you've allocated, even if sizeof() did tell you the length of allocated memory, you would be copying the wrong number of bytes (too many; you only need to copy the length of digits).
Also, even if the two lengths are always the same, obtaining the length is this way is not expressive; it misleads the reader.
Note also that strncpy() will pad with trailing NULLs if the specified copy length is greater than the length of the source string. As such, if digits is the length of the memory allocated, you will have a non-terminated string.
The sprintf() line is functionally correct, but for what you're doing, strcpy() (as opposed to strncpy()) is, from what I can see and know of the code, the correct choice.
I have to say, I don't know what you're trying to do, but the code feels very awkward.
The first thing: why have an array of pointers? Do you expect multiple strings for a port object? You probably only need a plain array or a pointer (since you are malloc-ing later on).
struct port_t
{
char *collect_digits;
}ports[20];
You need to pass the address of the string, otherwise, the malloc acts on a local copy and you never get back what you paid for.
static void g_store_digit(char **digit, unsigned int port);
Finally, the sizeof applies in a pointer context and doesn't give you the correct size.
Instead of using malloc() and strncpy(), just use strdup() - it allocates the buffer bin enough to hold the content and copies the content to the new string, all in one shot.
So you don't need g_store_digit() at all - just use strdup(), and maintain marker on the caller's level.
Another problem with the original code: The statement
strncpy(ports[port].collect_digits[marker++], digit, sizeof(ports[port].collect_digits[marker]));
references marker and marker++ in the same expression. The order of evaluation for the ++ is undefined in this case -- the second reference to marker may be evaluated either before or after the increment is performed.