Building up a string in C. Am I doing it right? - c

A while back I asked this question.
I eventually hacked together a sort of solution:
int convertWindowsSIDToString(void *sidToConvert, int size, char* result) {
const char *sidStringPrefix = "S-";
int i;
int concatLength = 0;
/* For Linux I have SID defined in a seperate header */
SID *sid;
char revision[2], identifierAuthority[2];
if(sidToConvert == NULL) {
return 1;
}
sid = (SID *)sidToConvert;
snprintf(revision, 2, "%d", sid -> Revision);
snprintf(identifierAuthority, 2, "%d", sid -> IdentifierAuthority.Value[5]);
/* Push prefix in to result buffer */
strcpy (result,sidStringPrefix);
/* Add revision so now should be S-{revision} */
strcat(result, revision);
/* Append another - symbol */
strcat(result, "-");
/* Add the identifier authority */
strcat(result, identifierAuthority);
/* Sub Authorities are all stored as unsigned long so a little conversion is required */
for (i = 0; i < sid -> SubAuthorityCount; i++) {
if(concatLength > 0){
concatLength += snprintf(result + concatLength, size, "-%lu", sid -> SubAuthority[i]);
} else {
concatLength = snprintf(result, size, "%s-%lu", result, sid -> SubAuthority[i]);
}
}
return 0;
}
I'm a complete amateur at C.
In the few test cases I have run, this works fine but I am worried about how I'm handling strings here.
Is there any better way to handle string concatenation in this type of scenario? Please note, I am kind of tied to C89 compatibility as I am trying to keep all code compiling on all platforms and am stuck with Visual Studio on Windows for now.
Also my apologies if this question is not the best format for Stack Overflow. I guess I'm asking more for a code review that a very specific question but not sure where else to go.
EDIT
Just wanted to add what I think is almost the final solution, based on suggestions here, before accepting an answer.
int convertWindowsSIDToString(SID *sidToConvert, int size, char* result) {
int i;
char* t;
if(sidToConvert == NULL) {
printf("Error: SID to convert is null.\n");
return 1;
}
if(size < 32) {
printf("Error: Buffer size must be at least 32.\n");
return 1;
}
t = result;
t+= snprintf(t, size, "S-%d-%d", sidToConvert->Revision, sidToConvert->IdentifierAuthority.Value[5]);
for (i = 0; i < sidToConvert -> SubAuthorityCount; i++) {
t += snprintf(t, size - strlen(result), "-%lu", sidToConvert -> SubAuthority[i]);
}
return 0;
}
I've got a lot of reading to do yet by the look of things. Got to admit, C is pretty fun though.

If you know that the result buffer will be bin enough (which you usually can ensure by allocating the maximum space necessary for any of the formats and validating your inputs before formatting them), you can do something like the following:
char* buffer = malloc(BIG_ENOUGH);
char* t = buffer;
t+=sprintf(t, "%d", sid->Revision);
t+=sprintf(t, "%d", sid->IdentifierAuthority.Value[5]);
for (i = 0; i < sid -> SubAuthorityCount; i++) {
t += sprintf(t, "-%lu", sid -> SubAuthority[i]);
}
printf("Result: %s\n", buffer);

Looks all to complicated to me. I'd replace most of it with a single snprintf()
So this...
snprintf(revision, 2, "%d", sid -> Revision);
snprintf(identifierAuthority, 2, "%d", sid -> IdentifierAuthority.Value[5]);
/* Push prefix in to result buffer */
strcpy (result,sidStringPrefix);
/* Add revision so now should be S-{revision} */
strcat(result, revision);
/* Append another - symbol */
strcat(result, "-");
/* Add the identifier authority */
strcat(result, identifierAuthority);
...would become
snprintf("S-%d-%d", size, sid->Revision, sid->IdentifierAuthority.Value[5]);
Also, in your loop, you're not using size correctly.
concatLength += snprintf(result + concatLength, size, "-%lu", sid -> SubAuthority[i]);
You should use size-concatLength so that the size reflects how much has already been written.
Oh, and this:-
concatLength = snprintf(result, size, "%s-%lu", result, sid -> SubAuthority[i]);
.. is probably unsafe, as your destination is one of your source parameters. Generally, for the loop I;d use a solution like #MagnusReftel's.

Related

Why do vs2019 still prompt at the end of function execution: run time check failure # - stack around the variable 'binR' was corrupted

My program uses socket communication for inadvertent transmission. I hope to get 128 random strings inadvertently transmitted at the same time. How to choose 0 and 1 is determined by the 0 and 1 values of each bit of the input string. Based on the above idea, I made this program.
For the function get_ reuse_ For rule (), I want to enter a 16 bit string (including '\ 0'17 bits) and return the sum of XOR values of 128 ot strings mentioned above.
Unfortunately, there was a runtime error, but I can still get the final XOR.
Here I show the code
int main(){
socket_conn();
char *rule1="networkingjjjjjj";//小于16 (10)
char re_rule[17];
memset(re_rule, 0, sizeof(re_rule));
get_reuse_rule(rule1, re_rule);
socket_clean();
system("pause");
return 0;
}
Error function
void get_reuse_rule(char *rule,char *re_rule) {
int binR[128];
memset(binR, 0, sizeof(binR));
get_arraized_rule(rule, binR);
//根据规则进行ot确认key_array
char ki[128][17];
memset(ki, 0, sizeof(ki));
for (int i = 0; i < 128; i++) printf("%d", binR[i]);
printf("\n");
ot_128_get(ki, binR);
for (int i = 0; i < 128; i++) {
printf("%s\n", ki[i]);
}
printf("%d\n", strlen(ki[127]));
//得到了对应的一组ki与规则进行异或加密规则
char Irule[17];//可重用的规则
memset(Irule, 0, sizeof(Irule));
for (int i = 0; i < 128; i++) {
for (int j = 0; j < 16; j++) {
Irule[j] = Irule[j] ^ ki[i][j];
}
}
printf("\n%s\n", Irule);
memcpy(re_rule, Irule, sizeof(Irule));
//memset(ki, 0, sizeof(ki));
}
Another function called Ki, which I tested in the main function with no problem
void ot_128_get(char (*choose_ki)[17],int *bin_rule) {
for (int i = 0; i < 128; i++) {
ot_get_msg(choose_ki[i], bin_rule[i]);
printf("%d\n", i);
}
printf("\not all is ok!\n");
}
ot_get_msg(choose_ki[i], bin_rule[i]):
According to the second parameter is 0 or 1 to receive the inadvertent transmission, the first parameter as the return value
I searched through the search engine and mentioned that this problem may be caused by array out of bounds, but I don't think there is array out of bounds in my program. If you know how to solve this problem, please leave me a message. Thank you for your reading.
-------------------------new update---------------------------------
I have written a function that is almost the same. I put the local variables that may be wrong into the main function. Nothing else has changed. When I don't close the program, I won't report an error. Closing the program will cause the same error, but this time the error report is at the end of the main function.
----------------ot_get_msg------------------------
void ot_get_msg(char *get_msg,int choose) {
//int choose = 1;//先写死选择1
int r;
char buf[1024];
memset(buf, 0, sizeof(buf));
r= recv(clientSocket1, buf, 1023, NULL);
//printf("%s", buf);//n/e/rmsg1/rmsg2
char* p[4];
p[0] = strtok(buf, "/");
p[1] = strtok(NULL, "/");
p[2] = strtok(NULL, "/");
p[3] = strtok(NULL, "/");
//strtok(NULL, "/");
char v[100];
char k[100];
memset(v, 0, sizeof(v));
memset(k, 0, sizeof(k));
ot_recv_compute_v(v, k, p[0], p[1], p[2], p[3],choose);
//printf("V:%s\n", v);
//printf("K:%s\n", k);
//把v发送给client
send(clientSocket1, v, strlen(v), NULL);
char buf2[1024];
memset(buf2, 0, sizeof(buf2));
recv(clientSocket1, buf2, 1023, NULL);
char* enmsg[2];
enmsg[0] = strtok(buf2, "/");
enmsg[1] = strtok(NULL, "/");
//解密
char demsg1[100];
char demsg2[100];
memset(demsg1, 0, sizeof(demsg1));
memset(demsg2, 0, sizeof(demsg2));
ot_decode_msg(demsg1, demsg2, enmsg[0], enmsg[1], k);
//printf("%s\n", demsg1);
//printf("%s\n", demsg2);
if (choose == 0) {
memcpy(get_msg, demsg1,sizeof(demsg1));
}
else {
memcpy(get_msg, demsg2,sizeof(demsg2));
}
//closesocket(servSOCKET);
}
error codeRun-Time Check Failure #2 - Stack around the variable 'ki' was corrupted.
I think I've found a way to make the program run without errors. The Chinese version is as follows in here.
set:project > Configuration Properties > C / C + + - > code generation > Basic runtime check as the default value
The original guess is: "when the project reaches a certain size, the programmer will occupy a large amount of stack. I also have a deep understanding. Because I originally wrote a class, there was no error at runtime, but when adding member properties, it is easy to make such errors in other ways. Therefore, I guess that vs internally limits the size of the stack. When the item is large enough, it will overflow. "

How to Hash CPU ID in C

I'm trying to short the cpu id of my microcontroller (STM32F1).
The cpu id is composed by 3 word ( 3 x 4 bytes). This is the id string built from the 3 word: 980416578761680031125348904
I found a very useful library that do this.
The library is Hashids and there is a C code.
I try to build a test code on PC with "Code Blocks IDE" and the code works.
But when I move the code into the embedded side (Keil v5 IDE), I get an error on strdup() function: "strdup implicit declaration of function".
The problem is related to the strdup function isn't a standard library function and ins't included into string.h.
I will avoid to replace the strdup function with a custom function (that mimic the behaviour of strdup) to avoid memory leak because strdup copy strings using malloc.
Is there a different approach to compress long numbers?
Thanks for the help!
<---Appendix--->
This is the function that uses the strdup.
/* common init */
struct hashids_t *
hashids_init3(const char *salt, size_t min_hash_length, const char *alphabet)
{
struct hashids_t *result;
unsigned int i, j;
size_t len;
char ch, *p;
hashids_errno = HASHIDS_ERROR_OK;
/* allocate the structure */
result = _hashids_alloc(sizeof(struct hashids_t));
if (HASHIDS_UNLIKELY(!result)) {
hashids_errno = HASHIDS_ERROR_ALLOC;
return NULL;
}
/* allocate enough space for the alphabet and its copies */
len = strlen(alphabet) + 1;
result->alphabet = _hashids_alloc(len);
result->alphabet_copy_1 = _hashids_alloc(len);
result->alphabet_copy_2 = _hashids_alloc(len);
if (HASHIDS_UNLIKELY(!result->alphabet || !result->alphabet_copy_1
|| !result->alphabet_copy_2)) {
hashids_free(result);
hashids_errno = HASHIDS_ERROR_ALLOC;
return NULL;
}
/* extract only the unique characters */
result->alphabet[0] = '\0';
for (i = 0, j = 0; i < len; ++i) {
ch = alphabet[i];
if (!strchr(result->alphabet, ch)) {
result->alphabet[j++] = ch;
}
}
result->alphabet[j] = '\0';
/* store alphabet length */
result->alphabet_length = j;
/* check length and whitespace */
if (result->alphabet_length < HASHIDS_MIN_ALPHABET_LENGTH) {
hashids_free(result);
hashids_errno = HASHIDS_ERROR_ALPHABET_LENGTH;
return NULL;
}
if (strchr(result->alphabet, ' ')) {
hashids_free(result);
hashids_errno = HASHIDS_ERROR_ALPHABET_SPACE;
return NULL;
}
/* copy salt */
result->salt = strdup(salt ? salt : HASHIDS_DEFAULT_SALT);
result->salt_length = (unsigned int) strlen(result->salt);
/* allocate enough space for separators */
result->separators = _hashids_alloc((size_t)
(ceil((float)result->alphabet_length / HASHIDS_SEPARATOR_DIVISOR) + 1));
if (HASHIDS_UNLIKELY(!result->separators)) {
hashids_free(result);
hashids_errno = HASHIDS_ERROR_ALLOC;
return NULL;
}
/* non-alphabet characters cannot be separators */
for (i = 0, j = 0; i < strlen(HASHIDS_DEFAULT_SEPARATORS); ++i) {
ch = HASHIDS_DEFAULT_SEPARATORS[i];
if ((p = strchr(result->alphabet, ch))) {
result->separators[j++] = ch;
/* also remove separators from alphabet */
memmove(p, p + 1,
strlen(result->alphabet) - (p - result->alphabet));
}
}
/* store separators length */
result->separators_count = j;
/* subtract separators count from alphabet length */
result->alphabet_length -= result->separators_count;
/* shuffle the separators */
hashids_shuffle(result->separators, result->separators_count,
result->salt, result->salt_length);
/* check if we have any/enough separators */
if (!result->separators_count
|| (((float)result->alphabet_length / (float)result->separators_count)
> HASHIDS_SEPARATOR_DIVISOR)) {
unsigned int separators_count = (unsigned int)ceil(
(float)result->alphabet_length / HASHIDS_SEPARATOR_DIVISOR);
if (separators_count == 1) {
separators_count = 2;
}
if (separators_count > result->separators_count) {
/* we need more separators - get some from alphabet */
int diff = separators_count - result->separators_count;
strncat(result->separators, result->alphabet, diff);
memmove(result->alphabet, result->alphabet + diff,
result->alphabet_length - diff + 1);
result->separators_count += diff;
result->alphabet_length -= diff;
} else {
/* we have more than enough - truncate */
result->separators[separators_count] = '\0';
result->separators_count = separators_count;
}
}
/* shuffle alphabet */
hashids_shuffle(result->alphabet, result->alphabet_length,
result->salt, result->salt_length);
/* allocate guards */
result->guards_count = (unsigned int) ceil((float)result->alphabet_length
/ HASHIDS_GUARD_DIVISOR);
result->guards = _hashids_alloc(result->guards_count + 1);
if (HASHIDS_UNLIKELY(!result->guards)) {
hashids_free(result);
hashids_errno = HASHIDS_ERROR_ALLOC;
return NULL;
}
if (HASHIDS_UNLIKELY(result->alphabet_length < 3)) {
/* take some from separators */
strncpy(result->guards, result->separators, result->guards_count);
memmove(result->separators, result->separators + result->guards_count,
result->separators_count - result->guards_count + 1);
result->separators_count -= result->guards_count;
} else {
/* take them from alphabet */
strncpy(result->guards, result->alphabet, result->guards_count);
memmove(result->alphabet, result->alphabet + result->guards_count,
result->alphabet_length - result->guards_count + 1);
result->alphabet_length -= result->guards_count;
}
/* set min hash length */
result->min_hash_length = min_hash_length;
/* return result happily */
return result;
}
The true question seems to be
Is there a different approach to compress long numbers?
There are many. They differ in several respects, including which bits of the input contribute to the output, how many inputs map to the same output, and what manner of transformations of the input leave the output unchanged.
As a trivial examples, you can compress the input to a single bit by any of these approaches:
Choose the lowest-order bit of the input
Choose the highest-order bit of the input
The output is always 1
etc
Or you can compress to 7 bits by using using the number of 1 bits in the input as the output.
None of those particular options is likely to be of interest to you, of course.
Perhaps you would be more interested in producing 32-bit outputs for your 96-bit inputs. Do note that in that case on average there will be at least 264 possible inputs that map to each possible output. That depends only on the sizes of input and output, not on any details of the conversion.
For example, suppose that you have
uint32_t *cpuid = ...;
pointing to the hardware CPU ID. You can produce a 32-bit value from it that depends on all the bits of the input simply by doing this:
uint32_t cpuid32 = cpuid[0] ^ cpuid[1] ^ cpuid[2];
Whether that would suit your purpose depends on how you intend to use it.
You can easily implement strdup yourself like this:
char* strdup (const char* str)
{
size_t size = strlen(str);
char* result = malloc(size);
if(result != NULL)
{
memcpy(result, str, size+1);
}
return result;
}
That being said, using malloc or strdup on an embedded system is most likely just nonsense practice, see this. Nor would you use float numbers. Overall, that library seems to have been written by a desktop-minded person.
If you are implementing something like for example a chained hash table on an embedded system, you would use a statically allocated memory pool and not malloc. I'd probably go with a non-chained one for that reason (upon duplicates, pick next free spot in the buffer).
Unique device ID register (96 bits) is located under address 0x1FFFF7E8. It is factory programmed and is read-only. You can read it directly without using any other external library. For example:
unsigned int b = *(0x1FFFF7E8);
should give you the first 32 bits (31:0) of the unique device ID. If you want to retrieve a string as in case of the library mentioned, the following should work:
sprintf(id, "%08X%08X%08X", *(0x1FFFF7E8), *(0x1FFFF7E8 + 4), *(0x1FFFF7E8 + 8);
Some additional casting may be required, but generally that's what the library did. Please refer to STM32F1xx Reference Manual (RM0008), section 30.2 for more details. The exact memory location to read from is different in case of Cortex-M4 family of the MCUs.

C language - turning input into code

Most of the times, the questions I ask have to do with a specific part of a code that i did incorrectly, or some bug that i overlooked, but this time, I don't know where to start. I don't even know if what I am trying to do is possible.
I was given an assignment to write a code that gets a string that resembles a variable declaration, for example int x,y; is a valid input. char c,*cptr,carray[80]; is another example of valid input.
The code will create what the user inputs, and will print how much memory it took.
For instance, in the first example (int x,y;) the code will create 2 integers, and print "x requires 4 bytes, y requires 4 bytes".
In the second example, the code will create a character, a pointer to a character, and a string with 80 characters, and will print "c requires 1 byte, cptr requires 4 bytes, carray requires 80 bytes"
Is this even possible? It is not valid code to declare variables after the beginning of the code. They must be declared before anything else in C. So I don't see a way to do this...
This is a parsing problem -- you need to parse the input string and figure out what it means. You don't need to actually "create" anything, you just need to figure out the sizes of the variables that the compiler would create for that code.
Parsing actually a very large subject, with lots of books written about it and tools written to make it easier. While you could use a tool like antlr or bison to complete this task, they're probably overkill -- a simple recursive descent hand-written parser is probably the best approach.
Something like:
const char *parse_declaration(const char *p) {
/* parse a declaration, printing out the names and sizes of the variables
* 'p' points at the beginning of the string containing the declaration, and the
* function returns the pointer immediately after the end or NULL on failure */
int size;
if (!(p = parse_declspecs(p, &size))) return 0;
do {
const char *name;
int namelen, declsize;
if (!(p = parse_declarator(p, size, &name, &namelen, &declsize))) return 0;
printf("%.*s requires %d bytes\n", namelen, name, declsize);
p += strspn(p, " \t\r\n"); /* skip whitespace */
} while (*p++ == ',');
if (p[-1] != ';') return 0;
return p;
}
const char *parse_declspecs(const char *p, int *size) {
/* parse declaration specifiers (a type), and output the size of that type
* p points at the string to be parsed, and we return the point after the declspec */
p += strspn(p, " \t\r\n");
if (!isalpha(*p)) return 0;
int len = 0;
while (isalnum(p[len])) len++;
if (!strncmp(p, "char", len)) {
*size = sizeof(char);
return p+len; }
if (!strncmp(p, "int", len)) {
*size = sizeof(int);
return p+len; }
... more type tests here ...
if (!strncmp(p, "unsigned", len)) {
p += len;
p += strspn(p, " \t\r\n");
if (!isalpha(*p)) {
*size = sizeof(unsigned);
return p; }
while (isalnum(p[len])) len++;
if (!strncmp(p, "int", len)) {
*size = sizeof(unsigned int);
return p+len; }
... more type tests here ...
}
return 0;
}
const char *parse_declarator(const char *p, int typesize, const char **name, int *namelen, int *declsize) {
/* parse a declarator */
p += strspn(p, " \t\r\n");
while (*p == '*') {
typesize = sizeof(void *); /* assuming all pointers are the same size...*/
p++;
p += strspn(p, " \t\r\n"); }
declsize = typesize;
if (isalpha(*p)) {
*name = p;
while (isalnum(*p) | *p == '_') p++;
*namelen = p - *name;
} else if (*p == '(') {
if (!(p = parse_declarator(p+1, typesize, name, namelen, declsize))) return 0;
p += strspn(p, " \t\r\n");
if (*p++ != ')') return 0;
} else
return 0;
p += strspn(p, " \t\r\n");
while (*p == '[') {
int arraysize, len;
if (sscanf(++p, "%d %n", &arraysize, &len) < 1) return 0;
p += len;
declsize *= arraysize;
if (*p++ != ']') return 0;
p += strspn(p, " \t\r\n"); }
return p;
}
should get you started...
If you are trying to execute input code dynamically, to my knowledge that would not be possible without storing the code and then compiling again. This however seems like a very nasty and lengthy approach. If all you are trying to do however is calculate the size of declarations from input, what I would do is take the string received, call a function that analyzes/decomposes the string. So for example if the string has "int", "char", etc.. I know would know what kind of declaration I am dealing with, and after I know what declaration I am dealing with I could just count the number of variables declared and keep a counter in your example it was x,y. I would a loop on the counter and calculate the sizeof the type of declaration and how many were declared.
Sure, it's possible; it's just a bit of work. You're going to have to study C declaration syntax, and then write the code to recognize it (basically a small compiler front end).
For example, in the declaration
char c, *cptr, carray[80];
you have a sequence of tokens:
char c , * cptr , carray [ 80 ] ;
which will be recognized as a type specifier (char) followed by three declarators; a direct declarator, a pointer declarator, and an array declarator.
You can create the space for the objects dynamically using malloc or calloc. Then you'll need to create some kind of table to map the identifier (the variable name) to the dynamically-created object. You won't be able to treat these things as regular variables in regular C code; you're going to be doing a lot of table lookups and dereferencing.
Sure, you could do this with a type of parser. Assuming that you do not want to actually execute the code that you are given, you could read the string and then count how many times a variable of each specific type is declared, and calculate the amount of memory thusly. But, depending on the requirements of the professor, you may run into a view different issues.
In particular, the sizes of different types will likely be different on each processor. With the exception of char, you need to account for this. This is easy if you are analyzing the memory requirements for the computer that your program is executing on, as you could just have const variables whose values are assigned via sizeof to get the sizes, but if not, your program is more difficult, especially since you cannot presume to know the size of any variable.
Secondly, structs will be a problem do to some of the more interesting rules of C. Do you need to account for them?
So, this is entirely possible, because contrary to what you stated in your question, your code doesn't have to "create" a variable at all - it can just create an in-memory total for each type and print them out when done.
Figured I would post my solution just incase anyone is interested
void* q5(char* str_in)
{
char runner;
int i=0,memory,counter=0,arr_size;
runner=str_in[i];
while(1)
{
if(runner=='i') //the input is integer
{
memory=sizeof(int);
break;
}
if(runner=='c') //input is char
{
memory=sizeof(char);
break;
}
if(runner=='d') //input is double
{
memory=sizeof(double);
break;
}
if(runner=='s') //input is short
{
memory=sizeof(short);
break;
}
if(runner=='l') //input is long
{
memory=sizeof(long);
break;
}
if(runner=='f') //input is float
{
memory=sizeof(float);
break;
}
} //we know the type of data, skip in the string until first variable
while(runner!=' ') //advance until you see empty space, signaling next variable
{
i++;
runner=str_in[i];
}
while(runner==' ') //advance until you encounter first letter of new variable
{
i++;
runner=str_in[i];
} //runner is now first letter of first variable
while(runner!=';') //run on the string until its over
{
if(runner==',') //if its ',', then spaces will occur, skip all of them to first char that isnt space
{
i++;
runner=str_in[i];
while(runner==' ')
{
i++;
runner=str_in[i];
} //runner now points to first letter of variable
continue;
}
if(runner=='*') //current variable is a pointer
{
counter=counter+4; //pointers are always 4 bytes regardless of type!
i++;
runner=str_in[i];
while((runner!=',')&&(runner!=';')) //while runner is still on this variable
{
printf("%c",runner);
i++;
runner=str_in[i];
}
printf(" requires 4 bytes\n"); //now runner is the first character after the variable we just finished
continue;
}
while((runner!=',')&&(runner!=';')) //now is the case that runner is the first letter of a non pointer variable
{
printf("%c",runner);
i++;
runner=str_in[i];
if((runner==',')||(runner==';')) //we are done
{
printf(" requires %d bytes\n",memory);
counter+=memory;
continue;
}
if(runner=='[') //this variable is an array
{
printf("[");
i++;
runner=str_in[i]; //runner is now MSB of size of array
arr_size=0;
while(runner!=']')
{
printf("%c",runner);
arr_size*=10;
arr_size=arr_size+runner-48; //48 is ascii of 0
i++;
runner=str_in[i];
} //arr_size is now whats written in the [ ]
printf("] requires %d bytes\n",arr_size*memory);
counter+=arr_size*memory;
i++;
runner=str_in[i]; // should be ',' since we just finished a variable
continue;
}
}
}
printf("Overall %d bytes needed to allocate\n",counter);
return (malloc(counter));
}

Initializing an infinite number of char **

I'm making a raytracing engine in C using the minilibX library.
I want to be able to read in a .conf file the configuration for the scene to display:
For example:
(Az#Az 117)cat universe.conf
#randomcomment
obj:eye:x:y:z
light:sun:100
light:moon:test
The number of objects can vary between 1 and the infinite.
From now on, I'm reading the file, copying each line 1 by 1 in a char **tab, and mallocing by the number of objects found, like this:
void open_file(int fd, struct s_img *m)
{
int i;
char *s;
int curs_obj;
int curs_light;
i = 0;
curs_light = 0;
curs_obj = 0;
while (s = get_next_line(fd))
{
i = i + 1;
if (s[0] == 'l')
{
m->lights[curs_light] = s;
curs_light = curs_light + 1;
}
else if (s[0] == 'o')
{
m->objs[curs_obj] = s;
curs_obj = curs_obj + 1;
}
else if (s[0] != '#')
{
show_error(i, s);
stop_parsing(m);
}
}
Now, I want to be able to store each information of each tab[i] in a new char **tab, 1 for each object, using the ':' as a separation.
So I need to initialize and malloc an undetermined number of char **tab. How can I do that?
(Ps: I hope my code and my english are good enough for you to understand. And I'm using only the very basic function, like read, write, open, malloc... and I'm re-building everything else, like printf, get_line, and so on)
You can't allocate an indeterminate amount of memory; malloc doesn't support it. What you can do is to allocate enough memory for now and revise that later:
size_t buffer = 10;
char **tab = malloc(buffer);
//...
if (indexOfObjectToCreate > buffer) {
buffer *= 2;
tab = realloc(tab, buffer);
}
I'd use an alternative approach (as this is c, not c++) and allocate simply large buffers as we go by:
char *my_malloc(size_t n) {
static size_t space_left = 0;
static char *base = NULL;
if (base==NULL || space_left < n) base=malloc(space_left=BIG_N);
base +=n; return base-n;
}
Disclaimer: I've omitted the garbage collection stuff and testing return values and all safety measures to keep the routine short.
Another way to think this is to read the file in to a large enough mallocated array (you can check it with ftell), scan the buffer, replace delimiters, line feeds etc. with ascii zero characters and remember the starting locations of keywords.

C interview question---run-length coding of strings [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
A friend of mine was asked the following question a Yahoo interview:
Given a string of the form "abbccc" print "a1b2c3". Write a function that takes a string and return a string. Take care of all special cases.
How would you experts code it?
Thanks a lot
if (0==strcmp(s, "abbccc"))
return "a1b2c3";
else
tip_the_interviewer(50);
Taken care of.
There's more than one way to do it, but I'd probably run over the input string twice: once to count how many bytes are required for the output, then allocate the output buffer and go again to actually generate the output.
Another possibility is to allocate up front twice the number of bytes in the input string (plus one), and write the output into that. This keeps the code simpler, but is potentially very wasteful of memory. Since the operation looks like a rudimentary compression (RLE), perhaps it's best that the first implementation doesn't have the output occupy double the memory of the input.
Another possibility is to take a single pass, and reallocate the output string as necessary, perhaps increasing the size exponentially to ensure O(N) overall performance. This is quite fiddly in C, so probably not the initial implementation of the function, especially in interview conditions. It's also not necessarily any faster than my first version.
However it's done, the obvious "special case" is an empty input string, because the obvious (to me) implementation will start by storing the first character, then enter a loop. It's also easy to write something where the output may be ambiguous: "1122" is the output for the input "122", but perhaps it is also the output for the input consisting of 122 1 characters. So you might want to limit run lengths to at most 9 characters (assuming base 10 representation) to prevent ambiguity. It depends what the function is for - conjuring a complete function specification from a single example input and output is not possible.
There's also more than one way to design the interface: the question says "returns a string", so presumably that's a NUL-terminated string in a buffer newly-allocated with malloc. In the long run, though, that's not always a great way to write all your string APIs. In a real project I would prefer to design a function that takes as input the string to process, together with a pointer to an output buffer and the length of that buffer. It returns either the number of bytes written, or if the output buffer isn't big enough it returns the number which would have been written. Implementing the stated function using this new function is easy:
char *stated_function(const char *in) {
size_t sz = new_function(in, NULL, 0);
char *buf = malloc(sz);
if (buf) new_function(in, buf, sz);
return buf;
}
I'm also confused what "print" means in the question - other answerers have taken it to mean "write to stdout", meaning that no allocation is necessary. Does the interviewer want a function that prints the encoded string and returns it? Prints and returns something else? Just returns a string, and is using "print" when they don't really mean it?
Follow the following algo and implement it.
Run a loop for all the letters in
string.
Store the first character in a temp
char variable.
For each change in character
initialize a counter with 1 and
print the count of previous
character and then the new letter.
This smells like a homework question, but the code was just too much fun to write.
The key ideas:
A string is a (possibly empty) sequence of nonempty runs of identical characters.
Pointer first always points to the first in a run of identical characters.
After the inner while loop, pointer beyond points one past the end of a run of identical characters.
If the first character of a run is a zero, we've reached the end of the string. The empty string falls out as an instance of the more general problem.
The space required for a decimal numeral is always at most the length of a run, so the result needs at most double the memory. The code works fine with a run length of 53: valgrind reports no memory errors.
Pointer arithmetic is beautiful.
The code:
char *runcode(const char *s) {
char *t = malloc(2 * strlen(s) + 1); // eventual answer
assert(t);
char *w = t; // writes into t;
const char *first, *beyond; // mark limits of a run in s
for (first = s; *first; first = beyond) { // for each run do...
beyond = first+1;
while (*beyond == *first) beyond++; // move to end of run
*w++ = *first; // write char
w += sprintf(w, "%d", beyond-first); // and length of run
}
*w = '\0';
return t;
}
Things I like:
No auxiliary variable for the character whose run we're currently scanning.
No auxiliary variable for the count.
Reasonably sparing use of other local variables.
As others have pointed out, the spec is ambiguous. I think that's fine for an interview question: the point may well be to see what the job applicant does in an ambiguous situation.
Here's my take on the code. I've made some assumptions (since I can't very well ask the interviewer in this case):
This is a simple form of run-length encoding.
Output is of the form {character}{count}.
To avoid ambiguity, the count is 1..9.
Runs of the same character longer than 9 are split into multiple counts.
No dynamic allocation is done. In C, it's usually better to let caller take care of that. We return true/false to indicate if there was enough space.
I hope the code is clear enough to stand on its own. I've included a test harness and some test cases.
#include <stdbool.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
static void append(char **output, size_t *max, int c)
{
if (*max > 0) {
**output = c;
*output += 1;
*max -= 1;
}
}
static void encode(char **output, size_t *max, int c, int count)
{
while (count > 9) {
append(output, max, c);
append(output, max, '0' + 9);
count -= 9;
}
append(output, max, c);
append(output, max, '0' + count);
}
static bool rle(const char *input, char *output, size_t max)
{
char prev;
int count;
prev = '\0';
count = 0;
while (*input != '\0') {
if (*input == prev) {
count++;
} else {
if (count > 0)
encode(&output, &max, prev, count);
prev = *input;
count = 1;
}
++input;
}
if (count > 0)
encode(&output, &max, prev, count);
if (max == 0)
return false;
*output = '\0';
return true;
}
int main(void)
{
struct {
const char *input;
const char *facit;
} tests[] = {
{ "", "" },
{ "a", "a1" },
{ "aa", "a2" },
{ "ab", "a1b1" },
{ "abaabbaaabbb", "a1b1a2b2a3b3" },
{ "abbccc", "a1b2c3" },
{ "1", "11" },
{ "12", "1121" },
{ "1111111111", "1911" },
{ "aaaaaaaaaa", "a9a1" },
};
bool errors;
errors = false;
for (int i = 0; i < sizeof(tests) / sizeof(tests[0]); ++i) {
char buf[1024];
bool ok;
ok = rle(tests[i].input, buf, sizeof buf);
if (!ok || strcmp(tests[i].facit, buf) != 0) {
printf("FAIL: i=%d input=<%s> facit=<%s> buf=<%s>\n",
i, tests[i].input, tests[i].facit, buf);
errors = true;
}
}
if (errors)
return EXIT_FAILURE;
return 0;
}
int priya_homework(char *input_str, char *output_str, int out_len)
{
char pc,c;
int count=0,used=0;
/* Check for NULL and empty inputs here and return*/
*output_str='\0';
pc=*input_str;
do
{
c=*input_str++;
if (c==pc)
{
pc=c;
count++;
}
else
{
used=snprintf(output_str,out_len,"%c%d",pc,count);
if (used>=out_len)
{
/* Output string too short */
return -1;
}
output_str+=used;
out_len-=used;
pc=c;
count=1;
}
} while (c!='\0' && (out_len>0));
return 0;
}
Damn, thought you said C#, not C. Here is my C# implementation for interest's sake.
private string Question(string input)
{
var output = new StringBuilder();
while (!string.IsNullOrEmpty(input))
{
var first = input[0];
var count = 1;
while (count < input.Length && input[count] == first)
{
count++;
}
if (count > input.Length)
{
input = null;
}
else
{
input = input.Substring(count);
}
output.AppendFormat("{0}{1}", first, count);
}
return output.ToString();
}
Something like this:
void so(char s[])
{
int i,count;
char cur,prev;
i = count = prev = 0;
while(cur=s[i++])
{
if(!prev)
{
prev = cur;
count++;
}
else
{
if(cur != prev)
{
printf("%c%d",prev,count);
prev = cur;
count = 1;
}
else
count++;
}
}
if(count)
printf("%c%d",prev,count);
printf("\n");
}

Resources