C language - turning input into code - c

Most of the times, the questions I ask have to do with a specific part of a code that i did incorrectly, or some bug that i overlooked, but this time, I don't know where to start. I don't even know if what I am trying to do is possible.
I was given an assignment to write a code that gets a string that resembles a variable declaration, for example int x,y; is a valid input. char c,*cptr,carray[80]; is another example of valid input.
The code will create what the user inputs, and will print how much memory it took.
For instance, in the first example (int x,y;) the code will create 2 integers, and print "x requires 4 bytes, y requires 4 bytes".
In the second example, the code will create a character, a pointer to a character, and a string with 80 characters, and will print "c requires 1 byte, cptr requires 4 bytes, carray requires 80 bytes"
Is this even possible? It is not valid code to declare variables after the beginning of the code. They must be declared before anything else in C. So I don't see a way to do this...

This is a parsing problem -- you need to parse the input string and figure out what it means. You don't need to actually "create" anything, you just need to figure out the sizes of the variables that the compiler would create for that code.
Parsing actually a very large subject, with lots of books written about it and tools written to make it easier. While you could use a tool like antlr or bison to complete this task, they're probably overkill -- a simple recursive descent hand-written parser is probably the best approach.
Something like:
const char *parse_declaration(const char *p) {
/* parse a declaration, printing out the names and sizes of the variables
* 'p' points at the beginning of the string containing the declaration, and the
* function returns the pointer immediately after the end or NULL on failure */
int size;
if (!(p = parse_declspecs(p, &size))) return 0;
do {
const char *name;
int namelen, declsize;
if (!(p = parse_declarator(p, size, &name, &namelen, &declsize))) return 0;
printf("%.*s requires %d bytes\n", namelen, name, declsize);
p += strspn(p, " \t\r\n"); /* skip whitespace */
} while (*p++ == ',');
if (p[-1] != ';') return 0;
return p;
}
const char *parse_declspecs(const char *p, int *size) {
/* parse declaration specifiers (a type), and output the size of that type
* p points at the string to be parsed, and we return the point after the declspec */
p += strspn(p, " \t\r\n");
if (!isalpha(*p)) return 0;
int len = 0;
while (isalnum(p[len])) len++;
if (!strncmp(p, "char", len)) {
*size = sizeof(char);
return p+len; }
if (!strncmp(p, "int", len)) {
*size = sizeof(int);
return p+len; }
... more type tests here ...
if (!strncmp(p, "unsigned", len)) {
p += len;
p += strspn(p, " \t\r\n");
if (!isalpha(*p)) {
*size = sizeof(unsigned);
return p; }
while (isalnum(p[len])) len++;
if (!strncmp(p, "int", len)) {
*size = sizeof(unsigned int);
return p+len; }
... more type tests here ...
}
return 0;
}
const char *parse_declarator(const char *p, int typesize, const char **name, int *namelen, int *declsize) {
/* parse a declarator */
p += strspn(p, " \t\r\n");
while (*p == '*') {
typesize = sizeof(void *); /* assuming all pointers are the same size...*/
p++;
p += strspn(p, " \t\r\n"); }
declsize = typesize;
if (isalpha(*p)) {
*name = p;
while (isalnum(*p) | *p == '_') p++;
*namelen = p - *name;
} else if (*p == '(') {
if (!(p = parse_declarator(p+1, typesize, name, namelen, declsize))) return 0;
p += strspn(p, " \t\r\n");
if (*p++ != ')') return 0;
} else
return 0;
p += strspn(p, " \t\r\n");
while (*p == '[') {
int arraysize, len;
if (sscanf(++p, "%d %n", &arraysize, &len) < 1) return 0;
p += len;
declsize *= arraysize;
if (*p++ != ']') return 0;
p += strspn(p, " \t\r\n"); }
return p;
}
should get you started...

If you are trying to execute input code dynamically, to my knowledge that would not be possible without storing the code and then compiling again. This however seems like a very nasty and lengthy approach. If all you are trying to do however is calculate the size of declarations from input, what I would do is take the string received, call a function that analyzes/decomposes the string. So for example if the string has "int", "char", etc.. I know would know what kind of declaration I am dealing with, and after I know what declaration I am dealing with I could just count the number of variables declared and keep a counter in your example it was x,y. I would a loop on the counter and calculate the sizeof the type of declaration and how many were declared.

Sure, it's possible; it's just a bit of work. You're going to have to study C declaration syntax, and then write the code to recognize it (basically a small compiler front end).
For example, in the declaration
char c, *cptr, carray[80];
you have a sequence of tokens:
char c , * cptr , carray [ 80 ] ;
which will be recognized as a type specifier (char) followed by three declarators; a direct declarator, a pointer declarator, and an array declarator.
You can create the space for the objects dynamically using malloc or calloc. Then you'll need to create some kind of table to map the identifier (the variable name) to the dynamically-created object. You won't be able to treat these things as regular variables in regular C code; you're going to be doing a lot of table lookups and dereferencing.

Sure, you could do this with a type of parser. Assuming that you do not want to actually execute the code that you are given, you could read the string and then count how many times a variable of each specific type is declared, and calculate the amount of memory thusly. But, depending on the requirements of the professor, you may run into a view different issues.
In particular, the sizes of different types will likely be different on each processor. With the exception of char, you need to account for this. This is easy if you are analyzing the memory requirements for the computer that your program is executing on, as you could just have const variables whose values are assigned via sizeof to get the sizes, but if not, your program is more difficult, especially since you cannot presume to know the size of any variable.
Secondly, structs will be a problem do to some of the more interesting rules of C. Do you need to account for them?
So, this is entirely possible, because contrary to what you stated in your question, your code doesn't have to "create" a variable at all - it can just create an in-memory total for each type and print them out when done.

Figured I would post my solution just incase anyone is interested
void* q5(char* str_in)
{
char runner;
int i=0,memory,counter=0,arr_size;
runner=str_in[i];
while(1)
{
if(runner=='i') //the input is integer
{
memory=sizeof(int);
break;
}
if(runner=='c') //input is char
{
memory=sizeof(char);
break;
}
if(runner=='d') //input is double
{
memory=sizeof(double);
break;
}
if(runner=='s') //input is short
{
memory=sizeof(short);
break;
}
if(runner=='l') //input is long
{
memory=sizeof(long);
break;
}
if(runner=='f') //input is float
{
memory=sizeof(float);
break;
}
} //we know the type of data, skip in the string until first variable
while(runner!=' ') //advance until you see empty space, signaling next variable
{
i++;
runner=str_in[i];
}
while(runner==' ') //advance until you encounter first letter of new variable
{
i++;
runner=str_in[i];
} //runner is now first letter of first variable
while(runner!=';') //run on the string until its over
{
if(runner==',') //if its ',', then spaces will occur, skip all of them to first char that isnt space
{
i++;
runner=str_in[i];
while(runner==' ')
{
i++;
runner=str_in[i];
} //runner now points to first letter of variable
continue;
}
if(runner=='*') //current variable is a pointer
{
counter=counter+4; //pointers are always 4 bytes regardless of type!
i++;
runner=str_in[i];
while((runner!=',')&&(runner!=';')) //while runner is still on this variable
{
printf("%c",runner);
i++;
runner=str_in[i];
}
printf(" requires 4 bytes\n"); //now runner is the first character after the variable we just finished
continue;
}
while((runner!=',')&&(runner!=';')) //now is the case that runner is the first letter of a non pointer variable
{
printf("%c",runner);
i++;
runner=str_in[i];
if((runner==',')||(runner==';')) //we are done
{
printf(" requires %d bytes\n",memory);
counter+=memory;
continue;
}
if(runner=='[') //this variable is an array
{
printf("[");
i++;
runner=str_in[i]; //runner is now MSB of size of array
arr_size=0;
while(runner!=']')
{
printf("%c",runner);
arr_size*=10;
arr_size=arr_size+runner-48; //48 is ascii of 0
i++;
runner=str_in[i];
} //arr_size is now whats written in the [ ]
printf("] requires %d bytes\n",arr_size*memory);
counter+=arr_size*memory;
i++;
runner=str_in[i]; // should be ',' since we just finished a variable
continue;
}
}
}
printf("Overall %d bytes needed to allocate\n",counter);
return (malloc(counter));
}

Related

What is the purpose of this clear_mem function?

I've been trying to work out exactly what this function's purpose is I've come across..
The code intentionally has bad code practices, so I am trying to figure out if this is one of them.
Here is the function:
void clear_mem(char *memblock, int siz) {
register int i;
for (i=0; i<=siz;i++)
*(memblock+i) = 0;
}
The function is called within the following function:
char *get_argument(char line[], int argno){
char *argument = malloc(512);
char clone[512];
strncpy(clone, line, strlen(line)+1);
int current_arg = 0;
char *splitted = strtok(clone, " ");
while (splitted != NULL){
if (splitted[0] != ':'){
current_arg++;
}
if (current_arg == argno+1){
clear_mem(argument, 512); //Here
strncpy(argument, splitted, strlen(splitted)+1);
return argument;
free(argument);
}
splitted = strtok(NULL, " ");
}
if (current_arg != argno){
argument[0] = '\0';
}
free(argument);
return argument;
}
Thanks in advance!
In this code:
for (i=0; i<=siz;i++)
*(memblock+i) = 0;
memblock+i adds the integer i to the pointer memblock. The result points i elements beyond where memblock points. Since memblock is a pointer to char, the result points i characters beyond where memblock points.
Then *(memblock+i) refers to the character at that address. *(memblock+i) is equivalent to memblock[i]. *(memblock+i) = 0 sets the character to zero.
So the effect of this code is to set all characters indexed by i during the loop to zero. It clears a block of memory.
The for (i=0; i<=siz;i++) causes the loop to iterate with i taking all values from zero up to and including siz. Thus, siz+1 characters will be set to zero.
We can see this is an error because get_argument allocates 512 bytes for argument and then later calls clear_mem(argument, 512), which clears 513 bytes. The resulting behavior is not defined by the C code.

Inserting and deleting spaces in a text array using C

I've managed to create the following output from a program:
7.86 ( 8Hm,), 6.82 ( 4Hm,),12.1 ( 1Hs,).
I want to either add a space or delete a space depending on what character I am pointing to. Would sscanf allow you to do this?
I was just wondering what would be the easiest way to insert and delete spaces in this text. For example, I want to delete a space between ( and 'num' eg 8, but provide a space between H and 'letter' eg m. So far I have code that looks something like this:
typedef struct node {
int i;
char array[SIZE];
struct node* link;
} node;
node* format(void)
{
int count = 0;
int size = list_size(data, sizeof(data)) / 4;
for (node* ptr = head; ptr != NULL; ptr = ptr->link)
{
int i;
int j;
for (i = 0, j = 0; i < sizeof(ptr->array); i++, j++)
{
if (ptr->array[0] == ' ' && count == 0)
{
count++;
j--;
}
else if (ptr->array[i] == ' ' && ptr->array[i-1] == '.')
{
j--;
ptr->array[i] = ',';
}
else if (ptr->array[i] == ',' && ptr->array[i-1] == ')')
{
if (count == size)
{
ptr->array[i] = '.';
}
}
ptr->array[j] = ptr->array[i];
}
count++;
}
return head;
}
Note that the above is currently a linked list of size 3, whereby I created the latter to enable swapping of "groups" of characters and then reverse the order in a specific way. Would it be easier to format if I passed everything back to a single array first?
Any tips/advice would be gratefully received!
The easiest way is to use sscanf to parse correctly the string and then just print it to another string. Manually parsing the string in the way you are doing it will give you no observable speed/memory gain and will just add a ton of code.
Example
#include<stdio.h>
int main(){
char str[]= "7.86 ( 8Hm,), 6.82 ( 4Hm,),12.1 ( 1Hs,).";
float f1, f2, f3;
int i1, i2, i3;
sscanf(str, "%f ( %dHm,), %f ( %dHm,),%f ( %dHs,).", &f1, &i1, &f2, &i2, &f3, &i3);
printf("%f ( %dHm,), %f ( %dHm,),%f ( %dHs,).\n", f1, i1, f2, i2, f3, i3);
return 0;
}
scanf functions are powerful and as you can see you can use them to parse your formated string. When you run the code in this example, it prints this:
7.860000 ( 8Hm,), 6.820000 ( 4Hm,),12.100000 ( 1Hs,).
The way it works is that scanf can ignore characters. So it reads the first float and then ignores blank spaces, ignores a parenthesis, reads an int, ignores Hm and so on.
My printf is formating the string the same way it came from the input. You can tweak it to meet your requirements. You should also search around and limit decimal digits. But the basic idea is that you get the data you need from the string and then you can print that data the way you want.
Doing the best to understand what you currently have in ptr-array and then how to accomplish the reformat, which you appear to want to do in-place simply updating ptr->array to hold to updated formatted string, then you can build a correctly formatted buffer to replace the current ptr->array.
Note: the better solution would be separate the values in ptr->array into actual separate values and hold that as part of your linked list where you either statically declare an array of structs to hold your values and units (which is my best guess for 7.86 (8Hm,). That would allow you to format the values in any way needed rather than being stuck with what is in array. For example:
#define USIZE 8
typedef struct values {
double value;
char[USIZE];
} values;
typedef struct node {
int i;
char array[SIZE];
values *vals;
struct node* link;
} node;
or, e.g.:
typedef struct node {
int i;
char array[SIZE];
values vals[3];
struct node* link;
} node;
Regardless, your original question was how to reformat your existing ptr->array removing spaces following '(' and separating '8Hm' to '8H m' in your sample data of:
7.86 ( 8Hm,), 6.82 ( 4Hm,),12.1 ( 1Hs,).
Given that you are iterating over each node in your list, this reformat would presumably only be run once after you have collected/stored all your nodes. One approach is to construct a new buffer holding array in its new format and then replace your new array with the properly formatted buffer.
Yes, there are many ways to do this, but if your reformat criteria is limited, simply walking down array with a pointer using the tried and true inch-worm method filling your new buffer with the characters in the desired format is about as easy and flexible as anything else.
Below, the example creates a new buffer (buf[SIZE]) copies/replaces/removes characters from ptr->array and then replaces ptr->array with the contents of buf.
NOTE: I have not run this on any data, so you may need to tweak the logic a bit. This is merely meant to show you an approach that can be used and to make that as close a possible to a solution to your question (to the best I understood it).
NOTE2: you can replace the upper/lower case conditionals below with isupper or islower from ctype.h if you are not limited to specific headers. Additionally, memcpy below will require string.h, but can be re-written to simply do the copy without it if that is an issue.
node* format(void)
{
/* for each node in list */
for (node* ptr = head; ptr != NULL; ptr = ptr->link)
{
/* declare a pointer to `ptr->array` and `buf` */
char *p = ptr->array;
char *pend = p + SIZE;
char buf[SIZE] = {0};
char *bp = buf;
/* for each char in ptr->array */
while (p < pend)
{
/* if not the first char */
if (p > ptr->array)
{
/* if current char is ' ' */
if (*p == ' ')
{
/* if previous char is one of the following */
if (*(p - 1) == '.') *bp++ = ',';
if (*(p - 1) == ')') *bp++ = '.';
if (*(p - 1) == '(') { p++; continue; } /* remove ' ' after '(' */
}
/* if current is lower case & prior is upper, separate */
else if (('a' <= *(p - 1) && *(p - 1) <= 'z') && ('A' <= *p && *p <= 'Z'))
{
*bp++ = ' ';
*bp++ = *p;
}
/* default - copy current to buf */
else
*bp++ = *p;
}
p++; /* advance to next char */
}
/* copy formatted string to ptr->array */
memcpy (ptr->array, buf, SIZE);
}
return head;
}
Note3: if SIZE is a global #define, then replace sizeof (ptr->array) wtih SIZE.

Overwriting memory ?Using malloc in C, when calling structure

im having problem with allocating memory for my following structure:
typedef struct level {
char* raw_map; // original string representing the level map
char* name; // level name
char* description; // level description
char* password; // level password
struct level *next; // pointer to the next level
} LEVEL;
My function looks like this:
LEVEL* parse_level(char* line){
LEVEL *level =(LEVEL*)malloc(sizeof(LEVEL));
int i=0;
level->name = (char*)malloc(sizeof(level->name));
while(line[i] != ';'){
level->name[i]= line[i];
i++;
}
level->password =(char*)malloc(sizeof(level->password));
i++;
while(line[i] != ';'){
level->password[i]= line[i];
i++;
}
return level;
So far, i have only set first two types of structure.
I call funciton like that:
int main(){
LEVEL *first = (LEVEL*)malloc(sizeof(LEVEL));
first = parse_level("nameoffirstlevelwillbehere;mandragoramo5000");
printf("name: %s pass: %s\n",first->name, first->password);
}
Im not sure that issue somewhere in malloc structure because im using pointers for first time in functions so i'm not sure they are placed right. When i have code like that i will have in first->name only 12 characters instead of full string until first ';'. Same happened in level->password. Thanks for help.
level->name
is a pointer. So
sizeof(level->name)
returns the size of a pointer on your platform. That's known at compile time. In fact that's something to bear in mind. The sizeof operator is evaluated at compile time. So a priori it cannot be used to size the allocation of dynamically sized memory.
Instead you need to allocate enough memory to hold the length of string you require, including the null terminator.
You'll need to walk over line first to find the length of the string. Just as you presently do, looking for semi-colon separators. Then you can allocate the string's memory and finally copy it.
Do beware that your current code assumes that the string in line is well formed. If it does not have sufficient number of semi-colons you will read off the end of the buffer. It would be prudent to check for that eventuality.
The malloc function returns void* which is assignable to any pointer type. So casting is not necessary and it is generally best that you avoid casting.
This is your problem
level->name = (char*)malloc(sizeof(level->name));
Here level->name is a char pointer. So sizeof returns just the size of pointer (generally 4 bytes). You need to allocate sufficient memory for it.
level->name = malloc(sizeof(char)*100);
Do same for level->password
And remember casting malloc is bad
Shouldn't level->name = (char*)malloc(sizeof(level->name)); be
level->name = (char*)malloc(strlen(line->name)+1);
We need to calculate length of name and password before allocating memory for it, or you may choose a default MAX_SIZE for it.
LEVEL* parse_level(char* line){
LEVEL *level =(LEVEL*)malloc(sizeof(LEVEL));
if (level == NULL) return NULL;
int i=0, name_length=0;
//calculate name length first, if MAX_SIZE is known then it can be dropped.
while(line[i] != ';') {
i++;
}
level->name = (char*)malloc(i+1);
if (level->name == NULL) {
free(level);
return NULL;
}
name_length = i;
i = 0;
while(line[i] != ';'){
level->name[i]= line[i];
i++;
}
level->name[i] = '\0';
// Calculating password length
i++;
while(line[i] != ';'){
i++;
}
level->password = (char*)malloc(i - name_length);
if (level->password == NULL) {
free(level->name);
free(level);
return NULL;
}
i = name_length + 1;
while(line[i] != ';') {
level->password[i] = line[i];
i++;
}
level->password[i] = '\0';
return level;
}

Appending a char to a char* in C?

I'm trying to make a quick function that gets a word/argument in a string by its number:
char* arg(char* S, int Num) {
char* Return = "";
int Spaces = 0;
int i = 0;
for (i; i<strlen(S); i++) {
if (S[i] == ' ') {
Spaces++;
}
else if (Spaces == Num) {
//Want to append S[i] to Return here.
}
else if (Spaces > Num) {
return Return;
}
}
printf("%s-\n", Return);
return Return;
}
I can't find a way to put the characters into Return. I have found lots of posts that suggest strcat() or tricks with pointers, but every one segfaults. I've also seen people saying that malloc() should be used, but I'm not sure of how I'd used it in a loop like this.
I will not claim to understand what it is that you're trying to do, but your code has two problems:
You're assigning a read-only string to Return; that string will be in your
binary's data section, which is read-only, and if you try to modify it you will get a segfault.
Your for loop is O(n^2), because strlen() is O(n)
There are several different ways of solving the "how to return a string" problem. You can, for example:
Use malloc() / calloc() to allocate a new string, as has been suggested
Use asprintf(), which is similar but gives you formatting if you need
Pass an output string (and its maximum size) as a parameter to the function
The first two require the calling function to free() the returned value. The third allows the caller to decide how to allocate the string (stack or heap), but requires some sort of contract about the minumum size needed for the output string.
In your code, when the function returns, then Return will be gone as well, so this behavior is undefined. It might work, but you should never rely on it.
Typically in C, you'd want to pass the "return" string as an argument instead, so that you don't have to free it all the time. Both require a local variable on the caller's side, but malloc'ing it will require an additional call to free the allocated memory and is also more expensive than simply passing a pointer to a local variable.
As for appending to the string, just use array notation (keep track of the current char/index) and don't forget to add a null character at the end.
Example:
int arg(char* ptr, char* S, int Num) {
int i, Spaces = 0, cur = 0;
for (i=0; i<strlen(S); i++) {
if (S[i] == ' ') {
Spaces++;
}
else if (Spaces == Num) {
ptr[cur++] = S[i]; // append char
}
else if (Spaces > Num) {
ptr[cur] = '\0'; // insert null char
return 0; // returns 0 on success
}
}
ptr[cur] = '\0'; // insert null char
return (cur > 0 ? 0 : -1); // returns 0 on success, -1 on error
}
Then invoke it like so:
char myArg[50];
if (arg(myArg, "this is an example", 3) == 0) {
printf("arg is %s\n", myArg);
} else {
// arg not found
}
Just make sure you don't overflow ptr (e.g.: by passing its size and adding a check in the function).
There are numbers of ways you could improve your code, but let's just start by making it meet the standard. ;-)
P.S.: Don't malloc unless you need to. And in that case you don't.
char * Return; //by the way horrible name for a variable.
Return = malloc(<some size>);
......
......
*(Return + index) = *(S+i);
You can't assign anything to a string literal such as "".
You may want to use your loop to determine the offsets of the start of the word in your string that you're looking for. Then find its length by continuing through the string until you encounter the end or another space. Then, you can malloc an array of chars with size equal to the size of the offset+1 (For the null terminator.) Finally, copy the substring into this new buffer and return it.
Also, as mentioned above, you may want to remove the strlen call from the loop - most compilers will optimize it out but it is indeed a linear operation for every character in the array, making the loop O(n**2).
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char *arg(const char *S, unsigned int Num) {
char *Return = "";
const char *top, *p;
unsigned int Spaces = 0;
int i = 0;
Return=(char*)malloc(sizeof(char));
*Return = '\0';
if(S == NULL || *S=='\0') return Return;
p=top=S;
while(Spaces != Num){
if(NULL!=(p=strchr(top, ' '))){
++Spaces;
top=++p;
} else {
break;
}
}
if(Spaces < Num) return Return;
if(NULL!=(p=strchr(top, ' '))){
int len = p - top;
Return=(char*)realloc(Return, sizeof(char)*(len+1));
strncpy(Return, top, len);
Return[len]='\0';
} else {
free(Return);
Return=strdup(top);
}
//printf("%s-\n", Return);
return Return;
}
int main(){
char *word;
word=arg("make a quick function", 2);//quick
printf("\"%s\"\n", word);
free(word);
return 0;
}

C interview question---run-length coding of strings [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
A friend of mine was asked the following question a Yahoo interview:
Given a string of the form "abbccc" print "a1b2c3". Write a function that takes a string and return a string. Take care of all special cases.
How would you experts code it?
Thanks a lot
if (0==strcmp(s, "abbccc"))
return "a1b2c3";
else
tip_the_interviewer(50);
Taken care of.
There's more than one way to do it, but I'd probably run over the input string twice: once to count how many bytes are required for the output, then allocate the output buffer and go again to actually generate the output.
Another possibility is to allocate up front twice the number of bytes in the input string (plus one), and write the output into that. This keeps the code simpler, but is potentially very wasteful of memory. Since the operation looks like a rudimentary compression (RLE), perhaps it's best that the first implementation doesn't have the output occupy double the memory of the input.
Another possibility is to take a single pass, and reallocate the output string as necessary, perhaps increasing the size exponentially to ensure O(N) overall performance. This is quite fiddly in C, so probably not the initial implementation of the function, especially in interview conditions. It's also not necessarily any faster than my first version.
However it's done, the obvious "special case" is an empty input string, because the obvious (to me) implementation will start by storing the first character, then enter a loop. It's also easy to write something where the output may be ambiguous: "1122" is the output for the input "122", but perhaps it is also the output for the input consisting of 122 1 characters. So you might want to limit run lengths to at most 9 characters (assuming base 10 representation) to prevent ambiguity. It depends what the function is for - conjuring a complete function specification from a single example input and output is not possible.
There's also more than one way to design the interface: the question says "returns a string", so presumably that's a NUL-terminated string in a buffer newly-allocated with malloc. In the long run, though, that's not always a great way to write all your string APIs. In a real project I would prefer to design a function that takes as input the string to process, together with a pointer to an output buffer and the length of that buffer. It returns either the number of bytes written, or if the output buffer isn't big enough it returns the number which would have been written. Implementing the stated function using this new function is easy:
char *stated_function(const char *in) {
size_t sz = new_function(in, NULL, 0);
char *buf = malloc(sz);
if (buf) new_function(in, buf, sz);
return buf;
}
I'm also confused what "print" means in the question - other answerers have taken it to mean "write to stdout", meaning that no allocation is necessary. Does the interviewer want a function that prints the encoded string and returns it? Prints and returns something else? Just returns a string, and is using "print" when they don't really mean it?
Follow the following algo and implement it.
Run a loop for all the letters in
string.
Store the first character in a temp
char variable.
For each change in character
initialize a counter with 1 and
print the count of previous
character and then the new letter.
This smells like a homework question, but the code was just too much fun to write.
The key ideas:
A string is a (possibly empty) sequence of nonempty runs of identical characters.
Pointer first always points to the first in a run of identical characters.
After the inner while loop, pointer beyond points one past the end of a run of identical characters.
If the first character of a run is a zero, we've reached the end of the string. The empty string falls out as an instance of the more general problem.
The space required for a decimal numeral is always at most the length of a run, so the result needs at most double the memory. The code works fine with a run length of 53: valgrind reports no memory errors.
Pointer arithmetic is beautiful.
The code:
char *runcode(const char *s) {
char *t = malloc(2 * strlen(s) + 1); // eventual answer
assert(t);
char *w = t; // writes into t;
const char *first, *beyond; // mark limits of a run in s
for (first = s; *first; first = beyond) { // for each run do...
beyond = first+1;
while (*beyond == *first) beyond++; // move to end of run
*w++ = *first; // write char
w += sprintf(w, "%d", beyond-first); // and length of run
}
*w = '\0';
return t;
}
Things I like:
No auxiliary variable for the character whose run we're currently scanning.
No auxiliary variable for the count.
Reasonably sparing use of other local variables.
As others have pointed out, the spec is ambiguous. I think that's fine for an interview question: the point may well be to see what the job applicant does in an ambiguous situation.
Here's my take on the code. I've made some assumptions (since I can't very well ask the interviewer in this case):
This is a simple form of run-length encoding.
Output is of the form {character}{count}.
To avoid ambiguity, the count is 1..9.
Runs of the same character longer than 9 are split into multiple counts.
No dynamic allocation is done. In C, it's usually better to let caller take care of that. We return true/false to indicate if there was enough space.
I hope the code is clear enough to stand on its own. I've included a test harness and some test cases.
#include <stdbool.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
static void append(char **output, size_t *max, int c)
{
if (*max > 0) {
**output = c;
*output += 1;
*max -= 1;
}
}
static void encode(char **output, size_t *max, int c, int count)
{
while (count > 9) {
append(output, max, c);
append(output, max, '0' + 9);
count -= 9;
}
append(output, max, c);
append(output, max, '0' + count);
}
static bool rle(const char *input, char *output, size_t max)
{
char prev;
int count;
prev = '\0';
count = 0;
while (*input != '\0') {
if (*input == prev) {
count++;
} else {
if (count > 0)
encode(&output, &max, prev, count);
prev = *input;
count = 1;
}
++input;
}
if (count > 0)
encode(&output, &max, prev, count);
if (max == 0)
return false;
*output = '\0';
return true;
}
int main(void)
{
struct {
const char *input;
const char *facit;
} tests[] = {
{ "", "" },
{ "a", "a1" },
{ "aa", "a2" },
{ "ab", "a1b1" },
{ "abaabbaaabbb", "a1b1a2b2a3b3" },
{ "abbccc", "a1b2c3" },
{ "1", "11" },
{ "12", "1121" },
{ "1111111111", "1911" },
{ "aaaaaaaaaa", "a9a1" },
};
bool errors;
errors = false;
for (int i = 0; i < sizeof(tests) / sizeof(tests[0]); ++i) {
char buf[1024];
bool ok;
ok = rle(tests[i].input, buf, sizeof buf);
if (!ok || strcmp(tests[i].facit, buf) != 0) {
printf("FAIL: i=%d input=<%s> facit=<%s> buf=<%s>\n",
i, tests[i].input, tests[i].facit, buf);
errors = true;
}
}
if (errors)
return EXIT_FAILURE;
return 0;
}
int priya_homework(char *input_str, char *output_str, int out_len)
{
char pc,c;
int count=0,used=0;
/* Check for NULL and empty inputs here and return*/
*output_str='\0';
pc=*input_str;
do
{
c=*input_str++;
if (c==pc)
{
pc=c;
count++;
}
else
{
used=snprintf(output_str,out_len,"%c%d",pc,count);
if (used>=out_len)
{
/* Output string too short */
return -1;
}
output_str+=used;
out_len-=used;
pc=c;
count=1;
}
} while (c!='\0' && (out_len>0));
return 0;
}
Damn, thought you said C#, not C. Here is my C# implementation for interest's sake.
private string Question(string input)
{
var output = new StringBuilder();
while (!string.IsNullOrEmpty(input))
{
var first = input[0];
var count = 1;
while (count < input.Length && input[count] == first)
{
count++;
}
if (count > input.Length)
{
input = null;
}
else
{
input = input.Substring(count);
}
output.AppendFormat("{0}{1}", first, count);
}
return output.ToString();
}
Something like this:
void so(char s[])
{
int i,count;
char cur,prev;
i = count = prev = 0;
while(cur=s[i++])
{
if(!prev)
{
prev = cur;
count++;
}
else
{
if(cur != prev)
{
printf("%c%d",prev,count);
prev = cur;
count = 1;
}
else
count++;
}
}
if(count)
printf("%c%d",prev,count);
printf("\n");
}

Resources