Using pointers in 2D arrays - c

I'm attempting to store arrays of integers that I read from a file (with a separate function) in a 2D array but I keep having issues with Segmentation fault. I know it's an issue with my pointers but I can't figure out exactly what I'm doing wrong.
Here is my function (takes an integer and compares it with an integer read from a file before storing it in my 2D array).
int **getStopTimes(int stop_id) {
int **result = malloc(sizeof(*result));
char const* const fileName = "stop_times_test.txt";
FILE* txt = fopen(fileName, "r");
char line[256];
int count = 0;
while (fgets(line, sizeof(line), txt) != NULL) {
int *formattedLine = getStopTimeData(line); //getStopTimeData returns a pointer to an array of ints, memory is allocated in the function
if (formattedLine[1] == stop_id) {
result[count] = formattedLine;
count++;
}
}
fclose(txt);
return result;
}
And my main:
int main(int argc, char *argv[]) {
int **niceRow = getStopTimes(21249);
for (int i=0; i<2; i++) { //Only looping 3 iterations for test purposes
printf("%d,%d,%d,%d\n",niceRow[i][0], niceRow[i][1], niceRow[i][2], niceRow[i][3]);
}
free(niceRow);
return 0;
}
getStopTimeData function thats being called (Pulls certain information from an array of chars and stores/returns them in an int array):
int *getStopTimeData(char line[]) {
int commas = 0;
int len = strlen(line);
int *stopTime = malloc(4 * sizeof(*stopTime)); //Block of memory for each integer
char trip_id[256]; //Temp array to build trip_id string
char stop_id[256]; //Temp array to build stop_id string
int arrival_time; //Temp array to build arrival_time string
int departure_time; //Temp array to build departure_time string
int counter;
for(int i = 0; i <len; i++) {
if(line[i] == ',') {
commas++;
counter = 0;
continue;
}
switch(commas) { //Build strings here and store them
case 0 :
trip_id[counter++] = line[i];
if(line[i+1] == ',') trip_id[counter] = '\0';
break;
case 1: //Convert to hours past midnight from 24hr time notation so it can be stored as int
if(line[i] == ':' && line[i+3] == ':') {
arrival_time = (line[i-2]-'0')*600 + (line[i-1]-'0')*60 + (line[i+1]-'0')*10 + (line[i+2]-'0');
}
break;
case 2 :
if(line[i] == ':' && line[i+3] == ':') {
departure_time = (line[i-2]-'0')*600 + (line[i-1]-'0')*60 + (line[i+1]-'0')*10 + (line[i+2]-'0');
}
break;
case 3 :
stop_id[counter++] = line[i];
if(line[i+1] == ',') stop_id[counter] = '\0';
break;
}
}
//Assign and convert to ints
stopTime[0] = atoi(trip_id);
stopTime[1] = atoi(stop_id);
stopTime[2] = arrival_time;
stopTime[3] = departure_time;
return stopTime;
}

This line:
int **result = malloc(sizeof(*result));
allocates just memory for one single pointer. (*result is of type int *, so it's a pointer to data -- the sizeof operator will tell you the size of a pointer to data ... e.g. 4 on a 32bit architecture)
What you want to do is not entirely clear to me without seeing the code for getStopTimeData() ... but you definitely need more memory. If this function indeed returns a pointer to some ints, and it handles allocation correctly, you probably want something along the lines of this:
int result_elements = 32;
int **result = malloc(sizeof(int *) * result_elements);
int count = 0;
[...]
if (formattedLine[1] == stop_id) {
if (count == result_elements)
{
result_elements *= 2;
result = realloc(result, result_elements);
}
result[count] = formattedLine;
count++;
}
Add proper error checking, malloc and realloc could return (void *)0 (aka null) on out of memory condition.
Also, the 32 for the initial allocation size is just a wild guess ... adapt it to your needs (so it doesn't waste a lot of memory, but will be enough for most use cases)

The upper answer is good,
just to give you an advice try to avoid using 2D array but use a simple array where you can store all your data, this ensures you to have coalescent memory.
After that, you can access your 1D array with an easy trick to see it like a 2D array
Consider that your 2D array has a line_size
To access it like a matrix or a 2d array you need to find out the corresponding index of your 1d array for given x,y values
index = x + y * line size;
In the opposite way:
you know the index, you want to find x and y corresponding to this index.
y = index / line_size;
x = index mod(line_size);
Of course, this "trick" can be used if you already know your line size

Related

Try to split string but got messy substrings

I try to split one string to 3-gram strings. But turns out that the resulting substrings were always messy. The length and char ** input... are needed, since I will use them as args later for python calling the funxtion.
This is the function I wrote.
struct strArrIntArr getSearchArr(char* input, int length) {
struct strArrIntArr nameIndArr;
// flag of same bit
int same;
// flag/index of identical strings
int flag = 0;
// how many identical strings
int num = 0;
// array of split strings
char** nameArr = (char **)malloc(sizeof(char *) * (length - 2));
if ( nameArr == NULL ) exit(0);
// numbers of every split string
int* valueArr = (int* )malloc(sizeof(int) * (length-2));
if ( valueArr == NULL ) exit(0);
// loop length of search string -2 times (3-gram)
for(int i = 0; i<length-2; i++){
if(flag==0){
nameArr[i - num] = (char *)malloc(sizeof(char) * 3);
if ( nameArr[i - num] == NULL ) exit(0);
printf("----i------------%d------\n", i);
printf("----i-num--------%d------\n", i-num);
}
flag = 0;
// compare splitting string with existing split strings,
// if a string exists, it would not be stored
for(int k=0; k<i-num; k++){
same = 0;
for(int j=0; j<3; j++){
if(input[i + j] == nameArr[k][j]){
same ++;
}
}
// identical strings found, if all the three bits are the same
if(same == 3){
flag = k;
num++;
break;
}
}
// if the current split string doesn't exist yet
// put current split string to array
if(flag == 0){
for(int j=0; j<3; j++){
nameArr[i-num][j] = input[i + j];
valueArr[i-num] = 1;
}
}else{
valueArr[flag]++;
}
printf("-----string----%s\n", nameArr[i-num]);
}
// number of N-gram strings
nameIndArr.length = length- 2- num;
// array of N-gram strings
nameIndArr.charArr = nameArr;
nameIndArr.intArr = valueArr;
return nameIndArr;
}
To call the function:
int main(int argc, const char * argv[]) {
int length = 30;
char* input = (char *)malloc(sizeof(char) * length);
input = "googleapis.com.wncln.wncln.org";
// split the search string into N-gram strings
// and count the numbers of every split string
struct strArrIntArr nameIndArr = getSearchArr(input, length);
}
Below is the result. The strings from 17 are messy.
----i------------0------
----i-num--------0------
-----string----goo
----i------------1------
----i-num--------1------
-----string----oog
----i------------2------
----i-num--------2------
-----string----ogl
----i------------3------
----i-num--------3------
-----string----gle
----i------------4------
----i-num--------4------
-----string----lea
----i------------5------
----i-num--------5------
-----string----eap
----i------------6------
----i-num--------6------
-----string----api
----i------------7------
----i-num--------7------
-----string----pis
----i------------8------
----i-num--------8------
-----string----is.
----i------------9------
----i-num--------9------
-----string----s.c
----i------------10------
----i-num--------10------
-----string----.co
----i------------11------
----i-num--------11------
-----string----com
----i------------12------
----i-num--------12------
-----string----om.
----i------------13------
----i-num--------13------
-----string----m.w
----i------------14------
----i-num--------14------
-----string----.wn
----i------------15------
----i-num--------15------
-----string----wnc
---i------------16------
----i-num--------16------
-----string----ncl
----i------------17------
----i-num--------17------
-----string----clnsole
----i------------18------
----i-num--------18------
-----string----ln.=C:
----i------------19------
----i-num--------19------
-----string----n.wgram 馻绚s
----i------------20------
----i-num--------20------
-----string----n.wgram 馻绚s
-----string----n.wgram 馻绚s
-----string----n.wgram 馻绚s
-----string----n.wgram 馻绚s
-----string----n.wgram 馻绚s
-----string----n.oiles(騛窑=
----i------------26------
----i-num--------21------
-----string----.orSModu鯽蓼t
----i------------27------
----i-num--------22------
-----string----org
under win10, codeblocks 17.12, gcc 8.1.0
You are making life complicated for you in several places:
Don't count backwards: Instead of making num the count of duplicates, make it the count of unique trigraphs.
Scope variable definitions in functions as closely as possible. You have several uninitialized variables. You have declared them at the start of the function, but you need them only in local blocks.
Initialize as soon as you allocate. In your code, you use a flag to determine whather to create a new string. The code to allocate he string and to initialize it are in different blocks. Those blocks have the same flag as condition, but the flag is updated in between. This could lead to asynchronities, even to bugs when you try to initialize memory that wasn't allocated.
It's probably better to keep the strings and their counts together in a struct. If anything, this will help you with sorting later. This also offers some simplification: Instead of allocating chunks of 3 bytes, keep a char array of four bytes in the struct, so that all entries can be properly null-terminated. Those don't need to be allocated separately.
Here's an alternative implementation:
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
struct tri {
char str[4]; // trigraph: 3 chars and NUL
int count; // count of occurrences
};
struct stat {
struct tri *tri; // list of trigraphs with counts
int size; // number of trigraphs
};
/*
* Find string 'key' in list of trigraphs. Return the index
* or in the array or -1 if it isn't found.
*/
int find_trigraph(const struct tri *tri, int n, const char *key)
{
for (int i = 0; i < n; i++) {
int j = 0;
while (j < 3 && tri[i].str[j] == key[j]) j++;
if (j == 3) return i;
}
return -1;
}
/*
* Create an array of trigraphs from the input string.
*/
struct stat getSearchArr(char* input, int length)
{
int num = 0;
struct tri *tri = malloc(sizeof(*tri) * (length - 2));
for(int i = 0; i < length - 2; i++) {
int index = find_trigraph(tri, num, input + i);
if (index < 0) {
snprintf(tri[num].str, 4, "%.3s", input + i); // see [1]
tri[num].count = 1;
num++;
} else {
tri[index].count++;
}
}
for(int i = 0; i < num; i++) {
printf("#%d %s: %d\n", i, tri[i].str, tri[i].count);
}
struct stat stat = { tri, num };
return stat;
}
/*
* Driver code
*/
int main(void)
{
char *input = "googleapis.com.wncln.wncln.org";
int length = strlen(input);
struct stat stat = getSearchArr(input, length);
// ... do stuff with stat ...
free(stat.tri);
return 0;
}
Footnote 1: I find that snprintf(str, n, "%.*s", len, str + offset) is useful for copying substrings: The result will not overflow the buffer and it will be null-terminated. There really ought to be a stanard function for this, but strcpy may overflow and strncpy may leave the buffer unterminated.
This answer tries to fix the existing code instead of proposing alternative/better solutions.
After fixing the output
printf("-----string----%s\n", nameArr[i-num]);
in the question, there is still another important problem.
You want to store 3 characters in nameArr[i-num] and allocate space for 3 characters. Later you print is as a string in the code shown above. This requires a trailing '\0' after the 3 characters, so you have to allocate memory for 4 characters and either append a '\0' or initialize the allocated memory with 0. Using calloc instead of malloc would automatically initialize the memory to 0.
Here is a modified version of the source code
I also changed the initialization of the string value and its length in main() to avoid the memory leak.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
struct strArrIntArr {
int length;
char **charArr;
int *intArr;
};
struct strArrIntArr getSearchArr(char* input, int length) {
struct strArrIntArr nameIndArr;
// flag of same bit
int same;
// flag/index of identical strings
int flag = 0;
// how many identical strings
int num = 0;
// array of split strings
char** nameArr = (char **)malloc(sizeof(char *) * (length - 2));
if ( nameArr == NULL ) exit(0);
// numbers of every split string
int* valueArr = (int* )malloc(sizeof(int) * (length-2));
if ( valueArr == NULL ) exit(0);
// loop length of search string -2 times (3-gram)
for(int i = 0; i<length-2; i++){
if(flag==0){
nameArr[i - num] = (char *)malloc(sizeof(char) * 4);
if ( nameArr[i - num] == NULL ) exit(0);
printf("----i------------%d------\n", i);
printf("----i-num--------%d------\n", i-num);
}
flag = 0;
// compare splitting string with existing split strings,
// if a string exists, it would not be stored
for(int k=0; k<i-num; k++){
same = 0;
for(int j=0; j<3; j++){
if(input[i + j] == nameArr[k][j]){
same ++;
}
}
// identical strings found, if all the three bits are the same
if(same == 3){
flag = 1;
num++;
break;
}
}
// if the current split string doesn't exist yet
// put current split string to array
if(flag == 0){
for(int j=0; j<3; j++){
nameArr[i-num][j] = input[i + j];
valueArr[i-num] = 1;
}
nameArr[i-num][3] = '\0';
}else{
valueArr[flag]++;
}
printf("-----string----%s\n", nameArr[i-num]);
}
// number of N-gram strings
nameIndArr.length = length- 2- num;
// array of N-gram strings
nameIndArr.charArr = nameArr;
nameIndArr.intArr = valueArr;
return nameIndArr;
}
int main(int argc, const char * argv[]) {
int length;
char* input = strdup("googleapis.com.wncln.wncln.org");
length = strlen(input);
// split the search string into N-gram strings
// and count the numbers of every split string
struct strArrIntArr nameIndArr = getSearchArr(input, length);
}
This other answer contains more improvements which I personally would prefer over the modified original solution.

Attempting to split and store arrays similar to strtok

For an assignment in class, we have been instructed to write a program which takes a string and a delimiter and then takes "words" and stores them in a new array of strings. i.e., the input ("my name is", " ") would return an array with elements "my" "name" "is".
Roughly, what I've attempted is to:
Use a separate helper called number_of_delimeters() to determine the size of the array of strings
Iterate through the initial array to find the number of elements in a given string which would be placed in the array
Allocate storage within my array for each string
Store the elements within the allocated memory
Include directives:
#include <stdlib.h>
#include <stdio.h>
This is the separate helper:
int number_of_delimiters (char* s, int d)
{
int numdelim = 0;
for (int i = 0; s[i] != '\0'; i++)
{
if (s[i] == d)
{
numdelim++;
}
}
return numdelim;
}
`This is the function itself:
char** split_at (char* s, char d)
{
int numdelim = number_of_delimiters(s, d);
int a = 0;
int b = 0;
char** final = (char**)malloc((numdelim+1) * sizeof(char*));
for (int i = 0; i <= numdelim; i++)
{
int sizeofj = 0;
while (s[a] != d)
{
sizeofj++;
a++;
}
final[i] = (char*)malloc(sizeofj);
a++;
int j = 0;
while (j < sizeofj)
{
final[i][j] = s[b];
j++;
b++;
}
b++;
final[i][j+1] = '\0';
}
return final;
}
To print:
void print_string_array(char* a[], unsigned int alen)
{
printf("{");
for (int i = 0; i < alen; i++)
{
if (i == alen - 1)
{
printf("%s", a[i]);
}
else
{
printf("%s ", a[i]);
}
}
printf("}");
}
int main(int argc, char *argv[])
{
print_string_array(split_at("Hi, my name is none.", ' '), 5);
return 0;
}
This currently returns {Hi, my name is none.}
After doing some research, I realized that the purpose of this function is either similar or identical to strtok. However, looking at the source code for this proved to be little help because it included concepts we have not yet used in class.
I know the question is vague, and the code rough to read, but what can you point to as immediately problematic with this approach to the problem?
The program has several problems.
while (s[a] != d) is wrong, there is no delimiter after the last word in the string.
final[i][j+1] = '\0'; is wrong, j+1 is one position too much.
The returned array is unusable, unless you know beforehand how many elements are there.
Just for explanation:
strtok will modify the array you pass in! After
char test[] = "a b c ";
for(char* t = test; strtok(t, " "); t = NULL);
test content will be:
{ 'a', 0, 'b', 0, 'c', 0, 0 }
You get subsequently these pointers to your test array: test + 0, test + 2, test + 4, NULL.
strtok remembers the pointer you pass to it internally (most likely, you saw a static variable in your source code...) so you can (and must) pass NULL the next time you call it (as long as you want to operate on the same source string).
You, in contrast, apparently want to copy the data. Fine, one can do so. But here we get a problem:
char** final = //...
return final;
void print_string_array(char* a[], unsigned int alen)
You just return the array, but you are losing length information!
How do you want to pass the length to your print function then?
char** tokens = split_at(...);
print_string_array(tokens, sizeof(tokens));
will fail, because sizeof(tokens) will always return the size of a pointer on your local system (most likely 8, possibly 4 on older hardware)!
My personal recommendation: create a null terminated array of c strings:
char** final = (char**)malloc((numdelim + 2) * sizeof(char*));
// ^ (!)
// ...
final[numdelim + 1] = NULL;
Then your print function could look like this:
void print_string_array(char* a[]) // no len parameter any more!
{
printf("{");
if(*a)
{
printf("%s", *a); // printing first element without space
for (++a; *a; ++a) // *a: checking, if current pointer is not NULL
{
printf(" %s", *a); // next elements with spaces
}
}
printf("}");
}
No problems with length any more. Actually, this is exactly the same principle C strings use themselves (the terminating null character, remember?).
Additionally, here is a problem in your own code:
while (j < sizeofj)
{
final[i][j] = s[b];
j++; // j will always point behind your string!
b++;
}
b++;
// thus, you need:
final[i][j] = '\0'; // no +1 !
For completeness (this was discovered by n.m. already, see the other answer): If there is no trailing delimiter in your source string,
while (s[a] != d)
will read beyond your input string (which is undefined behaviour and could result in your program crashing). You need to check for the terminating null character, too:
while(s[a] && s[a] != d)
Finally: how do you want to handle subsequent delimiters? Currently, you will insert empty strings into your array? Print out your strings as follows (with two delimiting symbols - I used * and + like birth and death...):
printf("*%s+", *a);
and you will see. Is this intended?
Edit 2: The variant with pointer arithmetic (only):
char** split_at (char* s, char d)
{
int numdelim = 0;
char* t = s; // need a copy
while(*t)
{
numdelim += *t == d;
++t;
}
char** final = (char**)malloc((numdelim + 2) * sizeof(char*));
char** f = final; // pointer to current position within final
t = s; // re-assign t, using s as start pointer for new strings
while(*t) // see above
{
if(*t == d) // delimiter found!
{
// can subtract pointers --
// as long as they point to the same array!!!
char* n = (char*)malloc(t - s + 1); // +1: terminating null
*f++ = n; // store in position pointer and increment it
while(s != t) // copy the string from start to current t
*n++ = *s++;
*n = 0; // terminate the new string
}
++t; // next character...
}
*f = NULL; // and finally terminate the string array
return final;
}
While I've now been shown a more elegant solution, I've found and rectified the issues in my code:
char** split_at (char* s, char d)
{
int numdelim = 0;
int x;
for (x = 0; s[x] != '\0'; x++)
{
if (s[x] == d)
{
numdelim++;
}
}
int a = 0;
int b = 0;
char** final = (char**)malloc((numdelim+1) * sizeof(char*));
for (int i = 0; i <= numdelim; i++)
{
int sizeofj = 0;
while ((s[a] != d) && (a < x))
{
sizeofj++;
a++;
}
final[i] = (char*)malloc(sizeofj);
a++;
int j = 0;
while (j < sizeofj)
{
final[i][j] = s[b];
j++;
b++;
}
final[i][j] = '\0';
b++;
}
return final;
}
I consolidated what I previously had as a helper function, and modified some points where I incorrectly incremented .

Array copying C programming

I have the code as:
mid = 3;
for(i=0;i<9;i++)
{
if(i == 9-mid)
num[i] = mid;
else
num[i] = 0;
printf("%d", num[i]);
}
which gives the result of "000000300".
What I try to do is to store "000000300" as an element of another array, i.e.
unsigned int array[0] = 000000300;
Any ideas of how to do this in C? Thanks~
If you want to copy the calculated string "000000300" you will need to allocate some memory and store it in a char * array:
// num is a char array containing "000000300".
char *stored = (char *)malloc(strlen(num) + 1);
if (stored == NULL) {
// This means that there is no memory available.
// Unlikely to happen on modern machines.
}
strcpy(stored, num);

Remove char from char* and return char*

I have one char array and I want to delete chars that satisfy my condition. Example I have a char array A={1,-1,0 ,1 ,-1} and I want to delete elements that equal -1. That mean output is {1,0,1}, and I want to check how many element in the array. In my example is 3. Can you help me please?
char* delete_char(char* sourceArray, char inputChar)
{
char* out=NULL;
//Need to malloc memory. But I don't know how many size will allocate because it depends on how many element that don't equal -1
return out;
}
int sizeofArray(char* sourceArray)
{
return size;
}
You have some strategies:
Calculate the exact size before allocating.
Estimate a size.
Allocate a big size memory (you probably know maximum size in worse case).
In two later cases, you can use realloc to shrink the memory. In your code, the estimated size is strlen(sourceArray)+1 if the array is a zero-terminated string, because by deleting the result is less and equal to it.
First of all, when you are using char* as an array, you also need to provide information about the length of that array (how many items it contains). Usualy this is done by a second parameter.
In your case: delete_char(char *inputArray, char inputArraySize, char inputChar).
Then you should allocate your result array to be the size of your input array (because it can contain at most as much items as the input array).
After that, you should iterate through every item in the input array and if the item fulfills your condition, add it into the result array. Of course, you have to provide also the size of the output array (for instance in an another output parameter), so you'll be able to work with it.
And lastly, when your done with working with the resulting array, don't forget do deallocate all of its memory (that means deallocate it as it was the size of the original input array, because it actially is).
You can write own function like that.
int size(char *ptr)
{
int offset = 0;
int count = 0;
while (*(ptr + offset) != '\0')
{
++count;
++offset;
}
return count;
}
Here it is:
#include <string.h>
char* delete_char(char* sourceArray, char inputChar)
{
int iNr = 0,iSize,j = 0;
iSize = strlen( sourceArray );
for( int i = 0; i < iSize; i++ )
if( sourceArray[i] == inputChar )
iNr ++;
char *newarray = new char[iNr +1];
for( int i = 0; i < iSize; i++ )
{
if( sourceArray[i] != inputChar )
newarray[j++] = sourceArray[i];
}
return newarray;
}
int sizeofArray(char* sourceArray)
{
return strlen( sourceArray );
}
PS TESTED. It works but may not be very efficient cause you check it 2 times.
You can scan the array to determine the number of elements you are removing, then allocate and appropriate memory, then copy. For example:
int sizeNeeded = 0;
for (int i = 0; i < sizeofArray(sourceArray; i++) {
if (sourceArray[I] != inputChar) sizeNeeded++;
}
char *rv = (char *) malloc(sizeNeeded * sizeOf(char));
int j = 0;
for (int i = 0; i < sizeofArray(sourceArray; i++) {
if (sourceArray[I] != inputChar) rv[j++] = sourceArray[i];
}
I did not try to compile this code but it should convey the idea

In-place run length decoding?

Given a run length encoded string, say "A3B1C2D1E1", decode the string in-place.
The answer for the encoded string is "AAABCCDE". Assume that the encoded array is large enough to accommodate the decoded string, i.e. you may assume that the array size = MAX[length(encodedstirng),length(decodedstring)].
This does not seem trivial, since merely decoding A3 as 'AAA' will lead to over-writing 'B' of the original string.
Also, one cannot assume that the decoded string is always larger than the encoded string.
Eg: Encoded string - 'A1B1', Decoded string is 'AB'. Any thoughts?
And it will always be a letter-digit pair, i.e. you will not be asked to converted 0515 to 0000055555
If we don't already know, we should scan through first, adding up the digits, in order to calculate the length of the decoded string.
It will always be a letter-digit pair, hence you can delete the 1s from the string without any confusion.
A3B1C2D1E1
becomes
A3BC2DE
Here is some code, in C++, to remove the 1s from the string (O(n) complexity).
// remove 1s
int i = 0; // read from here
int j = 0; // write to here
while(i < str.length) {
assert(j <= i); // optional check
if(str[i] != '1') {
str[j] = str[i];
++ j;
}
++ i;
}
str.resize(j); // to discard the extra space now that we've got our shorter string
Now, this string is guaranteed to be shorter than, or the same length as, the final decoded string. We can't make that claim about the original string, but we can make it about this modified string.
(An optional, trivial, step now is to replace every 2 with the previous letter. A3BCCDE, but we don't need to do that).
Now we can start working from the end. We have already calculated the length of the decoded string, and hence we know exactly where the final character will be. We can simply copy the characters from the end of our short string to their final location.
During this copy process from right-to-left, if we come across a digit, we must make multiple copies of the letter that is just to the left of the digit. You might be worried that this might risk overwriting too much data. But we proved earlier that our encoded string, or any substring thereof, will never be longer than its corresponding decoded string; this means that there will always be enough space.
The following solution is O(n) and in-place. The algorithm should not access memory it shouldn't, both read and write. I did some debugging, and it appears correct to the sample tests I fed it.
High level overview:
Determine the encoded length.
Determine the decoded length by reading all the numbers and summing them up.
End of buffer is MAX(decoded length, encoded length).
Decode the string by starting from the end of the string. Write from the end of the buffer.
Since the decoded length might be greater than the encoded length, the decoded string might not start at the start of the buffer. If needed, correct for this by shifting the string over to the start.
int isDigit (char c) {
return '0' <= c && c <= '9';
}
unsigned int toDigit (char c) {
return c - '0';
}
unsigned int intLen (char * str) {
unsigned int n = 0;
while (isDigit(*str++)) {
++n;
}
return n;
}
unsigned int forwardParseInt (char ** pStr) {
unsigned int n = 0;
char * pChar = *pStr;
while (isDigit(*pChar)) {
n = 10 * n + toDigit(*pChar);
++pChar;
}
*pStr = pChar;
return n;
}
unsigned int backwardParseInt (char ** pStr, char * beginStr) {
unsigned int len, n;
char * pChar = *pStr;
while (pChar != beginStr && isDigit(*pChar)) {
--pChar;
}
++pChar;
len = intLen(pChar);
n = forwardParseInt(&pChar);
*pStr = pChar - 1 - len;
return n;
}
unsigned int encodedSize (char * encoded) {
int encodedLen = 0;
while (*encoded++ != '\0') {
++encodedLen;
}
return encodedLen;
}
unsigned int decodedSize (char * encoded) {
int decodedLen = 0;
while (*encoded++ != '\0') {
decodedLen += forwardParseInt(&encoded);
}
return decodedLen;
}
void shift (char * str, int n) {
do {
str[n] = *str;
} while (*str++ != '\0');
}
unsigned int max (unsigned int x, unsigned int y) {
return x > y ? x : y;
}
void decode (char * encodedBegin) {
int shiftAmount;
unsigned int eSize = encodedSize(encodedBegin);
unsigned int dSize = decodedSize(encodedBegin);
int writeOverflowed = 0;
char * read = encodedBegin + eSize - 1;
char * write = encodedBegin + max(eSize, dSize);
*write-- = '\0';
while (read != encodedBegin) {
unsigned int i;
unsigned int n = backwardParseInt(&read, encodedBegin);
char c = *read;
for (i = 0; i < n; ++i) {
*write = c;
if (write != encodedBegin) {
write--;
}
else {
writeOverflowed = 1;
}
}
if (read != encodedBegin) {
read--;
}
}
if (!writeOverflowed) {
write++;
}
shiftAmount = encodedBegin - write;
if (write != encodedBegin) {
shift(write, shiftAmount);
}
return;
}
int main (int argc, char ** argv) {
//char buff[256] = { "!!!A33B1C2D1E1\0!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!" };
char buff[256] = { "!!!A2B12C1\0!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!" };
//char buff[256] = { "!!!A1B1C1\0!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!" };
char * str = buff + 3;
//char buff[256] = { "A1B1" };
//char * str = buff;
decode(str);
return 0;
}
This is a very vague question, though it's not particularly difficult if you think about it. As you say, decoding A3 as AAA and just writing it in place will overwrite the chars B and 1, so why not just move those farther along the array first?
For instance, once you've read A3, you know that you need to make space for one extra character, if it was A4 you'd need two, and so on. To achieve this you'd find the end of the string in the array (do this upfront and store it's index).
Then loop though, moving the characters to their new slots:
To start: A|3|B|1|C|2|||||||
Have a variable called end storing the index 5, i.e. the last, non-blank, entry.
You'd read in the first pair, using a variable called cursor to store your current position - so after reading in the A and the 3 it would be set to 1 (the slot with the 3).
Pseudocode for the move:
var n = array[cursor] - 2; // n = 1, the 3 from A3, and then minus 2 to allow for the pair.
for(i = end; i > cursor; i++)
{
array[i + n] = array[i];
}
This would leave you with:
A|3|A|3|B|1|C|2|||||
Now the A is there once already, so now you want to write n + 1 A's starting at the index stored in cursor:
for(i = cursor; i < cursor + n + 1; i++)
{
array[i] = array[cursor - 1];
}
// increment the cursor afterwards!
cursor += n + 1;
Giving:
A|A|A|A|B|1|C|2|||||
Then you're pointing at the start of the next pair of values, ready to go again. I realise there are some holes in this answer, though that is intentional as it's an interview question! For instance, in the edge cases you specified A1B1, you'll need a different loop to move subsequent characters backwards rather than forwards.
Another O(n^2) solution follows.
Given that there is no limit on the complexity of the answer, this simple solution seems to work perfectly.
while ( there is an expandable element ):
expand that element
adjust (shift) all of the elements on the right side of the expanded element
Where:
Free space size is the number of empty elements left in the array.
An expandable element is an element that:
expanded size - encoded size <= free space size
The point is that in the process of reaching from the run-length code to the expanded string, at each step, there is at least
one element that can be expanded (easy to prove).

Resources