In-place run length decoding? - c

Given a run length encoded string, say "A3B1C2D1E1", decode the string in-place.
The answer for the encoded string is "AAABCCDE". Assume that the encoded array is large enough to accommodate the decoded string, i.e. you may assume that the array size = MAX[length(encodedstirng),length(decodedstring)].
This does not seem trivial, since merely decoding A3 as 'AAA' will lead to over-writing 'B' of the original string.
Also, one cannot assume that the decoded string is always larger than the encoded string.
Eg: Encoded string - 'A1B1', Decoded string is 'AB'. Any thoughts?
And it will always be a letter-digit pair, i.e. you will not be asked to converted 0515 to 0000055555

If we don't already know, we should scan through first, adding up the digits, in order to calculate the length of the decoded string.
It will always be a letter-digit pair, hence you can delete the 1s from the string without any confusion.
A3B1C2D1E1
becomes
A3BC2DE
Here is some code, in C++, to remove the 1s from the string (O(n) complexity).
// remove 1s
int i = 0; // read from here
int j = 0; // write to here
while(i < str.length) {
assert(j <= i); // optional check
if(str[i] != '1') {
str[j] = str[i];
++ j;
}
++ i;
}
str.resize(j); // to discard the extra space now that we've got our shorter string
Now, this string is guaranteed to be shorter than, or the same length as, the final decoded string. We can't make that claim about the original string, but we can make it about this modified string.
(An optional, trivial, step now is to replace every 2 with the previous letter. A3BCCDE, but we don't need to do that).
Now we can start working from the end. We have already calculated the length of the decoded string, and hence we know exactly where the final character will be. We can simply copy the characters from the end of our short string to their final location.
During this copy process from right-to-left, if we come across a digit, we must make multiple copies of the letter that is just to the left of the digit. You might be worried that this might risk overwriting too much data. But we proved earlier that our encoded string, or any substring thereof, will never be longer than its corresponding decoded string; this means that there will always be enough space.

The following solution is O(n) and in-place. The algorithm should not access memory it shouldn't, both read and write. I did some debugging, and it appears correct to the sample tests I fed it.
High level overview:
Determine the encoded length.
Determine the decoded length by reading all the numbers and summing them up.
End of buffer is MAX(decoded length, encoded length).
Decode the string by starting from the end of the string. Write from the end of the buffer.
Since the decoded length might be greater than the encoded length, the decoded string might not start at the start of the buffer. If needed, correct for this by shifting the string over to the start.
int isDigit (char c) {
return '0' <= c && c <= '9';
}
unsigned int toDigit (char c) {
return c - '0';
}
unsigned int intLen (char * str) {
unsigned int n = 0;
while (isDigit(*str++)) {
++n;
}
return n;
}
unsigned int forwardParseInt (char ** pStr) {
unsigned int n = 0;
char * pChar = *pStr;
while (isDigit(*pChar)) {
n = 10 * n + toDigit(*pChar);
++pChar;
}
*pStr = pChar;
return n;
}
unsigned int backwardParseInt (char ** pStr, char * beginStr) {
unsigned int len, n;
char * pChar = *pStr;
while (pChar != beginStr && isDigit(*pChar)) {
--pChar;
}
++pChar;
len = intLen(pChar);
n = forwardParseInt(&pChar);
*pStr = pChar - 1 - len;
return n;
}
unsigned int encodedSize (char * encoded) {
int encodedLen = 0;
while (*encoded++ != '\0') {
++encodedLen;
}
return encodedLen;
}
unsigned int decodedSize (char * encoded) {
int decodedLen = 0;
while (*encoded++ != '\0') {
decodedLen += forwardParseInt(&encoded);
}
return decodedLen;
}
void shift (char * str, int n) {
do {
str[n] = *str;
} while (*str++ != '\0');
}
unsigned int max (unsigned int x, unsigned int y) {
return x > y ? x : y;
}
void decode (char * encodedBegin) {
int shiftAmount;
unsigned int eSize = encodedSize(encodedBegin);
unsigned int dSize = decodedSize(encodedBegin);
int writeOverflowed = 0;
char * read = encodedBegin + eSize - 1;
char * write = encodedBegin + max(eSize, dSize);
*write-- = '\0';
while (read != encodedBegin) {
unsigned int i;
unsigned int n = backwardParseInt(&read, encodedBegin);
char c = *read;
for (i = 0; i < n; ++i) {
*write = c;
if (write != encodedBegin) {
write--;
}
else {
writeOverflowed = 1;
}
}
if (read != encodedBegin) {
read--;
}
}
if (!writeOverflowed) {
write++;
}
shiftAmount = encodedBegin - write;
if (write != encodedBegin) {
shift(write, shiftAmount);
}
return;
}
int main (int argc, char ** argv) {
//char buff[256] = { "!!!A33B1C2D1E1\0!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!" };
char buff[256] = { "!!!A2B12C1\0!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!" };
//char buff[256] = { "!!!A1B1C1\0!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!" };
char * str = buff + 3;
//char buff[256] = { "A1B1" };
//char * str = buff;
decode(str);
return 0;
}

This is a very vague question, though it's not particularly difficult if you think about it. As you say, decoding A3 as AAA and just writing it in place will overwrite the chars B and 1, so why not just move those farther along the array first?
For instance, once you've read A3, you know that you need to make space for one extra character, if it was A4 you'd need two, and so on. To achieve this you'd find the end of the string in the array (do this upfront and store it's index).
Then loop though, moving the characters to their new slots:
To start: A|3|B|1|C|2|||||||
Have a variable called end storing the index 5, i.e. the last, non-blank, entry.
You'd read in the first pair, using a variable called cursor to store your current position - so after reading in the A and the 3 it would be set to 1 (the slot with the 3).
Pseudocode for the move:
var n = array[cursor] - 2; // n = 1, the 3 from A3, and then minus 2 to allow for the pair.
for(i = end; i > cursor; i++)
{
array[i + n] = array[i];
}
This would leave you with:
A|3|A|3|B|1|C|2|||||
Now the A is there once already, so now you want to write n + 1 A's starting at the index stored in cursor:
for(i = cursor; i < cursor + n + 1; i++)
{
array[i] = array[cursor - 1];
}
// increment the cursor afterwards!
cursor += n + 1;
Giving:
A|A|A|A|B|1|C|2|||||
Then you're pointing at the start of the next pair of values, ready to go again. I realise there are some holes in this answer, though that is intentional as it's an interview question! For instance, in the edge cases you specified A1B1, you'll need a different loop to move subsequent characters backwards rather than forwards.

Another O(n^2) solution follows.
Given that there is no limit on the complexity of the answer, this simple solution seems to work perfectly.
while ( there is an expandable element ):
expand that element
adjust (shift) all of the elements on the right side of the expanded element
Where:
Free space size is the number of empty elements left in the array.
An expandable element is an element that:
expanded size - encoded size <= free space size
The point is that in the process of reaching from the run-length code to the expanded string, at each step, there is at least
one element that can be expanded (easy to prove).

Related

Optimizing a do while loop/for loop in C

I was doing an exercise from LeetCode in which consisted in deleting any adjacent elements from a string, until there are only unique characters adjacent to each other. With some help I could make a code that can solve most testcases, but the string length can be up to 10^5, and in a testcase it exceeds the time limit, so I'm in need in some tips on how can I optimize it.
My code:
char res[100000]; //up to 10^5
char * removeDuplicates(char * s){
//int that verifies if any char from the string can be deleted
int ver = 0;
//do while loop that reiterates to eliminate the duplicates
do {
int lenght = strlen(s);
int j = 0;
ver = 0;
//for loop that if there are duplicates adds one to ver and deletes the duplicate
for (int i = 0; i < lenght ; i++){
if (s[i] == s[i + 1]){
i++;
j--;
ver++;
}
else {
res[j] = s[i];
}
j++;
}
//copying the res string into the s to redo the loop if necessary
strcpy(s,res);
//clar the res string
memset(res, '\0', sizeof res);
} while (ver > 0);
return s;
}
The code can't pass a speed test that has a string that has around the limit (10^5) length, I won't put it here because it's a really big text, but if you want to check it, it is the 104 testcase from the LeetCode Daily Problem
If it was me doing something like that, I would basically do it like a simple naive string copy, but keep track of the last character copied and if the next character to copy is the same as the last then skip it.
Perhaps something like this:
char result[1000]; // Assumes no input string will be longer than this
unsigned source_index; // Index into the source string
unsigned dest_index; // Index into the destination (result) string
// Always copy the first character
result[0] = source_string[0];
// Start with 1 for source index, since we already copies the first character
for (source_index = 1, dest_index = 0; source_string[source_index] != '\0'; ++source_index)
{
if (source_string[source_index] != result[dest_index])
{
// Next character is not equal to last character copied
// That means we can copy this character
result[++dest_index] = source_string[source_index];
}
// Else: Current source character was equal to last copied character
}
// Terminate the destination string
result[dest_index + 1] = '\0';

Making two arrays the same length. C(89)

The two arrays passed in are constants so I made two new arrays.
The first array stores a group of chars and the second array stores a second group of chars. So far I assume that the first group is bigger than the second ex. (a,b,c,d > x,y).
What the program hopes to accomplish is to make two new arrays that contain the same letters but the shorter array in this case arr2 (newarr2) has it's last char repeated until it matches the length of the first array.
examples of correct solutions.
(a,b,c,d < x,y) --> equate_arr --> (a,b,c,d = x,y,y,y)
void equate_arr(char arg2[], char arg1[]){
size_t i = 0;
size_t len1 = strlen(arg1);
size_t len2 = strlen(arg2);
char newarr2[512];
char newarr1[512];
while(i < (strlen2 - 1))
{
newarr2[i] = arg2[i];
i++;
}
i = 0;
while(i < (strlen1 - 1))
{
newarr1[i] = arg1[i];
i++;
}
i = 0;
while(strlen(newarr2) < strlen(newarr1))
{
newarr2[strlen(newarr2)] = newarr2[strlen(newarr2)-1]
}
}
Currently I have no idea what is happening because once I fiddle with this function in my code the program does not seem to run anymore. Sorry about asking about this project I'm working on so much but I really do need some assistance.
I can put the whole program in here if needed.
Revised
void tr_non_eq(char arg1[], char arg2[], int len1, int len2)
{
int i = 0;
char* arr2;
arr2 = (char*)calloc(len1+1,sizeof(char));
while(i < len2)
{
arr2[i] = arg2[i];
i++;
}
while(len2 < len1)
{
arr2[len2] = arg2[len2-1];
len2++;
}
tr_str(arg1, arr2);
}
Right now with inputs (a,b,c,d,e,f) and (x,y) and a string "cabbage" to translate the program prints out "yxyyx" and with string "abcdef" it prints out "xyy" which shows promise. I am not too sure why the arr2 array does not get filled with "y" chars as intended.
As de-duplicator says, as your code stands it effectively achieves nothing. More importantly, what it tries to do is fraught with peril.
The fact that you use strlen to determine the length of your arguments is a clear indicator that equate_arr does not expect to receive two arrays of char. Instead, it wants two NUL-terminated C-style strings. So the declaration should be more like:
void equate_arr(const char *arg2, const char *arg1)
This makes the contract a little clearer.
But note the return type: void. This says your function will not return any values to the caller. So, how did you plan to return the modified arrays?
The next big peril lies in these lines:
char newarr2[512];
char newarr1[512];
What happens if this function is called with a string which is larger than 511 characters (plus the NUL)? The phrase "buffer overrun" should be jumping out at you here.
What you need is to malloc buffers large enough to hold a duplicate of the longest string passed in. But that raises the question of how you will hand the new arrays back to the caller (remember that void return type?).
There are numerous other problems here, largely down to not having a clear definition of the contract this function is meant to meet.
One more for now while I look more closely
while(strlen(newarr2) < strlen(newarr1))
{
newarr2[strlen(newarr2)] = newarr2[strlen(newarr2)-1]
}
The very first pass through this loop overwrites the terminating NUL in newarr2, which means the next call to strlen is off into undefined behavior as it is completely at the mercy of whatever junk is sitting in your stack.
If you are unclear on C-style strings, take a look at my answer to this question which goes into great detail about them.
The following is whiteboard-code (i.e. not compiled, not tested) which would sort of do what you are wanting to achieve. It's purely for reference
// Pad a string so that it is the same length as another. Padding is done
// by replicating the final character.
//
// #param padThis: A C-style string in a non-constant buffer.
// #param bufLength: The size of the buffer containing padThis
// #param toMatchThis: A (possibly) const C-style string to act
// as a template for length
//
// Pre-conditions:
// - Both padThis and toMatchThis reference NUL-terminated sequences
// of chars
// - strlen(padThis) < bufLength. Violating this will exit the program.
// - strlen(toMatchThis) < bufLength. If not, padThis will be padded
// to bufLength characters.
//
// Post-conditons:
// - The string referenced by toMatchThis is unchanged
// - The original string at padThis has been padded if necessary to
// min(bufLength, strlen(toMatchThis))
void padString(char * padThis, size_t bufLength, const char * toMatchThis)
{
size_t targetLength = strlen(toMatchThis);
size_t originalLength = strlen(padThis);
if (originalLength >= bufLength)
{
fprintf(stderr, "padString called with an original which is longer than the buffer!\n");
exit(EXIT_FAILURE);
}
if (targetLength >= bufLength)
targetLength = bufLength -1; // Just pad until buffer full
if (targetLength <= strlen(padThis))
return; // Nothing to do
// At this point, we know that some padding needs to occur, and
// that the buffer is large enough (assuming the caller is not
// lying to us).
char padChar = padThis[originalLength-1];
size_t index = originalLength;
while (index < targetLength)
padThis[index++] = padChar;
padThis[index] = '\0';
}
Since you declared
char newarr2[512];
char newarr1[512];
as size 512 and not assigned any data, strlen will always return size of newarr1 and newarr2 as garbage since you not ended the string with a proper NULL character.
while(strlen(newarr2) < strlen(newarr1))
{
newarr2[strlen(newarr2)] = newarr2[strlen(newarr2)-1]
}
this while loop will not work properly.
for ( i = len2; i < len1; ++i )
newarr2[i] = newarr2[len2-1]
if len2 is always less than len1, you can use the above loop
if you do not know the which array will be bigger than,
size_t len1 = strlen(arg1);
size_t len2 = strlen(arg2);
char* newarr1;
char* newarr2;
int i;
if ( len1 >= len2 )
{
newarr1 = (char*)calloc(len1+1,sizeof(char));
newarr2 = (char*)calloc(len1+1,sizeof(char));
}
else
{
newarr1 = (char*)calloc(len2+1,sizeof(char));
newarr2 = (char*)calloc(len2+1,sizeof(char));
}
for ( i = 0; i < len1; ++i)
newarr1[i] = arg1[i];
for ( i = 0; i < len2; ++i)
newarr2[i] = arg2[i];
if( len1 >= len2 )
{
for ( i = len2; i < len1; ++i )
newarr2[i] = newarr2[len2-1];
}
else
{
for ( i = len1; i < len2; ++i )
newarr1[i] = newarr1[len1-1];
}
free the memory later

Returning the length of a char array in C

I am new to programming in C and am trying to write a simple function that will normalize a char array. At the end i want to return the length of the new char array. I am coming from java so I apologize if I'm making mistakes that seem simple. I have the following code:
/* The normalize procedure normalizes a character array of size len
according to the following rules:
1) turn all upper case letters into lower case ones
2) turn any white-space character into a space character and,
shrink any n>1 consecutive whitespace characters to exactly 1 whitespace
When the procedure returns, the character array buf contains the newly
normalized string and the return value is the new length of the normalized string.
*/
int
normalize(unsigned char *buf, /* The character array contains the string to be normalized*/
int len /* the size of the original character array */)
{
/* use a for loop to cycle through each character and the built in c functions to analyze it */
int i;
if(isspace(buf[0])){
buf[0] = "";
}
if(isspace(buf[len-1])){
buf[len-1] = "";
}
for(i = 0;i < len;i++){
if(isupper(buf[i])) {
buf[i]=tolower(buf[i]);
}
if(isspace(buf[i])) {
buf[i]=" ";
}
if(isspace(buf[i]) && isspace(buf[i+1])){
buf[i]="";
}
}
return strlen(*buf);
}
How can I return the length of the char array at the end? Also does my procedure properly do what I want it to?
EDIT: I have made some corrections to my program based on the comments. Is it correct now?
/* The normalize procedure normalizes a character array of size len
according to the following rules:
1) turn all upper case letters into lower case ones
2) turn any white-space character into a space character and,
shrink any n>1 consecutive whitespace characters to exactly 1 whitespace
When the procedure returns, the character array buf contains the newly
normalized string and the return value is the new length of the normalized string.
*/
int
normalize(unsigned char *buf, /* The character array contains the string to be normalized*/
int len /* the size of the original character array */)
{
/* use a for loop to cycle through each character and the built in c funstions to analyze it */
int i = 0;
int j = 0;
if(isspace(buf[0])){
//buf[0] = "";
i++;
}
if(isspace(buf[len-1])){
//buf[len-1] = "";
i++;
}
for(i;i < len;i++){
if(isupper(buf[i])) {
buf[j]=tolower(buf[i]);
j++;
}
if(isspace(buf[i])) {
buf[j]=' ';
j++;
}
if(isspace(buf[i]) && isspace(buf[i+1])){
//buf[i]="";
i++;
}
}
return strlen(buf);
}
The canonical way of doing something like this is to use two indices, one for reading, and one for writing. Like this:
int normalizeString(char* buf, int len) {
int readPosition, writePosition;
bool hadWhitespace = false;
for(readPosition = writePosition = 0; readPosition < len; readPosition++) {
if(isspace(buf[readPosition]) {
if(!hadWhitespace) buf[writePosition++] = ' ';
hadWhitespace = true;
} else if(...) {
...
}
}
return writePosition;
}
Warning: This handles the string according to the given length only. While using a buffer + length has the advantage of being able to handle any data, this is not the way C strings work. C-strings are terminated by a null byte at their end, and it is your job to ensure that the null byte is at the right position. The code you gave does not handle the null byte, nor does the buffer + length version I gave above. A correct C implementation of such a normalization function would look like this:
int normalizeString(char* string) { //No length is passed, it is implicit in the null byte.
char* in = string, *out = string;
bool hadWhitespace = false;
for(; *in; in++) { //loop until the zero byte is encountered
if(isspace(*in) {
if(!hadWhitespace) *out++ = ' ';
hadWhitespace = true;
} else if(...) {
...
}
}
*out = 0; //add a new zero byte
return out - string; //use pointer arithmetic to retrieve the new length
}
In this code I replaced the indices by pointers simply because it was convenient to do so. This is simply a matter of style preference, I could have written the same thing with explicit indices. (And my style preference is not for pointer iterations, but for concise code.)
if(isspace(buf[i])) {
buf[i]=" ";
}
This should be buf[i] = ' ', not buf[i] = " ". You can't assign a string to a character.
if(isspace(buf[i]) && isspace(buf[i+1])){
buf[i]="";
}
This has two problems. One is that you're not checking whether i < len - 1, so buf[i + 1] could be off the end of the string. The other is that buf[i] = "" won't do what you want at all. To remove a character from a string, you need to use memmove to move the remaining contents of the string to the left.
return strlen(*buf);
This would be return strlen(buf). *buf is a character, not a string.
The notations like:
buf[i]=" ";
buf[i]="";
do not do what you think/expect. You will probably need to create two indexes to step through the array — one for the current read position and one for the current write position, initially both zero. When you want to delete a character, you don't increment the write position.
Warning: untested code.
int i, j;
for (i = 0, j = 0; i < len; i++)
{
if (isupper(buf[i]))
buf[j++] = tolower(buf[i]);
else if (isspace(buf[i])
{
buf[j++] = ' ';
while (i+1 < len && isspace(buf[i+1]))
i++;
}
else
buf[j++] = buf[i];
}
buf[j] = '\0'; // Null terminate
You replace the arbitrary white space with a plain space using:
buf[i] = ' ';
You return:
return strlen(buf);
or, with the code above:
return j;
Several mistakes in your code:
You cannot assign buf[i] with a string, such as "" or " ", because the type of buf[i] is char and the type of a string is char*.
You are reading from buf and writing into buf using index i. This poses a problem, as you want to eliminate consecutive white-spaces. So you should use one index for reading and another index for writing.
In C/C++, a native string is an array of characters that ends with 0. So in essence, you can simply iterate buf until you read 0 (you don't need to use the len variable at all). In addition, since you are "truncating" the input string, you should set the new last character to 0.
Here is one optional solution for the problem at hand:
int normalize(char* buf)
{
char c;
int i = 0;
int j = 0;
while (buf[i] != 0)
{
c = buf[i++];
if (isspace(c))
{
j++;
while (isspace(c))
c = buf[i++];
}
if (isupper(c))
buf[j] = tolower(c);
j++;
}
buf[j] = 0;
return j;
}
you should write:
return strlen(buf)
instead of:
return strlen(*buf)
The reason:
buf is of type char* - it's an address of a char somewhere in the memory (the one in the beginning of the string). The string is null terminated (or at least should be), and therefore the function strlen knows when to stop counting chars.
*buf will de-reference the pointer, resulting on a char - not what strlen expects.
Not much different then others but assumes this is an array of unsigned char and not a C string.
tolower() does not itself need the isupper() test.
int normalize(unsigned char *buf, int len) {
int i = 0;
int j = 0;
int previous_is_space = 0;
while (i < len) {
if (isspace(buf[i])) {
if (!previous_is_space) {
buf[j++] = ' ';
}
previous_is_space = 1;
} else {
buf[j++] = tolower(buf[i]);
previous_is_space = 0;
}
i++;
}
return j;
}
#OP:
Per the posted code it implies leading and trailing spaces should either be shrunk to 1 char or eliminate all leading and trailing spaces.
The above answer simple shrinks leading and trailing spaces to 1 ' '.
To eliminate trailing and leading spaces:
int i = 0;
int j = 0;
while (len > 0 && isspace(buf[len-1])) len--;
while (i < len && isspace(buf[i])) i++;
int previous_is_space = 0;
while (i < len) { ...

How to use strncpy with a for-loop in C?

I am writing a program which will take every 3 numbers in a file and convert them to their ASCII symbol. So I thought I could read the numbers into a character array, and then make every 3 elements 1 element in a second array, convert them to int and then print these as char.
I am stuck on taking every 3 elements, however. This is my code snippet for this part:
char arry[] = "073102109109112"; <--example string read from a file
char arryNew[16] = {0};
for(int i = 0; i <= sizeof(arryNew); i++){
strncpy(arryNew, arry, 3);
arryNew[i+3]='\0';
puts(arryNew);
}
What this code gives me is the first 3 numbers, fifteen times. I've tried incrementing i by 3, which gives me the first 3 numbers 5 times. How do I write a for-loop with strncpy so that after copying n chars, it moves to the next n chars?
You pass always the pointer to the beginning of the array, so you will always have the same result of course. You must include the loop counter to get at the next block:
strncpy(arryNew, &arry[i*3], 3);
Here you have a problem:
arryNew[i+3]='\0';
First of all, you don't need to set the null byte every time, because this will not change anyway. Additionally you will corrupt memory, because you use i+3 as the index so when you reach 14 and 15, it will write beyond the arrayboundary.
Your arrayNew must be longer, because your original array is 16 characters, and your target array is also. If you intend to have several 3char strings in there, then you must have 5*4 characters for your target, because each string also has the 0-byte.
And of course, you must also use the index here as well. The way it is written now, it will write beyond the array boundary, when i reaches 14 and 15.
So what you seem to want to do (not sure from your description) is:
char arry[] = "073102109109112"; <--example string read from a file
char arryNew[20] = {0};
for(int i = 0; i <= sizeof(arry); i++)
{
strncpy(&arryNew[i*4], &arry[i*3], 3);
puts(&arryNew[i*4]);
}
Or if you just want to have the individual strings printed then you can just do:
char arry[] = "073102109109112"; <--example string read from a file
char arryNew[4] = {0};
for(int i = 0; i <= sizeof(arry); i++)
{
strncpy(arryNew, &arry[i*3], 3);
puts(arryNew);
}
Making things a bit simpler: your target string doesn't change.
char arry[] = "073102109109112"; <--example string read from a file
char target[4] = {0};
for(int i = 0; i < strlen(arry) - 3; i+=3)
{
strncpy(target, arry + i, 3);
puts(target);
}
Decoding:
start at the beginning of arry
copy 3 characters to target
(note the fourth element of target is \0)
print out the contents of target
increment i by 3
repeat until you fall off the end of the string.
Some problems.
// Need to change a 3 chars, as text, into an integer.
arryNew[i] = (char) strtol(buf, &endptr, 10);
// char arryNew[16] = {0};
// Overly large.
arryNew[6]
// for(int i = 0; i <= sizeof(arryNew); i++){
// Indexing too far. Should be `i <= (sizeof(arryNew) - 2)` or ...
for (i=0; i<arryNewLen; i++) {
// strncpy(arryNew, arry, 3);
// strncpy() can be used, but we know the length of source and destination,
// simpler to use memcpy()
// strncpy(buf, a, sizeof buf - 1);
memcpy(buf, arry, N);
// arryNew[i+3]='\0';
// Toward the loop's end, code is writing outside arryNew.
// Lets append the `\0` after the for() loop.
// int i
size_t i; // Better to use size_t (or ssize_t) for array index.
Suggestion:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main() {
char Source[] = "073102109109112"; // example string read from a file
const int TIW = 3; // textual integer width
// Avoid sprinkling bare constants about code. Define in 1 place instead.
const char *arry = Source;
size_t arryLen = strlen(arry);
if (arryLen%TIW != 0) return -1; // is it a strange sized arry?
size_t arryNewLen = arryLen/TIW;
char arryNew[arryNewLen + 1];
size_t i;
for (i=0; i<arryNewLen; i++) {
char buf[TIW + 1];
// strncpy(buf, a, sizeof buf - 1);
memcpy(buf, arry, TIW);
buf[TIW] = '\0';
char *endptr; // Useful should OP want to do error checking
// TBD: test if result is 0 to 255
arryNew[i] = (char) strtol(buf, &endptr, 10);
arry += TIW;
}
arryNew[i] = '\0';
puts(arryNew); // prints Ifmmp
return 0;
}
You could use this code to complete your task i.e. to convert the given char array in form of ascii value.
char arry[] = "073102109109112";
char arryNew[16] = {0};
int i,j=0;
for(i = 0; i <= sizeof(arryNew)-2; i+=3)
{
arryNew[j]=arry[i]*100+arry[i+1]*10+arry[i+2]*1;
j++;
arryNew[j+1]='\0';
puts(arryNew);
}

pointers and string parsing in c

I was wondering if somebody could explain me how pointers and string parsing works. I know that I can do something like the following in a loop but I still don't follow very well how it works.
for (a = str; * a; a++) ...
For instance, I'm trying to get the last integer from the string. if I have a string as const char *str = "some string here 100 2000";
Using the method above, how could I parse it and get the last integer of the string (2000), knowing that the last integer (2000) may vary.
Thanks
for (a = str; * a; a++) ...
This works by starting a pointer a at the beginning of the string, until dereferencing a is implicitly converted to false, incrementing a at each step.
Basically, you'll walk the array until you get to the NUL terminator that's at the end of your string (\0) because the NUL terminator implicitly converts to false - other characters do not.
Using the method above, how could I parse it and get the last integer of the string (2000), knowing that the last integer (2000) may vary.
You're going to want to look for the last space before the \0, then you're going to want to call a function to convert the remaining characters to an integer. See strtol.
Consider this approach:
find the end of the string (using that loop)
search backwards for a space.
use that to call strtol.
-
for (a = str; *a; a++); // Find the end.
while (*a != ' ') a--; // Move back to the space.
a++; // Move one past the space.
int result = strtol(a, NULL, 10);
Or alternatively, just keep track of the start of the last token:
const char* start = str;
for (a = str; *a; a++) { // Until you hit the end of the string.
if (*a == ' ') start = a; // New token, reassign start.
}
int result = strtol(start, NULL, 10);
This version has the benefit of not requiring a space in the string.
You just need to implement a simple state machine with two states, e.g
#include <ctype.h>
int num = 0; // the final int value will be contained here
int state = 0; // state == 0 == not parsing int, state == 1 == parsing int
for (i = 0; i < strlen(s); ++i)
{
if (state == 0) // if currently in state 0, i.e. not parsing int
{
if (isdigit(s[i])) // if we just found the first digit character of an int
{
num = s[i] - '0'; // discard any old int value and start accumulating new value
state = 1; // we are now in state 1
}
// otherwise do nothing and remain in state 0
}
else // currently in state 1, i.e. parsing int
{
if (isdigit(s[i])) // if this is another digit character
{
num = num * 10 + s[i] - '0'; // continue accumulating int
// remain in state 1...
}
else // no longer parsing int
{
state = 0; // return to state 0
}
}
}
I know this has been answered already but all the answers thus far are recreating code that is available in the Standard C Library. Here is what I would use by taking advantage of strrchr()
#include <string.h>
#include <stdio.h>
int main(void)
{
const char* input = "some string here 100 2000";
char* p;
long l = 0;
if(p = strrchr(input, ' '))
l = strtol(p+1, NULL, 10);
printf("%ld\n", l);
return 0;
}
Output
2000
for (a = str; * a; a++)...
is equivalent to
a=str;
while(*a!='\0') //'\0' is NUL, don't confuse it with NULL which is a macro
{
....
a++;
}
The loop you've presented just goes through all characters (string is a pointer to the array of 1-byte chars that ends with 0). For parsing you should use sscanf or better C++'s string and string stream.

Resources