Character alignment weird output - c

My code is at the bottom. I'm trying to take two strings and change them as per the edit transcript that I receive. I wrote my code but I don't understand why I'm getting such a weird output. My goal is to first store the values of the two strings then make them into a 2D array afterwards, but I'm failing on part one of that goal. Here's the problem:
Create a function that meets the following:
Input: an edit transcript, and 2 original strings (3 strings)
Output: a 2 d array containing the two alignments post edit
Example:
s1 = “vintner”
s2 = “writers”
trans = “RIMDMDMMI”
R stands for "replace"
I stands for "insert"
M stands for "match"
D stands for "delete"
Answer:
alignment={“v_intner_”,
“wri_t_ers”}; //return a 2d array
Function prototype:
char** getAlignment(char* s1, char* s2, char* s3);
Here's my code below:
char TestS1[] = "vintner";
char TestS2[] = "writers";
char TestS3[] = "RIMDMDMMI";
char twoDarray[2][10];
char** getAlignment(char* s1, char* s2, char* s3){
char transTemp[n];
char s1Temp[n];
char s2Temp[n];
char sOne[n];
char sTwo[n];
strcpy(sOne, s1);
strcpy(sTwo, s2);
int jj;
strcpy(transTemp, s3);
int kk;
for(jj=0, kk=0; jj<n, kk<n; jj++, kk++){
if(transTemp[jj]=='R')
{
s1Temp[kk] = sOne[jj];
s2Temp[kk] = sTwo[jj];
}
if(transTemp[jj]=='I'){
s1Temp[kk] = '_';
s1Temp[kk+1] = sOne[jj];
s2Temp[kk] = sTwo[jj];
kk++;
}
if(transTemp[jj] == 'M'){
s1Temp[kk] = sOne[jj];
s2Temp[kk] = sTwo[jj];
}
if(transTemp[jj] == 'D'){
s2Temp[kk] = '_';
s2Temp[kk+1] = sTwo[jj];
s1Temp[kk] = sOne[jj];
kk++;
}
}
printf("\ns1Temp = %s\n", s1Temp);
printf("\ns2Temp = %s\n", s2Temp);
return 0;
}
main()
{
printf("The new method returns: ", getAlignment(TestS1,TestS2,TestS3));
return 0;
}

Your question really has two parts: How can I return two strings? And why don't I get the desired output?
Strings in C are character arrays. You seldom return strings. It is more common to pass a character array to a function, together with its maximum length, and fill that array. The functions in <string.h> do that. A good design model is, in my opinion, snprintf: It fills the buffer, takes care not to overflow it, ensures that the result is properly null-terminated and returns the number of characters written had the buffer been big enough. That last property allows you to pass a length of null (and as a special case the NULL pointer) to find out how many chars you need and allocate memory as appropriate.
So the prototype for your function could look like this:
int getAlignment(char *res1, char *res2, size_t n,
const char* s1, const char* s2, const char* trans);
Except that the resulting strings could be of different length in your case.
You can also return strings, but you'll either have to return new memory allocated with malloc on the heap, which means the client code must explicitly free it, or pointers into already existing memory. You can, of course, only return one string.
You can return multiple values from a function as a struct. Structs do not decay into pointers when passing them to or returning them from functions. I'll use that approach in my example below.
As for the second question: Your main problem is that you have three strings - two source strings and one translation string - but keep only two indices. All strings are traversed independently; there is no synchronisation between the strings' indices.
You append to the result strings as you go. The "driving" string is the trenslation string, so you should traverse only that with the main loop.
Another thing to note is that you don't need to make copies of the source strings. This is not only inneccessary, it is also dangerous, because strcpy could overflow the buffers. Taking care of overflow with strncpy couldtruncate the input strings.
I've updated your implementation:
#include <stdio.h>
#include <string.h>
#define N 10
struct Result {
char res1[N];
char res2[N];
};
struct Result getAlignment(const char* s1, const char* s2, const char* trans)
{
struct Result res;
int j1 = 0; // index into s1
int j2 = 0; // index into s2
int n = N - 1; // leave 1 char for null terminator
while (*trans) {
if (*trans == 'R') {
if (j1 < n) res.res1[j1++] = *s1++;
if (j2 < n) res.res2[j2++] = *s2++;
}
if (*trans == 'I'){
if (j1 < n) res.res1[j1++] = '_';
if (j1 < n) res.res1[j1++] = *s1++;
if (j2 < n) res.res2[j2++] = *s2++;
}
if (*trans == 'M') {
if (j1 < n) res.res1[j1++] = *s1++;
if (j2 < n) res.res2[j2++] = *s2++;
}
if (*trans == 'D') {
if (j1 < n) res.res1[j1++] = *s1++;
if (j2 < n) res.res2[j2++] = '_';
if (j2 < n) res.res2[j2++] = *s2++;
}
trans++;
}
// null-terminate strings
res.res1[j1] = '\0';
res.res2[j2] = '\0';
return res;
}
int main()
{
char *str1 = "vintner";
char *str2 = "writers";
char *trans = "RIMDMDMMI";
struct Result res = getAlignment(str1, str2, trans);
printf("%s\n%s\n\n", res.res1, res.res2);
return 0;
}
Things to note:
The translation string is traversed via pointer, which saves an index.
The result strings are appended to only if there is enough space. You can change N to 5 and see how the result strings are truncated after 4 characters, thus losing information, but preventing buffer overflows.
Both result-string indices and source string pointers are incrementes as you go.
The source strings are only read from. (That's why copying doesn't make sense.) So they should be const char * in the function signature.

Related

Manipulating a string and rewriting it by the function output

For some functions for string manipulation, I try to rewrite the function output onto the original string. I came up with the general scheme of
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
char *char_repeater(char *str, char ch)
{
int tmp_len = strlen(str) + 1; // initial size of tmp
char *tmp = (char *)malloc(tmp_len); // initial size of tmp
// the process is normally too complicated to calculate the final length here
int j = 0;
for (int i = 0; i < strlen(str); i++)
{
tmp[j] = str[i];
j++;
if (str[i] == ch)
{
tmp[j] = str[i];
j++;
}
if (j > tmp_len)
{
tmp_len *= 2; // growth factor
tmp = realloc(tmp, tmp_len);
}
}
tmp[j] = 0;
char *output = (char *)malloc(strlen(tmp) + 1);
// output matching the final string length
strncpy(output, tmp, strlen(tmp));
output[strlen(tmp)] = 0;
free(tmp); // Is it necessary?
return output;
}
int main()
{
char *str = "This is a test";
str = char_repeater(str, 'i');
puts(str);
free(str);
return 0;
}
Although it works on simple tests, I am not sure if I am on the right track.
Is this approach safe overall?
Of course, we do not re-write the string. We simply write new data (array of the characters) at the same pointer. If output is longer than str, it will rewrite the data previously written at str, but if output is shorter, the old data remains, and we would have a memory leak. How can we free(str) within the function before outputting to its pointer?
A pair of pointers can be used to iterate through the string.
When a matching character is found, increment the length.
Allocate output as needed.
Iterate through the string again and assign the characters.
This could be done in place if str was malloced in main.
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
char *char_repeater(char *str, char ch)
{
int tmp_len = strlen(str) + 1; // initial size of tmp
char *find = str;
while ( *find) // not at terminating zero
{
if ( *find == ch) // match
{
tmp_len++; // add one
}
++find; // advance pointer
}
char *output = NULL;
if ( NULL == ( output = malloc(tmp_len)))
{
fprintf ( stderr, "malloc peoblem\n");
exit ( 1);
}
// output matching the final string length
char *store = output; // to advance through output
find = str; // reset pointer
while ( *find) // not at terminating zero
{
*store = *find; // assign
if ( *find == ch) // match
{
++store; // advance pointer
*store = ch; // assign
}
++store; // advance pointer
++find;
}
*store = 0; // terminate
return output;
}
int main()
{
char *str = "This is a test";
str = char_repeater(str, 'i');
puts(str);
free(str);
return 0;
}
For starters the function should be declared like
char * char_repeater( const char *s, char c );
because the function does not change the passed string.
Your function is unsafe and inefficient at least because there are many dynamic memory allocations. You need to check that each dynamic memory allocation was successful. Also there are called the function strlen also too ofhen.
Also this code snippet
tmp[j] = str[i];
j++;
if (str[i] == ch)
{
tmp[j] = str[i];
j++;
}
if (j > tmp_len)
//...
can invoke undefined behavior. Imagine that the source string contains only one letter 'i'. In this case the variable tmp_len is equal to 2. So temp[0] will be equal to 'i' and temp[1] also will be equal to 'i'. In this case j equal to 2 will not be greater than tmp_len. As a result this statement
tmp[j] = 0;
will write outside the allocated memory.
And it is a bad idea to reassign the pointer str
char *str = "This is a test";
str = char_repeater(str, 'i');
As for your question whether you need to free the dynamically allocated array tmp
free(tmp); // Is it necessary?
then of course you need to free it because you allocated a new array for the result string
char *output = (char *)malloc(strlen(tmp) + 1);
And as for your another question
but if output is shorter, the old data remains, and we would have a
memory leak. How can we free(str) within the function before
outputting to its pointer?
then it does not make a sense. The function creates a new character array dynamically that you need to free and the address of the allocated array is assigned to the pointer str in main that as I already mentioned is not a good idea.
You need at first count the length of the result array that will contain duplicated characters and after that allocate memory only one time.
Here is a demonstration program.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char * char_repeater( const char *s, char c )
{
size_t n = 0;
for ( const char *p = s; ( p = strchr( p, c ) ) != NULL; ++p )
{
++n;
}
char *result = malloc( strlen( s ) + 1 + n );
if ( result != NULL )
{
if ( n == 0 )
{
strcpy( result, s );
}
else
{
char *p = result;
do
{
*p++ = *s;
if (*s == c ) *p++ = c;
} while ( *s++ );
}
}
return result;
}
int main( void )
{
const char *s = "This is a test";
puts( s );
char *result = char_repeater( s, 'i' );
if ( result != NULL ) puts( result );
free( result );
}
The program output is
This is a test
Thiis iis a test
My kneejerk reaction is to dislike the design. But I have reasons.
First, realloc() is actually quite efficient. If you are just allocating a few extra bytes every loop, then chances are that the standard library implementation simply increases the internal bytecount value associated with your memory. Caveats are:
Interleaving memory management.Your function here doesn’t have any, but should you start calling other routines then keeping track of all that becomes an issue. Anything that calls other memory management routines can lead to the next problem:
Fragmented memory.If at any time the available block is too small for your new request, then a much more expensive operation to obtain more memory and copy everything over becomes an issue.
Algorithmic issues are:
Mixing memory management in increases the complexity of your code.
Every occurrence of c invokes a function call with potential to be expensive. You cannot control when it is expensive and when it is not.
Worst-case options (char_repeater( "aaaaaaaaaa", 'a' )) trigger worst-case potentialities.
My recommendation is to simply make two passes.
This passes several smell tests:
Algorithmic complexity is broken down into two simpler parts:
counting space required, and
allocating and copying.
Worst-case scenarios for allocation/reallocation are reduced to a single call to malloc().
Issues with very large strings are reduced:
You need at most space for 2 large strings (not 3, possibly repeated)
Page fault / cache boundary issues are similar (or the same) for both methods
Considering there are no real downsides to using a two-pass approach, I think that using a simpler algorithm is reasonable. Here’s code:
#include <stdio.h>
#include <stdlib.h>
char * char_repeater( const char * s, char c )
{
// FIRST PASS
// (1) count occurances of c in s
size_t number_of_c = 0;
const char * p = s;
while (*p) number_of_c += (*p++ == c);
// (2) get strlen s
size_t length_of_s = p - s;
// SECOND PASS
// (3) allocate space for the resulting string
char * dest = malloc( length_of_s + number_of_c + 1 );
// (4) copy s -> dest, duplicating every occurance of c
if (dest)
{
char * d = dest;
while (*s)
if ((*d++ = *s++) == c)
*d++ = c;
*d = '\0';
}
return dest;
}
int main(void)
{
char * s = char_repeater( "Hello world!", 'o' );
puts( s );
free( s );
return 0;
}
As always, know your data
Whether or not a two-pass approach actually is better than a realloc() approach depends on more factors than what is evident in a posting on the internet.
Nevertheless, I would wager that for general purpose strings that this is a better choice.
But, even if it isn’t, I would argue that a simpler algorithm, splitting tasks into trivial sub-tasks, is far easier to read and maintain. You should only start making tricky algorithms only if you have use-case profiling saying you need to spend more attention on it.
Without that, readability and maintainability trumps all other concerns.

Appending char to C array

I have a string declared as such:
char *mode_s = (char *)calloc(MODE_S_LEN, sizeof(char));
How can I add a char to the end of the array?
Lets assume " first available position " means at index 0.
char *mode_s = (char *)calloc(MODE_S_LEN, sizeof(char));
*mode_s='a';
To store a character at an arbitrary index n
*(mode_s+n)='b';
Use pointer algebra, as demonstrated above, which is equivalent to
mode_s[n]='b';
One sees that the first case simply means that n=0.
If you wish to eliminate incrementing the counter, as specified in the comment bellow, you can write a data structure and a supporting function that fits your needs. A simple one would be
typedef struct modeA{
int size;
int index;
char *mode_s;
}modeA;
The supporting function could be
int add(modeA* a, char toAdd){
if(a->size==a->index) return -1;
a->mode_s[index]=toAdd;
a->index++;
return 0;
}
It returns 0 when the add was successful, and -1 when one runs out of space.
Other functions you might need can be coded in a similar manner. Note that as C is not object oriented, the data structure has to be passed to the function as a parameter.
Finally you code code a function creating an instance
modeA genModeA(int size){
modeA tmp;
tmp.mode_s=(char *)calloc(size, sizeof(char));
tmp.size=size;
tmp.index=0;
return tmp;
}
Thus using it with no need to manually increment the counter
modeA tmp=genModeA(MODE_S_LEN);
add(&tmp,'c');
There is no standard function to concatenate a character to a string in C. You can easily define such a function:
#include <string.h>
char *strcatc(char *str, char c) {
size_t len = strlen(str);
str[len++] = c;
str[len] = '\0';
return str;
}
This function only works if str is allocated or defined with a larger size than its length + 1, ie if there is available space at its end. In your example, mode_s is allocated with a size of MODE_S_LEN, so you can put MODE_S_LEN-1 chars into it:
char *mode_s = calloc(MODE_S_LEN, sizeof(*mode_s));
for (int i = 0; i < MODE_S_LEN - 1; i++) {
strcatc(mode_s, 'X');
}
char newchar = 'a'; //or getch() from keyboard
//realloc memory:
char *mode_sNew = (char *)calloc(MODE_S_LEN + 1, sizeof(char));
//copy the str:
srncpy(mode_sNew, mode_s, MODE_S_LEN);
//put your char:
mode_sNew[MODE_S_LEN] = newchar;
//free old memory:
free(mode_s);
//reassign to the old string:
mode_s = mode_sNew;
//in a loop you can add as many characters as you want. You also can add more than one character at once, but assign only one in a new position

loop to reverse string in C

So I've looked around on SO and can't find code that answers my question. I have written a function that is supposed to reverse a string as input in cmd-line. Here is the function:
void reverse (char string[]) {
int x;
int i = 0;
char line[strlen(string)];
for (x = strlen(string) - 1; x > 0; x--) {
char tmp = string[x];
line[i] = tmp;
i++;
}
string = line;
}
When I call my reverse() function, the string stays the same. i.e., 'abc' remains 'abc'
If more info is needed or question is inappropriate, let me know.
Thanks!!
You're declaring your line array one char shorter remember the null at the end.
Another point, it should be for (x = strlen(string) - 1; x >= 0; x--) since you need to copy the character at 0.
void reverse (char string[]) {
int x;
int i = 0;
char line[strlen(string) + 1];
for (x = strlen(string) - 1; x >= 0; x--) {
char tmp = string[x];
line[i] = tmp;
i++;
}
for(x = 0; x < strlen(string); x++)
{
string[x] = line[x];
}
}
Note that this function will cause an apocalypse when passed an empty string or a string literal (as Bobby Sacamano said).
Suggestion you can probably do: void reverse(char source[], char[] dest) and do checks if the source string is empty.
I think that your answer is almost correct. You don't actually need an extra slot for the null character in line. You just need two minor changes:
Change the assignment statement at the bottom of the procedure to a memcpy.
Change the loop condition to <-
So, your correct code is this:
void reverse (char string[]) {
int x;
int i = 0;
char line[strlen(string)];
for (x = strlen(string) - 1; x >= 0; x--) {
char tmp = string[x];
line[i] = tmp;
i++;
}
memcpy(string, line, sizeof(char) * strlen(line));
}
Since you want to reverse a string, you first must decide whether you want to reverse a copy of the string, or reverse the string in-situ (in place). Since you asked about this in 'C' context, assume you mean to change the existing string (reverse the existing string) and make a copy of the string in the calling function if you want to preserve the original.
You will need the string library
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
Array indexing works, and this version takes that approach,
/* this first version uses array indexing */
char*
streverse_a(char string[])
{
int len; /*how big is your string*/
int ndx; /*because 'i' is hard to search for*/
char tmp; /*hold character to swap*/
if(!string) return(string); /*avoid NULL*/
if( (len=strlen(string)) < 2 ) return(string); /*one and done*/
for( ndx=0; ndx<len/2; ndx++ ) {
tmp=string[ndx];
string[ndx]=string[len-1-ndx];
string[len-1-ndx]=tmp;
}
return(string);
}
But you can do the same with pointers,
/* this is how K&R would write the function with pointers */
char*
streverse(char* sp)
{
int len, ndx; /*how big is your string */
char tmp, *bp, *ep; /*pointers to begin/end, swap temporary*/
if(!sp) return(sp); /*avoid NULL*/
if( (len=strlen(bp=sp)) < 2 ) return(sp); /*one and done*/
for( ep=bp+len-1; bp<ep; bp++, ep-- ) {
tmp=*bp; *bp=*ep; *ep=tmp; /*swap*/
}
return(sp);
}
(No, really, the compiler does not charge less for returning void.)
And because you always test your code,
char s[][100] = {
"", "A", "AB", "ABC", "ABCD", "ABCDE",
"hello, world", "goodbye, cruel world", "pwnz0r3d", "enough"
};
int
main()
{
/* suppose your string is declared as 'a' */
char a[100];
strcpy(a,"reverse string");
/*make a copy of 'a', declared the same as a[]*/
char b[100];
strcpy(b,a);
streverse_a(b);
printf("a:%s, r:%s\n",a,b);
/*duplicate 'a'*/
char *rp = strdup(a);
streverse(rp);
printf("a:%s, r:%s\n",a,rp);
free(rp);
int ndx;
for( ndx=0; ndx<10; ++ndx ) {
/*make a copy of 's', declared the same as s[]*/
char b[100];
strcpy(b,s[ndx]);
streverse_a(b);
printf("s:%s, r:%s\n",s[ndx],b);
/*duplicate 's'*/
char *rp = strdup(s[ndx]);
streverse(rp);
printf("s:%s, r:%s\n",s[ndx],rp);
free(rp);
}
}
The last line in your code does nothing
string = line;
Parameters are passed by value, so if you change their value, that is only local to the function. Pointers are the value of the address of memory they are pointing to. If you want to modify the pointer that the function was passed, you need to take a pointer to that pointer.
Here is a short example of how you could do that.
void reverse (char **string) {
char line = malloc(strlen(*string) + 1);
//automatic arrays are deallocated once the function ends
//so line needs to be dynamically or statically allocated
// do something to line
*string = line;
}
The obvious issue with this is that you can initialize the string with static memory, then this method will replace the static memory with dynamic memory, and then you'll have to free the dynamic memory. There's nothing functionally wrong with that, it's just a bit dangerous, since accidentally freeing the string literal is illegal.
char *test = "hello";
reverse(test);
free(test); //this is pretty scary
Also, if test was allocated as dynamic memory, the pointer to it would be lost and then it would become a memory leak.

How can I make a function to remove double letters in C?

I am trying to make a function that removes double letters from a string. The function is only supposed to remove double letters next to each other, not in the whole string. e.g 'aabbaa' would become 'aba' (not 'ab'). Im a fairly new to c programming and dont fully understand pointers etc. and need some help. Below is what I have so far. It does not work at all, and I have no idea what to return since when I try and return string[] it has an error:
char doubleletter( char *string[] ) {
char surname[25];
int i;
for((i = 1) ; string[i] != '\0' ; i++) {
if (string[i] == string[(i-1)]) { //Supposed to compare the ith letter in array with one before
string[i] = '\0' ; //Supposed to swap duplicate chars with null
}
}
surname[25] = string;
return surname ;
Try the following. It is a clear simple and professionally-looked code.:)
#include <stdio.h>
char * unique( char *s )
{
for ( char *p = s, *q = s; *q++; )
{
if ( *p != *q ) *++p = *q;
}
return s;
}
int main(void)
{
char s[] = "aabbaa";
puts( unique( s ) );
return 0;
}
The output is
aba
Also the function can be rewritten the following way that to escape unnecassary copying.
char * unique( char *s )
{
for ( char *p = s, *q = s; *q++; )
{
if ( *p != *q )
{
( void )( ( ++p != q ) && ( *p = *q ) );
}
}
return s;
}
Or
char * unique( char *s )
{
for ( char *p = s, *q = s; *q++; )
{
if ( *p != *q && ++p != q ) *p = *q;
}
return s;
}
It seems that the last realization is the best.:)
First of all delete those parenthenses aroung i = 1 in for loop (why you put them there in the first place ?
Secondly if you put \0 in the middle of the string, the string will just get shorter.
\0 terminates array (string) in C so if you have:
ababaabababa
and you replace second 'a' in pair with \0:
ababa\0baba
effectively for compiler it will be like you just cut this string to:
ababa
Third error here is probably that you are passing two-dimensional array to function here:
char *string[]
This is equivalent to passing char **string and essentialy you are passing array of strings while you wanna only to pass a string (which means a pointer, which means an array: char *string or ofc char string[])
Next thing: you are making internal assumption that passed string will have less than 24 chars (+ \0) but you don't check it anywhere.
I guess easiest way (though maybe not the most clever) to remove duplicated chars is to copy in this for loop passed string to another one, omitting repeated characters.
One example, It does not modify input string and returns a new dynamically allocated string. Pretty self explanatory I think:
char *new_string_without_dups(const char *input_str, size_t len)
{
int i = 1;
int j = 0;
char tmpstr[len+1] = {0};
for (; i < len; i++) {
if (input_str[i] == input_str[i-1]) {
continue;
}
tmpstr[j] = input_str[i];
j++;
}
return strdup(tmpstr);
}
Don't forget to free the returned string after usage.
Note that there are several ways to adapt/improve this. One thing now is that it requires C99 std due to array size not being known at compile time. Other things like you can get rid of the len argument if you guarantee a \0 terminated string as input. I'll leave that as excercises.
Your idea behind the code is right, but you are making two fundamental mistakes:
You return a char [] from a function that has char as return type. char [], char * and char are three different types, even though in this case char [] and char * would behave identically. However you would have to return char * from your function to be able to return a string.
You return automatically allocated memory. In other languages where memory is reference counted this is OK. In C this causes undefined behavior. You cannot use automatic memory from within a function outside this very function. The memory is considered empty after the function exits and will be reused, i.e. your value will be overwritten. You have to either pass a buffer in, to hold the result, or do a dynamic allocation within the function with malloc(). Which one you do is a matter of style. You could also reuse the input buffer, but writing the function like that is undesirable in any case where you need to preserve the input, and it will make it impossible for you to pass const char* into the function i.e. you would not be able to do do something like this:
const char *str = "abbc";
... doubleletter(str,...);
If I had to write the function I would probably call it something like this:
int doubleletter (const char *in, size_t inlen, char *out, size_t outlen){
int i;
int j = 0;
if (!inlen) return 0;
if (!outlen) return -1;
out [j++] = in[0];
for (i = 1; i < inlen; ++i){
if (in[i - 1] != in[i]){
if (j > outlen - 1) return -1;
out[j++] = in[i];
}
}
out[j] = '\0';
return j - 1;
}
int main(void) {
const char *str1 = "aabbaa";
char out[25];
int ret = doubleletter(str1, strlen(str1), out, sizeof(out)/sizeof(out[0]));
printf("Result: %s", out);
return 0;
}
I would recommend using 2 indices to modify the string in-place:
void remove_doubles(char *str)
{
// if string is 1 or 0 length do nothing.
if(strlen(str)<=1)return;
int i=0; //index (new string)
int j=1; //index (original string)
// loop until end of string
while(str[j]!=0)
{
// as soon as we find a different letter,
// copy it to our new string and increase the index.
if(str[i]!=str[j])
{
i++;
str[i]=str[j];
}
// increase index on original/old string
j++;
}
// mark new end of string
str[i+1]='\0';
}

String (array) capacity via pointer

I am tring to create a sub-routine that inserts a string into another string. I want to check that the host string is going to have enough capacity to hold all the characters and if not return an error integer. This requires using something like sizeof but that can be called using a pointer. My code is below and I would be very gateful for any help.
#include<stdio.h>
#include<conio.h>
//#include "string.h"
int string_into_string(char* host_string, char* guest_string, int insertion_point);
int main(void) {
char string_one[21] = "Hello mother"; //12 characters
char string_two[21] = "dearest "; //8 characters
int c;
c = string_into_string(string_one, string_two, 6);
printf("Sub-routine string_into_string returned %d and creates the string: %s\n", c, string_one);
getch();
return 0;
}
int string_into_string(char* host_string, char* guest_string, int insertion_point) {
int i, starting_length_of_host_string;
//check host_string is long enough
if(strlen(host_string) + strlen(guest_string) >= sizeof(host_string) + 1) {
//host_string is too short
sprintf(host_string, "String too short(%d)!", sizeof(host_string));
return -1;
}
starting_length_of_host_string = strlen(host_string);
for(i = starting_length_of_host_string; i >= insertion_point; i--) { //make room
host_string[i + strlen(guest_string)] = host_string[i];
}
//i++;
//host_string[i] = '\0';
for(i = 1; i <= strlen(guest_string); i++) { //insert
host_string[i + insertion_point - 1] = guest_string[i - 1];
}
i = strlen(guest_string) + starting_length_of_host_string;
host_string[i] = '\0';
return strlen(host_string);
}
C does not allow you to pass arrays as function arguments, so all arrays of type T[N] decay to pointers of type T*. You must pass the size information manually. However, you can use sizeof at the call site to determine the size of an array:
int string_into_string(char * dst, size_t dstlen, char const * src, size_t srclen, size_t offset, size_t len);
char string_one[21] = "Hello mother";
char string_two[21] = "dearest ";
string_into_string(string_one, sizeof string_one, // gives 21
string_two, strlen(string_two), // gives 8
6, strlen(string_two));
If you are creating dynamic arrays with malloc, you have to store the size information somewhere separately anyway, so this idiom will still fit.
(Beware that sizeof(T[N]) == N * sizeof(T), and I've used the fact that sizeof(char) == 1 to simplify the code.)
This code needs a whole lot more error handling but should do what you need without needing any obscure loops. To speed it up, you could also pass the size of the source string as parameter, so the function does not need to calculate it in runtime.
#include <string.h>
#include <stdlib.h>
#include <stdio.h>
signed int string_into_string (char* dest_buf,
int dest_size,
const char* source_str,
int insert_index)
{
int source_str_size;
char* dest_buf_backup;
if (insert_index >= dest_size) // sanity check of parameters
{
return -1;
}
// save data from the original buffer into temporary backup buffer
dest_buf_backup = malloc (dest_size - insert_index);
memcpy (dest_buf_backup,
&dest_buf[insert_index],
dest_size - insert_index);
source_str_size = strlen(source_str);
// copy new data into the destination buffer
strncpy (&dest_buf[insert_index],
source_str,
source_str_size);
// restore old data at the end
strcpy(&dest_buf[insert_index + source_str_size],
dest_buf_backup);
// delete temporary buffer
free(dest_buf_backup);
}
int main()
{
char string_one[21] = "Hello mother"; //12 characters
char string_two[21] = "dearest "; //8 characters
(void) string_into_string (string_one,
sizeof(string_one),
string_two,
6);
puts(string_one);
return 0;
}
I tried using a macro and changing string_into_string to include the requirement for a size argument, but I still strike out when I call the function from within another function. I tried using the following Macro:
#define STRING_INTO_STRING( a, b, c) (string_into_string2(a, sizeof(a), b, c))
The other function which causes failure is below. This fails because string has already become the pointer and therefore has size 4:
int string_replace(char* string, char* string_remove, char* string_add) {
int start_point;
int c;
start_point = string_find_and_remove(string, string_remove);
if(start_point < 0) {
printf("string not found: %s\n ABORTING!\n", string_remove);
while(1);
}
c = STRING_INTO_STRING(string, string_add, start_point);
return c;
}
Looks like this function will have to proceed at risk. looking at strcat it also proceeds at risk, in that it doesn't check that the string you are appending to is large enough to hold its intended contents (perhaps for the very same reason).
Thanks for everyone's help.

Resources