C - expanding the contents of a string - c

I am writing a function which expands the string, str1 and stores it as str2. By expansion, I mean if str1 has "a-d", it should be stored in str2 as "abcd". I have written the following code. I get a debug error that stack around the variable str1 is corrupted.
Can someone please point out what's going wrong?
Thanks.
#include <stdio.h>
void expand(char s1[], char s2[]);
int main() {
char s1[] = "Talha-z";
char s2[] = "";
expand(s1, s2);
printf(s2);
}
void expand(char s1[], char s2[]) {
int i = 0;
int j= 0;
int k, c_next;
while ( s1[i] != '\0') {
switch (s1[i]) {
case ('-') :
c_next = s1[i+1];
for ( k = 1; k < c_next; k++) {
s2[j] = s1[i] + k;
j++;
}
break;
}
i++;
j++;
}
s2[j] = '\0';
}

You are not allocating sufficient memory for your target string (s2). But you are attempting to write to it, which means you will be writing into memory that you don't own, causing the corruption.
You will need to use dynamic allocation for s2 (i.e. by using malloc), but you will first need to calculate how much memory you need.

char s2[] = "";
This is equivalent to writing
char s2[1] = { '\0' };
It cannot hold more than a single character (or none at all, if the NUL terminator is required).

The problem is that when you initialize s2, you give it enough room for 1 character (i.e. the null terminating '\0'). Thus when you write into s2:
s2[j] = ...
there are no guarantees about what memory you're writing into.
To allocate memory for s2 dynamically, you need to use malloc. In other words, you need to figure out how much memory is required (i.e. by finding the length of the expanded string) and then give s2 that much memory, and finally fill it in via the procedure you have written.

The string s2 at present is on the stack for local variables for main() and is allocated only one byte for one character. When you call the function, it gets passed stack addresses for s1 and s2. The code is over-writing whatever is next to s2 on the stack of function main(). Hence, the error. Please use dynamic memory allocation as already suggested by Mr. Oli above.
Hope my explanation helps you.

Related

Writing a string-concat: How to convert character array to pointer

I am learning C and I have written the following strcat function:
char * stringcat(const char* s1, const char* s2) {
int length_of_strings = strlen(s1) + strlen(s2);
char s3[length_of_strings + 1]; // add one for \0 at the end
int idx = 0;
for(int i=0; (s3[idx]=s1[i]) != 0; idx++, i++);
for(int i=0; (s3[idx]=s2[i]) != 0; idx++, i++);
s3[idx+1] = '\0';
// s3 is a character array;
// how to get a pointer to a character array?
char * s = s3;
return s;
}
That part that looks odd to me is where I have to "re-assign" the character array to a pointer, otherwise C complains that my return is a memory address. I also tried "casting" the return value to (char *) s3, but that didn't work either.
What is the most common way to do this "conversion"? Is this a common pattern in C programs?
There are many ways to handle this situation, but returning a pointer to stack-allocated memory inside the function isn't one of them (the behavior is undefined; consider this memory untouchable once the function returns).
One approach is to allocate heap memory using malloc inside the function, build the result string, then return the pointer to the newly allocated memory with the understanding that the caller is responsible for freeing the memory.
Here's an example of this:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char *stringcat(const char* s1, const char* s2) {
int i = 0;
int s1_len = strlen(s1);
int s2_len = strlen(s2);
char *result = malloc(s1_len + s2_len + 1);
result[s1_len+s2_len] = '\0';
for (int j = 0; j < s1_len; j++) {
result[i++] = s1[j];
}
for (int j = 0; j < s2_len; j++) {
result[i++] = s2[j];
}
return result;
}
int main(void) {
char *cat = stringcat("hello ", "world");
printf("%s\n", cat); // => hello world
free(cat);
return 0;
}
Another approach is for the caller to handle all of the memory management, which is similar to how strcat behaves:
/* Append SRC on the end of DEST. */
char *
STRCAT (char *dest, const char *src)
{
strcpy (dest + strlen (dest), src);
return dest;
}
man says:
The strcat() function appends the src string to the dest string, overwriting the terminating null byte ('\0') at the end of dest, and then adds a terminating null byte. The strings may not overlap, and the dest string must have enough space for the result. If dest is not large enough, program behavior is unpredictable; buffer overruns are a favorite avenue for attacking secure programs.
The problem isn't converting from array to pointer; that happens all the time implicitly, and it's no big deal. Your problem is you've just returned a pointer to invalid memory. The array you allocated in the function disappears when the function returns, and dereferencing a pointer to that array is undefined behavior (returning the pointer isn't technically illegal, but any good compiler warns you, because a pointer that is never dereferenced is usually pretty useless).
If you want to return a new array with the concatenated string, you must use dynamically allocated memory, e.g. from malloc/calloc; making the array static would also work (it would now be persistent global memory), but it would make your function both non-reentrant and non-threadsafe, so it's usually frowned on.
Your little trick of assigning to a pointer and returning the pointer may have fooled the compiler into thinking you weren't doing anything illegal, but it did nothing to make your code safer.
You might be used to languages with more dynamic memory handling, but your function here won't work because C strings are just a block of local memory which disappears when you return. That means that whatever you write to char s3[] will disappear after the return (the details vary and the memory can sometimes stick around long enough for you to think it worked even when it didn't).
Normally you'd want to allocate the memory before calling the function, and pass it in as a parameter, as in:
void stringcat(const char * first, const char * second, char * dest, const size_t dest_len)
Called like this:
char title[] = "Mr. ";
char last[] = "Jones";
char addressname[sizeof(title) + sizeof(last)];
stringcat(title, last, addressname, sizeof(addressname));
The other way to do it is to allocate the memory in the function using malloc(), and return that, but you have to remember to free it in the code when you're done with it.

random chars in dynamic char array C

I need help with char array. I want to create a n-lenght array and initialize its values, but after malloc() function the array is longer then n*sizeof(char), and the content of array isnt only chars which I assign... In array is few random chars and I dont know how to solve that... I need that part of code for one project for exam in school, and I have to finish by Sunday... Please help :P
#include<stdlib.h>
#include<stdio.h>
int main(){
char *text;
int n = 10;
int i;
if((text = (char*) malloc((n)*sizeof(char))) == NULL){
fprintf(stderr, "allocation error");
}
for(i = 0; i < n; i++){
//text[i] = 'A';
strcat(text,"A");
}
int test = strlen(text);
printf("\n%d\n", test);
puts(text);
free(text);
return 0;
}
Well before using strcat make
text[0]=0;
strcat expects null terminated char array for the first argument also.
From standard 7.24.3.1
#include <string.h>
char *strcat(char * restrict s1,
const char * restrict s2);
The strcat function appends a copy of the string pointed to by s2
(including the terminating null character) to the end of the string
pointed to by s1. The initial character of s2 overwrites the null
character at the end of s1.
How do you think strcat will know where the first string ends if you don't
put a \0 in s1.
Also don't forget to allocate an extra byte for the \0 character. Otherwise you are writing past what you have allocated for. This is again undefined behavior.
And earlier you had undefined behavior.
Note:
You should check the return value of malloc to know whether the malloc invocation was successful or not.
Casting the return value of malloc is not needed. Conversion from void* to relevant pointer is done implicitly in this case.
strlen returns size_t not int. printf("%zu",strlen(text))
To start with, you're way of using malloc in
text = (char*) malloc((n)*sizeof(char)
is not ideal. You can change that to
text = malloc(n * sizeof *text); // Don't cast and using *text is straighforward and easy.
So the statement could be
if(NULL == (text = (char*) malloc((n)*sizeof(char))){
fprintf(stderr, "allocation error");
}
But the actual problem lies in
for(i = 0; i < n; i++){
//text[i] = 'A';
strcat(text,"A");
}
The strcat documentation says
dest − This is pointer to the destination array, which should contain
a C string, and should be large enough to contain the concatenated
resulting string.
Just to point out that the above method is flawed, you just need to consider that the C string "A" actually contains two characters in it, A and the terminating \0(the null character). In this case, when i is n-2, you have out of bounds access or buffer overrun1. If you wanted to fill the entire text array with A, you could have done
for(i = 0; i < n; i++){
// Note for n length, you can store n-1 chars plus terminating null
text[i]=(n-2)==i?'A':'\0'; // n-2 because, the count starts from zero
}
//Then print the null terminated string
printf("Filled string : %s\n",text); // You're all good :-)
Note: Use a tool like valgrind to find memory leaks & out of bound memory accesses.

What's wrong with printf in my strcat code?

I have made this program to emulate strcat functionality but there is an error with printf which I don't understand...
Here is the code:
#include <stdio.h>
char *mystrcat(char *s1, char *s2);
int main(void)
{
char *s1,*s2;
s1="asdad";
s2="asdad";
s1=mystrcat(s1,s2);
printf(s1);
return 0;
}
char *mystrcat(char *s1,char *s2)
{
int i,j;
for(i=0;s1[i]<'\0';i++) ;
for(j=0;s2[j]!='\0';j++) s1[i+j]=s2[j];
s1[i+j]='\0';
return s1;
}
The first problem is that s1 doesn't have enough space to append s2 to it. You need the size of the buffer pointed to by s1 to be at least strlen(s1) + strlen(s2) + 1 (the + 1 being the NUL terminator).
The second problem is that string literals are read-only. You assign s1 from "asdad", which creates a pointer to (potentially) read-only memory. Of course the first problem means that you wouldn't have enough space to append to the end even if it were writeable, but this is one of the common pitfalls in C and worth mentioning.
Third problem (already mentioned in another answer) is that the comparison s1[i] < '\0' is wrong and you will not correctly find the length of s1 since the loop will not run even a single iteration. The correct condition is the same as in your second loop, != '\0'. (This masks problem 1 since then you are inadvertently overwriting s1 from the beginning.)
At least, s1[i] < '\0' is the same as s1[i] < 0, which is always false.

concatenation program in C

I have written a program for concatenation two strings, and it is throwing segmentation fault during run time at line s1[i+j] = s2[j], in for loop..... And i am not able to figure out, why it is happening so.... Kindly coreect me, where am i going wrong.
char* concatenate(char *s1, char *s2)
{
int i,j=0;
for(i=0; s1[i] != '\0'; i++);
for(j=0; s2[j] != '\0'; j++)
{
s1[i+j] = s2[j];
}
s1[i+j] = s2[j];
return s1;
}
char *s1 = (char *) malloc(15);;
char *s2 ;
s1 = "defds";
s2 = "abcd";
s1 = concatenate(s1,s2);
// printf("\n\n%s\n\n",s1);
s1 = "rahul";
This line does not copy the string "rahul" into the buffer pointed to by s1; it reassigns the pointer s1 to point to the (not modifiable) string literal "rahul"
You can get the desired functionality by using your concatenate function twice:
char *s1 = (char *) malloc(15);
s1[0] = '\0'; // make sure the buffer is a null terminated string of length zero
concatenate(s1, "rahul");
concatenate(s1, "bagai");
Note that the concatenate function is still somewhat unsafe since it blindly copies bytes, much like strcat does. You'll want either to be very sure that the buffer you pass it is large enough, or to modify it to take a buffer length like strncat takes.
When you do s1 = "rahul"; you are overwriting the memory you just allocated. This line doesn't copy "rahul" to the malloc'ed area, it changes s1 to point to the string constant "rahul" and throws away the pointer to the malloc'ed memory.
Instead, you should use strcpy to copy the string to the malloc'ed area:
// s1 = "rahul";
strcpy(s1, "rahul");
This will fix your call to concatenate since s1 will now point to the correct 15-byte area of memory.
Alternatively, you could eschew the dynamic allocation and allocate+assign the initial string all at once:
char s1[15] = "rahul";
That will allocate 15 bytes on the stack and copy "rahul" into that space. Note a subtlety here. In this case it is in fact correct to use =, whereas it is incorrect when s1 is declared as char *s1.
One important debugging lesson you can learn from this is that when your program crashes on a particular line of code that doesn't mean that's where the bug is. Often you make a mistake in one part of your program and that mistake doesn't manifest itself in a crash until later on. This is part of what makes debugging such a delightfully frustrating process!
You don't have to write your own string concatenation, there are already functions in the standard libary to do this task for you!
#include <string.h>
char *strcat(char *dest, const char *src);
char *strncat(char *dest, const char *src, size_t n);
In the second piece of code, you allocate a buffer of size 15 and then assign it to s1. Then you assign "rahul" to s1, which leaks the memory you just allocated, and assigns s1 to a 6-byte piece of memory that you likely cannot write to. Change s1 = "rahul"; to strcpy( s1, "rahul" ); and you might have better luck.
I agree with the other answers though, the concatenate function is dangerous.

Programs executes correctly and then segfaults

I'm trying to learn C programming and spent some time practicing with pointers this morning, by writing a little function to replace the lowercase characters in a string to their uppercase counterparts. This is what I got:
#include <stdio.h>
#include <string.h>
char *to_upper(char *src);
int main(void) {
char *a = "hello world";
printf("String at %p is \"%s\"\n", a, a);
printf("Uppercase becomes \"%s\"\n", to_upper(a));
printf("Uppercase becomes \"%s\"\n", to_upper(a));
return 0;
}
char *to_upper(char *src) {
char *dest;
int i;
for (i=0;i<strlen(src);i++) {
if ( 71 < *(src + i) && 123 > *(src + i)){
*(dest+i) = *(src + i) ^ 32;
} else {
*(dest+i) = *(src + i);
}
}
return dest;
}
This runs fine and prints exactly what it should (including the repetition of the "HELLO WORLD" line), but afterwards ends in a Segmentation fault. What I can't understand is that the function is clearly compiling, executing and returning successfully, and the flow in main continues. So is the Segmentation fault happening at return 0?
dest is uninitialised in your to_upper() function. So, you're overwriting some random part of memory when you do that, and evidently that causes your program to crash as you try to return from main().
If you want to modify the value in place, initialise dest:
char *dest = src;
If you want to make a copy of the value, try:
char *dest = strdup(src);
If you do this, you will need to make sure somebody calls free() on the pointer returned by to_upper() (unless you don't care about memory leaks).
Like everyone else has pointed out, the problem is that dest hasn't been initialized and is pointing to a random location that contains something important. You have several choices of how to deal with this:
Allocate the dest buffer dynamically and return that pointer value, which the caller is responsible for freeing;
Assign dest to point to src and modify the value in place (in which case you'll have to change the declaration of a in main() from char *a = "hello world"; to char a[] = "hello world";, otherwise you'll be trying to modify the contents of a string literal, which is undefined);
Pass the destination buffer as a separate argument.
Option 1 -- allocate the target buffer dynamically:
char *to_upper(char *src)
{
char *dest = malloc(strlen(src) + 1);
...
}
Option 2 -- have dest point to src and modify the string in place:
int main(void)
{
char a[] = "hello world";
...
}
char *to_upper(char *src)
{
char *dest = src;
...
}
Option 3 -- have main() pass the target buffer as an argument:
int main(void)
{
char *a = "hello world";
char *b = malloc(strlen(a) + 1); // or char b[12];
...
printf("Uppercase becomes %s\n", to_upper(a,b));
...
free(b); // omit if b is statically allocated
return 0;
}
char *to_upper(char *src, char *dest)
{
...
return dest;
}
Of the three, I prefer the third option; you're not modifying the input (so it doesn't matter whether a is an array of char or a pointer to a string literal) and you're not splitting memory management responsibilities between functions (i.e., main() is solely responsible for allocating and freeing the destination buffer).
I realize you're trying to familiarize yourself with how pointers work and some other low-level details, but bear in mind that a[i] is easier to read and follow than *(a+i). Also, there are number of functions in the standard library such as islower() and toupper() that don't rely on specific encodings (such as ASCII):
#include <ctype.h>
...
if (islower(src[i])
dest[i] = toupper(src[i]);
As others have said, your problem is not allocating enough space for dest. There is another, more subtle problem with your code.
To convert to uppercase, you are testing a given char to see if it lies between 71 ans 123, and if it does, you xor the value with 32. This assumes ASCII encoding of characters. ASCII is the most widely used encoding, but it is not the only one.
It is better to write code that works for every type of encoding. If we were sure that 'a', 'b', ..., 'z', and 'A', 'B', ..., 'Z', are contiguous, then we could calculate the offset from the lowercase letters to the uppercase ones and use that to change case:
/* WARNING: WRONG CODE */
if (c >= 'a' && c <= 'z') c = c + 'A' - 'a';
But unfortunately, there is no such guarantee given by the C standard. In fact EBCDIC encoding is an example.
So, to convert to uppercase, you can either do it the easy way:
#include <ctype.h>
int d = toupper(c);
or, roll your own:
/* Untested, modifies it in-place */
char *to_upper(char *src)
{
static const char *lower = "abcdefghijklmnopqrstuvwxyz";
static const char *upper = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
static size_t n = strlen(lower);
size_t i;
size_t m = strlen(src);
for (i=0; i < m; ++i) {
char *tmp;
while ((tmp = strchr(lower, src[i])) != NULL) {
src[i] = upper[tmp-lower];
}
}
}
The advantage of toupper() is that it checks the current locale to convert characters to upper case. This may make æ to Æ for example, which is usually the correct thing to do. Note: I use only English and Hindi characters myself, so I could be wrong about my particular example!
As noted by others, your problem is that char *dest is uninitialized. You can modify src's memory in place, as Greg Hewgill suggests, or you can use malloc to reserve some:
char *dest = (char *)malloc(strlen(src) + 1);
Note that the use of strdup suggested by Greg performs this call to malloc under the covers. The '+ 1' is to reserve space for the null terminator, '\0', which you should also be copying from src to dest. (Your current example only goes up to strlen, which does not include the null terminator.) Can I suggest that you add a line like this after your loop?
*(dest + i) = 0;
This will correctly terminate the string. Note that this only applies if you choose to go the malloc route. Modifying the memory in place or using strdup will take care of this problem for you. I'm just pointing it out because you mentioned you were trying to learn.
Hope this helps.

Resources