Can I emulate strcpy using malloc? - c

I'm trying to make a strcpy function from scratch for a class. While I could do it with a for loop and copy the individual characters, I think I could just make a swap using malloc and pointers to make it more efficient. Here's my code, but I've been getting a lot of confusing errors.
void notStrcpy(char s1[], char s2[]) { //copies string s1 into s2
char *s3 = (char *) malloc(strlen(s1)); //s3 is now an alias of s1
s2 = *s3;} //dereference s3 to dump s1 into s2
Why is this happening, and is there any way to make this code work the way I intended it?

You cannot do that: strcpy expects both chunks of memory to be ready - one for reading the string, and the other one for writing the string. Both addresses are expected to have enough memory for the actual content of a null-terminated C string.
On the other hand, malloc gives you a third chunk of memory (you need to allocate strlen(s)+1, but that's a completely different story). String copy algorithm has no use for that chunk of memory. On top of that, assigning parameters of a function has no effect on the values passed into your function, so s2 = *s3 is not doing what you think it should be doing.
Long story short, while ((*s1++ = *s2++)); is your simplest strcpy implementation.
Note: malloc could come in handy in an implementation of string duplication function, e.g. strdup. If you decide to give it a try, don't forget to allocate space for null terminator.

#dasblinkenlight Thank you. My new code is as follows:
void totallyNotstrcpy(char s1[], char s2[]) {
int x = 0;
while (x < strlen(s1)+1) {
s2[x] = s1[x];
x++;
}
}
As a quick side question, how does your code snippet work? Doesn't the while loop need a condition?

Related

How do I modify the contents of a string literal without using brackets in C?

Disclaimer: this is for a homework assigment.
Say I have a string that was declared like this:
char *string1;
For part of my program, I need to set string1 equal to another string, string2. I can't use strcpy or use brackets.
This is my code so far:
int i;
for(i = 0; *(string2 + i) != '\0'; i++){
*(string1 + i) = *(string2 + i);
}
This causes a segmentation fault.
According to https://www.geeksforgeeks.org/storage-for-strings-in-c/ , this is because string1 was declared like this: char *string1 and a workaround to avoid segfaults is to use brackets. I can't use brackets, so is there any workaround that I can do?
EDIT: I am also prohibited from allocating more memory or declaring arrays. I cant use malloc(), falloc() etc.
The issue you are having is that string2 does not have memory allocated to it.
Your code is missing some details, but I'll assume it looks something like this:
#include <stdio.h>
int main()
{
char *originalStr = "Hello NewArsenic";
char *newStr;
// YMMV depending on the compiler for this line. Might print (null) for
// newStr or it might throw an error.
printf("Original: %s\nNew: %s\n", originalStr, newStr);
int i;
for (i = 0; *(originalStr + i) != '\0'; i++)
{
*(newStr + i) = *(originalStr + i);
}
printf("Original: %s\nNew: %s\n", originalStr, newStr);
return 0;
}
TL;DR Your Issue
Your issue here is that you are attempting to store some values into newStr without having the memory to do so.
Solution
Use malloc.
#include <stdio.h>
#include <stdlib.h> // malloc(size_t) is in stdlib.h
#include <string.h> // strlen(const char *) is in string.h
int main()
{
char *originalStr = "Hello NewArsenic";
// Note here that size_t is preferable to int for length.
// Generally you want to be using size_t if you are working with size/length.
// More info at https://stackoverflow.com/questions/19732319/difference-between-size-t-and-unsigned-int
size_t originalLength = strlen(originalStr);
// This is malloc's typical usage, where we are asking from the system to
// give us originalLength + 1 many chars.
// The `char` here is redundant, actually, since sizeof(char) is defined to
// be one by the C spec, but you might find it useful to see the typical
// usage of `malloc`.
// Since malloc returns a void *, we need to cast that to a char *.
char *newStr = (char *)malloc((originalLength + 1) * sizeof(char));
// Your code stays the same.
printf("Original: %s\nNew: %s\n", originalStr, newStr);
size_t i;
for (i = 0; *(originalStr + i) != '\0'; i++)
{
*(newStr + i) = *(originalStr + i);
}
// Don't forget to append a null character like I did before editing!
*(newStr + originalLength) = 0;
printf("Original: %s\nNew: %s\n", originalStr, newStr);
// Because `malloc` gives us memory on the stack, we need to tell the system
// that we want to free it before exiting.
free(newStr);
return 0;
}
The long answer
What is a C String?
In C, a string is merely an array of characters. What this means is that for each character you want to have have, you need to allocate memory.
Memory
In C, there are two types of memory allocation - stack- and heap-based.
Stack Memory
You're probably more familiar with stack-based memory than you think. Whenever you declare a variable, you're defining it on the stack. Arrays declared with bracket notation type array[size_t] are stack-based too. What's specific about stack-based memory allocation is that when you allocate memory, it will only last for as long as the function in which it was declared, as you're probably familiar with. This means that you don't have to worry about your memory sticking around for longer than it should.
Heap Memory
Now heap-based memory allocation is different in the sense that it will persist until it is cleared. This is advantageous in one way:
You can keep values of which you don't know the size at compile time.
But, that comes at a cost:
The heap is slower
You have to manually clear your memory once you're done with it.
For more info, check out this thread.
We typically use the function (void *) malloc(size_t) and its sister (void *) calloc(size_t, size_t) for allocating heap memory. To free the memory that we asked for from the system, use free(void *).
Alternatives
You could've also used newStr = originalStr, but that would not actually copy the string, but only make newStr point to originalStr, which I'm sure you're aware of.
Other remarks
Generally, it's an anti-pattern to do:
char* string = "literal";
This is an anti-pattern because literals cannot be edited and shouldn't be. Do:
char const* string = "literal";
See this thread for more info.
Avoid using int in your loop. Use size_t See this thread.
For part of my program, I need to set string1 equal to another string, string2. I can't use strcpy or use brackets.
Perhaps the solution is just as simple as
string2 = string1
Note that this assignes the string2 pointer to point directly to the same memory as string1. This is sometimes very helpful because you need to maintain the beginning of the string with string1 but also need another pointer to move inside the string with things like string2++.
One way or another, you have to point string2 at an address in memory that you have access to. There are two ways to do this:
Point at memory that you already have access to through another variable either with another pointer variable or with the address-of & operator.
Allocate memory with malloc() or related functions.

why does this pointer manipulation fail?

I'm working my way in understanding pointers. I wrote this string copy functionality in C.
#include<stdio.h>
char *my_strcpy(char *dest, char *source)
{
while (*source != '\0')
{
*dest++ = *source++;
}
*dest = '\0';
return dest;
}
int main(void)
{
char* temp="temp";
char* temp1=NULL;
my_strcpy(temp1,temp);
puts(temp1);
return 0;
}
This program gives a segfault.If I change char* temp1=NULL to char* temp1 still it fails. If I change char* temp1 to char temp1[80], the code works. The code also works if char temp1[1] and gives the output as temp. I was thinking the output should be t. Why is it like this and why do I get error with char* temp.
Because you're not allocating space for the destination string. You're trying to write to memory at position NULL (almost certainly 0x00).
Try char* temp1= malloc(strlen(temp)+1); or something like it. That will allocate some memory and then you can copy the characters into it. The +1 is for the trailing null character.
If you wrote Java and friends, it would prevent you from accessing memory off the end of the array. But at a language level, C lets you write to memory anywhere you want. And then crash (hopefully immediately but maybe next week). Arrays aren't strictly enforced data types, they are just conventions for allocating and referencing memory.
If you create it as char temp1[1] then you are allocating some memory on the stack. Memory near that may be accessible (you can read and write to it) but you will be scribbling over other memory intended for something else. This is a classic memory bug.
Also style: I personally advise against using the return values from ++s. It's harder to read and makes you think twice.
*dest = *source;
dest++;
source++;
Is clearer. But that's just my opinion.
You must to allocate space for the destination parameter.
When you use char temp1[80], you allocate 80 bytes in the memory.
You can allocate memory in static way, like array, or use the malloc function

Problem with pointer copy in C

I radically re-edited the question to explain better my application, as the xample I made up wasn't correct in many ways as you pointed out:
I have one pointer to char and I want to copy it to another pointer and then add a NULL character at the end (in my real application, the first string is a const, so I cannot jsut modify it, that's why I need to copy it).
I have this function, "MLSLSerialWriteBurst" which I have to fill with some code adapt to my microcontroller.
tMLError MLSLSerialWriteBurst( unsigned char slaveAddr,
unsigned char registerAddr,
unsigned short length,
const unsigned char *data )
{
unsigned char *tmp_data;
tmp_data = data;
*(tmp_data+length) = NULL;
// this function takes a tmp_data which is a char* terminated with a NULL character ('\0')
if(EEPageWrite2(slaveAddr,registerAddr,tmp_data)==0)
return ML_SUCCESS;
else
return ML_ERROR;
}
I see there's a problem here: tha fact that I do not initialize tmp_data, but I cannot know it's length.
For starters, you are missing a bunch of declarations in your code. For example, what is lungh? Also, I'm assuming you initialized your two pointers so they point to memory you can use. However, maybe that's not a safe assumption.
Beyond that, you failed to terminate your from string. So getting the length of the string will not work.
There seems to be numerous errors here. It's hard to know where to start. Is this really what your actual code looks like? I don't think it would even compile.
Finally, there seems to be a bit of confusion in your terminology. Copying a pointer is different from copying the memory being pointed to. A pointer is a memory address. If you simply copy the pointer, then both pointers will refer to the same address.
I would create a copy of a string using code similar to this:
char *from_string = "ciao";
char *to_string;
int len;
len = strlen(from_string);
to_string = (char *)malloc(len + 1);
if (to_string != NULL)
strcpy(to_string, from_string);
Be fully aware that you do not want to copy a pointer. You want to copy the memory that is pointed to by the pointer. It does sound like you should learn more about pointers and the memory environment of your system before proceeding too much farther.
When you say tmp_data = data, you are pointing tmp_data to the same memory pointed to by data. Instead, you need to allocate a new block of memory and copy the memory from data into it.
The standard way to do this is with malloc. If you do not have malloc, your libraries may have some other way of acquiring a pointer to usable memory.
unsigned char * tmp_data = malloc(length + 1);
if(tmp_data != 0) {
memcpy(tmp_data, data, length);
tmp_data[length] = 0;
// ...
free(tmp_data);
}
You could also use a fixed-size array on the stack:
unsigned char tmp_data[256];
if(length >= sizeof(tmp_data)) length = sizeof(tmp_data) - 1;
memcpy(tmp_data, data, length); // or equivalent routine
tmp_data[length] = 0;
C99 introduced variable-length arrays, which may be what you seek here, if your compiler supports them:
unsigned char tmp_data[length];
memcpy(tmp_data, data, length); // or equivalent routine
tmp_data[length] = 0;

concatenation program in C

I have written a program for concatenation two strings, and it is throwing segmentation fault during run time at line s1[i+j] = s2[j], in for loop..... And i am not able to figure out, why it is happening so.... Kindly coreect me, where am i going wrong.
char* concatenate(char *s1, char *s2)
{
int i,j=0;
for(i=0; s1[i] != '\0'; i++);
for(j=0; s2[j] != '\0'; j++)
{
s1[i+j] = s2[j];
}
s1[i+j] = s2[j];
return s1;
}
char *s1 = (char *) malloc(15);;
char *s2 ;
s1 = "defds";
s2 = "abcd";
s1 = concatenate(s1,s2);
// printf("\n\n%s\n\n",s1);
s1 = "rahul";
This line does not copy the string "rahul" into the buffer pointed to by s1; it reassigns the pointer s1 to point to the (not modifiable) string literal "rahul"
You can get the desired functionality by using your concatenate function twice:
char *s1 = (char *) malloc(15);
s1[0] = '\0'; // make sure the buffer is a null terminated string of length zero
concatenate(s1, "rahul");
concatenate(s1, "bagai");
Note that the concatenate function is still somewhat unsafe since it blindly copies bytes, much like strcat does. You'll want either to be very sure that the buffer you pass it is large enough, or to modify it to take a buffer length like strncat takes.
When you do s1 = "rahul"; you are overwriting the memory you just allocated. This line doesn't copy "rahul" to the malloc'ed area, it changes s1 to point to the string constant "rahul" and throws away the pointer to the malloc'ed memory.
Instead, you should use strcpy to copy the string to the malloc'ed area:
// s1 = "rahul";
strcpy(s1, "rahul");
This will fix your call to concatenate since s1 will now point to the correct 15-byte area of memory.
Alternatively, you could eschew the dynamic allocation and allocate+assign the initial string all at once:
char s1[15] = "rahul";
That will allocate 15 bytes on the stack and copy "rahul" into that space. Note a subtlety here. In this case it is in fact correct to use =, whereas it is incorrect when s1 is declared as char *s1.
One important debugging lesson you can learn from this is that when your program crashes on a particular line of code that doesn't mean that's where the bug is. Often you make a mistake in one part of your program and that mistake doesn't manifest itself in a crash until later on. This is part of what makes debugging such a delightfully frustrating process!
You don't have to write your own string concatenation, there are already functions in the standard libary to do this task for you!
#include <string.h>
char *strcat(char *dest, const char *src);
char *strncat(char *dest, const char *src, size_t n);
In the second piece of code, you allocate a buffer of size 15 and then assign it to s1. Then you assign "rahul" to s1, which leaks the memory you just allocated, and assigns s1 to a 6-byte piece of memory that you likely cannot write to. Change s1 = "rahul"; to strcpy( s1, "rahul" ); and you might have better luck.
I agree with the other answers though, the concatenate function is dangerous.

Why am I getting a double free or corruption error with realloc()?

I've tried to write a string replace function in C, which works on a char *, which has been allocated using malloc(). It's a little different in that it will find and replace strings, rather than characters in the starting string.
It's trivial to do if the search and replace strings are the same length (or the replace string is shorter than the search string), since I have enough space allocated. If I try to use realloc(), I get an error that tells me I am doing a double free - which I don't see how I am, since I am only using realloc().
Perhaps a little code will help:
void strrep(char *input, char *search, char *replace) {
int searchLen = strlen(search);
int replaceLen = strlen(replace);
int delta = replaceLen - searchLen;
char *find = input;
while (find = strstr(find, search)) {
if (delta > 0) {
realloc(input, strlen(input) + delta);
find = strstr(input, search);
}
memmove(find + replaceLen, find + searchLen, strlen(input) - (find - input));
memmove(find, replace, replaceLen);
}
}
The program works, until I try to realloc() in an instance where the replaced string will be longer than the initial string. (It still kind of works, it just spits out errors as well as the result).
If it helps, the calling code looks like:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
void strrep(char *input, char *search, char *replace);
int main(void) {
char *input = malloc(81);
while ((fgets(input, 81, stdin)) != NULL) {
strrep(input, "Noel", "Christmas");
}
}
As a general rule, you should never do a free or realloc on a user provided buffer. You don't know where the user allocated the space (in your module, in another DLL) so you cannot use any of the allocation functions on a user buffer.
Provided that you now cannot do any reallocation within your function, you should change its behavior a little, like doing only one replacement, so the user will be able to compute the resulting string max length and provide you with a buffer long enough for this one replacement to occur.
Then you could create another function to do the multiple replacements, but you will have to allocate the whole space for the resulting string and copy the user input string. Then you must provide a way to delete the string you allocated.
Resulting in:
void strrep(char *input, char *search, char *replace);
char* strrepm(char *input, char *search, char *replace);
void strrepmfree(char *input);
First off, sorry I'm late to the party. This is my first stackoverflow answer. :)
As has been pointed out, when realloc() is called, you can potentially change the pointer to the memory being reallocated. When this happens, the argument "string" becomes invalid. Even if you reassign it, the change goes out of scope once the function ends.
To answer the OP, realloc() returns a pointer to the newly-reallocated memory. The return value needs to be stored somewhere. Generally, you would do this:
data *foo = malloc(SIZE * sizeof(data));
data *bar = realloc(foo, NEWSIZE * sizeof(data));
/* Test bar for safety before blowing away foo */
if (bar != NULL)
{
foo = bar;
bar = NULL;
}
else
{
fprintf(stderr, "Crap. Memory error.\n");
free(foo);
exit(-1);
}
As TyBoer points out, you guys can't change the value of the pointer being passed in as the input to this function. You can assign whatever you want, but the change will go out of scope at the end of the function. In the following block, "input" may or may not be an invalid pointer once the function completes:
void foobar(char *input, int newlength)
{
/* Here, I ignore my own advice to save space. Check your return values! */
input = realloc(input, newlength * sizeof(char));
}
Mark tries to work around this by returning the new pointer as the output of the function. If you do that, the onus is on the caller to never again use the pointer he used for input. If it matches the return value, then you have two pointers to the same spot and only need to call free() on one of them. If they don't match, the input pointer now points to memory that may or may not be owned by the process. Dereferencing it could cause a segmentation fault.
You could use a double pointer for the input, like this:
void foobar(char **input, int newlength)
{
*input = realloc(*input, newlength * sizeof(char));
}
If the caller has a duplicate of the input pointer somewhere, that duplicate still might be invalid now.
I think the cleanest solution here is to avoid using realloc() when trying to modify the function caller's input. Just malloc() a new buffer, return that, and let the caller decide whether or not to free the old text. This has the added benefit of letting the caller keep the original string!
Just a shot in the dark because I haven't tried it yet but when you realloc it returns the pointer much like malloc. Because realloc can move the pointer if needed you are most likely operating on an invalid pointer if you don't do the following:
input = realloc(input, strlen(input) + delta);
Someone else apologized for being late to the party - two and a half months ago. Oh well, I spend quite a lot of time doing software archaeology.
I'm interested that no-one has commented explicitly on the memory leak in the original design, or the off-by-one error. And it was observing the memory leak that tells me exactly why you are getting the double-free error (because, to be precise, you are freeing the same memory multiple times - and you are doing so after trampling over the already freed memory).
Before conducting the analysis, I'll agree with those who say your interface is less than stellar; however, if you dealt with the memory leak/trampling issues and documented the 'must be allocated memory' requirement, it could be 'OK'.
What are the problems? Well, you pass a buffer to realloc(), and realloc() returns you a new pointer to the area you should use - and you ignore that return value. Consequently, realloc() has probably freed the original memory, and then you pass it the same pointer again, and it complains that you're freeing the same memory twice because you pass the original value to it again. This not only leaks memory, but means that you are continuing to use the original space -- and John Downey's shot in the dark points out that you are misusing realloc(), but doesn't emphasize how severely you are doing so. There's also an off-by-one error because you do not allocate enough space for the NUL '\0' that terminates the string.
The memory leak occurs because you do not provide a mechanism to tell the caller about the last value of the string. Because you kept trampling over the original string plus the space after it, it looks like the code worked, but if your calling code freed the space, it too would get a double-free error, or it might get a core dump or equivalent because the memory control information is completely scrambled.
Your code also doesn't protect against indefinite growth -- consider replacing 'Noel' with 'Joyeux Noel'. Every time, you would add 7 characters, but you'd find another Noel in the replaced text, and expand it, and so on and so forth. My fixup (below) does not address this issue - the simple solution is probably to check whether the search string appears in the replace string; an alternative is to skip over the replace string and continue the search after it. The second has some non-trivial coding issues to address.
So, my suggested revision of your called function is:
char *strrep(char *input, char *search, char *replace) {
int searchLen = strlen(search);
int replaceLen = strlen(replace);
int delta = replaceLen - searchLen;
char *find = input;
while ((find = strstr(find, search)) != 0) {
if (delta > 0) {
input = realloc(input, strlen(input) + delta + 1);
find = strstr(input, search);
}
memmove(find + replaceLen, find + searchLen, strlen(input) + 1 - (find - input));
memmove(find, replace, replaceLen);
}
return(input);
}
This code does not detect memory allocation errors - and probably crashes (but if not, leaks memory) if realloc() fails. See Steve Maguire's 'Writing Solid Code' book for an extensive discussion of memory management issues.
Note, try to edit your code to get rid of the html escape codes.
Well, though it has been a while since I used C/C++, realloc that grows only reuses the memory pointer value if there is room in memory after your original block.
For instance, consider this:
(xxxxxxxxxx..........)
If your pointer points to the first x, and . means free memory location, and you grow the memory size pointed to by your variable by 5 bytes, it'll succeed. This is of course a simplified example as blocks are rounded up to a certain size for alignment, but anyway.
However, if you subsequently try to grow it by another 10 bytes, and there is only 5 available, it will need to move the block in memory and update your pointer.
However, in your example you are passing the function a pointer to the character, not a pointer to your variable, and thus while the strrep function internally might be able to adjust the variable in use, it is a local variable to the strrep function and your calling code will be left with the original pointer variable value.
This pointer value, however, has been freed.
In your case, input is the culprit.
However, I would make another suggestion. In your case it looks like the input variable is indeed input, and if it is, it shouldn't be modified, at all.
I would thus try to find another way to do what you want to do, without changing input, as side-effects like this can be hard to track down.
This seems to work;
char *strrep(char *string, const char *search, const char *replace) {
char *p = strstr(string, search);
if (p) {
int occurrence = p - string;
int stringlength = strlen(string);
int searchlength = strlen(search);
int replacelength = strlen(replace);
if (replacelength > searchlength) {
string = (char *) realloc(string, strlen(string)
+ replacelength - searchlength + 1);
}
if (replacelength != searchlength) {
memmove(string + occurrence + replacelength,
string + occurrence + searchlength,
stringlength - occurrence - searchlength + 1);
}
strncpy(string + occurrence, replace, replacelength);
}
return string;
}
Sigh, is there anyway to post code without it sucking?
realloc is strange, complicated and should only be used when dealing with lots of memory lots of times per second. i.e. - where it actually makes your code faster.
I have seen code where
realloc(bytes, smallerSize);
was used and worked to resize the buffer, making it smaller. Worked about a million times, then for some reason realloc decided that even if you were shortening the buffer, it would give you a nice new copy. So you crash in a random place 1/2 a second after the bad stuff happened.
Always use the return value of realloc.
My quick hints.
Instead of:
void strrep(char *input, char *search, char *replace)
try:
void strrep(char *&input, char *search, char *replace)
and than in the body:
input = realloc(input, strlen(input) + delta);
Generally read about passing function arguments as values/reference and realloc() description :).

Resources