How to insert a copy of a string into another string? - c

So, here is the function:
int strinsert(char *dst, int len, const char *src, int offset)
I need to insert a copy of my src string into the string called dst from the position offset.
The argument len specifies the number of characters reserved for the array dst.
the important part of the code:
int strinsert(char *dst, int len, const char *src, int offset)
{
strncpy(dst, src+offset, len);
char buf[100];
strcpy(buf+len, src);
len += strlen(src) ;
strcpy(buf+len, dst+offset);
strcpy(dst, buf);
return 1;
}
Still feels kind of off...
Edit: Before someone misunderstood, I am just teaching myself how to program in C and I found this exercise. Btw, I didn't really found some good learning material for one- and two-dimensional arrays, could someone be so kind and post some?

Its a bit painful, but you really have to contruct a new string as you can't really shuffle bits of memory around so easily and I don't think there is a library function to do this (is there??). Somthing like this:
int strinsert(char *dst, int len, const char *src, int offset)
{
char *new_string = new char[len];
int remaining = len;
// Check offset is not to long (+1 for null)
if (offset >= remaining)
offset = remaining;
// copy the pre-string from dest
strncpy(new_string, dest, offset);
// Calulate the remaining space
remaining -= offset;
// Add the insert string (with max chars remaining)
strncat(new_string, src, remaining);
// calc remaining space
remaining -= strlen(src);
// Add the post-string from dest (with max chars remaining)
strncat(new_string, dest, remaining);
// Finally copy the new_string into dest
strncpy(dest, new_string, len);
// free the memory
delete [] new_string;
}
Note: You probably need to do a better job of calculating the remaining space incase it goes negative...
Edit: replaced variable length array (illegal ... oops) with mem allocation

Does this solve the problem?
int strinsert(char *dst, int len, const char *src, int offset)
{
char temp [MAX_SIZE];
strncpy(temp, dest+offset, strlen(dest+offset));
strncpy(dest+offset, src, strlen(src));
strcat (dest+offset+strlen(src), temp);
return 0;
}
Obviously, the above code doesn't have any error checking, and the temp can be malloced etc.

You can use strncpy(), which is a function of string.h library. It copies first num characters of source string to destination string.
char * strncpy ( char * destination, const char * source, size_t num );
But in your case, you need a third string because you can not expand your destination array. You can copy substring (from beginning to offset) of destination array into this third array and concatenate the rest. So you need to do something like this:
char *dst = "ThisIsMyHomework";
char *src = "Not";
char finalString[50];
int offset = 6; //Assume we want to insert "Not" string between "ThisIs" and "MyHomework". So we want src array to start from 6th index of dst array.
strncpy(finalString, dst, offset); //finalString="ThisIs"
finalString[offset] = '\0'; //For finalString, you have to add "\0" manually because strcat will append other strings from 50. index of array if you don't
strcat(finalString, src); //finalString="ThisIsNot"
strcat(finalString, dst + offset); //finalString="ThisIsNotMyHomework"

void strinsert(char *dst, size_t len, const char *src, size_t offset)
{
size_t iLenDst = strlen(dst),
iLenSrc = strlen(src);
// Some error handling
if (iLenDst+iLenSrc+1>len)
{
ASSERT(FALSE):
return;
}
// restrict to max length
if (offset>iLenDst)
offset = iLenDst;
// Make room incl. trailing \0
memmove(dst+offset+iLenSrc,dst+offset,iLenDst-offset+1);
// Insert new
memcopy(dst+offset,src,iLenSrc);
}

Related

How to find and replace multiple or all occurences in C strings

The goal is to replace multiple (or all) occurences of a given text in another string using only C strings.
(self answered question)
This uses fixed size buffers, you must make sure they are big enough to hold the string after replacement is done.
Define the size before use:
#define LINE_LEN 256
This code was tested with MSVC 2019.
void replaceN(char* line,const char* orig,const char* new, int times){
char* buf;
if(times==0) return; //sem tempo irmao
if((times==-1||--times>0) && (buf = strstr(line,orig))!=NULL){ //find orig
for(const char *c=orig;*c;c++) buf++; //advance buf
replaceN(buf,orig,new,times); //repeat until the last occurrence
}
//this will run first for the last match
if((buf = strstr(line,orig))!=NULL){
char tmp[LINE_LEN];
int i = buf-line; //pointer difference
strncpy(tmp,line,i); //copy everything before the match
for(const char *k=orig;*k;k++) buf++; //buf++; //skip find string
for(const char *k=new;*k;k++) tmp[i++]=*k; //copy replace chars
for(;*buf;buf++) tmp[i++]=*buf; //copy the rest of the string
tmp[i]='\0';
strcpy(line,tmp);
}
}
inline void replace(char* line,const char* orig,const char* new){replaceN(line, orig, new, 1);}
inline void replaceAll(char* line,const char* orig,const char* new){replaceN(line,orig,new,-1);}
Turns out I had too much self esteem. The code was not tested, and I should not have posted it without proper testing. I add this comment to remind others of not doing the same mistake. If you find any other errors, please let me know.
In order to keep it simple, I don't do it in place. Instead it requires a preallocated output buffer. Doing in place is risky if the size of the new string is longer than the original. And there's also an edge case that can be tricky to handle, and that's when the original substring to replace is a substring of the new string.
The headers needed to run allt his:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <stddef.h>
#include <stdint.h>
The main replace function. It replaces maximum n occurrences and returns number of replacements. dest is a buffer big enough to hold the result. All pointers needs to be non NULL and valid. You may notice that I'm using goto which may be frowned upon, but using it to exit cleanly is very convenient.
size_t replace(char *dest, const char *src, const char *orig,
const char *new, size_t n) {
size_t ret = 0;
// Maybe an unnecessary optimization to avoid multiple calls in
// loop, but it also adds clarity
const size_t newlen = strlen(new);
const size_t origlen = strlen(orig);
if(origlen == 0 || n == 0) goto END; // Edge cases
do {
const char *match = strstr(src, orig);
if(!match) goto END;
// Length of the part of src before first match
const ptrdiff_t offset = match - src;
memcpy(dest, src, offset); // Copy before match
memcpy(dest + offset, new, newlen); // Replace
src += offset + origlen; // Move src past what we have already copied.
dest += offset + newlen; // Advance pointer to dest to the end
ret++;
} while(n > ret);
END:
strcpy(dest, src); // Copy whatever is remaining
return ret;
}
It's easy to write a wrapper for the allocation. We borrow and modify some code from find the count of substring in string
size_t countOccurrences(const char *str, const char *substr) {
if(strlen(substr) == 0) return 0;
size_t count = 0;
const size_t len = strlen(substr);
while((str = strstr(str, substr))) {
count++;
str+=len // We're standing at the match, so we need to advance
}
return count;
}
Then some code to calculate buffer size
size_t calculateBufferLength(const char *src, const char *orig,
const char *new, size_t n) {
const size_t origlen = strlen(orig);
const size_t newlen = strlen(new);
const size_t baselen = strlen(src) + 1;
if(origlen > newlen) return srclen;
n = n < count ? n : count; // Min of n and count
return baselen +
n * (newlen - origlen);
}
And the final function. It combines allocation and replacement. It returns a pointer to the buffer, and NULL if allocation fails.
char *replaceAndAllocate(const char *src, const char *orig,
const char *new, size_t n) {
const size_t count = countOccurrences(src, orig);
const size_t size = calculateBufferLength(src, orig, new, n);
char *buf = malloc(size);
if(buf) replace(buf, src, orig, new, n);
return buf;
}
And finally, a simple main with a few test cases
int main(void) {
puts(replaceAndAllocate("hoho", "ha", "he", SIZE_MAX ));
puts(replaceAndAllocate("", "", "", 5));
puts(replaceAndAllocate("", "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa", "", 5));
puts(replaceAndAllocate("", "", "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa", 5));
puts(replaceAndAllocate("hihihi!!!", "hi", "of", 2));
puts(replaceAndAllocate("!!!hihihi", "hi", "x", 3));
puts(replaceAndAllocate("asdfasdfasdf", "asdf", "x", 2));
puts(replaceAndAllocate("xxxxxxxxxxxx", "x", "y", SIZE_MAX ));
puts(replaceAndAllocate("xxxxxxxxxxxx", "x", "y", 0));
puts(replaceAndAllocate("xxxxxxxxxxxx", "x", "y", 1));
puts(replaceAndAllocate("xxxxxxxxxxxx", "x", "", SIZE_MAX ));
puts(replaceAndAllocate("xxxxxxxxxxxx", "x", "", 3 ));
puts(replaceAndAllocate("!asdf!asdf!asdf!", "asdf", "asdf#asdf", SIZE_MAX));
// Yes, I skipped freeing the buffers to save some space
}
No warnings with -Wall -Wextra -pedantic and the output is:
$ ./a.out
hoho
ofofhi!!!
!!!xxx
xxasdf
yyyyyyyyyyyy
xxxxxxxxxxxx
yxxxxxxxxxxx
xxxxxxxxx
!asdf#asdf!asdf#asdf!asdf#asdf!
Note that I don't have any special functions for replacing one and replacing all. If you really want those, just write wrappers with n=1 or n=SIZE_MAX. Using SIZE_MAX is safe, because a string cannot be bigger than that.
Another reason that I got rid of a special function for one replacement is that it was very ineffecient. Also, it was easier to write it that way and it is much cleaner.
I changed the code a lot from last time, and that's very much thanks to the awesome help I got at Codereview. You can see how the code was before on the question I posted there: https://codereview.stackexchange.com/q/263785/133688

How do i copy a string that has been dynamically allocated to another string that has been dynamically allocated?

I am having trouble trying to implement a custom strcpy function which is supposed to handle cases where the src string is larger than the destination string. Here I have provided some code so that you guys can see the entire function. My issue is that every time I increment *dest, it goes into a null address despite the fact that I have allocated enough memory to fit all of src in it. This causes the a segmentation fault in (double pointer)dest = *src. dest is stored as a char** because in reality, the argument that has to be passed is another string that is possibly of a smaller size than src, and I wish to overwrite *dest as safely as I can.
int customStrCpy(char** dest, char* src){
int strlen1 = strlen(*dest), strlen2 = strlen(src);
if(strlen1 < strlen2){
//Creates a dynamically allocated array that is big enough to store the contents of line2.
*dest = calloc(strlen2, sizeof(char));
char* backup_str = *dest;
int copy_arrs;
for(copy_arrs = 0; copy_arrs < strlen2; copy_arrs++){
**dest = *src;
*dest++; src++;
}
*dest = backup_str;
}
else strcpy(*dest, src);
}
In the end, (char**)dest is supposed to be pointing to the correct string.
Usually strcpy returns char * for "direct" use in other operations.
char *mysStrCpy(char **dest, const char *src)
{
size_t len = strlen(src);
char *tmpptr;
*dest = malloc(len + 1);
// or *dest = realloc(*dest, len + 1);
if(*dest)
{
tmpptr = *dest;
while(*tmpptr++ = *src++);
}
return *dest;
}
You need to add 1 to the string length, to allow for the null terminator, and you should free the old contents of dest if you're allocating a new string. After you do this, you can do the same strcpy() as you do when you don't need to reallocate.
There's also no need for the int return type (unless you want to add error checking to malloc(), and return a status result). This function modifies an argument, it should be void.
void customStrCpy(char** dest, char* src){
int strlen1 = strlen(*dest), strlen2 = strlen(src);
if(strlen1 < strlen2){
free(*dest); // Free the old string
//Creates a dynamically allocated array that is big enough to store the contents of line2.
*dest = malloc(strlen2+1);
}
strcpy(*dest, src); // or memcpy(*dest, src, strlen2+1);
}
*dest++;
increments dest, not the pointer dest points to. You want:
(*dest)++;
ps: there are better ways to accomplish what you are after....

Convert *char or char to bits

How do I convert * char or char to bits ?
For example:
Here 's my declarations
uint64_t blocks[64];
char * word = "hello";
How do I store the word hello in bytes inside blocks[0] ?
I tried this
int e;
int a = strlen(word);
for (e = 0; e < a; e++) {
blocks[0] |= !!word[e] >> 8;
}
Also, how will I reverse the process?
"I want to copy the bits in a char into a uint64_t."
Try using memcpy:
void * memcpy(void * dst, const void * src, size_t n)
e.g.
memcpy(blocks, word, strlen(word));
More than one string
Regarding your comment which I interpret to be about copying more than one string:
memcpy copies n bytes from src to dst, so if we want to copy several strings in succession, we need to make sure calls to memcpy have src set to the end of the last string we copied, assuming we want to copy "hello" and then "world" into blocks and end up with the bytes that represent "helloworld".
// if you have a char** words and uint64_t blocks[64]; or similar
uint64_t blocks[64];
const char *words[2] = { "hello", "world" };
size_t offset = 0, len;
int num_words = sizeof words / sizeof words[0], n;
for (n = 0; n < num_words && offset < sizeof blocks; ++n) {
len = strlen(words[n]);
memcpy(((void *)blocks) + offset, words[n], len); // note the void * cast
offset += len;
}
This should be easily adaptable to a situation where you are reading in the strings rather than having an array of array of chars.
Getting a string back again
To take blocks and get a char * with all the bytes in it, we need to remember that strings in C are null terminated, so if we want to treat the result as a string, it needs a null on the end. The last offset you have once you are done copying (from above) could be used to add this.
char new_word[100];
memcpy(new_word, blocks, sizeof new_word);
new_word[offset] = 0;
We don't have to copy the data to treat this as a char *, by the way; We could just cast...
char * new_word = (char *)blocks;
...but remember that if you do this, modifying new_word will also modify blocks.

What's wrong with this character buffer code?

For reasons that I promise exist, I'm reading input character by character, and if a character meets certain criteria, I'm writing it into a dynamically allocated buffer. This function adds the specified character to the "end" of the specified string. When reading out of the buffer, I read the first 'size' characters.
void append(char c, char *str, int size)
{
if(size + 1 > strlen(str))
str = (char*)realloc(str,sizeof(char)*(size + 1));
str[size] = c;
}
This function, through various iterations of development has produced such errors as "corrupted double-linked list", "double free or corruption". Below is a sample of how append is supposed to be used:
// buffer is a string
// bufSize is the number of non-garbage characters at the beginning of buffer
char *buft = buffer;
int bufLoc=0;
while((buft-buffer)/sizeof(char) < bufSize)
append(*(buft==),destination,bufLoc++);
It generally works for some seemingly arbitrary number of characters, and then aborts with error. If it's not clear what the second code snippet is doing, it's just copying from the buffer into some destination string. I know there's library methods for this, but I need a bit finer control of what exactly gets copied sometimes.
Thanks in advance for any insight. I'm stumped.
This function does not append a character to a buffer.
void append(char c, char *str, int size)
{
if(size + 1 > strlen(str))
str = realloc(str, size + 1);
str[size] = c;
}
First, what is strlen(str)? You can say "it's the length of str", but that's omitting some very important details. How does it compute the length? Easy -- str must be NUL-terminated, and strlen finds the offset of the first NUL byte in it. If your buffer doesn't have a NUL byte at the end, then you can't use strlen to find its length.
Typically, you will want to keep track of the buffer's length. In order to reduce the number of reallocations, keep track of the buffer size and the amount of data in it separately.
struct buf {
char *buf;
size_t buflen;
size_t bufalloc;
};
void buf_init(struct buf *b)
{
buf->buf = NULL;
buf->buflen = 0;
buf->bufalloc = 0;
}
void buf_append(struct buf *b, int c)
{
if (buf->buflen >= buf->bufalloc) {
size_t newalloc = buf->bufalloc ? buf->bufalloc * 2 : 16;
char *newbuf = realloc(buf->buf, newalloc);
if (!newbuf)
abort();
buf->buf = newbuf;
buf->bufalloc = newalloc;
}
buf->buf[buf->buflen++] = c;
}
Another problem
This code:
str = realloc(str, size + 1);
It only changes the value of str in append -- it doesn't change the value of str in the calling function. Function arguments are local to the function, and changing them doesn't affect anything outside of the function.
Minor quibbles
This is a bit strange:
// Weird
x = (char*)realloc(str,sizeof(char)*(size + 1));
The (char *) cast is not only unnecessary, but it can actually mask an error -- if you forget to include <stdlib.h>, the cast will allow the code to compile anyway. Bummer.
And sizeof(char) is 1, by definition. So don't bother.
// Fixed
x = realloc(str, size + 1);
When you do a:
str = (char*)realloc(str,sizeof(char)*(size + 1));
the changes in str will not be reflected in the calling function, in other words the changes are local to the function as the pointer is passed by value. To fix this you can either return the value of str:
char * append(char c, char *str, int size)
{
if(size + 1 > strlen(str))
str = (char*)realloc(str,sizeof(char)*(size + 1));
str[size] = c;
return str;
}
or you can pass the pointer by address:
void append(char c, char **str, int size)
{
if(size + 1 > strlen(str))
*str = (char*)realloc(*str,sizeof(char)*(size + 1));
(*str)[size] = c;
}

Why is substring not part of the C standard library?

I know C is purposefully bare-bones, but I'm curious as to why something as commonplace as a substring function is not included in <string.h>.
Is it that there is not one "right enough" way to do it? Too many domain specific requirements? Can anyone shed any light?
BTW, this is the substring function I came up with after a bit of research.
Edit: I made a few updates based on comments.
void substr (char *outStr, const char *inpStr, int startPos, size_t strLen) {
/* Cannot do anything with NULL. */
if (inpStr == NULL || outStr == NULL) return;
size_t len = strlen (inpStr);
/* All negative positions to go from end, and cannot
start before start of string, force to start. */
if (startPos < 0) {
startPos = len + startPos;
}
if (startPos < 0) {
startPos = 0;
}
/* Force negative lengths to zero and cannot
start after end of string, force to end. */
if ((size_t)startPos > len) {
startPos = len;
}
len = strlen (&inpStr[startPos]);
/* Adjust length if source string too short. */
if (strLen > len) {
strLen = len;
}
/* Copy string section */
memcpy(outStr, inpStr+startPos, strLen);
outStr[strLen] = '\0';
}
Edit: Based on a comment from r I also came up with this one liner. You're on your own for checks though!
#define substr(dest, src, startPos, strLen) snprintf(dest, BUFF_SIZE, "%.*s", strLen, src+startPos)
Basic standard library functions don't burden themselves with excessive expensive safety checks, leaving them to the user. Most of the safety checks you carry out in your implementation are of expensive kind: totally unacceptable in such a basic library function. This is C, not Java.
Once you get some checks out of the picture, the "substrung" function boils down to ordinary strlcpy. I.e ignoring the safety check on startPos, all you need to do is
char *substr(const char *inpStr, char *outStr, size_t startPos, size_t strLen) {
strlcpy(outStr, inpStr + startPos, strLen);
return outStr;
}
While strlcpy is not a part of the standard library, but it can be crudely replaced by a [misused] strncpy. Again, ignoring the safety check on startPos, all you need to do is
char *substr(const char *inpStr, char *outStr, size_t startPos, size_t strLen) {
strncpy(outStr, inpStr + startPos, strLen);
outStr[strLen] = '\0';
return outStr;
}
Ironically, in your code strncpy is misused in the very same way. On top of that, many of your safety checks are the direct consequence of your choosing a signed type (int) to represent indices, while proper type would be an unsigned one (size_t).
Perhaps because it's a one-liner:
snprintf(dest, dest_size, "%.*s", sub_len, src+sub_start);
You DO have strcpy and strncpy. Aren't enough for you? With strcpy you can simulate the substring from character to end, with strncpy you can simulate the substring from character for a number of characters (you only need to remember to add the \0 at the end of the string). strncpy is even better than the C# equivalent, because you can overshoot the length of the substring and it won't throw an error (if you have allocated enough space in dest, you can do strncpy(dest, src, 1000) even if src is long 1. In C# you can't.)
As written in the comment, you can even use memcpy, but remember to always add a \0 at the end of the string, and you must know how many characters you are copying (so you must know exactly the length of the src substring) AND it's a little more complex to use if a day you want to refactor your code to use wchar_t AND it's not type-safe (because it accepts void* instead of char*). All this in exchange for a little more speed over strncpy
In C you have a function that returns a subset of symbols from a string via pointers: strstr.
char *ptr;
char string1[] = "Hello World";
char string2[] = "World";
ptr = strstr(string1, string2)
*ptr will be pointing to the first character occurrence.
BTW you did not write a function but a procedure, ANSI string functions: string.h
Here's a lighter weight version of what you want. Avoids the redundant strlen calls and guarantees null termination on the destination buffer (something strncpy won't do).
void substr(char* pszSrc, int start, int N, char* pszDst, int lenDest)
{
const char* psz = pszSrc + start;
int x = 0;
while ((x < N) && (x < lenDest))
{
char ch = psz[x];
pszDst[x] = ch;
x++;
if (ch == '\0')
{
return;
}
}
// guarantee null termination
if (x > 0)
{
pszDest[x-1] = 0;
}
}
Example:
char *pszLongString = "This is a long string";
char szSub[10];
substr(pszLongString, 0, 4, szSub, 10); // copies "long" into szSub and includes the null char
So while there isn't a formal substring function in C, C++ string classes usually have such a method:
#include <string>
...
std::string str;
std::string strSub;
str = "This is a long string";
strSub = str.substr(10, 4); // "long"
printf("%s\n", strSub.c_str());
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
const char* substr(const char *string, size_t from, size_t to);
int main(int argc, char *argv[])
{
char *string = argv[1];
const char *substring = substr(string,6,80);
printf("string is [%s] substring is [%s]\n",string,substring);
return 0;
}
const char* substr(const char *string, size_t from, size_t to)
{
if (to <= from)
return NULL;
if (from >= to)
return NULL;
if (string == NULL)
return NULL;
if (strlen(string) == 0)
return NULL;
if (from < 0)
from = 0;
if (to > strlen(string))
to = strlen(string);
char *substring = malloc(sizeof(char) * ((to-from)+1));
size_t index;
for (index = 0; from < to; from++, index++)
substring[index] = string[from];
substring[index] = '\0';
return substring;
}

Resources