Copying some strings from pointer array in C++ - c

I have a string pointer like below,
char *str = "This is cool stuff";
Now, I've references to this string pointer like below,
char* start = str + 1;
char* end = str + 6;
So, start and end are pointing to different locations of *str. How can I copy the string chars falls between start and end into a new string pointer. Any existing C++/C function is preferable.

Just create a new buffer called dest and use strncpy
char dest[end-start+1];
strncpy(dest,start,end-start);
dest[end-start] = '\0'

Use STL std::string:
#include
const char *str = "This is cool stuff";
std::string part( str + 1, str + 6 );
This uses iterator range constructor, so the part of the C-string does not have to be zero-terminated.

It's best to do this with strcpy(), and terminate the result yourself. The standard strncpy() function has very strange semantics.
If you really want a "new string pointer", and be a bit safe with regard to lengths and static buffers, you need to dynamically allocate the new string:
char * ranged_copy(const char *start, const char *end)
{
char *s;
s = malloc(end - start + 1);
memcpy(s, start, end - start);
s[end - start] = 0;
return s;
}

If you want to do this with C++ STL:
#include <string>
...
std::string cppStr (str, 1, 6); // copy substring range from 1st to 6th character of *str
const char *newStr = cppStr.c_str(); // make new char* from substring

char newChar[] = new char[end-start+1]]
p = newChar;
while (start < end)
*p++ = *start++;

This is one of the rare cases when function strncpy can be used. Just calculate the number of characters you need to copy and specify that exact amount in the strncpy. Remember that strncpy will not zero-terminate the result in this case, so you'll have to do it yourself (which, BTW, means that it makes more sense to use memcpy instead of the virtually useless strncpy).
And please, do yourself a favor, start using const char * pointers with string literals.

Assuming that end follows the idiomatic semantics of pointing just past the last item you want copied (STL semantics are a useful idiom even if we're dealing with straight C) and that your destination buffer is known to have enough space:
memcpy( buf, start, end-start);
buf[end-start] = '\0';
I'd wrap this in a sub-string function that also took the destination buffer size as a parameter so it could perform a check and truncate the result or return an error to prevent overruns.
I'd avoid using strncpy() because too many programmers forget about the fact that it might not terminate the destination string, so the second line might be mistakenly dropped at some point by someone believing it unnecessary. That's less likely if memcpy() were used. (In general, just say no to using strncpy())

Related

Why is this use of strcpy considered bad?

I've spotted the following piece of C code, marked as BAD (aka buffer overflow bad).
The problem is I don't quite get why? The input string length is captured before the allocation etc.
char *my_strdup(const char *s)
{
size_t len = strlen(s) + 1;
char *c = malloc(len);
if (c) {
strcpy(c, s); // BAD
}
return c;
}
Update from comments:
the 'BAD' marker is not precise, the code is not bad, not efficient yes, risky (below) yes,
why risky? +1 after the strlen() call is required to safely allocate the space on heap that also will keep the string terminator ('\0')
There is no bug in your sample function.
However, to make it obvious to future readers (both human and mechanical) that there is no bug, you should replace the strcpy call with a memcpy:
char *my_strdup(const char *s)
{
size_t len = strlen(s) + 1;
char *c = malloc(len);
if (c) {
memcpy(c, s, len);
}
return c;
}
Either way, len bytes are allocated and len bytes are copied, but with memcpy that fact stands out much more clearly to the reader.
There's no problem with this code.
While it's possible that strcpy can cause undefined behavior if the destination buffer isn't large enough to hold the string in question, the buffer is allocated to be the correct size. This means there is no risk of overrunning the buffer.
You may see some guides recommend using strncpy instead, which allows you to specify the maximum number of characters to copy, but this has its own problems. If the source string is too long, only the specified number of characters will be copied, however this also means that the string isn't null terminated which requires the user to do so manually. For example:
char src[] = "test data";
char dest[5];
strncpy(dest, src, sizeof dest); // dest holds "test " with no null terminator
dest[sizeof(dest) - 1] = 0; // manually null terminate, dest holds "test"
I tend towards the use of strcpy if I know the source string will fit, otherwise I'll use strncpy and manually null-terminate.
I cannot see any problem with the code when it comes to the use of strcpy
But you should be aware that it requires s to be a valid C string. That is a reasonable requirement, but it should be specified.
If you want, you could put in a simple check for NULL, but I would say that it's ok to do without it. If you're about to make a copy of a "string" pointed to by a null pointer, then you probably should check either the argument or the result. But if you want, just add this as the first line:
if(!s) return NULL;
But as I said, it does not add much. It just makes it possible to change
if(!str) {
// Handle error
} else {
new_str = my_strdup(str);
}
to:
new_str = my_strdup(str);
if(!new_str) {
// Handle error
}
Not really a huge gain

Inserting strings into another string in C

I'm implementing a function which, given a string, a character and another string (since now we can call it the "substring"); puts the substring everywhere the character is in the string.
To explain me better, given these parameters this is what the function should return (pseudocode):
func ("aeiou", 'i', "hello") -> aehelloou
I'm using some functions from string.h lib. I have tested it with pretty good result:
char *somestring= "this$ is a tes$t wawawa$wa";
printf("%s", strcinsert(somestring, '$', "WHAT?!") );
Outputs: thisWHAT?! is a tesWHAT?!t wawawaWHAT?!wa
so for now everything is allright. The problem is when I try to do the same with, for example this string:
char *somestring= "this \"is a test\" wawawawa";
printf("%s", strcinsert(somestring, '"', "\\\"") );
since I want to change every " for a \" . When I do this, the PC collapses. I don't know why but it stops working and then shutdown. I've head some about the bad behavior of some functions of the string.h lib but I couldn't find any information about this, I really thank any help.
My code:
#define salloc(size) (str)malloc(size+1) //i'm lazy
typedef char* str;
str strcinsert (str string, char flag, str substring)
{
int nflag= 0; //this is the number of times the character appears
for (int i= 0; i<strlen(string); i++)
if (string[i]==flag)
nflag++;
str new=string;
int pos;
while (strchr(string, flag)) //since when its not found returns NULL
{
new= salloc(strlen(string)+nflag*strlen(substring)-nflag);
pos= strlen(string)-strlen(strchr(string, flag));
strncpy(new, string, pos);
strcat(new, substring);
strcat(new, string+pos+1);
string= new;
}
return new;
}
Thanks for any help!
Some advices:
refrain from typedef char* str;. The char * type is common in C and masking it will just make your code harder to be reviewed
refrain from #define salloc(size) (str)malloc(size+1) for the exact same reason. In addition don't cast malloc in C
each time you write a malloc (or calloc or realloc) there should be a corresponding free: C has no garbage collection
dynamic allocation is expensive, use it only when needed. Said differently a malloc inside a loop should be looked at twice (especially if there is no corresponding free)
always test allocation function (unrelated: and io) a malloc will simply return NULL when you exhaust memory. A nice error message is then easier to understand than a crash
learn to use a debugger: if you had executed your code under a debugger the error would have been evident
Next the cause: if the replacement string contains the original one, you fall again on it and run in an endless loop
A possible workaround: allocate the result string before the loop and advance both in the original one and the result. It will save you from unnecessary allocations and de-allocations, and will be immune to the original char being present in the replacement string.
Possible code:
// the result is an allocated string that must be freed by caller
str strcinsert(str string, char flag, str substring)
{
int nflag = 0; //this is the number of times the character appears
for (int i = 0; i<strlen(string); i++)
if (string[i] == flag)
nflag++;
str new_ = string;
int pos;
new_ = salloc(strlen(string) + nflag*strlen(substring) - nflag);
// should test new_ != NULL
char * cur = new_;
char *old = string;
while (NULL != (string = strchr(string, flag))) //since when its not found returns NULL
{
pos = string - old;
strncpy(cur, old, pos);
cur[pos] = '\0'; // strncpy does not null terminate the dest. string
strcat(cur, substring);
strcat(cur, string + 1);
cur += strlen(substring) + pos; // advance the result
old = ++string; // and the input string
}
return new_;
}
Note: I have not reverted the str and salloc but you really should do.
In your second loop, you always look for the first flag character in the string. In this case, that’ll be the one you just inserted from substring. The strchr function will always find that quote and never return NULL, so your loop will never terminate and just keep allocating memory (and not enough of it, since your string grows arbitrarily large).
Speaking of allocating memory, you need to be more careful with that. Unlike in Python, C doesn’t automatically notice when you’re no longer using memory; anything you malloc must be freed. You also allocate far more memory than you need: even in your working "this$ is a tes$t wawawa$wa" example, you allocate enough space for the full string on each iteration of the loop, and never free any of it. You should just run the allocation once, before the second loop.
This isn’t as important as the above stuff, but you should also pay attention to performance. Each call to strcat and strlen iterates over the entire string, meaning you look at it far more often than you need. You should instead save the result of strlen, and copy the new string directly to where you know the NUL terminator is. The same goes for strchr; you already replaced the beginning of the string and don’t want to waste time looking at it again, apart from the part where that’s causing your current bug.
In comparison to these issues, the style issues mentioned in the comments with your typedef and macro are relatively minor, but they are still worth mentioning. A char* in C is different from a str in Python; trying to typedef it to the same name just makes it more likely you’ll try to treat them as the same and run into these issues.
I don't know why but it stops working
strchr(string, flag) is looking over the whole string for flag. Search needs to be limited to the portion of the string not yet examined/updated. By re-searching the partially replaces string, code is finding the flag over and over.
The whole string management approach needs re-work. As OP reported a Python background, I've posted a very C approach as mimicking Python is not a good approach here. C is different especially in the management of memory.
Untested code
// Look for needles in a haystack and replace them
// Note that replacement may be "" and result in a shorter string than haystack
char *strcinsert_alloc(const char *haystack, char needle, const char *replacment) {
size_t n = 0;
const char *s = haystack;
while (*s) {
if (*s == needle) n++; // Find needle count
s++;
}
size_t replacemnet_len = strlen(replacment);
// string length - needles + replacements + \0
size_t new_size = (size_t)(s - haystack) - n*1 + n*replacemnet_len + 1;
char *dest = malloc(new_size);
if (dest) {
char *d = dest;
s = haystack;
while (*s) {
if (*s == needle) {
memcpy(d, s, replacemnet_len);
d += replacemnet_len;
} else {
*d = *s;
d++;
}
s++;
}
*d = '\0';
}
return dest;
}
In your program, you are facing problem for input -
char *somestring= "this \"is a test\" wawawawa";
as you want to replace " for a \".
The first problem is whenever you replace " for a \" in string, in next iteration strchr(string, flag) will find the last inserted " of \". So, in subsequent interations your string will form like this -
this \"is a test" wawawawa
this \\"is a test" wawawawa
this \\\"is a test" wawawawa
So, for input string "this \"is a test\" wawawawa" your while loop will run for infinite times as every time strchr(string, flag) finds the last inserted " of \".
The second problem is the memory allocation you are doing in your while loop in every iteration. There is no free() for the allocated memory to new. So when while loop run infinitely, it will eat up all the memory which will lead to - the PC collapses.
To resolve this, in every iteration, you should search for flag only in the string starting from a character after the last inserted substring to the end of the string. Also, make sure to free() the dynamically allocated memory.

C setting string equal to substring

In C, If I have:
char *reg = "[R5]";
and I want
char *reg_alt = "R5" (equal to the same thing, but without the brackets), how do I do this?
I tried
*char reg_alt = reg[1:2];
but this doesn't work.
There is no built-in syntax for dealing with substrings like that, so you need to copy the content manually:
char res[3];
memcpy(res, &reg[1], 2);
res[2] = '\0';
I suggest you need to read a basic text on C, rather than assuming techniques from other languages will just work.
First, char *reg = "[R5]"; is not a string. It is a pointer, that is initialised to point to (i.e. its value is the address of) the first character of a string literal ("[R5]").
Second, reg_alt is also a pointer, not a string. Assigning to it will contain an address of something. Strings are not first class citizens in C, so the assignment operator doesn't work with them.
Third, 1:2 does not specify a range - it is actually more invalid syntax. Yes, I know other languages do. But not C. Hence my comment that you cannot assume C will allow things it the way that other languages do.
If you want to obtain a substring from another string, there are various ways. For example;
char substring[3];
const char *reg = "[R5]"; /* const since the string literal should not be modified */
strncpy(substring, &reg[1], 2); /* copy 2 characters, starting at reg[1], to substring */
substring[2] = '\0'; /* terminate substring */
printf("%s\n", substring);
strncpy() is declared in standard header <string.h>. The termination of the substring is needed, since printf() %s format looks for a zero character to mark the end.
When using null-terminated strings (the default in C), you can indeed cheaply create a substring of another string by simply changing the starting character pointer, but you cannot make the new substring have a different null-terminator.
An option is to use a Pascal-string library. Pascal-strings are length-prefixed instead of C-strings which are null-terminated, which means Pascal-strings can share contents of a larger string buffer and substring generation is cheap (O(1)-cheap). A Pascal string looks like this:
struct PString {
size_t length;
char* start;
}
PString substring(const PString* source, size_t offset, size_t length) {
// Using C99 Designated Initializer syntax:
return PString { .length = length, .start = source.start + offset };
}
The downside is that most of the C library and platform libraries use null-terminated strings and unless your Pascal-string ends in a null character you'll need to copy the substring to a new buffer (in O(n) time).
Of course, if you're feeling dangerous (and using mutable character buffers) then you can hack it to temporarily insert a null-terminator, like so:
struct CStr {
char* start;
char* end;
char temp;
}
CStr getCStr(PString* source) {
char* terminator = (source.start + source.length);
char previous = *terminator;
*terminator = '\0';
return CStr { .start = source.start, .end = terminator, .temp = previous };
}
void undoGetCStr(CStr cstr) {
*cstr.end = cstr.temp;
}
Used like so:
PString somePascalString = doSomethingWithPascalStrings();
CStr temp = getCStr( somePascalString );
printf("My Pascal string: %s", temp.start ); // using a function that expects a C-string
undoGetCStr( temp );
...which then gives you O(1) PString-to-CString performance, provided you don't care about thread-safety.
Need to be a char?
Because that only work when is a "string"
So maybe you need this
char reg[] = "[R5]";
Then you can do the other thing
or just split the string like this question

Copying a part of a string (substring) in C

I have a string:
char * someString;
If I want the first five letters of this string and want to set it to otherString, how would I do it?
#include <string.h>
...
char otherString[6]; // note 6, not 5, there's one there for the null terminator
...
strncpy(otherString, someString, 5);
otherString[5] = '\0'; // place the null terminator
Generalized:
char* subString (const char* input, int offset, int len, char* dest)
{
int input_len = strlen (input);
if (offset + len > input_len)
{
return NULL;
}
strncpy (dest, input + offset, len);
return dest;
}
char dest[80];
const char* source = "hello world";
if (subString (source, 0, 5, dest))
{
printf ("%s\n", dest);
}
char* someString = "abcdedgh";
char* otherString = 0;
otherString = (char*)malloc(5+1);
memcpy(otherString,someString,5);
otherString[5] = 0;
UPDATE:
Tip: A good way to understand definitions is called the right-left rule (some links at the end):
Start reading from identifier and say aloud => "someString is..."
Now go to right of someString (statement has ended with a semicolon, nothing to say).
Now go left of identifier (* is encountered) => so say "...a pointer to...".
Now go to left of "*" (the keyword char is found) => say "..char".
Done!
So char* someString; => "someString is a pointer to char".
Since a pointer simply points to a certain memory address, it can also be used as the "starting point" for an "array" of characters.
That works with anything .. give it a go:
char* s[2]; //=> s is an array of two pointers to char
char** someThing; //=> someThing is a pointer to a pointer to char.
//Note: We look in the brackets first, and then move outward
char (* s)[2]; //=> s is a pointer to an array of two char
Some links:
How to interpret complex C/C++ declarations and
How To Read C Declarations
You'll need to allocate memory for the new string otherString. In general for a substring of length n, something like this may work for you (don't forget to do bounds checking...)
char *subString(char *someString, int n)
{
char *new = malloc(sizeof(char)*n+1);
strncpy(new, someString, n);
new[n] = '\0';
return new;
}
This will return a substring of the first n characters of someString. Make sure you free the memory when you are done with it using free().
You can use snprintf to get a substring of a char array with precision:
#include <stdio.h>
int main()
{
const char source[] = "This is a string array";
char dest[17];
// get first 16 characters using precision
snprintf(dest, sizeof(dest), "%.16s", source);
// print substring
puts(dest);
} // end main
Output:
This is a string
Note:
For further information see printf man page.
You can treat C strings like pointers. So when you declare:
char str[10];
str can be used as a pointer. So if you want to copy just a portion of the string you can use:
char str1[24] = "This is a simple string.";
char str2[6];
strncpy(str1 + 10, str2,6);
This will copy 6 characters from the str1 array into str2 starting at the 11th element.
I had not seen this post until now, the present collection of answers form an orgy of bad advise and compiler errors, only a few recommending memcpy are correct. Basically the answer to the question is:
someString = allocated_memory; // statically or dynamically
memcpy(someString, otherString, 5);
someString[5] = '\0';
This assuming that we know that otherString is at least 5 characters long, then this is the correct answer, period. memcpy is faster and safer than strncpy and there is no confusion about whether memcpy null terminates the string or not - it doesn't, so we definitely have to append the null termination manually.
The main problem here is that strncpy is a very dangerous function that should not be used for any purpose. The function was never intended to be used for null terminated strings and it's presence in the C standard is a mistake. See Is strcpy dangerous and what should be used instead?, I will quote some relevant parts from that post for convenience:
Somewhere at the time when Microsoft flagged strcpy as obsolete and dangerous, some other misguided rumour started. This nasty rumour said that strncpy should be used as a safer version of strcpy. Since it takes the size as parameter and it's already part of the C standard lib, so it's portable. This seemed very convenient - spread the word, forget about non-standard strcpy_s, lets use strncpy! No, this is not a good idea...
Looking at the history of strncpy, it goes back to the very earliest days of Unix, where several string formats co-existed. Something called "fixed width strings" existed - they were not null terminated but came with a fixed size stored together with the string. One of the things Dennis Ritchie (the inventor of the C language) wished to avoid when creating C, was to store the size together with arrays [The Development of the C Language, Dennis M. Ritchie]. Likely in the same spirit as this, the "fixed width strings" were getting phased out over time, in favour for null terminated ones.
The function used to copy these old fixed width strings was named strncpy. This is the sole purpose that it was created for. It has no relation to strcpy. In particular it was never intended to be some more secure version - computer program security wasn't even invented when these functions were made.
Somehow strncpy still made it into the first C standard in 1989. A whole lot of highly questionable functions did - the reason was always backwards compatibility. We can also read the story about strncpy in the C99 rationale 7.21.2.4:
The strncpy function
strncpy was initially introduced into the C library to deal with fixed-length name fields in
structures such as directory entries. Such fields are not used in the same way as strings: the
trailing null is unnecessary for a maximum-length field, and setting trailing bytes for shorter
5 names to null assures efficient field-wise comparisons. strncpy is not by origin a “bounded
strcpy,” and the Committee preferred to recognize existing practice rather than alter the function
to better suit it to such use.
The Codidact link also contains some examples showing how strncpy will fail to terminate a copied string.
I think it's easy way... but I don't know how I can pass the result variable directly then I create a local char array as temp and return it.
char* substr(char *buff, uint8_t start,uint8_t len, char* substr)
{
strncpy(substr, buff+start, len);
substr[len] = 0;
return substr;
}
strncpy(otherString, someString, 5);
Don't forget to allocate memory for otherString.
#include <stdio.h>
#include <string.h>
int main ()
{
char someString[]="abcdedgh";
char otherString[]="00000";
memcpy (otherString, someString, 5);
printf ("someString: %s\notherString: %s\n", someString, otherString);
return 0;
}
You will not need stdio.h if you don't use the printf statement and putting constants in all but the smallest programs is bad form and should be avoided.
Doing it all in two fell swoops:
char *otherString = strncpy((char*)malloc(6), someString);
otherString[5] = 0;
char largeSrt[] = "123456789-123"; // original string
char * substr;
substr = strchr(largeSrt, '-'); // we save the new string "-123"
int substringLength = strlen(largeSrt) - strlen(substr); // 13-4=9 (bigger string size) - (new string size)
char *newStr = malloc(sizeof(char) * substringLength + 1);// keep memory free to new string
strncpy(newStr, largeSrt, substringLength); // copy only 9 characters
newStr[substringLength] = '\0'; // close the new string with final character
printf("newStr=%s\n", newStr);
free(newStr); // you free the memory
Try this code:
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
char* substr(const char *src, unsigned int start, unsigned int end);
int main(void)
{
char *text = "The test string is here";
char *subtext = substr(text,9,14);
printf("The original string is: %s\n",text);
printf("Substring is: %s",subtext);
return 0;
}
char* substr(const char *src, unsigned int start, unsigned int end)
{
unsigned int subtext_len = end-start+2;
char *subtext = malloc(sizeof(char)*subtext_len);
strncpy(subtext,&src[start],subtext_len-1);
subtext[subtext_len-1] = '\0';
return subtext;
}

Typecast:LPCTSTR to Char * for string concatenate operation

Can u Give solution for this code of typecasting, LPCTSTR(here lpsubkey) to Char*
for below code snippet ,
char* s="HKEY_CURRENT_USER\\";
strcat(s,(char*)lpSubKey);
printf("%S",s);
here it makes error of access violation ,so what will be the solution for that?.
...thanks in advance
There are several issues with your code that might well lead to the access violation. I don't think any have anything to do with the cast you mentioned.
You are assigning a pointer to the first element of a fixed size char array to a char * and then attempt to append to this using strcat. This is wrong as there is no additional space left in the implicitly allocated string array. You will need to allocate a buffer big enough to hold the resulting string and then copy the string constant in there before calling strcat. For example, like so:
char *s = (char*)malloc(1024 * sizeof(char));
strcpy(s, "HKEY_CURRENT_USER\\");
strcat(s, T2A(lpSubKey));
printf("%s", s);
free(s);
Please note that the fixed size array I'm allocating above is bad practise. In production code you should always determine the correct size of the array on the go to prevent buffer overflows or use functions like strncat and strncpy to ensure that you are not copying more data into the buffer than the buffer can hold.
These are not the same thing. What are you trying to do?
The problem is you are trying to append to a string that you have not reserved memory for.
Try:
char s[1024] = "HKEY_CURRENT_USER";
strcat(s,(char*)lpSubKey );
printf("%S",s);
Do be careful with the arbitrary size of 1024. If you expect your keys to be much longer your program will crash.
Also, look at strcat_s.
ATL and MFC has set of macros to such conversion, where used next letters:
W - wide unicode string
T - generic character string
A - ANSI character string
OLE - BSTR string,
so in your case you need T2A macros
strcat does not attempt to make room for the combination. You are overwriting memory that isn't part of the string. Off the top of my head:
char *strcat_with_alloc(char *s1, char *s2)
{
if (!s1 || !s2) return NULL;
size_t len1 = strlen(s1);
size_t len2 = strlen(s2);
char *dest = (char *)malloc(len1 + len2 + 1);
if (!dest) return NULL;
strcpy(dest, s1);
strcat(dest, s2);
return dest;
}
now try:
char* s="HKEY_CURRENT_USER\\";
char *fullKey = strcat_with_alloc(s,(char*)lpSubKey);
if (!fullKey)
printf("error no memory");
else {
printf("%S",fullKey);
free(fullKey);
}

Resources