I am trying to think how to write a function that get char* and return a pointer to the same char but after added to his end few constant characters.
char* addExtension(char* FileName)
{
}
That's just what the standard library function strcat() (for "string concatenate") does, I think. You should look into using it.
Also beware of the dangers of buffer overrun, a function such as this (and strcat(), for that matter) is inherently unsafe since it doesn't include information about available space.
You really can't do that, not with arbitrary strings anyway.
Strings can be:
Constants (like the literal "hello world"). Those can't be modified
Char arrays (like char thingie[10]). They have fixed amout of space - if you
run out of it, you get the very friendly Segmentation fault or worse.
malloc'd pointers - they have fixed amount of space, too, and need to be freed.
You can copy the string and return a new one, but that can result in memory leaks
if you don't take care of the old one. Example
char *copycat(char *first, char *second) {
char *result = malloc(strlen(first)+strlen(second)+1);
strcpy(result, first);
strcpy(result+strlen(first), second);
return result;
}
Related
I am trying to use the C's strtok function in order to process a char* and print it in a display, and looks like that for some reason I don't know the character '\n' is not substituted by '\0' as I believe strtok does. The code is as follows:
-Declaration of char* and pass to the function where it will be processed:
char *string_to_write = "Some text\nSome other text\nNewtext";
malloc(sizeof string_to_write);
screen_write(string_to_write,ALIGN_LEFT_TOP,I2C0);
-Processing of char* in function:
void screen_write(char *string_to_write,short alignment,short I2C)
{
char *stw;
stw = string_to_write;
char* text_to_send;
text_to_send=strtok(stw,"\n");
while(text_to_send != NULL)
{
write_text(text_to_send,I2C);
text_to_send=strtok(NULL, "\n");
}
}
When applying the code, the result can be seen in imgur (Sorry, I am having problems with format adding the image here in the post), where it can be seen that the \n is not substituted as it is the strange character appearing in the image, and the debugger still showed the character as well. Any hints of where can the problem be?
Thanks for your help,
Javier
strtok expects to be able to mutate the string you pass it: instead of allocating new memory for each token, it puts \0 characters into the string at token boundaries, then returns a series of pointers into that string.
But in this case, your string is immutable: it's a constant stored in your program, and can't be changed. So strtok is doing its best: it's returning indices into the string for each token's starting point, but it can't insert the \0s to mark the ends. Your device can't handle \ns in the way you'd expect, so it displays them with that error character instead. (Which is presumably why you're using this code in the first place.)
The key is to pass in only mutable strings. To define a mutable string with a literal value, you need char my_string[] = "..."; rather than char* my_string = "...". In the latter case, it just gives you a pointer to some constant memory; in the former case, it actually makes an array for you to use. Alternately, you can use strlen to find out how long the string is, malloc some memory for it, then strcpy it over.
P.S. I'm concerned by your malloc: you're not saving the memory it gives you anywhere, and you're not doing anything with it. Be sure you know what you're doing before working with dynamic memory allocation! C is not friendly about that, and it's easy to start leaking without realizing it.
1.
malloc(sizeof string_to_write); - it allocates the sizeof(char *) bytes not as many bytes as your string needs. You also do not assign the allocated block to anything
2.
char *string_to_write = "Some text\nSome other text\nNewtext";
char *ptr;
ptr = malloc(strlen(string_to_write) + 1);
strcpy(ptr, string_to_write);
screen_write(ptr,ALIGN_LEFT_TOP,I2C0);
I have already written a couple of C programs and consider this awkward to ask. But why do I receive a segmentation fault for the following code being supposed to replace "test" by "aaaa"?
#include <stdio.h>
int main(int argc, char* argv[])
{
char* s = "test\0";
printf("old: %s \n", s);
int x = 0;
while(s[x] != 0)
{
s[x++] = 'a'; // segmentation fault here
}
printf("new: %s \n", s); // expecting aaaa
return 0;
}
This assignment is writing to a string literal, which is stored in a read-only section of your executable when it is loaded in memory.
Also, note that the \0 in the literal is redundant.
One way to fix this (as suggested in comments) without copying the string: declare your variable as an array:
char s[] = "test";
This will cause the function to allocate at least 5 bytes of space for the string on the stack, which is normally writeable memory.
Also, you should generally declare a pointer to a string literal as const char*. This will cause the compiler to complain if you try to write to it, which is good, since the system loader will often mark the memory it points to as read-only.
Answering the OP's question in the comment posted to #antron.
What you need is to allocate a character array, then use strcpy() to initialize it with your text, then overwrite with a-s.
Allocation can be done statically (i.e., char s[10];) but make sure the space is enough to store the length of your init string (including the terminating \0).
Alternatively, you can dynamically allocate memory using malloc() and free it using free(). This enables you to allocate exactly enough space to hold your init string (figure it out in run-time using strlen()).
In C I have a path in one of my strings
/home/frankv/
I now want to add the name of files that are contained in this folder - e.g. file1.txt file123.txt etc.
Having declared my variable either like this
char pathToFile[strlen("/home/frankv/")+1]
or
char *pathToFile = malloc(strlen("/home/frankv/")+1)
My problem is that I cannot simply add more characters because it would cause a buffer overflow. Also, what do I do in case I do not know how long the filenames will be?
I've really gotten used to PHP lazy $string1.$string2 .. What is the easiest way to do this in C?
If you've allocated a buffer with malloc(), you can use realloc() to expand it:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(void) {
char *buf;
const char s1[] = "hello";
const char s2[] = ", world";
buf = malloc(sizeof s1);
strcpy(buf, s1);
buf = realloc(buf, sizeof s1 + sizeof s2 - 1);
strcat(buf, s2);
puts(buf);
return 0;
}
NOTE: I have omitted error checking. You shouldn't. Always check whether malloc() returns a null pointer; if it does, take some corrective action, even if it's just terminating the program. Likewise for realloc(). And if you want to be able to recover from a realloc() failure, store the result in a temporary so you don't clobber your original pointer.
Use std::string, if possible. Else, reallocate another block of memory and use strcpy and strcat.
You have a couple options, but, if you want to do this dynamically using no additional libraries, realloc() is the stdlib function you're looking for:
char *pathToFile = malloc(strlen("/home/frankv/")+1);
char *string_to_add = "filename.txt";
char *p = realloc(pathToFile, strlen(pathToFile) + strlen(string_to_add) + 1);
if (!p) abort();
pathToFile = p;
strcat(p, string_to_add);
Note: you should always assign the result of realloc to a new pointer first, as realloc() returns NULL on failure. If you assign to the original pointer, you are begging for a memory leak.
If you're going to be doing much string manipulation, though, you may want to consider using a string library. Two I've found useful are bstring and ustr.
In case you can use C++, use the std::string. In case you must to use pure C, use what's call doubling - i.e. when out of space in the string - double the memory and copy the string into the new memory. And you'll have to use the second syntax:
char *pathToFile = malloc(strlen("/home/frankv/")+1);
You have chosen the wrong language for manipulating strings!
The easy and conventional way out is to do something like:
#define MAX_PATH 260
char pathToFile[MAX_PATH+1] = "/home/frankv/";
strcat(pathToFile, "wibble/");
Of course, this is error prone - if the resulting string exceeds MAX_PATH characters, anything can happen, and it is this sort of programming which is the route many trojans and worms use to penetrate security (by corrupting memory in a carefully defined way). Hence my deliberate choice of 260 for MAX_PATH, which is what it used to be in Windows - you can still make Windows Explorer do strange things to your files with paths over 260 characters, possibly because of code like this!
strncat may be a small help - you can at least tell it the maximum size of the destination, and it won't copy beyond that.
To do it robustly you need a string library which does variable length strings correctly. But I don't know if there is such a thing for C (C++ is a different matter, of course).
I will be coaching an ACM Team next month (go figure), and the time has come to talk about strings in C. Besides a discussion on the standard lib, strcpy, strcmp, etc., I would like to give them some hints (something like str[0] is equivalent to *str, and things like that).
Do you know of any lists (like cheat sheets) or your own experience in the matter?
I'm already aware of the books for the ACM competition (which are good, see particularly this), but I'm after tricks of the trade.
Thank you.
Edit: Thank you very much everybody. I will accept the most voted answer, and have duly upvoted others which I think are relevant. I expect to do a summary here (like I did here, asap). I have enough material now and I'm certain this has improved the session on strings immensely. Once again, thanks.
It's obvious but I think it's important to know that strings are nothing more than an array of bytes, delimited by a zero byte.
C strings aren't all that user-friendly as you probably know.
Writing a zero byte somewhere in the string will truncate it.
Going out of bounds generally ends bad.
Never, ever use strcpy, strcmp, strcat, etc.., instead use their safe variants: strncmp, strncat, strndup,...
Avoid strncpy. strncpy will not always zero delimit your string! If the source string doesn't fit in the destination buffer it truncates the string but it won't write a nul byte at the end of the buffer. Also, even if the source buffer is a lot smaller than the destination, strncpy will still overwrite the whole buffer with zeroes. I personally use strlcpy.
Don't use printf(string), instead use printf("%s", string). Try thinking of the consequences if the user puts a %d in the string.
You can't compare strings with if( s1 == s2 )
doStuff(s1);
You have to compare every character in the string. Use strcmp or better strncmp.
if( strncmp( s1, s2, BUFFER_SIZE ) == 0 )
doStuff(s1);
Abusing strlen() will dramatically worsen the performance.
for( int i = 0; i < strlen( string ); i++ ) {
processChar( string[i] );
}
will have at least O(n2) time complexity whereas
int length = strlen( string );
for( int i = 0; i < length; i++ ) {
processChar( string[i] );
}
will have at least O(n) time complexity. This is not so obvious for people who haven't taken time to think of it.
The following functions can be used to implement a non-mutating strtok:
strcspn(string, delimiters)
strspn(string, delimiters)
The first one finds the first character in the set of delimiters you pass in. The second one finds the first character not in the set of delimiters you pass in.
I prefer these to strpbrk as they return the length of the string if they can't match.
str[0] is equivalent to 0[str], or more generally str[i] is i[str] and i[str] is *(str + i).
NB
this is not specific to strings but it works also for C arrays
The strn* variants in stdlib do not necessarily null terminate the destination string.
As an example: from MSDN's documentation on strncpy:
The strncpy function copies the
initial count characters of strSource
to strDest and returns strDest. If
count is less than or equal to the
length of strSource, a null character
is not appended automatically to the
copied string. If count is greater
than the length of strSource, the
destination string is padded with null
characters up to length count.
confuse strlen() with sizeof() when using a string:
char *p = "hello!!";
strlen(p) != sizeof(p)
sizeof(p) yield, at compile time, the size of the pointer (4 or 8 bytes) whereas strlen(p) counts, at runtime, the lenght of the null terminated char array (7 in this example).
strtok is not thread safe, since it uses a mutable private buffer to store data between calls; you cannot interleave or annidate strtok calls also.
A more useful alternative is strtok_r, use it whenever you can.
kmm has already a good list. Here are the things I had problems with when I started to code C.
String literals have an own memory section and are always accessible. Hence they can for example be a return value of function.
Memory management of strings, in particular with a high level library (not libc). Who is responsible to free the string if it is returned by function or passed to a function?
When should "const char *" and when "char *" be used. And what does it tell me if a function returns a "const char *".
All these questions are not too difficult to learn, but hard to figure out if you don't get taught them.
I have found that the char buff[0] technique has been incredibly useful.
Consider:
struct foo {
int x;
char * payload;
};
vs
struct foo {
int x;
char payload[0];
};
see https://stackoverflow.com/questions/295027
See the link for implications and variations
I'd point out the performance pitfalls of over-reliance on the built-in string functions.
char* triple(char* source)
{
int n=strlen(source);
char* dest=malloc(n*3+1);
strcpy(dest,src);
strcat(dest,src);
strcat(dest,src);
return dest;
}
I would discuss when and when not to use strcpy and strncpy and what can go wrong:
char *strncpy(char* destination, const char* source, size_t n);
char *strcpy(char* destination, const char* source );
I would also mention return values of the ansi C stdlib string functions. For example ask "does this if statement pass or fail?"
if (stricmp("StrInG 1", "string 1")==0)
{
.
.
.
}
perhaps you could illustrate the value of sentinel '\0' with following example
char* a = "hello \0 world";
char b[100];
strcpy(b,a);
printf(b);
I once had my fingers burnt when in my zeal I used strcpy() to copy binary data. It worked most of the time but failed mysteriously sometimes. Mystery was revealed when I realized that binary input sometimes contained a zero byte and strcpy() would terminate there.
You could mention indexed addressing.
An elements address is the base address + index * sizeof element
A common error is:
char *p;
snprintf(p, 3, "%d", 42);
it works until you use up to sizeof(p) bytes.. then funny things happens (welcome to the jungle).
Explaination
with char *p you are allocating space for holding a pointer (sizeof(void*) bytes) on the stack. The right thing here is to allocate a buffer or just to specify the size of the pointer at compile time:
char buf[12];
char *p = buf;
snprintf(p, sizeof(buf), "%d", 42);
Pointers and arrays, while having the similar syntax, are not at all the same. Given:
char a[100];
char *p = a;
For the array, a, there is no pointer stored anywhere. sizeof(a) != sizeof(p), for the array it is the size of the block of memory, for the pointer it is the size of the pointer. This become important if you use something like: sizeof(a)/sizeof(a[0]). Also, you can't ++a, and you can make the pointer a 'const' pointer to 'const' chars, but the array can only be 'const' chars, in which case you'd be init it first. etc etc etc
If possible, use strlcpy (instead of strncpy) and strlcat.
Even better, to make life a bit safer, you can use a macro such as:
#define strlcpy_sz(dst, src) (strlcpy(dst, src, sizeof(dst)))
I have a old program in which some library function is used and i dont have that library.
So I am writing that program using libraries of c++.
In that old code some function is there which is called like this
*string = newstrdup("Some string goes here");
the string variable is declared as char **string;
What he may be doing in that function named "newstrdup" ?
I tried many things but i dont know what he is doing ... Can anyone help
The function is used to make a copy of c-strings. That's often needed to get a writable version of a string literal. They (string literals) are itself not writable, so such a function copies them into an allocated writable buffer. You can then pass them to functions that modify their argument given, like strtok which writes into the string it has to tokenize.
I think you can come up with something like this, since it is called newstrdup:
char * newstrdup(char const* str) {
char *c = new char[std::strlen(str) + 1];
std::strcpy(c, str);
return c;
}
You would be supposed to free it once done using the string using
delete[] *string;
An alternative way of writing it is using malloc. If the library is old, it may have used that, which C++ inherited from C:
char * newstrdup(char const* str) {
char *c = (char*) malloc(std::strlen(str) + 1);
if(c != NULL) {
std::strcpy(c, str);
}
return c;
}
Now, you are supposed to free the string using free when done:
free(*string);
Prefer the first version if you are writing with C++. But if the existing code uses free to deallocate the memory again, use the second version. Beware that the second version returns NULL if no memory is available for dup'ing the string, while the first throws an exception in that case. Another note should be taken about behavior when you pass a NULL argument to your newstrdup. Depending on your library that may be allowed or may be not allowed. So insert appropriate checks into the above functions if necessary. There is a function called strdup available in POSIX systems, but that one allows neither NULL arguments nor does it use the C++ operator new to allocate memory.
Anyway, i've looked with google codesearch for newstrdup functions and found quite a few. Maybe your library is among the results:
Google CodeSearch, newstrdup
there has to be a reason that they wrote a "new" version of strdup. So there must be a corner case that it handles differently. like perhaps a null string returns an empty string.
litb's answer is a replacement for strdup, but I would think there is a reason they did what they did.
If you want to use strdup directly, use a define to rename it, rather than write new code.
The line *string = newstrdup("Some string goes here"); is not showing any weirdness to newstrdup. If string has type char ** then newstrdup is just returning char * as expected. Presumably string was already set to point to a variable of type char * in which the result is to be placed. Otherwise the code is writing through an uninitialized pointer..
newstrdup is probably making a new string that is a duplicate of the passed string; it returns a pointer to the string (which is itself a pointier to the characters).
It looks like he's written a strdup() function to operate on an existing pointer, probably to re-allocate it to a new size and then fill its contents. Likely, he's doing this to re-use the same pointer in a loop where *string is going to change frequently while preventing a leak on every subsequent call to strdup().
I'd probably implement that like string = redup(&string, "new contents") .. but that's just me.
Edit:
Here's a snip of my 'redup' function which might be doing something similar to what you posted, just in a different way:
int redup(char **s1, const char *s2)
{
size_t len, size;
if (s2 == NULL)
return -1;
len = strlen(s2);
size = len + 1;
*s1 = realloc(*s1, size);
if (*s1 == NULL)
return -1;
memset(*s1, 0, size);
memcpy(*s1, s2, len);
return len;
}
Of course, I should probably save a copy of *s1 and restore it if realloc() fails, but I didn't need to get that paranoid.
I think you need to look at what is happening with the "string" variable within the code as the prototype for the newstrdup() function would appear to be identical to the library strdup() version.
Are there any free(*string) calls in the code?
It would appear to be a strange thing do to, unless it's internally keeping a copy of the duplicated string and returning a pointer back to the same string again.
Again, I would ask why?