I am trying to use the C's strtok function in order to process a char* and print it in a display, and looks like that for some reason I don't know the character '\n' is not substituted by '\0' as I believe strtok does. The code is as follows:
-Declaration of char* and pass to the function where it will be processed:
char *string_to_write = "Some text\nSome other text\nNewtext";
malloc(sizeof string_to_write);
screen_write(string_to_write,ALIGN_LEFT_TOP,I2C0);
-Processing of char* in function:
void screen_write(char *string_to_write,short alignment,short I2C)
{
char *stw;
stw = string_to_write;
char* text_to_send;
text_to_send=strtok(stw,"\n");
while(text_to_send != NULL)
{
write_text(text_to_send,I2C);
text_to_send=strtok(NULL, "\n");
}
}
When applying the code, the result can be seen in imgur (Sorry, I am having problems with format adding the image here in the post), where it can be seen that the \n is not substituted as it is the strange character appearing in the image, and the debugger still showed the character as well. Any hints of where can the problem be?
Thanks for your help,
Javier
strtok expects to be able to mutate the string you pass it: instead of allocating new memory for each token, it puts \0 characters into the string at token boundaries, then returns a series of pointers into that string.
But in this case, your string is immutable: it's a constant stored in your program, and can't be changed. So strtok is doing its best: it's returning indices into the string for each token's starting point, but it can't insert the \0s to mark the ends. Your device can't handle \ns in the way you'd expect, so it displays them with that error character instead. (Which is presumably why you're using this code in the first place.)
The key is to pass in only mutable strings. To define a mutable string with a literal value, you need char my_string[] = "..."; rather than char* my_string = "...". In the latter case, it just gives you a pointer to some constant memory; in the former case, it actually makes an array for you to use. Alternately, you can use strlen to find out how long the string is, malloc some memory for it, then strcpy it over.
P.S. I'm concerned by your malloc: you're not saving the memory it gives you anywhere, and you're not doing anything with it. Be sure you know what you're doing before working with dynamic memory allocation! C is not friendly about that, and it's easy to start leaking without realizing it.
1.
malloc(sizeof string_to_write); - it allocates the sizeof(char *) bytes not as many bytes as your string needs. You also do not assign the allocated block to anything
2.
char *string_to_write = "Some text\nSome other text\nNewtext";
char *ptr;
ptr = malloc(strlen(string_to_write) + 1);
strcpy(ptr, string_to_write);
screen_write(ptr,ALIGN_LEFT_TOP,I2C0);
Related
I am trying to test something and I made a small test file to do so. The code is:
void main(){
int i = 0;
char array1 [3];
array1[0] = 'a';
array1[1] = 'b';
array1[2] = 'c';
printf("%s", array1[i+1]);
printf("%d", i);
}
I receive a segmentation error when I compile and try to run. Please let me know what my issue is.
Please let me know what my issue is. ? firstly char array1[3]; is not null terminated as there is no enough space to put '\0' at the end of array1. To avoid this undefined behavior increase the size of array1.
Secondly, array1[i+1] is a single char not string, so use %c instead of %s as
printf("%c", array1[i+1]);
I suggest you get yourself a good book/video series on C. It's not a language that's fun to pick up out of the blue.
Regardless, your problem here is that you haven't formed a correct string. In C, a string is a pointer to the start of a contiguous region of memory that happens to be filled with characters. There is no data whatsoever stored about it's size or any other characteristics. Only where it starts and what it is. Therefore you must provide information as to when the string ends explicitly. This is done by having the very last character in a string be set to the so called null character (in C represented by the escape sequence '\0'.
This implies that any string must be one character longer than the content you want it to hold. You should also never be setting up a string manually like this. Use a library function like strlcpy to do it. It will automatically add in a null character, even if your array is too small (by truncating the string). Alternatively you can statically create a literal string like this:
char array[] = "abc";
It will automatically be null terminated and be of size 4.
Strings need to have a NUL terminator, and you don't have one, nor is there room for one.
The solution is to add one more character:
char array1[4];
// ...
array1[3] = 0;
Also you're asking to print a string but supplying a character instead. You need to supply the whole buffer:
printf("%s", array1);
Then you're fine.
Spend the time to learn about how C strings work, in particular about the requirement for the terminator, as buffer overflow bugs are no joke.
When printf sees a "%s" specifier in the formatting string, it expects a char* as the corresponding argument, but you passed a char value of the array1[i+1] expression. That char got promoted to int but that is still incompatible with char *, And even if it was it has no chance to be a valid pointer to any meaningful character string...
I am working on a university assignment and I've been wracking my head around a weird problem where my program calls strtok and never returns.
My code looks like:
int loadMenuDataIn(GJCType* menu, char *data)
{
char *lineTokenPtr;
int i;
lineTokenPtr = strtok(data, "\n");
while (lineTokenPtr != NULL) {
/* ... */
}
}
I've looked up a bunch of sites on the web, but I cant see anything wrong with the way that I am using strtok and I cant determine why it would my code would get stuck on the line lineTokenPtr = strtok(data, "\n");
Can anyone help me shed some light on this?
(Using OSX and Xcode if it makes any difference)
have you checked the contents of the argument? is it \0 terminated?
the argument that you pass, is it writeable memory? strtok writes to the buffer that it gets as first argument when it tokenizes the string.
IOW if you write
char* mystring = "hello\n";
strtok(mystring,"\n"); // you get problems
The function strtok() replaces the actual token delimiting symbols in the character string with null (i.e., \0) chars, and returns a pointer to the start of the token in the string. So after repeated calls to strtok() with a newline delimiting symbol, a string buffer that looked like
"The fox\nran over\nthe hill\n"
in memory will be literally modified in-place and turned into
"The fox\0ran over\0the hill\0"
with char pointers returned from strtok() that point to the strings the fox\0, ran over\0, and the hill\0. No new memory is allocated ... the original string memory is modified in-place, which means it's important not to pass a string literal that is of type const char*.
I'm wondering if there is another way of getting a sub string without allocating memory. To be more specific, I have a string as:
const char *str = "9|0\" 940 Hello";
Currently I'm getting the 940, which is the sub-string I want as,
char *a = strstr(str,"9|0\" ");
char *b = substr(a+5, 0, 3); // gives me the 940
Where substr is my sub string procedure. The thing is that I don't want to allocate memory for this by calling the sub string procedure.
Is there a much easier way?, perhaps by doing some string manipulation and not alloc mem.
I'll appreciate any feedback.
No, it can't be done. At least, not without modifying the original string and not without departing from the usual C concept of what a string is.
In C, a string is a sequence of characters terminated by a NUL (a \0 character). In order to obtain from "9|0\" 940 Hello" the substring "940", there would have to be a sequence of characters 9, 4, 0, \0 somewhere in memory. Since that sequence of characters does not exist anywhere in your original string, you would have to modify the original string.
The other option would just be to use a pointer into the original string at the place where your desired substring starts, and then also remember how long your substring is supposed to be in lieu of having the terminating \0 character. However, all C standard library functions that work on strings (and pretty much all third party C libraries that work with strings) expect strings to be NUL-terminated, and so won't accept this pointer-and-count format.
Try this:
char *mysubstr(char *dst, const char *src, const char *substr, size_t maxdst) {
... do substr logic, but stick result in dst respecting maxdst ...
}
Basically, punt and let the caller allocate space on the stack via:
char s[100];
Or something.
A C string is simply an array of chars in memory. If you want to access the substring without allocating a copy of the characters, you can simply access it directly:
char *b = a[5];
The problem with this approach is that b will not be null-terminated to the appropriate length. It would essentially be a pointer to the string: "940 hello".
If that doesn't matter to the code that uses b, then you are good to go. Keep in mind, however, that this would probably surprise other programmers later on in the product lifetime (including yourself)!
As xyld, suggested, you could let the caller allocate the memory and pass your substr function a buffer to fill; though, strictly speaking, that still involves "allocating memory".
Without allocating any memory at all, the only way you'd be able to do this would be by modifying the original string by changing the character after the substring to a '\0', but of course then your function couldn't take a const char * anymore, and you're modifying the original string, which may not be desirable.
If you don't require a \0 terminated string you can make a substring finding function that just tells you where in the full string (haystack) your partial string (needle) is. This would be considered a hot-copy or alias as the data could be changed by changes to the full string (haystack).
I was writing up a long thing on how to allocate memory using alloca and implement a macro (because it wouldn't work as a function) that would do what you want, but just happened to run across strndupa which is like strndup except allocates the memory on the stack rather than from the heap. It's a GNU extension, so it might not be available for you.
Writing your own macro that would look like a function because it needs to return a value but also work on the memory, but it is possible.
I will be coaching an ACM Team next month (go figure), and the time has come to talk about strings in C. Besides a discussion on the standard lib, strcpy, strcmp, etc., I would like to give them some hints (something like str[0] is equivalent to *str, and things like that).
Do you know of any lists (like cheat sheets) or your own experience in the matter?
I'm already aware of the books for the ACM competition (which are good, see particularly this), but I'm after tricks of the trade.
Thank you.
Edit: Thank you very much everybody. I will accept the most voted answer, and have duly upvoted others which I think are relevant. I expect to do a summary here (like I did here, asap). I have enough material now and I'm certain this has improved the session on strings immensely. Once again, thanks.
It's obvious but I think it's important to know that strings are nothing more than an array of bytes, delimited by a zero byte.
C strings aren't all that user-friendly as you probably know.
Writing a zero byte somewhere in the string will truncate it.
Going out of bounds generally ends bad.
Never, ever use strcpy, strcmp, strcat, etc.., instead use their safe variants: strncmp, strncat, strndup,...
Avoid strncpy. strncpy will not always zero delimit your string! If the source string doesn't fit in the destination buffer it truncates the string but it won't write a nul byte at the end of the buffer. Also, even if the source buffer is a lot smaller than the destination, strncpy will still overwrite the whole buffer with zeroes. I personally use strlcpy.
Don't use printf(string), instead use printf("%s", string). Try thinking of the consequences if the user puts a %d in the string.
You can't compare strings with if( s1 == s2 )
doStuff(s1);
You have to compare every character in the string. Use strcmp or better strncmp.
if( strncmp( s1, s2, BUFFER_SIZE ) == 0 )
doStuff(s1);
Abusing strlen() will dramatically worsen the performance.
for( int i = 0; i < strlen( string ); i++ ) {
processChar( string[i] );
}
will have at least O(n2) time complexity whereas
int length = strlen( string );
for( int i = 0; i < length; i++ ) {
processChar( string[i] );
}
will have at least O(n) time complexity. This is not so obvious for people who haven't taken time to think of it.
The following functions can be used to implement a non-mutating strtok:
strcspn(string, delimiters)
strspn(string, delimiters)
The first one finds the first character in the set of delimiters you pass in. The second one finds the first character not in the set of delimiters you pass in.
I prefer these to strpbrk as they return the length of the string if they can't match.
str[0] is equivalent to 0[str], or more generally str[i] is i[str] and i[str] is *(str + i).
NB
this is not specific to strings but it works also for C arrays
The strn* variants in stdlib do not necessarily null terminate the destination string.
As an example: from MSDN's documentation on strncpy:
The strncpy function copies the
initial count characters of strSource
to strDest and returns strDest. If
count is less than or equal to the
length of strSource, a null character
is not appended automatically to the
copied string. If count is greater
than the length of strSource, the
destination string is padded with null
characters up to length count.
confuse strlen() with sizeof() when using a string:
char *p = "hello!!";
strlen(p) != sizeof(p)
sizeof(p) yield, at compile time, the size of the pointer (4 or 8 bytes) whereas strlen(p) counts, at runtime, the lenght of the null terminated char array (7 in this example).
strtok is not thread safe, since it uses a mutable private buffer to store data between calls; you cannot interleave or annidate strtok calls also.
A more useful alternative is strtok_r, use it whenever you can.
kmm has already a good list. Here are the things I had problems with when I started to code C.
String literals have an own memory section and are always accessible. Hence they can for example be a return value of function.
Memory management of strings, in particular with a high level library (not libc). Who is responsible to free the string if it is returned by function or passed to a function?
When should "const char *" and when "char *" be used. And what does it tell me if a function returns a "const char *".
All these questions are not too difficult to learn, but hard to figure out if you don't get taught them.
I have found that the char buff[0] technique has been incredibly useful.
Consider:
struct foo {
int x;
char * payload;
};
vs
struct foo {
int x;
char payload[0];
};
see https://stackoverflow.com/questions/295027
See the link for implications and variations
I'd point out the performance pitfalls of over-reliance on the built-in string functions.
char* triple(char* source)
{
int n=strlen(source);
char* dest=malloc(n*3+1);
strcpy(dest,src);
strcat(dest,src);
strcat(dest,src);
return dest;
}
I would discuss when and when not to use strcpy and strncpy and what can go wrong:
char *strncpy(char* destination, const char* source, size_t n);
char *strcpy(char* destination, const char* source );
I would also mention return values of the ansi C stdlib string functions. For example ask "does this if statement pass or fail?"
if (stricmp("StrInG 1", "string 1")==0)
{
.
.
.
}
perhaps you could illustrate the value of sentinel '\0' with following example
char* a = "hello \0 world";
char b[100];
strcpy(b,a);
printf(b);
I once had my fingers burnt when in my zeal I used strcpy() to copy binary data. It worked most of the time but failed mysteriously sometimes. Mystery was revealed when I realized that binary input sometimes contained a zero byte and strcpy() would terminate there.
You could mention indexed addressing.
An elements address is the base address + index * sizeof element
A common error is:
char *p;
snprintf(p, 3, "%d", 42);
it works until you use up to sizeof(p) bytes.. then funny things happens (welcome to the jungle).
Explaination
with char *p you are allocating space for holding a pointer (sizeof(void*) bytes) on the stack. The right thing here is to allocate a buffer or just to specify the size of the pointer at compile time:
char buf[12];
char *p = buf;
snprintf(p, sizeof(buf), "%d", 42);
Pointers and arrays, while having the similar syntax, are not at all the same. Given:
char a[100];
char *p = a;
For the array, a, there is no pointer stored anywhere. sizeof(a) != sizeof(p), for the array it is the size of the block of memory, for the pointer it is the size of the pointer. This become important if you use something like: sizeof(a)/sizeof(a[0]). Also, you can't ++a, and you can make the pointer a 'const' pointer to 'const' chars, but the array can only be 'const' chars, in which case you'd be init it first. etc etc etc
If possible, use strlcpy (instead of strncpy) and strlcat.
Even better, to make life a bit safer, you can use a macro such as:
#define strlcpy_sz(dst, src) (strlcpy(dst, src, sizeof(dst)))
char *strtok(char *s1, const char *s2)
repeated calls to this function break string s1 into "tokens"--that is
the string is broken into substrings,
each terminating with a '\0', where
the '\0' replaces any characters
contained in string s2. The first call
uses the string to be tokenized as s1;
subsequent calls use NULL as the first
argument. A pointer to the beginning
of the current token is returned; NULL
is returned if there are no more
tokens.
Hi,
I have been trying to use strtok just now and found out that if I pass in a char* into s1, I get a segmentation fault. If I pass in a char[], strtok works fine.
Why is this?
I googled around and the reason seems to be something about how char* is read only and char[] is writeable. A more thorough explanation would be much appreciated.
What did you initialize the char * to?
If something like
char *text = "foobar";
then you have a pointer to some read-only characters
For
char text[7] = "foobar";
then you have a seven element array of characters that you can do what you like with.
strtok writes into the string you give it - overwriting the separator character with null and keeping a pointer to the rest of the string.
Hence, if you pass it a read-only string, it will attempt to write to it, and you get a segfault.
Also, becasue strtok keeps a reference to the rest of the string, it's not reeentrant - you can use it only on one string at a time. It's best avoided, really - consider strsep(3) instead - see, for example, here: http://www.rt.com/man/strsep.3.html (although that still writes into the string so has the same read-only/segfault issue)
An important point that's inferred but not stated explicitly:
Based on your question, I'm guessing that you're fairly new to programming in C, so I'd like to explain a little more about your situation. Forgive me if I'm mistaken; C can be hard to learn mostly because of subtle misunderstanding in underlying mechanisms so I like to make things as plain as possible.
As you know, when you write out your C program the compiler pre-creates everything for you based on the syntax. When you declare a variable anywhere in your code, e.g.:
int x = 0;
The compiler reads this line of text and says to itself: OK, I need to replace all occurrences in the current code scope of x with a constant reference to a region of memory I've allocated to hold an integer.
When your program is run, this line leads to a new action: I need to set the region of memory that x references to int value 0.
Note the subtle difference here: the memory location that reference point x holds is constant (and cannot be changed). However, the value that x points can be changed. You do it in your code through assignment, e.g. x = 15;. Also note that the single line of code actually amounts to two separate commands to the compiler.
When you have a statement like:
char *name = "Tom";
The compiler's process is like this: OK, I need to replace all occurrences in the current code scope of name with a constant reference to a region of memory I've allocated to hold a char pointer value. And it does so.
But there's that second step, which amounts to this: I need to create a constant array of characters which holds the values 'T', 'o', 'm', and NULL. Then I need to replace the part of the code which says "Tom" with the memory address of that constant string.
When your program is run, the final step occurs: setting the pointer to char's value (which isn't constant) to the memory address of that automatically created string (which is constant).
So a char * is not read-only. Only a const char * is read-only. But your problem in this case isn't that char *s are read-only, it's that your pointer references a read-only regions of memory.
I bring all this up because understanding this issue is the barrier between you looking at the definition of that function from the library and understanding the issue yourself versus having to ask us. And I've somewhat simplified some of the details in the hopes of making the issue more understandable.
I hope this was helpful. ;)
I blame the C standard.
char *s = "abc";
could have been defined to give the same error as
const char *cs = "abc";
char *s = cs;
on grounds that string literals are unmodifiable. But it wasn't, it was defined to compile. Go figure. [Edit: Mike B has gone figured - "const" didn't exist at all in K&R C. ISO C, plus every version of C and C++ since, has wanted to be backward-compatible. So it has to be valid.]
If it had been defined to give an error, then you couldn't have got as far as the segfault, because strtok's first parameter is char*, so the compiler would have prevented you passing in the pointer generated from the literal.
It may be of interest that there was at one time a plan in C++ for this to be deprecated (http://www.open-std.org/jtc1/sc22/wg21/docs/papers/1996/N0896.asc). But 12 years later I can't persuade either gcc or g++ to give me any kind of warning for assigning a literal to non-const char*, so it isn't all that loudly deprecated.
[Edit: aha: -Wwrite-strings, which isn't included in -Wall or -Wextra]
In brief:
char *s = "HAPPY DAY";
printf("\n %s ", s);
s = "NEW YEAR"; /* Valid */
printf("\n %s ", s);
s[0] = 'c'; /* Invalid */
If you look at your compiler documentation, odds are there is a option you can set to make those strings writable.