I'm wondering if there is another way of getting a sub string without allocating memory. To be more specific, I have a string as:
const char *str = "9|0\" 940 Hello";
Currently I'm getting the 940, which is the sub-string I want as,
char *a = strstr(str,"9|0\" ");
char *b = substr(a+5, 0, 3); // gives me the 940
Where substr is my sub string procedure. The thing is that I don't want to allocate memory for this by calling the sub string procedure.
Is there a much easier way?, perhaps by doing some string manipulation and not alloc mem.
I'll appreciate any feedback.
No, it can't be done. At least, not without modifying the original string and not without departing from the usual C concept of what a string is.
In C, a string is a sequence of characters terminated by a NUL (a \0 character). In order to obtain from "9|0\" 940 Hello" the substring "940", there would have to be a sequence of characters 9, 4, 0, \0 somewhere in memory. Since that sequence of characters does not exist anywhere in your original string, you would have to modify the original string.
The other option would just be to use a pointer into the original string at the place where your desired substring starts, and then also remember how long your substring is supposed to be in lieu of having the terminating \0 character. However, all C standard library functions that work on strings (and pretty much all third party C libraries that work with strings) expect strings to be NUL-terminated, and so won't accept this pointer-and-count format.
Try this:
char *mysubstr(char *dst, const char *src, const char *substr, size_t maxdst) {
... do substr logic, but stick result in dst respecting maxdst ...
}
Basically, punt and let the caller allocate space on the stack via:
char s[100];
Or something.
A C string is simply an array of chars in memory. If you want to access the substring without allocating a copy of the characters, you can simply access it directly:
char *b = a[5];
The problem with this approach is that b will not be null-terminated to the appropriate length. It would essentially be a pointer to the string: "940 hello".
If that doesn't matter to the code that uses b, then you are good to go. Keep in mind, however, that this would probably surprise other programmers later on in the product lifetime (including yourself)!
As xyld, suggested, you could let the caller allocate the memory and pass your substr function a buffer to fill; though, strictly speaking, that still involves "allocating memory".
Without allocating any memory at all, the only way you'd be able to do this would be by modifying the original string by changing the character after the substring to a '\0', but of course then your function couldn't take a const char * anymore, and you're modifying the original string, which may not be desirable.
If you don't require a \0 terminated string you can make a substring finding function that just tells you where in the full string (haystack) your partial string (needle) is. This would be considered a hot-copy or alias as the data could be changed by changes to the full string (haystack).
I was writing up a long thing on how to allocate memory using alloca and implement a macro (because it wouldn't work as a function) that would do what you want, but just happened to run across strndupa which is like strndup except allocates the memory on the stack rather than from the heap. It's a GNU extension, so it might not be available for you.
Writing your own macro that would look like a function because it needs to return a value but also work on the memory, but it is possible.
Related
The code is as follows:
char seg1[] = "abcdefgh";
char seg2[] = "ijklmnop";
char seg3[] = "qrstuvwx";
strcat(seg2, seg3);
Then the value stored in seg1 will become:
"rstuvwx\0\0"
I have learned to declare that strings with close positions are also adjacent in the stack area, but I forgot the details.
I guess the memory address of seg1 was overwritten when strcat() was executed, but I'm not sure about the specific process. Can someone tell me the specific process of this event?Thanks
C does not have a string class, it has character arrays which may be used as strings by appending a null terminator. And since there is no string class, all memory management of strings/arrays must be done manually.
char seg1[] = "abcdefgh"; Allocates space for exactly 8 characters and 1 null terminator. There is no room to append anything else at the end. If you try anyway, that's the realm of undefined behavior, where anything can happen. Crashes, overwriting other variables, program ceasing to function as expected and so on.
Solve this by allocating enough space to append something in the end, for example
char seg1[50] = "abcdefgh";. Alternatively allocate a new, third array and copy the strings into that one.
I am trying to use the C's strtok function in order to process a char* and print it in a display, and looks like that for some reason I don't know the character '\n' is not substituted by '\0' as I believe strtok does. The code is as follows:
-Declaration of char* and pass to the function where it will be processed:
char *string_to_write = "Some text\nSome other text\nNewtext";
malloc(sizeof string_to_write);
screen_write(string_to_write,ALIGN_LEFT_TOP,I2C0);
-Processing of char* in function:
void screen_write(char *string_to_write,short alignment,short I2C)
{
char *stw;
stw = string_to_write;
char* text_to_send;
text_to_send=strtok(stw,"\n");
while(text_to_send != NULL)
{
write_text(text_to_send,I2C);
text_to_send=strtok(NULL, "\n");
}
}
When applying the code, the result can be seen in imgur (Sorry, I am having problems with format adding the image here in the post), where it can be seen that the \n is not substituted as it is the strange character appearing in the image, and the debugger still showed the character as well. Any hints of where can the problem be?
Thanks for your help,
Javier
strtok expects to be able to mutate the string you pass it: instead of allocating new memory for each token, it puts \0 characters into the string at token boundaries, then returns a series of pointers into that string.
But in this case, your string is immutable: it's a constant stored in your program, and can't be changed. So strtok is doing its best: it's returning indices into the string for each token's starting point, but it can't insert the \0s to mark the ends. Your device can't handle \ns in the way you'd expect, so it displays them with that error character instead. (Which is presumably why you're using this code in the first place.)
The key is to pass in only mutable strings. To define a mutable string with a literal value, you need char my_string[] = "..."; rather than char* my_string = "...". In the latter case, it just gives you a pointer to some constant memory; in the former case, it actually makes an array for you to use. Alternately, you can use strlen to find out how long the string is, malloc some memory for it, then strcpy it over.
P.S. I'm concerned by your malloc: you're not saving the memory it gives you anywhere, and you're not doing anything with it. Be sure you know what you're doing before working with dynamic memory allocation! C is not friendly about that, and it's easy to start leaking without realizing it.
1.
malloc(sizeof string_to_write); - it allocates the sizeof(char *) bytes not as many bytes as your string needs. You also do not assign the allocated block to anything
2.
char *string_to_write = "Some text\nSome other text\nNewtext";
char *ptr;
ptr = malloc(strlen(string_to_write) + 1);
strcpy(ptr, string_to_write);
screen_write(ptr,ALIGN_LEFT_TOP,I2C0);
Without using the string.h functions (want to use only the std libs), I wanted to create a new string by concatenating the string provided as an argument to the program. For that, I decided to copy the argument to a new char array of larger size and then replace the end of the string by the characters I want to append.
unsigned int argsize=sizeof(argv[1]);
unsigned char *newstr=calloc(argsize+5,1);
newstr=argv[1]; //copied arg string to new string of larger size
newstr[argsize+4]=oname[ns]; //copied the end-of-string null character
newstr[argsize]='.'; //this line gives seg fault
newstr[argsize+1]='X'; //this executes without any error
I believe there must be another more secure way of concatenating string without using string functions or by copying and appending char by char into a new char array. I would really want to know such methods. Also, I'm curious to know what is the reason of this segfault.
Read here: https://stackoverflow.com/a/164258/1176315 and I guess, the compiler is making my null character memory block read only but that's only a guess. I want to know the real reason behind this.
I will appreciate all your efforts to answer the question. Thanks.
Edit: By using std libs only, I mean to say I don't want to use the strcpy(), strlen(), strcat() etc. functions.
Without using the string.h functions (want to use only the std libs)
string.h is part of the standard library.
unsigned int argsize=sizeof(argv[1]);
This is wrong. sizeof does not tell you the length of a C string, it just tell you how big is the type of its argument. argv[1] is a pointer, and sizeof will just tell you how big a pointer is on your platform (typically 4 or 8), regardless of the actual content of the string.
If you want to know how long is a C string, you have to examine its characters and count until you find a 0 character (which incidentally is what strlen does).
newstr=argv[1]; //copied arg string to new string of larger size
Nope. You just copied the pointer stored in argv[1] to the variable newstr, incidentally losing the pointer that calloc returned to you previously, so you have also a memory leak.
To copy a string from a buffer to another you have to copy its characters one by one until you find a 0 character (which incidentally is what strcpy does).
All the following lines are thus operating on argv[1], so if you are going out of its original bounds anything can happen.
I believe there must be another more secure way of concatenating string without using string functions or by copying and appending char by char into a new char array.
C strings are just arrays of characters, everything boils down to copying/reading them one at time. If you don't want to use the provided string functions you'll end up essentially reimplementing them yourself. Mind you, it's a useful exercise, but you have to understand a bit better what C strings are and how pointers work.
First of all sizeof(argv[1]) will not return the length of the string you need to count the number of characters in the string using loops or using standard library function strlen().second if you want to copy the string you need to use strcpy() function.
You supposed to do like this:
unsigned int argsize=strlen(argv[1]); //you can also count the number of character
unsigned char *newstr=calloc((argsize+5),1);
strcpy(newstr,argv[1]);
newstr[argsize+4]=oname[ns];
newstr[argsize]='.';
newstr[argsize+1]='X';
as far as I'm concerned, strncat enlarges the size of the array you want to cat.
for example:
char str1[] = "This is str1";
char str2[] = "This is str2";
and here the length of str1 is 12 and str2 is also 12, but when I strncat them, str1 changes from 12 to 24.
I was asked to write strncat by my own, but I can't figure out how to enlarge the size of an array, taking in account that we didn't learn pointers yet.
I tried just putting every char in the end of the array while moving the distance by 1 each iteration, but as you would have thought, it doesn't put the data in the array because there is no such position like this in the array (str[20] when str's length is 10 for example).
Thanks in advance,
every help would be appreciated.
strlen returns the length of the string, that is, counts until the first null character. It does NOT return the size of the memory allocated for str1!
When you concatenatestr2 to str1, you write beyond the memory allocated for str1. That will cause undefined behavior. In your particular case, it seems nothing happens and it even seems that str1 has become larger. That is not so. However (in your paticular case), if str2 follows str1 in memory, you just overwrote str2. Try printing str2. It will probaby print his is str2.
Since strcat() et al. does not enlarge a buffer, your implementation does not have to do it. (And it is simply not possible with the parameter list of strcat().) It is the caller's responsibility to pass a destination buffer big enough.
On the caller's side you can simply create an array big enough and pass its address. However, you can still use variable length arrays (VLA):
char str1[] = "This is str1";
char str2[] = "This is str2";
char str1str2[strlen(str1)+strlen(str2)+1];
strcpy( str1str2, str1 );
yourstrcat( str1str2, str2 );
str1str2 is big enough to store both contents plus 1 for the string terminator \0.
Thanks for everyone, I solved the problem. As some of you said, I don't need to enlarge the string, I just need to make sure it's big enough to contain all the data.
what I did eventually is this:
void strnCat(char dest[], char src[], int length)
{
int i = 0;
int len = strlen(dest);
for(i=0; i < length; i++)
{
dest[len+i] = src[i];
dest[len+i+1] = 0;
}
}
so my main problem was that I forget to add the null at the end of the array to make it a string and that I used strlen(str) instead of saving the length in a variable. I did that because I forgot that there is no end of the string after the null disappears.
It is a really strange task to let students implement strncat, since this is one of the C functions that is very difficult to use correctly.
So to implement it yourself, you should read its specification in the C standard or in the POSIX standard. There you will find that strncat doesn't enlarge any array. By the way, arrays cannot be enlarged in C at all, it's impossible by definition. Note the careful distinction between the words array (can contain arbitrary bytes) and string (must contain one null byte) in the standard wording.
A saner alternative to implement is strlcat, which is not in the C standard but also widely known.
I want to be clear about all the advantages/disadvantages of the following code:
{
char *str1 = strdup("some string");
char *str2 = "some string";
free(str1);
}
str1:
You can modify the contents of the string
str2:
You don't have to use free()
Faster
Any other differences?
Use neither if you can and avoid it by one of the following
static char const str3[] = { "some string" };
char str4[] = { "some string" };
str3 if you never plan to modify it and str4 if you do.
str3 ensures that no other function in your program can modify your string (string literals may be shared and mutable). str4 allocates a constant sized array on the stack, so allocation and deallocation comes with no overhead. The system has just to copy your data.
Using the original string - whether it's a literal in the source, part of a memory-mapped file, or even an allocated string "owned" by another part of your program - has the advantage of saving memory, and possibly eliminating ugly error conditions you'd otherwise have to handle if you performed an allocation (which could fail). The disadvantage, of course, is that you have to keep track of the fact that this string is not "owned" by the code currently using it, and thus that it cannot be modified/freed. Sometimes this means you need a flag in a structure to indicate whether a string it uses was allocated for the structure or not. With smaller programs, it might just mean you have to manually follow the logic of string ownership through several functions and make sure it's correct.
By the way, if the string is going to be used by a structure, one nice way to get around having to keep a flag marking whether it was allocated for the structure or not is to allocate space for the structure and the string (if needed) with a single call to malloc. Then, freeing the structure always just works, regardless of whether the string was allocated for the structure or assigned from a string literal or other source.
strdup is not C89 and not C99 -> not
ANSI C -> not portable
is portable and str2 is implicit const