Proper way to empty a C-String - c

I've been working on a project in C that requires me to mess around with strings a lot. Normally, I do program in C++, so this is a bit different than just saying string.empty().
I'm wondering what would be the proper way to empty a string in C. Would this be it?
buffer[80] = "Hello World!\n";
// ...
strcpy(buffer, "");

It depends on what you mean by "empty". If you just want a zero-length string, then your example will work.
This will also work:
buffer[0] = '\0';
If you want to zero the entire contents of the string, you can do it this way:
memset(buffer,0,strlen(buffer));
but this will only work for zeroing up to the first NULL character.
If the string is a static array, you can use:
memset(buffer,0,sizeof(buffer));

Two other ways are strcpy(str, ""); and string[0] = 0
To really delete the Variable contents (in case you have dirty code which is not working properly with the snippets above :P ) use a loop like in the example below.
#include <string.h>
...
int i=0;
for(i=0;i<strlen(string);i++)
{
string[i] = 0;
}
In case you want to clear a dynamic allocated array of chars from the beginning,
you may either use a combination of malloc() and memset() or - and this is way faster - calloc() which does the same thing as malloc but initializing the whole array with Null.
At last i want you to have your runtime in mind.
All the way more, if you're handling huge arrays (6 digits and above) you should try to set the first value to Null instead of running memset() through the whole String.
It may look dirtier at first, but is way faster. You just need to pay more attention on your code ;)
I hope this was useful for anybody ;)

Depends on what you mean by emptying. If you just want an empty string, you could do
buffer[0] = 0;
If you want to set every element to zero, do
memset(buffer, 0, 80);

If you are trying to clear out a receive buffer for something that receives strings I have found the best way is to use memset as described above. The reason is that no matter how big the next received string is (limited to sizeof buffer of course), it will automatically be an asciiz string if written into a buffer that has been pre-zeroed.

I'm a beginner but...Up to my knowledge,the best way is
strncpy(dest_string,"",strlen(dest_string));

For those who clicked into this question to see some information about bzero and explicit_bzero:
bzero: this function is deprecated
explicit_bzero: this is not a standard c function and is absent on some of the BSDs
Further explanation:
why-use-bzero-over-memset
bzero(3)

needs name of string and its length
will zero all characters
other methods might stop at the first zero they encounter
void strClear(char p[],u8 len){u8 i=0;
if(len){while(i<len){p[i]=0;i++;}}
}

Related

Trying to figure out how to solve char and strlen related problem [duplicate]

I am currently working on a program which involves creating a template for an exam.
In the function where I allow the user to add a question to the exam, I am required to ensure that I use only as much memory as is required to store it's data. I've managed to do so after a great deal of research into the differences between various input functions (getc, scanf, etc), and my program seems to be working but I am concerned about one thing. Here is the code for my function, I've placed a comment on the line in question:
int AddQuestion(){
Question* newQ = NULL;
char tempQuestion[500];
char* newQuestion;
if(exam.phead == NULL){
exam.phead = (Question*)malloc(sizeof(Question));
}
else{
newQ = (Question*)malloc(sizeof(Question));
newQ->pNext = exam.phead;
exam.phead = newQ;
}
while(getchar() != '\n');
puts("Add a new question.\n"
"Please enter the question text below:");
fgets(tempQuestion, 500, stdin);
newQuestion = (char*)malloc(strlen(tempQuestion) + 1); /*Here is where I get confused*/
strcpy(newQuestion, tempQuestion);
fputs(newQuestion, stdout);
puts("Done!");
return 0;
}
What's confusing me is that I've tried running the same code but with small changes to test exactly what is going on behind the scenes. I tried removing the + 1 from my malloc, which I put there because strlen only counts up to but not including the terminating character and I assume that I want the terminating character included. That still ran without a hitch. So I tried running it but with - 1 instead under the impression that doing so would remove whatever is before the terminating character (newline character, correct?). Still, it displayed everything on separate lines.
So now I'm somewhat baffled and doubting my knowledge of how character arrays work. Could anybody help clear up what's going on here, or perhaps provide me with a resource which explains this all in further detail?
In C, strings are conventionally null-terminated. Strlen, however, only counts the characters before the null. So, you always must add one to the value of strlen to get enough space. Or call strdup.
A C string contains the characters you can see "abc" plus one you can't which marks the end of the string. You represent this as '\0'. The strlen function uses the '\0' to find the end of the string, but doesn't count it.
So
myvar = malloc(strlen(str) + 1);
is correct. However, what you tried:
myvar = malloc(strlen(str));
and
myvar = malloc(strlen(str) - 1);
while INCORRECT, MAY seem to work some of the time. This is because malloc typically allocates memory in chunks, (say maybe in units of 16 bytes) rather than the exact size you ask for. So sometimes, you may 'luck out' and end up using the 'slop' at the end of the chunk.

C strings initialization. Possible trouble

Good day everyone. I have the following c-string initialization:
*(str) = 0; where str is declared as char str[255];. The questions are:
1) is it the best way to initialize a string this way?
2) should I expect trouble with this code on 64-bit platform?
Thanks in advance.
The instruction *(str) = 0 is completely equivalent to str[0] = 0;.
The initiatlization simply place the integer 0 (which is equivalent to '\0') in the first position, making the string empty.
1) There's no "best way", the key is being consistent so if you choose the "*(str) = 0;" path, always do the same everywhere.
2) No trouble could ever come from that statement. Whatever the architecture, the compiler will take care of everything.
What this does is change the uninitialized char array str, of which each element has some undefined value, to have its first element set to the null terminator character.
As far as most string related functions go, they will traverse this string from left to right, and immediately see the null terminator.
There are surely functions that don't care about null terminators, and are given a maximum size. In these cases, you are reading undefined values, which is Bad™.
Use this instead:
char str[255] = { 0 }; // zero every element
or initialize the string to something useful on first use.

Pointer mystery/noobish issue

I am originally a Java programmer who is now struggling with C and specifically C's pointers.
The idea on my mind is to receive a string, from the user, on a command line, into a character pointer. I then want to access its individual elements. The idea is later to devise a function that will reverse the elements' order. (I want to work with anagrams in texts.)
My code is
#include <stdio.h>
char *string;
int main(void)
{
printf("Enter a string: ");
scanf("%s\n",string);
putchar(*string);
int i;
for (i=0; i<3;i++)
{
string--;
}
putchar(*string);
}
(Sorry, Code marking doesn't work).
What I am trying to do is to have a first shot at accessing individual elements. If the string is "Santillana" and the pointer is set at the very beginning (after scanf()), the content *string ought to be an S. If unbeknownst to me the pointer should happen to be set at the '\0' after scanf(), backing up a few steps (string-- repeated) ought to produce something in the way of a character with *string. Both these putchar()'s, though, produce a Segmentation fault.
I am doing something fundamentally wrong and something fundamental has escaped me. I would be eternally grateful for any advice about my shortcomings, most of all of any tips of books/resources where these particular problems are illuminated. Two thick C books and the reference manual have proved useless as far as this.
You haven't allocated space for the string. You'll need something like:
char string[1024];
You also should not be decrementing the variable string. If it is an array, you can't do that.
You could simply do:
putchar(string[i]);
Or you can use a pointer (to the proposed array):
char *str = string;
for (i = 0; i < 3; i++)
str++;
putchar(*str);
But you could shorten that loop to:
str += 3;
or simply write:
putchar(*(str+3));
Etc.
You should check that scanf() is successful. You should limit the size of the input string to avoid buffer (stack) overflows:
if (scanf("%1023s", string) != 1)
...something went wrong — probably EOF without any data...
Note that %s skips leading white space, and then reads characters up to the next white space (a simple definition of 'word'). Adding the newline to the format string makes little difference. You could consider "%1023[^\n]\n" instead; that looks for up to 1023 non-newlines followed by a newline.
You should start off avoiding global variables. Sometimes, they're necessary, but not in this example.
On a side note, using scanf(3) is bad practice. You may want to look into fgets(3) or similar functions that avoid common pitfalls that are associated with scanf(3).

Different ways to calculate string length

A comment on one of my answers has left me a little puzzled. When trying to compute how much memory is needed to concat two strings to a new block of memory, it was said that using snprintf was preferred over strlen, as shown below:
size_t length = snprintf(0, 0, "%s%s", str1, str2);
// preferred over:
size_t length = strlen(str1) + strlen(str2);
Can I get some reasoning behind this? What is the advantage, if any, and would one ever see one result differ from the other?
I was the one who said it, and I left out the +1 in my comment which was written quickly and carelessly, so let me explain. My point was merely that you should use the pattern of using the same method to compute the length that will eventually be used to fill the string, rather than using two different methods that could potentially differ in subtle ways.
For example, if you had three strings rather than two, and two or more of them overlapped, it would be possible that strlen(str1)+strlen(str2)+strlen(str3)+1 exceeds SIZE_MAX and wraps past zero, resulting in under-allocation and truncation of the output (if snprintf is used) or extremely dangerous memory corruption (if strcpy and strcat are used).
snprintf will return -1 with errno=EOVERFLOW when the resulting string would be longer than INT_MAX, so you're protected. You do need to check the return value before using it though, and add one for the null terminator.
If you only need to determine how big would be the concatenation of the two strings, I don't see any particular reason to prefer snprintf, since the minimum operations to determine the total length of the two strings is what the two strlen calls do. snprintf will almost surely be slower, because it has to check the parameters and parse the format string besides just walking the two strings counting the characters.
... but... it may be an intelligent move to use snprintf if you are in a scenario where you want to concatenate two strings, and have a static, not too big buffer to handle normal cases, but you can fallback to a dynamically allocated buffer in case of big strings, e.g.:
/* static buffer "big enough" for most cases */
char buffer[256];
/* pointer used in the part where work on the string is actually done */
char * outputStr=buffer;
/* try to concatenate, get the length of the resulting string */
int length = snprintf(buffer, sizeof(buffer), "%s%s", str1, str2);
if(length<0)
{
/* error, panic and death */
}
else if(length>sizeof(buffer)-1)
{
/* buffer wasn't enough, allocate dynamically */
outputStr=malloc(length+1);
if(outputStr==NULL)
{
/* allocation error, death and panic */
}
if(snprintf(outputStr, length, "%s%s", str1, str2)<0)
{
/* error, the world is doomed */
}
}
/* here do whatever you want with outputStr */
if(outputStr!=buffer)
free(outputStr);
One advantage would be that the input strings are only scanned once (inside the snprintf()) instead of twice for the strlen/strcpy solution.
Actually, on rereading this question and the comment on your previous answer, I don't see what the point is in using sprintf() just to calculate the concatenated string length. If you're actually doing the concatenation, my above paragraph applies.
You need to add 1 to the strlen() example. Remember you need to allocate space for nul terminating byte.
So snprintf( ) gives me the size a string would have been. That means I can malloc( ) space for that guy. Hugely useful.
I wanted (but did not find until now) this function of snprintf( ) because I format tons of strings for output later; but I wanted not to have to assign static bufs for the outputs because it's hard to predict how long the outputs will be. So I ended up with a lot of 4096-long char arrays :-(
But now -- using this newly-discovered (to me) snprintf( ) char-counting function, I can malloc( ) output bufs AND sleep at night, both.
Thanks again and apologies to the OP and to Matteo.
EDIT: random, mistaken nonsense removed. Did I say that?
EDIT: Matteo in his comment below is absolutely right and I was absolutely wrong.
From C99:
2 The snprintf function is equivalent to fprintf, except that the output is written into
an array (specified by argument s) rather than to a stream. If n is zero, nothing is written,
and s may be a null pointer. Otherwise, output characters beyond the n-1st are
discarded rather than being written to the array, and a null character is written at the end
of the characters actually written into the array. If copying takes place between objects
that overlap, the behavior is undefined.
Returns
3 The snprintf function returns the number of characters that would have been written
had n been sufficiently large, not counting the terminating null character, or a neg ative
value if an encoding error occurred. Thus, the null-terminated output has been
completely written if and only if the returned value is nonnegative and less than n.
Thank you, Matteo, and I apologize to the OP.
This is great news because it gives a positive answer to a question I'd asked here only a three weeks ago. I can't explain why I didn't read all of the answers, which gave me what I wanted. Awesome!
The "advantage" that I can see here is that strlen(NULL) might cause a segmentation fault, while (at least glibc's) snprintf() handles NULL parameters without failing.
Hence, with glibc-snprintf() you don't need to check whether one of the strings is NULL, although length might be slightly larger than needed, because (at least on my system) printf("%s", NULL); prints "(null)" instead of nothing.
I wouldn't recommend using snprintf() instead of strlen() though. It's just not obvious. A much better solution is a wrapper for strlen() which returns 0 when the argument is NULL:
size_t my_strlen(const char *str)
{
return str ? strlen(str) : 0;
}

Remove redundant whitespace from string (in-place)

Ok so i posted earlier about trying to (without any prebuilt functions) remove additional spaces so
"this is <insert many spaces!> a test" would return
"this is a test"
Remove spaces from a string, but not at the beginning or end
As it was homework i asked for no full solutions and some kind people provided me with the following
"If you need to do it in place, then create two pointers. One pointing to the character being read and one to the character being copied. When you meet an extra space, then adapt the 'write' pointer to point to the next non space character. Copy to the read position the character pointed by the write character. Then advance the read and write pointers to the character after the character being copied."
The problem is i now want to fully smash my computer in to pieces as i am sooooo irritated by it. I didnt realise at the time i couldnt utilise the char array so was using array indexes to do it, i thought i could suss how to get it to work but now i am just using pointers i am finding it very hard. I really need some help, not full solutions. So far this is what i am doing;
1)create a pointer called write, so i no where to write to
2)create a pointer called read so i no where to read from
(both of these pointers will now point to the first element of the char array)
while (read != '\0')
if read == a space
add one to read
if read equals a space now
add one to write
while read != a space {
set write to = read
}
add one to both read and write
else
add one to read and write
write = read
Try just doing this yourself character by character on a piece of paper and work out what it is you are doing first, then translate that into code.
If you are still having trouble, try doing something simpler first, for example just copying a string character for character without worrying about the "remove duplicate spaces" part of it - just to make sure that you haven't made a silly mistake elsewhere.
The advice of trying to do it with pen and paper first is good. And it really doesn't matter if you do it with pointers or array indexing; you can either use a reader and a writer pointer, or a reader and a writer index.
Think about when you want to move the indices forward. You always move the write index forward after you write a character. You move the read index forward when you've read a character.
Perhaps you could start with some code that just moves over the string, but actually doesn't change it. And then you add the logic that skips additional spaces.
char p[] = "this is a test";
char *readptr = &p[0];
char *writeptr = &p[0];
int inspaces = 0;
while(*readptr) {
if(isspace(*readptr)) {
inspaces ++;
} else {
inspaces = 0;
}
if(inspaces <= 1) {
*writeptr = *readptr;
writeptr++;
}
readptr++;
}
*writeptr = 0;
Here's a solution that only has a single loop (no inner loop to skip spaces) and no state data:
dm9pZCBTdHJpcFNwYWNlcyAoY2hhciAqdGV4dCkNCnsNCiAgY2hhciAqc3JjID0gdGV4dCwgKmRl
c3QgPSB0ZXh0Ow0KICB3aGlsZSAoKnNyYykNCiAgew0KICAgICpkZXN0ID0gKihzcmMrKyk7DQog
ICAgaWYgKCpzcmMgIT0gJyAnIHx8ICpkZXN0ICE9ICcgJykNCiAgICB7DQogICAgICArK2Rlc3Q7
DQogICAgfQ0KICB9DQogICpkZXN0ID0gMDsNCn0NCg==
The above is Base64 encoded as this is a homework question. To decode, copy the above block and paste into this website to decode it (you'll need to check the "Decode" radio button).
assuming you're trying to write a function like:
void removeSpaces(char * str){
/* ... stuff that changes the contents of str[] */
}
You want to scan the string for consecutive spaces, so that your write pointer is always trailing your read pointer. Advance your read pointer to a place where you are pointing at a space, but the next character is not a space. If your read and write pointers are not the same, then your write pointer ought to be pointing at the beginning of a sequence of spaces. The difference between your write and read pointer (i.e. read_pointer - write_pointer) will tell you the number of consecutive spaces that need to be overwritten to close the gap. When there is a difference of greater than zero, (prefix) advance both pointers along by that many positions, copying characters as you go. When you're read pointer is at the end of the string ('\0'), you should be done.
Can you use regular expressions? If so, it might be really easy. Use a regex to replace \s{2,}? with a single space. The \s means any white space (tabs, spaces, carriage feeds...); {2,} means 2 or more; ? means non-greedy. (Disclaimer: my regex might not be the best one you could write, since I'm no regex pro. Also, it's .net syntax, so the regex library for C might have slightly different syntax.)
Why don't you do something like this:
make a second char** that is the same length as the first. As you run through the first array with your pointers keep an extra pointer on the last space you've seen. If the last space you saw was the previous element then you don't copy that char to the second array.
But I would start with something that runs through the char** by each character and prints out each char. If you can make that happen then you can work on actually copying them into a second char**.

Resources