so I was working on creating the raw function of concatenating a string in c. One solution that was provided to me was :
char *_strcat(char *dest, char *src)
{
int c, c2;
c = 0;
while (dest[c])
c++;
for (c2 = 0; src[c2] ; c2++)
dest[c++] = src[c2];
return (dest);
}
The part that confuses me is while (dest[c]), and other similar parts. I've already gone through pointers through various resources but I can't seem to understand this part. A good explanation will be much appreciated.
For starters the function is incorrect. It does not build a concatenated string because it does not append the terminating zero character '\0' to the result (dest) string in this for loop
for (c2 = 0; src[c2] ; c2++)
dest[c++] = src[c2];
Also the function should be declared like
char * _strcat( char *dest, const char *src );
because the appended string (src) is not changed.
This while loop
while (dest[c])
c++;
is equivalent to
while (dest[c] != '\0' )
c++;
and this for loop
for (c2 = 0; src[c2] ; c2++)
dest[c++] = src[c2];
is equivalent to
for (c2 = 0; src[c2] != '\0' ; c2++)
dest[c++] = src[c2];
That is the loops continue their iterations until the terminating zero character '\0' is encountered in the while loop in the string dest (to find its end) and in the second loop in the string src to find its end..
A non-zero scalar expression is evaluated as a logical true in conditions.
And the variables c and c2 should have the unsigned type size_t instead of the type int because objects of the type int can be not large enough to store string lengths.
Also you should not define names starting from the underscore character.
As for your question
How exactly does pointer value incrementing work?
then the pointers themselves are not incremented. There are used expressions with the subscript operator to access elements of strings as for example dest[c] or dest[c++] or src[c2].
The function can be defined the following way
char * my_strcat( char *dest, const char *src )
{
char *p = dest;
while ( *p != '\0' ) ++p;
while ( ( *p++ = *src++ ) != '\0' );
return dest;
}
In the shown function there are indeed incremented pointers p and src and neither expression with the subscript operator is used..
char *dest is an pointer to char and it's pointed to the first character by default. Then the following loop will move the pointer offset to the end of the string.
c = 0;
while (dest[c])
c++;
Related
I am trying to replicate the strcmp() function from the string.h library and here is my code
/**
* string_compare - this function compares two strings pointed
* by s1 and s2. Is a replica of the strcmp from the string.h library
* #s1: The first string to be compared
* #s2: The second string to be compared
*
* Return: On success, it returns:
* 0 if s1 is equal to s2
* negative value if s1 is less that s2
* positive value if s1 is greater than s2
*/
int string_compare(char *s1, char *s2)
{
int sum = 0, i;
for (i = 0; s1[i] != '\0' && s2[i] != '\0'; i++)
sum += (s1[i] - s2[i]);
for ( ; s1[i] != '\0'; i++)
sum += (s1[i] - 0);
for ( ; s2[i] != '\0'; i++)
sum += (0 - s2[i]);
return (sum);
}
I tried my function using this sample code:
#include <stdio.h>
int main(void)
{
char s1[] = "Hello";
char s2[] = "World!";
printf("%d\n", string_compare(s1, s2));
printf("%d\n", string_compare(s2, s1));
printf("%d\n", string_compare(s1, s1));
return (0);
}
And I get the following output,
-53
-500
0
But I should be getting:
-15
15
0
Why am I getting such a result??
This approach is incorrect.
Let's assume that the first string is "B" and the second string is "AB".
It is evident that the first string is greater than the second string in the lexicographical order.
But the result will be negative due to this for loop
for ( ; s2[i] != '\0'; i++)
sum += (0 - s2[i]);
though the function shall return a positive value.
Moreover there can occur an overflow for the variable sum of the type int.
Also the function should be declared at least like
int string_compare( const char *s1, const char *s2);
because passed strings are not changed within the function.
The function can be defined the following way
int string_compare( const char *s1, const char *s2 )
{
while ( *s1 && *s1 == *s2 )
{
++s1;
++s2;
}
return ( unsigned char )*s1 - ( unsigned char )*s2;
}
You are overcomplicating very simple function.
#define UC unsigned char
int mystrcmp(const char *s1, const char *s2)
{
int result;
while(!(result = (UC)*s1 - (UC)*s2++) && *s1++);
return result;
}
Strings in C are arrays of characters terminated with a null character (\0).
When you pass a string to a function, you are passing a pointer to its first element. That pointer is passed by value. You can modify that pointer within the function without any side-effects on the string it points to, as long as you don't dereference and assign to the address it points to.
That's why the pointer math from
0___________'s answer works.
int mystrcmp1(const char *s1, const char *s2) {
int result = 0;
while(!(result = *s1 - *s2++) && *s1++);
return result;
}
*s1++ could be rewritten as *(s1++) to disambiguate. s1++ returns the current pointer to the beginning of the first string, and then increments the pointer so it points to the next character. That pointer is then dereferenced to give us the character. The same happens with the s2 pointer.
Then we're comparing them by subtraction. If they're the same, we get 0, which in C is false in a boolean context. This result is assigned to result.
We can now see that the loop continues while corresponding characters in the two strings are equal and while dereferencing s1 does not give us the null terminator.
When the loop continues it means there was either a difference or we reached the end of the first string.
The difference will be stored in result, which the function returns.
While in the progress of making string functions, I have tried building a function somewhere similar to strlwr() which I named lowercase():
#include <stdio.h>
#include <ctype.h>
char *lowercase(char *text);
int main() {
char *hello = "Hello, world!";
printf("%s\n", lowercase(hello));
}
char *lowercase(char *text) {
for (int i = 0; ; i++) {
if (isalpha(text[i])) {
(int) text[i] += ('a' - 'A');
continue;
} else if (text[i] == '\0') {
break;
}
}
return text;
}
I learned that the gap for a big letter and small letter would be 32, which is what I used. But then I got this error:
lowercase.c:14:13: error: assignment to cast is illegal, lvalue casts are not supported
(int) text[i] += 32;
^~~~~~~~~~~~~ ~~
I want to increment the value of the char if it is considered a letter from A-Z. Turns out I can't, since the char is in an array, and the way I'm doing it doesn't seem to make sense for the computer.
Q: What alternate ways can I use to complete this function? Can you explain further why this error is like this?
Though in C string literals have types of non-constant character arrays nevertheless you may not change string literals.
char *hello = "Hello, world!";
From the C Standard (6.4.5 String literals)
7 It is unspecified whether these arrays are distinct provided their
elements have the appropriate values. If the program attempts to
modify such an array, the behavior is undefined.
So you should declare the identifier hello like a character array
char hello[] = "Hello, world!";
Within the function you should not use magic numbers like 32. For example if the compiler uses the EBCDIC coding your function will produce a wrong result.
And in the loop instead of the type int you have to use the type size_t because an object of the type int can be unable to store all values of the type size_t that is the return type of the sizeof operator or of the function strlen.
This statement
(int) text[i] += 32;
does not make a sense because in the left side of the expression there is an rvalue due to the casting.
The function can be implemented the following way
char * lowercase( char *text )
{
for ( char *p = text; *p; ++p )
{
if ( isalpha( ( unsigned char )*p ) )
{
*p = tolower( ( unsigned char )*p );
}
}
return text;
}
The cast is unnecessary. chars are integral types and can be incremented without fanfare:
text[i] += 32;
As several commenters have noted, you should also change your string to a modifiable string. char *hello = "..." declares a read-only string literal. Use array syntax to make it writable.
char hello[] = "Hello, world!";
You'll also want to switch isalpha() to isupper() so you only modify uppercase letters.
By the way, if you move the else if check into the for loop's test condition you can get rid of both the break and the continue.
for (int i = 0; text[i] != '\0'; i++) {
if (isupper((unsigned char) text[i])) {
text[i] += 32;
}
}
Why i can't put *a++ in this while loop and get what i want ,( i saw in book for C that this form can be used), but i got something else in output.
void strcat(char *a, char *b)
{
while( *a != '\0'){
a++;
}
for ( ;*b != '\0' ; *a++ = *b++);
}
When i checked what is current value, after this while loop ,for *a it print at both ways (up and down) same value and it is 0. But when i print my result is correct only for up way.
Why i can't do something like this?
while( *a++ != '\0');
while( *a != '\0'){
a++;
and
while( *a++ != '\0');
are not identical.
The first one increments a as long as it does not point to the terminator,
the second increments a and repeats that as long as a did not point to the terminator before the increment.
The difference is exactly one increment of a, making the second code an off-by-one-error.
You have a similar problem with the second loop:
for ( ;*b != '\0' ; *a++ = *b++);
It checks whether it reached the terminator, and otherwise copies one element from b to a.
Thus, it does not copy the terminator!
Change to:
while((*a++ = *b++)) {}
(Double-parentheses to suppress compiler-warning about possibly erroneous assignment in conditional expression.)
Additional tip:
Make intentional empty statements more obvious, use {}.
Also, when you re-implement the standard-library, consider following its definition, return the result-string.
Final code:
char* strcat(char *a, const char *b)
{
char* ret = a;
while(*a)
a++;
while((*a++ = *b++))
{}
return ret;
}
When terminating a string, it seems to me that logically char c=0 is equivalent to char c='\0', since the "null" (ASCII 0) byte is 0, but usually people tend to do '\0' instead. Is this purely out of preference or should it be a better "practice"?
What is the preferred choice?
EDIT: K&R says: "The character constant '\0' represents the character with value zero, the null character. '\0' is often written instead of 0 to emphasize the character nature of some expression, but the numeric value is just 0.
http://en.wikipedia.org/wiki/Ascii#ASCII_control_code_chart
Binary Oct Dec Hex Abbr Unicode Control char C Escape code Name
0000000 000 0 00 NUL ␀ ^# \0 Null character
There's no difference, but the more idiomatic one is '\0'.
Putting it down as char c = 0; could mean that you intend to use it as a number (e.g. a counter). '\0' is unambiguous.
'\0' is just an ASCII character. The same as 'A', or '0' or '\n'
If you write char c = '\0', it's the same aschar c = 0;
If you write char c = 'A', it's the same as char c = 65
It's just a character representation and it's a good practice to write it, when you really mean the NULL byte of string. Since char is in C one byte (integral type), it doesn't have any special meaning.
Preferred choice is that which can give people reading your code an ability to understand how do you use your variable - as a number or as a character.
Best practice is to use 0 when you mean you variable as a number and to use '\0' when you mean your variable is a character.
The above answers are already quite clear. I just share what I learned about this issue with a demo.
#include <stdlib.h>
#include <stdio.h>
char*
mystrcat(char *dest, char *src) {
size_t i,j;
for(i = 0; dest[i] != '\0'; i++)
;
for(j = 0; src[j] != '\0'; j++)
dest[i+j] = src[j];
dest[i+j] = '\0';
return dest;
}
int main() {
char *str = malloc(20); // malloc allocate memory, but doesn't initialize the memory
// str[0] = '\0';
str[0] = 0;
for (int k = 0; k <10; k++) {
char s[2];
sprintf(s, "%d", k);
mystrcat(str, s);
}
printf("debug:%s\n", str);
return 0;
}
In the above program, I used malloc to initialize the pointer, but malloc doesn't initialize the memory. So after the mystrcat operation(which is nearly the same as the strcat function in glibc), the string may contain mess code(since the memory content is not initialized).
So I need to initialize the memory. In this case str[0] = 0 and str[0] = 0 both can make it work.
My head is getting bad trying to find a solution for this assignment because my ideas didn't work...
I have to interlace two char strings using pointers. See the following example (this example is not code):
char s1 = "My House Black"
char s2 = "Are very near"
Result: "MAyr eH ovuesrey Bnleaacrk"
How can I do this?
Try:
int total = strlen(char1) + strlen(char2);
int i1 = 0, i2 = 0;
for(i = 0; i < total; i++)
{
if(i % 2 == 0)
{
result[i] = char1[i1];
i1++;
}
else
{
result[i] = char2[i2];
i2++;
}
}
Here's a hint (pseudocode):
result = ""
for i = 0 to longest string's length:
result += some character (whose?)
result += another character (also, whose?)
Be careful: you need a little check somewhere, otherwise bad things might happen.
Since this is homework, I will just give you an outline.
First of all declare your two strings:
const char *s1 = "My House Black";
const char *s2 = "Are very near";
Next declare two pointers to char:
char *p1 = s1;
char *p2 = s2;
Now enter a while loop. The condition should be that *p1 or *p2 are not equal to zero.
Inside the loop output *p1 if it is not zero and then output *p2 if it is not zero. Increment each pointer if it refers to a non-zero character.
That's it, you are done!
Since this is tagged homework, I don't want to directly post code. But create three character arrays, one for each input, one long enough to contain the output, and traverse the input character arrays one character at a time (use pointer arithmetic). Store the character into your output string. Continue until you reach the end of each string. Don't forget the null terminations!
You need a target string that is large enough to hold both input strings and the string terminator.
Then you should probably use a loop (while or for) where you copy one character from each input string in each iteration.
For extra credits:
Consider the case where the input strings are of unequal length.
Is this what you want?
char * s1 = "My House Black";
char * s2 = "Are very near";
char * s = (char *)malloc(strlen(s1) + strlen(s2) + 1);
char * p1 = s1;
char * p2 = s2;
char * p = s;
while (*p1 && *p2)
{
*p++ = *p1++;
*p++ = *p2++;
}
while (*p1)
{
*p++ = *p1++;
}
while (*p2)
{
*p++ = *p2++;
}
*p = '\0';