Bug with strlen? - c

I'm just getting started with C and I just started trying to figure out
call by reference in functions. I have noticed an odd result in my output
when using strlen() to iterate over a string and modify its contents. In this
example the result of strlen() is 3, not including the null character,
but if I do not explicitly check for the null character (or use less than the
result of strlen() instead of less than or equals) during the for loop then
it gives a bizarre bit character in the output which I ASSUME is because of the null character?
Please help this noob to understand what is happening here.
Code:
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
void f_test_s(char s[]);
void f_test_s2(char s[]);
int main(){
char s_test[] = "abc";
f_test_s(s_test);
f_test_s2(s_test);
puts("\nTest complete!");
return 0;
}
void f_test_s(char s[]){
puts("Test #1: ");
printf("string before: %s\n", s);
int len = strlen(s);
printf("strlen() = %d\n", len);
int i=0;
for(i=0;i<=len;i++){
if(s[i] != '\0'){
s[i]++;
}
}
printf("string after: %s\n", s);
}
void f_test_s2(char s[]){
puts("\nTest #2: ");
printf("string before: %s\n", s);
int len = strlen(s);
printf("strlen() = %d\n", len);
int i=0;
for(i=0;i<=len;i++){
s[i]++;
}
printf("string after: %s\n", s);
}
output:
Test #1:
string before: abc
strlen() = 3
string after: bcd
Test #2:
string before: bcd
strlen() = 3
string after: cde
Test complete!
If it matters I am using gcc version 7.3.0 on Ubuntu. I am definitely
not an expert with either C, gcc, or Ubuntu.

This is the problem:
for (i = 0; i <= len; i++) {
s[i]++;
}
It should be:
for (i = 0; i < len; i++) {
s[i]++;
}
s[len] is the null char (0). When you removed null char and replaced it with the value of 1, the contents of the array are now {'a', 'b', 'c', 0x1}. And when printf attempts to print s it's going to keep printing characters past the value memory address of the array until it encounters a null char. Technically this is undefined behavior.

Change this:
for (i = 0; i <= len; i++) {
to this:
for (i = 0; i < len; i++) {
since strlen() returns the length of the string. A C string is as long as the number of characters between the beginning of the string and the terminating null character (without including the terminating null character itself).
Your code invokes Undefined Behavior (UB), since you go out of bounds. Standard string functions (like printf()) depend on the NULL terminating character to mark the end of the string. Without it, they do not know when to stop . . .

Related

I tried reversing a string in C without using <string.h> functions, it didn't work

I was trying to inverse a string in c which seemed fairly easy at first but I keep encountering some weird problem that I don't seem to understand where it comes from.
The string c3 keep showing more characters that it should
#include <stdio.h>
#include <stdlib.h>
int main()
{
char c1[10];
char c3[10];
int i,j,l;
printf("donner la chaine a inverser\n");
fflush(stdin);
gets(c1);
for(i = 0; c1[i] != '\0'; i++)
{
}
l = i;
j = 0;
for(i = l-1; i >= 0; i--)
{
printf("%d%d\n", i, j);
c3[j] = c1[i];
j++;
}
printf("%s", c3);
return 0;
}
I'm not really sure but c3 should only have the number of characters that c1 does but it shows that it contains more in printf("%s", c3);.
I am still new to strings in c so I probably missed something really obvious.
The answer is quite simple. let's say your string is abcdef. in c3, you will put fedcba, where a in at index 5.
What will be at index 6 ? The answer is "no one knows". it's undefined. That's why you have garbage after your string.
In C, a string is a char array, "null terminated" ( NULL terminated means there is the character '\0' after the last character ( or simply a 0 ( not '0') ).
The simple way of solving your problem is to initialize c3 to 0.
char c3[10] = {0};
This way, your array will be filled with NULL characters.
You did not set a null terminator at offset c3[i]. printf() will keep reading from c3 until it finds a null byte, since c3 a local object that is not initialized, printf may read and output extra characters as you experience, and potentially read beyond the end of the array which has undefined behavior.
Note also that you should not use gets() as you cannot tell this obsolete C library function the size of the destination array.
Here is a modified version:
#include <stdio.h>
int main() {
char c1[80];
char c3[80];
int i, j, len;
printf("donner la chaine a inverser:\n");
fflush(stdin);
if (!fgets(c1, sizeof c1, stdin))
return 1;
for (len = 0; c1[len] != '\0' && c1[len] != '\n'; len++)
continue;
for (j = 0, i = len; i-- > 0; j++) {
c3[j] = c1[i];
}
c3[j] = '\0'; // set the null terminator
printf("%s\n", c3);
return 0;
}

I mixed up two programs in the cs50 sandbox in c?

I mixed up two programs in the cs50 sandbox, one was to find the the number of characters in an array and other was the print these characters. I know the program is garbage but could anyone explain me what is the compiler doing here?
When I ran this, the output starts printing alphanumeric text and never stops Thanks
#include <cs50.h>
#include <stdio.h>
#include <string.h>
int main(void)
{
string s = get_string("Name: ");
int n = 0;
while (strlen(s) != '\0')
{
n++;
printf("%c", n);
}
}
You have multiple problems with the code you show, here's a couple of them:
strlen(s) will never be zero as you never modify or remove characters from the string, which means you have an infinite loop
n is an integer and not a character so should be printed with the %d format specifier
'\0' is (semantically) a character, representing the string terminator, it's not (semantically) the value 0
To fix the first problem I suspect you want to iterate over every character in the string? Then that could be done with e.g.
for (int i = 0; i < strlen(s); ++i)
{
printf("Current character is '%c'\n", s[i]);
}
But if all you want is to could the number of characters in the string, then that's what strlen is already gives you:
printf("The number of characters in the string is %zu\n", strlen(s));
If you want to count the length of the string without using strlen then you need to modify the loop to loop until you hit the terminator:
for (n = 0; s[n] != '\0'; ++n)
{
// Empty
}
// Here the value of n is the number of characters in the string s
All of this should be easy to figure out by reading any decent beginners book.
while (strlen(s) != '\0') is wrong. '\0' equals 0. There string length is never 0, so the loop keeps going on forever, printing integers interpreted as characters.
You can either use the indexes to go through the string characters by using the variable "n" or you can increment the pointer of the string that you have received from the standard input to go through all of its characters.
#include <cs50.h>
#include <stdio.h>
#include <string.h>
int main(void)
{
string s = get_string("Name: ");
/* First way using n to iterate */
int n = 0;
for (n = 0; n < strlen(s); ++n)
{
printf("%c", s[n]);
}
printf("\n");
/* Second way increment the string pointer*/
while (strlen(s) != '\0')
{
printf("%c", *s); //print the value of s
s++; // go to the next character from s
}
printf("\n");
return 0;
}

How to change individual characters in string on C?

Trying to make some basic hangman code to practice learning C but I can't seem to change individual characters in the program
int main(int argc, char *argv[]) {
int length, a, x;
x = 0;
char gWord[100];
char word[100] = "horse";
length = strlen(word) - 1;
for(a = 0; a<= length; a = a + 1){
gWord[a] = "_";
printf("%s", gWord[a]);
}
printf("%s", gWord);
}
when I try to run this it just prints (null) for every time it goes through the loop. It's probably a basic fix but I'm new to C and can't find anything about it online
To print character instead of string change:
printf("%s", gWord[a]);
to:
printf("%c", gWord[a]);
but before that change also:
gWord[a] = "_";
to:
gWord[a] = '_';
The last problem is that you were assigning a string literal to a single character.
Edit:
Also as #4386427 pointed out, you never zero-terminate gWord before printing it later on with printf("%s", gWord). You should change the last line from:
printf("%s", gWord);
to:
gWord[a] = '\0';
printf("%s", gWord);
because otherwise this would very likely lead to a buffer overflow.
This line
printf("%s", gWord[a]);
Must be
printf("%c", gWord[a]);
To print a char, c is the right specifier. s is only for whole strings and takes char pointers.
Are you getting any warning message('s) when compiling your code?
Since you are learning C language, one suggestion - Never ignore any warning message given by the compiler, they are there for some reason.
Three problems in your code:
First:
Assigning string to a character:
gWord[a] = "_";
gWord is an array of 100 characters and gWord[a] is a character at location a of gWord array. Instead, you should do
gWord[a] = '_';
Second:
Using wrong format specifier for printing a character:
printf("%s", gWord[a]);
^^
For printing a character you should use %c format specifier:
printf("%c", gWord[a]);
Third:
Missed adding null terminating character at the end in gWord and printing it:
printf("%s", gWord);
In C language, strings are actually one-dimensional array of characters terminated by a null character '\0'. The %s format specifier is used for character string and by default characters are printed until the ending null character is encountered. So, you should make sure to add '\0' at the end of gWord after the for loop finishes:
gWord[a] = '\0';
Apart from these, there are couple of more things -
This statement:
length = strlen(word) - 1;
I do not see any reason of subtracting 1 from the word length. The strlen return the length of string without including the terminating null character itself. So, the strlen(word) will give output 5. Now you are subtracting 1 from this and running loop till <= length may confuse the reader of the code. You should simply do:
length = strlen(word);
for(a = 0; a < length; a = a + 1){
....
....
Also, the return type of strlen() is size_t and the size_t is an unsigned type. So, you should use the variable of type size_t to receive strlen() return value.
Last but not least, make sure to not to have any unused variables/parameters in your program. You are not using argc and argv anywhere in your program. If you are using gcc compiler then compile it with -Wall -Wextra options. You will find that compiler will report all the unused variables/parameters. So, if not using argc and argv then you should simply give void in the parameter list of main() function.
Putting these all together, you can do:
#include <stdio.h>
#include <string.h>
int main(void) {
size_t length, a;
char gWord[100];
char word[100] = "horse";
length = strlen(word);
for(a = 0; a < length; a = a + 1) {
gWord[a] = '_';
printf("%c", gWord[a]);
}
gWord[a] = '\0';
printf ("\n");
printf("%s\n", gWord);
return 0;
}
Syntactically, your code is alright #Blookey. But logically, no actually.
Let me point out 3 places which are causing the undesired behavior in your code:
gWord[a] = "_"; Observe this line. You have specified the _ in " ". In case you are unaware of this fact, each individual element of a string is a character. And each character is supposed to be given in ' ', i.e., single quotes and not double quotes.
printf("%s", gWord[a]); A similar error again. gWord[a] is a character, not a string. Hence you need to print it using the format specifier %c instead of %s which is for string instead.
A string, any string is supposed to end with a NULL, which is \0 (backslashZERO). That is what differentiates an array of characters from a string. So just add the following line once you finish loading characters into gWord[].
gWord[a] = '\0';
Here is the complete code, just with the 3 changes:
#include<stdio.h>
#include<string.h>
int main(int argc, char *argv[]) {
int length, a, x;
x = 0;
char gWord[100];
char word[100] = "horse";
length = strlen(word) - 1;
for(a = 0; a<= length; a = a + 1){
gWord[a] = '_';
printf("%c ", gWord[a]);
}
gWord[a] = '\0';
printf("\n%s", gWord);
}
Here is the OUTPUT:
_ _ _ _ _ .
_____.
#include <stdio.h>
#include <stdlib.h>
#include<string.h>
int main(int argc, char *argv[]) {
int length, a, x;
x = 0;
char gWord[100] = {0};
char word[100] = "horse"; /*note that if the array in place strlen +1 is not nulled before using strlen you might not get the correct result*/
length = strlen(word) - 1;/*strln will return 5, that is the letters inn the string the pointer is pointing to until the first terminator '\0'*/
for(a = 0; a<= length; a = a + 1){
gWord[a] = '_'; /* if you use "_" it will try to fit in the chars '_' and '\0' to each char slot of the array*/
printf("%c", gWord[a]); /* %s looks for a string to print while here there are single chars to print*/
}
printf("\n");/*you can print the hole string like this */
printf("%s", gWord);
return 0;/*and remember that main function should always have a return value*/
}

Find the length of string in c with just one line of code without using strlen() function?

I want to find if there is any way to find the length of any string in C.
Here's how I did:
#include <stdio.h>
int main()
{
char s[10] = "hello";
int i , len = 0;
for(i = 0; s[i] != '\0'; i++)
{
len++
}
printf("length of string is: %d" , len);
return 0;
}
I want to find, if there is any way to get the length of string in just one line of code.
You can just simply do this:
for(len = 0; s[len] != '\0'; len++);
So in just one line of code you will get the length of string stored in len.
You can remove s[len] != '\0'; comparison to make it shorter:
for(len=0;s[len];len++);
You can call strlen() function to know length of the string in one line.
it returns an size value size_t strlen(char*)
just do this:
for(len=0;s[len];len++);
this will store the length in len
If you want only one line, the something like this is also possible:
while (s[len] != '\0') len++;
Which is just another way of doing it, but not most pleasing to look at.
The most minimal version:
#include <stdio.h>
int main(void)
{
char s[10] = "hello", *p = s;
while(*p++); /* "counting" here */
printf("The length of string '%s' is: %td" , s, p - s);
}
It prints:
The length of string 'hello' is: 6
I believe, this should be the shortest version:
for(l=-1;s[++l];);
It's 18 bytes, so it's quite a good code-golf answer. However, I would prefer the more canonical
for(len = 0; s[len]; len++) ;
in real code.

Iterating over string/strlen with umlauted characters

This is a follow-up to my previous question . I succeeded in implementing the algorithm for checking umlauted characters. The next problem comes from iterating over all characters in a string. I do this like so:
int main()
{
char* str = "Hej du kalleåäö";
printf("length of str: %d", strlen(str));
for (int i = 0; i < strlen(str); i++)
{
printf("%s ", to_morse(str[i]));
}
putchar('\n');
return 0;
}
The problem is that, because of the umlauted characters, it prints 18, and also makes the to_morse function fail (ignoring these characters). The toMorse method accepts an unsigned char as a parameter. What would be the best way to solve this? I know I can check for the umlaut character here instead of the letterNr function but I don't know if that would be a pretty/logical solution.
Normally, you'd store the string in a wchar_t and use something like ansi_strlen to get the length of it - that would give you the number of printed characters as opposed to the number of bytes you stored.
You really shouldn't be implementing UTF or Unicode or whatever multibyte character handling yourself - there are libraries for that sort of thing.
On OS X, Cocoa is a solution - note the use of "%C" in NSLog - that's an unichar (16-bit Unicode character):
#import <Cocoa/Cocoa.h>
int main()
{
NSAutoreleasePool * pool = [NSAutoreleasePool new];
NSString * input = #"Hej du kalleåäö";
printf("length of str: %d", [input length]);
int i=0;
for (i = 0; i < [input length]; i++)
{
NSLog(#"%C", [input characterAtIndex:i]);
}
[pool release];
}
You could do something like
for (int i = 0; str[i]!='\0'; ++i){
//do something with str[i]
}
Strings in C are terminated with '\0'. So it is possible to check for the end of the string like that.
EDIT: What locale are you using?
If you are going to iterating over a string, don't bother with getting its length with strlen. Just iterate until you see a NUL character:
char *p = str;
while(*p != '\0') {
printf("%c\n", *p);
++p;
}
As for the umlauted characters and such, are they UTF-8? If the string is multi-byte, you could do something like this:
size_t n = strlen(str);
char *p = str;
char *e = p + n;
while(*p != '\0') {
wchar_t wc;
int l = mbtowc(&wc, p, e - p);
if(l <= 0) break;
p += l;
/* do whatever with wc which is now in wchar_t form */
}
I honestly don't know if mbtowc will simply return -1 if it encounters a NUL in the middle of a MB character. If it does, you could just pass MB_CUR_MAX instead of e - p and do away with the strlen call. But I have a feeling this is not the case.

Resources