Why is the last character "?" in this string function - c

I have the following string function:
char * to_upper(const char * str) {
char * upper = malloc(strlen(str)+1);
int i;
for (i=0; str[i] != 0; i++)
upper[i] = toupper(str[i]);
upper[i+1] = '\0';
return upper;
}
However, when I call it, it adds a "?" to the end (probably an invalid character). If I change the last line, from upper[i+1] = '\0' to upper[i] = '\0', it works as expected. What is wrong then with code above?
Additionally, is this the right way to allocate for the string?
char * upper = malloc(strlen(str)+1);
Or should I instead do:
char upper[strlen(str)+1];
Update: my error above is because length starts at 1, index starts at 0. How should I initialize the string though?

Your code is fine, you just need to remove the +1 as you found out. The for loop ends when str[i] is equal to '\0', so it makes sense that upper[i] should then be set to '\0' as well.
Your string initialization is fine.

I fixed the answer, accoridng to comments:
char * to_upper(const char * str) {
char* upper = malloc(strlen(str));
int i;
for (i=0; str[i] != '\0'; i++)
upper[i] = toupper(str[i]);
upper[i] = '\0';
return upper;
}
With the comments i saw the error in my and your logic. The null string is already there, hence we fall out of the for at str[i] == '\0'. So we know i is the index we need to set as \0 in upper.

Related

A function that changes all lowercase letters of a string to uppercase

I am trying to write a function that changes all lowercase letters of a string to uppercase. Here is my code:
/**
* string_toupper - This function will replace all lowercase letters in
* the string pointed by str to uppercase
* #str: The string that will be checked for lowercase letters
*
* Return: The resulting string str, where all the letters are uppercase
*/
char *string_toupper(char *str)
{
int i;
for (i = 0; *str != '\0'; i++)
{
if (str[i] >= 'a' && str[i] <= 'z')
str[i] -= 32;
}
return (str);
}
And I tried it using:
#include <stdio.h>
int main(void)
{
char str[] = "Hello World!\n";
char *ptr;
ptr = string_toupper(str);
printf("%s\n", ptr);
printf("%s\n", str);
return (0);
}
But I get the following output:
Segmentation fault(core dumped)
My approach --> I will check the string if it has a lowercase letter. Then I will subtract 32 from the character if it matches to a lowercase character. I did this to make the character to uppercase, by subtracting 32 I am able to get the uppercase letter of the corresponding lowercase character I have found in the string.
But I am getting a Segmentation fault error, why is it happening?
change the for loop condition to for (i = 0; str[i] != '\0'; i++) since it should check every index.
char *string_toupper(char *str)
{
int i;
for (i = 0; str[i] != '\0'; i++)
{
if (str[i] >= 'a' && str[i] <= 'z')
str[i] =(int)str[i] - 32;
}
return (str);
}
By request, this is offered for education and debate.
Some workplaces or institutes insist on a particular style wrt curly braces, etc. I freelance...
Notice that the function name is not reproduced in a comment block. Bad habit that leads to satisfying supervisors with copy/paste of comment blocks that are WRONG and certainly misleading. Better to let the code explain itself by using conventional idioms and standard libraries.
#include <stdio.h>
#include <ctype.h>
#include <assert.h>
char *string_toupper( char *str ) {
// Uppercase all lowercase letters found in 'str'.
// Return str after processing.
assert( str != NULL ); // Trust no-one, especially yourself
// Alternative for():: for( int i = 0; str[ i ]; i++ )
for( int i = 0; str[ i ] != '\0'; i++ )
str[ i ] = (char)toupper( str[ i ] ); // note casting.
return str;
}
int main( void ) {
char str[] = "Hello World!";
// No further use? Don't store return value; just use it.
printf( "%s\n", string_toupper( str ) );
printf( "%s\n", str );
return 0;
}
OP's key problem is well explained by Prithvish: wrong loop test.
// for (i = 0; *str != '\0'; i++)
for (i = 0; str[i] != '\0'; i++)
To help OP with How can I make my code work on every environment?, some thoughts for later consideration.
Future names
"Function names that begin with str, mem, or wcs and a lowercase letter may be added to the declarations in the <string.h> header." C17dr ยง 7.31.13
So do not code function names that begin str<lowercase> to avoid future collisions.
Indexing type
int i; is too narrow a type for long lines. Use size_t for array indexing.
Alternatively simply increment the pointer.
Test case with classification is...() functions
str[i] >= 'a' && str[i] <= 'z' is incorrect on systems where [a...z] are not continuous. (Uncommon these days - example EBCDIC).
Simplify with topper()
To convert any character to its uppercase equivalent:
str[i] = toupper(str[i]);
Use unsigned access
is..(x) and toupper(x) functions need unsigned char character values (or EOF) for x.
On soon to be obsolete rare non-2's complement systems, character string should be accessed as unsigned char to avoid stopping on -0.
Putting this together:
#include <ctype.h>
char *str_toupper(char *str) {
unsigned char *ustr = (unsigned char *) str;
while (*ustr) {
*ustr = toupper(*ustr);
ustr++;
}
return str;
}
There is a major mistakes in your code:
the test in for (i = 0; *str != '\0'; i++) in function string_toupper is incorrect: it only tests the first character of str instead of testing for the end of string. As coded, you keep modifying memory well beyond the
end of the string until you reach an area of memory that cannot be read or written, causing a segmentation fault. The code has undefined behavior. You should instead write:
for (i = 0; str[i] != '\0'; i++)
Also note that if (str[i] >= 'a' && str[i] <= 'z') assumes that the lowercase letters form a contiguous block in the execution character set. While it is the case for ASCII, you should not make this assumption in portable code.
Similarly, str[i] -= 32; is specific to the ASCII and related character sets. You should either use str[i] = str[i] - 'a' + 'A'; which is more readable or use the functions from <ctype.h>.
Here is a modified version:
#include <stdio.h>
/**
* string_toupper - This function will replace all lowercase letters in
* the string pointed by str with their uppercase equivalent
* #str: The string that will be checked for lowercase letters
*
* Return: The resulting string str, where all the letters are uppercase
*/
char *string_toupper(char *str) {
for (size_t i = 0; str[i] != '\0'; i++) {
if (str[i] >= 'a' && str[i] <= 'z')
str[i] = str[i] - 'a' + 'A';
}
return str;
}
int main() {
char str[] = "Hello World!\n";
char *ptr;
printf("before: %s\n", str);
ptr = string_toupper(str);
printf("result: %s\n", ptr);
printf(" after: %s\n", str);
return 0;
}
And here is a portable version of string_toupper():
#include <ctype.h>
#include <stddef.h>
char *string_toupper(char *str) {
for (size_t i = 0; str[i] != '\0'; i++) {
if (islower((unsigned char)str[i]))
str[i] = (char)toupper((unsigned char)str[i]);
}
return str;
}

Why would a character array be unchanged after a for-loop?

I have built a function with the goal of taking text that is fed from elsewhere in the program and removing all whitespace and punctuation from it. I'm able to remove whitespace and punctuation, but the changes don't stay after they are made. For instance, I put the character array/string into a for-loop to remove whitespace and verify that the whitespace is removed by printing the current string to the screen. When I send the string through a loop to remove punctuation, though, it acts as though I did not remove whitespace from earlier. This is an example of what I'm talking about:
Example of output to screen
The function that I'm using is here.
//eliminates all punctuation, capital letters, and whitespace in plaintext
char *formatPlainText(char *plainText) {
int length = strlen(plainText);
//turn capital letters into lower case letters
for (int i = 0; i < length; i++)
plainText[i] = tolower(plainText[i]);
//remove whitespace
for (int i = 0; i < length; i++) {
if (plainText[i] == ' ')
plainText[i] = plainText[i++];
printf("%c", plainText[i]);
}
printf("\n\n");
//remove punctuation from text
for (int i = 0; i < length; i++) {
if (ispunct(plainText[i]))
plainText[i] = plainText[i++];
printf("%c", plainText[i]);
}
}
Any help as to why the text is unchanged after if exits the loop would be appreciated.
Those for loops are not necessary. Your function can be modified as follows and I commented where I made those changes:
char* formatPlainText(char *plainText)
{
char *dest = plainText; //dest to hold the modified version of plainText
while ( *plainText ) // as far as *plainText is not '\0'
{
int k = tolower(*plainText);
if( !ispunct(k) && k != ' ') // check each char for ' ' and any punctuation mark
*dest++ = tolower(*plainText); // place the lower case of *plainText to *dest and increment dest
plainText++;
}
*dest = '\0'; // This is important because in the while loop we escape it
return dest;
}
From main:
int main( void ){
char str[] = "Practice ????? &&!!! makes ??progress!!!!!";
char * res = formatPlainText(str);
printf("%s \n", str);
}
The code does convert the string to lower case, but the space and punctuation removal phases are broken: plainText[i] = plainText[i++]; has undefined behavior because you use i and modify it elsewhere in the same expression.
Furthermore, you do not return plainText from the function. Depending on how you use the function, this leads to undefined behavior if you store the return value to a pointer and later dereference it.
You can fix the problems by using 2 different index variables for reading and writing to the string when removing characters.
Note too that you should not use a length variable as the string length changes in the second and third phase. Texting for the null terminator is simpler.
Also note that tolower() and ispunct() and other functions from <ctype.h> are only defined for argument values in the range 0..UCHAR_MAX and the special negative value EOF. char arguments must be cast as (unsigned char) to avoid undefined behavior on negative char values on platforms where char is signed by default.
Here is a modified version:
#include <ctype.h>
//eliminate all punctuation, capital letters, and whitespace in plaintext
char *formatPlainText(char *plainText) {
size_t i, j;
//turn capital letters into lower case letters
for (i = 0; plainText[i] != '\0'; i++) {
plainText[i] = tolower((unsigned char)plainText[i]);
}
printf("lowercase: %s\n", plainText);
//remove whitespace
for (i = j = 0; plainText[i] != '\0'; i++) {
if (plainText[i] != ' ')
plainText[j++] = plainText[i];
}
plainText[j] = '\0';
printf("no white space: %s\n", plainText);
//remove punctuation from text
for (i = j = 0; plainText[i] != '\0'; i++) {
if (!ispunct((unsigned char)plainText[i]))
plainText[j++] = plainText[i];
}
plainText[j] = '\0';
printf("no punctuation: %s\n", plainText);
return plainText;
}

printf("%s") returning weird values

I've got a function which does some stuff with strings, however it has to save the original string by copying it into a char array, making it all upper-case and substitute any w/W for V.
char* function(const char* text){
int textLength = strlen(text);
char text_copy[textLength];
for(int i = 0; i < textLength; i++){
if(text[i] == 'W' || text[i] == 'w')
text_copy[i] = 'V';
else
text_copy[i] = toupper(text[i]);
}
return 'a';
}
It doesn't really matter what the function returns, however whenever I try to printf("%s\n", text_copy);, with some strings, it returns this:
belfast: BELFAST
please: PLEASE
aardvark: AARDVARK??
hello world: HELLO VORLD
taxxxiii: TAXXXIII???
swag: SVAG?
Why is it that some strings turn out fine and some don't? Thanks.
You need to null-terminate the copy.
char text_copy[textLength+1];
...
text_copy[textLength]='\0';
Though if you are returning it from your function (that isn't clear) you should be mallocing it instead.
Why is it that some strings turn out fine and some don't?
Pure chance.
You only allocate enoufgh space for the visible characters in the string and not the terminating \0. You are just lucky that for some of the strings a null byte is on the stack just after the character array.
Change your code like so...
int textLength = strlen(text);
char text_copy[textLength + 1]; // << enough space for the strings and \0
for(int i = 0; i < textLength; i++){
if(text[i] == 'W' || text[i] == 'w')
text_copy[i] = 'V';
else
text_copy[i] = toupper(text[i]);
}
text_copy[textLength] = '\0'; // Make sure it is terminated properly.

remove a character from the string which does not come simultaneously in c

for example, given the string str1 = "120jdvj00ncdnv000ndnv0nvd0nvd0" and the character ch = '0', the output should be 12jdvj00ncdnv000ndnvnvdnvd. That is, the 0 is removed only wherever it occurs singly.
this code is not working
#include<stdio.h>
char remove1(char *,char);
int main()
{
char str[100]="1o00trsg50nf0bx0n0nso0000";
char ch='0';
remove1(str,ch);
printf("%s",str);
return 0;
}
char remove1(char* str,char ch)
{
int j,i;
for(i=0,j=0;i<=strlen(str)-1;i++)
{
if(str[i]!=ch)
{
if(str[i+1]==ch)
continue;
else
str[j++]=str[i];
}
}
str[j]='\0';
}
Your code looks for an occurrence of something other than the character to be removed with "if(str[i]!=ch)", then if the next character is the one to be removed it skips (i.e. does not keep the characters it has just seen), otherwise it copies the current character. So if it sees 'a0' and is looking for '0' it will ignore the 'a'.
What you could do is copy all characters other than the one of interest and set a counter to 0 each time you see one of them (for the number of contiguous character of interest you've seen at this point). When you find the one of interest increment that count. Now whenever you find one that is not of interest, you do nothing if the count is 1 (as this is the single character you want to remove), or put that many instances of the interesting character into str if count > 1.
Ensure you deal with the case of the string ending with a contiguous run of the character to be removed, and you should be fine.
char *remove1(char* str, char ch){
char *d, *s;
for(d = s = str;*s;++s){
if(*s == ch){
if(s[1] == ch)
while(*s == ch)
*d++=*s++;
else
++s;//skip a ch
if(!*s)break;
}
*d++ = *s;
}
*d = '\0';
return str;
}
Code to copy the basic
for(d = s = str;*s;++s){
*d++ = *s;
}
*d = '\0';
Special processing to be added.
for(d = s = str;*s;++s){
if(find a character that is specified){
Copy that in the case of continuously than one character
if one letter then skip
}
*d++ = *s;
}
*d = '\0';
Here is the working code
output is : "1o00trsg5nfbxnnso0000"
#include<stdio.h>
char remove1(char *,char);
int main()
{
char str[100]="1o00trsg50nf0bx0n0nso0000";
char ch='0';
remove1(str,ch);
printf("%s",str);
return 0;
}
char remove1(char* str,char ch)
{
int j,i;
int len = strlen(str);
for(i = 0;i < (len - 1);i++){
if(str[i] == ch){
/* if either of check prev and next character is same then contd. without removal */
if((str[i+1] == ch) || (str[i-1] == ch))
continue;
/* replacing the char and shifting next chars left*/
for(j = i;j < (len - 2);j++) {
str[j] = str[j + 1];
}
/* string length is decrementing due to removal of one char*/
len--;
}
}
str[len] = '\0';
}

Need help removing empty character in C

This should be relatively simple.
I've got a string/character pointer that looks like this
" 1001"
Notice the space before the 1. How do I remove this space while still retaining the integer after it (not converting to characters or something)?
The simplest answer is:
char *str = " 1001";
char *p = str+1; // this is what you need
If the space is at the beginning of string.You also can do it.
char *str = " 1001";
char c[5];
sscanf(str,"%s",c);
printf("%s\n",c);
%s will ignore the first space at the beginning of the buffer.
One solution to this is found here: How to remove all occurrences of a given character from string in C?
I recommend removing the empty space character or any character by creating a new string with the characters you want.
You don't seem to be allocating memory so you don't have to worry about letting the old string die.
If it is a character pointer, I believe
char* new = &(old++);
Should do the trick
I'm guessing your reading in a String representation of an integer from stdin and want to get rid of the white space? If you can't use the other tricks above with pointers and actually need to modify the memory, use the following functions.
You can also use sprintf to get the job done.
I'm sure there is more efficient ways to trim the string. Here is just an example.
void trim(unsigned char * str)
{
trim_front(str);
trim_back(str);
}
void trim_front(unsigned char * str)
{
int i = 0;
int index = 0;
int length = strlen(str);
while(index < length && (str[index] == ' ' || str[index] == '\t' || str[index] == '\n'))
{
index++;
}
while(index < length)
{
str[i] = str[index];
i++;
index++;
}
}
void trim_back(unsigned char * str)
{
int i;
for(i = 0; str[i] != ' ' && str[i] != '\n' && str[i] != '\t' && str[i] != '\0'; i++);
str[i] = '\0';
}

Resources