Coverting an unknown length string to lower case issues - c

So I'm not very good with C but I'm designing a GLUT application that reads in a file that is not case sensitive. To make it easier I want to convert my strings to all lower case. I made a function makeLower that is modifying a variable that was passed in by reference.
I have a While Loop in the method makeLower that seems to get through part of the first iteration of the while loop and then the EXE crashes. Any tips would be great, thanks!
Output:
C:\Users\Mark\Documents\Visual Studio 2010\Projects\Project 1\Debug>"Project 1.e
xe" ez.txt
Line is #draw a diamond ring
Character is #
Then error "project 1.exe has stopped working"
Code:
void makeLower(char *input[]){
int i = 0;
printf("Line is %s\n", *input);
while(input[i] != "\0"){
printf("Character is %c\n", *input[i]);
if(*input[i] >= 'A' && *input[i] <= 'Z'){
*input[i] = tolower(*input[i]);
}
i++;
}
}
int main(int argc, char *argv[]) {
FILE *file = fopen(argv[1], "r");
char linebyline [50], *lineStr = linebyline;
char test;
glutInit(&argc, argv);
while(!feof(file) && file != NULL){
fgets(lineStr , 100, file);
makeLower(&lineStr);
printf("%s",lineStr);
//directFile();
}
fclose(file);
glutMainLoop();
}

I see more problems now, so I extend my comments to an answer:
You allocate an array of 50 characters, but tell fgets to get up to 100 characters, which might be fatal as fgets will overwrite memory not in the string.
When passing a C string to a function, you don't have to pass the address of the pointer to the string (&lineStr), the actual pointer or array is okay. This means you can change the makeLower function to void makeLower(char *input) or void makeLower(char input[]). Right now the argument to makeLower is declared as an array or char pointers, not a pointer to an array of char.
In the new makeLower I proposed above, you can access single characters either as an array (input[i]) or as a pointer plus offset (*(input + i). Like I said in my comment, the last version is what the compiler will probably create if you use the first. But the first is more readable so I suggest that.
Also in makeLower you make a comparison with "\0", which is a string and not a character. This is almost right actually: You should use input[i] != '\0'.
And finally this is how I would implement it:
void makeLower(char *input)
{
while (*input != '\0') /* "while (*input)" would also work */
{
*input = tolower(*input);
input++;
}
}
A few explanations about the function:
All char arrays can be converted to a char pointer, but not the other way around. Passing char pointer is the most common way to pass a string actually, as you can see from all standard functions that accepts strings (like strlen or strcpy.)
The expression *input dereferences (i.e. takes the value of what a pointer points to) the string. It is the same as *(input + 0) and so get the value of the first character in the string.
While the first character in the string is not '\0' (which technically is a normal zero) we will loop.
Get the first character of the string and pass it to the tolower function. This will work no matter what the character is, tolower will only turn upper case characters to lower case, all other characters will be returned as they already were.
The result of tolower copied over the first character. This works because the right hand side of an assignment must be executed before the assignment, so there will not be any error or problem.
Last we increase the pointer by one. This will make input point to the next character in the string. This works because input is a local variable, so operations on the pointer will not affect anything in the calling function.
This function can now be called like this:
char input[100];
fgets(input, sizeof(input), stdin);
printf("before: \"%s\"\n", input);
makeLower(input);
printf("after : \"%s\"\n", input);

Did you try while(*input[i] != "\0") instead of what you have ? For some reason you seem to pass to your function a pointer to pointer to char (*input[]) and &lineStr so it would make sense to dereference twice when you check for string terminator character "\0"....
Just a thought, hope it helps

I think the problem is that you don't know that the string is going to equal '\0' when you want it to. So you may be going out of bounds which is very likely that you don't know the length of the string.

As far as I understand things, it's fine to pass '\0' to tolower(). It's a valid unsigned char value, and tolower() simply returns the input character if it is not able to do any conversion.
Thus, the loop can be succinctly put as:
while(input[i] = tolower(input[i]))
++i;
This does one more call to tolower(), but it's shorter and (imo) quite clear. Just wanted to mention it as an alternative.

Related

Reverse order in array of char in C

The pointers confuse me a lot. I have a function that takes as arguments argc (the number of argument that are strings + 1 that is the name of the code) and char* argv[], that, if I understood well, is an array of pointers. Now as result I need to print on each line the argument (string) and the reversed string. This means that if I pass as arguments hello world, I need to have:
hello olleh
world dlrow
I tried to implement a part of the code:
int main(int argc, char *argv[])
{
int i = 1;
int j;
while (i < argc)
{
while (argv[i][j] != 0) {
printf("%c", argv[i][j]);
j++;
}
i++;
}
return 0;
}
}
And now I'm literally lost. The inner loop doesn't work. I know that argv[i] pass through all my arguments strings, but I need obviously to enter in the strings (array of chars), to swap the pointers. Also I don't understand why the difference between argv[0] and *argv, because in theory argv[0] print the first element of the array that is a pointer, so an address, but instead of this it prints the first argument.
char* argv[] is a "array of pointers to a character" It's important to learn how to read types in C; because, those types will enable / thwart your ability to do stuff.
Types are read right to left. [] is a type of array, with an unspecified number of elemnts * is a pointer type, char is a base type. Combine these, and you now have "array of pointers to a character"
So to get something out of argv, you would first specify which element it is in the array. argv[0] would specify the first element. Let's look at what is returned. Since the array is not part of the result, the answer is "a pointer to a character"
char* line = argv[0];
would capture the pointer to a character, which is stored in argv[0].
In C a char* or a "pointer to a character" is the convention used for a string. C doesn't have a new type for "string"; rather it uses pointers to characters, where advancing the pointer eventually runs into the \0 character that signals the string's end.
int main(int argc, char* argv[]) {
int index = 0;
while (index <= argc) {
char* line = argv[index];
printf("%d, %s\n", index, line);
index++;
}
}
should dump the arguements passed to your program. From there, I imagine you can handle the rest.
Notice that I never converted the pointer to an array. While arrays can be used as pointers if you never specify the index of the array, in general pointers can't be used as arrays unless you rely on information that isn't in the type system (like you clearly grabbed the pointer from an array elsewhere).
Good luck!
---- Updated to address the question "How do I reverse them?" ----
Now that you have a simple char* (pointer to a character) how does one reverse a string?
Remember a string is a pointer to a character, where the next characters eventually end with a \0 character. First we will need to find the end of the string.
char* some_string = ...
char* position = some_string;
while (*position != 0) {
position++;
}
// end_of_string set, now to walk backwards
while (position != some_string) {
position--;
printf("%c", *end_of_string);
}

Usage of pointers as parameters in the strcpy function. Trying to understand code from book

From my book:
void strcpy (char *s, char *t)
{
int i=0;
while ((s[i] = t[i]) != ’\0’)
++i;
}
I'm trying to understand this snippet of code from my textbook. They give no main function so I'm trying to wrap my head around how the parameters would be used in a call to the function. As I understand it, the "i-number" of characters of string t[ ] are being copied to the string s[ ] until there are no longer characters to read, from the \0 escape sequence. I don't really understand how the parameters would be defined outside of the function. Any help is greatly appreciated. Thank you.
Two things to remember here:
Strings in C are arrays of chars
Arrays are passed to functions as pointers
So you would call this like so:
char destination[16];
char source[] = "Hello world!";
strcpy(destination, source);
printf("%s", destination);
i is just an internal variable, it has no meaning outside the strcpy function (it's not a parameter or anything). This function copies the entire string t to s, and stops when it sees a \0 character (which marks the end of a string by C convention).
EDIT: Also, strcpy is a standard library function, so weird things might happen if you try to redefine it. Give your copy a new name and all will be well.
Here's a main for you:
int main()
{
char buf[30];
strcpy(buf, "Hi!");
puts(buf);
strcpy(buf, "Hello there.");
puts(buf);
}
The point of s and t are to accept character arrays that exist elsewhere in the program. They are defined elsewhere, at this level usually by the immediate caller or one more caller above. Their meanings are replaced at runtime.
Your get compile problems because your book is wrong. Should read
const strcpy (char *s, const char *t)
{
...
return s;
}
Where const means will not modify. Because strcpy is a standard function you really do need it to be correct.
Here is how you might use the function (note you should change the function name as it will conflict with the standard library)
void my_strcpy (char *s, char *t)
{
int i=0;
while ((s[i] = t[i]) != ’\0’)
++i;
}
int main()
{
char *dataToCopy = "This is the data to copy";
char buffer[81]; // This buffer should be at least big enough to hold the data from the
// source string (dataToCopy) plus 1 for the null terminator
// call your strcpy function
my_strcpy(buffer, dataToCopy);
printf("%s", buffer);
}
In the code, the i variable is pointing to the character in the character array. So when i is 0 you are pointing to the first character of s and t. s[i] = t[i]copies the i'th character from t to the i'th character of s. This assignment in C is self an expression and returns the character that was copied, which allows you to compare that to the null terminator 0 ie. (s[i] = t[i]) != ’\0’ which indicates the end of the string, if the copied character is not a null terminator the loop continues otherwise it will end.

char* or char ?. I don't understand about the declaration here

I wrote a function that cuts all the left space of an inputted string. These two functions give the same output "haha" when input is " haha".
My question are:
1) Why the 1st one need return but the 2nd one doesn't. I added "return s" and it made a syntax error.
2) Are there any different in these if I use it in another situation?
3) Many said that 2nd one return a character not a string, how about my output ?
char *LTrim(char s[])
{
int i=0;
while(s[i]==' ')i++;
if (i>0) strcpy(&s[0],&s[i]);
return s;
}
and
char LTrim(char s[])
{
int i=0;
while(s[i]==' ')i++;
if (i>0) strcpy(&s[0],&s[i]);
}
This is my main():
int main()
{
char s[100];
printf("input string ");
gets(s);
LTrim(s);
puts(s);
return 0;
}
Your second code segment doesn't seem to have a return statement, please correct that for getting the correct answer.
The first function is returning a character pointer, which will be memory pointing to the starting location of your character array s, whereas the second function is returning a single character.
What you do with the values returned is what will make the difference, both the codes seem to be performing the same operation on the character array(string) passed to them, so if you are only looking at the initial and final string, it will be same.
On the other hand, if you actually use the returned value for some purpose, then you will get a different result for both functions.
char *LTrim(char s[]){} is a function of character array / string which returns character pointer i.e. returns reference / memory address.
While char LTrim(char s[]) is a function of character array / string, which return character only.
char is a single character.
char * is a pointer to a char.
char * are mostly used to point to the first character of a string (like sin your example).
In the first example you return your modified svariable, and in the second you return nothing so it's best to change the return value to void instead of char.

C String parsing errors with strtok(),strcasecmp()

So I'm new to C and the whole string manipulation thing, but I can't seem to get strtok() to work. It seems everywhere everyone has the same template for strtok being:
char* tok = strtok(source,delim);
do
{
{code}
tok=strtok(NULL,delim);
}while(tok!=NULL);
So I try to do this with the delimiter being the space key, and it seems that strtok() no only reads NULL after the first run (the first entry into the while/do-while) no matter how big the string, but it also seems to wreck the source, turning the source string into the same thing as tok.
Here is a snippet of my code:
char* str;
scanf("%ms",&str);
char* copy = malloc(sizeof(str));
strcpy(copy,str);
char* tok = strtok(copy," ");
if(strcasecmp(tok,"insert"))
{
printf(str);
printf(copy);
printf(tok);
}
Then, here is some output for the input "insert a b c d e f g"
aaabbbcccdddeeefffggg
"Insert" seems to disappear completely, which I think is the fault of strcasecmp(). Also, I would like to note that I realize strcasecmp() seems to all-lower-case my source string, and I do not mind. Anyhoo, input "insert insert insert" yields absolutely nothing in output. It's as if those functions just eat up the word "insert" no matter how many times it is present. I may* end up just using some of the C functions that read the string char by char but I would like to avoid this if possible. Thanks a million guys, i appreciate the help.
With the second snippet of code you have five problems: The first is that your format for the scanf function is non-standard, what's the 'm' supposed to do? (See e.g. here for a good reference of the standard function.)
The second problem is that you use the address-of operator on a pointer, which means that you pass a pointer to a pointer to a char (e.g. char**) to the scanf function. As you know, the scanf function want its arguments as pointers, but since strings (either in pointer to character form, or array form) already are pointer you don't have to use the address-of operator for string arguments.
The third problem, once you fix the previous problem, is that the pointer str is uninitialized. You have to remember that uninitialized local variables are truly uninitialized, and their values are indeterminate. In reality, it means that their values will be seemingly random. So str will point to some "random" memory.
The fourth problem is with the malloc call, where you use the sizeof operator on a pointer. This will return the size of the pointer and not what it points to.
The fifth problem, is that when you do strtok on the pointer copy the contents of the memory pointed to by copy is uninitialized. You allocate memory for it (typically 4 or 8 bytes depending on you're on a 32 or 64 bit platform, see the fourth problem) but you never initialize it.
So, five problems in only four lines of code. That's pretty good! ;)
It looks like you're trying to print space delimited tokens following the word "insert" 3 times. Does this do what you want?
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int main(int argc, char **argv)
{
char str[BUFSIZ] = {0};
char *copy;
char *tok;
int i;
// safely read a string and chop off any trailing newline
if(fgets(str, sizeof(str), stdin)) {
int n = strlen(str);
if(n && str[n-1] == '\n')
str[n-1] = '\0';
}
// copy the string so we can trash it with strtok
copy = strdup(str);
// look for the first space-delimited token
tok = strtok(copy, " ");
// check that we found a token and that it is equal to "insert"
if(tok && strcasecmp(tok, "insert") == 0) {
// iterate over all remaining space-delimited tokens
while((tok = strtok(NULL, " "))) {
// print the token 3 times
for(i = 0; i < 3; i++) {
fputs(tok, stdout);
}
}
putchar('\n');
}
free(copy);
return 0;
}

Custom getLine() function for c

I need a function/method that will take in a char array and set it to a string read from stdin. It needs to return the last character read as its return type, so I can determine if it reached the end of a line or the end of file marker.
here is what I have so far, and I kind of based it off of code from here
UPDATE: I changed it, but now it just crashes upon hitting enter after text. I know this way is inefficient, and char is not the best for EOF check, but for now I am just trying to get it to return the string. I need it to do it in this fashion and no other fashion. I need the string to be the exact length of the line, and to return a value that is either the newline or EOF int which I believe can still be used in a char value.
This program is in C not C++
char getLine(char **line);
int main(int argc, char *argv[])
{
char *line;
char returnVal = 0;
returnVal = getLine(&line);
printf("%s", line);
free(line);
system("pause");
return 0;
}
char getLine(char **line) {
unsigned int lengthAdder = 1, counter = 0, size = 0;
char charRead = 0;
*line = malloc(lengthAdder);
while((charRead = getc(stdin)) != EOF && charRead != '\n')
{
*line[counter++] = charRead;
*line = realloc(*line, counter);
}
*line[counter] = '\0';
return charRead;
}
Thank you for any help in advance!
You're assigning the result of malloc() to a local copy of line, so after the getLine() function returns it's not modified (albeit you think it is). What you have to do is either return it (as opposed to use an output parameter) or pass its address (pass it 'by reference'):
void getLine(char **line)
{
*line = malloc(length);
// etc.
}
and call it like this:
char *line;
getLine(&line);
Your key problem is that line pointer value does not propagate out of the getLine() function. The solution is to pass pointer to the line pointer to the function as a parameter instead - calling it like getLine(&line); while the function would be defined as taking parameter char **line. In the function, on all places where you now work with line, you would work with *line instead, i.e. dereferencing the pointer to a pointer and working with the value of the variable in main() where the pointer leads. Hope this is not too confusing. :-) Try to draw it on a piece of paper.
(A tricky part - you must change line[counter] to (*line)[counter] because you first need to dereference the pointer to the string, and only then to access a specific character in the string.)
There is a couple of other problems with your code:
You use char as the type for charRead. However, the EOF constant cannot be represented using char, you need to use int - both as the type of charRead and return value of getLine(), so that you can actually distringuish between a newline and end of file.
You forgot to return the last char read from your getLine() function. :-)
You are reallocating the buffer after each character addition. This is not terribly efficient and therefore is a rather ugly programming practice. It is not too difficult to use another variable to track the amount of space allocated and then (i) start with allocating a reasonable chunk of memory, e.g. 64 bytes, so that ideally you will never reallocate (ii) enlarge the allocation only if you need to based on comparing the counter and your allocation size tracker. Two reallocation strategies are common - either doubling the size of the allocation or increasing the allocation by a fixed step.
The way you use realloc is not correct. If it returns NULL then the memory block will be lost.
It is better to use realloc in this way:
char *tmp;
...
tmp = realloc(line, counter);
if(tmp == NULL)
ERROR, TRY TO SOLVE IT
line = tmp;

Resources