Removing spaces in a string - c

I was trying to remove spaces of a string after scanning it. The program has compiled perfectly but it is not showing any output and the output screen just keeps getting shut down after scanning the string. Is there a logical error or is there some problem with my compiler(I am using a devc++ compiler btw).
Any kind of help would be appreciated
int main()
{
char str1[100];
scanf("%s",&str1);
int len = strlen(str1);
int m;
for (m=0;m<=len;){
if (&str1[m]==" "){
m++;
}
else {
printf("%c",&str1[m]);
}
m++;
}
return 0;
}
Edit : sorry for the error of m=1, I was just checking in my compiler whether that works or not and just happened to paste that code

Your code contains a lot of issues, and the behaviour you describe is very likely not because of a bug in the compiler :-)
Some of the issues:
Use scanf("%s",str1) instead of scanf("%s",&str1). Since str1 is defined as a character array, it automatically decays to a pointer to char as required.
Note that scanf("%s",str1) will never read in any white space because "%s" is defined as skipping leading white spaces and stops reading in when detecting the first white space.
In for (m=1;m<=len;) together with str1[m], note that an array in C uses zero-based indizes, i.e. str1[0]..str1[len-1], such that m <= len exceeds array bounds. Use for (m=0;m<len;) instead.
Expression &str1[m]==" " is correct from a type perspective, but semantically a nonsense. You are comparing the memory address of the mth character with the memory address of a string literal " ". Use str1[m]==' ' instead and note the single quotes denoting a character value rather than double quotes denoting a string literal.
Statement printf("%c",&str1[m]) passes the memory address of a character to printf rather than the expected character value itself. Use printf("%c",str1[m]) instead.
Hope I found everything. Correct these things, turn on compiler warnings, and try to get ahead. In case you face further troubles, don't hesitate to ask again.
Hope it helps a bit and good luck in experiencing C language :-)

There are many issues:
You cannot read a string with spaces using scanf("%s") , use fgets instead (see comments).
scanf("%s", &str1) is wrong anyway, it should be scanf("%s", str1);, str1 being already the address of of the string
for (m = 0; m <= len;) is wrong, it should be for (m = 0; m < len;), because otherwise the last character you will check is the NUL string terminator.
if (&str1[m]==" ") is wrong, you should write if (str1[m]==' '), " " does not denote a space character but a string literal, you need ' ' instead..
printf("%c", &str1[m]); is wrong, you want to print a char so you need str1[m] (without the &).
You should remove both m++ and put that into the for statement: for (m = 1; m < len; m++), that makes the code clearer.
And possibly a few more problems.
And BTW your attempt doesn't remove the spaces from the string, it merely displays the string skipping spaces.

There are a number of smaller errors here that are adding up.
First, check the bounds on your for loop. You're iterating from index 1 to index strlen(str1), inclusive. That's a reasonable thing to try, but remember that in C, string indices start from 0 and go up to strlen(str1), inclusive. How might you adjust the loop to handle this?
Second, take a look at this line:
if (&str1[m] == " ") {
Here, you're attempting to check whether the given character is a space character. However, this doesn't do what you think it does. The left-hand side of this expression, &str1[m], is the memory address of the mth character of the string str1. That should make you pause for a second, since if you want to compare the contents of memory at a given location, you shouldn't be looking at the address of that memory location. In fact, the true meaning of this line of code is "if the address of character m in the array is equal to the address of a string literal containing the empty string, then ...," which isn't what you want.
I suspect you may have started off by writing out this line first:
if (str1[m] == " ") {
This is closer to what you want, but still not quite right. Here, the left-hand side of the expression correctly says "the mth character of str1," and its type is char. The right-hand side, however, is a string literal. In C, there's a difference between a character (a single glyph) and a string (a sequence of characters), so this comparison isn't allowed. To fix this, change the line to read
if (str1[m] == ' ') {
Here, by using single quotes rather than double quotes, the right-hand side is treated as "the space character" rather than "a string containing a space." The types now match.
There are some other details about this code that need some cleanup. For instance, look at how you're printing out each character. Is that the right way to use printf with a character? Think about the if statement we discussed above and see if you can tinker with your code. Similarly, look at how you're reading a string. And there may be even more little issues here and there, but most of them are variations on these existing themes.
But I hope this helps get you in the right direction!

For loop should start from 0 and less than length (not less or equal)
String compare is wrong. Should be char compare to ' ' , no &
Finding apace should not do anything, non space outputs. You ++m twice.
& on %c output is address not value
From memory, scanf stops on whitespace anyway so needs fgets
int main()
{
char str1[100];
scanf("%s",str1);
int len = strlen(str1);
int m;
for (m=0;m<len;++m){
if (str1[m]!=' '){
printf("%c",str1[m]);
}
}
return 0;
}

There are few mistakes in your logic.
scanf terminates a string when it encounter any space or new line.
Read more about it here. So use fgets as said by others.
& in C represents address. Since array is implemented as pointers in C, its not advised to use & while getting string from stdin. But scanf can be used like scanf("%s",str1) or scanf("%s",&str[1])
While incrementing your index put m++ inside else condition.
Array indexing in C starts from 0 not 1.
So after these changes code will becames something like
int main()
{
char str1[100];
fgets(str1, sizeof str1 , stdin);
int len = strlen(str1);
int m=0;
while(m < len){
if (str1[m] == ' '){
m++;
}
else {
printf("%c",str1[m]);
m++;
}
}
return 0;
}

Related

Integer to pointer type conversion

int count_words(string word)
{
string spaces = "";
int total_words = 1;
int i, j = 0 ;
for (i = 0; i < strlen(word); i++)
{
strcpy(spaces, word[i]);
if (strcmp(spaces, " ") == 0)
{
total_words = total_words + 1;
}
}
return total_words;
}
I am trying to make a function in c that gets the total number of words, and my strategy is to find the number of spaces in the string input. However i get an error at strcpy about integer to ptrtype conversion,. I cant seem to compare the 2 strings without getting the error. Can someone explain to me whats how the error is happening and how I would go about fixing it. The IDE is also suggesting me to add an ampersand beside word[i] but it then makes a segmentation fault output
You need to learn a little more about the distinction between characters and strings in C.
When you say
strcpy(spaces, word[i]);
it looks like you're trying to copy one character to a string, so that in the next line you can do
if (strcmp(spaces, " ") == 0)
to compare the string against a string consisting of one space character.
Now, it's true, if you're trying to compare two strings, you do have to call strcmp. Something like
if (spaces == " ") /* WRONG */
definitely won't cut it.
In this case, though, you don't need to compare strings. You're inspecting your input a character at a time, so you can get away with a simple character comparison instead. Get rid of the spaces string and the call to strcpy, and just do
if (word[i] == ' ')
Notice that I'm comparing against the character ' ', not the string " ". Using == to compare single characters like this is perfectly fine.
Sometimes, you do have to construct a string out of individual characters, "by hand", but when you do, it's a little more elaborate. It would look like this:
char spaces[2];
spaces[0] = word[i];
spaces[1] = '\0';
if (strcmp(spaces, " ") == 0)
...
This would work, and you might want to try it to be sure, but it's overkill, and there's no reason to write it that way, except perhaps as a learning exercise.
Why didn't your code
strcpy(spaces, word[i]);
work? What did the error about "integer to pointer conversion" mean? Actually there are several things wrong here.
It's not clear what the actual type of the string spaces is (more on this later), but it has space for at most 0 characters, so you're not going to be able to copy a 1-character string into it.
It's also not clear that spaces is even writable. It might be a constant, meaning that you can't legally copy any characters into it.
Finally, strcpy copies one string to another. In C, although strings are arrays, they're usually referred to as pointers. So strcpy accepts two pointers, one to the source and one to the destination string. But you passed it word[i], which is a single character, not a string. Let's say the character was A. If you hadn't gotten the error, and if strcpy had tried to do its job, it would have treated A as a pointer, and it would have tried to copy a string from address 65 in memory (because the ASCII value of the character A is 65).
This example shows that working with strings is a little bit tricky in C. Strings are represented as arrays, but arrays are second-class citizens in C. You can't pass arrays around, but arrays are usually referred to by simple pointers to their first element, which you can pass around. This turns out to be very convenient and efficient once you understand it, but it's confusing at first, and takes some getting used to.
It might be nice if C did have a built-in, first-class string type, but it does not. Since it does not, C programmers myst always keep in mind the distinction between arrays and characters when working with strings. That "convenient" string typedef they give you in CS50 turns out to be a bad idea, because it's not actually convenient at all -- it merely hides an important distinction, and ends up making things even more confusing.

realloc:invalid next size error, can anyone point out the mistake i did in memory allocation

text which i passed to get_document function is a normal string data.
1." " denotes separation of words.
2."." denotes separation of sentences.
3."\n" denotes separation of paragraphs.
get_document is a function which allocates each words, sentences, paragraphs for separate memory blocks making it easily accessible.
Here's the code snippet.
char**** get_document(char* text) {
//get_document
int l=0,k=0,j=0,i=0;
char**** document = (char****)malloc(sizeof(char***));//para
document[l] = (char***)malloc(sizeof(char**));//sen
document[l][k] = (char**)malloc(sizeof(char*));//word
document[l][k][j] = (char*)malloc(sizeof(char));//letter
for(int z = 0; z < strlen(text); z++) {
if(strcmp(&text[z]," ")==0) {
document[l][k][j][i] = '\0';
j++;
document[l][k] = realloc(document[l][k],(sizeof(char*)) * j+1);
i=0;
document[l][k][j] = (char*)malloc(sizeof(char));
}
else if(strcmp(&text[z],".")==0) {
k++;
document[l] = realloc(document[l],(sizeof(char**)) * k+1);
j=0;
i=0;
document[l][k] =(char**)malloc(sizeof(char*));
document[l][k][j] = (char*)malloc(sizeof(char));
}
else if(strcmp(&text[z],"\n")==0) {
l++;
document = realloc(document,(sizeof(char***)) * l+1);
k=0;
j=0;
i=0;
document[l] = (char***)malloc(sizeof(char**));
document[l][k] =(char**)malloc(sizeof(char*));
document[l][k][j] = (char*)malloc(sizeof(char));
}
else {
strcpy(&document[l][k][j][i],&text[z]);
i++;
document[l][k][j] = realloc(document[l][k][j],(sizeof(char)) * i+1);
}
}
return document;
}
but when I run the program , I get the error
realloc:invalid next size
Can anyone help me with this. Thanks in advance.
when I run the program , I get the error
realloc:invalid next size
It appears that one of your realloc calls is failing because the allocator's tracking data has been corrupted. This is one of the more common things that can go wrong when you overwrite the bounds of an object, especially an allocated one. Which you do, a lot:
strcpy(&document[l][k][j][i],&text[z]);
If you want to make any progress in your study of C, it is essential that you learn the difference between a char and a string. The C string functions, such as strcmp() and strcpy(), apply only to the latter. You may use them on empty strings (containing only a nul) or on single-character strings (containing one character plus a nul), among other kinds, but they are neither safe nor useful for individual chars. For individual chars you would use standard C operators instead, such as == and =.
In the case of the line quoted above, each strcpy call will attempt to copy the entire tail of the input string, including the terminator, into into the one-char-big space pointed to by &document[l][k][j][i]. This will always write past the end of the allocated space, often by a lot, thus producing undefined behavior. You appear to instead want:
document[l][k][j][i] = text[z];
(well-deserved criticism of the choice of a quadruple pointer left aside). I see that you leave appending a string terminator for later, which is ok in principle, but I also see that you fail to terminate the last word of each sentence if the period ('.') immediately follows the word without any space.
Along the same lines, your several uses of strcmp() each compare the entire tail of the input string to one of several length-one string literals. Such comparisons are allowed, but they will not yield the results you appear to want. It appears you want simple equality tests against character constants, instead:
if (text[z] == ' ')
// ...
else if (text[z] == '.')
// ...
else if (text[z] == '\n')
And of course, even with those corrections, your approach is highly inefficient. Memory [re]allocation is comparatively expensive, and you are performing an allocation or reallocation for every. single. character. in the document. At least scan ahead to the end of each word so as to allocate a word at a time, though it is possible to do better even than that.
Also, do not neglect the fact that malloc() and realloc() can fail, in which case they return a null pointer. Robust code is meticulous about checking for and handling error results from its function calls, including allocation errors.
You mess up characters with strings.
You conditions to detect your elements are wrong:
if(strcmp(&text[z]," ")==0)
else if(strcmp(&text[z],".")==0)
...
Unless strlen(text) == 1 you will never enter any of your branches.
strcmp compares strings, not single characters. This means it compares the whole remaining buffer with a string of length 1 which can never be true except for the last character.
If you want to compare single characters, use if(text[z] == ' ') instead.
In your final else branch you completely smash your heap:
strcpy(&document[l][k][j][i],&text[z]);
You copy a string (again: the complete remaining buffer) into a single character.
The memory for document[l][k][j] was allocated using size=1. This cannot even hold a string of length 1 because there is no room for terminating '\0' byte.
Copying the string into memory large enough to hold exactly 1 character, causes heap corruption and in any call to memory allocation function, this will finally explode as you can see with your error message.
What you need is:
document[l][k][j][i] = text[z];
document[l][k][j][i+1] = 0;
Finally your memory size for allocation is wrong:
document = realloc(document,(sizeof(char***)) * l+1);
You want to add 1 extra element to the array but you only add 1 byte. Use this instead:
document = realloc(document,(sizeof(char***)) * (l+1));
The same applies for all other levels of your construction.
In addition your naming of counters is poor. One character variable names should only be used for loops etc. where there is no risk of confusion.
If you use them for different levels of array indexing, you should use names like wordcount, paracount etc. This would make the code much more readable.
Also I suggest you follow the hints in comments. Rethink your complete design.

Sscanf not returning what I want

I have the following problem:
sscanf is not returning the way I want it to.
This is the sscanf:
sscanf(naru,
"%s[^;]%s[^;]%s[^;]%s[^;]%f[^';']%f[^';']%[^;]%[^;]%[^;]%[^;]"
"%[^;]%[^;]%[^;]%[^;]%[^;]%[^;]%[^;]%[^;]%[^;]%[^;]%[^;]%[^;]"
"%[^;]%[^;]%[^;]%[^;]%[^;]%[^;]",
&jokeri, &paiva1, &keskilampo1, &minlampo1, &maxlampo1,
&paiva2, &keskilampo2, &minlampo2, &maxlampo2, &paiva3,
&keskilampo3, &minlampo3, &maxlampo3, &paiva4, &keskilampo4,
&minlampo4, &maxlampo4, &paiva5, &keskilampo5, &minlampo5,
&maxlampo5, &paiva6, &keskilampo6, &minlampo6, &maxlampo6,
&paiva7, &keskilampo7, &minlampo7, &maxlampo7);
The string it's scanning:
const char *str = "city;"
"2014-04-14;7.61;4.76;7.61;"
"2014-04-15;5.7;5.26;6.63;"
"2014-04-16;4.84;2.49;5.26;"
"2014-04-17;2.13;1.22;3.45;"
"2014-04-18;3;2.15;3.01;"
"2014-04-19;7.28;3.82;7.28;"
"2014-04-20;10.62;5.5;10.62;";
All of the variables are stored as char paiva1[22] etc; however, the sscanf isn't storing anything except the city correctly. I've been trying to stop each variable at ;.
Any help how to get it to store the dates etc correctly would be appreciated.
Or if there's a smarter way to do this, I'm open to suggestions.
There are multiple problems, but BLUEPIXY hit the first one — the scan-set notation doesn't follow %s.
Your first line of the format is:
"%s[^;]%s[^;]%s[^;]%s[^;]%f[^';']%f[^';']%[^;]%[^;]%[^;]%[^;]"
As it stands, it looks for a space separated word, followed by a [, a ^, a ;, and a ] (which is self-contradictory; the character after the string is a space or end of string).
The first fixup would be to use scan-sets properly:
"%[^;]%[^;]%[^;]%[^;]%f[^';']%f[^';']%[^;]%[^;]%[^;]%[^;]"
Now you have a problem that the first %[^;] scans everything up to the end of string or first semicolon, leaving nothing for the second %[;] to match.
"%[^;]; %[^;]; %[^;]; %[^;]; %f[^';']%f[^';']%[^;]%[^;]%[^;]%[^;]"
This looks for a string up to a semicolon, then for the semicolon, then optional white space, then repeats for three items. Apart from adding a length to limit the size of string, preventing overflow, these are fine. The %f is OK. The following material looks for an odd sequence of characters again.
However, when the data is looked at, it seems to consist of a city, and then seven sets of 'a date plus three numbers'.
You'd do better with an array of structures (if you've worked with those yet), or a set of 4 parallel arrays, and a loop:
char jokeri[30];
char paiva[7][30];
float keskilampo[7];
float minlampo[7];
float maxlampo[7];
int eoc; // End of conversion
int offset = 0;
char sep;
if (fscanf(str + offset, "%29[^;]%c%n", jokeri, &sep, &eoc) != 2 || sep != ';')
...report error...
offset += eoc;
for (int i = 0; i < 7; i++)
{
if (fscanf(str + offset, "%29[^;];%f;%f;%f%c%n", paiva[i],
&keskilampo[i], &minlampo[i], &maxlampo[i], &sep, &eoc) != 5 ||
sep != ';')
...report error...
offset += eoc;
}
See also How to use sscanf() in loops.
Now you have data that can be managed. The set of 29 separately named variables is a ghastly thought; the code using them will be horrid.
Note that the scan-set conversion specifications limit the string to a maximum length one shorter than the size of jokeri and the paiva array elements.
You might legitimately be wondering about why the code uses %c%n and &sep before &eoc. There is a reason, but it is subtle. Suppose that the sscanf() format string is:
"%29[^;];%f;%f;%f;%n"
Further, suppose there's a problem in the data that the semicolon after the third number is missing. The call to sscanf() will report that it made 4 successful conversions, but it doesn't count the %n as an assignment, so you can't tell that sscanf() didn't find a semicolon and therefore did not set &eoc at all; the value is left over from a previous call to sscanf(), or simply uninitialized. By using the %c to scan a value into sep, we get 5 returned on success, and we can be sure the %n was successful too. The code checks that the value in sep is in fact a semicolon and not something else.
You might want to consider a space before the semi-colons, and before the %c. They'll allow some other data strings to be converted that would not be matched otherwise. Spaces in a format string (outside a scan-set) indicate where optional white space may appear.
I would use strtok function to break your string into pieces using ; as a delimiter. Such a long format string may be a source of problems in future.

Pointer mystery/noobish issue

I am originally a Java programmer who is now struggling with C and specifically C's pointers.
The idea on my mind is to receive a string, from the user, on a command line, into a character pointer. I then want to access its individual elements. The idea is later to devise a function that will reverse the elements' order. (I want to work with anagrams in texts.)
My code is
#include <stdio.h>
char *string;
int main(void)
{
printf("Enter a string: ");
scanf("%s\n",string);
putchar(*string);
int i;
for (i=0; i<3;i++)
{
string--;
}
putchar(*string);
}
(Sorry, Code marking doesn't work).
What I am trying to do is to have a first shot at accessing individual elements. If the string is "Santillana" and the pointer is set at the very beginning (after scanf()), the content *string ought to be an S. If unbeknownst to me the pointer should happen to be set at the '\0' after scanf(), backing up a few steps (string-- repeated) ought to produce something in the way of a character with *string. Both these putchar()'s, though, produce a Segmentation fault.
I am doing something fundamentally wrong and something fundamental has escaped me. I would be eternally grateful for any advice about my shortcomings, most of all of any tips of books/resources where these particular problems are illuminated. Two thick C books and the reference manual have proved useless as far as this.
You haven't allocated space for the string. You'll need something like:
char string[1024];
You also should not be decrementing the variable string. If it is an array, you can't do that.
You could simply do:
putchar(string[i]);
Or you can use a pointer (to the proposed array):
char *str = string;
for (i = 0; i < 3; i++)
str++;
putchar(*str);
But you could shorten that loop to:
str += 3;
or simply write:
putchar(*(str+3));
Etc.
You should check that scanf() is successful. You should limit the size of the input string to avoid buffer (stack) overflows:
if (scanf("%1023s", string) != 1)
...something went wrong — probably EOF without any data...
Note that %s skips leading white space, and then reads characters up to the next white space (a simple definition of 'word'). Adding the newline to the format string makes little difference. You could consider "%1023[^\n]\n" instead; that looks for up to 1023 non-newlines followed by a newline.
You should start off avoiding global variables. Sometimes, they're necessary, but not in this example.
On a side note, using scanf(3) is bad practice. You may want to look into fgets(3) or similar functions that avoid common pitfalls that are associated with scanf(3).

Looping until a specific string is found

I have a simple question. I want to write a program in C that scans the lines of a specific file, and if the only phrase on the line is "Atoms", I want it to stop scanning and report which line it was on. This is what I have and is not compiling because apparently I'm comparing an integer to a pointer: (of course "string.h" is included.
char dm;
int test;
test = fscanf(inp,"%s", &dm);
while (test != EOF) {
if (dm=="Amit") {
printf("Found \"Atoms\" on line %d", j);
break;
}
j++;
}
the file was already opened with:
inp = fopen( .. )
And checked to make sure it opens correctly...
I would like to use a different approach though, and was wondering if it could work. Instead of scanning individual strings, could I scan entire lines as such:
// char tt[200];
//
// fgets(tt, 200, inp);
and do something like:
if (tt[] == "Atoms") break;
Thanks!
Amit
Without paying too much attention to your actual code here, the most important mistake your making is that the == operator will NOT compare two strings.
In C, a string is an array of characters, which is simply a pointer. So doing if("abcde" == some_string) will never be true unless they point to the same string!
You want to use a method like "strcmp(char *a, char *b)" which will return 0 if two strings are equal and something else if they're not. "strncmp(char *a, char *b, size_t n)" will compare the first "n" characters in a and b, and return 0 if they're equal, which is good for looking at the beginning of strings (to see if a string starts with a certain set of characters)
You also should NOT be passing a character as the pointer for %s in your fscanf! This will cause it to completely destroy your stack it tries to put many characters into ch, which only has space for a single character! As James says, you want to do something like char ch[BUFSIZE] where BUFSIZE is 1 larger than you ever expect a single line to be, then do "fscanf(inp, "%s", ch);"
Hope that helps!
please be aware that dm is a single char, while you need a char *
more: if (dm=="Amit") is wrong, change it in
if (strcmp(dm, "Amit") == 0)
In the line using fscanf, you are casting a string to the address of a char. Using the %s in fscanf should set the string to a pointer, not an address:
char *dm;
test = fscanf(inp,"%s", dm);
The * symbol declares an indirection, namely, the variable pointed to by dm. The fscanf line will declare dm as a reference to the string captured with the %s delimiter. It will point to the address of the first char in the string.
What kit said is correct too, the strcmp command should be used, not the == compare, as == will just compare the addresses of the strings.
Edit: What kit says below is correct. All pointers should be allocated memory before they are used, or else should be cast to a pre-allocated memory space. You can allocate memory like this:
dm = (char*)malloc(sizeof(char) * STRING_LENGTH);
where STRING_LENGTH is a maximum length of a possible string. This memory allocation only has to be done once.
The problem is you've declared 'dm' as a char, not a malloc'd char* or char[BUFSIZE]
http://www.cplusplus.com/reference/clibrary/cstdio/fscanf/
You'll also probably report incorrect line numbers, you'll need to scan the read-in buffer for '\n' occurences, and handle the case where your desired string lies across buffer boundaries.

Resources