Segmentation Fault in pset2 Readability - c

I am working on the readability problem in pset 2. I am finding that I get a segmentation fault but I believe I have narrowed down that it is coming from two functions I have created that calculates the number of words and letters in the user inputted text.
//function that counts letters
int count_letters(string t)
{
int letters = 0;
for (int i = 0, len = strlen(t); i < len; i++)
{
if (isalpha(t))
{
letters++;
}
}
return letters;
}
//function that counts words
int count_words(string t)
{
int words = 1;
for (int x = 0, len = strlen(t); x < len; x++)
{
if (isspace(t))
{
words++;
}
}
return words;
}
I am unsure how to fix the issue and would be open to any advice.
P.S. Sorry for any formating issues this is my first time posting to stack overflow.

First, please describe what you're talking about. It's not at all clear what the "readability problem in pset 2" is. Googling gave me some idea, but we shouldn't have to do that.
Second, isalpha() and isspace() take a char, not a string. You probably want
if (isalpha(t[i]))
Plus, you currently don't even use the variable 'i'. You should also check if 't' is a NULL string or not before dereferencing it. That's the most common cause of segfaults.
Third, the count_words() routine will always return at least 1. What if the passed in string is empty or only whitespace?

Related

Shuffle words from a 1D array

I've been given this sentence and I need to shuffle the words of it:
char array[] = "today it is going to be a beautiful day.";
A correct output would be: "going it beautiful day is a be to today"
I've tried many things like turning it into a 2D array and shuffling the rows, but I can't get it to work.
Your instinct of creating a 2D array is solid. However in C that's more involved than you might expect:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <time.h>
int main()
{
char array[] = "today it is going to be a beautiful day.";
char out_array[sizeof(array)];
char words[sizeof(array)][46];
int word_count = 0;
int letter_count = 0;
int on_word = 0;
int count = 0;
int i = 0;
int j = 0;
srand(time(NULL));
// parse words into 2D array
for (i = 0; i < sizeof(array); i++) {
if (array[i] == ' ') {
if (on_word) {
words[word_count++][letter_count] = '\0';
letter_count = 0;
on_word = 0;
}
} else if (array[i] == '\0' || array[i] == '.') {
break;
} else {
on_word = 1;
words[word_count][letter_count++] = array[i];
}
}
words[word_count++][letter_count] = '\0';
// randomly swap around words
for (i = 0; i < word_count; i++) {
char temp[46];
int idx = rand() % word_count;
if (idx != i) {
strcpy(temp, words[idx]);
strcpy(words[idx], words[i]);
strcpy(words[i], temp);
}
}
// output words into out_array
for (i = 0; i < word_count; i++) {
for (j = 0; words[i][j] != '\0'; j++) {
out_array[count++] = words[i][j];
}
out_array[count++] = ' ';
}
out_array[count - 1] = '\0';
printf("%s", out_array);
return 0;
}
You need two basic algorithms to solve this problem.
Split the input string into a list of words.
Randomly sample your list of words until there are no more.
1. Split the input string into a list of words.
This is much simpler than you may think. You don’t need to actually copy any words, just find where each one begins in your input string.
today it is going to be a beautiful day.
^---- ^- ^- ^---- ^- ^- ^ ^-------- ^--
There are all kinds of ways you can store that information, but the two most useful would be either an array of integer indices or an array of pointers.
For your example sentence, the following would be a list of indices:
0, 6, 9, 12, 18, 21, 24, 26, 36
To do this, just create an array with a reasonable upper limit on words:
int words[100]; // I wanna use a list of index values
int nwords = 0;
 
char * words[100]; // I wanna use a list of pointers
int nwords = 0;
If you do it yourself either structure is just as easy.
If you use strtok life is much easier with a list of pointers.
All you need at this point is a loop over your input to find the words and populate your list. Remember, a words is any alphabetic or numeric value (and maybe hyphens, if you want to go that far). Everything else is not a word. If you #include <ctype.h> you get a very handy function for classifying a character is “word” or “not-word”:
if (isalnum( input[n] )) its_a_word_character;
else its_not_a_word_character_meaning_we_have_found_the_end_of_the_word;
Now that you have a list of words, you can:
2. Randomly sample your list of words until there are no more.
There are, again, a number of ways you could do this. Already suggested above is to randomly shuffle the list of words (array of indices or array of pointers), and then simply rebuild the sentence by taking the words in order.
→ Beware, Etian’s example is not a correct shuffle, though it would probably go unnoticed or ignored by everyone at your level of instruction as it will appear to work just fine. Google around “coding horror fisher yates” for more.
The other way would be to just select and remove a random word from your array until there are no words left.
The random sampling is not difficult, but it does require some precise thinking, making this the actually most difficult part of your project.
To start you first need to get a proper random number. There is a trick to this that people are generally not taught. Here you go:
int random( int N ) // Return an UNBIASED pseudorandom value in [0, N-1].
{
int max_value = (RAND_MAX / N) * N;
int result;
do result = rand(); while (result >= max_value);
return result % N;
}
And in main() the very first thing you should do is initialize the random number generator:
#include <stdlib.h>
#include <time.h>
int main()
{
srand( (unsigned)time( NULL ) );
Now you can sample / shuffle your array properly. You can google "Fisher-Yates Shuffle" (or follow the link in the comment below your question). Or you can just select the next word:
while (nwords)
{
int index = random( nwords );
// do something with word[index] here //
// Remove the word we just printed from our list of words
// • Do you see what trick we use to remove the word?
// • Do you also know why this does not affect our random selection?
words[index] = words[--nwords];
}
Hopefully you can see that both of these methods are essentially the same thing. Whichever you choose is up to you. I personally would use the latter because of the following consideration:
Output
You can create a new string and then print it, or you can just print each word directly. As the homework (as you presented it) does not require generation of a new string, I would just print the output directly. This makes life simpler in the sense that you do not have to mess with another string array.
As you print each word (or append it to a new string), remember how you separated them to begin with. If you use strtok you can just use something like:
printf( "%s", words[index] ); // print word directly to stdout
 
strcat( output, words[index] ); // append word to output string
If you found the beginnings of each word yourself, you will have to again loop until you find the end of the word:
// Print word, character by character, directly to stdout
for (int n = index; isalnum( words[index+n] ); n++)
{
putchar( words[index+n] );
}
 
// Append word, character by character, to output string
for (int n = index; isalnum( words[index+n] ); n++)
{
char * p = strchr( output, '\0' ); // (Find end of output[])
*p++ = words[index+n]; // (Add char)
*p = '\0'; // (Add null terminator)
}
All that’s left is to pay attention to spaces and periods in your output.
Hopefully this should be enough to get you started.

Simulate strrev() function, crazy output

I should simulate the operation of the strrev() function with an inscription from me. However, I don't understand why I have a series of special characters that don't make sense as output until you stop the program completely. I also tried to see if the problem was in the index "i" with the commented line of code, but it's ok. What could be the problem? thanks!
void strrev_new(char *s_to_rev) {
int i = 0;
int length = 0;
length = strlen(s_to_rev);
for (i = 0; i < length; i++) {
s_to_rev[length - i] = s_to_rev[i];
// printf("%d ----- %d\n", (length-i), i);
}
}
You have an off-by-one error, since strlen() returns the length of the string (e.g. 5 for hello), but the last index in the string is 4 (counting from 0).
Try
s_to_rev[length - 1 - i] = s_to_rev[i];
Your code has two problems. The first, brilliantly spotted by #AKX, is that you write starting from str[length] character instead of str[length-1] (in C array indexes start from 0).
The second problem is a consequence of the fact you are trying to reverse the string in place, that is without using a auxiliary array.
With the loop
for (i = 0; i < length; i++) {
s_to_rev[length - i] = s_to_rev[i];
}
you correctly start updating the last elements of the array. But as soon as you reach the half of the string, the characters at s_to_rev[i] are not the original ones anymore, as you updated them previously!
Try instead traversing half the string and swapping characters (just use a temporary char variable):
for (i = 0; i < length/2; i++) {
char tmp = s_to_rev[length - i -1],
s_to_rev[length - i -1] = s_to_rev[i];
s_to_rev[i] = tmp;
}

Runtime error: reading uninitialized value, how can I fix it?

This function is basically just supposed to compare 2 strings and return their ASCII difference if they are different. It works perfectly fine when I compile it with the GCC compiler, but when I run it through the online compiler that is used to upload our classes homework, I get this error message:
Error near line 98: Reading an uninitialized value from address 10290
Line 98 is marked in the below code. I am not quite sure what the problem is and how I'm supposed to fix it. Does anyone have an idea?
int stringCompare(char * pStr1, char * pStr2) {
int n = 100;
int difference;
for (int i = 0; i < n; i++) {
difference = pStr1[i] - pStr2[i]; // line 98
if (difference != 0) {
return difference;
}
}
return difference;
}
Your code can skip over EOLN, if string equals, and try to compare memory after end of lines. To fix this, you need instantly return, if both string equals, and you see EOLN char '\0' in both strings at position i. Try my fix:
int stringCompare(char * pStr1, char * pStr2) {
int n = 100;
int difference;
for (int i = 0; i < n; i++) {
difference = pStr1[i] - pStr2[i];
if (difference != 0 || pStr1[i] == '\0') {
return difference;
}
}
return difference;
}
The problem in your code is that you fail to check the real length of the strings before indexing them. You are iterating with i from 0 to 99, but you do not check for the NUL terminator (\0) that marks the end of a string and therefore your for loop goes beyond the end of the string resulting in undefined behavior accessing memory that is not supposed to (which is what the error is telling you).
The correct way to iterate over a string, is not to loop a fixed amount of cycles: you should start from index 0 and check each character of the string in the loop condition. When you find \0, you stop. See also How to iterate over a string in C?.
Here's a correct version of your code:
int stringCompare(char *pStr1, char *pStr2) {
size_t i;
for (i = 0; pStr1[i] != '\0' && pStr2[i] != '\0'; i++) {
if (pStr1[i] != pStr2[i])
break;
}
return pStr1[i] - pStr2[i];
}
You could even write this more concisely with a simple while loop:
int stringCompare(char *pStr1, char *pStr2) {
while (*pStr1 && *pStr1 == *pStr2) {
pStr1++;
pStr2++;
}
return *pStr1 - *pStr2;
}
Of course, both the above functions expect two valid pointers to be passed as arguments. If you also want to allow invalid pointers you should check them before starting the loop (though it does not seem like you want to do that from your code).

C String -- Sort by first-word length [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 8 years ago.
Improve this question
I'm self-studying C and doing an exercise that, among other things, asks me to sort a list of user-entered strings by length of the first word in the string. The other functions in the exercise (including sorting the string by entire length) were easy to write. I've been working on this one for over three hours and can't get it to work. I'm sorting an array of pointers-to-char, and then printing them with a for loop in the main() function.
There's probably a much easier way to do this, but even if so, I cannot understand why this function doesn't work. I've made about thirty changes to it and the sort still comes out pretty random.
void srtlengthw(char * strings[], int n)
{
int top, seek, ct, ct_temp, i;
int ar_ct[n]
char * temp;
bool inWord;
for (top = 0, ct = 0, i = 0, inWord = false; top < n - 1; top++)
{
while (strings[top][i])
{
if (!isblank(strings[top][i]))
{
i++;
ct++;
inWord = true;
}
else if (!inWord)
i++;
else
break;
}
ar_ct[top] = ct;
for (seek = top + 1, ct = 0, i = 0, inWord = false; seek < n; seek++)
{
while(strings[seek][i])
{
if (!isblank(strings[seek][i]))
{
i++;
ct++;
inWord = true;
}
else if (!inWord)
i++;
else
break;
}
ar_ct[seek] = ct;
if (ar_ct[top] > ar_ct[seek])
{
ct_temp = ar_ct[top];
ar_ct[top] = ar_ct[seek];
ar_ct[seek] = ct_temp;
temp = strings[top];
strings[top] = strings[seek];
strings[seek] = temp;
}
}
}}
Example of wrong output, as requested:
Input:
Mary
had
a
little
lamb
that
was
sacrificed
to
Satan
=========
Output:
had
a
little
lamb
that
was
sacrificed
to
Mary
Satan
And here's an example of a much simpler function that worked properly. It's meant to sort the pointers by length of the entire string rather than just the first word. I tried to model the word-length sort function on this one, but I'm apparently having trouble dealing with my counter variables and maybe my bool flag right.
void srtlength(char * strings[], int n)
{
int top, seek;
char * temp;
for (top = 0; top < n - 1; top++)
for (seek = top + 1; seek < n; seek++)
if (strlen(strings[top]) > strlen(strings[seek]))
{
temp = strings[top];
strings[top] = strings[seek];
strings[seek] = temp;
}
}
For Craig, hopefully this helps?
Input:
They say it's lonely at the top, and whatever you do
You always gotta watch m*********s around you
Nobody's invincible
No plan is foolproof
We all must meet our moment of truth
The same sheisty cats that you hang with and do your thang with
Could set you up and wet you up, n***a, peep the language
It's universal
You play with fire, it may hurt you, or burn you
Lessons are blessins you should learn through
Output for me:
You always gotta watch m********s around you
Nobody's invincible
No plan is foolproof
We all must meet our moment of truth
The same sheisty cats that you hang with and do your thang with
Could set you up and wet you up, n***a, peep the language
It's universal
You play with fire, it may hurt you, or burn you
Lessons are blessins you should learn through
They say it's lonely at the top, and whatever you do
If you're looking for output similar to that of the example code that you posted, then I suggest using it as a template for a version with your expected behavior. The key that I'm looking to point out is that it sorts by the return value of the strlen function.
strlen is a function in C's <string.h> library (I think?) that returns the length of a C-style string. In C, as you're probably aware, the end of a string is identified by a null terminator, which is represented as a '\0'.
While the precise strlen may vary from one library to another, here is one standard implementation (made easier to read):
int strlen(char * str){
char * l;
for(l = str; *l != '\0'; l++);
return l - str;
}
People will likely argue that there are problems with this and it isn't perfect, but it does hopefully show how the length of a string is determined.
Now that we understand that the last example sorts by the total string length, and we know how string length is determined, we can probably make our own version of strlen that stops after the first word, instead of stopping at the null terminator:
int blank_strlen(char * str){
char * l;
for(l = str; *l != '\0' && !isblank(*l); l++);
return l - str;
}
Now, using the example code given:
void blank_srtlength(char * strings[], int n)
{
int top, seek;
char * temp;
for (top = 0; top < n - 1; top++)
for (seek = top + 1; seek < n; seek++)
if (blank_strlen(strings[top]) > blank_strlen(strings[seek]))
{
temp = strings[top];
strings[top] = strings[seek];
strings[seek] = temp;
}
}
millinon's answer is a much better way to do it, as it is simpler. However, if you are looking for the reason why your code isn't working, it is due to your variables only being reset outside of each loop.
This code:
for (seek = top + 1, ct = 0, i = 0, inWord = false; seek < n; seek++)
{
while(strings[seek][i])
only sets ct, i and inWord once, before the loop is first started. When the program loops around, the values of ct, i and inWord will be kept from the last iteration.
Moving the assignments inside the loop like this:
for (seek = top + 1; seek < n; seek++)
{
ct = 0;
i = 0;
inWord = false;
while(strings[seek][i])
will fix your problem (you have to do it in both places).

Pointers to string C

trying to write function that returns 1 if every letter in “word” appears in “s”.
for example:

containsLetters1("this_is_a_long_string","gas") returns 1
containsLetters1("this_is_a_longstring","gaz") returns 0
containsLetters1("hello","p") returns 0
Can't understand why its not right:
#include <stdio.h>
#include <string.h>
#define MAX_STRING 100
int containsLetters1(char *s, char *word)
{
int j,i, flag;
long len;
len=strlen(word);
for (i=0; i<=len; i++) {
flag=0;
for (j=0; j<MAX_STRING; j++) {
if (word==s) {
flag=1;
word++;
s++;
break;
}
s++;
}
if (flag==0) {
break;
}
}
return flag;
}
int main() {
char string1[MAX_STRING] , string2[MAX_STRING] ;
printf("Enter 2 strings for containsLetters1\n");
scanf ("%s %s", string1, string2);
printf("Return value from containsLetters1 is: %d\n",containsLetters1(string1,string2));
return 0;
Try these:
for (i=0; i < len; i++)... (use < instead of <=, since otherwise you would take one additional character);
if (word==s) should be if (*word==*s) (you compare characters stored at the pointed locations, not pointers);
Pointer s advances, but it should get back to the start of the word s, after reaching its end, i.e. s -= len after the for (j=...);
s++ after word++ is not needed, you advance the pointer by the same amount, whether or not you found a match;
flag should be initialized with 1 when declared.
Ah, that should be if(*word == *s) you need to use the indirection operator. Also as hackss said, the flag = 0; must be outside the first for() loop.
Unrelated but probably replace scanf with fgets or use scanf with length specifier For example
scanf("%99s",string1)
Things I can see wrong at first glance:
Your loop goes over MAX_STRING, it only needs to go over the length of s.
Your iteration should cover only the length of the string, but indexes start at 0 and not 1. for (i=0; i<=len; i++) is not correct.
You should also compare the contents of the pointer and not the pointers themselves. if(*word == *s)
The pointer advance logic is incorrect. Maybe treating the pointer as an array could simplify your logic.
Another unrelated point: A different algorithm is to hash the characters of string1 to a map, then check each character of the string2 and see if it is present in the map. If all characters are present then return 1 and when you encounter the first one that is not present then return 0. If you are only limited to using ASCII characters a hashing function is very easy. The longer your ASCII strings are the better the performance of the second approach.
Here is a one-liner solution, in keeping with Henry Spencer's Commandment 7 for C Programmers.
#include <string.h>
/*
* Does l contain every character that appears in r?
*
* Note degenerate cases: true if r is an empty string, even if l is empty.
*/
int contains(const char *l, const char *r)
{
return strspn(r, l) == strlen(r);
}
However, the problem statement is not about characters, but about letters. To solve the problem as literally given in the question, we must remove non-letters from the right string. For instance if r is the word error-prone, and l does not contain a hyphen, then the function returns 0, even if l contains every letter in r.
If we are allowed to modify the string r in place, then what we can do is replace every non-letter in the string with one of the letters that it does contain. (If it contains no letters, then we can just turn it into an empty string.)
void nuke_non_letters(char *r)
{
static const char *alpha =
"abcdefghijklmnopqrstuvwxyz"
"ABCDEFGHIJKLMNOPQRSTUVWXYZ";
while (*r) {
size_t letter_span = strspn(r, alpha);
size_t non_letter_span = strcspn(r + letter_span, alpha);
char replace = (letter_span != 0) ? *r : 0;
memset(r + letter_span, replace, non_letter_span);
r += letter_span + non_letter_span;
}
}
This also brings up another flaw: letters can be upper and lower case. If the right string is A, and the left one contains only a lower-case a, then we have failure.
One way to fix it is to filter the characters of both strings through tolower or toupper.
A third problem is that a letter is more than just the 26 letters of the English alphabet. A modern program should work with wide characters and recognize all Unicode letters as such so that it works in any language.
By the time we deal with all that, we may well surpass the length of some of the other answers.
Extending the idea in Rajiv's answer, you might build the character map incrementally, as in containsLetters2() below.
The containsLetters1() function is a simple brute force implementation using the standard string functions. If there are N characters in the string (haystack) and M in the word (needle), it has a worst-case performance of O(N*M) when the characters of the word being looked for only appear at the very end of the searched string. The strchr(needle, needle[i]) >= &needle[i] test is an optimization if there are likely to be repeated characters in the needle; if there won't be any repeats, it is a pessimization (but it can be removed and the code still works fine).
The containsLetters2() function searches through the string (haystack) at most once and searches through the word (needle) at most once, for a worst case performance of O(N+M).
#include <assert.h>
#include <stdio.h>
#include <string.h>
static int containsLetters1(char const *haystack, char const *needle)
{
for (int i = 0; needle[i] != '\0'; i++)
{
if (strchr(needle, needle[i]) >= &needle[i] &&
strchr(haystack, needle[i]) == 0)
return 0;
}
return 1;
}
static int containsLetters2(char const *haystack, char const *needle)
{
char map[256] = { 0 };
size_t j = 0;
for (int i = 0; needle[i] != '\0'; i++)
{
unsigned char c_needle = needle[i];
if (map[c_needle] == 0)
{
/* We don't know whether needle[i] is in the haystack yet */
unsigned char c_stack;
do
{
c_stack = haystack[j++];
if (c_stack == 0)
return 0;
map[c_stack] = 1;
} while (c_stack != c_needle);
}
}
return 1;
}
int main(void)
{
assert(containsLetters1("this_is_a_long_string","gagahats") == 1);
assert(containsLetters1("this_is_a_longstring","gaz") == 0);
assert(containsLetters1("hello","p") == 0);
assert(containsLetters2("this_is_a_long_string","gagahats") == 1);
assert(containsLetters2("this_is_a_longstring","gaz") == 0);
assert(containsLetters2("hello","p") == 0);
}
Since you can see the entire scope of the testing, this is not anything like thoroughly tested, but I believe it should work fine, regardless of how many repeats there are in the needle.

Resources