Copying a string into a new array - c

I'm trying to read a string in an array, and if a character is not any of the excluded characters int a = ('a'||'e'||'i'||'o'||'u'||'y'||'w'||'h'); it should copy the character into a new array, then print it.
The code reads as:
void letter_remover (char b[])
{
int i;
char c[MAX];
int a = ('a'||'e'||'i'||'o'||'u'||'y'||'w'||'h');
for (i = 0; b[i] != '\0'; i++)
{
if (b[i] != a)
{
c[i] = b[i];
}
i++;
}
c[i] = '\0';
printf("New string without forbidden characters: %s\n", c);
}
However it only prints New string without forbidden characters: h, if the inputted array is, for example hello. I'd like the output of this to be ll (with h, e and o removed).

Use this:
if (b[i] != 'a' && b[i] != 'e' && b[i] != 'i' && b[i] != 'o' && b[i] != 'u' && b[i] != 'y' && b[i] != 'w' && b[i] != 'h')
The boolean OR operator just returns 0 or 1, it doesn't create an object that automatically tests against all the parameters to the operator.
You could also use the strchr() function to search for a character in a string.
char a[] = "aeiouywh";
for (i = 0; b[i] != '\0'; i++)
{
if (!strchr(a, b[i]))
{
c[i] = b[i];
}
i++;
}
c[i] = '\0';

int a = ('a'||'e'||'i'||'o'||'u'||'y'||'w'||'h');
...has an entirely different meaning than you expect. When you Boolean-OR together all those characters, a becomes 1. Since b[] contains no character value 1, no characters will be excluded. Also, your c[] is going to have empty slots if you had tested correctly.
You can use strcspn() to test if your string contains your forbidden characters. For example...
// snip
int i=0, j=0;
char * a = "aeiouywh";
while (b[i])
{
int idx = strcspn(&b[i], a);
if (idx >= 0)
{
if (idx > 0)
strncpy(&c[j], &b[i], idx);
j += idx;
i += idx + 1;
}
}
// etc...
Also, you must be sure c[] is large enough to contain all the characters that might be copied.

Related

Checking for duplicate words in a string in C [duplicate]

This question already has answers here:
How to check duplicate words in a string in C?
(2 answers)
Closed 1 year ago.
I have written a code in c to search for duplicate words in a string, It just appends every word in a string to a 2d string array, but it is returning 0 for the numbers of rows and duplicate strings, what is the problem with the code?
int main() {
char str[50] = "C code find duplicate string";
char str2d[10][50];
int count = 0;
int row = 0, column = 0;
for (int i = 0; str[i] != '\0'; i++) {
if (str[i] != '\0' || str[i] != ' ') {
str2d[row][column] = str[i];
column += 1;
} else {
str2d[row][column] = '\0';
row += 1;
column = 0;
}
}
for (int x = 0; x <= row; x++) {
for (int y = x + 1; y <= row; y++) {
if (strcmp(str2d[x], str2d[y]) == 0 && (strcmp(str2d[y], "0") != 0)) {
count += 1;
}
}
}
printf("%i %i", row, count);
return 0;
}
There are multiple problems in your code:
the 2D array might be too small: there could be as many as 25 words in a 50 byte string, and even more if you consider sequences of spaces to embed empty words.
the test if (str[i] != '\0' || str[i] != ' ') is always true.
the last word in the string is not null terminated in the 2D array.
the word at str2d[row] is uninitialized if the string ends with a space
sequences of spaces cause empty words to be stored into the 2D array.
there is no point in testing strcmp(str2d[y], "0"). This might be a failed attempt at ignoring empty words, which could be tested with strcmp(str2d[y], "").
Here is a modified version:
#include <stdio.h>
#include <string.h>
int main() {
char str[50] = "C code find duplicate string";
char str2d[25][50];
int count = 0, row = 0, column = 0;
for (int i = 0;;) {
// skip initial spaces
while (str[i] == ' ')
i++;
if (str[i] == '\0')
break;
// copy characters up to the next space or the end of the string
while (str[i] != ' ' && str[i] != '\0')
str2d[row][column++] = str[i++];
str2d[row][column] = '\0';
row++;
}
for (int x = 0; x < row; x++) {
for (int y = x + 1; y < row; y++) {
if (strcmp(str2d[x], str2d[y]) == 0)
count += 1;
}
}
printf("%i %i\n", row, count);
return 0;
}
The problems are:
if (str[i] != '\0' || str[i] != ' ') should be if (str[i] != '\0' && str[i] != ' '). If I recall right, using the logical or will prevent reaching the else case.
if (strcmp(str2d[x], str2d[y]) == 0 && (strcmp(str2d[y], "0") != 0)) should be if (strcmp(str2d[x], str2d[y]) == 0). Otherwise, your code will not count duplicates when the word is "0".
a. To avoid confusion, use something like printf("Number of rows = %d, Number of duplicates = %d\n", row+1, count);. Since C arrays start at index 0, that's what row in your code contains. But the number of rows is 1.
b. If you haven't realised by now, there are no duplicates in your str variable: char str[50] = "C code find duplicate string";. So your code returns a correct value of 0. Change it to char str[50] = "C code find duplicate duplicate"; (for example), your code will correctly return 1.

Function that divides the string with given delimiter

I have function named ft_split(char const *s, char c) that is supposed to take strings and delimiter char c and divide s into bunch of smaller strings.
It is 3rd or 4th day I am trying to solve it and my approach:
Calculates no. of characters in the string including 1 delimiter at the time (if space is delimiter so if there are 2 or more spaces in a row than it counts one space and not more. Why? That space is a memory for adding '\0' at the end of each splitted string)
It finds size (k) of characters between delimiters -> malloc memory -> copy from string to malloc -> copy from malloc to malloc ->start over.
But well... function shows segmentation fault. Debugger shows that after allocating "big" memory it does not go inside while loop, but straight to big[y][z] = small[z] after what it exits the function.
Any tips appreciated.
#include "libft.h"
#include <stdlib.h>
int ft_count(char const *s, char c)
{
int i;
int j;
i = 0;
j = 0;
while (s[i] != '\0')
{
i++;
if (s[i] == c)
{
i++;
while (s[i] == c)
{
i++;
j++;
}
}
}
return (i - j);
}
char **ft_split(char const *s, char c)
{
int i;
int k;
int y;
int z;
char *small;
char **big;
i = 0;
y = 0;
if (!(big = (char **)malloc((ft_count(s, c) + 1) * sizeof(char))))
return (0);
while (s[i] != '\0')
{
while (s[i] == c)
i++;
k = 0;
while (s[i] != c)
{
i++;
k++;
}
if (!(small = (char *)malloc(k * sizeof(char) + 1)))
return (0);
z = 0;
while (z < k)
{
small[z] = s[i - k + z];
z++;
}
small[k] = '\0';
z = 0;
while (z < k)
{
big[y][z] = small[z];
z++;
}
y++;
free(small);
}
big[y][i] = '\0';
return (big);
}
int main()
{
char a[] = "jestemzzbogiemzalfa";
ft_split(a, 'z');
}
I didn't get everything what the code is doing, but:
You have a char **big, it's a pointer-to-pointer-to-char, so presumably is supposed to point to an array of char *, which then point to strings. That would look like this:
[ big (char **) ] -> [ big[0] (char *) ][ big[1] (char *) ][ big[2] ... ]
| [
v v
[ big[0][0] (char) ] ...
[ big[0][1] (char) ]
[ big[0][2] (char) ]
[ ... ]
Here, when you call big = malloc(N * sizeof(char *)), you allocate space for the middle pointers, big[0] to big[N-1], the ones on the top right in the horizontal array. It still doesn't set them to anything, and doesn't reserve space for the final strings (big[0][x] etc.)
Instead, you'd need to do something like
big = malloc(N * sizeof(char *));
for (i = 0; i < N; i++) {
big[i] = malloc(k);
}
for each final string individually, with the correct size etc. Or just allocate a big area in one go, and split it among the final strings.
Now, in your code, it doesn't look like you're ever assigning anything to big[y], so they might be anything, which very likely explains the segfault when referencing big[y][z]. If you used calloc(), you'd now that big[y] was NULL, with malloc() it might be, or might not.
Also, here:
while (s[i] != '\0')
{
while (s[i] == c)
i++;
k = 0;
while (s[i] != c) /* here */
{
i++;
k++;
}
I wonder what happens if the end of string is reached at the while (s[i] != c), i.e. if s[i] is '\0' at that point? The loop should probably stop, but it doesn't look like it does.
There are multiple problems in the code:
the ft_count() function is incorrect: you increment i before testing for separators, hence the number is incorrect if the string starts with separators. You should instead count the number of transitions from separator to non-separator:
int ft_count(char const *s, char c)
{
char last;
int i;
int j;
last = c;
i = 0;
j = 0;
while (s[i] != '\0')
{
if (last == c && s[i] != c)
{
j++;
}
last = s[i];
i++;
}
return j;
}
Furthermore, the ft_split() functions is incorrect too:
the amount of memory allocated for the big array of pointers in invalid: you should multiply the number of elements by the element size, which is not char but char *.
you add an empty string at the end of the array if the string ends with separators. You should test for a null byte after skipping the separators.
you do not test for the null terminator when scanning for the separator after the item.
you do not store the small pointer into the big array of pointers. Instead of copying the string to big[y][...], you should just set big[y] = small and not free(small).
Here is a modified version:
char **ft_split(char const *s, char c)
{
int i;
int k;
int y;
int z;
char *small;
char **big;
if (!(big = (char **)malloc((ft_count(s, c) + 1) * sizeof(*big))))
return (0);
i = 0;
y = 0;
while (42) // aka 42 for ever :)
{
while (s[i] == c)
i++;
if (s[i] == '\0')
break;
k = 0;
while (s[i + k] != '\0' && s[i + k] != c)
{
k++;
}
if (!(small = (char *)malloc((k + 1) * sizeof(char))))
return (0);
z = 0;
while (z < k)
{
small[z] = s[i];
z++;
i++;
}
small[k] = '\0';
big[y] = small;
y++;
}
big[y] = NULL;
return (big);
}
42 rant:
Ces conventions de codage (la norminette) sont contre-productives! Les boucles for sont plus lisibles et plus sûres que ces while, les casts sur les valeurs de retour de malloc() sont inutiles et confusantes, les parenthèses autour de l'argument de return sont infantiles.

How can I extract only numbers from a string?

I want to extract only numbers from a string, and to put them in an array.
For example, string is "fds34 21k34 k25j 6 10j340ii0i5".
I want to make one array, which elements are like following:
arr[0]=34, arr[1]=21, arr[2]=34, arr[3]=25, arr[4]=6, arr[5]=10, arr[6]=340, arr[7]=0, arr[8]=5;
my trial code:
#include <stdio.h>
int main()
{
char ch;
int i, j;
int pr[100];
i=0;
while ( (ch = getchar()) != '\n' ){
if( ch>='0' && ch<='9' ){
pr[i] = ch-'0';
i++;
}
for(j=0; j<i; j++)
printf("pr[%d]: %d\n", j, pr[j]);
return 0;
}
My code cannot recognize the contiguous number. just 'pr' array has {3, 4, 2, 1, 3, 4, 2, 5, 6, 1, 0, 3, 4, 0, 0, 5}. Is there any method to implement my objective?
That is algorithm:
Use a string to store current number. At first, init it as empty string
when ch is a digit('0'..'9'), put it in this string
when ch is not a digit, if string is not empty, convert current string to number by atoi function, and store that number in array. After that, init current string to empty again.
Ex: i have string "ab34 56d1"
use string str to store current number, at first str =""(empty)
ch = 'a', do nothing (because current string is empty)
ch = 'b', do nothing
ch = '3', put it to string, so str = "3"
ch = '4', put it to str, now str = "34"
ch = ' ', convert "34" to 34, save it in array, init str="" again
.....
Create a state machine.
Keep track of the previous character - was it a digit?
When a digit is detected ...
... If continuing a digit sequence, *10 and add
... Else start new sequence
Do not overfill pr[]
Use int ch to properly detect EOF
//char ch;
int ch;
bool previous_digit = false;
int pr[100];
int i = 0 - 1;
while (i < 100 && (ch = getchar()) != '\n' && ch != EOF) {
if (ch>='0' && ch<='9') {
if (previous_digit) {
pr[i] = pr[i] * 10 + ch - '0';
} else {
i++;
pr[i] = ch - '0';
}
previous_digit = true;
} else {
previous_digit = false;
}
}
i++;
Use scanf. Life becomes simpler when you use standard functions instead of making up your own algorithms.
This code uses scan read a line of user input and then parses it. Detected digits are put into an array and the search index is shifted forward by the number of digits.
char line[100];
int p[100];
int readNums = 0;
int readDigits = 0;
int len;
int index = 0;
//get line
scanf("%99[^\n]%n",line,&len);
while( index < len ){
if(line[index] <= '9' && line[index] >= '0'){
if(sscanf(line + index, "%d%n", p + readNums, &readDigits) != 1)
fprintf(stderr, "failed match!!!! D:\n");
index += readDigits;
readNums++;
}
index++;
}
//print results
printf("read %d ints\n", readNums);
for(int i = 0; i < readNums; i++)
printf("p[%d] = %d\n", i, p[i]);
Here is a working code. I try 3-4 times it works fine.
chPrevious will hold the previous state of ch. There is no need to store the digits into the string of digit. We can simply use an integer for this purpose.
#include<stdio.h>
#define NONDIGIT 'a'
int main() {
char ch, chPrevious; //chPrevious hold the previous state of ch.
chPrevious = NONDIGIT;
int temp = 0;
int pr[100];
int i = 0;
while ( (ch = getchar()) != '\n' ){
if( (ch>='0' && ch<='9') && (chPrevious>='0' && chPrevious<= '9')){
temp = temp * 10 + (ch - '0');
} else if (ch>= '0' && ch<= '9' && temp != 0) {
pr[i++] = temp;
temp = 0;
temp = ch - '0';
} else if (ch >= '0' && ch <= '9') {
temp = ch-'0';
}
chPrevious = ch;
}
pr[i++] = temp;
for(int j=0; j<i; j++)
printf("pr[%d]: %d\n", j, pr[j]);
return 0;
}
There may be other way too do to this and efficient also. Please ignore the bad styling. You should also improve this code as well.

Sorting words out in a string array

My program is designed to allow the user to input a string and my program will output the number of occurrences of each letters and words. My program also sorts the words alphabetically.
My issue is: I output the words seen (first unsorted) and their occurrences as a table, and in my table I don't want duplicates. SOLVED
For example, if the word "to" was seen twice I just want the word "to" to appear only once in my table outputting the number of occurrences.
How can I fix this? Also, why is it that i can't simply set string[i] == delim to apply to every delimiter rather than having to assign it manually for each delimiter?
Edit: Fixed my output error. But how can I set a condition for string[i] to equal any of the delimiters in my code rather than just work for the space bar? For example on my output, if i enter "you, you" it will out put "you, you" rather than just "you". How can I write it so it removes the comma and compares "you, you" to be as one word.
Any help is appreciated. My code is below:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
const char delim[] = ", . - !*()&^%$##<> ? []{}\\ / \"";
#define SIZE 1000
void occurrences(char s[], int count[]);
void lower(char s[]);
int main()
{
char string[SIZE], words[SIZE][SIZE], temp[SIZE];
int i = 0, j = 0, k = 0, n = 0, count;
int c = 0, cnt[26] = { 0 };
printf("Enter your input string:");
fgets(string, 256, stdin);
string[strlen(string) - 1] = '\0';
lower(string);
occurrences(string, cnt);
printf("Number of occurrences of each letter in the text: \n");
for (c = 0; c < 26; c++){
if (cnt[c] != 0){
printf("%c \t %d\n", c + 'a', cnt[c]);
}
}
/*extracting each and every string and copying to a different place */
while (string[i] != '\0')
{
if (string[i] == ' ')
{
words[j][k] = '\0';
k = 0;
j++;
}
else
{
words[j][k++] = string[i];
}
i++;
}
words[j][k] = '\0';
n = j;
printf("Unsorted Frequency:\n");
for (i = 0; i < n; i++)
{
strcpy(temp, words[i]);
for (j = i + 1; j <= n; j++)
{
if (strcmp(words[i], words[j]) == 0)
{
for (a = j; a <= n; a++)
strcpy(words[a], words[a + 1]);
n--;
}
} //inner for
}
i = 0;
/* find the frequency of each word */
while (i <= n) {
count = 1;
if (i != n) {
for (j = i + 1; j <= n; j++) {
if (strcmp(words[i], words[j]) == 0) {
count++;
}
}
}
/* count - indicates the frequecy of word[i] */
printf("%s\t%d\n", words[i], count);
/* skipping to the next word to process */
i = i + count;
}
printf("ALphabetical Order:\n");
for (i = 0; i < n; i++)
{
strcpy(temp, words[i]);
for (j = i + 1; j <= n; j++)
{
if (strcmp(words[i], words[j]) > 0)
{
strcpy(temp, words[j]);
strcpy(words[j], words[i]);
strcpy(words[i], temp);
}
}
}
i = 0;
while (i <= n) {
count = 1;
if (i != n) {
for (j = i + 1; j <= n; j++) {
if (strcmp(words[i], words[j]) == 0) {
count++;
}
}
}
printf("%s\n", words[i]);
i = i + count;
}
return 0;
}
void occurrences(char s[], int count[]){
int i = 0;
while (s[i] != '\0'){
if (s[i] >= 'a' && s[i] <= 'z')
count[s[i] - 'a']++;
i++;
}
}
void lower(char s[]){
int i = 0;
while (s[i] != '\0'){
if (s[i] >= 'A' && s[i] <= 'Z'){
s[i] = (s[i] - 'A') + 'a';
}
i++;
}
}
I have the solution to your problem and its name is called Wall. No, not the type to bang your head against when you encounter a problem that you can't seem to solve but for the Warnings that you want your compiler to emit: ALL OF THEM.
If you compile C code with out using -Wall then you can commit all the errors that people tell you is why C is so dangerous. But once you enable Warnings the compiler will tell you about them.
I have 4 for your program:
for (c; c< 26; c++) { That first c doesn't do anything, this could be written for (; c < 26; c++) { or perhaps beter as for (c = 0; c <26; c++) {
words[i] == NULL "Statement with no effect". Well that probably isn't what you wanted to do. The compiler tells you that that line doesn't do anything.
"Unused variable 'text'." That is pretty clear too: you have defined text as a variable but then never used it. Perhaps you meant to or perhaps it was a variable you thought you needed. Either way it can go now.
"Control reaches end of non-void function". In C main is usually defined as int main, i.e. main returns an int. Standard practice is to return 0 if the program successfully completed and some other value on error. Adding return 0; at the end of main will work.
You can simplify your delimiters. Anything that is not a-z (after lower casing it), is a delimiter. You don't [need to] care which one it is. It's the end of a word. Rather than specify delimiters, specify chars that are word chars (e.g. if words were C symbols, the word chars would be: A-Z, a-z, 0-9, and _). But, it looks like you only want a-z.
Here are some [untested] examples:
void
scanline(char *buf)
{
int chr;
char *lhs;
char *rhs;
char tmp[5000];
lhs = tmp;
for (rhs = buf; *rhs != 0; ++rhs) {
chr = *rhs;
if ((chr >= 'A') && (chr <= 'Z'))
chr = (chr - 'A') + 'a';
if ((chr >= 'a') && (chr <= 'z')) {
*lhs++ = chr;
char_histogram[chr] += 1;
continue;
}
*lhs = 0;
if (lhs > tmp)
count_string(tmp);
lhs = tmp;
}
if (lhs > tmp) {
*lhs = 0;
count_string(tmp);
}
}
void
count_string(char *str)
{
int idx;
int match;
match = -1;
for (idx = 0; idx < word_count; ++idx) {
if (strcmp(words[idx],str) == 0) {
match = idx;
break;
}
}
if (match < 0) {
match = word_count++;
strcpy(words[match],str);
}
word_histogram[match] += 1;
}
Using separate arrays is ugly. Using a struct might be better:
#define STRMAX 100 // max string length
#define WORDMAX 1000 // max number of strings
struct word {
int word_hist; // histogram value
char word_string[STRMAX]; // string value
};
int word_count; // number of elements in wordlist
struct word wordlist[WORDMAX]; // list of known words

How do I allocate memory to my char pointer?

My assignment is to allow the user to enter any input and print the occurrences of letters and words, we also have to print out how many one letter, two, three, etc.. letter words are in the string. I have gotten the letter part of my code to work and have revised my word function several times, but still can't get the word finding function to even begin to work. The compiler says the char pointer word is undeclared when it clearly is. Do I have to allocate memory to it and the array of characters?
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
void findLetters(char *ptr);
void findWords(char *point);
int main()
{
char textStream[100]; //up to 98 characters and '\n\ and '\0'
printf("enter some text\n");
if (fgets(textStream, sizeof (textStream), stdin)) //input up to 99 characters
{
findLetters(textStream);
findWords(textStream);
}
else
{
printf("fgets failed\n");
}
return 0;
}
void findLetters(char *ptr) //find occurences of all letters
{
int upLetters[26];
int loLetters[26];
int i;
int index;
for (i = 0; i < 26; i++) // set array to all zero
{
upLetters[i] = 0;
loLetters[i] = 0;
}
i = 0;
while (ptr[i] != '\0') // loop until prt[i] is '\0'
{
if (ptr[i] >= 'A' && ptr[i] <= 'Z') //stores occurrences of uppercase letters
{
index = ptr[i] - 'A';// subtract 'A' to get index 0-25
upLetters[index]++;//add one
}
if (ptr[i] >= 'a' && ptr[i] <= 'z') //stores occurrences of lowercase letters
{
index = ptr[i] - 'a';//subtract 'a' to get index 0-25
loLetters[index]++;//add one
}
i++;//next character in ptr
}
printf("Number of Occurrences of Uppercase letters\n\n");
for (i = 0; i < 26; i++)//loop through 0 to 25
{
if (upLetters[i] > 0)
{
printf("%c : \t%d\n", (char)(i + 'A'), upLetters[i]);
// add 'A' to go from an index back to a character
}
}
printf("\n");
printf("Number of Occurrences of Lowercase letters\n\n");
for (i = 0; i < 26; i++)
{
if (loLetters[i] > 0)
{
printf("%c : \t%d\n", (char)(i + 'a'), loLetters[i]);
// add 'a' to go back from an index to a character
}
}
printf("\n");
}
void findWords(char *point)
{
int i = 0;
int k = 0;
int count = 0;
int j = 0;
int space = 0;
int c = 0;
char *word[50];
char word1[50][100];
char* delim = "{ } . , ( ) ";
for (i = 0; i< sizeof(point); i++) //counts # of spaces between words
{
if ((point[i] == ' ') || (point[i] == ',') || (point[i] == '.'))
{
space++;
}
}
char *words = strtok(point, delim);
for(;k <= space; k++)
{
word[k] = malloc((words+1) * sizeof(*words));
}
while (words != NULL)
{
printf("%s\n",words);
strcpy(words, word[j++]);
words = strtok(NULL, delim);
}
free(words);
}
This is because you are trying to multiply the pointer position+1 by the size of pointer. Change line 100 to:
word[k] = malloc(strlen(words)+1);
This will solve your compilation problem, but you still have other problems.
You've got a couple of problems in function findWords:
Here,
for (i = 0; i< sizeof(point); i++)
sizeof(point) is the same as sizeof(char*) as point in a char* in the function fincdWords. This is not what you want. Use
for (i = 0; i < strlen(point); i++)
instead. But this might be slow as strlen will be called in every iteration. So I suggest
int len = strlen(point);
for (i = 0; i < len; i++)
The same problem lies here too:
word[k] = malloc((words+1) * sizeof(*words));
It doesn't makes sense what you are trying with (words+1). I think you want
word[k] = malloc( strlen(words) + 1 ); //+1 for the NUL-terminator
You got the arguments all mixed up:
strcpy(words, word[j++]);
You actually wanted
strcpy(word[j++], words);
which copies the contents of words to word[j++].
Here:
free(words);
words was never allocated memory. Since you free a pointer that has not been returned by malloc/calloc/realloc, the code exhibits Undefined Behavior. So, remove that.
You allocated memory for each element of word. So free it using
for(k = 0; k <= space; k++)
{
free(word[k]);
}
Your calculation of the pointer position+1 is wrong. If you want the compilation problem will go away change line 100 to:
word[k] = malloc( 1 + strlen(words));

Resources