C programming: function that keeps track of word lengths & counts? - c

I'm having trouble articulating what the function is supposed to do, so I think I will just show you guys an example. Say my program opens and scans a text file, which contains the following:
"The cat chased after the rooster to no avail."
Basically the function I'm trying to write is supposed to print out how many 1 letter words there are (if any), how many 2 letter words there are, how many 3 letter words, etc.
"
Length Count
2 2
3 3
5 2
6 1
7 1
"
Here's my attempt:
int word_length(FILE *fp, char file[80], int count)//count is how many total words there are; I already found this in main()
{
printf("Length\n");
int i = 0, j = 0;
while(j < count)
{
for(i = 0; i < count; i++)
{
if(strlen(file[i] = i)
printf("%d\n", i);
}//I intended for the for loop to print the lengths
++i;
printf("Count\n");
while()//How do you print the counts in this case?
}
}
I think the way I set up the loops causes words of the same length to be printed twice...so it'd look something like this, which is wrong. So how should I set up the loops?
"Length Count
2 1
2 2
"

This sounds like homework, so I will not write code for you, but will give you some clues.
To hold several values you will need array. Element with index i will contain counter for words with length i.
Find a way to identify boundaries of words (space, period, beginning of line etc.). Then count number of characters between boundaries.
Increase relevant counter (see tip 1). Repeat.
Some details. You actually want to map one thing to another: length of word to number of such words. For mapping there is special data type, called usually hash(table) or dictionary. But in your case array can perfectly work as a map because you keys are uniform and continues (1,2 ... to some maximum word length).

You can't use a single int to count all of that. You need an array and then in it at position 0 you keep track of how many 1 letter words, at position 1 you accumulate 2 letter words and so on.

Related

Making a character array rotate its cells left/right n times

I'm totally new here but I heard a lot about this site and now that I've been accepted for a 7 months software development 'bootcamp' I'm sharpening my C knowledge for an upcoming test.
I've been assigned a question on a test that I've passed already, but I did not finish that question and it bothers me quite a lot.
The question was a task to write a program in C that moves a character (char) array's cells by 1 to the left (it doesn't quite matter in which direction for me, but the question specified left). And I also took upon myself NOT to use a temporary array/stack or any other structure to hold the entire array data during execution.
So a 'string' or array of chars containing '0' '1' '2' 'A' 'B' 'C' will become
'1' '2' 'A' 'B' 'C' '0' after using the function once.
Writing this was no problem, I believe I ended up with something similar to:
void ArrayCharMoveLeft(char arr[], int arrsize, int times) {
int i;
for (i = 0; i <= arrsize ; i++) {
ArraySwap2CellsChar(arr, i, i+1);
}
}
As you can see the function is somewhat modular since it allows to input how many times the cells need to move or shift to the left. I did not implement it, but that was the idea.
As far as I know there are 3 ways to make this:
Loop ArrayCharMoveLeft times times. This feels instinctively inefficient.
Use recursion in ArrayCharMoveLeft. This should resemble the first solution, but I'm not 100% sure on how to implement this.
This is the way I'm trying to figure out: No loop within loop, no recursion, no temporary array, the program will know how to move the cells x times to the left/right without any issues.
The problem is that after swapping say N times of cells in the array, the remaining array size - times are sometimes not organized. For example:
Using ArrayCharMoveLeft with 3 as times with our given array mentioned above will yield
ABC021 instead of the expected value of ABC012.
I've run the following function for this:
int i;
char* lastcell;
if (!(times % arrsize))
{
printf("Nothing to move!\n");
return;
}
times = times % arrsize;
// Input checking. in case user inputs multiples of the array size, auto reduce to array size reminder
for (i = 0; i < arrsize-times; i++) {
printf("I = %d ", i);
PrintArray(arr, arrsize);
ArraySwap2CellsChar(arr, i, i+times);
}
As you can see the for runs from 0 to array size - times. If this function is used, say with an array containing 14 chars. Then using times = 5 will make the for run from 0 to 9, so cells 10 - 14 are NOT in order (but the rest are).
The worst thing about this is that the remaining cells always maintain the sequence, but at different position. Meaning instead of 0123 they could be 3012 or 2301... etc.
I've run different arrays on different times values and didn't find a particular pattern such as "if remaining cells = 3 then use ArrayCharMoveLeft on remaining cells with times = 1).
It always seem to be 1 out of 2 options: the remaining cells are in order, or shifted with different values. It seems to be something similar to this:
times shift+direction to allign
1 0
2 0
3 0
4 1R
5 3R
6 5R
7 3R
8 1R
the numbers change with different times and arrays. Anyone got an idea for this?
even if you use recursion or loops within loops, I'd like to hear a possible solution. Only firm rule for this is not to use a temporary array.
Thanks in advance!
If irrespective of efficiency or simplicity for the purpose of studying you want to use only exchanges of two array elements with ArraySwap2CellsChar, you can keep your loop with some adjustment. As you noted, the given for (i = 0; i < arrsize-times; i++) loop leaves the last times elements out of place. In order to correctly place all elements, the loop condition has to be i < arrsize-1 (one less suffices because if every element but the last is correct, the last one must be right, too). Of course when i runs nearly up to arrsize, i+times can't be kept as the other swap index; instead, the correct index j of the element which is to be put at index i has to be computed. This computation turns out somewhat tricky, due to the element having been swapped already from its original place. Here's a modified variant of your loop:
for (i = 0; i < arrsize-1; i++)
{
printf("i = %d ", i);
int j = i+times;
while (arrsize <= j) j %= arrsize, j += (i-j+times-1)/times*times;
printf("j = %d ", j);
PrintArray(arr, arrsize);
ArraySwap2CellsChar(arr, i, j);
}
Use standard library functions memcpy, memmove, etc as they are very optimized for your platform.
Use the correct type for sizes - size_t not int
char *ArrayCharMoveLeft(char *arr, const size_t arrsize, size_t ntimes)
{
ntimes %= arrsize;
if(ntimes)
{
char temp[ntimes];
memcpy(temp, arr, ntimes);
memmove(arr, arr + ntimes, arrsize - ntimes);
memcpy(arr + arrsize - ntimes, temp, ntimes);
}
return arr;
}
But you want it without the temporary array (more memory efficient, very bad performance-wise):
char *ArrayCharMoveLeft(char *arr, size_t arrsize, size_t ntimes)
{
ntimes %= arrsize;
while(ntimes--)
{
char temp = arr[0];
memmove(arr, arr + 1, arrsize - 1);
arr[arrsize -1] = temp;
}
return arr;
}
https://godbolt.org/z/od68dKTWq
https://godbolt.org/z/noah9zdYY
Disclaimer: I'm not sure if it's common to share a full working code here or not, since this is literally my first question asked here, so I'll refrain from doing so assuming the idea is answering specific questions, and not providing an example solution for grabs (which might defeat the purpose of studying and exploring C). This argument is backed by the fact that this specific task is derived from a programing test used by a programing course and it's purpose is to filter out applicants who aren't fit for intense 7 months training in software development. If you still wish to see my code, message me privately.
So, with a great amount of help from #Armali I'm happy to announce the question is answered! Together we came up with a function that takes an array of characters in C (string), and without using any previously written libraries (such as strings.h), or even a temporary array, it rotates all the cells in the array N times to the left.
Example: using ArrayCharMoveLeft() on the following array with N = 5:
Original array: 0123456789ABCDEF
Updated array: 56789ABCDEF01234
As you can see the first cell (0) is now the sixth cell (5), the 2nd cell is the 7th cell and so on. So each cell was moved to the left 5 times. The first 5 cells 'overflow' to the end of the array and now appear as the Last 5 cells, while maintaining their order.
The function works with various array lengths and N values.
This is not any sort of achievement, but rather an attempt to execute the task with as little variables as possible (only 4 ints, besides the char array, also counting the sub function used to swap the cells).
It was achieved using a nested loop so by no means its efficient runtime-wise, just memory wise, while still being self-coded functions, with no external libraries used (except stdio.h).
Refer to Armali's posted solution, it should get you the answer for this question.

Looping through all character combinations with increasing number of elements

What I want to achieve:
I have a function where I want to loop through all possible combinations of printable ascii-characters, starting with a single character, then two characters, then three etc.
The part that makes this difficult for me is that I want this to work for as many characters as I can (leave it overnight).
For the record: I know that abc really is 97 98 99, so a numeric representation is fine if that's easier.
This works for few characters:
I could create a list of all possible combinations for n characters, and just loop through it, but that would require a huge amount of memory already when n = 4. This approach is literally impossible for n > 5 (at least on a normal desktop computer).
In the script below, all I do is increment a counter for each combination. My real function does more advanced stuff.
If I had unlimited memory I could do (thanks to Luis Mendo):
counter = 0;
some_function = #(x) 1;
number_of_characters = 1;
max_time = 60;
max_number_of_characters = 8;
tic;
while toc < max_time && number_of_characters < max_number_of_characters
number_of_characters = number_of_characters + 1;
vectors = [repmat({' ':'~'}, 1, number_of_characters)];
n = numel(vectors);
combs = cell(1,n);
[combs{end:-1:1}] = ndgrid(vectors{end:-1:1});
combs = cat(n+1, combs{:});
combs = reshape(combs, [], n);
for ii = 1:size(combs, 1)
counter = counter + some_function(combs(ii, :));
end
end
Now, I want to loop through as many combinations as possible in a certain amount of time, 5 seconds, 10 seconds, 2 minutes, 30 minutes, so I'm hoping to create a function that's only limited by the available time, and uses only some reasonable amount of memory.
Attempts I've made (and failed at) for more characters:
I've considered pre-computing the combinations for two or three letters using one of the approaches above, and use a loop only for the last characters. This would not require much memory, since it's only one (relatively small) array, plus one or more additional characters that gets looped through.
I manage to scale this up to 4 characters, but beyond that I start getting into trouble.
I've tried to use an iterator that just counts upwards. Every time I hit any(mod(number_of_ascii .^ 1:n, iterator) == 0) I increment the m'th character by one. So, the last character just repeats the cycle !"# ... ~, and every time it hits tilde, the second character increments. Every time the second character hits tilde, the third character increments etc.
Do you have any suggestions for how I can solve this?
It looks like you're basically trying to count in base-26 (or base 52 if you need CAPS). Each number in that base will account for a specific string of character. For example,
0,1,2,3,4,5,6,7,8,9,A,B,C,D,E,F,G,H,I,J,K,L,M,N,O,P,10,11,12,...
Here, cap A through P are just symbols that are used to represent number symbols for base-26 system. The above simply represent this string of characters.
a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z,ba,bb,bc,...
Then, you can simply do this:
symbols = ['0','1','2','3','4','5','6','7','8','9','A','B','C','D','E',...
'F','G','H','I','J','K','L','M','N','O','P']
characters = ['a','b','c','d','e','f','g','h','i','j','k','l',...
'm','n','o','p','q','r','s','t','u','v','w','x','y','z']
count=0;
while(true)
str_base26 = dec2base(count,26)
actual_str = % char-by-char-lookup-of-str26 to chracter string
count=count+1;
end
Of course, it does not represent characters that begin with trailing 0's. But that should be pretty simple.
You were not far with your idea of just getting an iterator that just counts upward.
What you need with this idea is a map from the integers to ASCII characters. As StewieGriffin suggested, you'd just need to work in base 95 (94 characters plus whitespace).
Why whitespace : You need something that will be mapped to 0 and be equivalent to it. Whitespace is the perfect candidate. You'd then just skip the strings containing any whitespace. If you don't do that and start directly at !, you'll not be able to represent strings like !! or !ab.
First let's define a function that will map (1:1) integers to string :
function [outstring,toskip]=dec2ASCII(m)
out=[];
while m~=0
out=[mod(m,95) out];
m=(m-out(1))/95;
end
if any(out==0)
toskip=1;
else
toskip=0;
end
outstring=char(out+32);
end
And then in your main script :
counter=1;
some_function = #(x) 1;
max_time = 60;
max_number_of_characters = 8;
currString='';
tic;
while numel(currString)<=max_number_of_characters&&toc<max_time
[currString,toskip]=dec2ASCII(counter);
if ~toskip
some_function(currString);
end
counter=counter+1;
end
Some random outputs of the dec2ASCII function :
dec2ASCII(47)
ans =
O
dec2ASCII(145273)
ans =
0)2
In terms of performance I can't really elaborate as I don't know what you want to do with your some_function. The only thing I can say is that the running time of dec2ASCII is around 2*10^(-5) s
Side note : iterating like this will be very limited in terms of speed. With the function some_function doing nothing, you'd just be able to cycle through 4 characters in around 40 minutes, and 5 characters would already take up to 64 hours. Maybe you'd want to reduce the amount of stuff you want to pass through the function you iterate on.
This code, though, is easily parallelizable, so if you want to check more combinations, I'd suggest trying to do it in a parallel manner.

Filter out a list of strings

Given a list of strings such as [boo,koo,kool]
You try to filter out the characters that occur in all strings and further you filter out characters that occur equal number of times so in the case above you would return oo
In my approach i was thinking of first making a struct for all unique letters in the first string and keep a count of them and then do a compare with every other string. i think it might be a overkill in terms of run time. can anyone suggest better approach?
You already have a struct of sorts to use, its the type 'char' which stores -128 to 128 values. Perhaps an array of ints for each word would do the trick, where you index the array with the char you have found in the string.
#define NUMWORDS 10 // assuming 10 words in list
int CountOfChars[NUMWORDS][256];
for each string n in list
{
for each char c in string n
{
CountOfChars[n][c]++;
}
}
THEN analyze each CountOfChars array to find the counts >=2
You can use a loop like this:
char SetFlag;
for each char c in the system // a - z, A - Z
{
SetFlag = 0;
if (CountOfChars[0][c]>1)
{
for each string n in list except the first word
{
if (CountOfChars[n][c]>1)
SetFlag = c;
else
SetFlag = 0;
}
}
if (SetFlag)
printf("%c",SetFlag); // prints a char found twice in all words
}
I have left this as psuedocode as it sounds like homework, hope this gets you started
You can solve this problem using TRIES. something relevant: here
For example consider your given scenario:
+1 +2
b k
\ |
\ |
o +3 ----|
| | ==> this has contiguous max frequency.
o +3 ----|
|
l +1
Since your array has only three string and count of contiguous string is 3. Therefore answer will be oo

C Storing Matrix in Array of Chars and Printing

Hey all I am trying to store a matrix in an array of chars and then print it out.
My code that I have written:
#include<stdio.h>
#include<stdlib.h>
int main() {
int i;
int j;
int row=0;
int col=0;
int temp=0;
char c;
int array[3][2] = {{}};
while((c=getchar()) !=EOF && c!=10){
if((c==getchar()) == '\n'){
array[col++][row];
break;
}
array[col][row++]=c;
}
for(i=0; i<=2; i++){
for(j=0; j<=3; j++){
printf("%c ", array[i][j]);
}
printf("\n");
}
}
Using a text file such as:
1 2 3 4
5 6 7 8
9 1 2 3
I would like to be able to print that back out to the user, however what my code outputs is:
1 2 3 4
3 4 5 6
5 6 7 8
I cannot figure out what is wrong with my code, some how I am off an iteration in one of my loops, or it has something to do with not handling new lines properly. Thanks!
A few problems that I can see are:
As user3386109 mentioned in the comments, your array should be array[3][4] to match the input file.
The line array[col++][row]; does nothing but increment col, and then uselessly indexes the array and throws away the value. You can do the same thing with just col++;. However, you're not even using col at any later point in the code, so really you don't even need that. The break; all by itself does what you need. Which leads me to...
You're not populating the array like you think you are. You're incrementing col and then immediately breaking out of the loop. So how does the entire array ever get populated? Just by pure luck. As it turns out with your array declared as array[3][4], the array access array[0][4] (which isn't even technically supposed to exist) is equivalent to array[1][0]. This is because all multidimensional arrays (in C and just about any other language) are laid out in memory as flat arrays, because memory itself uses linear addressing. In C, this flattening of multidimensional arrays is done in so-called Row-major order, meaning that as you traverse the raw memory from first address to last, the corresponding multidimensional indices (i,j,k,...z, or in your case just i,j) increment in such a way that the last index will change the fastest. So, not only does col never get incremented except for right before you break out of the loop, but row never gets reset to 0, which means you're storing values in array[0][0], array[0][1], ... array[0][11], not array[0][0] .. array[0][3], array[1][0] .. array[1][3], array[2][0] .. array[2][3] as you were expecting. It was just luck that, thanks to row-major ordering, these two sets of indices were actually equivalent (and C doesn't do array bounds checking for you because it assumes you're doing it yourself).
This is just personal preference, but you will usually see arrays referenced as array[row][col], not array[col][row]. But like I said, that's just preference. If it's easier for you to visualize it as [col][row], then by all means do it that way. Just make sure you do it consistently and don't accidentally switch gears midway through your code to doing [row][col].
Your code will break and only print out part of the matrix if you accidentally put a trailing space at the end of one of your rows of numbers, because of the weird way you're checking for the end of input (doing a second getchar after each initial getchar and checking to see if the second character is \n). This method isn't wrong per se, in the sense that it will work, but it's not very robust and relies on your input data being precisely formatted and containing no trailing spaces. Anyone who has ever spent hours trying to figure out why their Makefile didn't work, only to find out that it was because they had leading spaces instead of tabs can attest to the fact that those kinds of errors can be extremely time-consuming and frustrating to track down. Precisely formatted input data is always a good thing, but your code shouldn't break in unexpected an non-obvious ways (such as only printing out half of a matrix) when it doesn't get perfect input. Edit: It only occurred to me later on that you were actually intending to do two mutually exclusive things here: increment col for the next line of input, and break out of the loop after having (presumably) detected the end of input. You need to figure out which thing you're doing here, although thanks to item #3, your code actually (and oddly) works just by taking user3386109's advice and changing array[3][2] to array[3][4].
I can only assume you used <= 2 and <= 3 in your for loops instead of < 3 and < 4, respectively, because you prefer doing it that way. That's fine, but it generally makes for easier-to-read code if your for loop conditions match up with your array dimensions. Just speculating here, but perhaps that's why you had array[3][2] when you really meant array[3][4].

Random Letter Distribution Limitations

I am writing a scrabble program that will randomly generate an array of 7 letters.
this is my code that generates the letters and puts them in the array and it works great.
char randomletters (char letters[8], int i){
srand((time(NULL)));
for(i=0; i<7; i++){
letters[i] = (rand() % 26 + 65);
}
return letters[8];
}
My only issue is figuring out how to limit the number of times a certain letter can appear, using the standard scrabble distribution. Such as 'B' only can appear twice. I was thinking a way of doing it was 26 if statements that counted how many of each letter were there and if it was to much then start over? Seem's like that isn't the best way of doing it though.
Not looking for a code answer, just ideas on how to make it happen.
Thanks in advance.
char array[] = "AAAAAAAAABBCCD";
unsigned remaining = sizeof array;
int get_a_letter(void)
{
unsigned idx;
int sample;
if (!remaining) return EOF;
idx = urnd(remaining);
sample = array[idx] ;
array [idx] = array [--remaining];
array [remaining] = sample; // #Note:1
return sample;
}
urnd(xxx) is a function that should return a random value between 0 and (xxx-1), inclusive.
Update
#Note1: this statement is not necessary for drawing a random letter, but it helps for the next round: resetting remaining = sizeof array; will suffice to start over. (the array will be scrambled, but all the initial letters are still present)
Create an array of struct to keep a letter and the number of times it can be used. Randomly choose a number between 0 and 25, if the chosen cell has a count >0 add the letter to the rack decrease the letter count, repeat until the rack counts 7 tiles, repeat the whole think until less than 7 letters are left.

Resources