Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
I am having a lot of trouble starting my project. Here are the directions:
"Complete counts.c as follows:
Read characters from standard input until EOF (the end-of-file mark) is read. Do not prompt the user to enter text - just read data as soon as the program starts.
Keep a running count of each different character encountered in the input, and keep count of the total number of characters input (excluding EOF)."
The format my professor gave me to start is: `
#include <stdio.h>
int main(int argc, char *argv[]) {
return 0;
}
In addition to how to start the problem, I'm also confused as to why the two parameter's are given in the main function when nothing is going to be passed to it. Help would be much appretiated! Thank you!
`
Slightly tricky to see what you're having trouble with here. The title doesn't form a complete question, nor is there one in the body; and they seem to be hinting at entirely different questions.
The assignment tells you to read characters - not store them. You could have a loop that only reads them one at a time if you wish (for instance, using getchar). You're also asked to report counts of each character, which would make sense to store in an array. Given that this is of "each different character", the simplest way would be to size the array for all possible characters (limits.h defines UCHAR_MAX, which would help with this). Remember to initialize the array if it's automatically allocated (the default for function local variables).
Regarding the arguments to main, this program does not need them, and the C standard does allow you to leave them out. They're likely included as this is a template of a basic C program, to make it usable if command line arguments will be used also.
For more reference code you might want to compare the word count utility (wc); the character counting you want is the basis of a frequency analysis or histogram.
This should give you a start to investigate what you need to learn to complete your task,
Initially declare a character input buffer of sufficient size to read chars as,
char input[SIZE];
Use fgets() to read the characters from stdin as,
if (fgets(input, sizeof input, stdin) == NULL) {
; // handle EOF
}
Now input array has your string of characters which you to find occurrence of characters. I did not understand When you say different characters to count, however you have an array to traverse it completely to count the characters you need.
Firstly, luckily for you we will not need dynamic memory allocation at all here as we are not asked to store the input strings, instead we simply need to record how many of each ascii code is input during program run, as there a constant and finite number of those we can simply store them in a fixed size array.
The functions we are looking at here (assuming we are using standard libs) are as follows:
getchar, to read chars from standard input
printf, to print the outputs back to stdout
The constructs we will need are:
do {} while, to loop around until a condition is false
The rest just needs simple mathematical operators, here is a short example which basically shows a sample solution:
#include <stdio.h>
int main(int argc, char *argv[])
{
/* Create an array with entries for each char,
* then init it to zeros */
int AsciiCounts[256] = {0};
int ReadChar;
int TotalChars = 0;
int Iterator = 0;
do
{
/* Read a char from stdin */
ReadChar = getchar();
/* Increment the entry for its code in the array */
AsciiCounts[ReadChar]++;
TotalChars++;
} while (ReadChar != EOF);
/* Stop if we read an EOF */
do
{
/* Print each char code and how many times it occurred */
printf("Char code %#x occurred %d times\n", Iterator, AsciiCounts[Iterator]);
Iterator++;
} while (Iterator <= 255);
/* Print the total length read in */
printf("Total chars read (excluding EOF): %d", --TotalChars);
return 0;
}
Which should achieve the basic goal, however a couple of extension exercises which would likely benefit your understanding of C. First you could try to convert the second do while loop to a for loop, which is more appropriate for the situation but I did not use for simplicity's sake. Second you could add a condition so the output phase skips codes which never occurred. Finally it could be interesting to check which chars are printable and print their value instead of their hex code.
On the second part of the question, the reason those arguments are passed to main even though they are ignored is due to the standard calling convention of c programs under most OSes, they pass the number of command line arguments and values of each command line argument respectively in case the program wishes to check them. However if you really will not use them you can in most compilers just use main() instead however this makes things more difficult later if you choose to add command line options and has no performance benefit.
Related
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 4 years ago.
Improve this question
In a group assignment, we have to read input which only consist of the (small) letters a-z and the characters have a max length of 255. I wanted to check with getchar and ASCII, but my partner found a solution using sprintf and scanf, which I do not totally understand:
#include <stdio.h>
int main() {
int result = 0;
unsigned int length = 255;
char input[257];
result = readInput(input, &length);
return 1;
}
int readInput(char *output, int *length) {
char format_pattern[15];
sprintf(format_pattern, "%%%u[^\n]%%n", *length);
printf("Max allowed length is %d\n",*length);
scanf(format_pattern, output, length);
printf("Input length is %d\n",*length);
return 1;
}
Output:
Max allowed length is 255
testinput
Input length is 9
How does the format pattern in sprintf and scanf work?
Especially the three %%% before u and the two %% before n - I tried changing this to %u[^\n]%n because two ## would escape the ´%´, but then I get an error so they have to be there.
The only things I figured out are:
the %n can read the characters before it, e.g.:
int var;
printf("Some Text before%n and after",&var);
printf("characters before percent n = %d\n", var);
Output: Some Text before and aftercharacters before percent n = 16
but in my big example above there isn't a pointer variable, where the
amount of text could be stored since *length is for %%%u?
The [^\n] means something like "read till new Line"
I googled a lot but did not find a similiar example - could somebody help me?
Assuming x is a (large enough) char array
sprintf(x, "%%"); // put a single % (and '\0') in x
sprintf(x, "%u", 8); // put 8 (and '\0') in x
sprintf(x, "%%%u", 8); // put %8 (and '\0') in x
Your partner is using sprintf() to dynamically create a format string for scanf(). This is a bit tricky, because the printf and scanf functions use mostly the same formatting language. The reason for the dynamic format creation appears to be to insert a configurable field width into the format. That's clever, but overcomplicated and wholly unnecessary.
How does the format pattern in sprintf and scanf work? Especially the three %%% before u and the two %% before n
%%%u is actually two directives. The first %% causes printf to emit a single % character, and that leaves %u, which I think you recognize. Similarly, the %%n is one directive (%%, as described above) plus a literal 'n' character, which is emitted as-is. The overall sprintf call prepares a format string something like "%255[^\n]%n".
How, then, does the resulting format work with scanf? You should probably read its documentation, available all over, such as here. That one happens to be for the GLIBC implementation, but you're not using anything non-standard, so it should explain everything you need to know about your particular format. Specifically, the 255 that you go to such trouble to introduce is a maximum field width for a field consisting of characters in the "scanset" described by [^\n] -- every character except newline. The %n does not consume any input; instead, it stores the number of characters so far read by that scanf call.
For your purposes, I see absolutely no reason to generate the scanf format dynamically. You are given a maximum field width, so you might as well use it in a literal format string. Even if you wanted a maximum field width specified at runtime, scanf() has a better mechanism than dynamically writing format strings, involving passing the wanted field width as an argument.
Bonus hint: the code you posted is terribly mixed up about where to use *length and where just length in your printf and scanf calls. Where you want to output its value, you must pass the value (length). Where you want scanf to modify the value, you must pass the address where the new value should be stored (*length).
I cannot understand when does the putchar line is being executed and how it's helping to reverse the input lines ? If EOF occurs the return statement gets executed , but what happens after that line ?
#include<stdio.h>
int fun_reverse();
void main(){
fun_reverse();
}
int fun_reverse(){
int ch ;
ch = getchar();
if(ch==EOF)
return;
fun_reverse();
putchar(ch);
}
every time you're calling fun_reverse in your fun_reverse function, it doesn't print the inputted char immediately, just asks for input for another one, piling on the requests (and creating as much local variables storing each char) until EOF is reached.
When EOF is encountered, fun_reverse returns without calling fun_reverse again, ending the chain, making all callers return and eventually print the results.
The fact that the calls have been piled on due to recursion has the effect of reversing the output, because unpiling them is done the other way round.
This technique is often used to convert a number to string without any extra buffer. Converting a number to string gives the "wrong" end of the number first, so you have to buffer the numbers until the number digits are fully processed. A similar algorithm as the one above allows to store the digits and print them in the readable order.
Though your question is already been answered I would suggest you to read about 'head recursion' and 'tail recursion'.
Have a look at accepted answer of this question.
I'm making a program that counts the number of words contained within a file. My code works for certain test cases with files that have less than a certain amount of words/characters...But when I test it with, let's say, a word like:
"loooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooong", (this is not random--this is an actual test case I'm required to check), it gives me this error:
*** stack smashing detected ***: ./wcount.out terminated
Abort (core dumped)
I know what the error means and that I have to implement some sort of malloc line of code to be able to allocate the right amount of memory, but I can't figure out where in my function to put it or how to do it:
int NumberOfWords(char* argv[1]) {
FILE* inFile = NULL;
char temp_word[20]; <----------------------I think this is the problem
int num_words_in_file;
int words_read = 0;
inFile = fopen(argv[1], "r");
while (!feof(inFile)) {
fscanf(inFile, "%s", temp_word);
words_read++;
}
num_words_in_file = words_read;
printf("There are %d word(s).\n", num_words_in_file - 1);
fclose(inFile);
return num_words_in_file;
}
As you've correctly identified by rendering your source code invalid (future tip: /* put your arrows in comments */), the problem is that temp_word only has enough room for 20 characters (one of which must be a terminal null character).
In addition, you should check the return value of fopen. I'll leave that as an exercise for you. I've answered this question in other questions (such as this one), but I don't think just shoving code into your face will help you.
In this case, I think it may pay to better analyse the problem you have, to see if you actually need to store words to count them. As we define a word (the kind read by scanf("%s", ...) as a sequence of non-whitespace characters followed by a sequence of (zero or more) whitespace characters, we can see that such a counting program as yours needs to follow the following procedure:
Read as much whitespace as possible
Read as much non-whitespace as possible
Increment the "word" counter if all was successful
You don't need to store the non-whitespace any more than you do the whitespace, because once you've read it you'll never revisit it. Thus you could write this as two loops embedded into one: one loop which reads as much whitespace as possible, another which reads non-whitespace, followed by your incrementation and then the outer loop repeats the whole lot... until EOF is reached...
This will be best achieved using the %*s directive, which tells scanf-related functions not to try to store the word. For example:
size_t word_count = 0;
do {
fscanf(inFile, "%*s");
} while (!feof(inFile) && ++word_count);
You are limited by the size of your array. A simple solution would be to increase the size of your array. But you are always susceptible to stack smashing if someone enters a long word.
A word is delimited by spaces.
You can simply store a counter variable initialized to zero, and a variable that records the current char that you are looking at. Every time you read in a character using fgetc(inFile, &temp) that is a space, you increment the counter.
In your current code you simply want to count the words. Therefore you are not interested in the words themselves. You can suppress the assignment with the optional * character:
fscanf(inFile, "%*s");
The Scoop:
I am creating a method that runs through a lengthy file in chunks: using pthreads. I am calling fread() to read the file in this sort of fashion:
fread( thread_data[i].buffer, 1, 50, f )
/*
thread_data is a data structure for each thread (hence i)
buffer is in thread_data as an array of length 50
*/
I am then directly calling a print statement to see what each thread is doing, as a weird pattern was showing up in some of the parts that I was printing. Namely, my print statement would look something like this:
this is suppose to be 50 characters, but it is only a fewgD4
That D4 directly above is what I have my question on. Every thread that I make, at the end of the string, we are printing D4, and in this case, followed by a g. Other times, it is followed by a d, and most commonly a �. Now, I did read the wikipedia page on this character, which states:
replacement character used to replace an unknown or unrepresentable character
My question:
What kind of an error am I running into? Why is the end of each read statement containing unknown characters, especially the weird gD4 guy?
Aside:
I am trying to make a function in c that utilizes pthreads to find the frequency of each word in a file, in case anyone was wondering. These weird characters were showing up in my list, which is something that I find slightly unpleasent. Finally, don't bother linking me to the Obligaroty Unicode article, I am already aware of it, and the characters are not outside of what I am working with.
The strings you are printing out are not null-terminated — fread() does not null-terminate its output, it simply reads in as many raw bytes as you asked for (or fewer). So when you print out your buffer, your print function is walking past the end of the data and printing out whatever garbage memory comes after the buffer, which in your case just happens to be gD4.
You need to either explicitly null-terminate your buffer; or, if your print function supports it, tell it exactly how many characters to print. Either way, you need to save the return value from fread to know how many characters you read. For example:
int n = fread(thread_data[i].buffer, 1, 50, f);
if (n < 0) /* Handle error */ ;
// Explicitly add a null terminator -- make sure the buffer has room for it!
thread_data[i].buffer[n] = 0;
This code comes from K&R. I have read it several times, but it still seems to escape my grasp.
#define BUFSIZE 100
char buf[BUFSIZE];
int bufp = 0;
int getch(void)
{
return(bufp>0)?buf[--bufp]:getchar();
}
int ungetch(int c)
{
if(bufp>=BUFSIZE)
printf("too many characters");
else buf[bufp++]=c;
}
The purpose of these two functions, so K&R says, is to prevent a program from reading too much input. i.e. without this code a function might not be able to determine it has read enough data without first reading too much. But I don't understand how it works.
For example, consider getch().
As far as I can see this is the steps it takes:
check if bufp is greater than 0.
if so then return the char value of buf[--bufp].
else return getchar().
I would like to ask a more specific question, but I literally dont know how this code achieves what it is intended to achieve, so my question is: What is (a) the purpose and (b) the reasoning of this code?
Thanks in advance.
NOTE: For any K&R fans, this code can be found on page 79 (depending on your edition, I suppose)
(a) The purpose of this code is to be able to read a character and then "un-read" it if it turns out you accidentally read a character too many (with a max. of 100 characters to be "un-read"). This is useful in parsers with lookahead.
(b) getch reads from buf if it has contents, indicated by bufp>0. If buf is empty, it calls getchar. Note that it uses buf as a stack: it reads it from right-to-left.
ungetch pushes a character onto the stack buf after doing a check to see if the stack isn't full.
The code is not really for "reading too much input", instead is it so you can put back characters already read.
For example, you read one character with getch, see if it is a letter, put it back with ungetch and read all letters in a loop. This is a way of predicting what the next character will be.
This block of code is intended for use by programs that make decisions based on what they read from the stream. Sometimes such programs need to look at a few character from the stream without actually consuming the input. For example, if your input looks like abcde12xy789 and you must split it into abcde, 12, xy, 789 (i.e. separate groups of consecutive letters from groups of consecutive digits) you do not know that you have reached the end of a group of letters until you see a digit. However, you do not want to consume that digit at the time you see it: all you need is to know that the group of letters is ending; you need a way to "put back" that digit. An ungetch comes in handy in this situation: once you see a digit after a group of letters, you put the digit back by calling ungetch. Your next iteration will pick that digit back up through the same getch mechanism, sparing you the need to preserve the character that you read but did not consume.
1. The other idea also shown here can be also called as a very primitive I/O stack mangement system and gives the implementation of the function getch() and ungetch().
2. To go a step further , suppose you want to design an Operating System , how can you handle the memory which stores all the keystrokes?
This is solved by the above code snippet.An extension of this concept is used in file handling , especially in editing files .In that case instead of using getchar() which is used to take input from Standard input , a file is used as a source of input.
I have a problem with code given in question. Using buffer (in form of stack) in this code is not correct as when getting more than one extra inputs and pushing into stack will have undesired effect in latter processing (getting input from buffer).
This is because when latter processing (getting input) going on ,this buffer (stack) will give extra input in reverse order (means last extra input given first).
Because of LIFO (Last in first out ) property of stack , the buffer in this code must be quene as it will work better in case of more than one extra input.
This mistake in code confused me and finally this buffer must be quene as shown below.
#define BUFSIZE 100
char buf[BUFSIZE];
int bufr = 0;
int buff = 0;
int getch(void)
{
if (bufr ==BUFSIZE)
bufr=0;
return(bufr>=0)?buf[bufr++]:getchar();
}
int ungetch(int c)
{
if(buff>=BUFSIZE && bufr == 0)
printf("too many characters");
else if(buff ==BUFSIZE)
buff=0;
if(buff<=BUFSIZE)
buf[buff++]=c;
}