Kernighan and Ritchie C exercise 1-16 - c

I tried to implement a solution for the exercise on the C language of K&R's book. I wanted to ask here if this could be considered a legal "solution", just modifying the main without changing things inside external functions.
Revise the main routine of the longest-line program so it will
correctly print the length of arbitrary long input lines, and as much
as possible of the text.
#include <stdio.h>
#define MAXLINE 2 ////
int get_line1(char s[], int lim)
{
int c, i;
for (i = 0; i < lim - 1 && ((c = getchar()) != EOF) && c != '\n'; i++) {
s[i] = c;
}
if (c == '\n') {
s[i] = c;
i++;
}
s[i] = '\0';
return i;
}
int main()
{
int len;
int max = MAXLINE;
char line[MAXLINE];
int tot = 0;
int text_l = 0;
while ((len = get_line1(line, max)) > 0) {
if (line[len - 1] != '\n') {
tot = tot + len;
}
if (line[1] == '\n' || line[0] == '\n') {
printf("%d\n", tot + 1);
text_l = text_l + (tot + 1);
tot = 0;
}
}
printf("%d\n", text_l);
}
The idea is to set the max lenght of the string considered for the array line ad 2.
For a string as abcdef\n , the array line will be ab. Since the last element of the array is not \n (thus the line we are considering is not over), we save the length up until now and repeat the cycle. We will get then the array made of cd, then ef and at the end we will get the array of just \n. Then the else if condition is executed, since the first element of this array is\n, and we print the tot length obtained from the previous additions. We add +1 in order to also consider the new character \n. This works also for odd strings: with abcdefg\n the process will go on up until we reach g\n and the sum is done correctly.
Outside the loop then we print the total amount of text.
Is this a correct way to do the exercise?

The exercise says to “Revise the main routine,” but you altered the definition of MAXLINE, which is outside of main, so that is not a valid solution.
Also, your code does not have the copy or getline routines of the original. Your get_line1 appears to be identical except for the name. However, a correction solution would use identical source code except for the code inside main.
Additionally, the exercise says to print “as much as possible of the text.” That is unclearly stated, but I expect it means to keep a buffer of MAXLINE characters (with MAXLINE at its original value of 1000) and use it to print the first MAXLINE−1 characters of the longest line.

Related

Newb: Assignment: C Prog: saving getchar() input into 2D array

Newb: Learn C - using The C Programming Language - 2nd ed.
I've scoured stackoverflow - can't find a solution.
I'm opening a stream using getchar() (its what the manual has me learning. The stream ends with EOF <ctrl> z . The script looks for a new line and should put each line into a 2D Array (row, width) Each row ought to hold an entire line of input.
Code runs, stream works. Doesn't appear to be populating the array though. Code is far from polished, just trying to get it to work before I polish. I used exit() simply as a short cut trying to get this to work. Any ideas? I added a counter which prints at the end r which ought to indicate number of rows created in array... it's zero... making me think array is not being built. The fate of the universe depends on you!!!
#include <stdlib.h>
#include <stdio.h>
#include <stdbool.h>
#define ROW 1000
#define WIDTH 5000
char arr[ROW][WIDTH] = {0}; // array to store char strings of sentences - each sentence stored in an individual row
int r = 0; // initiallizing rows
int i = 0;
char endprog(char arr[ROW][WIDTH]);
int main () {
int c; // int variable to accept input - will be copied into arraw
bool end = true; // count the character for each line to iterate through
while (end){
for ( i = 0; i < (c = getchar()) != EOF && c != '\n'; ++i){
if (c == EOF)
endprog(arr);
arr[r][i] = c;
if (c == '\n'){
arr[r][i] = c;
i++;
r++; // new line deterted row incremented to next row - ready for new line
}
arr[r][i] = '\0'; // copying input into array at position (row, i)
}
}
}
char endprog(char arr[ROW][WIDTH]){
int j;
for (j = 0; j <= r; j++){ // iterating through array to print rows of input strings
printf("R: %d", r);
printf("%s\n", arr[j]);
}
exit(0);
}
OUTPUT
Stream is working.
New lines are accepted
Help me Obi-Wan Kenobi
You're my only hope....
^Z
R: 0
--------------------------------
Process exited after 120.8 seconds with return value 0
Press any key to continue . . .
I modified your code a bit, I hope this is what you're looking for.
The comments in the code explain almost all the changes I've made to your code.
#include <stdio.h>
/* you don't need stdbool.h and stdlib.h for this code*/
#define ROW 1000
#define WIDTH 5000
char arr[ROW][WIDTH] = {0}; // array to store char strings of sentences - each sentence stored in an individual row
int r = 0; // initiallizing rows
int i = 0;
void endprog(char arr[ROW][WIDTH]); //the function is void since it's not returning anything
int main () {
int c; // int variable to accept input - will be copied into array
while (1){ //you don't need the for-loop and also the for-loop you've written has wrong logic
c=getchar();
if (c == EOF){
endprog(arr);
break;
}
else{
arr[r][i] = c;
i++; //increasing the column by 1
if (c == '\n'){
arr[r][i] = '\0'; //since c is '\n', it means the line is completed, so write the NUL character at the end
i=0;
r++; //increasing the row by 1 and changing the column to 0 since the next character has to be written in the first (0) index of the next row
}
}
}
}
void endprog(char arr[ROW][WIDTH]){
int j;
for (j = 0; j<r; j++){
printf("R: %d ", j); //you're supposed to print j, not r
printf("%s\n", arr[j]);
}
}

Program to get an indefinite number of strings in C and print them out

As part of an assignment, I am supposed to write a small program that accepts an indefinite number of strings, and then print them out.
This program compiles (with the following warning
desafio1.c:24:16: warning: format not a string literal and no format arguments [-Wform
at-security]
printf(words[i]);
and it prints the following characters on the screen: �����8 ���#Rl�. I guess it did not end the strings I entered by using getchar properly with the null byte, and it prints out garbage. The logic of the program is to initiate a while loop, which runs untill I press the enter key \n, and if there are an space, this is a word that will be store in the array of characters words. Why am I running into problems, if in the else statement once a space is found, I close the word[i] = \0, in that way and store the result in the array words?
#include <stdio.h>
#include <string.h>
int main()
{
char words[100][100];
int i,c;
char word[1000];
while((c = getchar()) != '\n')
{
if (c != ' '){
word[i++] = c;
c = getchar();
}
else{
word[i] = '\0';
words[i] == word;
}
}
int num = sizeof(words) / sizeof(words[0]);
for (i = 0; i < num; i++){
printf(words[i]);
}
return 0;
}
Here are some fixes to your code. As a pointer (as mentioned in other comments), make sure to enable compiler warnings, which will help you find 90% of the issues you had. (gcc -Wall)
#include <stdio.h>
#include <string.h>
int main() {
char words[100][100];
int i = 0;
int j = 0;
int c;
char word[1000];
while((c = getchar()) != '\n') {
if (c != ' '){
word[i++] = c;
} else {
word[i] = '\0';
strcpy(words[j++], word);
i = 0;
}
}
word[i] = '\0';
strcpy(words[j++], word);
for (i = 0; i < j; i++) {
printf("%s\n", words[i]);
}
return 0;
}
i was uninitialized, so its value was undefined. It should start at 0. It also needs to be reset to 0 after each word so it starts at the beginning.
The second c = getchar() was unnecessary, as this is done in every iteration of the loop. This was causing your code to skip every other letter.
You need two counters, one for the place in the word, and one for the number of words read in. That's what j is.
== is for comparison, not assignment. Either way, strcpy() was needed here since you are filling out an array.
Rather than looping through all 100 elements of the array, just loop through the words that have actually been filled (up to j).
The last word input was ignored by your code, since it ends with a \n, not a . That's what the lines after the while are for.
When using printf(), the arguments should always be a format string ("%s"), followed by the arguments.
Of course, there are other things as well that I didn't fix (such as the disagreement between the 1000-character word and the 100-character words). If I were you, I'd think about what to do if the user entered, for some reason, more than 1000 characters in a word, or more than 100 words. Your logic will need to be modified in these cases to prevent illegal memory accesses (outside the bounds of the arrays).
As a reminder, this program does not accept an indefinite number of words, but only up to 100. You may need to rethink your solution as a result.

K&R exercise 1-22 Hints

The exercise states as follows
Write a program to "fold" long input lines in two or more shorter lines after the non-last blank character that occurs before the n-th columns of input. Make sure your program does something intelligent with very long lines, and if there are no blanks or tabs before the specified column.
My question is on how to implement the foldStrings function. I have tried some things but none of them worked.
Can you give me some hints on how to do this, but please don't write the solution down I want to figure it myself.
I have written some code but I am stuck at the folding part
#include <stdio.h>
#include <string.h>
int getline(char s[], int lim);
void emptystring(char s[]);
void foldStrings(char s[],int len);
int main(){
int len ;
char line[255];
while((len = getline(line,255))>0)
{
foldStrings(line,len);
}
return 0;
}
int getline(char s[],int lim)
{
int c , i ;
for( i = 0 ; i < lim-1 && ( c = getchar()) != EOF && c !='\n';++i)
s[i] = c;
if ( c == '\n')
{
s[i] = c;
++i;
}
return i;
}
void foldStrings(char s[], int len)
{
}
void emptystring(char s[])
{
int i;
int len = strlen(s);
for( i = 0 ; i < len ; ++i){
s[i] = 0 ;
}
}
I am stuck at the foldStrings function.
P.S I am using the empty string function to print the lines, so print a segmented line and then empty it, fill it up again and print it and so on.
Update
I have tried doing the foldStrings, here is one of my implementations
void foldStrings(char s[], int len)
{
int i ;
char temp[255];
for(i = 1;i < len-1 ;++i)
{
if( i % 16 != 0)
{
temp[i-1] = s[i-1];
}
else if(i%16 == 0)
{
printf("%s",temp)
emptystring(temp);
}
}
}
When getline() is done, s is not necessarily null character terminated.
// for( i = 0 ; i < lim-1 && ...
for( i = 0 ; i < lim-2 && ( c = getchar()) != EOF && c !='\n';++i)
...
s[i] = '\0'; // add
return i;
Same for foldStrings(). missing null character.
temp[i-1] = s[i-1];
temp[i] = '\0'; // add
Other problems may exist
This exercise is a little challenging.
At first, you don't need to buffer at all, as you only have two kinds of read chars (blank/non blank) and for a non blank character you always have to print it, so your main loop can be something like
while((c = getchar()) != EOF) {
...
}
(much the style all exercises in K&R are written)
look that when blank character are input, you only have to count them, and reset the counter on \n input.
As you ask not to reveal the final solution, I'll commit on that, but the trick is to count characters as you read the line, outputting on nonblanks (and counting) and not outputting (but only counting) on blank characters. If the character read is a blank and you have passed the limit, you'll fold the line (emit a \n)
EDITION 1
In my first attempt to write the code I discovered that the pattern blank->nonblank crossing the maximum line length boundary makes the need to break the line at the point of the first blank character and remember all the nonblank characters read so far. In that case, I'll need at most, the maximum output line length of storage (to store the non blank characters that happen to be in the data when we reach the maximum line length and have to break the line), and a maximum output line length of them have to be stored, as if I get more, for sure the line must be broken before that point.
My first attempt will be to store the number of blank characters read so far, followed by a buffer of maximum output line length (not input line length, which is unbounded, as specified in the problem). The possible statuses will be: (follow next edition of page)

Program runs too slowly with large input - C

The goal for this program is for it to count the number of instances that two consecutive letters are identical and print this number for every test case. The input can be up to 1,000,000 characters long (thus the size of the char array to hold the input). The website which has the coding challenge on it, however, states that the program times out at a 2s run-time. My question is, how can this program be optimized to process the data faster? Does the issue stem from the large char array?
Also: I get a compiler warning "assignment makes integer from pointer without a cast" for the line str[1000000] = "" What does this mean and how should it be handled instead?
Input:
number of test cases
strings of capital A's and B's
Output:
Number of duplicate letters next to each other for each test case, each on a new line.
Code:
#include <stdio.h>
#include <string.h>
#include <math.h>
#include <stdlib.h>
int main() {
int n, c, a, results[10] = {};
char str[1000000];
scanf("%d", &n);
for (c = 0; c < n; c++) {
str[1000000] = "";
scanf("%s", str);
for (a = 0; a < (strlen(str)-1); a++) {
if (str[a] == str[a+1]) { results[c] += 1; }
}
}
for (c = 0; c < n; c++) {
printf("%d\n", results[c]);
}
return 0;
}
You don't need the line
str[1000000] = "";
scanf() adds a null terminator when it parses the input and writes it to str. This line is also writing beyond the end of the array, since the last element of the array is str[999999].
The reason you're getting the warning is because the type of str[10000000] is char, but the type of a string literal is char*.
To speed up the program, take the call to strlen() out of the loop.
size_t len = strlen(str)-1;
for (a = 0; a < len; a++) {
...
}
str[1000000] = "";
This does not do what you think it does and you're overflowing the buffer which results in undefined behaviour. An indexer's range is from 0 - sizeof(str) EXCLUSIVE. So you either add one to the
1000000 when initializing or use 999999 to access it instead. To get rid of the compiler warning and produce cleaner code use:
str[1000000] = '\0';
Or
str[999999] = '\0';
Depending on what you did to fix it.
As to optimizing, you should look at the assembly and go from there.
count the number of instances that two consecutive letters are identical and print this number for every test case
For efficiency, code needs a new approach as suggeted by #john bollinger & #molbdnilo
void ReportPairs(const char *str, size_t n) {
int previous = EOF;
unsigned long repeat = 0;
for (size_t i=0; i<n; i++) {
int ch = (unsigned char) str[i];
if (isalpha(ch) && ch == previous) {
repeat++;
}
previous = ch;
}
printf("Pair count %lu\n", repeat);
}
char *testcase1 = "test1122a33";
ReportPairs(testcase1, strlen(testcase1));
or directly from input and "each test case, each on a new line."
int ReportPairs2(FILE *inf) {
int previous = EOF;
unsigned long repeat = 0;
int ch;
for ((ch = fgetc(inf)) != '\n') {
if (ch == EOF) return ch;
if (isalpha(ch) && ch == previous) {
repeat++;
}
previous = ch;
}
printf("Pair count %lu\n", repeat);
return ch;
}
while (ReportPairs2(stdin) != EOF);
Unclear how OP wants to count "AAAA" as 2 or 3. This code counts it as 3.
One way to dramatically improve the run-time for your code is to limit the number of times you read from stdin. (basically process input in bigger chunks). You can do this a number of way, but probably one of the most efficient would be with fread. Even reading in 8-byte chunks can provide a big improvement over reading a character at a time. One example of such an implementation considering capital letters [A-Z] only would be:
#include <stdio.h>
#define RSIZE 8
int main (void) {
char qword[RSIZE] = {0};
char last = 0;
size_t i = 0;
size_t nchr = 0;
size_t dcount = 0;
/* read up to 8-bytes at a time */
while ((nchr = fread (qword, sizeof *qword, RSIZE, stdin)))
{ /* compare each byte to byte before */
for (i = 1; i < nchr && qword[i] && qword[i] != '\n'; i++)
{ /* if not [A-Z] continue, else compare */
if (qword[i-1] < 'A' || qword[i-1] > 'Z') continue;
if (i == 1 && last == qword[i-1]) dcount++;
if (qword[i-1] == qword[i]) dcount++;
}
last = qword[i-1]; /* save last for comparison w/next */
}
printf ("\n sequential duplicated characters [A-Z] : %zu\n\n",
dcount);
return 0;
}
Output/Time with 868789 chars
$ time ./bin/find_dup_digits <dat/d434839c-d-input-d4340a6.txt
sequential duplicated characters [A-Z] : 434893
real 0m0.024s
user 0m0.017s
sys 0m0.005s
Note: the string was actually a string of '0's and '1's run with a modified test of if (qword[i-1] < '0' || qword[i-1] > '9') continue; rather than the test for [A-Z]...continue, but your results with 'A's and 'B's should be virtually identical. 1000000 would still be significantly under .1 seconds. You can play with the RSIZE value to see if there is any benefit to reading a larger (suggested 'power of 2') size of characters. (note: this counts AAAA as 3) Hope this helps.

C Program won't remove comments that take up the whole line

So I'm working through the K&R C book and there was a bug in my code that I simply cannot figure out.
The program is supposed to remove all the comments from a C program. Obviously I'm just using stdin
#include <stdio.h>
int getaline (char s[], int lim);
#define MAXLINE 1000 //maximum number of characters to put into string[]
#define OUTOFCOMMENT 0
#define INASINGLECOMMENT 1
#define INMULTICOMMENT 2
int main(void)
{
int i;
int isInComment;
char string[MAXLINE];
getaline(string, MAXLINE);
for (i = 0; string[i] != EOF; ++i) {
//finds whether loop is in a comment or not
if (string[i] == '/') {
if (string[i+1] == '/')
isInComment = INASINGLECOMMENT;
if (string[i+1] == '*')
isInComment = INMULTICOMMENT;
}
//fixes the problem of print messing up after the comment
if (isInComment == INASINGLECOMMENT && string[i] == '\0')
printf("\n");
//if the line is done, restates all the variables
if (string[i] == '\0') {
getaline(string, MAXLINE);
i = 0;
if (isInComment != INMULTICOMMENT)
isInComment = OUTOFCOMMENT;
}
//prints current character in loop
if(isInComment == OUTOFCOMMENT && string[i] != EOF)
printf("%c", string[i]);
//checks to see of multiline comment is over
if(string[i] == '*' && string[i+1] == '/' ) {
++i;
isInComment = OUTOFCOMMENT;
}
}
return 0;
}
So this works great except for one problem. Whenever a line starts with a comment, it prints that comment.
So for instance, if I had a line that was simply
//this is a comment
without anything before the comment begins, it will print that comment even though it's not supposed to.
I thought I was making good progress, but this bug has really been holding me up. I hope this isn't some super easy thing I've missed.
EDIT: Forget the getaline function
//puts line into s[], returns length of that line
int getaline(char s[], int lim)
{
int c, i;
for (i = 0; i < lim-1 && (c = getchar()) != '\n'; ++i)
s[i] = c;
if (c == '\n') {
s[i] = c;
++i;
}
s[i] = '\0';
return i;
}
There are many problems in your code:
isInComment is not initialized in function main.
as pointed by others, string[i] != EOF is wrong. You need to test for end of file more precisely, especially for files that do not end with a linefeed. This test only works if char type is signed and EOF is a valid signed char value. It will nonetheless mistakenly stop on a stray \377 character, which is legal in a string or in a comment.
When you detect the end of line, you read another line and reset i to 0, but i will be incremented by the for loop before you test again for single line comment... hence the bug!
You do not handle special cases such as /* // */ or // /*
You do not handle strings. This is not a comment: "/*", nor this: '//'
You do not handle \ at end of line (escaped linefeed). This can be used to extend single line comments, strings, etc. There are more subtle cases related to \ handling and if you really want completeness, you should handle trigraphs too.
Your implementation has a limit for line size, this is not needed.
The problem you are assigned is a bit tricky. Instead of reading and parsing lines, read one character at a time and implement a state machine to parse escaped linefeeds, strings, and both comment styles. The code is not too difficult if you do it right with this method.
if (string[i] == '\0') {
getaline(string, MAXLINE);
i = 0;
if (isInComment != INMULTICOMMENT)
isInComment = OUTOFCOMMENT;
}
When you start a new line, you initialize i to 0. But then in the next iteration:
for (i = 0; string[i] != EOF; ++i)
i will be incremented, so you'll begin the new line with index 1. Therefore there is a bug when the line begins with //.
You can see that it solves the problem if you write instead:
if (string[i] == '\0') {
getaline(string, MAXLINE);
i = 0;
if (isInComment != INMULTICOMMENT)
isInComment = OUTOFCOMMENT;
}
though it's usually considered as bad style to modify for loop indices inside the loop. You may redesign your implementation in a more readable way.

Resources