Comparing 2 substrings in C - c

I having trouble reading a string of characters from a file and then comparing them for the first part of my homework on ubuntu using C.
So the program compiles fine but it seems I get stuck in an infinite loop when it gets to the while loop under the compare string portion of the code. Thanks.
Also, can I get some advice on how to take multiple inputs from the terminal to compare the string from the 'bar' file and the string of x substring of characters after that in the terminal. My output should look like:
% echo "aaab" > bar
% ./p05 bar aa B
2
1
%
This is what I have so far:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int main(void /*int argc, char *argv[]*/)
{
/******* Open, Read, Close file**********/
FILE *ReadFile;
ReadFile = fopen(/*argv[1]*/"bar", "r");
if(NULL == ReadFile)
{
printf("\n file did not open \n");
return 1;
}
fseek(ReadFile, 0 , SEEK_END);
int size = ftell(ReadFile);
rewind(ReadFile);
char *content = calloc( size +1, 1);
fread(content,1,size,ReadFile);
/*fclose(ReadFile); */
printf("you made it past opening and reading file\n");
printf("your file size is %i\n",size);
/*********************************/
/******String compare and print*****/
int count =0;
const char *tmp = "Helololll";
while (content = strstr(content,"a"))
{
count++;
tmp++;
}
printf("Your count is:%i\n",count);
/***********************************/
return 0;
}

The following loop is infinite if the character 'a' occurs in content.
while (content = strstr(content, "a"))
{
count ++;
tmp ++;
}
It resets content to point to the location of the first occurrence of 'a' on the first iteration. Future iterations will not change the value of content. IOW, content points to "aaab" so the call to strstr will find the first 'a' every time. If you replace tmp++ with content++ inside of your loop, then it will be closer to what you want. I would probably write this with a for loop to make it a little more clear that you are iterating.
char const * const needle = "a";
for (char *haystack=content; haystack=strstr(haystack, needle); haystack++) {
count++;
}
The haystack is incremented so that it always decreases in size. Eventually, you will not find the needle in the haystack and the loop will terminate.

Related

Troubles with pointers when reading from a txt file

Im trying to print out the strings from a txt file in order.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(int argc, char *argv[])
{
// Check for command line args
if (argc != 2)
{
printf("Usage: ./read infile\n");
return 1;
}
// Create buffer to read into
char buffer[7];
// Create array to store plate numbers
char *plates[8];
FILE *infile = fopen(argv[1], "r");
int idx = 0;
while (fread(buffer, 1, 7, infile) == 7)
{
char buffer2[7];
// Replace '\n' with '\0'
buffer[6] = '\0';
strcpy(buffer2, buffer);
// Save plate number in array
plates[idx] = buffer2;
idx++;
}
for (int i = 0; i < 8; i++)
{
printf("%s\n", plates[i]);
}
}
The pasted code just writes one and the same string over and over again, and I cant for the life of me figure out what Im doing wrong. When I debug the "while" method, I see that the buffer updates keep overwriting every entry to the plates array.
In this for loop
while (fread(buffer, 1, 7, infile) == 7)
{
char buffer2[7];
// Replace '\n' with '\0'
buffer[6] = '\0';
strcpy(buffer2, buffer);
// Save plate number in array
plates[idx] = buffer2;
idx++;
}
you declared a local array with automatic storage duration
char buffer2[7];
that will not be alive after exiting the loop. And all elements of the array plates are set by the address of the first element of the array buffer2. That is within the for loop they all point to the same extent of memory.
After exiting the loop the pointers will be invalid.
You need to allocate character arrays dynamically and their addresses to assign to the elements of the array plates.
Also pay attention to that the function fread does not read a string. So this statement
buffer[6] = '\0';
overwrites the last character stored in the array.
Using dynamic allocation should fix your problem. You could try something like this:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(int argc, char *argv[])
{
// Check for command line args
if (argc != 2)
{
printf("Usage: ./read infile\n");
return 1;
}
// Create buffer to read into
char buffer[7];
// Create array to store plate numbers
char *plates[8];
FILE *infile = fopen(argv[1], "r");
int idx = 0;
while (fread(buffer, 1, 7, infile) == 7)
{
// Replace '\n' with '\0'
buffer[6] = '\0';
// Save plate number in array
plates[idx] = malloc(sizeof(buffer));
strcpy(plates[idx++], buffer);
}
for (int i = 0; i < 8; i++)
{
printf("%s\n", plates[i]);
free(plates[i];
}
}
The pasted code just writes one and the same string over and over again, and I cant for the life of me figure out what Im doing wrong. When I debug the "while" method, I see that the buffer updates keep overwriting every entry to the plates array.
#Vlad from Moscow gave you an explanation for this:
that will not be alive after exiting the loop. And all elements of the array plates are set by the address of the first element of the array buffer2. That is within the for loop they all point to the same extent of memory.
"Im trying to print out the strings from a txt file in order."
As noted in comments fread() as used in your implementation is not the best way to read lines in a text file.
Answering these 2 questions (at minimum the first one) will provide important values to help in declaring and initializing the right sized (and shaped) buffers for reading lines from a file...
What is the longest line in the file?
How many lines are in the file? (may be optional if not storing all lines)
The following example(s) can be accomplished knowing only the answer to the first question, but knowing the answer to the second would be useful if it was necessary for example to store all of the lines into an array of strings. (This is out of scope here as you did not list that as a requirement for your code)
Unless you are comfortable with making an assumption on the maximum line length, i.e. hard-coded...
char line[guessed_max_line_length] = {0};
...a run-time assessment to determine the length of the longest line in the file is necessary to size the buffer such that it can safely contain lines that will later be read from file. Once this assessment is done, use the length of the longest line to create a line buffer during run-time. (dynamically allocate memory):
char *line = malloc(max_length + 1);
memset(line, 0, max_length + 1);
Using these methods, (and providing the implementation linked above) your code can be simplified to the following adaptation....
//prototype to get max line length in file
size_t longestLine(FILE *fi);
int main(int argc, char *argv[])
{
// Check for command line args
if (argc != 2)
{
printf("Usage: ./read infile\n");
return 1;
}
FILE *infile = fopen(argv[1], "r");
if(infile)
{
size_t max_length = longestLine(infile); //see linked implemenation from above
rewind(infile);//suggest adding this line to longestLine() implementation.
char *line = malloc(max_length + 1);
if(line)
{
memset(line, 0, max_length + 1);
while(fgets(line, max_length, infile))
{
fputs(line, stdout);
//or alternatively
//printf("%s", line);
}
free(line);
}
fclose(infile);
}
return 0;
}

Strange behaviour of printf with array when passing pointer to another function

To study for the exam we are trying to do some exercise from past exams.
In this exercise we get a header file and we have to create a function that read an input file and print onto the stdout only the parts of strings that do not contain digits.
(We have to pass the pointer of the string red to the main function).
We tried to do it with a an array but when printing the first word is empty or has strange characters. Instead doing a malloc allocation works fine.
What is also strange is that printing before everything an empty string will fix the code.
Therefore we don't understand why using an array of char the first word is not printed correctly, although it is saved in the buffer.
Including a printf before the while loop in the main function will reset the problem.
Using dynamic allocation (malloc) and not static allocation (array) will fix the print.
Iterating over the whole array and set all the memory to 0 does not fix the problem.
Therefore the pointer is correct as with printing an empty string it prints it correctly, but I really cannot understand what cause the issue.
Question are:
How it is possible that printing an empty string the print is correct?
Array is allocated on the stack therefore it is deallocated when the program exit the scope, why is only the first broken and not all the words?
#include "word_reader.h"
#include <ctype.h>
#include <stdlib.h>
#include <string.h>
const char * read_next_word(FILE * f) {
char buffer[WORD_MAX_LEN];
char * word = buffer;
for (int i = 0; i < WORD_MAX_LEN; ++i)
buffer[i] = 0;
//char * buffer = malloc(sizeof(char) * WORD_MAX_LEN);
int found = 0;
int c = 0;
int i = 0;
while (!found && c != EOF) {
while ((c = fgetc(f)) != EOF && isalpha(c)) {
found = 1;
buffer[i] = c;
++i;
}
buffer[i] = '\0';
}
if (found) {
return word;
//return buffer; // when use malloc
}
return 0;
}
int main(int argc, char * argv[]) {
FILE * f = fopen(argv[1], "r");
if(!f) {
perror(argv[1]);
return EXIT_FAILURE;
}
const char * word = 0;
//printf(""); // adding this line fix the problem
while ((word = read_next_word(f))) {
printf("%s\n", word);
}
fclose(f);
return 0;
}
the header file contain only the read_next_word declaration and define WORD_MAX_LEN to 1024. (Also include
the file to read (a simple .txt file)
ciao234 44242 toro
12Tiz23 where333
WEvo23
expected result:
ciao
toro
Tiz
where
WEvo
actual result
�rǫs+)co�0�*�E�L�mзx�<�/��d�c�q
toro
Tiz
where
WEvo
the first line is always some ascii characters or an empty line.

Storing values of file into array leads to weird behaviour

Let's say I've got the file
5f2
3f6
2f1
And the code:(The printf should print the second numbers (i.e 2,6, and 1) but it doesn't
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <ctype.h>
int main (int argc, char * argv[])
{
FILE *ptr;
char str[100];
char * token;
int a, b, i;
int arr[4];
if(argc > 1)
{
ptr = fopen(argv[1],"r");
if(ptr == NULL)
{
exit(1);
}
}
else
{
exit(1);
}
//And I'm looking to parse the numbers between the "f" so..
while(fgets(str,100,ptr) != NULL)
{
token = strstr(str,"f");
if(token != NULL)
{
a = atol(str); // first number
b = atol(token+1); // second number
arr[i] = b; // store each b value (3 of em) into this array
}
i++;
printf("Values are %d\n",arr[i]); //should print 2,6 and 1
}
}
I've tried to move the printf outside the loop, but that seems to print an even weirder result, I've seen posts about storing integers from a file into an array before, however since this involves using strstr, I'm not exactly sure the procedure is the same.
int i,j=0;
while(fgets(str,sizeof(str),file) != NULL)
{
size_t n = strlen(str);
if(n>0 && str[n-1] == '\n')
str[n-1] = '\0';
i = str[strlen(str)-1] - '0'; /* Convert the character to int */
printf("%d\n",i);// Or save it to your int array arr[j++] = i;
}
Just move to the last character as shown and print it out as integer.
PS: fgets() comes with a newline character you need to suppress it as shown
You are never initializing i, then you are reading into arr[i] (which just happens to not crash right there), then increment i (to "undefined value + 1"), then print arr[i] -- i.e., you are writing to and reading from uninitialized memory.
Besides, your FILE * is ptr, not file. And you should get into the habit of using strtol() instead of atol(), because the former allows you to properly check for success (and recover from error).

rle compression algorithm c

I have to do a rle algorithm in c with the escape character (Q)
example if i have an input like: AAAAAAABBBCCCDDDDDDEFG
the output have to be: QA7BBBCCCQD6FFG
this is the code that i made:
#include <stdio.h>
#include <stdlib.h>
void main()
{
FILE *source = fopen("Test.txt", "r");
FILE *destination = fopen("Dest.txt", "w");
char carCorrente; //in english: currentChar
char carSucc; // in english: nextChar
int count = 1;
while(fread(&carCorrente, sizeof(char),1, source) != 0) {
if (fread(&carCorrente, sizeof(char),1, source) == 0){
if(count<=3){
for(int i=0;i<count;i++){
fprintf(destination,"%c",carCorrente);
}
}
else {
fwrite("Q",sizeof(char),1,destination);
fprintf(destination,"%c",carCorrente);
fprintf(destination,"%d",count);
}
break;
}
else fseek(source,-1*sizeof(char), SEEK_CUR);
while (fread(&carSucc, sizeof(char), 1, source) != 0) {
if (carCorrente == carSucc) {
count++;
}
else {
if(count<=3){
for(int i=0;i<count;i++){
fprintf(destination,"%c",carCorrente);
}
}
else {
fwrite("Q",sizeof(char),1,destination);
fprintf(destination,"%c",carCorrente);
fprintf(destination,"%d",count);
}
count = 1;
goto OUT;
}
}
OUT:fseek(source,-1*sizeof(char), SEEK_CUR); //exit 2° while
}
}
the problem is when i have an input like this: ABBBCCCDDDDDEFGD
in this case the output is: QB4CCCQD5FFDD
and i don't know why :(
There is no need to use Fseek to rewind as u have done , Here is a code that is have written without using it by using simple counter & current sequence character.
C implementation:
#include<stdio.h>
#include<stdlib.h>
void main()
{
FILE *source = fopen("Test.txt", "r");
FILE *destination = fopen("Dest.txt", "w");
char currentChar;
char seqChar;
int count = 0;
while(1) {
int flag = (fread(&currentChar, sizeof(char),1, source) == 0);
if(flag||seqChar!=currentChar) {
if(count>3) {
char ch = 'Q';
int k = count;
char str[100];
int digits = sprintf(str,"%d",count);
fwrite(&ch,sizeof(ch),1,destination);
fwrite(&seqChar,sizeof(ch),1,destination);
fwrite(&str,sizeof(char)*digits,1,destination);
}
else {
for(int i=0;i<count;i++)
fwrite(&seqChar,sizeof(char),1,destination);
}
seqChar = currentChar;
count =1;
}
else count++;
if(flag)
break;
}
fclose(source);
fclose(destination);
}
Your code has various problems. First, I'm not sure whether you should read straight from the file. In your case, it might be better to read the source string to a text buffer first with fgets and then do the encoding. (I think in your assignment, you should only encode letters. If source is a regular text file, it will have at least one newline.)
But let's assume that you need to read straight from the disk: You don't have to go backwards. You already habe two variables for the current and the next char. Read the next char from disk once. Before reading further "next chars", assign the :
int carSucc, carCorr; // should be ints for getc
carSucc = getc(source); // read next character once before loop
while (carSucc != EOF) { // test for end of input stream
int carCorr = next; // this turn's char is last turn's "next"
carSucc = getc(source);
// ... encode ...
}
The going forward and backward makes the loop complicated. Besides, what happens if the second read read zero characters, i.e. has reached the end of the file? Then you backtrace once and go into the second loop. That doesn't look as if it was intended.
Try to go only forward, and use the loop above as base for your encoding.
I think the major problem in your approach is that it's way too complicated with multiple different places where you read input and seek around in the input. RLE can be done in one pass, there should not be a need to seek to the previous characters. One way to solve this is to change the logic into looking at the previous characters and how many times they have been repeated, instead of trying to look ahead at future characters. For instance:
int repeatCount = 0;
int previousChar = EOF;
int currentChar; // type changed to 'int' for fgetc input
while ((currentChar = fgetc(source)) != EOF) {
if (currentChar != previousChar) {
// print out the previous run of repeated characters
outputRLE(previousChar, repeatCount, destination);
// start a new run with the current character
previousChar = currentChar;
repeatCount = 1;
} else {
// same character repeated
++repeatCount;
}
}
// output the final run of characters at end of input
outputRLE(previousChar, repeatCount, destination);
Then you can just implement outputRLE to do the output to print out a run of the character c repeated count times (note that count can be 0); here's the function declaration:
void outputRLE(const int c, const int count, FILE * const destination)
You can do it pretty much the same way as in your current code, although it can be simplified greatly by combining the fwrite and two fprintfs to a single fprintf. Also, you might want to think what happens if the escape character 'Q' appears in the input, or if there is a run of 10 or more repeated characters. Deal with those cases in outputRLE.
An unrelated problem in your code is that the return type of main should be int, not void.
Thank you so much, i fixed my algorithm.
The problem was a variable, in the first if after the while.
Before
if (fread(&carCorrente, sizeof(char),1, source) == 0)
now
if (fread(&carSucc, sizeof(char),1, source) == 0){
for sure all my algorithm is wild. I mean it is too much slow!
i made a test with my version and with the version of Vikram Bhat and i saw how much my algorithm losts time.
For sure with getc() i can save more time.
now i'm thinking about the encoding (decompression) and i can see a little problem.
example:
if i have an input like: QA7QQBQ33TQQ10QQQ
how can i recognize which is the escape character ???
thanks

Need to reverse file in place but it only works for one line files

So i think im closer here but im still getting funny results when printing the reversed string in place. I'll try to be detailed.
Here is the input:
Writing code in c
is fun
Here is what i want:
c in code Writing
fun is
Here is the actual output:
C
in code Writing
fun
is
Here is my code:
char str[1000]; /*making array large. I picked 1000 beacuse it'll never be written over. A line will never hole 1000 characters so fgets won't write into memory where it doesnt belong*/
int reverse(int pos)
{
int strl = strlen(str)-1,i;
int substrstart = 0,substrend = 0;
char temp;
for(;;)
{
if( pos <= strl/2){ /*This will allow the for loop to iterate to the middle of the string. Once the middle is reached you no longer need to swap*/
temp = str[pos]; /*Classic swap algorithm where you move the value of the first into a temp variable*/
str[pos]= str[strl-pos]; /*Move the value of last index into the first*/
str[strl-pos] = temp; /*move the value of the first into the last*/
}
else
break;
pos++; /*Increment your position so that you are now swaping the next two indicies inside the last two*/
} /* If you just swapped index 5 with 0 now you're swapping index 4 with 1*/
for(;substrend-1 <= strl;)
{
if(str[substrend] == ' ' || str[substrend] == '\0' ) /*in this second part of reverse we take the now completely reversed*/
{
for(i = 0; i <= ((substrend-1) - substrstart)/2; i++) /*Once we find a word delimiter we go into the word and apply the same swap algorthim*/
{
temp = str[substrstart+i]; /*This time we are only swapping the characters in the word so it looks as if the string was reversed in place*/
str[substrstart+i] = str[(substrend-1)-i];
str[(substrend-1)-i] = temp;
}
if(str[substrend] == '\t' || str[substrend] == '\n')
{
str[substrend] = ' ';
for(i = 0; i <= ((substrend-1) - substrstart)/2; i++) /*Once we find a word delimiter we go into the word and apply the same swap algorthim*/
{
temp = str[substrstart+i]; /*This time we are only swapping the characters in the word so it looks as if the string was reversed in place*/
str[substrstart+i] = str[(substrend-1)-i];
str[(substrend-1)-i] = temp;
}
}
if(str[substrend] == '\0')
{
break;
}
substrstart=substrend+1;
}
substrend++; /*Keep increasing the substrend until we hit a word delimiter*/
}
printf("%s\n", str); /*Print the reversed line and then jump down a line*/
return 0;
}
int main(int argc, char *argv[])
{
char *filename; /*creating a pointer to a filename*/
FILE *file20; /*creating FIlE pointer to a file to open*/
int n;
int i;
if (argc==1) /*If there is no line parameter*/
{
printf("Please use line parameter!\n");
return(5); /*a return of 5 should mean that now line parameter was given*/
}
if(argc>1){
for(i=1; i < argc; i++)
{
filename = argv[i]; //get first line parameter
file20 = fopen(filename, "r"); //read text file, use rb for binary
if (file20 == NULL){
printf("Cannot open empty file!\n");
}
while(fgets(str, 1000, file20) != NULL) {
reverse(0);
}
fclose(file20);
}
return(0); /*return a value of 0 if all the line parameters were opened reveresed and closed successfully*/
}
}
Can anyone point me to an error in the logic of my reverse function?
What you've written reads out the whole file into a single buffer and runs your reverse function over the whole file at once.
If you want the first line reversed then the next line reversed, etc, you'll need to read the lines one at a time using something like fgets. Run reverse over each line, one at a time and you should get what you want.
http://www.cplusplus.com/reference/cstdio/fgets/
Assuming you want to continue reading in the whole file into a single buffer and then doing the line-by-line reverse on the buffer all at once (instead of reading in one line, reversing it, reading in the next line, reversing it, and so on), you'll need to re-write your reverse() algorithm.
What you have in place seems to work already; I think you can get what you need by adding another loop around your existing logic, with a few modifications to your existing logic. Start with a pointer to the beginning of str[], let's call it char* cp1 = str. At the top of this new loop, create another pointer, char* cp2, and set it equal to cp1. Using cp2, scan to the end of the current line looking for a newline or '\0'. Now you have a pointer to the start of the current line (cp1) and a pointer to the end of the current line (cp2). Now modify your existing logic to use those pointers instead of str[] directly. You can compute the length of the current line by simply lineLen = cp2 - cp1; (you wouldn't want to use strlen() because the line might not have a terminating '\0'). After that, it will loop back up to the top of your new loop and continue with the next line (if *cp2 doesn't point to '\0')... just set cp1 = cp2+1 and continue with the next line.

Resources