So my program is to count the frequency of words in a text file. The text file will be partitioned in to n parts and use n threads to count the frequency of words from each part.
Assume the text file contains only letters and white spaces, and the uppercase letters are the same as lowercase letters.
An example of the text file is
This is a text file which contains some numbers from one to ten some numbers are lower than the other numbers such as one is lower than two
My problem is when reading a part of the text file using fseek and fread, it doesn't read properly.
I'm using start to indicate from which position to start reading and end to indicate the number of bytes to read.
Although I check the start and end, but the strings that I get aren't correct.
For example,
From 0 to 48, the string is 'This is a text file which contains some numbers '
From 48 to 92, the string is 'from one to ten some numbers are lower than '
From 92 to 139, the string is 'This is a text file which contains some numbers'
Also, the variables that is returned from thread_exit() don't seem to be correct as well.
I made a check and the words and their frequencies that I got from wordFreq were not the same as what I got from the returned variable from thread_exit(), myfreqword
for example,
this is what I got from wordFreq of each thread
a 1
contains 1
are 1
file 1
as 1
from 1
is 1
is 1
lower 1
numbers 1
lower 1
numbers 1
some 1
numbers 1
one 1
text 1
one 1
some 1
this 1
other 1
ten 1
which 1
such 1
than 1
than 1
to 1
And this is what I got after return from thread_exit(), there were some weird strings here.
the 1
a 1
two 1
contains 1
numbers 1
some 1
which 1
are 1
from 1
lower 1
numbers 1
one 1
some 1
ten 1
than 1
to 1
is! 1
? 1
??? 1
tw 1
I have no idea what went wrong here.
Related
I've been trying to search the Internet for answers but couldn't find any.
I'm trying to scan the following input:
100 0 1 3 10 3 6
101 0 4 4 2
200 1 2 5 1 2 3
300 1 7 6 1
Each line as a string and each string has whitespace between numbers.
I tried using:
while(scanf("%[^\n]s", str) != EOF)
but it's stuck in an infinite loop. It only scans the first line.
I also tried fgets until EOF but that gives me a compile error saying that I cannot compare a pointer
to an integer.
I just want to scan each line -> run it to a parser so I can separate the numbers into different variables -> do my calculations.
Thanks in advance.
Consider an example:
5
1 0 5
1 1 7
1 0 3
2 1 0
2 1 1
Here, in the first line, 5 denotes the size of the array.
I'm entering five sequences one by one.
I want the first sequence ie. 1 0 5 to be stored in arr[0].
Note: 1, 0 and 5 are seperated by spaces.
However, arr[0] should contain 105 without any space.
I want to accept the next sequence into arr[1] only after pressing 'Enter'.
So that arr[1] should contain 117, arr[2] should contain 103 and so on up to arr[4].
Is there any operator that I can use for this?
There are no operators that do I/O in C at all, so no.
I also don't think there's any standard function with those semantics, they tend to view all whitespace as equal.
You should write your own, probably using fgets() to read in whole lines and then extracting the digits to convert to integers.
Suppose I have a data set arranged in the following way
19 10 1 1
12 15 1 1
13 12 4 5
10 5 2 3
...
and so on, at a particular iteration in a for loop I have to read only the 1st and the 4th row and in the next iteration I have to access some other set of rows,for example
1st iteration:
1st row: 19 10 1 1
4th row: 10 5 2 3
i will access my data using the fscanf() function. But how will i ensure that I choose only the 1st and 4th rows or any two rows for that matter at a given iteration?
(I have not considered reading it into a 2D array since the size of data set is 10^8 )
Thank you.
As you read through your data (say, stored in a standard file), get byte offsets for rows by looking for row delimiters (a newline character). You can then read out rows based on the start and end byte offset with C pointer arithmetic on a FILE * and fseek(). Storing a few byte offsets (an eight byte long or equivalent, often) is cheap.
I'm making a maze solver using Breadth-first search. Consider the following list of numbers in a text file
10 20
1 1
10 20
5 1
4 2
3 3
1 10
2 9
3 8
4 7
5 6
6 5
7 4
8 3
Where the first row denotes a size of my maze (10x20), the second row denotes the starting position coordinates (1x1),and the third row denotes the ending position(10x20). Every row after the third row represents the coordinate where a block in the maze will be (aka will have to move around it).
Here's what this particular board will look like:
**********************
*s........*..........*
*........*...........*
*..*....*............*
*.*....*.............*
**....*..............*
*....*...............*
*...*................*
*..*.................*
*....................*
*...................e*
**********************
What I am trying to do:
If my text file has impossible coordinates for either the size or start/end coordinates, ignore those coordinates and continue processing input.
example:
10 0 => Invalid: Maze sizes must be greater than 0
15 7 => Maze becomes size 15 x 7
10 20 => Invalid: column 20 is outside range from 1 to 7
5 1 => Starting position is at position 5, 1
24 2 => Invalid: row 24 is outside of range from 1 to 15
3 3 => Ending position is at position 3, 3
1 10 => Invalid: column 10 is outside range from 1 to 7
2 9 => Invalid: column 9 is outside range from 1 to 7
3 8 => Invalid: column 8 is outside range from 1 to 7
4 7
5 6
5 1 => Invalid: attempting to block starting position
6 5
7 4
8 3
I know I'm supposed to use some fprintf or fscanf loop until the end of file is reached.
Can someone start me off in the right direction?
I want to print all coordinates in the file, with error messages further in the line, if necessary.
Is the problem you are trying to ask how to read all the points? If so you can do something like the following:
int n1, n2;
FILE * fp = fopen("myfile.txt", "r");
//...read first three lines and do what you need with them
//read rest of points
while( fscanf(fp, "%d %d", &n1, &n2) ) {
if (checkPoints(n1,n2)) // check points are valid
addPointsToBoard(n1,n2); // add to board
}
If you're asking how to implement something like checkPoints I'd say you haven't given enough information of how you plan to implement your code for someone to help
NOTE: This assumes you have a well formed input file. If you are concerned about invalid inputs you will need to do sanity checking
EDIT Based upon comment here is a way you can do a sanity check on the matrix size input (first line)
int valid_size = 0;
while(1) {
if ( fsanf(fp, "%d %d", &n1, &n2) )
valid_size = checkMatrixSizes(n1,n2);
else
exit(1); //never finding valid matrix size in file
if (valid_size)
break
}
The above loop will continuously loop until checkMatrixSizes finds valid sizes (I would also suspect it would create your board, etc. The above is pseudo code and far from complete). You could do similar loops for the second and third inputs. It should be noted that this code simply ignores any invalid input and moves on, which I think is the behavior you want based upon your question. Other behaviors might include adjusting the input to the closest acceptable value (i.e. if a column is out of range, set column to the highest possible value).
I need to read a file and store each number (int) in a variable, when it sees \n or a "-" (the minus sign means that it should store the numbers from 1 to 5 (1-5)) it needs to store it into the next variable. How should I proceed?
I was thinking of using fgets() but I can't find a way to do what I want.
The input looks like this:
0
0
5 10
4
2 4
5-10 2 3 4 6 7-9
4 3
These are x y positions.
I'd use fscanf to read one int at a time, and when it's negative, it is obviously the second part of a range. Or is -4--2 a valid input?