How to get data from specific input file format skipping line with scanf [closed] - c

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 1 year ago.
Improve this question
I have a .txt file that contains info like this:
9:0
B1 0 0 0 0 0
B2 0 0 0 0 0
B3 4 5 0 0 0
B4 1 2 3 0 0
9:1
B1 0 0 0 0 0
B2 0 0 0 0 0
B3 4 5 0 0 0
B4 1 2 3 0 0
9:2
B1 0 0 0 0 0
B2 0 0 0 0 0
B3 4 5 0 0 0
B4 1 2 3 0 0
(...)
As you can see, the format is a line with the time, and the next four lines is the identifier of "B" followed by five elements of an array.
I can easily read the whole file, looping through the whole file (I know how many lines the file has) using first of all scanf to get the date, next another scanf to read the "B" plus the identifier, and loop four times with scanf again to get the integers of the array and go back to get the time.
This works fine, but (again) as you can see, there's a lot of zeros in the array space, and it would be a lot faster if I check the first element and if it is zero, then skip and read the next box, so I did it and used the break; statement, but the problem is that using break will ruin the structure of my loop, storing in the time variable a B identifier sometimes.
I'm wondering if there's any other way to skip the zeros when I find one, and jump to the next B identifier, i.e. after reading 9:0, i get B1 and then read the first zero, so skip this line and get 'B2' until I find a non-empty array.
If anyone could help me, please!

I think a reasonably fast solution might be to read the entire line into a static char array, skipping to the next line if the 4'th character is 0, and using sscanf to read the values otherwise.
My reasoning here is that the work of fscanf is split between (1) reading in a string from a file, and (2) parsing that string into numbers, and moving them into the provided variables. However, I know IO operations can be pretty slow, so here is another alternative that does slightly less IO.
Use fscanf to read in only the first two tokens of a line (namely, the B_ and the first number), and use fseek to skip to the next line if the first number is 0. I'm not super confident that this will be faster, but its something you could try.

Related

How do you calculate the number of elements in a jagged array in F#?

I am new to F# and haven't found the answer to this anywhere. I am creating a jagged array that can hold 10 rows and 10 columns each with an increasing number of elements. The code I used for the array creation and printing is as follows:
let jagged = [| for a in 1 .. 10 do yield [| for a in 1 .. a do yield 0 |] |]
let mutable len = 0;
for arr in jagged do
for col in arr do
len <- (len + 1)
printf "%i " col
printfn "";
printfn "%i" len
The above code gives the following output
0
0 0
0 0 0
0 0 0 0
0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
55
Currently, I am calculating the number of elements manually but would like to know if there is a better way to do so.
If you want to calculate the length of a single array, you could use Array.length. But what you have is an array of arrays of different lengths, and you want to calculate the sum of their sizes. Rather than just give you the answer, I'll show you how you could use https://fsharpforfunandprofit.com/posts/list-module-functions/ (a site by Scott Wlaschin that's a really terrific resource, BTW) to find the answer yourself. This page presents a series of questions to help you find the functions you're looking for: starting from question 1, you move to other questions and eventually to a list of useful functions.
Question 1 on that page is, "What kind of collection do you have?" The choices are "I don't have a collection and I want to create one", or "I have one collection I want to work with", or several other choices where you have two or three or more collections. Here, we have one collection we want to work with, so the page directs us to question 9.
Question 9 on that page has a bunch of choices I won't repeat here, but one of them is "If you want to aggregate or summarize the collection into a single value". That sounds like what we want: we want the sum of the lengths of the sub-arrays. So we go to section 14, which has a bunch of functions we could use. And halfway down the list is sum and sumBy. Those sound intriguing. The sum function "returns the sum of the elements in the collection"... well, no, that won't work, because our array contains arrays, not numbers. But the sumBy function "returns the sum of the results generated by applying the function to each element of the collection." And we know there's a function for finding the length of a single array: Array.length. (The page talks about functions that work on lists, but pretty much any function that works on lists has a corresponding function that works on arrays and a similar corresponding function that works on sequences. The few exceptions are for things like how you can have infinite sequences, but not infinite arrays or lists, so there's a Seq.initInfinite function but there's no Array.initInfinite or List.initInfinite function).
So now that we've found that, we just need to write it.
let lengthOfJaggedArray arr = arr |> Array.sumBy Array.length
And that's it. Instead of calculating the length by hand via two nested for loops, there's a one-line solution that's quite simple and uses built-in functions. All you needed to do was know what functions are available — and since the entire list of available array/list/seq functions can be a little daunting when you're new to F#, Scott Wlaschin has made a very useful resource to help make it a bit less daunting.

find largest rectangle not (necessary) aligned with image boundary in binary matrix

I am using this solution to find rectangles aligned with the image border in a binary matrix. Suppose now I want to find a rectangle that is not aligned with the image border, and I don't know its orientation; what would be the fastest way to find it?
For the sake of the example, let's look for a rectangle containing only 1's. For example:
1 1 1 1 0 0 0 0 0 1 0 0 1 1 1
0 1 1 1 1 1 0 0 0 1 0 0 1 1 0
0 0 0 1 1 1 1 1 0 1 0 0 1 0 0
0 0 0 0 0 1 1 1 1 1 0 0 0 0 0
0 0 0 0 0 0 0 1 1 1 1 1 0 0 0
0 0 0 0 0 0 0 0 0 1 1 1 1 1 0
Then the algorithm described in the solution I described above would only find a rectangle of size 6 (3x2). I would like to find a bigger rectangle that is tilted; we can clearly see a rectanble of at least size 10 or more...
I am working in C/C++ but an algorithm description in any language or pseudo-code would help me a lot.
Some more details:
there can be more than one rectangle in the image: I need the biggest only
the rectangle is not a beautiful rectangle in the image (I adapted my example above a little bit)
I work on large images (1280x1024) so I'm looking for the fastest solution (a brute-force O(n³) algorithm will be very slow)
(optional) if the solution can be parallellized, that is a plus (then I can boost it more using GPU, SIMD, ...)
I only have a partial answer for this question, and only a few thoughts on complexity or speed for what I propose.
Brute Force
The first idea that I see is to use the fact that your problem is discrete to implement a rotation around the center of the image and repeat the algorithm you already use in order to find the axis aligned solution.
This has the downside of checking a whole lot of candidate rotations. However, this check can be done in parallel since they are indepedant of one another. This is still probably very slow, although implementing it (shouldn't be too hard) and would provide a more definite answer to the question speed once parallelized.
Note that your work-space being a discrete matrix, there is only a finite number of rotation to browse through.
Other Approach
The second solution I see is:
To cut down your base matrix so as to separate the connected components [1] (corresponding to the value set you're interested in).
For each one of those smaller matrices -- note that they may be overlapping depending on the distribution -- find the minimum oriented bounding box for the value set you're interested in.
Still for each one of those, rotate your matrix so that the minimum oriented bounding box is now axis-aligned.
Launch the algorithm you already have to find the maximum axis-aligned rectangle containing only values from your value set.
The solution found by this algorithm would be the largest rectangle obtained from all the connected components.
This second solution would probably give you an approximation of the soluiton, but I believe it might prove to be worth trying.
For reference
The only solutions that I have found for the problem of the maximum/largest empty rectangle are axis-aligned. I have seen many unanswered questions corresponding to the oriented version of this problem on 2D continuous space.
EDIT:
[1] Since what we want is to separate the connected component, if there is a degree of overlap, you should do as in the following example:
0 1 0 0
0 1 0 1
0 0 0 1
should be divided into:
0 0 0 0
0 0 0 1
0 0 0 1
and
0 1 0 0
0 1 0 0
0 0 0 0
Note that I kept the original dimensions of the matrix. I did that because I'm guessing from your post it has some importance and that a rectangle expanding further away from the boundaries would not be found as a solution (i.e. that we can't just assume there are zero values beyond the border).
EDIT #2:
The choice of whether or not to keep the matrix dimensions is debatable since it will not directly influence the algorithm.
However, it is worth noting that if the matrices corresponding to connected components do not overlap on non-zero values, you may choose to store those matrices "in-place".
You also need to consider the fact that if you wish to return as output the coordinates of the rectangle, creating a matrix with different dimensions for each connected component, this will force you to store the coordinates of your newly created matrix in the original one (actually, one point, say for instance the up-left one, should be enough).

Finding row with maximum no. of 1s if each row is sorted using logicalOR approach

Question similar to this may have been discussed before but I want to discuss a different approach to this.
Given a boolen 2D array where each row is sorted, find the rows with maximum number of 1s.
Input Matrix :
0 1 1 1
0 0 1 1
1 1 1 1
0 0 0 0
Output : 2
How about doing this approach...Logical OR for column 0 of each row and if answer is 1, return that row index and stop. Like in this case if I do (0 | 0 | 1 | 0) answer would be one and thereby return that row index. if the input matrix is something like :
Input matrix:
0 1 1 1
0 0 1 1
0 0 0 1
0 0 0 0
Ouput : 0
When I do logicalOR of column 0 of each row, answer would be zero...so I would move to column 1 of each row, the procedure is followed till the LogicalOR is 1.?I know other approaches to solve this problem but I would like to have view on this approach.
If it's:
0 ... 0 1
0 ... 0 0
0 ... 0 0
0 ... 0 0
0 ... 0 0
You'd have to search many columns.
The maximum amount of work involved would be linear in the number of cells (O(mn)), and the other approaches outperform this here.
Specifically the approach where:
You start at the top right and
Repeatedly:
Search left until you find a 0 and
Search down until you find a 1
And return the last row where you found a 1
Is linear in the number of rows plus columns (O(m + n)).
That would work since it's equivalent to finding the row for which the leftmost 1 is before (or at the same point as) any other row's leftmost 1. It would still be O(m * n) in the worst case:
Input Matrix :
0 0 0 0
0 0 0 0
0 0 0 0
0 0 0 1
Given that your rows are sorted, I would binary search for the position of the first one for each row, and return the row with the minimum position. This would be O(m * logn), although you might be able to do better.
Your approach is likely to be orders of magnitude slower than the naive "go through the rows, and count the zeros, and remember the row with the fewest zeros." The reason is that, assuming your bits are stored one-row-at-a-time, with the bools packed tightly, then memory for the row will be in cache all at once, and bit-counting will cache beautifully.
Contrast this to your proposed approach, where for each row, the cache line will be loaded, and a single bit will be read from it. By the time you've cycled through all the rows in your array, the memory for the first row will (probably, if you've got any reasonable number of rows), be out of the cache, and the row will have to be loaded again.
Approximately, assuming a 64B cache line, the first approach is going to need (1/64*8) memory accesses per bit in the array, compared to 1 memory access per bit in the array compared to yours. Since counting the bits and remembering the max is just a few cycles, it's reasonable to think that the memory access are going to dominate the running cost, which means the first approach will run approximately 64 * 8 = 512 times faster. Of course, you'll get some of that time back because your approach can terminate early, but the 512 times speed hit is a large cost to overcome.
If your rows are super-long, you may find that a hybrid between these two approaches works excellently: count the number of bits in the first cache-line's worth of data in each row (being careful to cache-line-align each row of your data in memory), and if every row has no bits set in the first cache-line, go to the second and so forth. This combines the cache-efficiency of the first approach with the early termination of the second approach.
As with all optimisations, you should measure results, and be sure that it's important that the code is fast. The efficient solution is likely to impose annoying restrictions (like 64-byte memory alignment for rows), and the code will be harder to read than a straightforward solution.

Create an "array style" countdown in c

What I want to know is, if it's possible to create a countdown in c, BUT have a condition for when it hits an "unsual" piece of data in the array. I'll explain better with examples.
This is also similar to: Read ahead in an array to predict later outcomes in C
however, it was poorly worded. So, I am rewording this question.
Ex: The array is an integer array with : 0 0 0 0 1 0 0 0 0 0 0 0 0 0 5 0 0 0 0 0 2
So, when its zero, don't do anything.
However, when it's non-zero, display the text associated with the number (according to some condition).
With pseudocode it'd be something like this:
if 0 dont do anything ====> within this countdown till next non-zero
if != 0 then display text asociated
reset countdown till next non-zero.
Is there any way this can be achieved? Basically, this would mean you could predict or read ahead in the array. Any help would really be appreciated!

preallocation of memory in matlab / octave [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
I preallocate memory for a large array but the new data appends at the end of the array instead of over writing the data how can I fix this. So I can preallocate memory for a large array.
note the array is 44101x5001 I just used smaller numbers in the example.
Example:
clear all
xfreq=zeros(10,10); %allocate memory
for ww=1:1:10
xfreq_new = xfreq(:,1)+1+ww;
xfreq=[xfreq xfreq_new]; %would like this to over write and append the new data where the preallocated memory of zeros are instead of appending to the end of it.
end
If you run this you'll notice that it appends the ones instead of over writing the zeros.
Aloha
Rick
Hopefully this explains things better
Allocated array
1)Allocated memory of zeros
[0 0 0 0 0
0 0 0 0 0
0 0 0 0 0]
2) Overwriting allocated memory of zeros with a number, the number could be anything not just the number one, I used the number one as an example
[1 0 0 0 0
1 0 0 0 0
1 0 0 0 0]
3) Still Overwriting of allocated memory zeros with a number, the number could be anything not just the number one, I used the number one as an example
[1 1 0 0 0
1 1 0 0 0
1 1 0 0 0]
the problem is with this line
xfreq=[xfreq xfreq_new]; %would like this to over write and append the new data where the preallocated memory of zeros are instead of appending to the end of it.
end
This will work if you want all your entries equal to x
x = (some number)
A = zeros(10,n)
for i=1:n
A(:,i) = x;
end
If you want your columns equal to some other column you have to do
A = zeros(10,n)
for i=1:n
A(:,i) = v;
end
where v is a vector of size (10,1)

Resources