COBOL programmers - How to use arrays - arrays

I am programming in COBOL and trying to put this client file in an array. I'm having trouble understanding this problem. I know that the array would probably be based on the bookingtype because there are 4 different options. Any help would be appreciated.
This is how I have the array defined so far:
01 Booking-Table.
05 BookingType OCCURS 4 TIMES PIC 9.
Here is the client file.

I guess the solution is about storing the costs in an array. To calculate the average the array would need to have cost + number with the booking type being the index used.
The "tricky" part may be the maximum of amount per type (9999.99) * maximum customers with this type (all and as the client number implies the 3 given positions are numeric: 1000 [including the zero, all could have the same type]).
Something like
REPLACE ==MaxBookingType== BY ==4==.
01 Totals-Table.
05 Type-Total OCCURS MaxBookingType TIMES.
10 type-amount pic 9(8)V99 COMP.
10 type-customers pic 9(4) COMP.
Now loop through the file from start to end, do check that BookingType >= 1 AND <= MaxBookingType (I'm always skeptic that "data never changes and is always correct) and then
ADD 1 TO type-customers(BookingType)
ADD trip-cost TO type-amount (BookingType)
and after end of file calculate the average for all 4 entries using a PERFORM VARYING.
The main benefit of using an "array" here is that you can update the program to have 20 booking types just by changing the value for MaxBookingType - and as you've added a check which tells you what "bad" number is seen in there you can adjust it quite fast.
I'm not sure if/how your compiler does allow self-defined numeric constants, if there's a way: use this instead of forcing the compiler to check for all occurrences of the text "MaxBookingType".

I believe the diagram is trying to say you need an enumeration. In COBOL, you'd implement this with
01 client-file-record.
*> ...
03 booking-type PIC 9.
88 cruise VALUE 1.
88 air-independent VALUE 2.
88 air-tour VALUE 3.
88 other VALUE 4.
*> ...
An array-approach is only necessary if the booking types (and/or their behaviour) varied at runtime.

Related

Is there a space efficent way to store and retrieve the order of a dataset?

Here's my problem. I have a set of 20 objects stored in memory as an array. I want to store a second piece of data that defines an order for the objects to be displayed.
The simplest way to store the order is as an array of 20 unsigned integers, each of which is 5 bits (aka 0-31). The position of the object in the output list would be defined by the number stored in this array at the same index as the object in it's array.
But.. I know from statistics that there are only 20! (that's 20 factorial), ways to arrange these objects.
This could be stored in 62 bits, since 2^62 > 20!
I'm currently using 100 bits to store the same information.
So my question is this: Is there a space efficient way to store ORDER as a sequence of bits?
I have some addition constraints as well. This will run on an embedded device, so I can't use any huge arrays or high level math functions. I would need a simple iterative method.
Edit: Some clarification on why this is necessary. Say for example the objects are pictures, and they're stored in ROM (aka they can't be moved around). Now lets say I want to keep track of what order to display the images in, and i'm going to update that order every second. My device has 1k of storage with wear leveling, but each bit in the storage can only be written 1000 times before it becomes unreliable. If I need 1kb to store the order, than my device will only work for 1000 seconds. If I need 0.1kb, it will work for 10k seconds, and so on. Thus the devices longevity will be inversely proportional to the number of bits I need to update every cycle.
You can store the order in a single 64-bit value x:
For the first choice, 20 possibilities, compute the index as x % 20 and update x as x /= 20,
For the next choice, only 19 possibilities, compute x % 19 and update x as x /= 19.
Continue this process 17 more times and you are done.
I think I've found a partial solution to my own question. Assuming I start at the left side of the order array, for every move right there are fewer remaining possibilities for the position value. The number of possibilities is 20,19,18,etc. I can take advantage of this by populating the order array in a relative fashion. The first index will place a value in the order array. There are 20 possibilities so this takes 5 bits. Placing the next value, there are only 19 position available (still 5 bits). Proceeding though the whole array. The bits-required is now 5,5,5,5,4,4,4,4,4,4,4,4,3,3,3,3,2,2,1,0. So that gets me down to 69 bits, much better.
There's still some "wasted" precision in each of the values, since for example the first position can store 32 possible values, even though there are only 20. I'm not sure how to deal with this, but I think will have something to do with carrying a remainder from one calculation to the next..

Algorithm so that i can index 2^n combinations in a way so i can backtrack from any index value of 1 to 2^n without using an array

I am trying to do something but it is outside my field. To explain lets set n=3 to simplify things where n is the total number of the parameters in this example: A, B, C. These parameters can have a state of ON and OFF (aka 0 or 1).
The total number of combinations of these parameters is 2^n = 8 in this case which can be visualized as:
ABC
1: 000
2: 111
3: 100
4: 010
5: 001
6: 110
7: 011
8: 101
Of course the above list can be sorted in (2^n)! = 40320 ways.
I want an algorithm so that i can calculate the state of any of my parameters (0 or 1) given a number from 1 to 2^n. For example if i have the number of 3 using the table above i know state of A is 1 and B and C is 0. Of course you can have a table/array to look it up given a specific sorting, but even for relatively small values of n you need to have a huge table.
I'm not familiar with this and the methods you can do indexing that's why i need help.
Kind regards
Just realised you can actually look at it another way. What you want is a function encrypting N bits to another set of N bits. In practice this is the same as format preserving encryption. The question is, do you care whether:
all 2^n cases are covered, or just a large enough number close to 2^n (you have to choose the right encryption/hash method)
you want to do this one way or both ways (that is, do you ever want to ask - I have this number corresponding to that number, which permutation am I using)
If the answer is no to both, you can just find an FPE algorithm that doesn't require you to generate the whole table (some do).
I have seen another problem of finding all subsets of a given set using bitmask. You can use the same concept in your case. This link contains a good tutorial.

SPSS Identifying Different Lagged Values Through Loops

I have this dataset with 2 variables: week and brand_chosen, where brand chosen designates which product from e.g. a super market was chosen, an it looks like this.
Week brand_chosen
2 19
2 15
2 50
2 12
3 19
3 16
3 50
4 77
4 19
What I am trying to do is for each line, to note the week in which the brand purchase was made, and check if in the week before that the same brand purchase was made. In case it did, a variable dummy would take the value of 1, otherwise 0.
Because week appears multiple times I cannot take just the lag(week,1), so I probably need to loop through the week variables for each case, until it finds the first different value.
This is what i tried to do
loop i=1 to 70.
do if (week<>lag(week,i) and brand_chosen=lag(brand_chosen,i)).
compute dummy=1.
end loop.
else.
compute dummy=0.
end if.
end loop.
execute.
Where 70 is just an arbitrary number so that I am sure that it will check all the previous cases.
I get two problems with that. First the lag function needs to contain a number from what I understand but "i" is not considered a number here.
The second problem is that i would like to close the loop if the condition is satisfied, and move to the next case but I get an error.
I am new to spss syntax and I am struggling with that one, so any help is greatly appreciated.
I assume that every combination of week--brand_chosen is unique. In this case the solution is quite simple. Just reorder your dataset by brand_chosen and then week, and then run a simple lag command.
This should do the trick:
SORT CASES BY brand_chosen week.
COMPUTE dummy=0.
IF (brand_chosen=LAG(brand_chosen) AND week>LAG(week)) dummy = 1.

Generate unique identifier for chess board

I'm looking for something like a checksum for a chess board with pieces in specific places. I'm looking to see if a dynamic programming or memoized solution is viable for an AI chess player. The unique identifier would be used to easily check if two boards are equal or to use as indices in the arrays. Thanks for the help.
An extensively used checksum for board positions is the Zobrist signature.
It's an almost unique index number for any chess position, with the requirement that two similar positions generate entirely different indices. These index numbers are used for faster and space efficient transposition tables / opening books.
You need a set of randomly generated bitstrings:
one for each piece at each square;
one to indicate the side to move;
four for castling rights;
eight for the file of a valid en-passant square (if any).
If you want to get the Zobrist hash code of a certain position, you have to xor all random numbers linked to the given feature (details: here and Correctly Implementing Zobrist Hashing).
E.g the starting position:
[Hash for White Rook on a1] xor [White Knight on b1] xor ... ( all pieces )
... xor [White castling long] xor ... ( all castling rights )
XOR allows a fast incremental update of the hash key during make / unmake of moves.
Usually 64bit are used as a standard size in modern chess programs (see The Effect of Hash Signature Collisions in a Chess Program).
You can expect to encounter a collision in a 32 bit hash when you have evaluated √ 232 == 216. With a 64 bit hash, you can expect a collision after about 232 or 4 billion positions (birthday paradox).
If you're looking for a checksum, the usual solution is Zobrist Hashing.
If you're looking for a true unique-identifier, the usual human-readable solution is Forsyth notation.
For a non-human-readable unique-identifier, you can store the type/color of the piece on each square using four-bits. Throw in another 3-bits for en-passant square, 4-bits for which castlings are still allowed, and one-bit for whose turn it is, and you end up with exactly 33 bytes for each board-setup.
You can use a checksum like md5, sha, just pass your chessboard cells as text, like:
TKBQKBHT
........
........
........
tkbqkbht
And get the checksum for generated text.
The checksum between one to other board will be different without any related value, at this point may be create a unique string (or array of bits) is the best way:
TKBQKBHT........................tkbqkbht
Because it will be unique too and is easily compare with others.
If two games achieve the same configuration through different moves or move orders, they should still be "equal". e.g. You shouldn't have to distinguish between which pawn is in a particular location, as long as the location is the same. You don't seem to really want to hash, but to uniquely and correctly distinguish between these board states.
One method is to use a 64x12 square-by-piecetype membership matrix. You can store this as a bit vector and then compare vectors for the check. e.g. the first 64 addresses in the vector might show which locations on the board contain pawns. The next 64 show locations which contain knights. You could let the first 6 sections show membership of white pieces and the final 6 show membership of black pieces.
Binary membership matrix pseudocode:
bool[] memberships = zeros(64*12);
move(pawn,a3,a2);
def move(piece,location,oldlocation):
memberships(pawn,location) = 1;
memberships(pawn,oldlocation) = 0;
This is cumbersome because you have to be careful how you implement it. e.g. make sure there is only one king maximum for each player. The advantage is that it only takes 768 bits to store a state.
Another way is a length-64 integer vector representing vectorized addresses for the board locations. In this case, the first 8 addresses might represent the state of the first row of the board.
Non-binary membership matrix pseudocode:
half[] memberships = zeros(64);
memberships[8] = 1; // white pawn at location a2
memberships[0] = 2; // white rook at location a1
...
memberships[63] = 11; // black knight at location g8
memberships[64] = 12; // black rook at location h8
The nice thing about the non-binary vector is you don't have as much freedom to accidently assign multiple pieces to one location. The downside is that it is now larger to store each state. Larger representations will be slower to do equality comparisons on. (in my example, assume each vector location stores a 16-bit half-word, we get 64*16=1014 bits to store one state compared to the 768 bits for the binary vector)
Either way, you'd probably want to enumerate each piece and board location.
enumerate piece {
empty = 0;
white_pawn = 1;
white_rook = 2;
...
black_knight = 11;
black_rook = 12;
}
enumerate location {
a1 = 0;
...
}
And testing for equality is just comparing two vectors together.
There are 64 squares. There are twelve different figures in chess that can occupy a square plus the possibility of no figure occupying it. Makes 13. You need 4 bits to represent those 13 (2^4 = 16). So you end up with 32 bytes to unambiguously store a chess board.
If you want to ease handling you can store 64 bytes instead, one byte per square, as bytes are easier to read and write.
EDIT: I've read some more on chess and have come to the following conclusion: Two boards are only the same, if all previous boards since last capture or pawn move are also the same. This is because of the threefold repetition rule. If for the third time the board looks exactly the same in a game, a draw can be claimed. So in spite of seeing the same board in two matches, it may be considered unfortunate in one match to make a certain move, so as to avoid a draw, whereas in the other match there is no such danger.
It is up to you, how you want to go about it. You would need a unique identifyer of variable length due to the variable number of previous boards to store. Well, maybe you take it easy, turn a blind eye to this and just store the last five moves to detect directly repetetive moves that could lead to a third repetion of positions, this being the most often occuring reason.
If you want to store moves with the board: There are 64x63=4032 thinkable moves (12 bits necessary), but many of them illegal of course. If I count correctly there are 1728 legal moves (A1->A2 = legal, A1->D2 illegal for instance), which would fit in 11 bits. I would still go for the 12 bits, however, as to make interpretion as easy as possible by storing 0/1 for A1->A2 and 62/63 for H7->H8.
Then there is the 50 moves rule. You don't have to store moves here. Only the number of moves since last capture or pawn move from 0 to 50 (that's enough; it doesn't matter whether it's 50, 51 or more). So another six bits for this.
At last: Black's or white's move? Enpassantable pawn? Castlingable rook? Some additional bits for this (or extension of the 13 occupancies to save some bits).
EDIT again: So if you want to use the board to compare with other matches, then "two boards are only the same, if all previous boards since last capture or pawn move are also the same" applies. If you only want to detect repetion of positions in the same game, however, then you should be fine by just using the 15 occupancies x 64 squares plus one bit for who's move it is.

matlab error, attempt to reference field of non structure array

All the references to this error I could find searching online were completely inapplicable to my situation, they were dealing with some kind of variables involving dots, like a.b (structures in other words), whereas I am strictly using arrays. Nothing involves a dot, nor does my code ask about it.
Ok, I have this GINORMOUS array called tier2comparatorconnectionpoints. It is a 4-D array of size 400×10×20×10. Consider tier2comparatorconnectionpoints(counter,counter2,counter3,counter4).
counter is a number 1 to 400,
counter2 is a number 1 to numchromosomes(counter), and numchromosomes(counter1) is bound to 10,
counter3 is a number 1 to tier2numcomparators(counter,counter2), which is in turn bounded to 20.
counter4 is a number 1 to tier2inputspercomparator(counter,counter2,counter3), which is bounded to 10.
Now, so that I don't run out of RAM, I have tier2comparatorconnectionpoints as type int8, and UNFORTUNATELY at some point in my horrendous amount of code, I forgot to cast it to a double when I'm doing math with it, and a rounding error involved with multiplying it with a rand ends up with tier2comparatorconnectionpoints for some values of its 4 inputs exceeding what it's allowed to be.
The values it's allowed to have are 1 through tier1numcomparators(counter,counter2), which is bounded to 40, 41 through 40+tier2numcomparators(counter,counter2), with tier2numcomparators(counter,counter2) being bounded to 20, and 61 through 60+tier2numcomparators(counter,counter2), thus it's not allowed to be more than 80 since tier2numcomparators(counter,counter2) is bounded to 20 and it's not allowed to be more than 60+tier2numcomparators(counter,counter2), but it's also not allowed to be less than 40 but more than tier1numcomparators(counter,counter2) and it's not allowed to be less than 60 but more than 40+tier2numcomparators(counter,counter2). I became aware of the problem because it was being set to 81 somewhere.
This is an evolutionary simulation by the way, it's natural selection on simulated organisms. I need to hunt down the part of the code that is allowing the values of tier2comparatorconnectionpoints to exceed what it's allowed to be. But that is a separate problem.
A temporary fix of my data, just so that it at least is made to conform to its allowed values, is to set anything that is greater than tier1numcomparators(counter,counter2) but less than 40 to tier1numcomparators(counter,counter2), to set anything that is greater than 40+tier2numcomparators(counter,counter2) but less than 60 to 40+tier2numcomparators(counter,counter2), and to set anything that is greater than 60+tier2numcomparators(counter,counter2) to 60+tier2numcomparators(counter,counter2). I first found this problem because it was being set to 81, so it didn't just exceed 60+tier2numcomparators(counter,counter2), it exceeded 60+20, with tier2numcomparators being bounded to 20.
I hope this isn't all too-much-information, but I felt it might be necessary to get you to understand just what sort of variables these are.
So in my attempts to at least turn the data into valid data, I did the following:
for counter=1:size(tier2comparatorconnectionpoints,1)
for counter2=1:size(tier2comparatorconnectionpoints,2)
for counter3=1:size(tier2comparatorconnectionpoints,3)
for counter4=1:size(tier2comparatorconnectionpoints,4)
if tier2comparatorconnectionpoints(counter,counter2,counter3,counter4)>60+tier2numcomparators(counter,counter2)
tier2comparatorconnectionpoints(counter,counter2,counter3,counter4)=60+tier2numcomparators(counter,counter2);
end
end
end
end
end
And that worked just fine. And then:
for counter=1:size(tier2comparatorconnectionpoints,1)
for counter2=1:size(tier2comparatorconnectionpoints,2)
for counter3=1:size(tier2comparatorconnectionpoints,3)
for counter4=1:size(tier2comparatorconnectionpoints,4)
if tier2comparatorconnectionpoints(counter,counter2,counter3,counter4)>40+tier2numcomparators(counter,counter2)
if tier2comparatorconnectionpoints(counter,counter2,counter3,counter4)<60
tier2comparatorconnectionpoints(counter,counter2,counter3,counter4)=40+tier2numcomparators(counter,counter2);
end
end
end
end
end
end
And that's where it said "Attempt to reference field of non-structure array".
TBH it sounds like maybe you've made a typo and put a . in somewhere? Otherwise please post the entire error as maybe it's happening in a different function or something.
Either way you don't need all those for loops, it's simpler and usually quicker to do this (and should bypass your error):
First replicate your tier2numcomparators matrix so that it has the same dimension sizes as tier2comparatorconnectionpoints
T = repmat(tier2numcomparators + 40, 1, 1, size(tier2comparatorconnectionpoints, 3), size(tier2comparatorconnectionpoints, 4));
Now in one shot you can create a logical matrix of which elements meet your criteria:
ind = tier2comparatorconnectionpoints > T | tier2comparatorconnectionpoints < 60;
Finally employ logical indexing to set your desired elements:
tier2comparatorconnectionpoints(ind) = T(ind);
You can play around with bsxfun instead of repmat if this is slow or takes too much memory

Resources