Having trouble identifying what is wrong in this FAT - fat

So I'm going through the table starting at 0000. I see 006E and that sends me down to row 0060 column E. I see 00A2 which sends me to row 00A0 colum 2. I see 00EA which sends me to row 00E0 row A. I see FFF0 indicating end of cluster. Is this wrong because more clusters are left in table? Am i going about this compeltely wrong??

This does not strike me as a particularly realistic presentation of a FAT. In the real world, the fact that each entry is two bytes long would mean that you need to multiply the cluster numbers by 2 to find their actual location in the table; that's clearly not workable here, since about half of the cluster links would point outside of the FAT as shown. If you're supposed to interpret the numbers as location offsets instead of actual cluster numbers, then they'd have to all be even - that odd 0049 at 00DA would be invalid.
You cannot truly evaluate a FAT without looking at the disk's directories, to know where each file actually starts. That 0062 at 0000 is NOT the start of a file; notice that there is a link to 0000 at 0056, and a link to 0056 at 00D2, so perhaps 00D2 is a valid file start.
A link that points to a FFFF unused cluster would indeed be a problem - I see at least one of those. Two links to the same cluster, or links that form a loop, would also be worth looking for.

I found unexpected end of file from 00AA (ends with FFFF block), recursion (begins from 00B4) and wrong adress 0049.
Further more some file chunks contains intersected data. Maybe this is some sort of FATs feature, then i can't suggest this like an anomaly )

Related

Can i find memory adress of moving or making actions in games

So I was wondering can I find memory adress of moving, doing some stuff or memory that saves enemy position in program lie cheatengine or am I need reverse engeneering for that? and hypothetically can I go futher and make bot or even ai based on that?
Your question is quite vague but I will try to answer what I understood. It will be difficult to help without code and I don't have a dissasembler open right now and I write everything by memory right now so take my instruction references with a grain of salt. They, are after all a guideline and should you make this question more accurate and potentially provide some more information about the game and/or its code I could revise my answer.
You can certainly find your character's position using unknown initial float value scan and increased/decreased value whenever you move.
I suggest going up and down because it's more clear to figure out if you increase or decrease the float value responsible for your character's Y position than going front/back and/or sideways because you don't really know if you are increasing or decreasing your X/Z value and you might need more scans.
When you find the value you're looking for you can find out what accesses or writes to that address and then get one of the instructions and find out what addresses that particular instruction accesses.
You might have a shared instruction between you and the enemies or other entities. If that's the case you can then dissect the data structures using your address and the addresses of a couple enemies or other entities and create 2 separate groups. One for the player and one for the enemies and/or other entities or 3 separate groups. One for the player, one for the enemies and one for other entities and find uncommon values between groups. I suggest int values,bytes or hex values and not floats.
Then you auto-compile an AOB script, make a new variable (you need to allocate space) called something like playerBaseCoords and compare the uncommon values between the groups and then pass the base register to [playerBaseCoords] e.g. If the instruction had [rsi+180] you do mov [playerBaseCoords],rsi
After that you add a new address to your cheat table called playerBaseCoords and make it a pointer or give it the offset that the register had. In this case 180
By the way, most of the times, the other coordinate values are close by in memory so look up and down 4-8 bytes apart from the value you found for some other float values.
Regarding AI and Bots:
I think you could probably do something like an aimbot but I have no idea how those things work.

Generate unique identifier for chess board

I'm looking for something like a checksum for a chess board with pieces in specific places. I'm looking to see if a dynamic programming or memoized solution is viable for an AI chess player. The unique identifier would be used to easily check if two boards are equal or to use as indices in the arrays. Thanks for the help.
An extensively used checksum for board positions is the Zobrist signature.
It's an almost unique index number for any chess position, with the requirement that two similar positions generate entirely different indices. These index numbers are used for faster and space efficient transposition tables / opening books.
You need a set of randomly generated bitstrings:
one for each piece at each square;
one to indicate the side to move;
four for castling rights;
eight for the file of a valid en-passant square (if any).
If you want to get the Zobrist hash code of a certain position, you have to xor all random numbers linked to the given feature (details: here and Correctly Implementing Zobrist Hashing).
E.g the starting position:
[Hash for White Rook on a1] xor [White Knight on b1] xor ... ( all pieces )
... xor [White castling long] xor ... ( all castling rights )
XOR allows a fast incremental update of the hash key during make / unmake of moves.
Usually 64bit are used as a standard size in modern chess programs (see The Effect of Hash Signature Collisions in a Chess Program).
You can expect to encounter a collision in a 32 bit hash when you have evaluated √ 232 == 216. With a 64 bit hash, you can expect a collision after about 232 or 4 billion positions (birthday paradox).
If you're looking for a checksum, the usual solution is Zobrist Hashing.
If you're looking for a true unique-identifier, the usual human-readable solution is Forsyth notation.
For a non-human-readable unique-identifier, you can store the type/color of the piece on each square using four-bits. Throw in another 3-bits for en-passant square, 4-bits for which castlings are still allowed, and one-bit for whose turn it is, and you end up with exactly 33 bytes for each board-setup.
You can use a checksum like md5, sha, just pass your chessboard cells as text, like:
TKBQKBHT
........
........
........
tkbqkbht
And get the checksum for generated text.
The checksum between one to other board will be different without any related value, at this point may be create a unique string (or array of bits) is the best way:
TKBQKBHT........................tkbqkbht
Because it will be unique too and is easily compare with others.
If two games achieve the same configuration through different moves or move orders, they should still be "equal". e.g. You shouldn't have to distinguish between which pawn is in a particular location, as long as the location is the same. You don't seem to really want to hash, but to uniquely and correctly distinguish between these board states.
One method is to use a 64x12 square-by-piecetype membership matrix. You can store this as a bit vector and then compare vectors for the check. e.g. the first 64 addresses in the vector might show which locations on the board contain pawns. The next 64 show locations which contain knights. You could let the first 6 sections show membership of white pieces and the final 6 show membership of black pieces.
Binary membership matrix pseudocode:
bool[] memberships = zeros(64*12);
move(pawn,a3,a2);
def move(piece,location,oldlocation):
memberships(pawn,location) = 1;
memberships(pawn,oldlocation) = 0;
This is cumbersome because you have to be careful how you implement it. e.g. make sure there is only one king maximum for each player. The advantage is that it only takes 768 bits to store a state.
Another way is a length-64 integer vector representing vectorized addresses for the board locations. In this case, the first 8 addresses might represent the state of the first row of the board.
Non-binary membership matrix pseudocode:
half[] memberships = zeros(64);
memberships[8] = 1; // white pawn at location a2
memberships[0] = 2; // white rook at location a1
...
memberships[63] = 11; // black knight at location g8
memberships[64] = 12; // black rook at location h8
The nice thing about the non-binary vector is you don't have as much freedom to accidently assign multiple pieces to one location. The downside is that it is now larger to store each state. Larger representations will be slower to do equality comparisons on. (in my example, assume each vector location stores a 16-bit half-word, we get 64*16=1014 bits to store one state compared to the 768 bits for the binary vector)
Either way, you'd probably want to enumerate each piece and board location.
enumerate piece {
empty = 0;
white_pawn = 1;
white_rook = 2;
...
black_knight = 11;
black_rook = 12;
}
enumerate location {
a1 = 0;
...
}
And testing for equality is just comparing two vectors together.
There are 64 squares. There are twelve different figures in chess that can occupy a square plus the possibility of no figure occupying it. Makes 13. You need 4 bits to represent those 13 (2^4 = 16). So you end up with 32 bytes to unambiguously store a chess board.
If you want to ease handling you can store 64 bytes instead, one byte per square, as bytes are easier to read and write.
EDIT: I've read some more on chess and have come to the following conclusion: Two boards are only the same, if all previous boards since last capture or pawn move are also the same. This is because of the threefold repetition rule. If for the third time the board looks exactly the same in a game, a draw can be claimed. So in spite of seeing the same board in two matches, it may be considered unfortunate in one match to make a certain move, so as to avoid a draw, whereas in the other match there is no such danger.
It is up to you, how you want to go about it. You would need a unique identifyer of variable length due to the variable number of previous boards to store. Well, maybe you take it easy, turn a blind eye to this and just store the last five moves to detect directly repetetive moves that could lead to a third repetion of positions, this being the most often occuring reason.
If you want to store moves with the board: There are 64x63=4032 thinkable moves (12 bits necessary), but many of them illegal of course. If I count correctly there are 1728 legal moves (A1->A2 = legal, A1->D2 illegal for instance), which would fit in 11 bits. I would still go for the 12 bits, however, as to make interpretion as easy as possible by storing 0/1 for A1->A2 and 62/63 for H7->H8.
Then there is the 50 moves rule. You don't have to store moves here. Only the number of moves since last capture or pawn move from 0 to 50 (that's enough; it doesn't matter whether it's 50, 51 or more). So another six bits for this.
At last: Black's or white's move? Enpassantable pawn? Castlingable rook? Some additional bits for this (or extension of the 13 occupancies to save some bits).
EDIT again: So if you want to use the board to compare with other matches, then "two boards are only the same, if all previous boards since last capture or pawn move are also the same" applies. If you only want to detect repetion of positions in the same game, however, then you should be fine by just using the 15 occupancies x 64 squares plus one bit for who's move it is.

What is a good approach to check if an item is in a very big hashset?

I have a hashset that cannot be entirely loaded into the memory. So let's say it has ABC part and each one could be loaded into memory but not all at one time.
I also have random entries coming in from time to time which I can barely tell which part it could potentially belong to. So one of the approaches could be that I load A first and then make a check, and then B, C. But next entry could belong to B so I have to unload C, and then load A, then B...Hopefully I make this understood.
This clearly would be very slow so I wonder is there a better way to do that? (if using db is not an alternative)
I suggest that you don't use some criteria to put data entry either to A or to B. In other words, A,B,C - it's just result of division of whole data to 3 equal parts. Am I right? If so I recommend you add some criteria when you adding new entry to your set. For example, if your entries are numbers put those who starts from 0-3 to A, those who starts from 4-6 - to B, from 7-9 to C. When your search something, you apriori now that you have to search in A or in B, or in C. If your entries are words - the same solution, but now criteria is first letter. May be here better use not 3 sets but 26 - size of english alphabet. Please note, that you anyway have to store one of sets in memory. You see one advantage - you do maximum 1 load/unload operation, you don't need to check all sets - you now which of them can really store your value. This idea is widely using in DB - partitioning. If you store in sets nor numbers nor words but some complex objects you anyway can invent some simple criteria.

Implementing a basic predator-prey simulation

I am trying to implement a predator-prey simulation, but I am running into a problem.
A predator searches for nearby prey, and eats it. If there are no near by prey, they move to a random vacant cell.
Basically the part I am having trouble with is when I advanced a "generation."
Say I have a grid that is 3x3, with each cell numbered from 0 to 8.
If I have 2 predators in 0 and 1, first predator 0 is checked, it moves to either cell 3 or 4
For example, if it goes to cell 3, then it goes on to check predator 1. This may seem correct
but it kind of "gives priority" to the organisms with lower index values.. I've tried using 2 arrays, but that doesn't seem to work either as it would check places where organisms are but aren't. ._.
Anyone have an idea of how to do this "fairly" and "correctly?"
I recently did a similar task in Java. Processing the predators starting from the top row to bottom not only gives "unfair advantage" to lower indices but also creates patterns in the movement of the both preys and predators.
I overcame this problem by choosing both row and columns in random ordered fashion. This way, every predator/prey has the same chance of being processed at early stages of a generation.
A way to randomize would be creating a linked list of (row,column) pairs. Then shuffle the linked list. At each generation, choose a random index to start from and keep processing.
More as a comment then anything else if your prey are so dense that this is a common problem I suspect you don't have a "population" that will live long. Also as a comment update your predators randomly. That is, instead of stepping through your array of locations take your list of predators and randomize them and then update them one by one. I think is necessary but I don't know if it is sufficient.
This problem is solved with a technique called double buffering, which is also used in computer graphics (in order to prevent the image currently being drawn from disturbing the image currently being displayed on the screen). Use two arrays. The first one holds the current state, and you make all decisions about movement based on the first array, but you perform the movement in the other array. Then, you swap their roles.
Edit: Looks like I didn't read your question thoroughly enough. Double buffering and randomization might both be needed, depending on how complex your rules are (but if there are no rules other than the ones you've described, randomization should suffice). They solve two distinct problems, though:
Double buffering solves the problem of correctness when you have rules where decisions about what will happen to a creature in a cell depends on the contents of neighbouring cells, and the decisions about neighbouring cells also depend on this cell. If you e.g. have a rule that says that if two predators are adjacent, they will both move away from each other, you need double buffering. Otherwise, after you've moved the first predator, the second one won't see any adjacent predator and will remain in place.
Randomization solves the problem of fairness when there are limited resources, such as when a prey only can be eaten by one predator (which seems to be the problem that concerned you).
How about some sort of round robin method. Put your predators in a circular linked list and keep a pointer to the node that's currently "first". Then, advance that first pointer to the next place in the list each generation. You could insert new predators either at the front or the back of your circular list with ease.

Loading tiles for a 2D game

Im trying to make an 2D online game (with Z positions), and currently im working with loading a map from a txt file. I have three different map files. One contains an int for each tile saying what kind of floor there is, one saying what kind of decoration there is, and one saying what might be covering the tile. The problem is that the current map (20, 20, 30) takes 200 ms to load, and I want it to be much much bigger. I have tried to find a good solution for this and have so far come up with some ideas.
Recently I'v thought about storing all tiles in separate files, one file per tile. I'm not sure if this is a good idea (it feels wrong somehow), but it would mean that I wouldn't have to store any unneccessary tiles as "-1" in a text file and I would be able to just pick the right tile from the folder easily during run time (read the file named mapXYZ). If the tile is empty I would just be able to catch the FileNotFoundException. Could anyone tell me a reason for this being a bad solution? Other solutions I'v thought about would be to split the map into smaller parts or reading the map during startup in a BackgroundWorker.
Try making a much larger map in the same format as your current one first - it may be that the 200ms is mostly just overhead of opening and initial processing of the file.
If I'm understanding your proposed solution (opening one file per X,Y or X,Y,Z coordinate of a single map), this is a bad idea for two reasons:
There will be significant overhead to opening so many files.
Catching a FileNotFoundException and eating it will be significantly slower - there is actually a lot of overhead with catching exceptions, so you shouldn't rely on them to perform application logic.
Are you loading the file from a remote server? If so, that's why it's taking so long. Instead you should embed the file into the game. I'm saying this because you probably take 2-3 bytes per tile, so the file's about 30kb and 200ms sounds like a reasonable download time for that size of file (including overhead etc, and depending on your internet connection).
Regarding how to lower the filesize - there are two easy techniques I can think of that will decrease the filesize a bit:
1) If you have mostly empty squares and only some significant ones, your map is what is often referred to as 'sparse'. When storing a sparse array of data you can use a simple compression technique (formally known as 'run-length encoding') where each time you come accross empty squares, you specify how many of them there are. So for example instead of {0,0,0,0,0,0,0,0,0,0,1,1,2,3,0,0,0,0,0,0,0,0,0,0,0,0,1} you could store {10 0's, 1, 1, 2, 3, 12 0's, 1}
2) To save space, I recommend that you store everything as binary data. The exact setup of the file mainly depends on how many possible tile types there are, but this is a better solution than storing the ascii characters corresponding to the base-10 representation of the numers, separated by delimiters.
Example Binary Format
File is organized into segments which are 3 or 4 bytes long, as explained below.
First segment indicates the version of the game for which the map was created. 3 bytes long.
Segments 2, 3, and 4 indicate the dimensions of the map (x, y, z). 3 bytes long each.
The remaining segments all indicate either a tile number and is 3 bytes long with an MSB of 0. The exception to this follows.
If one of the tile segments is an empty tile, it is 4 bytes long with an MSB of 1, and indicates the number of empty tiles including that tile that follow.
The reason I suggest the MSB flag is so that you can distinguish between segments which are for tiles, and segments which indicate the number of empty tiles which follow that segment. For those segments I increase the length to 4 bytes (you might want to make it 5) so that you can store larger numbers of empty tiles per segment.

Resources