Database file with 78 9C header? - database

I've come to work with a strange database file format.
Each DB comes with two files: one is "database.db" and the other is "database.key".
The ".db" file always starts with a 0x78 0x9C binary header, while the ".key" always contains, in a random part of the file, the string "1.00 Peter's B Tree" inside.
Looking online I found that the header 0x78 0x9C could refer to compression Zlib, but have not found any way to view the contents of the database.
Does anyone here know something that could help me with this format ? Thnaks :)
Edit 1:
It appears that the ".db" file contains more than one zlib deflated streams:
The signature 0x78 0x9C is not only present at the beginning of the file but in different parts of it.
Fo example this are some of the streams i can find in one file:
78 9C CB 63 40 07 33 76 5B 6A AF 78 DD 54 23 CE C9 90 C4 78 89 81 89 81 F1 22 86 9A ED 6A D7 44 F6 03 D5 B0 31 30 94 60 91 F6 D4 2A 76 3B 0C 94 E6 63 60 2C 51 B6 63 00 00 22 13 11 57
78 9C CB 63 40 07 2F 53 D7 B8 9F EC 8B B2 E1 7A F1 32 87 F1 12 03 23 03 E3 45 0C 35 4B B7 68 5B CD 90 2E E7 65 67 60 2A 51 B6 63 00 00 A6 E8 0C 5D
By inflating thoose 2 streams i get 2 new uncompressed streams.
What i did then is a C# program that loaded a ".db" file and created a list of byte arrays; a byte array is a deflated stream.
To do this I simply split the file at every 78 9C.
This seems to work with some of the ".db" files but, in other situations it gave me some errors like "Invalid distance code", with this stream
78 9C E2 13 FD 2F 14 9F CD 9B 29 3E 65 9F A0 F8 BC 7C 92 E2 93 EF 29 8A CF B0 A7 29 3E 8D FE 4A F1 B9 F2 0C C5 27 C4 B3 14 EF F5 5B 28 DE B5 B7 52 BC FF 6E A3 78 27 DD 4E F1 9E B8 83 E2 DD 6D 27 C5 FB D4 2E FA F0 6A EE A6 78 EF 78 EE EA 2F AA D3 91 FE 1F 2F 94 78 6C
or "Invalid stored block lenght", with this stream
78 9C 90 35 CE 34 2F 0C 7D FE A5 57 C9 FF D5 2B 47 5B B7 C4 7F 69 EA 3F 0F AC 25 F4 45 49 3D CC FF 00 E5 AE 30 40
Maybe simply splitting the file at each 78 9C is not the correct way of doing it ...
As for the ".key" files: I was able to open them using the library of Peter Graf "PBL".
With the "pblKfGetAbs ()" (http://www.mission-base.com/peter/source/pbl/doc/keyfile.html) I managed to get all records related to each key in the file. These records are of 4-byte values.
Searching for these values on a decompressed ".db" file (In a file that did not give me errors during the inflate process) with an hex editor I was able to get some results but nothing more. I don't understand wat thoose records on the key file means...
Thank you for the help !

Yes, those are very likely zlib streams stored in the database.
There is nothing keeping 78 9c from appearing in the compressed data, so simply searching for that is not a good way to extract the contents of the file. Also 78 9c is not the only valid zlib header. The easiest way to find the valid zlib streams is to simply start decompressing at every byte. zlib will very quickly rule out most as not having a valid zlib header. For the rest you can decompress until it completes or fails. If it completes with a good integrity check (returning Z_STREAM_END), then it is extremely likely that that was an intentional compressed zlib stream.
You are trying to reverse-engineer a data base format with what appears to be relatively little to go on. This is a detective job that stackoverflow can't help with, unless someone here knows the format and recognizes it.

These are zlib magic headers widely used by different utilities (such as Git, Memcached, etc).
To uncompress the file, you can use the following command:
printf "\x1f\x8b\x08\x00\x00\x00\x00\x00" | cat - zlib-file.dump | gunzip
To skip some bytes before, use dd, e.g.
cat <(printf "\x1f\x8b\x08\x00\x00\x00\x00\x00") <(dd skip=100 if=zlib-file.dump bs=1 of=/dev/stdout) | gunzip
If the data got crc/length error, consider as faulty.

the .db files are compressed data , the .key files are key_informations to find the wanted data in those .db (like an index file) after you open them,you may not find string data in those .db files,because they are a runtime databases, these .db files containt hex data like 'packets'and they are compressed as he said

78 9C is the zlib magic headers with Default Compression.
Try Aluigi's offzip commandline tool to extract the data.

Related

How does this video file crash discord?

https://cdn.discordapp.com/attachments/882079986599751680/954144030449631332/UOOOOPA.webm
this video will crash your discord if you open it in discord (do it at your own risk)
i was wondering how this video achieves that. i downloaded and hexdumped the video and found that at the end of it, a series of data is repeated which might be the cause of this thing actually crashing discord. (it crashes discord when it reaches the end)
you can see the pattern in the image.
click to see image
this goes down from offset 00036680 to 000375e0.
HEX : 40 b4 d3 4d 34 d3 4c 8c 00 00 00 08 40
ASCII : # M 4 L NUL NUL NUL BS #
this looks to be the pattern that is getting repeated.
i also found another set of repeating data going from offset 00033c90 to 000364a0 which is exactly behind the one above. click to see image
52 a9 52 a5 52 a5 4a a5 4a 95 4a 95 2a 95 2a
55 2a 54 aa 54 a9 54 a9 52 a9 52 a5 52 a5 4a
a5 4a 95 4a 95 2a 95 2a 55 2a 54 aa 54 a9 54
this looks to be the data that is repeating.
i dont know if its a common thing in webm videos or its a malicious data injected into it.
how are these kind of videos made?
(i've also seen some other webm videos that have infinite time)

Possible CRC-32 Checksum Reverse Engineer Crack

I'm trying to change some data from an old PS1 game save file but the data keeps getting corrupted even though I isolated the exact bit I wanted to change and made sure nothing else was altered.
I changed some data for every save file and here's the possible checksum constant I found.
8A B5 2E CC
BD E6 AE 3A
B7 88 25 21
61 EC 03 37
35 3E 6D 59
11 48 91 D0
77 4B B2 85
85 55 F7 B5
Any advice or help is appreciated.

h264 inside AVI, MP4 and "Raw" h264 streams. Different format of NAL units (or ffmpeg bug)

TL;DR: I want to read raw h264 streams from AVI/MP4 files, even broken/incomplete.
Almost every document about h264 tells me that it consists of NAL packets. Okay. Almost everywhere told to me that the packet should start with a signature like 00 00 01 or 00 00 00 01. For example, https://stackoverflow.com/a/18638298/8167678, https://stackoverflow.com/a/17625537/8167678
The format of H.264 is that it’s made up of NAL Units, each starting
with a start prefix of three bytes with the values 0x00, 0x00, 0x01
and each unit has a different type depending on the value of the 4th
byte right after these 3 starting bytes. One NAL Unit IS NOT one frame
in the video, each frame is made up of a number of NAL Units.
Okay.
I downloaded random_youtube_video.mp4 and strip out one frame from it:
ffmpeg -ss 10 -i random_youtube_video.mp4 -frames 1 -c copy pic.avi
And got:
Red part - this is part of AVI container, other - actual data.
As you can see, here I have 00 00 24 A9 instead of 00 00 00 01
This AVI file plays perfectly
I do same for mp4 container:
As you can see, here exact same bytes.
This MP4 file plays perfectly
I try to strip out raw data:
ffmpeg -i pic.avi -c copy pic.h264
This file can't play in VLC or even ffmpeg, which produced this file, can't parse it:
I downloaded mp4 stream analyzer and got:
MP4Box tells me:
Cannot find H264 start code
Error importing pic.h264: BitStream Not Compliant
It very hard to learn internals of h264, when nothing works.
So, I have questions:
What actual data inside mp4?
What I must read to decode that data (I mean different annex-es)
How to read stream and get decoded image (even with ffmpeg) from this "broken" raw stream?
UPDATE:
It seems bug in ffmpeg:
When I do double conversion:
ffmpeg -ss 10 -i random_youtube_video.mp4 -frames 1 -c copy pic.mp4
ffmpeg pic.mp4 -c copy pic.h264
But when I convert file directly:
ffmpeg -ss 10 -i random_youtube_video.mp4 -frames 1 -c copy pic.h264
I have NALs signatures and one extra NAL unit. Other bytes are same (selected).
This is bug?
UPDATE
Not, this is not bug, U must use option -bsf h264_mp4toannexb to save stream as "Annex B" format (with prefixes)
"I want to read raw h264 streams from AVI files, even broken/incomplete."
"Almost everywhere told to me that the packet should start with a signature like : 00 00 01 or 00 00 00 01"
"...As you can see, here I have 00 00 24 A9 instead of 00 00 00 01"
Your H264 is in AVCC format which means it uses data sizes (instead of data start codes). It is only Annex-B that will have your mentioned signature as start code.
You seek frames, not by looking for start codes, but instead you just do skipping by frame sizes to reach the final correct offset of a (requested) frame...
AVI processing :
Read size (four) bytes (32-bit integer, Little Endian).
Extract the next following bytes up to size amount.
This is your H.264 frame (in AVCC format), decode the bytes to view image.
To convert into Annex-B, try replacing first 4 bytes of H.264 frame bytes with 00 00 00 01.
Consider your shown AVI bytes (see first picture) :
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00 00 00 00 4C 49 53 54 BA 24 00 00 6D 6F 76 69 ....LISTº$..movi
30 30 64 63 AD 24 00 00 00 00 24 A9 65 88 84 27 00dc.$....$©eˆ„'
C7 11 FE B3 C7 83 08 00 08 2A 7B 6E 59 B5 71 E1 Ç.þ³Çƒ...*{nYµqá
E3 9C 0E 73 E7 10 50 00 18 E9 25 F7 AA 7D 9C 30 ãœ.sç.P..é%÷ª}œ0
E6 2F 0F 20 00 3A 64 AA CA 5E 4F CA FF AE 20 04 æ/. .:dªÊ^OÊÿ® .
07 81 40 00 48 00 0A 28 71 21 84 48 06 18 90 0C ..#.H..(q!„H....
31 14 57 9E 7A CD 63 A0 E0 9B 96 69 C5 18 AE F2 1.WžzÍc à›–iÅ.®ò
E6 07 02 29 01 20 10 70 A1 0F 8C BC 73 F0 78 FA æ..). .p¡.Œ¼sðxú
9E 1D E1 C2 BF 8C 62 CE CE AC 14 5A A4 E1 45 44 ž.á¿ŒbÎά.Z¤áED
38 38 85 DB 12 57 3E F6 E0 FB AE 03 04 21 62 8D 88…Û.W>öàû®..!b.
F6 F1 1E 37 1C A2 FF 75 1C F1 02 66 0C 92 07 06 öñ.7.¢ÿu.ñ.f.’..
15 7C 90 15 6F 7D FC BD 13 1E 2B 0C 14 3C 0C 00 .|..o}ü½..+..<..
B0 EA 6F 53 B4 98 D7 80 7A 68 3E 34 69 20 D2 FA °êoS´˜×€zh>4i Òú
F0 91 FC 75 C6 00 01 18 C0 00 3B 9A C5 E2 7D BF ð‘üuÆ...À.;šÅâ}¿
Some explanation :
Ignore leading multiple 00 bytes.
4C 49 53 54 D6 3C 00 00 6D 6F 76 69 including 30 30 64 63 = AVI "List" header.
AD 24 00 00 == decimal 9389 is AVI's own size of H264 item (must read in Little Endian).
Notice that the AVI bytes include...
- a note of item's total size (AD 24 00 00... or reverse for Little Endian : 00 00 24 AD)
- followed by item data (00 00 24 A9 65 88 84 27 ... etc ... C5 E2 7D BF).
This size includes both the 4 bytes of the AVI's"size" entry + expected bytes length of the item's own bytes. Can be written simply as:
AVI_Item_Size = ( 4 + item_H264_Frame.length );
H.264 video frame bytes in AVI :
Next follows the item data, which is the H.264 video frame. By sheer coincidence of formats/bytes layout, it too holds a 4-byte entry for data's size (since your H264 is in AVCC format, if it was Annex-B then you would be seeing start code bytes here instead of size bytes).
Unlike AVI bytes, these H264 size bytes are written in Big Endian format.
00 00 24 A9 = size of bytes for this video frame (instead of start code : 00 00 00 01).
65 88 84 27 C7 11 FE B3 C7 = H.264 keyframe (always begins X5, where the X value is based on other settings).
Remember after four size bytes (or even start codes) if followed by...
byte X5 = keyframe (IDR), example byte 65.
byte X1 = P or B frame, example byte 41.
byte X6 = SEI (Supplemental Enhancement Information).
byte X7 = SPS (Sequence Parameter Set).
byte X8 = PPS (Picture Parameter Set).
bytes 00 00 00 X9 = Access unit delimiter.
You can find the H.264 if you search for exact same bytes within AVI file. See third picture, these are your H.264 bytes (they are cut & pasted into the AVI container).
Sometimes a frame is sliced into different NAL units. So if you extract a key frame and it only shows 1/2 or 1/3 instead of full image, just grab next one or two NAL and re-try the decode.

Hexadecimal/Binary to QR Code

Is there any way to create a QR code from a byte array? I decoded one using "zxing", and now that I changed it, I want to turn it back. If there is a solution, please tell me. Here is the code churned from "zxing":
40 07 01 18 2b 3c ba 4c 0e 1d bd 8a b4 23 29 10
40 72 b0 fe 7f 12 7c 71 2f f2 2b 8e 2a 2b b9 88
21 93 94 83 c8 b2 57 d8 a1 5f 0f 70 c3 56 8f 88
81 16 70 1d b0 b8 dc 0d ce 4c 1e 7c 01 85 26 74
d3 ae ce 6b b0 4b 02 6a 45 50 11 1b 65 2c 5e e2
cc 4a 65 f2 04 94 27 84 6a 88 2c c1 92 8b 65 b3
4d a4 9a 07 4f 41 14 bd 6e b6 ab 02 ca cc 7b dd
fe 34 60 ec 11 ec 11 ec 11 ec
The array format and the spaces seem important. Now, here is the original QR Code I put into "zxing":
Thanks everyone!
---In Response to the On-Hold Message:---
I tried to convert the array to a QR Code, but it was much different from the original. I expected it to be the same.
There are different ways of encoding the same information for a QR code. One of several possible mask patterns is applied to the information in order to produce a QR code that has a well distributed pattern of black and white pixels. In a QR code on a piece of paper that is bent in some way, the apparent distances of the pixels may vary. Long series of white or black pixels could disturb the reading process. When reading a QR code, the edges between the white and black pixels are used to synchronize with the raster used.
In order to test if the printed QR code is correct, re-read it and compare the information with the information of the original QR code. Don't compare the pictures!
See QR code / Encoding on Wikipedia.

Trying to identify a filesystem / disk label, recover it

I'm trying to ID (and maybe recover) the filesystem/partition table. Friend brought a "broken" USB drive, Windows can't recognise the partition layout.
Under Linux, fdisk says the partition table is empty. Tried mounting it as NTFS, vfat, no luck. With fdisk/mkfs, created an empty: DOS partition table, ntfs and fat filesystems, tried to compare magic numbers in the first block of the respective three and the broken drive - none seem alike. dd'd the first 1MB of the drive to a file on disk (so that file doesn't say it's a block device), file said "data".
This is the first 8 lines of hd:
00000000 0e 21 e9 6e 2c 64 39 b5 63 bf a5 08 8b 07 85 a6 |.!.n,d9.c.......|
00000010 63 aa ec 58 c3 ff fb 92 64 ec 80 02 f4 3c 4c d1 |c..X....d....<L.|
00000020 8f 2a e4 58 24 39 ba 3d 86 4a 8e e0 d3 27 ac 60 |.*.X$9.=.J...'.`|
00000030 eb 81 73 9f 26 68 f6 15 72 60 02 6b 32 32 4c 75 |..s.&h..r`.k22Lu|
00000040 b1 0a cd ff ff ff f4 ea 23 c8 2a ba 25 01 20 9d |........#.*.%. .|
00000050 26 52 b1 31 2c 4d 72 b1 2f bc 9f 1f 59 5b 98 98 |&R.1,Mr./...Y[..|
00000060 41 9d 3c 10 17 d0 58 9a ab 24 d9 31 ff 3a 79 55 |A.<...X..$.1.:yU|
00000070 f3 88 08 6b 57 ec 7a 5f ff e0 21 c7 87 4c 62 83 |...kW.z_..!..Lb.|
Any idea how to proceed with the recovery?
If you study fdisk code on Linux, you will see code to create/parse a Master Boot Table. This is the table that contains diff codes for diff boot partitions, starting block/offset, bootable/non-bootable flag, etc. If this table is corrupted, then it is difficult to recover.
One option is to find out where the MBT is stored on USB...usually, it is a standard location based on the file system. If the data there is not readable, then go beyond it and see where the first file system block is resident (most probably also a fix starting location. If the hex dump is recognizable at this location, create a MBT with this block number and see if the boot works..
The other option is to find out if there is a duplicate copy of MBT stored by the FS on the USB. Study the File system that formatted the USB and you may get closer.

Resources