In a disassembled dll (by IDA), I reached an array, which is commented as an array of int (but it may be of byte):
.rdata:000000018003CC00 ; int boxA[264]
.rdata:000000018003CC00 boxA dd 0 ; DATA XREF: BlockPrepXOR+5FC↑r
.rdata:000000018003CC04 db 0Eh
.rdata:000000018003CC05 db 0Bh
.rdata:000000018003CC06 db 0Dh
.rdata:000000018003CC07 db 9
.rdata:000000018003CC08 db 1Ch
.rdata:000000018003CC09 db 16h
.rdata:000000018003CC0A db 1Ah
.rdata:000000018003CC0B db 12h
.rdata:000000018003CC0C db 12h
.rdata:000000018003CC0D db 1Dh
.rdata:000000018003CC0E db 17h
.rdata:000000018003CC0F db 1Bh
Can I interpret the data as
{000000h, E0B0D09h, 1C161A12h, ..} or
{0, 90D0B0Eh, 121A161Ch, ...} or
{00h,00h,00h,00h, 0Eh, 0Bh, ..} ?
From the comment (from IDA), can you confirm that the array ends at CC00h + 253*4 = D01Fh ? I have another array starting at D020h:
.rdata:000000018003D01D db 0F9h ; ù
.rdata:000000018003D01E db 0A2h ; ¢
.rdata:000000018003D01F db 3Fh ; ?
.rdata:000000018003D020 array4_1248 db 1 ; DATA XREF: BlockPrepXOR+39A↑o
.rdata:000000018003D021 db 2
.rdata:000000018003D022 db 4
.rdata:000000018003D023 db 8
That's just the AES decryption's T8 matrix as described in this paper.
You can easily identify it by looking for the DWORDs values on Google (e.g. this is one of the results).
So that's just data for an AES decryption function.
Note also that the interpretation of a sequence of bytes as a sequence of multi-byte data (WORDs, DWORDs, QWORDs, and so on) depends on the architecture.
For x86, only the little-endian interpretation is correct (this is your case 2) but data may undergo arbitrary manipulations (e.g. it can be bswapped) so, when looking on Google, always use both the little and the big-endian versions of the data.
It's also worth noting that IDA can interpret the bytes as DWORDs (type d twice or use the context menù), showing the correct value based on the architecture of disassembled binary.
Related
I am wondering if is safe to only use the first 22 characters instead of the 44 characters of a pubkey of an NFT as a primary key of a MySQL DB. I have a DB with huge data and could save a lot of space thanks to this approach. For instance having the following pubkey:
AQoKYV7tYpTrFZN6P5oUufbQKAUr9mNYGe1TTJC9wajM
Would it be safer to use the first 22 characters:
AQoKYV7tYpTrFZN6P5oUuf
Would it be safer using the first 11chars plus the trailing 11chars, or doesn't make any difference?
AQoKYV7tYpTe1TTJC9wajM
A public key is 32 bytes, so those "44 characters" are actually the base-58 representation of those 32 bytes.
If you're only storing 22 characters, let's simplify things and say that you're storing 16 bytes out of 32 total. The chance of two pubkeys sharing the same 16-byte sequence is 1 / 256^16 = 1 / 2^128 = 2.9 * 10 ^ -39, which is very unlikely, but possible.
Here's another way to approach the problem -- how about storing the full pubkey as 32 bytes instead of as a string? Then you won't ever lose any precision.
I am going to encrypted several fields in existing table. Basically, the following encryption technique is going to be used:
CREATE MASTER KEY ENCRYPTION
BY PASSWORD = 'sm_long_password#'
GO
CREATE CERTIFICATE CERT_01
WITH SUBJECT = 'CERT_01'
GO
CREATE SYMMETRIC KEY SK_01
WITH ALGORITHM = AES_256 ENCRYPTION
BY CERTIFICATE CERT_01
GO
OPEN SYMMETRIC KEY SK_01 DECRYPTION
BY CERTIFICATE CERT_01
SELECT ENCRYPTBYKEY(KEY_GUID('SK_01'), 'test')
CLOSE SYMMETRIC KEY SK_01
DROP SYMMETRIC KEY SK_01
DROP CERTIFICATE CERT_01
DROP MASTER KEY
The ENCRYPTBYKEY returns varbinary with a maximum size of 8,000 bytes. Knowing the table fields going to be encrypted (for example: nvarchar(128), varchar(31), bigint) how can I define the new varbinary types length?
You can see the full specification here
So lets calculate:
16 byte key UID
_4 bytes header
16 byte IV (for AES, a 16 byte block cipher)
Plus then the size of the encrypted message:
_4 byte magic number
_2 bytes integrity bytes length
_0 bytes integrity bytes (warning: may be wrongly placed in the table)
_2 bytes (plaintext) message length
_m bytes (plaintext) message
CBC padding bytes
The CBC padding bytes should be calculated the following way:
16 - ((m + 4 + 2 + 2) % 16)
as padding is always applied. This will result in a number of padding bytes in the range 1..16. A sneaky shortcut is to just add 16 bytes to the total, but this may mean that you're specifying up to 15 bytes that are never used.
We can shorten this to 36 + 8 + m + 16 - ((m + 8) % 16) or 60 + m - ((m + 8) % 16. Or if you use the little trick specified above and you don't care about the wasted bytes: 76 + m where m is the message input.
Notes:
beware that the first byte in the header contains the version number of the scheme; this answer does not and cannot specify how many bytes will be added or removed if a different internal message format or encryption scheme is used;
using integrity bytes is highly recommended in case you want to protect your DB fields against change (keeping the amount of money in an account confidential is less important than making sure the amount cannot be changed).
The example on the page assumes single byte encoding for text characters.
Based upon some tests in SQL Server 2008, the following formula seems to work. Note that #ClearText is VARCHAR():
52 + (16 * ( ((LEN(#ClearText) + 8)/ 16) ) )
This is roughly compatible with the answer by Maarten Bodewes, except that my tests showed the DATALENGTH(myBinary) to always be of the form 52 + (z * 16), where z is an integer.
LEN(myVarCharString) DATALENGTH(encryptedString)
-------------------- -----------------------------------------
0 through 7 usually 52, but occasionally 68 or 84
8 through 23 usually 68, but occasionally 84
24 through 39 usually 84
40 through 50 100
The "myVarCharString" was a table column defined as VARCHAR(50). The table contained 150,000 records. The mention of "occasionally" is an instance of about 1 out of 10,000 records that would get bumped into a higher bucket; very strange. For LEN() of 24 and higher, there were not enough records to get the weird anomaly.
Here is some Perl code that takes a proposed length for "myVarCharString" as input from the terminal and produces an expected size for the EncryptByKey() result. The function "int()" is equivalent to "Math.floor()".
while($len = <>) {
print 52 + ( 16 * int( ($len+8) / 16 ) ),"\n";
}
You might want to use this formula to calculate a size, then add 16 to allow for the anomaly.
I have some FAT32 structure output:
bytes_per_sector 0x200
sectors_per_cluster 0x8
reserved_sector_count 0x780
FATtable_count 0x2
root_entry_count 0
hidden_sector_count 0x38
total_sectors_32 0xf17fc8
FATtable_size_32 0x3c40
root_cluster 0x2
fat_info 0x1
backup_BS_sector 0x6
Ok, assume to this information, I jump over reserved_sector_count and get the FAT32 table. #1920 - is offset from te begining - 0x780.
fat32Table #1920
cluster0 ffffff8
cluster1 ffffffff
cluster2 fffffff
cluster3 fffffff
Ok, assume to FATtable_size_32 and FATtable_count I jump over and get to the Data Region at the position #16777216 from the begining. This is the 0 cluster, as I understand. Cluster consists of 8 sectors by 512 byte each. First Directory named "F" - is volume. Second "HELLO.TXT" is a file. DirectoryEntry size of 32 byte.
name #16777216 46 [F ]
ext [ ]
attrib 8
NTreserved 0
CrtTimeTenth 0
createtime 0
createdate 0
accessdate 0
clusterhigh 0
modifiedtime 81e3
modifieddate 4ace
clusterlow 0
filesize 0
name #16777216 48 [HELLO ]
ext [TXT]
attrib 20
NTreserved 18
CrtTimeTenth 5c
createtime 82ca
createdate 4ace
accessdate 4ace
clusterhigh 0
modifiedtime 82d4
modifieddate 4ace
clusterlow 3
filesize 24
Next, I want to find data according to "hello.txt" file. I see clusterlow= 0x03. When I jump to next cluster (+4096 byte) and see data of my file.
name #16781312 This is test
But I don't get how does clusterlow help me to find it? What clusters ffffff8, ffffffff, fffffff in FAT32 table? How to find source data according to all this stuff?
#tilz0R, thanks for a link, but I have already read a doc from MS. And more.
I just get that at the DataRegion (after reserved sectors and FAT32 table) we begin with the cluster2, and clusterlow=0x03 means that source data located in the next cluster3 which is +4096 byte lower than cluster2.
It was easier to understand when I add one more file on a disk and see that first file located in the cluster3 and the second - cluster4.
I'm trying to read dataset that has 4030 observations and 23 variables. I'm doing that in proc fcmp, using read_array (...) statement.
Most of the variables have character type, but when I'm trying to read the code:
proc fcmp;
array a[&Numobs., &Nvar.] / NOSYMBOLS ;
rcl = read_array ("input", a);
res = write_array ('output', a);
quit;
I get error for every character variable:
ERROR: Column "Variable2" in data set "WORK.input" is not numeric in
function READ_ARRAY.
Does read_arrray work only for numeric variables? What am I doing wrong?
(the rest of my code is simple, and I'm sure it's correct).
I am using SAS Enterprise Guide 4.3.
In SAS all variables in an array must be of the same data type. Your Variable1 is probably numeric, Variable2 is character.
Read_array and write_array are numeric only. By default you're reading in all columns, but you can specify which columns you're interested in using quoted strings.
When using large arrays it would be nice to be able to adjust the array for a certain number of bytes per number. Mostly I want fast routines to read such adjusted multi byte numbers to singles on the stack and conversely to store singles in the array adjusted for a certain number of bytes. In a 64 bit system there is a need for other single number arrays than one byte (c# c!) and eight bytes (# !).
So how to implement
cs# ( ad b -- n )
cs! ( n ad b -- )
where b is the number of bytes. The word cs! seems to work as
: cs! ( n ad b -- ) >r sp# cell+ swap r> cmove drop ;
but how about cs# and how to do it in pure ANS Forth without sp# or similar words?
The Forth200*x* committee has put quite some time into developing a Memory Access wordset that would suite. We have not included it into the standard thus far due to its size.
The compatible way is to use C# and bitwise operations. To use the same byte order in memory as Forth system there is need to detect endianness and compile the suitable versions of the certain definitions.
\ These definitions use little-endian format in memory.
\ Assumption: char size and address unit size equal to 1 octet.
: MB! ( x addr u -- )
ROT >R OVER + SWAP
BEGIN 2DUP U> WHILE R> DUP 8 RSHIFT >R OVER C! 1+ REPEAT
2DROP RDROP
;
: MB# ( addr u -- x )
0 >R OVER +
BEGIN 2DUP U< WHILE 1- DUP C# R> 8 LSHIFT OR >R REPEAT
2DROP R>
;
For higher performance it could be better to use implementation specific features (including W#, T#, Q#, SP#, etc) or even inline Forth-assembler.
Note that a straightforward definition via DO loop usually has worse performance (depends on optimizer; 10% in SP-Forth/4.21). The code for reference:
: MB! ( x addr u -- )
OVER + SWAP ?DO DUP I C! 8 RSHIFT LOOP DROP
;
: MB# ( addr u -- x )
DUP 0= IF NIP EXIT THEN
0 -ROT
1- OVER + DO 8 LSHIFT I C# OR -1 +LOOP
;
We can't use ?DO in the second case because of decreasing the loop index and +LOOP semantics: it leaves circle when the index crosses "the boundary between the loop limit minus one and the loop limit".
\ little-endian (eg. pc, android)
: mb! ( n ad i -- ) 2>r here ! here 2r> cmove ;
: mb# ( ad i -- n ) here 0 over ! swap cmove here # ;
\ big-endian (eg. mac)
: mb! ( n ad i -- ) 2>r here ! here cell + r# - 2r> cmove ;
: mb# ( ad i -- n ) here 0 over ! cell + over - swap cmove here # ;
\ little-endian test
1 here ! here c# negate .
Of course HERE could be any one cell buffer.
Thanks ruvim for parsing the process forward!