Can't read every byte of binary files on Lua 5.1 - file

,Hello, friends! I've been trying to add compatibility with Lua 5.1 to a library I'm working on and that's written, originally, to Lua 5.3. Everything was going considerably fine until now.
I've stumbled with a behavior I have absolutely no clue about the cause. Here's the thing: apparently, I can't read binary files properly on Lua 5.1. For the sake of clarity, running this test snippet produces different outputs depending on the version it's ran.
local f = io.open("test.bin", "wb")
local t = {}
for i=1, 256 do t[i] = i-1 end
local unpack = unpack or table.unpack
local str = string.char(unpack(t))
f:write(str)
f:close()
f = io.open("test.bin", "rb")
local buffer = {}
for line in f:lines() do
print(#line)
for i=1, #line do
buffer[#buffer+1] = string.byte(line:sub(i,i))
end
end
print('Total:', #buffer)
f:close()
Using Lua 5.1:
245
Total: 245
Using Lua 5.3:
10
245
Total: 255
So, the way I see it is that version 5.1 simply jumps the first "line" of the file for some reason.
Any help will be profoundly appreciated.

This was a bug in Lua 5.1 and Lua 5.2 that was corrected in Lua 5.3.
Anyway, don't use f:lines() with binary files. Instead, read the whole file with f:read("*a")or read it by blocks.

Related

"Expecting miMATRIX type" error when reading MATLAB MAT-file with SciPy

This is a MATLAB question: the problem is caused by interactions with MATLAB files and Python/numpy. I am tying to write a 3-D array of type uint8 in MATLAB, and then read it in Python using numpy. This is the MATLAB code that creates the file:
voxels = zeros(30, 30, 30);
....
fileID1 = fopen(fullFileNameOut,'w','s');
fwrite(fileID1, voxels, 'uint8');
fclose(fileID1);
This is the Python code that tries to read the file:
filename = 'File3DArray.mat'
arr = scipy.io.loadmat(filename)['instance'].astype(np.uint8)
This is the error that I get when I run the python code:(I think this is the source of the problem. What is MM
raise TypeError('Expecting miMATRIX type here, got %d' % mdtype)
This is the output of the Linux command 'file' on the 3D array file
that I created (I think this is the source of the problem. What is MMDF Mailbox?):
File3DArray.mat: MMDF mailbox
This is the output of the same Linux command 'file' on another 3D array file
that was created by someone else in MATLAB:
GoodFile.mat: Matlab v5 mat-file (little endian) version 0x0100
I want the files I create in MATLAB to be the same as GoodFile.mat (so that I can read them with the Python/Numpy code segment above). The output of the Linux 'file' command should be the same as the GoodFile output, I think.
What is the MATLAB code that does that?
To create a MAT-file, use the MATLAB save command:
voxels = zeros(30, 30, 30, 'uint8');
save(fullFileNameOut, 'voxels', '-v7')
You need to add '-v7' (or '-v6') as an argument to save to create a file in an older format, as SciPy doesn't recognize the '-v7.3' files created by default.

difference yap and swi-prolog reading canonical lists

I have the following test code trying to read file into a list
open('raw250-split1.pl', read, Stream),
read(Stream,train_xs(TrainXs)),
length(TrainXs, MaxTrain).
I will omit part of the output due to the file is quite large.
It works well with yap,
➜ chill git:(master) ✗ yap [18/06/19| 5:48PM]
% Restoring file /usr/lib/Yap/startup.yss
YAP 6.2.2 (x86_64-linux): Sat Sep 17 13:59:03 UTC 2016
?- open('raw250-split1.pl', read, Stream),
read(Stream, train_xs(TrainXs)),
length(TrainXs, MaxTrain).
MaxTrain = 225,
Stream = '$stream'(3),
TrainXs = [[parse([which,rivers,run,through,states,bordering,new,mexico,/],answer(_A,(river(_A),traverse(_A,_B),next_to(_B,_C),const(_C,stateid('new mexico')))))],
<omited output>
,[parse([what,is,the,largest,state,capital,in,population,?],answer(_ST,largest(_SU,(capital(_ST),population(_ST,_SU)))))]]
But on swi-prolog, it will produce Type error
➜ chill git:(master) ✗ swipl [18/06/19| 7:24PM]
Welcome to SWI-Prolog (threaded, 64 bits, version 7.6.4)
SWI-Prolog comes with ABSOLUTELY NO WARRANTY. This is free software.
Please run ?- license. for legal details.
For online help and background, visit http://www.swi-prolog.org
For built-in help, use ?- help(Topic). or ?- apropos(Word).
?- open('raw250-split1.pl', read, Stream),
read(Stream, train_xs(TrainXs)),
length(TrainXs, MaxTrain).
ERROR: raw250-split1.pl:4:
Type error: `list' expected, found `parse(which.(rivers.(run.(through.(states.(bordering.(new.(mexico.((/).[])))))))),
<omited output>
,answer(_67604,(state(_67604),next_to(_67604,_67628),const(_67628,stateid(kentucky))))).[].(parse(what.((is).(the.(largest.(state.(capital.(in.(population.((?).[])))))))),answer(_67714,largest(_67720,(capital(_67714),population(_67714,_67720))))).[].[]))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))' (a compound)
In:
[10] throw(error(type_error(list,...),context(...,_67800)))
[7] <user>
Note: some frames are missing due to last-call optimization.
Re-run your program in debug mode (:- debug.) to get more detail.
What might be the problem for the error here?
File raw250-split1.pl can be found from the ftp url below, if you'd like to try it.
Thank you for the help!
I am trying to migrate an earlier code to SWI-Prolog, which was written in
SICStus 3 #3: Thu Sep 12 09:54:27 CDT 1996 or earlier
by Raymond J. Mooney ftp://ftp.cs.utexas.edu/pub/mooney/chill/.
All the questions with this tag are all related to this task. I'm new to prolog, helps and suggestions are welcomed!
The raw250-split1.pl was apparently written using canonical notation. The traditional list functor is ./2 but SWI-Prolog 7.x changed it to '[|]'/2 in order to use ./2 for other purposes. This results in the the variable TrainXs being instantiated by the read/2 call to a compound term whose argument is not a list:
?- open('raw250-split1.pl', read, Stream), read(Stream,train_xs(TrainXs)).
Stream = <stream>(0x7f8975e08e90),
TrainXs = parse(which.(rivers.(run.(through.(states.(bordering.(... . ...)))))), answer(_94, (river(_94), traverse(_94, _100), next_to(_100, _106), const(_106, stateid('new mexico'))))).[].(parse(what.((is).(the.(highest.(point.(... . ...))))), answer(_206, (high_point(_204, _206), const(_204, stateid(montana))))).[].(parse(what.((is).(the.(most.(... . ...)))), answer(_298, largest(_300, (population(_298, _300), state(...), ..., ...)))).[].(parse(through.(which.(states.(... . ...))), answer(_414, (state(_414), const(..., ...), traverse(..., ...)))).[].(parse(what.((is).(... . ...)), answer(_500, longest(_500, river(...)))).[].(parse(how.(... . ...), answer(_566, (..., ...))).[].(parse(... . ..., answer(..., ...)).[].(parse(..., ...).[].(... . ... .(... . ...))))))))).
YAP still uses the ./2 functor for lists, which explains why it can handle it. A workaround for SWI-Prolog is to start it with the --traditional command-line option:
$ swipl --traditional
...
?- open('raw250-split1.pl', read, Stream), read(Stream,train_xs(TrainXs)).
Stream = <stream>(0x7faeb2f77700),
TrainXs = [[parse([which, rivers, run, through, states, bordering|...], answer(_94, (river(_94), traverse(_94, _100), next_to(_100, _106), const(_106, stateid('new mexico')))))], [parse([what, is, the, highest, point|...], answer(_206, (high_point(_204, _206), const(_204, stateid(montana)))))], [parse([what, is, the, most|...], answer(_298, largest(_300, (population(_298, _300), state(...), ..., ...))))], [parse([through, which, states|...], answer(_414, (state(_414), const(..., ...), traverse(..., ...))))], [parse([what, is|...], answer(_500, longest(_500, river(...))))], [parse([how|...], answer(_566, (..., ...)))], [parse([...|...], answer(..., ...))], [parse(..., ...)], [...]|...].
The type error you get is due to the length/2 expecting a list when the first argument is bound.
There is a tilde as last character in that file, causing the syntax being invalid, so you should remove it before reading. I don't know why YAP accept the file as valid, should raise an error AFAIK.
There is a read option dotlists/2 in SWI-Prolog:
dotlists(Bool)
If true (default false), read .(a,[]) as a
list, even if lists are internally nor constructed
using the dot as functor. This is primarily intended
to read the output from write_canonical/1 from
other Prolog systems. See section 5.1.
http://www.swi-prolog.org/pldoc/man?predicate=read_term/2
This gives you the desired result, without changing the mode:
Welcome to SWI-Prolog (threaded, 64 bits, version 8.1.0)
?- read_term(X, [dotlists(true)]).
|: .(a,.(b,.(c,[]))).
X = [a, b, c].

Contiguous Hex file generation using GCC

I have a Hex file for STM32F427 that was built using GCC(gcc-arm-none-eabi) version 4.6 that had contiguous memory addresses. I wrote boot loader for loading that hex file and also added checksum capability to make sure Hex file is correct before starting the application.
Snippet of Hex file:
:1005C80018460AF02FFE07F5A64202F1D00207F5F9
:1005D8008E4303F1A803104640F6C821C2F2000179
:1005E8001A460BF053F907F5A64303F1D003184652
:1005F8000BF068F907F5A64303F1E80340F6FC1091
:10060800C2F2000019463BF087FF07F5A64303F145
:10061800E80318464FF47A710EF092FC07F5A643EA
:1006280003F1E80318460EF03DFC034607F5A64221
:1006380002F1E0021046194601F0F2FC07F56A5390
As you can see all the addresses are sequential. Then we changed the compiler to version 4.8 and i got the same type of Hex file.
But now we used compiler version 6.2 and the Hex file generated is not contiguous. It is somewhat like this:
:10016000B9BC0C08B9BC0C08B9BC0C08B9BC0C086B
:10017000B9BC0C08B9BC0C08B9BC0C08B9BC0C085B
:08018000B9BC0C08B9BC0C0865
:1001900081F0004102E000BF83F0004330B54FEA38
:1001A00041044FEA430594EA050F08BF90EA020FA5
As you can see after 0188 it is starting at 0190 means rest of 8 bytes(0189 to 018F) are 0xFF as they are not flashed.
Now boot loader is kind of dumb where we just pass the starting address and no of bytes to calculate the checksum.
Is there a way to make hex file in contiguous way as compiler 4.6 and compiler 4.8? the code is same in all the three times.
If post-processing the hex file is an option, you can consider using the IntelHex python library. This lets you manipulate hex file data (i.e. ignoring the 'markup'; record type, address, checksum etc) rather than as lines, will for instance create output with the correct line checksum.
A fast way to get this up and running could be to use the bundled convenience scripts hex2bin.py and bin2hex.py:
python hex2bin.py --pad=FF noncontiguous.hex tmp.bin
python bin2hex.py tmp.bin contiguous.hex
The first line converts the input file noncontiguous.hex to a binary file, padding it with FF where there is no data. The second line converts it the binary file back to a hex file.
The result would be
:08018000B9BC0C08B9BC0C0865
becomes
:10018000B9BC0C08B9BC0C08FFFFFFFFFFFFFFFF65
As you can see, padding bytes are added where the input doesn't have any data, equivalent to writing the input file to the device and reading it back out. Bytes that are in the input file are kept the same - and at the same address.
The checksum is also correct as changing the length byte from 0x08 to 0x10 compensates for the extra 0xFF bytes. If you padded with something else, IntelHex would output the correct checksum
You can skip the the creation of a temporary file by piping these: omit tmp.bin in the first line and replacing it with - in the second line:
python hex2bin.py --pad=FF noncontiguous.hex | python bin2hex.py - contiguous.hex
An alternative way could be to have a base file with all FF and use the hexmerge.py convenience script to merge gcc's output onto it with --overlap=replace
The longer, more flexible way, would be to implement your own tool using the IntelHex API. I've used this to good effect in situations similar to yours - tweak hex files to satisfy tools that are costly to change, but only handle hex files the way they were when the tool was written.
One of many possible ways:
Make your hex file with v6.2, e.g., foo.hex.
Postprocess it with this Perl oneliner:
perl -pe 'if(m/^:(..)(.*)$/) { my $rest=16-hex($1); $_ = ":10" . $2 . ("FF" x $rest) . "\n"; }' foo.hex > foo2.hex
Now foo2.hex will have all 16-byte lines
Note: all this does is FF-pad to 0x10 bytes. It doesn't check addresses or anything else.
Explanation
perl -pe '<some script>' <input file> runs <some script> for each line of <input file>, and prints the result. The script is:
if(m/^:(..)(.*)$/) { # grab the existing byte count into $1
my $rest=16 - hex($1); # how many bytes of 0xFF we need
$_ = ":10" . $2 . ("FF" x $rest) . "\n"; # make the new 16-byte line
# existing bytes-^^ ^^^^^^^^^^^^^^-pad bytes
}
Another solution is to change the linker script to ensure the preceding .isr_vector section ends on a 16 byte alignment, as the mapfile reveals that the following .text section is 16 byte aligned.
This will ensure there is no unprogrammed flash bytes between the two sections
You can use bincopy to fill all empty space with 0xff.
$ pip install bincopy
$ bincopy fill foo.hex
Use the -gap-fill option of objcopy, e.g.:
arm-none-eabi-objcopy --gap-fill 0xFF -O ihex firmware.elf firmware.hex

Access a 16 Bit DLL

I have been given the task to upgrade an existing 16 Bit Desktop application originally written in GFA Basic.
I want to know if there is a possibility to access the functions inside these 16 Bit Dlls via C/JNI (or any other programming language).
I guess, I have to write some sort of an intermediate DLL to access the functionalities from a Java class (or any other language).
For example
DLLTEST has the implementation of the functions
$Library
'LNK Exe d:\DLLtest.dll
Procedure LIBMAIN(hInst&, DSeg&, HpSz&, lpCmd%)
q_dllname$ = "DLLtext.dll"
RETVAL 1 ' If LIBMAIN is used, then RETVAL must be TRUE
Return
Procedure WEP(SysExit&)
' ##############################################
// SysExit = 1 - ExitWindows
// SysExit = 0 - DLL vrijgegeven
RETVAL 0 ' ???????????
Return
Procedure TextTest(dc&)
$EXPORT TextTest
SETDC dc&
RGBColor 0
Local t$ = "Hello world" + Chr$(0)
Text 10, 10, t$
Beep
~TextOut(dc&, 10, 50, V:t$, Len(t$))
Beep
Return
The above dll file is in turn used by TESTTEXT.exe
// destination exe file
'LNK Exe d:\testtext.exe
DLL #7, "dlltest.dll"
DECL LONG TextTest(W)
ENDDLL
OpenW # 1
h& = Win(1)
SETDC GetDC(h&)
' RGBCOLOR 0
' GRAPHMODE R2_COPYPEN
~##TextTest(_DC())
KeyGet k%
CloseW # 1
FreeDll 7
End
I want to rewrite this TESTTEXT.exe using Java/C (or any other moder programming language). I guess, I need to build a bridge between this dll and the new exe by building another dll.
I was hoping to get some help about writing this intermediate dll.
Also, let me know if this kind of solution makes sense!
Your help would be highly appreciated.
Thanks you for your time.
Using a 16-bit DLL from a 32-bit application involves what Microsoft calls (called) "flat thunking". Flat Thunking is available only in the 16/32-bit hybrid versions of Windows (Windows 95, 98, 98SE, Me).
What you want is not supported on any reasonably current version of Windows.
I agree with Jerry. In the meantime you might start looking at this post:
http://www.atari-forum.com/viewtopic.php?f=69&t=4826&start=20

How do I create a sparse file programmatically, in C, on Mac OS X?

I'd like to create a sparse file such that all-zero blocks don't take up actual disk space until I write data to them. Is it possible?
There seems to be some confusion as to whether the default Mac OS X filesystem (HFS+) supports holes in files. The following program demonstrates that this is not the case.
#include <stdio.h>
#include <string.h>
#include <fcntl.h>
#include <unistd.h>
void create_file_with_hole(void)
{
int fd = open("file.hole", O_WRONLY|O_TRUNC|O_CREAT, 0600);
write(fd, "Hello", 5);
lseek(fd, 99988, SEEK_CUR); // Make a hole
write(fd, "Goodbye", 7);
close(fd);
}
void create_file_without_hole(void)
{
int fd = open("file.nohole", O_WRONLY|O_TRUNC|O_CREAT, 0600);
write(fd, "Hello", 5);
char buf[99988];
memset(buf, 'a', 99988);
write(fd, buf, 99988); // Write lots of bytes
write(fd, "Goodbye", 7);
close(fd);
}
int main()
{
create_file_with_hole();
create_file_without_hole();
return 0;
}
The program creates two files, each 100,000 bytes in length, one of which has a hole of 99,988 bytes.
On Mac OS X 10.5 on an HFS+ partition, both files take up the same number of disk blocks (200):
$ ls -ls
total 400
200 -rw------- 1 user staff 100000 Oct 10 13:48 file.hole
200 -rw------- 1 user staff 100000 Oct 10 13:48 file.nohole
Whereas on CentOS 5, the file without holes consumes 88 more disk blocks than the other:
$ ls -ls
total 136
24 -rw------- 1 user nobody 100000 Oct 10 13:46 file.hole
112 -rw------- 1 user nobody 100000 Oct 10 13:46 file.nohole
As in other Unixes, it's a feature of the filesystem. Either the filesystem supports it for ALL files or it doesn't. Unlike Win32, you don't have to do anything special to make it happen. Also unlike Win32, there is no performance penalty for using a sparse file.
On MacOS, the default filesystem is HFS+ which does not support sparse files.
Update: MacOS used to support UFS volumes with sparse file support, but that has been removed. None of the currently supported filesystems feature sparse file support.
This thread becomes a comprehensive source of info about the sparse files. Here is the missing part for Win32:
Decent article with examples
Tool that estimates if it makes sense to make file as sparse
Regards
hdiutil can handle sparse images and files but unfortunately the framework it links against is private.
You could try defining external symbols as defined by the DiskImages framework below but this is most likely not acceptable for production code, plus since the framework is private you'd have to reverse engineer its use cases.
cristi:~ diciu$ otool -L /usr/bin/hdiutil
/usr/bin/hdiutil:
/System/Library/PrivateFrameworks/DiskImages.framework/Versions/A/DiskImages (compatibility version 1.0.8, current version 194.0.0)
[..]
cristi:~ diciu$ nm /System/Library/PrivateFrameworks/DiskImages.framework/Versions/A/DiskImages | awk -F' ' '{print $3}' | c++filt | grep -i sparse
[..]
CSparseFile::sector2Band(long long)
CSparseFile::addIndexNode()
CSparseFile::readIndexNode(long long, SparseFileIndexNode*)
CSparseFile::readHeaderNode(CBackingStore*, SparseFileHeaderNode*, unsigned long)
[... cut for brevity]
Later Edit
You could use hdiutil as an external process and have it create an sparse disk image for you. From the C process you would then create a file in the (mounted) sparse disk image.
If you seek (fseek, ftruncate, ...) to past the end, the file size will be increased without allocating blocks until you write to the holes. But there's no way to create a magic file that automatically converts blocks of zeroes to holes. You have to do it yourself.
This may be helpful to look at (the OpenBSD cp command inserts holes instead of writing zeroes).
patch
If you want portability, the last resort is to write your own access function so that you manage an index and a set of blocks.
In essence you manage a single file as the OS manages the disk keeping the chain of the blocks that are part of the file, the bitmap of allocated/free blocks etc.
Of course this will lead to a non optimized and slower access, I would reccomend this apprach only if the requirement to save space is absolutely critical and you have enough time to write a robust set of access functions.
And even in that case, I would first investigate if your problem is in need of a different solution. Probably you should store your data differently?
It looks like OS X supports sparse files on UDF volumes. I tried titaniumdecoy's test program on OS X 10.9 and it did generate a sparse file on a UDF disk image. Also, not that UFS is no longer supported in OS X, so if you need sparse files, UDF is the only natively supported file system that supports them.
I also tried the program on SMB shares. When the server is Ubuntu (ext4 filesystem) the program creates a sparse file, but 'ls -ls' through SMB doesn't show that. If you do 'ls -ls' on the Ubuntu host itself it does show the file is sparse. When the server is Windows XP (NTFS filesystem) the program does not generate a sparse file.

Resources