Read and write directly from and to compressed files in C - c

in Java I think it is possible to cruise through jar files like they were not compressed. Is there some similar (and portable) thing in C/C++ ?
I would like to import binary data into memory from a large (zipped or similar) file without decompressing to disk first and afterwards writing to disk in a compressed way.
Maybe some trick with shell pipes and the zip utility?

I think you want zlib:
http://www.zlib.net/

Related

Manipulation of Excel files in TwinCAT

Is there an available library for reading/writing of Excel files, particularly XLSX or XLSM for TwinCAT 3? How about TDMS files? Obviously I'd prefer something open source and free, if available.
Thank you
Using TwinCAT you can make CSV files, JSON files, XML files.
Next, after write files, you can use Python language to save data as excel files.
There is some book for python called : "Automate the Boring Stuff with Python, 2nd Edition: Practical Programming for Total Beginners"
There are some examples how to read write and modify Excel and Word files.
but remember that CSV can be opened in Excel using IMPORT or just using CTRL+C/CTRL+V. Delimiter in TwinCAT is located in Global variables - read about it in Beckhoff Information System (btw. google search works better and faster than Beckhoff's search on website)
info about CSV function blocks from this page:
https://infosys.beckhoff.com/content/1033/tcplclib_tc2_utilities/34977931.html?id=7903313200164417832
https://infosys.beckhoff.com/content/1033/tcplclib_tc2_utilities/34979467.html?id=1113952616781398655

.bin files used for upgrading embeded devices

I am confused a bit about .bin files. Basically in Linux we use elf, .ko type of files for upgrading the box or to copy in it . But, while upgrading a NAND flash in router or any Networking Gaint products why always .bin files is preferred. Is this is something like converged mix of all the OS related files. Does it possible to see the contents of a bin file. How to play with it. It is something like contents of BootROM. How is is prepared? How do we create and test on this. How Linux support for this. Any historical reasons behind this?
Speaking about routers, those files are usually just snapshots of a router's flash memory, probably compressed and with some headers added. Typical things are a compressed squashfs image or simply gzip'ed snapshot of memory.
There is no such thing as .bin format, it's just a custom array of bytes and every vendor interprets it in some vendor-specific way. Basically this extension means “it's not your business what's in the file, our device/software will handle it”. You can try to identify (thnk, reverse-engineer) what's actually in those files by using file utility or just looking at those files through a hex editor and trying to guess what's going on.

Is there a performance drop when we open a file in a directory that has huge numbers of files?

Suppose we want to open a file in a directory, but there are huge numbers of files. When I requested the program to open a file in there, how fast can it search for this particular file? Will there be performance drop for looking for the requested file in this case?
PS. This should also depend on the file systems implementation, yes?
Yes, it depends a lot on the file system implementation.
Some file systems have specific optimizations for large directories. One example I can think of is is ext3, which uses HTree indexing for large directories.
Generally speaking there will usually be some delay to find the file. Once the file is located/opened, however, reading it should not be slower than reading any other file.
Some programs that need to handle a large amount of files (for caching, for example) put them in a large directory tree, to reduce the number of entries per directory.

Analyze VMDK (vmware virtual machine disk) files for changes

Is there a good way to analyze VMware delta VMDK files between snapshots to list changed blocks, so one can use a tool to tell which NTFS files are changed?
I do not know a tool that does this out of the block, but it should not be so difficult.
The VMDK file format specification is available and the format is not that complex. As far as I remember, a VMDK file consists of a lot of 64k block. At the beginning of the VMDK file there is some directory that contains the information where a logical block is stored in the physical file.
It should be pretty easy to detect there a logical block is stored in both files and than compare the data in the two version of the VMDK file.

Copying millions of files

I've got about 3 million files I need to copy from one folder to another over my company's SAN. What's the best way for me to do this?
If a straight copy is too slow (although a SAN with write-back caching would be about as fast as anything for this type of operation) you could tar the files up into one or more archives and then expand the archives out at the destination. This would slightly reduce the disk thrashing.
At a more clever level, you can do a trick with tar or cpio where you archive the files and write them to stdout which you pipe to another tar/cpio process to unravel them at their destination.
A sample command to do this with tar looks like:
tar cf - * | (cd [destination dir] ; tar xf - )
Some SANs will also directly clone a disk volume.
If you're on windows, use robocopy. It's very robust and build for situations like that. It supports dead link detection and can be told to retry copies if one is interrupted.
Have you considered using rsync? This is a tool that uses an algorithm that involves calculating hashes on chunks of the files to compare two sites and send deltas between the sites.
Microsoft SyncToy is in my experience very good at handling ridiculous numbers of files. And it's very easy to use.
Teracopy will do this I think.
http://www.codesector.com/teracopy.php
Or, if on *nix, try cuteftp.
If you ask me, its just the best way to copy with neatest system software.
Just something like:
cp -pvr /pathtoolddir /pathtonewdir
on a linux box will do and work great. Any compression in between will just slow down the process.

Resources