How do I open a zip within a zip with libzip - c

I am trying to open a zip inside a zip
#include "zip.h"
#include "gtk.h"
zip_t *mainzipfile = zip_open(g_file_get_path(file), ZIP_CHECKCONS, &error);
zip_file_t *childzip = zip_fopen(mainzipfile, "child.zip", ZIP_RDONLY);// this segfaults
zip_file_t *childofchild = zip_fopen_index((zip_t*)childzip, 1, ZIP_RDONLY);
From what I am seeing childzip is not being read as a zip so its seg faulting.
I tried casting because I know childzip is a zip file but the program is failing to see it as so
How do I set the zip_file_t as zip_t so that I can also extract its children

There is no generic support for opening a ZIP file inside a zip. To some extent, this is because reading ZIP file require direct access to the data (the ability to seek by offset). However, compressed ZIP files do not support the ability to read by offset. The only way to read a specific offset is to rewind the zip_file_t object, and skip over bytes.
The leaves two possible scenarios (assuming the goal is to avoid extracting the inside zip into a file).
1. Reading from uncompressed zip.
In most cases, when a ZIP archive is placed into another ZIP archive, the zip program will realize that compression will not be effective, and will use the 'store' method. In those situation, it is possible to use zip_source_zip method to create (seekable) zip_source, which then be opened
See https://libzip.org/documentation/zip_source.html
// Find index
zip_int64_t child_idx= zip_name_locate(main_zip, "child.zip", flags);
// Create zip_source from the complete child.zip
zip_source_t *src = zip_source_zip(archive, main_zip, child_idx, flags, 0, 0);
// Create zip_t
zip_t child_zip = zip_open_from_source(src, flags, &error);
// work with the child zip
2. Unzipping into memory.
As an alternative, and assuming that the ZIP can fit in memory, consider reading the whole child zip into memory, than using the same context of zip_source to create a zip_source, which can be opened. In theory, simpler to implement.
zip_stat (...) ;
N = size_of_child_zip(...) ;
zip_file_t *child_file = zip_fopen(main_zip, "child.zip", flags);
char *buffer = calloc(1, N);
zip_fread(child_file, buffer, N) ;
zip_source = zip_source_buffer_create(buffer, N, ...)
// Create zip_t
zip_t child_zip = zip_open_from_source(zip_source, flags, &error);
// work with the child zip

Related

How to delete a file in C using a file-descriptor?

In my code, I create a file with a random name using mkstemp() function (Im on Linux). What this function returns is an int being a file descriptor.
int fd;
char temp[] = "tempXXXXXX";
fd = mkstemp(temp);
Later I can access the file using fdopen() through that int file descriptor.
FILE *file_ptr = NULL;
file_ptr = fdopen(fd);
But at the end of my program, I would like to see if the file still exists with the random name it was given when I created it (the program should change that file name if successful). I can set a flag if the rename() function run on that file is successful, but I still don't know how to delete it when I only have its file descriptor.
if rename files => remove the temp file
How can I do that? Or is there a way to get the files name if I have its file descriptor?
Neither C nor POSIX (since you are using POSIX library functions) defines a way to delete a file via an open file descriptor. And that makes sense, because the kind of deletion you're talking about is actually to remove a directory entry, not the file itself. The same file can be hard linked into the directory tree in multiple places, with multiple names. The OS takes care of removing its data from storage, or at least marking it as available for reuse, after the last hard link to it is removed from the directory tree and no process any longer has it open.
A file descriptor is associated directly with a file, not with any particular path, notwithstanding the fact that under many circumstances, you obtain one via a path. This has several consequences, among them that once a process opens a file, that file cannot be pulled out from under it by manipulating the directory tree. And that is the basis for one of the standard approaches to your problem: unlink (delete) it immediately after opening it, before losing its name. Example:
#include <stdlib.h>
#include <unistd.h>
int make_temp_file() {
char filename[] = "my_temp_file_XXXXXX";
int fd;
fd = mkstemp(filename);
if (fd == -1) {
// handle failure to open ...
} else {
// file successfully opened, now unlink it
int result = unlink(filename);
// ... check for and handle error conditions ...
}
return fd;
}
Not only does that (nearly) ensure that the temp file does not outlive the need for it, but it also prevents the contents from being accessible to users and processes to which the owning process does not explicitly grant access.
Even though this doesn't exactly answer the question you're asking about mkstemp, consider creating a temporary file that will automatically be deleted, unless you rename it.
Instead of mkstemp you could call open combined with the creation flag O_TMPFILE to create a temporary, unnamed file that is automatically deleted when file is closed.
See open(2):
O_TMPFILE (since Linux 3.11)
Create an unnamed temporary regular file. The pathname argu‐
ment specifies a directory; an unnamed inode will be created
in that directory's filesystem. Anything written to the
resulting file will be lost when the last file descriptor is
closed, unless the file is given a name.
Instead of a filename, you call open with the path where you prefer to place the temporary file, like:
temp_fd = open("/path/to/dir", O_TMPFILE | O_RDWR, S_IRUSR | S_IWUSR);
If you like to give the temporary file a permanent location/name, you can call linkat on it later:
linkat(temp_fd, NULL, AT_FDCWD, "/path/for/file", AT_EMPTY_PATH);
Note: Filesystem support is required for O_TMPFILE, but mainstream Linux filesystems do support it.
readlink provide you the name of your file depending of the file descriptor if you use the path /proc/self/fd/ adding you fd.
Then use remove for deleting the file passing the name readlink gave you
ssize_t readlink(const char *path, char *buf, size_t bufsiz); (also load ernno)
int remove(const char *filename); (returns zero is successful, otherwise nonzero)
I hope something like that could helped you ?
⚠ Don't copy/past this you must edit "filename"; _BUFFER, _BUFSIZE ⚠
#include<stdio.h>
#include <unistd.h>
#include <stdlib.h>
int delete_file(int fd) {
char *str_fd = itoa(fd, str_fd, 10);
char *path = strcat("/proc/self/fd/", str_fd);
if (read_link(path, buffer, bufsize) == -1)
return -1;
int del = remove(filename);
if (!del)
printf("The file is Deleted successfully");
else
printf("The file is not Deleted");
return 0;
}
(feel free to edit this, i didn't test the code and i let you handel the buffer and buffer size)

Compress a file chunk by chunk - miniz

I'm building a program with help of the library [miniz][1] for compressing files with sizes up to 3GB. The computer that will run this program will also run another (heavy) application and therefore I want this compressing program to load a chunk of each file to prevent it to use a lot of RAM (max size of chunk = 0.5 GB ), compressing that chunk and then proceed the next chunk until all files are compressed.
Right know, this does not work as I want, for example: if a file named problem.txt is divided into 10 chunks, I get 10 files named problem.txt in my zip folder. Obviously I want the chunks to be merged together instead of being splitted in the zip.
Is this possible to do with miniz?
The following text is written in the libary file(the libary contains only one file) so I guess it is not possible but I ask anyway to see if anyone has a solution or another approach so the program does not eat all the memory.
The ZIP archive API's where designed with simplicity and efficiency in mind, with just enough abstraction to
get the job done with minimal fuss. There are simple API's to retrieve file information, read files from
existing archives, create new archives, append new files to existing archives, or clone archive data from
one archive to another. It supports archives located in memory or the heap, on disk (using stdio.h),
or you can specify custom file read/write callbacks.
The program crash with files larger then 0.9 GB.
[1]: https://code.google.com/p/miniz/ .
Please note that the program store the whole file in the std::vector filesdata. Each element is a chunk of data. In the final version, just a chunk shall be read and stored in the program at one time. The problem in this version is that the library creates many files with the same name in the .zip as described above.
Do I use the lib wrongly right now? I open the files myself and store the data in the vector because I could not figure out how to make function open the file itself.
for (i = 0; i < filesData.size(); ++i)
{
sprintf(data.at(i), filesData.at(i) );
sprintf(archive_filename, fileNames.at(i) );
// Add a new file to the archive. Note this is an IN-PLACE operation, so if it fails your archive is probably hosed (its central directory may not be complete) but it should be recoverable using zip -F or -FF. So use caution with this guy.
// A more robust way to add a file to an archive would be to read it into memory, perform the operation, then write a new archive out to a temp file and then delete/rename the files.
// Or, write a new archive to disk to a temp file, then delete/rename the files. For this test this API is fine.
status = mz_zip_add_mem_to_archive_file_in_place(s_Test_archive_filename, archive_filename, data.at(i), strlen(data.at(i)) + 1, s_pComment, (uint16)strlen(s_pComment), MZ_BEST_COMPRESSION);
if (!status)
{
printf("mz_zip_add_mem_to_archive_file_in_place failed! 2\n");
return EXIT_FAILURE;
}
}
I made a small test program(which fails).
#include "miniz.c"
#if defined(__GNUC__)
// Ensure we get the 64-bit variants of the CRT's file I/O calls
#ifndef _FILE_OFFSET_BITS
#define _FILE_OFFSET_BITS 64
#endif
#ifndef _LARGEFILE64_SOURCE
#define _LARGEFILE64_SOURCE 1
#endif
#endif
typedef unsigned char uint8;
typedef unsigned short uint16;
typedef unsigned int uint;
int main(int argc, char *argv[])
{
mz_zip_archive zip_archive;
const char *s_Test_archive_filename = "__mz_example2_test__.zip";
const char *s_pComment = "This is a comment";
remove(s_Test_archive_filename);
printf (argv[1] );
bool status = mz_zip_writer_add_file( (&zip_archive), s_Test_archive_filename, argv[1], s_pComment, (uint16)strlen(s_pComment), MZ_BEST_COMPRESSION );
if (!status)
{
printf("mz_zip_reader_init_file() failed!\n");
return EXIT_FAILURE;
}
else
{
printf("success\n");
}
return 0;
}

What is the correct approach to write multiple small pieces to a temp file in c, in multithreads?

I am simulating multithreads file downloading. My strategy is in each thread would receive small file pieces( each file piece has piece_length and piece_size and start_writing_pos )
And then each thread writes to the same buffer. How do I realize it ? Do I have to worry about collisions ?
//=================== follow up ============//
so I write a small demo as follows:
#include <stdio.h>
int main(){
char* tempfilePath = "./testing";
FILE *fp;
fp = fopen(tempfilePath,"w+");//w+: for reading and writing
fseek( fp, 9, SEEK_SET);//starting in 10-th bytes
fwrite("----------",sizeof(char), 10, fp);
fclose(fp);
}
And before execution I let content in "./testing" to be "XXXXXXXXXXXXXXXXXXX", after I do the above I get "^#^#^#^#^#^#^#^#^#----------" I wonder where is the problem then ....
Do what most torrent clients do. Create a file with the final size having an extension .part. Then allocate non-overlapping parts of the file to each thread, who shall have their own file-descriptors. Thus collisions are avoided. Rename to final name when finished.
Unless you want to use a mutex, you can't use fwrite(). FILE *-based IO using fopen(), fwrite(), and all related functions simply isn't reentrant - the FILE uses a SINGLE buffer., a SINGLE offset, etc.
You can't even use open() and lseek()/write() - multiple threads will interfere with each other, modifying the one offset an open file descriptor has.
Use open() to open the file, and use pwrite() to write data to exact offsets.
pwrite() man page:
pwrite() writes up to count bytes from the buffer starting at buf to
the file descriptor fd at offset offset. The file offset is not
changed.

Clearing file contents only using FILE * [duplicate]

I'm using C to write some data to a file. I want to erase the previous text written in the file in case it was longer than what I'm writing now.
I want to decrease the size of file or truncate until the end. How can I do this?
If you want to preserve the previous contents of the file up to some length (a length bigger than zero, which other answers provide), then POSIX provides the truncate() and ftruncate() functions for the job.
#include <unistd.h>
int ftruncate(int fildes, off_t length);
int truncate(const char *path, off_t length);
The name indicates the primary purpose - shortening a file. But if the specified length is longer than the previous length, the file grows (zero padding) to the new size. Note that ftruncate() works on a file descriptor, not a FILE *; you could use:
if (ftruncate(fileno(fp), new_length) != 0) ...error handling...
However, you should be aware that mixing file stream (FILE *) and file descriptor (int) access to a single file is apt to lead to confusion — see the comments for some of the issues. This should be a last resort.
It is likely, though, that for your purposes, truncate on open is all you need, and for that, the options given by others will be sufficient.
For Windows, there is a function SetEndOfFile() and a related function SetFileValidData() function that can do a similar job, but using a different interface. Basically, you seek to where you want to set the end of file and then call the function.
There's also a function _chsize() as documented in the answer by sofr.
In Windows systems there's no header <unistd.h> but yet you can truncate a file by using
_chsize( fileno(f), size);
That's a function of your operating system. The standard POSIX way to do it is:
open("file", O_TRUNC | O_WRONLY);
If this is to run under some flavor of UNIX, these APIs should be available:
#include <unistd.h>
#include <sys/types.h>
int truncate(const char *path, off_t length);
int ftruncate(int fd, off_t length);
According to the "man truncate" on my Linux box, these are POSIX-conforming. Note that these calls will actually increase the size of the file (!) if you pass a length greater than the current length.
<edit>
Ah, you edited your post, you're using C. When you open the file, open it with the mode "w+" like so, and it will truncate it ready for writing:
FILE* f = fopen("C:\\gabehabe.txt", "w+");
fclose(file);
</edit>
To truncate a file in C++, you can simply create an ofstream object to the file, using ios_base::trunc as the file mode to truncate it, like so:
ofstream x("C:\\gabehabe.txt", ios_base::trunc);
If you want to truncate the entire file, opening the file up for writing does that for you. Otherwise, you have to open the file for reading, and read the parts of the file you want to keep into a temporary variable, and then output it to wherever you need to.
Truncate entire file:
FILE *file = fopen("filename.txt", "w"); //automatically clears the entire file for you.
Truncate part of the file:
FILE *inFile("filename.txt", "r");
//read in the data you want to keep
fclose(inFile);
FILE *outFile("filename.txt", "w");
//output back the data you want to keep into the file, or what you want to output.

File processing in c?

I have been given a raw file that holds several jpg images. I have to go through the file, find each jpg image, and put those images each in a separate file. So far I have code that can find each where each image begins and ends. I also have written code that names several file names I can use to put the pictures in. It is an array: char filename[] , that holds the names: image00.jpg - image29.jpg .
What I cannot figure out is how to open a file every time I find an image, an then close that file and open a new one for the next image. Do I need to use fwrite()? Also, each image is in blocks of 512 bytes, so I only have to check for a new image every 512 bytes once I find the first one. Do I need to add that into fwrite?
So, to summarize my questions, I don't understand how to use fwrite(), if that is what I should be using to write to these files.
Also, I do not know how to open the files using the names I have already created.
Thanks in advance for the help. Let me know if I need to post any other code.
Use fopen(rawfilename, "rb"); to open the raw file for reading. and fread to read from it.
Use fopen(outfilename, "wb"); to open output file for writing and fwrite to write to it.
As mentioned in my comment, you are assigning char *[] to char*, use char filename[] = "image00.jpg"; instead.
Don't forget to close each file after you finish its processing (r/w) (look at fclose() at the same site of other links)
Decide how much bytes to read each time by parsing the jpeg header. Use malloc to allocate the amount of bytes needed to be read, and remember, for each allocation of buffer you need to free the allocated buffer later.
Pretty much any book on C programming should cover the functions you need. As MByD pointed out, you'll want to use the functions fopen(), fwrite(), and fclose().
I imagine your code may include fragments that look something like
/* Warning: untested and probably out-of-order code */
...
char **filename = {
"image00.jpg", "image01.jpg", "image02.jpg",
...
"image29.jpg" };
...
int index = 0;
const int blocksize = 512; /* bytes */
...
index++;
...
FILE * output_file = fopen( filename[index], "wb");
fwrite( output_data, 1, blocksize, output_file );
fclose(output_file);
...

Resources