redirect output to compressed format (C) - c

Is it possible to use fprintf in such a way that write data to a compressed file?
For example:
fopen ("myfile.txt","w");
will write to a plain text file. So the file size grows very large.

You can use zlib to write data to a compressed stream.
gzFile fp;
fp = gzopen(NAME, "wb");
gzprintf(fp, "Hello, %s!\n", "world");
gzclose(fp);
Compile it like this:
gcc -Wall -Wextra -o zprog zprog.c -lz
Use zcat to print the contents of the file.

The minimally-invasive solution if you're on a system that has pipes would be to open a pipe to an external gzip process. That way you can use all the normal stdio output functions without having to replace everything with zlib calls.

On Linux, you could use the zlib library (and link it as -lz) and use its compressed streams

Related

How is rustc able to compile source code from bash process substitution but gcc cannot?

$ rustc <(echo 'fn main(){ print!("Hello world!");}')
$ ls
63
$ gcc <(echo '#include<stdio.h> int main(){ printf("Hello world!\n"); return 0;}')
/dev/fd/63: file not recognized: Illegal seek
collect2: error: ld returned 1 exit status
Why can't ld link the program?
The gcc command is mostly a dispatch engine. For each input file, it determines what sort of file it is from the filename's extension, and then passes the file on to an appropriate processor. So .c files are compiled by the C compiler, .h files are assembled into precompiled headers, .go files are sent to the cgo compiler, and so on.
If the filename has no extension or the extension is not recognised, gcc assumes that it is some kind of object file which should participate in the final link step. These files are passed to the collect2 utility, which then invokes ld, possibly twice. This will be the case with process substitution, which produces filenames like /dev/fd/63, which do not include extensions.
ld does not rely on the filename to identify the object file format. It is generally built with several different object file recognisers, each of which depends on some kind of "magic number" (that is, a special pattern at or near the beginning of the file). It calls these recognisers one at a time until it finds one which is happy to interpret the file. If the file is not recognised as a binary format, ld assumes that it is a linker script (which is a plain text file) and attempts to parse it as such.
Naturally, between attempts ld needs to rewind the file, and since process substitution arranges for a pipe to be passed instead of a file, the seek will fail. (The same thing would happen if you attempted to pass the file through redirection of stdin to a pipe, which you can do: gcc will process stdin as a file if you specify - as a filename. But it insists that you tell it what kind of file it is. See below.)
Since ld can't rewind the file, it will fail after the file doesn't match its first guess. Hence the error message from ld, which is a bit misleading since you might think that the file has already been compiled and the subsequent failure was in the link step. That's not the case; because the filename had no extension, gcc skipped directly to the link phase and almost immediately failed.
In the case of process substitution, pipes, stdin, and badly-named files, you can still manually tell gcc what the file is. You do that with the -x option, which is documented in the GCC manual section on options controlling the kind of output (although in this case, the option actually controls the kind of input).
There are a number of answers to questions like this floating around the Internet, including various answers here on StackOverflow, which claim that GCC attempts to detect the language of input files. It does not do that, and it never has. (And I doubt that it ever will, since some of the languages it compiles are sufficiently similar to each other that accurate detection would be impossible.) The only component which does automatic detection is ld, and it only does that once GCC has irrevocably decided to treat the input file as an object file or linker script.
At least in your case, you can use process substition when specifying the input language manually, using -xc. However, you should put a newline after the include statement.
$ gcc -xc <(echo '#include<stdio.h>
int main(){ printf("Hello world!\n"); return 0;}')
$ ls
a.out
$ ./a.out
Hello world!
For a possible reason why this works, see Charles' answer and the comments on this answer.

How to write gz file in C via zlib and compress2

I'm using zlib to write a program that compress data in several threads. so I can't use gzwrite. I'm using compress2().
*dest_len = compressBound(LOG_BUFF_SZ);
err = compress2((Bytef*)compressed_buff->buff, dest_len, (Bytef*)b->buff, size, GZ_INT_COMPRESSION_LEVEL);
write(fd, compressed_buff->buff, compressed_buff->full);
But when I try to decompress file via gzip -d I see the next output: "not in gzip format". what am I doing wrong? Thank you for your answers
compress() and compress2() compress to the zlib format, not the gzip format. You need to use the lower-level functions to be able to select the gzip format. Those are deflateInit2(), deflate() and deflateEnd(). Read the documentation in zlib.h for those functions. After that, you should also look at the heavily documented example of their use.

Using zlib with C

I am currently learning C, and am having some issues with trying to make a small program that utilizes zlib.
I have managed to compile my application (using Codeblocks/MinGW) with the zlib libraries, and compilation works fine. I have used an example based upon the zpipe.c example found over at the official zlib site (zlib.net).
On execution, the output zip file is created, but it seems malformed and/or empty. I am unable to open it using 7zip.
Here is the code that I have modified. I have simply replaced the main() function within zpipe.c.
int main() {
printf("Compression test...");
int ret;
FILE *fpsource;
FILE *fpdest;
fpsource = fopen("test.txt", "rb");
fpdest = fopen("output.zip", "wb");
ret = def(fpsource, fpdest, Z_DEFAULT_COMPRESSION);
if (ret != Z_OK) {
printf("failure\n");
zerr(ret);
}
else {
printf("success..\n");
}
fclose(fpsource);
fclose(fpdest);
return EXIT_SUCCESS;
}
I receive no errors, and my 'success' message is printed. It's just the output file is corrupt.
zpipe.c as-is will generate the zlib format, which is raw deflate data wrapped in a zlib header and trailer. 7zip won't recognize that. It will recognize the gzip or zip format, which are entirely different wrappers on the same raw deflate data.
You can modify zpipe.c to use deflateInit2 (and inflateInit2) instead of the versions without the "2" to select the gzip format instead of the zlib format. You can read zlib.h for how to do this.
The code discussed simply compresses the file using the DEFLATE algorithm. The appropriate structures that make it a zip or gzip file are missing.

Audio file format that can be written without seeking

I want to write audio data to stdout, preferably using libsndfile. When I output WAV to /dev/stdout I manage to write the header, but then I get an error
Error : could not open file : /dev/stdout
System error : Illegal seek.
I assume this is related to http://www.mega-nerd.com/libsndfile/FAQ.html#Q017, some file formats cannot be written without seeks. However, when I try to output SF_FORMAT_AU | SF_FORMAT_PCM_16 instead, I still get the same Illegal seek error.
Are there any audio file formats that can be written completely without seeking?
I'm using Linux.
EDIT: It might be obvious, but RAW format works (without seeking). Unfortunately I need a format that has meta information like sample rate.
You should finish reading that FAQ... the link you give us has all the answers.
However, there is at least one file format (AU) which is specifically designed to be written to a pipe.
So use AU instead of WAV.
Also make sure that you open the SNDFILE object with sf_open_fd, and not sf_open_virtual (or sf_open):
SNDFILE* sf_open_fd (int fd, int mode, SF_INFO *sfinfo, int close_desc) ;
SNDFILE* sf_open_virtual (SF_VIRTUAL_IO *sfvirtual, int mode, SF_INFO *sfinfo,
void *user_data) ;
If you use sf_open_fd, then libsndfile will use fstat to determine whether the file descriptor is a pipe or a regular file. If you use sf_open_virtual or sf_open, it will assume that the file is seekable. This appears to be a flaw in libsndfile, but you should be using sf_open_fd anyway.
Footnote: Don't open /dev/stdout to get standard output; it is already open and there is no need to open it again. Use file descriptor STDOUT_FILENO.
Ended outputting an "infinite" wav header, and then writing raw PCM data for as long as the audio lasts. Not really valid, but most players seem to understand anyway.
The wav header is here, in case anyone wants it: https://gist.github.com/1428176
You could write to a temp file (perhaps in /tmp), let the libsnd seek to modify the .wav(RIFF) header of the temp file, and then, after libsnd has closed the file, stream the temp file out to stdout.

How can I copy files in C without platform dependency?

It looks like this question is pretty simple but I can't find the clear solution for copying files in C without platform dependency.
I used a system() call in my open source project for creating a directory, copying files and run external programs. It works very well in Mac OS X and other Unix-ish systems, but it fails on Windows. The problem was:
system( "cp a.txt destination/b.txt" );
Windows uses backslashes for path separator. (vs slashes in Unix-ish)
Windows uses 'copy' for the internal copy command. (vs cp in Unix-ish)
How can I write a copying code without dependency?
( Actually, I wrote macros to solve this problems, but it's not cool. http://code.google.com/p/npk/source/browse/trunk/npk/cli/tests/testutil.h, L22-56 )
The system() function is a lot more trouble than it's worth; it invokes the shell in a seperate proccess, and should usually be avoided.
Instead fopen() a.txt and dest/b.text, and use getc()/putc() to do the copying (because the standard library is more likely to do page-aligned buffering than you)
FILE *src = fopen("a.txt", "rb");
FILE *dst = fopen("dest/b.txt", "wb");
int i;
for (i = getc(src); i != EOF; i = getc(src))
{
putc(i, dst);
}
fclose(dst);
fclose(src);
You need to use the C standard library functions in stdio.h.
In particular, fopen, fread, fwrite, and fclose will be sufficient.
Be sure to include the b ("binary") option in the flags to fopen.
[edit]
Unfortunately, the file names themselves (forward-slashes vs. back-slashes) are still platform dependent. So you will need some sort of #ifdef or similar to deal with that.
Or you can use a cross-platform toolkit.
Use the standard C library stdio.h. First open input file for reading using fopen(inputFilename, "rb") and open output file for writing using fopen(outputFilename, "wb"), copy the content using fread and fwrire. Then close both files using fclose.

Resources