Writing to an unformatted, direct access binary file in C

Writing to an unformatted, direct access binary file in C - c

I'm trying to use this diehard repo to test a stream of numbers for randomness (https://github.com/reubenhwk/diehard). It gives these specifics for the file type it reads:
Then the command
diehard
will prompt for the name of the file to be tested.
That file must be a form="unformatted",access="direct" binary
file of from 10 to 12 million bytes.
These are file conventions specific to fortran, no? The problem is, I'm using a lfsr-generator to generate my random binary stream, and that's in C. I tried just doing fputc of the binary stream to a text file or .bin file, but diehard doesn't seem to be accepting it.
I have no experience with fortran. Is there any way to create this file type using C? Will I just have to bite the bullet and have C call a fortran subroutine that creates the file? Here's my C code for reference:
#include <stdio.h>
int main(void)
{
FILE *fp;
fp=fopen("numbers", "wb");
const unsigned int init = 1;
unsigned int v = init;
int counter = 0;
do {
v = shift_lfsr(v);
fputc( (((v & 1) == 0) ? '0' : '1'), fp);
counter += 1;
} while (counter < 11000000);
}

You're creating the binary file just fine. Your problem is that you're only writing a single random bit per byte (and expanding it into text). Diehard wants every bit to be random. So accumulate 8 bits at a time before you write:
do {
int b = 0;
for (int i = 0; i < 8; i += 1) {
v = shift_lfsr(v);
b <<= 1;
b |= (v & 1);
}
fputc(b, fp);
. . .

Related

How does fread() in C work inside a for loop?

I am new to C programming, but I need it to read some binary file which I describe below.
The India Meteorological Department (IMD) has provided historical weather data in .GRD files in their website. They have also provided sample C code to read those files. From their sample C code, I have written the following code that extracts the daily minimum temperatures on 15 April 1980 recorded on a 31x31 grid over India.
/* This program reads binary data for 365/366 days and writes in ascii file. */
#include <stdio.h>
int main() {
float t[31][31];
int i,j ,k;
FILE *fin,*fout;
fin = fopen("C:\\New folder\\Mintemp_MinT_1980.GRD","rb"); // Input file
fout = fopen("C:\\New folder\\MINT15APR1980.TXT","w"); // Output file
fprintf(fout,"Daily Minimum Tempereture for 15 April 1980\n");
if(fin == NULL) {
printf("Can't open file");
return 0;
}
if(fout == NULL) {
printf("Can't open file");
return 0;
}
for(k=0 ; k<366 ; k++) {
fread(&t,sizeof(t),1,fin);
if(k == 105) {
for(i=0 ; i < 31 ; i++) {
fprintf(fout,"\n") ;
for(j=0 ; j < 31 ; j++)
fprintf(fout,"%6.2f",t[i][j]);
}
}
}
fclose(fin);
fclose(fout);
return 0;
}
/* end of main */
The file Mintemp_MinT_1980.GRD can be downloaded from the IMD website by selecting the year as 1980 against Minimum Temperature.
What I don't understand is that how the fread() function actually works in the line fread(&t,sizeof(t),1,fin) within the loop for(k=0 ; k<366 ; k++). In plain sight, the arguments of fread() here do not depend on the looping variable k, and so it should read the same data to the matrix t[31][31] for every k. However, I have checked that, surprisingly, the data extracted by this program are different for different values of k in the line if(k == 105), i.e., the data extracted for k == 105 and k == 32 are different, for example.
I would very much appreciate if one can please explain the above.

Files contain sequential data. All the file operators are based on the premise that whatever you do to a file, you'll generally be doing it in a sequential way.
So when you read data, and then read more data, you will be getting sequential chunks of the file. The both the FILE datatype and the operating system itself do a number of things for you, including keeping track of your current position in the file and doing block buffering in memory to improve performance.
If you wanted to reread the same data over, or skip around in the file, you would need to use fseek() to change positions in the file before doing your next read.

Quantization noise during writing wav file using C

I'm just trying to write a library for read and write Wav files (just need it for audio processing), just as test, I read samples from a Wave File convert them to double (just standardize them to -1 ~ 1), and do nothing but transform them back to integer, according to the bit per sample (assume the Wav file have N bits per sample, I divided them through 2^(N-1)-1 and multiply with the same factor after to restore it)
But the problem is, I get a wav file with background noise (id say it seems like quantisization noise) and I don't know why, can you help me find it out?
the library is here: https://pastebin.com/mz5TWMPN
the header file is: https://pastebin.com/Lr2tbmnv
and a demo main function is like:
#include <stdio.h>
#include <math.h>
#include "wavreader.h"
#define FRAMESIZE 512
int main()
{
FILE *fh;
FILE *fhWrite;
struct WavHeader * header;
struct WavHeader * newHeader;
double frame[FRAMESIZE];
int iBytesWritten;
int i;
char test;
fh = fopen("D:/ArbeitsOrdner/advanced_pacev/AudioSample/spfg.wav", "rb+");
if (fh == NULL)
{
printf("Failed to open organ.wav\n");
return 1;
}
fhWrite = fopen("D:/ArbeitsOrdner/MyC/test_organ.wav", "wb+");
if (fhWrite == NULL)
{
printf("Failed to create test_organ.wav\n");
return 1;
}
header = readWaveHeader(fh);
printWaveHeader(header);
newHeader = createWaveHeader(header->iChannels, header->iSampleRate, header->iBitsPerSample);
WaveWriteHeader(fhWrite, newHeader);
while (WaveReadFrame(fh, header, FRAMESIZE, frame) != -1)
{
iBytesWritten = WaveWriteFrame(fhWrite, newHeader, FRAMESIZE, frame);
if (iBytesWritten < 0)
{
printf("Error occured while writing to new file\n");
return 1;
}
}
WaveWriteHeader(fhWrite, newHeader);
fclose(fhWrite);
fclose(fh);
return 0;
}

thx for viewing this post. I have found the problem myself, it is that, i used char instead of unsigned char for raw data (raw bytes). By converting them to int16 or int32, i haven't considered the sign bit. that means they are not exact the same value during convertion as it except to be.
the solution for this is:
either stay with signed char and use:
buffer[i] & 0xff
to get the correct raw data for convertion, or change the char types into unsigned char:
unsgiend char * buffer;

Binary file that can only be printed in hex format , but not binary format

Here is a binary file that contains:
0xff 0xff 0xff
which is exactly three bytes.
I try to use the dump_file function here
#include "table.h"
#include "debug.h"
typedef unsigned int Code
void dump_file( char* fileName[] )
{
char c;
for (int i = 0; i < 4; ++i)
{
log_info("File: %s",fileName[i]);
FILE* file = fopen(fileName[i],"rb");
fread(&c,sizeof(char),1,file);
while( !feof(file) ){
dump_code( c , 8 );
fread(&c,sizeof(char),1,file);
}
}
}
void dump_code( Code code,int BitsNum )
{
int mask = 1 << (BitsNum-1);
for (int i = 0; i < BitsNum ; ++i)
{
if(i%8==0)putchar('|');
putchar((mask & code) ? '1' : '0');
code <<= 1;
}
puts("");
}
to print the file in binary format, but it prints nothing. ( Somehow it bumps into EOF in an undesirable manner ?? )
I also use the Unix unity xxd.
When I signal xxd to print my file in binary, it prints nothing. But if I choose to print hexademically, it prints as expected. What's wrong with this file?
This file is generated by a parser. The C program uses fseek to jump to various location in a file and print the corresponding binary code. It might go like:
0th byte --> 1st byte --> 3rd byte --> 5th byte --> 2nd byte --> 4th byte --> 6th byte
It is guaranteed that there is no "leak" in the resulting file, i.e, every byte will be traversed.
What is the reason for this strange behavior?
Update 1
While pointed out by samgak that this might be due to the interpretation of 0xff, some of my other experiments indicate that even file containing:
0x01 0x01 0x01
which results in the same phenomenon.
Update 2
Here's the relevent code that write Code into file:
#define CODE_FILE_NUM 3
void writeCode( FILE* out[] , Code code ){
for (int i = 0; i < CODE_FILE_NUM; ++i){
fwrite(&code,sizeof(char),1,out[i]);
code >>= 8;
}
}
Code is an unsigned int, which has 4 bytes. Function writeCode will only consider the lower 3 bytes and write each byte into 3 seperate files.

I have found the reason.
It's because I forgot to close the output files.
I tried to dump unclosed binary files ( That is: open and read data from files that haven't been closed. ) , which resulted in unpredictable behaviors.

Seeking single file encryption implemenation which can handle whole file en/de-crypt in Delphi and C

[Update] I am offering a bonus for this. Frankly, I don't care which encryption method is used. Preferably something simple like XTEA, RC4, BlowFish ... but you chose.
I want minimum effort on my part, preferably just drop the files into my projects and build.
Idealy you should already have used the code to en/de-crypt a file in Delphi and C (I want to trade files between an Atmel UC3 micro-processor (coding in C) and a Windows PC (coding in Delphi) en-and-de-crypt in both directions).
I have a strong preference for a single .PAS unit and a single .C/.H file. I do not want to use a DLL or a library supporting dozens of encryption algorithms, just one (and I certainly don't want anything with an install program).
I hope that I don't sound too picky here, but I have been googling & trying code for over a week and still can't find two implementations which match. I suspect that only someone who has already done this can help me ...
Thanks in advance.
As a follow up to my previous post, I am still looking for some very simple code with why I can - with minimal effort - en-de crypt a file and exchange it between Delphi on a PC and C on an Atmel UC3 u-processor.
It sounds simple in theory, but in practice it's a nightmare. There are many possible candidates and I have spend days googling and trying them out - to no avail.
Some are humonous libraries, supporting many encryption algorithms, and I want something lightweight (especially on the C / u-processor end).
Some look good, but one set of source offers only block manipulation, the other strings (I would prefer whole file en/de-crypt).
Most seem to be very poorly documented, with meaningless parameter names and no example code to call the functions.
Over the past weekend (plus a few more days), I have burned my way through a slew of XTEA, XXTEA and BlowFish implementations, but while I can encrypt, I can't reverse the process.
Now I am looking at AES-256. Dos anyone know of an implementation in C which is a single AES.C file? (plus AES.H, of course)
Frankly, I will take anything that will do whole file en/de-crypt between Delphi and C, but unless anyone has actually done this themselves, I expect to hear only "any implementation that meets the standard should do" - which is a nice theoory but just not working out for me :-(
Any simple AES-256 in C out there? I have some reasonable looking Delphi code, but won't be sure until I try them together.
Thanks in advance ...

I would suggest using the .NET Micro Framework on a secondary microcontroller (e.g. Atmel SAM7X) as a crypto coprocessor. You can test this out on a Netduino, which you can pick up for around $35 / £30. The framework includes an AES implementation within it, under the System.Security.Cryptography namespace, alongside a variety of other cryptographic functions that might be useful for you. The benefit here is that you get a fully tested and working implementation, and increased security via type-safe code.
You could use SPI or I2C to communicate between the two microcontrollers, or bit-bang your own data transfer protocol over several I/O lines in parallel if higher throughput is needed.
I did exactly this with an Arduino and a Netduino (using the Netduino to hash blocks of data for a hardware BitTorrent device) and implemented a rudimentary asynchronous system using various commands sent between the devices via SPI and an interrupt mechanism.
Arduino is SPI master, Netduino is SPI slave.
A GPIO pin on the Netduino is set as an output, and tied to another interrupt-enabled GPIO pin on the Arduino that is set as an input. This is the interrupt pin.
Arduino sends 0xF1 as a "hello" initialization message.
Netduino sends back 0xF2 as an acknolwedgement.
When Arduino wants to hash a block, it sends 0x48 (ASCII 'H') followed by the data. When it is done sending data, it sets CS low. It must send whole bytes; setting CS low when the number of received bits is not divisible by 8 causes an error.
The Netduino receives the data, and sends back 0x68 (ASCII 'h') followed by the number of received bytes as a 2-byte unsigned integer. If an error occurred, it sends back 0x21 (ASCII '!') instead.
If it succeeded, the Netduino computes the hash, then sets the interrupt pin high. During the computation time, the Arduino is free to continue its job whilst waiting.
The Arduino sends 0x52 (ASCII 'R') to request the result.
The Netduino sets the interrupt pin low, then sends 0x72 (ASCII 'r') and the raw hash data back.
Since the Arduino can service interrupts via GPIO pins, it allowed me to make the processing entirely asynchronous. A variable on the Arduino side tracks whether we're currently waiting on the coprocessor to complete its task, so we don't try to send it a new block whilst it's still working on the old one.
You could easily adapt this scheme for computing AES blocks.

Small C library for AES-256 by Ilya Levin. Short implementation, asm-less, simple usage. Not sure how would it work on your current micro CPU, though.
[Edit]
You've mentioned having some delphi implementation, but in case something not working together, try this or this.
Also I've found arduino (avr-based) module using the Ilya's library - so it should also work on your micro CPU.

Can you compile C code from Delphi (you can compile Delphi code from C++ Builder, not sure about VV). Or maybe use the Free Borland Command line C++ compiler or even another C compiler.
The idea is to use the same C code in your Windows app as you use on your microprocessor.. That way you can be reasonably sure that the code will work in both directions.
[Update] See
http://www.drbob42.com/examines/examin92.htm
http://www.hflib.gov.cn/e_book/e_book_file/bcb/ch06.htm (Using C++ Code in Delphi)
http://edn.embarcadero.com/article/10156#H11
It looks like you need to use a DLL, but you can statically link it if you don't want to distribute it

Here is RC4 code. It is very lightweight.
The C has been used in a production system for five years.
I have added lightly tested Delphi code. The Pascal is a line-by-line port with with unsigned char going to Byte. I have only run the Pascal in Free Pascal with Delphi option turned on, not Delphi itself. Both C and Pascal have simple file processors.
Scrambling the ciphertext gives the original cleartext back.
No bugs reported to date. Hope this solves your problem.
rc4.h
#ifndef RC4_H
#define RC4_H
/*
* rc4.h -- Declarations for a simple rc4 encryption/decryption implementation.
* The code was inspired by libtomcrypt. See www.libtomcrypt.org.
*/
typedef struct TRC4State_s {
int x, y;
unsigned char buf[256];
} TRC4State;
/* rc4.c */
void init_rc4(TRC4State *state);
void setup_rc4(TRC4State *state, char *key, int keylen);
unsigned endecrypt_rc4(unsigned char *buf, unsigned len, TRC4State *state);
#endif
rc4.c
void init_rc4(TRC4State *state)
{
int x;
state->x = state->y = 0;
for (x = 0; x < 256; x++)
state->buf[x] = x;
}
void setup_rc4(TRC4State *state, char *key, int keylen)
{
unsigned tmp;
int x, y;
// use only first 256 characters of key
if (keylen > 256)
keylen = 256;
for (x = y = 0; x < 256; x++) {
y = (y + state->buf[x] + key[x % keylen]) & 255;
tmp = state->buf[x];
state->buf[x] = state->buf[y];
state->buf[y] = tmp;
}
state->x = 255;
state->y = y;
}
unsigned endecrypt_rc4(unsigned char *buf, unsigned len, TRC4State *state)
{
int x, y;
unsigned char *s, tmp;
unsigned n;
x = state->x;
y = state->y;
s = state->buf;
n = len;
while (n--) {
x = (x + 1) & 255;
y = (y + s[x]) & 255;
tmp = s[x]; s[x] = s[y]; s[y] = tmp;
tmp = (s[x] + s[y]) & 255;
*buf++ ^= s[tmp];
}
state->x = x;
state->y = y;
return len;
}
int endecrypt_file(FILE *f_in, FILE *f_out, char *key)
{
TRC4State state[1];
unsigned char buf[4096];
size_t n_read, n_written;
init_rc4(state);
setup_rc4(state, key, strlen(key));
do {
n_read = fread(buf, 1, sizeof buf, f_in);
endecrypt_rc4(buf, n_read, state);
n_written = fwrite(buf, 1, n_read, f_out);
} while (n_read == sizeof buf && n_written == n_read);
return (n_written == n_read) ? 0 : 1;
}
int endecrypt_file_at(char *f_in_name, char *f_out_name, char *key)
{
int rtn;
FILE *f_in = fopen(f_in_name, "rb");
if (!f_in) {
return 1;
}
FILE *f_out = fopen(f_out_name, "wb");
if (!f_out) {
close(f_in);
return 2;
}
rtn = endecrypt_file(f_in, f_out, key);
fclose(f_in);
fclose(f_out);
return rtn;
}
#ifdef TEST
// Simple test.
int main(void)
{
char *key = "This is the key!";
endecrypt_file_at("rc4.pas", "rc4-scrambled.c", key);
endecrypt_file_at("rc4-scrambled.c", "rc4-unscrambled.c", key);
return 0;
}
#endif
Here is lightly tested Pascal. I can scramble the source code in C and descramble it with the Pascal implementation just fine.
type
RC4State = record
x, y : Integer;
buf : array[0..255] of Byte;
end;
KeyString = String[255];
procedure initRC4(var state : RC4State);
var
x : Integer;
begin
state.x := 0;
state.y := 0;
for x := 0 to 255 do
state.buf[x] := Byte(x);
end;
procedure setupRC4(var state : RC4State; var key : KeyString);
var
tmp : Byte;
x, y : Integer;
begin
y := 0;
for x := 0 to 255 do begin
y := (y + state.buf[x] + Integer(key[1 + x mod Length(key)])) and 255;
tmp := state.buf[x];
state.buf[x] := state.buf[y];
state.buf[y] := tmp;
end;
state.x := 255;
state.y := y;
end;
procedure endecryptRC4(var buf : array of Byte; len : Integer; var state : RC4State);
var
x, y, i : Integer;
tmp : Byte;
begin
x := state.x;
y := state.y;
for i := 0 to len - 1 do begin
x := (x + 1) and 255;
y := (y + state.buf[x]) and 255;
tmp := state.buf[x];
state.buf[x] := state.buf[y];
state.buf[y] := tmp;
tmp := (state.buf[x] + state.buf[y]) and 255;
buf[i] := buf[i] xor state.buf[tmp]
end;
state.x := x;
state.y := y;
end;
procedure endecryptFile(var fIn, fOut : File; key : KeyString);
var
nRead, nWritten : Longword;
buf : array[0..4095] of Byte;
state : RC4State;
begin
initRC4(state);
setupRC4(state, key);
repeat
BlockRead(fIN, buf, sizeof(buf), nRead);
endecryptRC4(buf, nRead, state);
BlockWrite(fOut, buf, nRead, nWritten);
until (nRead <> sizeof(buf)) or (nRead <> nWritten);
end;
procedure endecryptFileAt(fInName, fOutName, key : String);
var
fIn, fOut : File;
begin
Assign(fIn, fInName);
Assign(fOut, fOutName);
Reset(fIn, 1);
Rewrite(fOut, 1);
endecryptFile(fIn, fOut, key);
Close(fIn);
Close(fOut);
end;
{$IFDEF TEST}
// Very small test.
const
key = 'This is the key!';
begin
endecryptFileAt('rc4.pas', 'rc4-scrambled.pas', key);
endecryptFileAt('rc4-scrambled.pas', 'rc4-unscrambled.pas', key);
end.
{$ENDIF}

It looks easier would be to get reference AES implementation (which works with blocks), and add some code to handle CBC (or CTR encryption).
This would need from you only adding ~30-50 lines of code, something like the following (for CBC):
aes_expand_key();
first_block = iv;
for (i = 0; i < filesize / 16; i++)
{
data_block = read(file, 16);
data_block = (data_block ^ iv);
iv = encrypt_block(data_block);
write(outputfile, iv);
}
// if filesize % 16 != 0, then you also need to add some padding and encrypt the last block

Assuming the encryption strength isn't an issue, as in satisfying an organization's Chinese Wall requirement, the very simple "Sawtooth" encryption scheme of adding (i++ % modulo 256) to fgetc(), for each byte, starting at the beginning of the file, might work just fine.
Declaring i as a UCHAR will eliminate the modulo requirement, as the single byte integer cannot help but cycle through its 0-255 range.
The code is so simple it's not worth posting. A little imagination, and you'll have some embellishments that can add a lot to the strength of this cypher. The primary vulnerability of this cypher is large blocks of identical characters. Rectifying this is a good place to start improving its strength.
This cypher works on every possible file type, and is especially effective if you've already 7Zipped the file.
Performance is phenomenal. You won't even know the code is there. Totally I/O bound.

Generating Random ASCII

I've Been trying to work on a very simple encryption routine , It should work like this :
-- Generate A Random Key of ASCII Characters (Just a permutation of the ascii table)
-- For Every char in the File to be encrypted , Get Its Decimal Representation(X) , Then Replace it with the char at Index X at the key.
The problem is that It corrupts some files and I Have no idea why.
Any help would be appreciated.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <time.h>
int main()
{
int temp,used[256];
char *key,*mFile;
long i,fSize;
memset(used,0,sizeof(used));
srand(time(NULL));
FILE *pInput = fopen("Input.in","rb");
FILE *pOutput = fopen("Encrypted.out","wb");
FILE *pKeyOutput = fopen("Key.bin","wb");
if(pInput==NULL||pOutput==NULL||pKeyOutput==NULL)
{
printf("File I/O Error\n");
return 1;
}
key = (char*)malloc(255);
for(i=0;i<256;i++)
{
temp = rand()%256;
while(used[temp])
temp = rand()%256;
key[i] = temp;
used[temp] = 1;
}
fwrite(key,1,255,pKeyOutput);
fseek(pInput,0,SEEK_END);
fSize = ftell(pInput);
rewind(pInput);
mFile = (char*)malloc(fSize);
fread(mFile,1,fSize,pInput);
for(i=0;i<fSize;i++)
{
temp = mFile[i];
fputc(key[temp],pOutput);
}
fclose(pInput);
fclose(pOutput);
fclose(pKeyOutput);
free(mFile);
free(key);
return 0;
}
The Decryption Routine :
#include <stdio.h>
#include <stdlib.h>
int main()
{
int temp,j;
char *key,*mFile;
long i,fSize;
FILE *pKeyInput = fopen("key.bin","rb");
FILE *pInput = fopen("Encrypted.out","rb");
FILE *pOutput = fopen("Decrypted.out","wb");
if(pInput==NULL||pOutput==NULL||pKeyInput==NULL)
{
printf("File I/O Error\n");
return 1;
}
key = (char*)malloc(255);
fread(key,1,255,pKeyInput);
fseek(pInput,0,SEEK_END);
fSize = ftell(pInput);
rewind(pInput);
mFile = (char*)malloc(fSize);
fread(mFile,1,fSize,pInput);
for(i=0;i<fSize;i++)
{
temp = mFile[i];
for(j=0;j<256;j++)
{
if(key[j]==temp)
fputc(j,pOutput);
}
}
fclose(pInput);
fclose(pOutput);
fclose(pKeyInput);
free(mFile);
free(key);
return 0;
}

Make sure you use unsigned char; if char is signed, things will go wrong when you process characters in the range 0x80..0xFF. Specifically, you'll be accessing negative indexes in your 'mapping table'.
Of course, strictly speaking, ASCII is a 7-bit code set and any character outside the range 0x00..0x7F is not ASCII.
You only allocate 255 bytes but you then proceed to overwrite one byte beyond what you allocate. This is a basic buffer overflow; you invoke undefined behaviour (which means anything may happen, including the possibility that it seems to work correctly without causing trouble - on some machines).
Another problem is that you write mappings for 255 of the 256 possible byte codes, which is puzzling. What happens with the other byte value?
Of course, since you write the 256-byte mapping to the 'encrypted' file, it will be child's play to decode; the security in this scheme is negligible. However, as a programming exercise, it still has some merit.
There is no reason to slurp the entire file and then write it out byte by byte. You can perfectly well read it byte by byte as well as write it byte by byte. Or you could slurp the whole file, map it in situ, and then write the whole file in one go. Consistency is important in programming.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight