I'm trying to calculate the entropy of a .exe file by giving it as an input. However, I'm getting a zero value instead of an answer.
Entropy of a file can be understood as the the summation of (pi*log(pi)) every character in the file. I'm trying to calculate the entropy of a .exe file. However, I'm ending up getting a '0'. The '.exe' file for sure has an output.
Below is my code.
#include <stdio.h>
#include <stdlib.h>
#include "stdbool.h"
#include <string.h>
#include <conio.h>
#include <math.h>
#define MAXLEN 100
int makehist( char *S, int *hist, int len) {
int wherechar[256];
int i,histlen;
histlen=0;
for (i=0;i<256;i++)
wherechar[i]=-1;
for (i=0;i<len;i++) {
if (wherechar[(int)S[i]]==-1) {
wherechar[(int)S[i]]=histlen;
histlen++;
}
hist[wherechar[(int)S[i]]]++;
}
return histlen;
}
double entropy(int *hist, int histlen, int len) {
int i;
double H;
H=0;
for (i=0;i<histlen;i++) {
H-=(double)hist[i]/len*log((double)hist[i]/len);
}
return H;
}
void main() {
char S[100];
int len,*hist,histlen;
int num;
double H;
int i=0;
int count =0;
FILE*file = fopen("freq.exe","r");
while (fscanf(file,"%d",&num)>0)
{
S[i]=num;
printf("%d",S[i]);
i++;
}
hist=(int*)calloc(i,sizeof(int));
histlen=makehist(S,hist,i);
H=entropy(hist,histlen,i);
printf("%lf\n",H);
getch();
}
while (fscanf(file,"%d",&num)>0)
This reads numbers encoded as leading white space, optional sign, and a sequence of digits. As soon as some other character is encountered in your file (probably the first byte), your loop will stop. You need to read raw bytes, with getc or fread.
Also, please consider doing the most basic debugging before submitting a question to StackOverflow. Surely your printf in that loop never printed anything, yet you don't mention this in your question and apparently didn't investigate why.
Some other issues:
#define MAXLEN 100
This is never used.
void main()
This is not a valid definition of main. Use
int main(void)
char S[100];
You have undefined behavior if the input contains more than 100 chars, and a .exe file surely will. You really should be feeding the bytes into your histogram calculation as you read them, rather than storing them in a buffer. Easiest is to make wherechar and histlen globals, but you could also put everything you need into a struct and pass a pointer to the struct, together with each byte, to makehist, and again pass a pointer to the struct to entropy.
FILE*file = fopen("freq.exe","r");
Binary files must be opened with "rb" (doesn't matter on linux but does on Windows).
Also, you should check whether fopen succeeds.
hist=(int*)calloc(i,sizeof(int));
hist should have 256 elements. If you allocate this first, then you can process each byte as it is read per above.
You do a divide by zero in entropy if the file is empty ... you should check for len == 0.
wherechar[(int)S[i]] is undefined behavior if the file has chars with negative values, as it surely will. You should use unsigned char instead of char, and then the casts aren't necessary.
This line seems to be reading numbers:
fscanf(file,"%d",&num)
But I don't really expect to find many numbers in an EXE file.
They'd be random byte-values of all different types.
Numbers are only the digits 0-9 (and - & + signs as well).
Related
Quite new to C so apologies for the long read and no doubt rubbish programming.
So my program is reading in a CSV file, containing the specification of 15 different transistors. It then compares an input to the data from the CSV file. After the data is input, the program determines which transistors are most suitable. This is meant to loop until the letter q is entered, which is when the program is meant to stop.
The input is entered as eg.
10 0.05 120 5000 20
for volt, amp, power, freq and gain respectively. The program doesn't care about type, company or price.
#include <stdio.h>
#include <string.h>
#include <math.h>
#include <stdlib.h>
#define MAX 15
#define MAX_IN 10000
struct data{
char type[10], company[10];
int volt, power, freq, gain;
float amp, price;
};
struct input{
int volt, power, freq, gain;
float amp;
};
struct input inputs[MAX_IN];
struct data datas[MAX];
main()
{
int chk = 1, num=0;
while(chk == 1)
{
chk = input_function(&num);
}
}
int input_function(int *number)
{
int j=0;
char check=0;
scanf("%c%d %f %d %d %d", &check, &inputs[j].volt, &inputs[j].amp, inputs[j].power, &inputs[j].freq, &inputs[j].gain);
if(check!='q')
{
check_function(j);
++*number;
++j;
return 1;
}
else
{
return 0;
}
}
num is an integer used to determine how many inputs I have put in so that the check function compares the correct input for the input structure.
In a previously similar problem, using %c%d let it check for both a character and an integer at the same time, but this isn't working.
Previously I had it almost working, but it took q as the first input to voltage - so every input after was off by one. Anyone have any ideas?
scanf, fscanf, and sscanf are broken-as-specified and shouldn't ever be used for anything. Forget you ever heard of them.
Use fgets (or getline if available) to read the entire line. Examine its first character manually. If it is q, exit. Otherwise, use a chain of calls to strtol and strtod to parse the line and convert text to machine numbers. (Pay close attention to the part of the strtol/strtod manpages where they explain how to check for errors; it's a little tricky.)
I'm relatively new to this concept for parsing. And here is a simple, yet for me, it's mindbreaking, example.
I have a text file containing a series of numbers and letters. In each line of the text there are three elements. a letter, another letter, and a number. Consider the first as the source, the second as the destination and the number as size. The read them and put them into a structure array and be able to arrange them according to size. "a, b, 1" for the first line. "q, s, 5" for the 2nd, etc. And lastly, printing them in an arranged format (which is according to size)
Mind giving me a clues or starting points?
Update:
#include <stdio.h>
#include <conio.h>
#include <stdlib.h>
int main(){
FILE *fp;
fp= fopen("file.txt", "O");
int i;
struct arrangement{
char source;
char dest;
int cost;
};
struct arrangement rng[22];
for(i=0; i<22 ; i++){
fscanf(fp, "%c, %c, %d", rng[i].source, rng[i].dest, rng[i].cost);
printf("%c, %c, %d", rng[i].source, rng[i].dest, rng[i].cost);
}
getch();
return 0;
}
will this be able to "store all elements in the array?I still don't have any idea how I will arrange these according to size/cost without the source and destination being left out.
fscanf requires pointers to the variables, not the variables themselves. Your code may result in strange results, depending on compiler (gcc may emit a warning/error) and platform.
You also should break the loop if reaching EOF. i may then provide the last used entry (which may be partially valid or not, depending on the input).
I am writing a program to intake exactly 5 peoples last names and their votes. Which will display the names of the people entered, the corresponding votes, and also the winner.
I need the names of the people into one array of strings. That is where the program crashes. Not sure if I can modify this to make it work or if I need to redo it.
the malloc function seems to be a recurring fix for this type of problem ?
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int TotalVotes(int voteArray[],int size)
{
int Sum=0;
for (int i=0;i<size;i++)
{
Sum+=voteArray[i];
}
return Sum;
}
int Winner(int voteArray[],int size)
{
int max;
max=0;
if (voteArray[1]>voteArray[max])
max=1;
if (voteArray[2]>voteArray[max])
max=2;
if (voteArray[3]>voteArray[max])
max=3;
if (voteArray[4]>voteArray[max])
max=4;
return max;
}
void main()
{
char nameArray[5];
int voteArray[5],Total,winner;
for (int i=0;i<5;i++)
{
voteArray[i]=0;
}
for (int j=0;j<5;j++)
{
printf("Enter the name of the candidate number %d\n",j+1);
scanf("%s",nameArray[j]);
printf("Enter that persons number of votes\n");
scanf("%d",&voteArray[j]);
}
Total=TotalVotes(voteArray,5);
winner=Winner(voteArray,5);
printf("%s\t%s\t%s\n","Candidate","Votes Received","% of Total Votes");
for (int y=0;y<5;y++)
{
printf("%s\t%d\t%0.2f\n",nameArray[y],voteArray[y],(float)voteArray[y]/Total);
}
printf("The Winner of The Election is %s\n",nameArray[winner]);
}
char nameArray[5]; is should be like char nameArray[5][20];
Yeah, in C, strings are represented by character arrays (char* or char[]).
Also, you should get fgets instead of scanf with strings for two reasons:
Fgets helps prevent buffer overflow because it knows the size of the string in advance.
Fgets will always run because it does not leave characters in the input buffer like scanf does.
The prototype for fgets looks somewhat like this (you can use stdin for the FILE pointer to read in from the keyboard, but realize that fgets keeps newlines):
fgets( char *output_variable, unsigned int string_length, FILE *input_file );
Also, if you use scanf, you should do a lot more error checking for invalid input.
I am attempting to read the size values from the header of a .pgm image file (mars.pgm), and assign the resulting values to the integer variables u and v using sscanf.
When executed the program prints P5 832 700 127 in the first line, which is correct (the 832 and 700 are the size values that I want to pick out).
In the second line that is meant to print u and v variables two very large numbers are printed, instead of the 832 and 700 values.
I cannot figure out why this is not working as desired. When using the small test program (located at the bottom of the post) sscanf picks out the values from a string like I expected it to.
#include<stdio.h>
#include <string.h>
int main()
{
FILE *fin;
fin= fopen ("mars.pgm","r+");
if (fin == NULL)
{
printf ("ERROR");
fclose(fin);
}
int u,v,i,d,c;
char test[20];
for (i=0; i<=20; i++)
{
test[i]=getc(fin);
}
sscanf(test,"%d,%d,%d,%d",&c,&u,&v,&d);
printf("%s\n",test);
printf("%d %d",u, v);
fclose(fin);
}
small test Program
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
void main(void)
{
int a;
char s[3];
s[0]='1';
s[1]=' ';
s[2]='2';
sscanf(s,"%d",&a);
printf("%d",a);
}
First of all, I advise you to make a small test: initialize your variables with a 0, for instance, and verify what value they are holding after read operation.
Then, try removing , characters from your format string. Check if it works then.
This behavior you see is happening because fscanf() and derivatives match the full pattern when scanning, which means if your source data has no commas and your format has commas, it may be ignored.
So I am a very beginner to C programming (I have used Ruby, Python and Haskell before) and I am having trouble getting the most simple thing to work in C (probably because of all the manual memory stuff). Anyway, what I am trying to do is (using simple constructs) make a script that just echoes what the user inputs to the console.
e.g. user inputs hi, console prints hi.
This is what I came up with.
Also, I haven't really mastered pointers, so none of that.
// echo C script
int echo();
int main() {
echo();
return 0;
}
int echo() {
char input[500];
while (1) {
if (scanf("%[^\n]", input) > 0) {
printf("%s\n", input);
}
input[0] = 0;
}
return 1;
}
I realize that there is a bunch of bad practices here, like setting a giant string array, but that is just for simplifying it.
Anyway, my problem is that it repeats the first input then the input freezes. As far as I can tell, it freezes during the while loop (1 is never returned).
Any help would be appreciated.
Oh, and using TCC as the compiler.
You don't need an array for echo
#include <stdio.h>
int main(void)
{
int c;
while((c = getchar()) != EOF) putchar(c);
return 0;
}
It's fine that you have such a large string allocated, as long as it's possible for users to input a string of that length. What I would use for input is fgets (read this for more information). Proper usage in your situation, given that you still would like to use the string of size 500, would be:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int echo(){
char input[500];
while(fgets(input, 500, STDIN)){ //read from STDIN (aka command-line)
printf("%s\n", input); //print out what user typed in
memset(input, 0, strlen(input)); //reset string to all 0's
}
return 1;
}
Note that changing the value of 500 to whatever smaller number (I would normally go with some power of 2 by convention, like 512, but it doesn't really matter) will limit the length of the user's input to that number. Also note that I didn't test my code but it should work.
scanf("%[^\n]", input
Should be:
scanf("%s",input)
Then after your if you should do:
memset(input,0,500);
There are many ways of accomplishing this task however the easiest would be to read from stdin one byte at a time and output that byte to stdout as you process each byte.
Snippet:
#include <stdio.h>
int main( void ) {
// Iterates until EOF is sent.
for ( int byte = getchar(); byte != EOF; byte = getchar() ) {
// Outputs to stdout the byte.
putchar( byte );
}
return 0;
}
Remark:
You must store the byte that you are reading through stdin in an integer. This is because you are not guaranteed that char is signed or unsigned, there are in fact 3 char types in C (char, signed char and unsigned char). Include the limits library to determine whether a char is signed or not in your environment.
You must compile using the C99 standards, otherwise move the declaration of byte outside of the for loop.