Does a quine print the ACTUAL code of the program i.e not obfuscated or does it print the obfuscated program?
I don't think obfuscation has anything to do with it. Usually a quine prints the actual source code of the program itself.
Suppose that you had a C program which prints an "obfuscated" or otherwise cosmetically modified version of its source. For example, suppose there's a difference in whitespace or variable names.
Then that program would not be a quine, since by definition a quine is a program which prints itself, and by "itself" we mean the exact same sequence of bytes. But the output of that program, once compiled, would print the same thing as the original program (since it's just a cosmetic variant), i.e. itself. So the output is a quine.
This sometimes eases the process of writing a quine - just write a "nearly-quine", which maybe doesn't get the formatting exactly right, run it once, and the output is your actual quine.
This is all assuming a quine in C. A quine in x86 machine code would have to output not its C source, but the same sequence of bytes that makes up the .exe file.
I'm not sure what you mean by "ACTUAL code" as opposed to "obfuscated code", but to test whether something is a quine or not, you have to decide what language it's supposed to be a quine in. Maybe by deciding that you can answer your own question - do you just want a quine in C, or a quine that has something to do with your obfuscator?
Here is an actual quine in standard C, found at Wikipedia:
main() { char *s="main() { char *s=%c%s%c; printf(s,34,s,34); }"; printf(s,34,s,34); }
You will notice that its structure is relatively straightforward. It uses a string constant containing the text of the program as both the format string and one of the values to be formatted by printf().
When compiled and run, it prints exactly that single line of code.
There are examples of quines in a variety of languages, including several more in C, at the wiki article.
Following is a simple quine code. This source code need to saved as "quine_file.c". Compile and execute.
Here a simple file pointer is taken and it is used to read the source file line by line and print it to the stdout.
#include <stdio.h>
#include <stdlib.h>
void main()
{
FILE *fp = NULL;
char * line = NULL;
int len = 0;
int read;
fp = fopen("quine_file.c","r");
if(fp == NULL)
return;
while ((read = getline(&line, &len, fp)) != -1)
{
printf("%s", line);
}
fclose(fp);
if (line)
free(line);
exit(EXIT_SUCCESS);
}
A quine is a program which prints its own listing. This means that when the program is run, it must print out precisely those instructions which the programmer wrote as part of the program (including, of course, the instructions that do the printing, and the data used in the printing).
- David Madore
Further reading
JavaScript example
$=_=>`$=${$};$()`;$();
when executed this program will display the following string
"$=_=>`$=${$};$()`;$();"
Featured in the art of code video # 30m21s
- Dylan Beattie
#include <stdio.h>
main(){
FILE* fp = fopen("printItself.c", "r");
int c;
while ((c = getc(fp)) != EOF) putc(c, stdout);
}
Save it in a file named as printItself.c ... Problem with previous example is that if i add a line in program e.g int x; i will have to add it in the string also while taking care of newline and spaces etc ... but in this example you can add whatever you want.
Related
I am trying to write a program to count the number of lines in a file by using the following code..
#include <stdio.h>
int main(int argc,char *argv[])
{
if(argc!=2)
{
printf("invalid no. of arguments\nUsage:lc <FILENAME>\n");
return 1;
}
FILE *in=fopen(argv[1],"r");
char c;
int count=0;
fscanf(in,"%c",&c); //take the first character
while(c!=EOF)
{
if(c=='\n')
count++;
fscanf(in,"%c",&c);
}
fclose(in);
printf("TOTAL LINES:%d\n",count);
return 0;
}
I am aware that this can be achieved by using fgetc(), but I am curious to know why fscanf() doesn't work here.
all help is highly appreciated.
You forgot to test the result count of fscanf. In particular, your program is completely wrong when the processed file is an empty one.
while(c!=EOF)
That can never happen (after any fscanf(in,"%c",&c)...). EOF is not a valid character (so by definition is outside of all the possible values of a char). For instance, on your system, char could be unsigned 8 bits, and EOF could be -1 (EOF is documented -in §7.21.1 of the C11 standard- to be some negative int, not some char). So the type of EOF is not char. And on my Linux/Debian/x86-64 that is happening.
Read much more carefully the documentation related to <stdio.h>
In general, read the documentation of every function that you are using. And learn How to debug small programs
this can be achieved by using fgetc()
Yes, and in your case you'll better use fgetc. Again, read its documentation. It is very likely that fgetc is much faster (since more low-level) than fscanf. Most implementations of fscanf are using fgetc (or its equivalent).
At last, some C standard library implementations are free software. On Linux, both GNU glibc and musl-libc are. You could study their source code. You'll learn a lot by doing so. Both are built above Linux syscalls(2).
#include<stdio.h>
#include<string.h>
#include<stdlib.h>
int main(int argc , char **argv)
{
FILE *fp1 = fopen(argv[0] , "r");
FILE *fp2;
fseek(fp1 , 0L , SEEK_END);
long int file_size = ftell(fp1);
printf("file size is : %ld\n", file_size);
rewind(fp1);
int cnt = 0;
fp2 = fopen("newImage.jpg", "w");
printf("file pointer is at %ld ", ftell(fp1));
while(cnt < file_size) {
fputc( fgetc(fp1), fp2 );
cnt++;
}
fclose(fp1);
fclose(fp2);
return EXIT_SUCCESS;
}
As you discovered from the comments, argv[0] is what burned you, technically. I wanted to offer some advice that would have saved you the trouble of posting to SO.
For anything posted to a public forum, it's advisable to format the code "correctly", as an aid to anyone who'll take the time to answer you. The better they understand what you're doing, and the more care you've taken with your question, the better answers you'll get.
For simple programs like this, it's useful to recapitulate the input. When you print the file size, print the name, too. You would have seen immediately you were copying the wrong file.
For any program, especially in C, and very especially where there's I/O involved, it's essential to check return codes for syscalls. assert(3) and err(3) are your friends here. Notably, if you don't check for EOF from fgetc, you stand to write -1 to every byte of the output file, with nary a message to the user. (Yes, you have every right to expect that if you can open and seek on a file, you'll be able to read from it. OTOH, a tiny mistake like reading from fp2 could go unnoticed.)
Strange as it may seem, careful error checking encourages writing minimal code. In your case, you really don't care how big the file is. You just want to read all the input, until EOF. If you do that, you naturally adopt the idiom while( (ch = fgetc(fp1)) != EOF ) and save yourself a call to find out how big the file is.
Get to know getopt(3) and use it. That would have saved you stumbling on argv[0].
If you follow that advice, you may find as I have that more often than not the question answers itself. C is a simple language, and the C standard library is well documented. If you're careful about your inputs, fastidious in formatting, and relentless in checking for errors, the only mistakes left will be in your logic. Which is how you want it. :-)
I'm brand new to C and trying to learn how to read a file. My file is a simple file (just for testing) which contains the following:
this file
has been
successfully read
by C!
So I read the file using the following C code:
#include <stdio.h>
int main() {
char str[100];
FILE *file = fopen("/myFile/path/test.txt", "r");
if(file == NULL) {
puts("This file does not exist!");
return -1;
}
while(fgets(str, 100, file) != '\0') {
puts(str);
}
fclose(file);
return 0;
}
This prints my text like this:
this file
has been
successfully read
by C!
When I compile and run it, I pipe its output to hexdump -C and can see an extra 0a at the end of each line.
Finally, why do I need to declare an array of chars to read from a file? What if I don't know how much data is on each line?
fgets() reads up to the newline and keeps the newline in the string and puts() always adds a newline to the string it is given to print. Hence you get double-spaced output when used as in your code.
Use fputs(str, stdout) instead of puts(); it does not add a newline.
The obsolete function gets() — removed from the 2011 version of the C standard — read up to the newline but removed it. The gets() and puts() pair worked well together, as do fgets() and fputs(). However, you should certainly NOT use gets(); it is a catastrophe waiting to happen. (The first internet worm in 1988 used gets() to migrate — Google search for 'morris internet worm').
In comments, inquisitor asked:
Why does the line need to be read into a char array of a specific size?
Because you need to make sure you don't overrun the space that is available. C does not do automatic allocation of space for strings. That is one of its weaknesses from some viewpoints; it is also a strength, but it routinely confuses newcomers to the language. If you want the input code to allocate enough space for a line, use the POSIX function getline().
So is it better to just read and output until I hit a '\0' since I won't always know the amount of chars on a given line?
No. In general, you won't hit '\0'; most text files do not contain any of those. If you don't want to allocate enough space for a line, then use:
int c;
while ((c = getchar()) != EOF)
putchar(c);
which reads one character at a time in the user code, but the underlying standard I/O packages buffer the input up so it isn't too costly — it is perfectly feasible to implement a program that way. If you need to work on lines, either allocate enough space for lines (I use char buffer[4096]; routinely) or use getline().
And Charlie Burns asked in a comment:
Why don't we see getline() suggested more often?
I think it is not mentioned all that often because getline() is relatively new, and not necessarily available everywhere yet. It was added to POSIX 2008; it is available on Linux and BSD. I'm not sure about the other mainline Unix variants (AIX, HP-UX, Solaris). It isn't hard to write for yourself (I've done it), but it is a nuisance if you need to write portable code (especially if 'portable' includes 'Microsoft'). One of its merits is that it tells you how long the line it read actually was.
Example using getline()
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char **argv)
{
char *line = 0;
size_t length = 0;
char const name[] = "/myFile/path/test.txt";
FILE *file = fopen(name, "r");
if (file == NULL)
{
fprintf(stderr, "%s: failed to open file %s\n", argv[0], name);
return -1;
}
while (getline(&line, &length, file) > 0)
fputs(str, stdout);
free(line);
fclose(file);
return 0;
}
fgets saves the newline character at the end of the line when reading line by line. This allows you to determine wether actually a line was read or just your buffer was too small.
puts always adds a newline when printing.
Either trim off the newline from fgets or use printf
printf("%s", str);
I'm trying to use "echo FileName.txt | a.c" in terminal and read the data from the file into a array i got in a header file, but the code I have so far is just giving me a infinite loop. I tried storing the info in a local array also but still the same result.
main(int argc, char *argv[]) {
extern char test[];
FILE *fp;
int i = 0;
int c; // was char c; originally
if (argc == 1) {
fp = stdin;
} else {
fp = fopen (argv[1], "r");
}
while ((c = getc(fp)) != EOF) {
test[i] = c;
printf("%c", test[i]);
i++;
}
}
(1) Change variable c to an int so it recognizes EOF.
(2) Don't increment i before your printf or you will be printing junk.
Not sure what you are trying to accomplish with the echo thing.
NOTE: I'm assuming that your program is intended to do what the code in your question actually does, namely read input from a file named by a command-line argument, or from stdin if no command-line argument is given. That's a very common way for programs to operate, particularly on Unix-like systems. Your question and the way you invoke the program suggest that you're doing something quite different, namely reading a file name from standard input. That's an unusual thing to do. If that's really your intent, please update your question to clarify just what you're trying to do.
Given this assumption, the problem isn't in your program, it's in how you're invoking it. This:
echo FileName.txt | a.c
will feed the string FileName.txt as input to your program; the program has no idea (and should have no idea) that it's a file name.
The way to pass a file name to your program is:
a.c FileName.txt
Or, if you want to read the contents of the file from stdin:
a.c < FileName.txt
(In the latter case, the shell will take care of opening the file for you.)
You could read a file name from stdin, but it's rarely the right thing to do.
A few other points (some already pointed out in comments):
An executable file probably shouldn't have a name ending in .c; the .c suffix indicates a C source file.
You should declare main with an explicit return type; as of C99, the "implicit int" rule was removed, and it was never a particularly good idea anyway:
main(int argc, char *argv[]) { /* ... */ }
You're reading the entire contents of the input file into your test array. This is rarely necessary; in particular, if you're just copying an input file to stdout, you don't need to store more than one character at a time. If you do need to store the contents of a file in an array, you'll need to worry about not overflowing the array. You might want to abort the program with an error message if i exceeds the size of the array.
See also the rest of the advice in Jonathan Leffler's comment; it's all good advice (except that I think he misunderstood the purpose of your test array, which you're using to store the contents of the file, not its name).
I am trying to create a lexical analyzer in C.
The program reads another program as input to convert it into tokens, and the source code is here-
#include <stdio.h>
#include <conio.h>
#include <string.h>
int main() {
FILE *fp;
char read[50];
char seprators [] = "\n";
char *p;
fp=fopen("C:\\Sum.c", "r");
clrscr();
while ( fgets(read, sizeof(read)-1, fp) !=NULL ) {
//Get the first token
p=strtok(read, seprators);
//Get and print other tokens
while (p!=NULL) {
printf("%s\n", p);
p=strtok(NULL, seprators);
}
}
return 0;
}
And the contents of Sum.c are-
#include <stdio.h>
int main() {
int x;
int y;
int sum;
printf("Enter two numbers\n");
scanf("%d%d", &x, &y);
sum=x+y;
printf("The sum of these numbers is %d", sum);
return 0;
}
I am not getting the correct output and only see a blank screen in place of output.
Can anybody please tell me where am I going wrong??
Thank you so much in advance..
You've asked a few question since this one, so I guess you've moved on. There are a few things that can be noted about your problem and your start at a solution that can help others starting to solve a similar problem. You'll also find that people can often be slow at answering things that are obvious homework. We often wait until homework deadlines have passed. :-)
First, I noted you used a few features specific to Borland C compiler which are non-standard and would not make the solution portable or generic. YOu could solve the problem without them just fine, and that is usually a good choice. For example, you used #include <conio.h> just to clear the screen with a clrscr(); which is probably unnecessary and not relevant to the lexer problem.
I tested the program, and as written it works! It transcribes all the lines of the file Sum.c to stdout. If you only saw a blank screen it is because it could not find the file. Either you did not write it to your C:\ directory or had a different name. As already mentioned by #WhozCraig you need to check that the file was found and opened properly.
I see you are using the C function strtok to divide the input up into tokens. There are some nice examples of using this in the documentation you could include in your code, which do more than your simple case. As mentioned by #Grijesh Chauhan there are more separators to consider than \n, or end-of-line. What about spaces and tabs, for example.
However, in programs, things are not always separated by spaces and lines. Take this example:
result=(number*scale)+total;
If we only used white space as a separator, then it would not identify the words used and only pick up the whole expression, which is obviously not tokenization. We could add these things to the separator list:
char seprators [] = "\n=(*)+;";
Then your code would pick out those words too. There is still a flaw in that strategy, because in programming languages, those symbols are also tokens that need to be identified. The problem with programming language tokenization is there are no clear separators between tokens.
There is a lot of theory behind this, but basically we have to write down the patterns that form the basis of the tokens we want to recognise and not look at the gaps between them, because as has been shown, there aren't any! These patterns are normally written as regular expressions. Computer Science theory tells us that we can use finite state automata to match these regular expressions. Writing a lexer involves a particular style of coding, which has this style:
while ( NOT <<EOF>> ) {
switch ( next_symbol() ) {
case state_symbol[1]:
....
break;
case state_symbol[2]:
....
break;
default:
error(diagnostic);
}
}
So, now, perhaps the value of the academic assignment becomes clearer.