From An Integrated Approach to Software Engineering
By Pankaj Jalote
Clearly, no meaningful program can be written as a sequence of simple
statements without any branching or repetition (which also involves
branching). So, how is the objective of linearizing the control flow
to be achieved? By making use of structured constructs. In
structured programming, a statement is not a simple assignment
statement, it is a structured statement. The key property of a
structured statement is that it has a single-entry and a
single-exit. That is, during execution, the execution of the (structured) statement starts from one defined point and the execution
terminates at one defined point. With single-entry and single-exit
statements, we can view a program as a sequence of (structured)
statements. And if all statements are structured statements, then
during execution, the sequence of execution of these statements will
be the same as the sequence in the program text. Hence, by using
single-entry and single-exit statements, the correspondence between
the static and dynamic structures can be obtained.
The most commonly used single-entry and single-exit statements are
Selection: if B then S1 else S2
if B then S1
Iteration: While B do S
repeat S until B
Sequencing: S1; S2; S3;...
What do "single-entry" and "single-exit" mean in a structured statement?
Why are statements listed at the end are single-entry and single-exit?
For example, in if B then S1 else S2, why is it single exit, given it can terminates at either S1 or S2?
Can you give a statement which is not single entry?
Can you given a statement which is not single exit?
In many languages, the only statements that do not have a single entry are those which happen to contain labels for use with goto or switch statements located outside them, and the only statements that do not have a single exit are those which contain a goto to an outside location, trigger an exception, or otherwise force stack unwinding. Note that for any particular call of a function, the only "normal" exit point will be the code immediately following that call.
The notion of single entry/single exit may be unclear to those who have never worked with code that didn't use such an approach. Examples of the latter may be found when writing code for the platforms like the Atari 2600 where RAM space is often at an absolute premium. If a piece of code will be invoked from the code that shows the title screen, or from within the game logic, and there one can't afford the two bytes of stack space necessary for a subroutine-call instruction, it would not be uncommon to jump to the code (rather than using a "JSR" [jump to subroutine] instruction), and have the code exit by checking whether a game is in progress and jumping back to the appropriate spot in the "show title screen" or "perform game logic" code. Such a design may be awkward to maintain if it becomes necessary to invoke it from more places in the code, but such techniques may be necessary if RAM is really tight (e.g. one only has 128 bytes total, as on the Atari 2600).
Related
I'm learning debugging. For this purpose I'm using x64dbg. So I understood how to place breakpoints (according to strings) and how to "block" the execution of the program near the assembly code that interests me.
Unfortunately I came across some strange cases.
In some cases the "flow" is not linear. The flow "follows" a sort of loop. With each passage of this loop "something new" is written. For example. Let's say that the software has to write HELLO. At the first turn just an H appears, at the second turn just a HE, etc etc... and I can see this in the comments column of x64dbg.
The stranger thing is that if the program has to write PIZZA, the program crosses the SAME loop that previously wrote hello HELLO... but this time P appears first, then PI, then PIZZ etc etc… and I can see this in the comments column of x64dbg.
In other words, any operation I perform in the software the SAME piece of code is always executed, this code is executed hundreds of times (in loop) and each step contributes to a little piece of the final result (which changes).
How is it possible ? What should I do ?
I'm debugging the goldfish android kernel (version 3.4), with kernel sources.
Now I found that gdb sometimes jump back and forth between lines, e.g consider c source code like the following:
char *XXX;
int a;
...
if (...)
{
}
When I reached the if clause, I type in n and it will jump back to the int a part. Why is that?
If I execute that command again, it would enter the brackets in the if.
If possible, I want to avoid that part, and enter the if directly (of course, if condition matches)
When I reached the if clause, I type in n and it will jump back to the int a part. Why is that?
Because your code is compiled with optimization on, and the compiler can (and often does) re-arrange instructions of your program in such a way that instructions "belonging" to different source lines are interleaved (code motion optimizations attempt (among other things) to move load instructions to long before their results are needed; this helps to hide memory latency).
If you are using gcc-4.8 or later, build your sources with -Og. Else, see this answer.
I am reading some C text at the address https://cs.senecac.on.ca/~btp100/pages/content/const.html.
In the section "STRUCTURED PROGRAMMING", the author mentioned: "Structured programs are understandable, testable and readily modifiable. They consist of simple constructs, each of which has one entry point and one exit point."
I understood what is a structured program, but I am not really understanding the idea "one entry point and one exit point". What if we do not have such stuff?
Can anyone elaborate on that, please?
Look at the Flags example close to the bottom and Avoiding Jumps below that: https://cs.senecac.on.ca/~btp100/pages/content/const.html#fla
What they're basically trying to say here is that you could have some sort of loop (for/while/whatever) where you could use something like break to exit a loop prematurely, rather than waiting on the actual condition that you're checking in the loop to become false and have the loop exit normally. In this case you would have two exit points.
They suggest the use of a flag variable added to the loop's condition to have a single exit point, makes sense.
The use of continue is another example where you can "break structure." You could use continue to stop the current iteration of the loop and reenter it, where in this case you would have multiple entry points.
Things like that can make code a lot harder to read and be able to follow the flow, even though sometimes it may seem necessary to do so.
I am doing a stimulation of dead-code remover in a very simpler manner.
For that my Idea is to,
Step 1: Read the input C-Program line by line and store it in a doubly linked-list or Array.(Since deletion and insertion will be easier than in file operations).
Doubt:Is my approach correct? If so, How to minimize traversing a Linked-List each time.
Step 2: Analyzing of the read strings will be done in parallel, and tables are created to maintain variables names and their details, functions and their calls,etc.,
Step 3: Searching will be done for each entries in the variable table, and the variables will be replaced by its that time's value(as it has).
(E.g.)
i=0;
if(i==3) will be replaced by if(0==3).
But on situation like..
get(a);
i=a;
if(i){}
here,'i' will not be replaced since it depends on another variable. 'a' will not be replaced since it depends on user input.
Doubt: if user input is,
if(5*5+6){print hello;} ,
it surely will be unnecessary check. How can i solve this expression to simplify the code as
{
print hello;
}
Step 4: Strings will be searched for if(0),while(0) etc., and using stack, the action block is removed. if(0){//this will be removed*/}
Step 5:(E.g) function foo(){/**/} ... if(0) foo(); ..., Once all the dead codes are removed, foo()'s entry in the function table is checked to get no.of.times it gets referred in the code. If it is 0, that function has to be removed using the same stack method.
Step 6: In the remaining functions, the lines below the return statements (if any) are removed except the '}'. This removal is done till the end of the function. The end of the function is identified using stack.
Step 7: And I will assume that my dead-free code is ready now. Store the linked-list or array in an output file.
My Questions are..
1.Whether my idea will be meaningful? or will it be implementable? How
can I improve this algorithm?
2.While i am trying to implement this idea, I have to deal more with string
manipulations rather than removing dead-codes. Is any way to reduce
string manipulations in this algorithm.
Do not do it this way. C is a free-form language, and trying to process it line-by-line will result in supporting a subset of C that is so ridiculously restricted that it doesn't deserve the name.
What you need to do is to write a proper parser. There is copious literature about that out there. Find out which textbook your school uses for its compiler-construction course, and work through that -- or just take the course! Only when you've got the parser down should you even begin to consider semantics. Then do your work on abstract syntax trees instead of strings. Alternatively, find an already written and tested parser for C that you can reuse (but you'll still need to learn quite a bit in order to integrate it with your own processing).
If you end up writing the parser yourself, and it's only for your own edification, consider using a simpler language than C as your subject. Even though C at is core is fairly compact as languages go, getting all details of the declaration syntax right is surprisingly tricky, and will probably detract you from what you're actually interested in. And the presence of the preprocessor is an issue in itself which can make it very difficult to design meaningful source-to-source transformations.
By the way, the transformations you sketch are known in the trade as "constant propagation", or (in a more ambitious variants that will clone functions and loop bodies when they have differing constant inputs) "partial evaluation". Googling those terms may be interesting.
How to protect c++ output file(pe file) from editing using crc(Cyclic Redundancy Check)?
**Best Regards**
You can use CRC's to effectively check to see if a file was accidentally altered, but they are not effective for copy protection, or preventing cheats on a game.
Usually, when I program has some sort of CRC check, I find the code which does the check, and change the assembly instruction from a conditional branch to an unconditional branch. This is usually quite easy to find, because normally after a CRC fail, the program displays a message and exits. I place a break point when the message occurs, and examine all the frames in the stack. I then put break points on each point in the stack, run the program again, and see which one does the CRC check.
This isn't particularly difficult, and people often bundle little programs which will apply the same changes to the software of your choice.
You need a static variable in your code. The variable needs to be initialized to a value that can easily found with an hex editor (e.g. DEADBEEF)
you need a crc-algorithm (try searching google)
The tricky part. You need to get pointer in memory to the start and to the end of your exe. You can parse the pe file header for the code location and run the crc-algorithm from start of code to end of code. Then you have the value.
Of course you have to check the calculated value with the one in the static variable.
Inserting the value - depending on how often you build, you might want to programm a tool. You can always run your program and set a breakpoint on the comparison. Then you note down the value and hex-edit it into the executable. Or you create a standalone program that parses the pe-header as well, uses the same function (this time on the file) and patches it in. This could be complicated though, because I don't know what is changed by the OS during loading.