Assembly and Execution of Programs - Two pass assembler [closed] - c

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 6 years ago.
Improve this question
While going through a book on machine instructions and programs I came across a particular point which says that an assembler scans an entire source program twice. It builds a symbol table during the 1st pass/scan and associates the entire program with it during the second scan. The assembler needs to provide an address in a similar way for a function.
Now, since the assembler passes through the program twice, why is it necessary to declare a function before it can be used? Wouldn't the assembler provide an address for the function from the 1st pass and then correlate it to the program during the 2nd pass ?
I am considering C programming in this case.

The simple answer is that C programs require that functions be declared before it can be used because the C language was designed to be processed by a compiler in a single pass. It has nothing to with assemblers and addresses of functions. The compiler needs to know the type of a symbol, whether its a function, variable or something else, before it can use it.
Consider this simple example:
int foo() { return bar(); }
int (*bar)();
In order to generate the correct code the compiler needs to know that bar isn't a function, but a pointer to a function. The code only works if you put extern int (*bar)(); before the definition of foo so the compiler knows what type bar is.
While the language could have been in theory designed to require the compiler to use two passes, this would have required some significant changes in the design of the language. Requiring two passes would also increase the required complexity of the compiler, decreasing the number of platforms that could host a C compiler. This was very important consideration back in the day when C was first being developed, back when 64K (65,536) bytes of RAM was a lot of memory. Even today would have noticeable impact on the compile times of large programs.
Note that the C language does sort of allows what you want anyways, by supporting implicit function declarations. (In my example above this it what happens in foo when bar isn't declared previously.) However this feature is obsolete, limited, and considered dangerous.

Related

Can I call a function from unknown source safely? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
I want to receive some C code from the user and compile it Just-In-Time using tcc compiler. The compiler then gives me a pointer to a function in the compiled code. I want to call this function safely so that if this function causes a crash it just returns with an integer representing error, is this possible?
(This is an example of how I want to use tcc compiler library)
I want to call this function safely so that if this function causes a crash it just returns with an integer representing error, is this possible
That alone is potentially possible. Most things that cause a crash will cause a signal; which means that you can call setjump() before calling the unsafe code, then have signal handler/s that use longjmp() to restore a known state if the unsafe code crashes.
Can I call a function from unknown source safely?
That is a lot more than just guarding against crashes - you might also have to guard against deliberately malicious code that does not crash.
However; this depends on what you consider "safe" and how your software would be used. Typically (for personal computers, not servers) there is nothing that the end user could do that they couldn't also do by compiling their code with their own compiler and running it themselves (and this includes loading and starting your software into a forked process with malware injected into the virtual address space, then tampering with everything your code does); so "safe" (or "less safe than the user can already do anyway") becomes hard to define in meaningful way.
The only valid concern that I can think of is when the user has less permissions/privileges than your software (where the user could abuse your software to gain permissions/privileges they didn't already have). In this case; you shouldn't be considering letting the user run arbitrary code (it's simply too hard to make it safe).

How is the assembler compiler programmed [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
I learned not long ago that most of the asm compilers were written in C or other languages, and we say assembler is the fastest language. But if it's coded in C, how can it be faster than the C itself? What does the compiler do then? Are there compilers of ASM in ASM? I do not really understand how it all works ... I searched on the internet, but I did not find clearly what I was looking for ...
Would you have explained or given me any links that could help me better understand the concept of assemblies compilers?
There are three concepts getting tossed around here:
Speed of a compiler
Speed of a processor
Speed of an executable
First, to get it out of the way, the time it takes to compile some executable has very little relationship to the time it takes for that executable to run. (The compiler can take longer to do some careful analysis and apply optimizations.)
The speed at which your processor can operate is another thing. Assembly language is the closest to machine language, which is what your processor understands. Any given instruction in machine language will operate at the speed that the machine processes that instruction.
Everything that executes on your processor must, by definition, be at some point converted to machine language so that your processor can understand and execute it.
That’s where things get tricky. An assembler will translate code you write directly to machine language, but there is more to a program than just knowing how to convert to machine language. Suppose you have a complex value, such as a collection of options. These options must be maintained as strings, integers, floats, etc. How are they stored? How are they accessed?
The way in which all this is done can vary. The way you organize your program can vary. These variations make a difference in executable time.
So you can write a very slow program using assembly language and a very fast program using an interpreted language. And, frankly, compilers are often better at organizing the final machine code than you are, even if you are using an assembler directly.
So to bring it to a point: the compiler’s job is to transform your textual source code (C, or assembly, or whatever) into machine code, which is what your processor understands. Once done, the compiler is no longer necessary.
There is significantly more to it than that, but that is the general idea.

Questions about C as an intermediate language [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I'm writing a language that compiles to C right now, and when I say IL I mean as in C is the language I write the code as to then generate assembly by another c compiler, e.g. gcc or clang.
The C code I generate, will it be more beneficial to:
If I do some simple opt passes (constant propagation, dead code removal, ...) will this reduce the amount of work the C compiler has to do, or make it harder because it's not really human C code?
If I were to compile to say three-address code or SSA or some other form and then feed this into a C program with functions, labels, and variables - would that make it easier or harder for the C compiler to optimize?
Which kind of link together to form the following question...
What is the most optimal way to produce good C code from a language that compiles to C?
Is it worth doing any optimisations at all and leaving that to the compiler?
Generally there's not much point doing peephole type optimisations because the C compiler will simply do those for you. What is expensive is a) wasted or unnecessary "gift-wrapping" operations, b) memory accesses, c) branch mispredictions.
For a), make sure you're not passing data about too much, because whilst C will do constant propagation, there's a limit to how far it can detect that two buffers are in fact aliases of the same underlying data. For b) try to keep functions short and operations on the same data together, also limit heap memory use to improve cache performance. For c), the compiler understand for loops, it doesn't understand goto loops. So it will figure that
for(i=0;i<N;i++)
will usually take the loop body, it wont figure that
if(++i < N) goto do_loop_again
will usually take the jump.
So really the rule is to make your automatic code as human-like as possible. Though if it's too human-like, that raises the question of what your language has to offer that C doesn't - the whole point of a non-C language is to create a spaghetti of gotos in the C source, a nice structure in the input script.

If there are many functions with the same parameters, should I use a macro to avoid typing the parameters multiple times? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 years ago.
Improve this question
I have some old C programs to maintain. For some functions (at least 10) with exactly the same parameters, the programmer utilized a macro to avoid typing the same parameters again and again. Here is the macro definition:
#define FUNC_DECL(foo) int foo(int p1, int p2, ....)
Then, if I want to define function with the same parameters, I need only type:
FUNC_DECL(func1)
Besides avoiding the tedious work of typing same parameters many times, are there any other advantages of this implementation?
And this kind of implementation confuses me a little bit. Are there other disadvantages of it?
Is this kind of implementation a good one?
As I noted in comments to the main question, the advantage of using a macro to declare the functions with the same argument list is that it ensures the definitions do have the same argument list.
The primary disadvantage is that it doesn't look like regular C, so people reading the code have to search more code to work out what it means.
On the whole, I don't like that sort of macro-based scheme, but occasionally there are good enough reasons to use it — this might be a borderline example.
There are at least ten functions with the same parameters. Currently‌​, every function only has 3 parameters.
Oh, only 3 parameters? No excuse for using the macro then — I thought it was 10 parameters. Clarity is more important. I don't think that the code will be clearer using the macro. The chances that you'll need to change 10 functions to use 4 parameters instead of 3 is rather limited — and you'd have to change the code to use the extra parameter anyway. The saving of typing is not relevant; the saving of time spent puzzling over the meaning of the macro is relevant. And the first person who has to puzzle over the code will spend longer doing that than you'd save typing the function declarations out — even if you hunt and peck when typing.
Away with it — off with its head! Expunge the macro. Make your code happy again.
#define is a text processor kind of thing. So, whether you write the full function declaration or use the preprocessor instead, both will do the same thing with similar execution times. Using #define makes a program readable/short and doesn't affect end result at all but more number of #define means more compilation time and nothing else. But generally, programs are used more than they are compiled. So, the usage of #define doesn't hamper your production environment at all.

Generally Ada seems to compile code slower than similar C code, why is this? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
When I compile programs in Ada, I typically notice a longer compile time for code of similar length and of similar content to programs written in C or C++.
While it is true that it comes down to the compiler and system to determine compile time the Ada compilation generally takes longer. Is this process radically different than the compile/link process of C or C++. Does it consist of different stages?
What about the Ada compilation process makes the compilation take longer than ?
It is all about the amount of time and effort put into making the compiler fast.
Compilers that have a broader scope tend to have more money to invest in making fast; however, sometimes there are other elements at stake. For example, the details of a compiler might include static type checking, various "extra" correctness checks, and other items (programming contract compliance, code quality, etc) that might adjust the compile time.
Ada tends to have had less money thrown at its compiler, and it is likely a slightly more complex language to parse than C. Both of these factors lend themselves to making it likely that its compiler will be slower.
Note that speed of compilation has little to do with the "quality" of the language. While C might have a larger footprint, Ada has made its mark on the programming world in other ways.

Resources