Fastest assembly code for finding the square root. Explanation needed [closed] - c

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
I'm currently making a program in C that needs to find billions of square roots. I looked up which known code finds the square root faster and came across this code which is seemingly the fastest. https://www.codeproject.com/Articles/69941/Best-Square-Root-Method-Algorithm-Function-Precisi
double inline __declspec (naked) __fastcall sqrt(double n)
{
_asm fld qword ptr[esp + 4]
_asm fsqrt
_asm ret 8
}
I don't know much about assembly language so can someone please explain what this code does algorithmically and what those keywords mean?

This is Microsoft Specific naked fast call of the standard sqrt function.
For detail info please check Microsoft documentation.
The naked storage-class attribute is a Microsoft-specific extension to the C language. For functions declared with the naked storage-class attribute, the compiler generates code without prolog and epilog code. You can use this feature to write your own prolog/epilog code sequences using inline assembler code. Naked functions are particularly useful in writing virtual device drivers.
See: Naked functions.
The __fastcall calling convention specifies that arguments to functions are to be passed in registers, when possible. This calling convention only applies to the x86 architecture. Take a look at:
__fastcall
__fastcall was introduced a long time ago by Microsoft. Typically fastcall calling conventions pass one or more arguments in registers which reduces the number of memory accesses required for the call. With on-chip caching, the gain from passing things in registers is not a much gain as it use to be.
And __stdcall may be actually faster now.

Related

In C Programming Language , where(i.e. in which register of the microprocessor) are the register storage class variables stored? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
Any fixed registers for register storage class
register is an suggestion to the compiler that it might want to place the specified variable in a register.
It is not a command that it must put it in a register.
The compiler can choose which register to put the variable in, or ignore the suggestion completely.
In the previous century, register was a hint for the compiler to try to put that variable in a processor register.
Today, on most compilers, that hint is nearly ignored. But you still are not allowed to take the address (using & unary operator) of a variable declared register. So today register means "I won't take the address of that variable" to the compiler (hence, register storage class is almost never used in recently written code). Some people think that register could be deprecated in future standards (of C & C++) or that keyword would be reused for other purposes.
Optimizing compilers have sophisticated register allocation and instruction scheduling (see also this). Details depend upon the level of optimization, the target processor's instruction set architecture, the ABI, etc... So a given variable may be ignored entirely (if the compiler don't need it), or can sit in a register, or can sit in the call stack, etc... (and that status could vary in different points of your compiled function).
With GCC, you could compile your foo.c file with gcc -O -fverbose-asm -S foo.c and look into the generated foo.s assembler file (and you could vary the optimization level, e.g. with -O2 etc...).
Regarding performance, today the CPU cache matters a lot, much more than just registers.

How is the assembler compiler programmed [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
I learned not long ago that most of the asm compilers were written in C or other languages, and we say assembler is the fastest language. But if it's coded in C, how can it be faster than the C itself? What does the compiler do then? Are there compilers of ASM in ASM? I do not really understand how it all works ... I searched on the internet, but I did not find clearly what I was looking for ...
Would you have explained or given me any links that could help me better understand the concept of assemblies compilers?
There are three concepts getting tossed around here:
Speed of a compiler
Speed of a processor
Speed of an executable
First, to get it out of the way, the time it takes to compile some executable has very little relationship to the time it takes for that executable to run. (The compiler can take longer to do some careful analysis and apply optimizations.)
The speed at which your processor can operate is another thing. Assembly language is the closest to machine language, which is what your processor understands. Any given instruction in machine language will operate at the speed that the machine processes that instruction.
Everything that executes on your processor must, by definition, be at some point converted to machine language so that your processor can understand and execute it.
That’s where things get tricky. An assembler will translate code you write directly to machine language, but there is more to a program than just knowing how to convert to machine language. Suppose you have a complex value, such as a collection of options. These options must be maintained as strings, integers, floats, etc. How are they stored? How are they accessed?
The way in which all this is done can vary. The way you organize your program can vary. These variations make a difference in executable time.
So you can write a very slow program using assembly language and a very fast program using an interpreted language. And, frankly, compilers are often better at organizing the final machine code than you are, even if you are using an assembler directly.
So to bring it to a point: the compiler’s job is to transform your textual source code (C, or assembly, or whatever) into machine code, which is what your processor understands. Once done, the compiler is no longer necessary.
There is significantly more to it than that, but that is the general idea.

Assembly and Execution of Programs - Two pass assembler [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 6 years ago.
Improve this question
While going through a book on machine instructions and programs I came across a particular point which says that an assembler scans an entire source program twice. It builds a symbol table during the 1st pass/scan and associates the entire program with it during the second scan. The assembler needs to provide an address in a similar way for a function.
Now, since the assembler passes through the program twice, why is it necessary to declare a function before it can be used? Wouldn't the assembler provide an address for the function from the 1st pass and then correlate it to the program during the 2nd pass ?
I am considering C programming in this case.
The simple answer is that C programs require that functions be declared before it can be used because the C language was designed to be processed by a compiler in a single pass. It has nothing to with assemblers and addresses of functions. The compiler needs to know the type of a symbol, whether its a function, variable or something else, before it can use it.
Consider this simple example:
int foo() { return bar(); }
int (*bar)();
In order to generate the correct code the compiler needs to know that bar isn't a function, but a pointer to a function. The code only works if you put extern int (*bar)(); before the definition of foo so the compiler knows what type bar is.
While the language could have been in theory designed to require the compiler to use two passes, this would have required some significant changes in the design of the language. Requiring two passes would also increase the required complexity of the compiler, decreasing the number of platforms that could host a C compiler. This was very important consideration back in the day when C was first being developed, back when 64K (65,536) bytes of RAM was a lot of memory. Even today would have noticeable impact on the compile times of large programs.
Note that the C language does sort of allows what you want anyways, by supporting implicit function declarations. (In my example above this it what happens in foo when bar isn't declared previously.) However this feature is obsolete, limited, and considered dangerous.

C's coverage of assembly [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
In an argument with a friend, I made the remark that it is impossible to write, in any language besides C, a program that is faster than all variants in C, that do the same thing. My argument was based on an affirmative answer to the question below. Is it true?
If we think of "compiling" as a map from [C programs] to [assembly programs], then is this map surjective?
Caveat: Of course, you can include assembly in C programs, but pretend that isn't possible (makes for a more interesting question!).
The answer to the question If we think of "compiling" as a map from [C programs] to [assembly programs], then is this map surjective? is obviously NO.
It can be proven trivially:
* There could be assembly language instructions that the compiler will not generate, such as int 10, halt, jmp *eax, iret, sub esp,esp...
* You might be fiddling with registers in assembly that the C compiler never touches, such as segment registers.
There is just a world of creativity in assembly that the C language cannot express.
Regarding the other question, I'm not sure what you mean by
it is impossible to write, in any language besides C, a program that is faster than all variants in C, that do the same thing.
If you mean that a skilled programmer can always write a C program that will be faster at a given task than any other program written in any language, I think you probably wrong too, because the compiler itself is a fixed variable that is imperfect.
Imagine for example that the C compiler is very dumb and generates unoptimized code. It is obvious that an assembly program can be written that will beat the best C variation at the given task: all that is needed is to optimize the unoptimized code. Since the C compiler is imperfect, you can always find a task for which even the best C variation can be further optimized.

Simulation of x86 32 bit assembly running using c [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
So what I am looking for is not finding a assembly emulator.
Basically what I am trying to do is translation assembly to c,
Although I have a IDA pro with the "F5" dis-compile function, but
generally I am trying to do a simulation approach.
I made this examples by hand to demonstrate my idea:
mov %eax, 10
add %eax, 5
jmp foo
I want to directly translate it into a simulated c procedure like this
unsigned v_eax = 0;
v_eax = 10;
v_eax += 5;
goto foo;
I think this is pretty like a assembly simulator, which has the process like
assembly --> running in a CPU simulator in C --> output the results
But what I am trying to do is like this
assembly --> translate into a c source code --> compile --> run to get the results
After a quick search, I think this paper has an approach which is similiar to what I am trying to do (however I don't any analysis work, just translation of some simple assembly code)
Could anyone give some help on this issue..?
Thank you!
What help are you looking for? If you have specific questions, ask those questions.
It looks like you've already got the general idea: Set up a bunch of variables to represent the registers, set up a large array to represent the memory, implement either subroutines or macros (chunks of code generated in-line) that represent each instruction and do the Right Thing with those resources, implement additional macros or subroutines which are wrappers for or equivalent to every operating system call or external library function which the programs might invoke (I/O most importantly), write a "loader" for the executable file, then go through the program converting instructions to those macros. Be sure to fix up goto/call addresses properly, and hope like heck that the programmers kept data blocks and code blocks distinct. Get it all debugged, and it should work. Extremely slowly, but that's what you've asked for.

Resources