How floats are computed on a machine without an FPU - c

C language has a data-type float. Some machines have a floating point processor that carries out all the floating point computations. My question is: Could there be some machines without a floating point processor? How do such machines use floating point?

Many small controllers do not have floating point units. In that case, there is a floating point software library.
In the mid-1980s, we considered ourselves blessed if our system had an 8087, the FPU for the 8086 and 8088. Unfortunately our software had to work correctly if an 8087 was present or not. That meant trapping and emulating 8087 instructions if it was missing.

The c standard allows floating points.
It is the compiler's responsibility to translate it to the specific hardware architecture.
If the hardware instruction set supports floating points [and most modern machines do], then - the compiler will most likely use it.
Otherwise, it will have to create a native language code that simulates the behavior of floating points by its own. How is it done? You could read more about floating points in the wikipeida page and in this more detailed article about floating point arithmetics

Up till and including the 486SX, no CPU's had a a builtin FPU unit.
As for microcontrollers, most of them do not have a FPU unit.

You'll find that nearly all modern desktop computers and servers include a FPU.
High end mobile devices have begun to include FPUs, but not all of them have them. And if we're talking about mobile devices other than at the high end, you won't find many devices that have FPUs.
In many applications, it's possible to do arithmetic on fractional numbers using "fixed point arithmetic"--that doesn't require an FPU.
In other cases, you can do the same math that an FPU does, but it takes longer when you have to build it yourself out of other arithmetic primitives rather than having a complex chip take care of it for you.
My favorite example of floating point simulation on fixed point processors is provided in Donald Knuth's MMIXware, a complete processor simulation in very portable C.

Emulating floating point is a bit slow, but theoretically fairly simple. It's just about like most people learned in high school or so: you have a number with an exponent. To add or subtract, you have to adjust the numbers so they have the same exponents, then add/subtract the mantissas. To multiply or divide, you multiply/divide the mantissas and add/subtract the exponents.
When you've finished that, you normalize the result. In high school we used decimal, and normally required exactly one digit before the decimal point, so (for example) 10001 would be written as 1.0001 x 104. On the computer, the details are a bit different (e.g., we're dealing in binary instead of decimal) but the basic idea is pretty much the same.

Related

How floating point conversion was handled before the invention of FPU and SSE?

I am trying to understand how floating point conversion is handled at the low level. So based on my understanding, this is implemented in hardware. So, for example, SSE provides the instruction cvttss2si which converts a float to an int.
But my question is: was the floating point conversion always handled this way? What about before the invention of FPU and SSE, was the calculation done manually using Assembly code?
It depends on the processor, and there have been a huge number of different processors over the years.
FPU stands for "floating-point unit". It's a more or less generic term that can refer to a floating-point hardware unit for any computer system. Some systems might have floating-point operations built into the CPU. Others might have a separate chip. Yet others might not have hardware floating-point support at all. If you specify a floating-point conversion in your code, the compiler will generate whatever CPU instructions are needed to perform the necessary computation. On some systems, that might be a call to a subroutine that does whatever bit manipulations are needed.
SSE stands for "Streaming SIMD Extensions", and is specific to the x86 family of CPUs. For non-x86 CPUs, there's no "before" or "after" SSE; SSE simply doesn't apply.
The conversion from floating-point to integer is considered a basic enough operation that the 387 instruction set already had such an instruction, FIST—although not useful for compiling the (int)f construct of C programs, as that instruction used the current rounding mode.
Some RISC instruction sets have always considered that a dedicated conversion instruction from floating-point to integer was an unnecessary luxury, and that this could be done with several instructions accessing the IEEE 754 floating-point representation. One basic scheme might look like this blog post, although the blog post is about rounding a float to a
float representing the nearest integer.
Prior to the standardization of IEEE 754 arithmetic, there were many competing vendor-specific ways of doing floating-point arithmetic. These had different ranges, precision, and different behavior with respect to overflow, underflow, signed zeroes, and undefined results such as 0/0 or sqrt(-1).
However, you can divide floating point implementations into two basic groups: hardware and software. In hardware, you would typically see an opcode which performs the conversion, although coprocessor FPUs can complicate things. In software, the conversion would be done by a function.
Today, there are still soft FPUs around, mostly on embedded systems. Not too long ago, this was common for mobile devices, but soft FPUs are still the norm on smaller systems.
Indeed, the floating point operations are a challenge for hardware engineers, as they require much hardware (leading to higher costs of the final product) and consume much power. There are some architectures that do not contain a floating point unit. There are also architectures that do not provide instructions even for basic operations like integer division. The ARM architecture is an example of this, where you have to implement division in software. Also, the floating point unit comes as an optional coprocessor in this architecture. It is worth thinking about this, considering the fact that ARM is the main architecture used in embedded systems.
IEEE 754 (the floating point standard used today in most of the applications) is not the only way of representing real numbers. You can also represent them using a fixed point format. For example, if you have a 32 bit machine, you can assume you have a decimal point between bit 15 and 16 and perform operations keeping this in mind. This is a simple way of representing floating numbers and it can be handled in software easily.
It depends on the implementation of the compiler. You can implement floating point math in just about any language (an example in C: http://www.jhauser.us/arithmetic/SoftFloat.html), and so usually the compiler's runtime library will include a software implementation of things like floating point math (or possibly the target hardware has always supported native instructions for this - again, depends on the hardware) and instructions which target the FPU or use SSE are offered as an optimization.
Before Floating Point Units doesn't really apply, since some of the earliest computers made back in the 1940's supported floating point numbers: wiki - first electro mechanical computers.
On processors without floating point hardware, the floating point operations are implemented in software, or on some computers, in microcode as opposed to being fully hardware implemented: wiki - microcode , or the operations could be handled by separate hardware components such as the Intel x87 series: wiki - x87 .
But my question is: was the floating point conversion always handled this way?
No, there's no x87 or SSE on architectures other than x86 so no cvttss2si either
Everything you can do with software, you can also do in hardware and vice versa.
The same to float conversion. If you don't have the hardware support, just do some bit hacking. There's nothing low level here so you can do it in C or any other languages easily. There is already a lot of solutions on SO
Converting Int to Float/Float to Int using Bitwise
Casting float to int (bitwise) in C
Converting float to an int (float2int) using only bitwise manipulation
...
Yes. The exponent was changed to 0 by shifting the mantissa, denormalizing the number. If the result was too large for an int an exception was generated. Otherwise the denormalized number (minus the factional part and optionally rounded) is the integer equivalent.

Does ALU read and write floating point number?

I know that for floating point operation FPU (Floating point Unit) is required and the ALU can only perform arithmetic operations. So I am using fixed point arithmetic.
These are the flollowing steps I am doing:
read floating point number.
Convert it into fixed point
Do all operation using fixed point arithmetic
Convert result into floating point
write o/p
My question is if there is no FPU present in system, how would it read floating point as input and output.
Does ALU read and write floating point number? If yes, how?
No, the ALU can not read nor write floating point numbers as floating point numbers, just the FPU. From the ALU point of view an FP number is a series of random bits.
The FPU is present today for performance reasons; you have a dedicated piece of silicon on your CPU to perform FP operations.
Since floating point numbers are base two numbers with a mantissa and an exponent, you can always perform floating point operations using the ALU. Which, again, is slower than using a hardware FPU but gets the job done anyway.
For example you have FLIP which is a floating point library implemented in C to perform floating point operatins using just integer numbers; that's it, the ALU.
FLIP is a C library that provides a software support for binary32
floating-point arithmetic on integer processors. This library is
particularly targeted to VLIW or DSP processors (that is, embedded
systems), and has been validated on VLIW integer processors like those
of the ST200 family from STMicroelectronics.
This library provides software implementation for the five basic
arithmetic operations (addition, subtraction, multiplication,
division, and square root) with subnormal numbers support, and for the
four rounding-direction attributes (RoundTiesToEven,
RoundTowardPositive, RoundTowardNegative, RoundTowardZero) required by
the IEEE 754-2008 standard.
The GCC compiler also contains a software emulation layer for floating point operations:
The software floating point library is used on machines which do not
have hardware support for floating point. It is also used whenever
-msoft-float is used to disable generation of floating point instructions. (Not all targets support this switch.)
With an ALU you can only use integer or use fixed point arithmetics. Otherwise, you have to emulate it. The emulation can be done either at compiler level (see soft float) or application level.

Xilinx MicroBlaze Floating Point Compatibility

I have a 'c' code targeted to a MicroBlaze CPU.
When I debug the code as c program in Eclipse + GCC or Visual Studio I get the results I want.
Yet when I run on the target the result are different.
It happens only on floating point operations (Multiplication and Division).
How can I make it work with full floating point precision?
Are there special GCC flags?
P.S.
The configuration of the MicroBlaze is with all the hardware of floating point operations enabled.
I'm not very experienced with MicroBlaze, but the Wikipedia page states:
Also, key processor instructions which are rarely used but more expensive to implement in hardware can be selectively added/removed (i.e. multiply, divide, and floating-point ops.)
Emphasis mine.
So, make sure that your particular MicroBlaze actually has the floating point operations supported, otherwise I imagine your results will be very random.
Also make sure your compiler toolchain generates the proper instructions, sometimes toolchains for embedded development support software-emulated floating point. This should be trivial to figure out by disassembling the final code, and seeing how the floating-point operations are implemented.
MicroBlaze floating-point in hardware supports IEEE754 with some exceptions that is listed in the MicroBlaze reference guide.
Floating-point is not 100% identical on all machines.
It depends on actual precision when executing the operations (hardware can use extended precision when executing single-precision operations), it also depends on the configuration of the rounding-mode (IEEE defines four different rounding modes).
MicroBlaze do not support denormalized floating-point (they will be consider to be zero).
However normal coding should avoid denormalized values since they have a reduced accuracy.
What kind of difference do you see?
Göran Bilski

What is the difference between fixed point and floating point processor?How float is handled in both kinds?

I am bit confused about how floating point operations are handled in a processor which is do not support floating point operation.
Again how floating point processor is different from fixed point processor?
In which case IEEE floating point formats are used?
First off there are a number of different floating point formats, for various reasons. (some) DSPs do not use IEEE for performance reasons, it carries a lot of extra baggage (which most folks never use).
From elementary school we learned how to count then we learned how to add which is just a short cut for counting, then we learned to multiply which is just a short cut for adding, likewise subtraction and division are shortcuts for counting down rather than counting up. We also learned to do all of our math one column at a time, so if you have a processor that can do at least 1 bit math operations you can do addition, subtraction, multiplication and division as wide (As many bits per operand) as you desire, it may take a lot of operations but it is quite doable and anyone that made it through grade school has the tool box/skill set to do such a thing.
floating point is a middle school thing, manipulate a decimal point and use powers of some base (1.3 * 10^5) + (1.5 * 10*5). we know we have to get the 10 to the powers the same then we can just do basic elementary addition with the decimal points lined up. multiplication is even easier as you dont have to line up the decimal points you just do the math on the significant digits and simply add the exponents.
When your processor has a multiply instruction, it is just a shortcut for you having to do multiple additions (the shortcuts usually involve multiple additions). What they do is depending on how many clock cycles they want to get the multiply operation down to uses an increasingly large amount of chip real estate. Likewise for division, that is why you dont see divide on a lot of instruction sets and dont see multiply on some of the ones that dont have divide, it is a cost trade off, performance vs power and chip real estate, yield, etc.
Then floating point is just an extension of that at the core of a floating point operation you still have the fixed point operations, a floating point multiply requires a fixed point multiplication and an addition and some adjustment. A floating point addition requires some adjustment, an addition and some more adjustment.
Now what processors have fpus and what dont? What processors with an fpu support ieee and what dont? That is as easy to find as the information above, but I will leave you to solve that yourself.
if you are for example able to do math operations using scientific notation (1.345*10^4 + 2.456*10^6, or 2.3*10^6 * 4.5*10^7) then you should be able to break down the math steps involved and write your own soft float routines, not optimized but you can see how a cpu that either doesnt have an fpu or a programmer that doesnt want to use the fpu can do floating point operations. You have to be able to think in terms of base 2 not ten though which makes the problem significantly easier 1.101001*2^4 + 1.010101*2^5, in particular multiplies get real easy.
When floating point is not supported in hardware, the calculations are done by heavily optimised pieces of assembly code, usually from a library.
One Google search away you could have found this about fixed-point. I assume you can find info about IEEE floating point yourself ;-)
Good luck!
Explanation and arbitrary fixed point library can be found here.

How to do floating point calculations with integers

I have a coprocessor attached to the main processor. Some floating point calculations needs to be done in the coprocessor, but it does not support hardware floating point instructions, and emulation is too slow.
Now one way is to have the main processor to scale the floating point values so that they can be represented as integers, send them to the co processor, who performs some calculations, and scale back those values on return. However, that wouldn't work most of the time, as the numbers would eventually become too big or small to be out of range of those integers. So my question is, what is the fastest way of doing this properly.
You are saying emulation is too slow. I guess you mean emulation of floating point. The only remaining alternative if scaled integers are not sufficient, is fixed point math but it's not exactly fast either, even though it's much faster than emulated float.
Also, you are never going to escape the fact that with both scaled integers, and fixed point math, you are going to get less dynamic range than with floating point.
However, if your range is known in advance, the fixed point math implementation can be tuned for the range you need.
Here is an article on fixed point. The gist of the trick is deciding how to split the variable, how many bits for the low and high part of the number.
A full implementation of fixed point for C can be found here. (BSD license.) There are others.
In addition to #Amigable Clark Kant's suggestion, Anthony Williams' fixed point math library provides a C++ fixed class that can be use almost interchangeably with float or double and on ARM gives a 5x performance improvement over software floating point. It includes a complete fixed point version of the standard math library including trig and log functions etc. using the CORDIC algorithm.

Resources