Octal number literals: When? Why? Ever? [closed] - c

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
I have never used octal numbers in my code nor come across any code that used it (hexadecimal and bit twiddling notwithstanding).
I started programming in C/C++ about 1994 so maybe I'm too young for this? Does older code use octal? C includes support for these by prepending a 0, but where is the code that uses these base 8 number literals?

I recently had to write network protocol code that accesses 3-bit fields. Octal comes in handy when you want to debug that.
Just for effect, can you tell me what the 3-bit fields of this are?
0x492492
On the other hand, this same number in octal:
022222222
Now, finally, in binary (in groups of 3):
010 010 010 010 010 010 010 010

The only place I come across octal literals these days is when dealing with the permission bits on files in Linux, which are normally represented as 3 octal digits, where each digit represents the permissions for the file owner, group and other users respectively.
e.g. 0755 (also just 755 with most command line tools) means the file owner has full permissions (read, write, execute), and the group and other users just have read and execute permissions.
Representing these bits in octal makes it easier to figure out what permissions are set. You can tell at a glance what 0755 means, but not 493 or 0x1ed.

From Wikipedia
At the time when octal originally
became widely used in computing,
systems such as the IBM mainframes
employed 24-bit (or 36-bit) words.
Octal was an ideal abbreviation of
binary for these machines because
eight (or twelve) digits could
concisely display an entire machine
word (each octal digit covering three
binary digits). It also cut costs by
allowing Nixie tubes, seven-segment
displays, and calculators to be used
for the operator consoles; where
binary displays were too complex to
use, decimal displays needed complex
hardware to convert radixes, and
hexadecimal displays needed to display
letters.
All modern computing
platforms, however, use 16-, 32-, or
64-bit words, with eight bits making
up a byte. On such systems three octal
digits would be required, with the
most significant octal digit
inelegantly representing only two
binary digits (and in a series the
same octal digit would represent one
binary digit from the next byte).
Hence hexadecimal is more commonly
used in programming languages today,
since a hexadecimal digit covers four
binary digits and all modern computing
platforms have machine words that are
evenly divisible by four. Some
platforms with a power-of-two word
size still have instruction subwords
that are more easily understood if
displayed in octal; this includes the
PDP-11. The modern-day ubiquitous x86
architecture belongs to this category
as well, but octal is almost never
used on this platform.
-Adam

I have never used octal numbers in my
code nor come across any code that
used it.
I bet you have. According to the standard, numeric literals which start with zero are octal. This includes, trivially, 0. Every time you have used or seen a literal zero, this has been octal. Strange but true. :-)

Commercial Aviation uses octal "labels" (basically message type ids) in the venerable Arinc 429 bus standard. So being able to specify label values in octal when writing code for avionics applications is nice...

I have also seen octal used in aircraft transponders. A mode-3a transponder code is a 12-bit number that everyone deals with as 4 octal numbers. There is a bit more information on Wikipedia. I know it's not generally computer related, but the FAA uses computers too :).

It's useful for the chmod and mkdir functions in Unix land, but aside from that I can't think of any other common uses.

I came into contact with Octal through PDP-11, and so, apparently, did the C language :)

There are still a bunch of old Process Control Systems (Honeywell H4400, H45000, etc) out there from the late 60s and 70s which are arranged to use 24-bit words with octal addressing. Think about when the last nuclear power plants were constructed in the United States as one example.
Replacing these industrial systems is a pretty major undertaking so you may just be lucky enough to encounter one in the wild before they go extinct and gape in awe at their magnificent custom floating point formats!

tar files store information as an octal integer value string

There is no earthly reason to modify a standard that goes back to the birth of the language and which exists in untold numbers of programs. I still remember ASCII characters by their
octal values, would have to think to come up with the hex value of A, but it is 101 in octal; numeric 0 is 060... ^C is 003...
That is to say, I often use the octal representation.
Now if you really want to bend your mine, take a look at the word format for the PDP-10...

Anyone who learned to program on a PDP-8 has a warm spot in his heart for octal numbers. Word size was 12 bits divided into 4 groups of 3 bits each, so -1 was 7777 octal. This scheme was perpetuated in the PDP-11 which had 16 bit words but still used octal representation for various things, hence the *NIX file permission scheme which lives to this day.

Octal is and was most useful with the first available display hardware (7-segment displays). These original displays did not have the decoders available later.
Thus the digital register outputs were grouped to fit the available display which was capable of only displaying eight(8) symbols: 0,1,2 3,4,5,6,7 .
Also the first CRT display tubes were raster scan displays and simplest character-symbol generators were equivalent to the 7-segment displays.
The motivating driver was, as always, the least expensive display possible.

Related

What does the following mean in context of programming, specifically C programming language?

representations of values on a computer can vary “culturally” from architecture to architecture or are determined by the type the programmer gave to the value. Therefore, we should try to reason primarily about values and not about representations if we want to write portable code.
Specifying values. We have already seen several ways in which numerical constants (literals) can be specified:
123 Decimal integer constant.
077 Octal integer constant.
0xFFFF Hexadecimal integer constant.
et cetera
Question: Are decimal integer constants and hexadecimal integer constants, different ways to 'represent' values or are they values themselves? If the latter what are different ways to represent them on different architectures?
The source of the aforementioned is the book "Modern C" by Jens Gustedt which is freely available online, specifically from page no. 38 to page no. 46.
The words "representation" can be used here in two different contexts.
One is when we (the programmers) specify e.g. integer constants. For example, the value 37 may be represented in the C code as 37 or 0x25 or 045. Regardless of which representation we have chosen, the C compiler will interpret this into the same value when generating the binary code. Hence, these statements all generate the same code:
int a = 37;
int a = 0x25;
int a = 045;
Another context is how the compiler chooses to store the value 37 internally. The C standard states a few requirements (e.g. that the representation of int must at least be able to represent values in the range -32767 to +32767). Within the rules of the C standard the compiler will use a bit representation which can be operated on efficiently by the native language of the target system's CPU. The most common representation for signed integers is Two's complement and usually a signed integer with type int will occupy 2 or 4 bytes of 8 bits each.
However, the C standard is sufficiently flexible to allow for other internal representations (e.g. bytes with more than 8 bits or Ones' complement representation of signed integers). A common difference between representations of multibyte integers on different systems is the use of different byte order.
The C standard is primarily concerned with the result of standard operations. E.g. 5+6 must give the same result no matter on which platform the expression is executed, but how 5, 6 and 11 are represented on the given platform is largely up to the compiler to decide.
It is of utmost importance to every C programmer to understand that C is an abstraction layer that shields you from the underlying hardware. This service is the raison d'être for the language, the reason it was developed. Among other things, the language shields you from the different internal byte patterns used to hold the same values on different platforms: You write a value and operations on it, and the compiler will see to producing the proper code. This would be different in assembler where you are intimately concerned with memory layout, register sizes etc.
In case it wasn't obvious: I'm emphasizing this because I struggled with these concepts myself when I learned C.
The first thing to hammer down is that C program code is text. What we deal with here are text representations of values, a succession of (most likely) ASCII codes much as if you wrote a letter to your grandma.
Integer literals like 0443 (the less usual octal format), 0x0123 or 291 are simply different string representations for the same value. Here and in the standard, "value" is a value in the mathematical sense. As much as we think "oh, C!" when we see "0x0123", it is nothing else than a way to write down the mathematical value of 291. That's meant with "value", for example when the standard specifies that "the type of an integer constant is the first of the corresponding list in which its value can be represented." The compiler has to create a binary representation of that value in the program's memory. This means it has to find out what value it is (291 in all cases) and then produce the proper byte pattern for it. The integer literal in the C code is not a binary form of anything, no matter whether you choose to write its string representation down base 10, base 16 or base 8. In particular does 0x0123 not mean that the two bytes 01 and 23 will be anywhere in the compiled program, or in which order.1
To demonstrate the abstraction consider the expression (0x0123 << 4) == 0x1230, which should be true on all machines. Both hex literals are of type int here. The beauty of hex code is that it makes bit manipulations in multiples of 4 really easy to compute.
On a typical contemporary Intel architecture an int has 4 bytes and is organized "little endian first", or "little endian" for short: The lowest-value byte comes first if we inspect the memory in ascending order. 0x123 is represented as 00100011-00000001-00000000-00000000 (because the two highest-value bytes are zero for such a small number). 0x1230 is, consequently, 00110000-00010010-00000000-00000000. No left-shift whatsoever took place on the hardware (but also no right-shift!). The bit-shift operators' semantics are an abstraction: "Imagine a regular binary number, following the old Arab fashion of starting with the highest-value digit, and shift that imagined binary number." It is an abstraction that bears zero resemblance to anything happening on the hardware, and the compiler simply translates this abstract operation into the right thing for that particular hardware.
1Now admittedly, they probably are there, but on your prevalent x86 platform their order will be reversed, as assumed below.
Are decimal integer constants and hexadecimal integer constants, different ways to 'represent' values or are they values themselves?
This is philosophy! They are different ways to represent values, like:
0x2 means 2 (for a C compiler)
two means 2 (english language)
a couple means 2 (for an english speaker)
zwei means 2 (...)
A C compiler translates from "some form of human understandable language" to "a very precise form understandable by the machine": the only thing which is retained from the various forms, is the intimate meaning (the value!).
It happens that C, in order to be more friendly, lets you specify integers in two different ways, decimal and hexadecimal (ok, even octal and recently also binary notation). What the C compiler is interested in, is the value and, as already noted in a comment, after the C has "understand" the value, there is no more difference between a "0xC" or a "12". From that point, the compiler must make the machine understand the value 12, using the representation the target machine uses and, again, what is important is the value.
Most probably, the phrase
we should try to reason primarily about values and not about representations
is an invite to the programmers to choose correct data types and values, but not only: also to give useful names for types and variables and so on. A not very good example is: even if we know that a line feed is represented (often) by a 10 decimal, we should use LF or "\n" or similar, which is the value we want, not its representation.
About data types, especially integers, C is not particularly brilliant, compared to other languages which let you define types based on their possible values (for example with the "-3 .. 5" notation, which states that the possible values go from -3 to 5, and lets the compiler choose the number of bits needed for the representation of the range -3 to 5).

About Memory Address convention? [duplicate]

Whenever I see C programs that refer directly to a specific location on the memory (e.g. a memory barrier) it is done with hexadecimal numbers, also in windows when you get a segfualt it presents the memory being segfualted with a hexadecimal number.
For example: *(0x12DF)
I am wondering why memory addresses are represented using hexadecimal numbers?
Is there a special reason for that or is it just a convention?
Memory is often manipulated in terms of larger units, such as pages or segments, which
tend to have sizes that are powers of 2. So if addresses are expressed in hex, it's
much easier to read them as page+offset or similar constructs. Decimal is difficult because
of that pesky factor of 5, and binary addresses are too long to be easily readable.
Its a much shorter way to represent what would otherwise be written in binary. It is also very nice and easy to convert hex to binary and back. Each 4 digits of binary corresponds to one digit of hex.
Convention and convenience: hex shows more clearly what relationship various pointers have to address segmenting. (For example, shared libraries are usually loaded on even hex boundaries, and the data segment likewise is on an even boundary.) DEC minicomputer convention actually preferred octal, but IBM's hex preference won out in practice.
(As for why this matters: what's easier to remember, 0xb73eb000 or 3074338816? It's the address of one of the shared objects in my current shell on jinx.)
It's the shortest, common number format, thus the numbers don't take up much place and everybody knows what they mean.
Computer only understands binary language which is collection of 0's and 1's. That means ON/OFF. As in case of the human readability the binary number which may be representing some address or data has to be converted into human readable format. Hexadecimal is one of them. But the question can be why we have converted binary to HEX only why not decimal, octal etc. Answer is HEX is the one which can be easily converted with the least amount of overhead on both HW as well as SW. thats why we are using addresses as HEX. But internally they are used as binary only.
Hope it helps :)

Binary representation in C

In C why is there no standard specifier to print a number in its binary format, sth like %b. Sure, one can write some functions /hacks to do this but I want to know why such a simple thing is not a standard part of the language.
Was there some design decision behind it? Since there are format specifiers for octal %o and %x for hexadecimal is it that octal and hexadecimal are somewhat "more important" than the binary representation.
Since In C/C++ one often encounters bitwise operators I would imagine that it would be useful to have %b or directly input a binary representation of a number into a variable (the way one inputs hexadecimal numbers like int i=0xf2 )
Note: Threads like this discuss only the 'how' part of doing this and not the 'why'
The main reason is 'history', I believe. The original implementers of printf() et al at AT&T did not have a need for binary, but did need octal and hexadecimal (as well as decimal), so that is what was implemented. The C89 standard was fairly careful to standardize existing practice - in general. There were a couple of new parts (locales, and of course function prototypes, though there was C++ to provide 'implementation experience' for those).
You can read binary numbers with strtol() et al; specify a base of 2. I don't think there's a convenient way of formatting numbers in different bases (other than 8, 10, 16) that is the inverse of strtol() - presumably it should be ltostr().
You ask "why" as if there must be a clear and convincing reason, but the reality is that there is no technical reason for not supporting a %b format.
K&R C was created be people who framed the language to meet what they thought were going to be their common use cases. An opposing force was trying to keep the language spec as simple as possible.
ANSI C was standardized by a committee whose members had diverse interests. Clearly %b did not end-up being a winning priority.
Languages are made by men.
The main reason as I see it is what binary representation should one use? one's complement? two's complement? are you expecting the actual bits in memory or the abstract number representation?
Only the latter makes sense when C makes no requirements of word size or binary number representation. So since it wouldn't be the bits in memory, surely you would rather read the abstract number in hex?
Claiming an abstract representation is "binary" could lead to the belief that -0b1 ^ 0b1 == 0 might be true or that -0b1 | -0b10 == -0b11
Possible representations:
While there is only one meaningful hex representation --- the abstract one, the number -0x79 can be represented in binary as:
-1111001 (the abstract number)
11111001 (one's complement)
10000111 (two's complement)
#Eric has convinced me that endianness != left-to-right order...
the problem is further compounded when numbers don't fit in one byte. the same number could be:
1000000001111001 as a one's-complement big-endian 16bit number
1111111110000111 as a two's-complement big-endian 16bit number
1000011110000000 as a one's-complement little-endian 16bit number
1000011111111111 as a two's-complement little-endian 16bit number
The concepts of endianness and binary representation don't apply to hex numbers as there is no way they could be considered the actual bits-in-memory representation.
All these examples assume an 8-bit byte, which C makes no guarantees of (indeed there have been historical machines with 10 bit bytes)
Why no decision is better than any decision:
Obviously one can arbitrarily pick one representation, or leave it implementation defined.
However:
if you are trying to use this to debug bitwise operations, (which I see as the only compelling reason to use binary over hex) you want to use something close what the hardware uses, which makes it impossible to standardise, so you want implementation defined.
Conversely if you are trying to read a bit sequence, you need a standard, not implementation defined format.
And you definitely want printf and scanf to use the same.
So it seems to me there is no happy medium.
One answer may be that hexadecimal formatting is much more compact. See for example the hexa view of Total Commander's Lister.
%b would be useful in lots of practical cases. For example, if you write code to analyze network packets, you have to read the values of bits, and if printf would have %b, debugging such code would be much easier. Even if omitting %b could be explained when printf was designed, it was definitely a bad idea.
I agree. I was a participant in the original ANSI C committee and made the proposal to include a binary representation in C. However, I was voted down, for some of the reasons mentioned above, although I still think it would be quite helpful when doing, e.g., bitwise operations, etc.
It is worth noting that the ANSI committee was for the most part composed of compiler developers, not users and C programmers. Their objectives were to make the standard understandable to compiler developers not necessarily for C programmers, and to be able to do so with a document that was no longer than it need be, even if this meant it was a difficult read for C programmers.

Where did the octal/hex notations come from? [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
After all of this time, I've never thought to ask this question; I understand this came from c++, but what was the reasoning behind it:
Specify decimal numbers as you
normally would
Specify octal numbers by a leading 0
Specify hexadecimal numbers by a leading 0x
Why 0? Why 0x? Is there a natural progression for base-32?
C, the ancestor of C++ and Java, was originally developed by Dennis Richie on PDP-8s in the early 70s. Those machines had a 12-bit address space, so pointers (addresses) were 12 bits long and most conveniently represented in code by four 3-bit octal digits (first addressable word would be 0000octal, last addressable word 7777octal).
Octal does not map well to 8 bit bytes because each octal digit represents three bits, so there will always be excess bits representable in the octal notation. An all-TRUE-bits byte (1111 1111) is 377 in octal, but FF in hex.
Hex is easier for most people to convert to and from binary in their heads, since binary numbers are usually expressed in blocks of eight (because that's the size of a byte) and eight is exactly two Hex digits, but Hex notation would have been clunky and misleading in Dennis' time (implying the ability to address 16 bits). Programmers need to think in binary when working with hardware (for which each bit typically represents a physical wire) and when working with bit-wise logic (for which each bit has a programmer-defined meaning).
I imagine Dennis added the 0 prefix as the simplest possible variation on everyday decimal numbers, and easiest for those early parsers to distinguish.
I believe Hex notation 0x__ was added to C slightly later. The compiler parse tree to distinguish 1-9 (first digit of a decimal constant), 0 (first [insignificant] digit of an octal constant), and 0x (indicating a hex constant to follow in subsequent digits) from each other is considerably more complicated than just using a leading 0 as the indicator to switch from parsing subsequent digits as octal rather than decimal.
Why did Dennis design this way? Contemporary programmers don't appreciate that those early computers were often controlled by toggling instructions to the CPU by physically flipping switches on the CPUs front panel, or with a punch card or paper tape; all environments where saving a few steps or instructions represented savings of significant manual labor. Also, memory was limited and expensive, so saving even a few instructions had a high value.
In summary:
0 for octal because it was efficiently parseable and octal was user-friendly on PDP-8s (at least for address manipulation)
0x for hex probably because it was a natural and backward-compatible extension on the octal prefix standard and still relatively efficient to parse.
The zero prefix for octal, and 0x for hex, are from the early days of Unix.
The reason for octal's existence dates to when there was hardware with 6-bit bytes, which made octal the natural choice. Each octal digit represents 3 bits, so a 6-bit byte is two octal digits. The same goes for hex, from 8-bit bytes, where a hex digit is 4 bits and thus a byte is two hex digits. Using octal for 8-bit bytes requires 3 octal digits, of which the first can only have the values 0, 1, 2 and 3 (the first digit is really 'tetral', not octal).
There is no reason to go to base32 unless somebody develops a system in which bytes are ten bits long, so a ten-bit byte could be represented as two 5-bit "nybbles".
“New” numerals had to start with a digit, to work with existing syntax.
Established practice had variable names and other identifiers starting with a letter (or a few other symbols, perhaps underscore or dollar sign). So “a”, “abc”, and “a04” are all names. Numbers started with a digit. So “3” and “3e5” are numbers.
When you add new things to a programming language, you seek to make them fit into the existing syntax, grammar, and semantics, and you try to make existing code continue working. So, you would not want to change the syntax to make “x34” a hexadecimal number or “o34” an octal number.
So, how do you fit octal numerals into this syntax? Somebody realized that, except for “0”, there is no need for numerals beginning with “0”. Nobody needs to write “0123” for 123. So we use a leading zero to denote octal numerals.
What about hexadecimal numerals? You could use a suffix, so that “34x” means 3416. However, then the parser has to read all the way to the end of the numeral before it knows how to interpret the digits (unless it encounters one of the “a” to “f” digits, which would of course indicate hexadecimal). It is “easier” on the parser to know that the numeral is hexadecimal early. But you still have to start with a digit, and the zero trick has already been used, so we need something else. “x” was picked, and now we have “0x” for hexadecimal.
(The above is based on my understanding of parsing and some general history about language development, not on knowledge of specific decisions made by compiler developers or language committees.)
I dunno ...
0 is for 0ctal
0x is for, well, we've already used 0 to mean octal and there's an x in hexadecimal so bung that in there too
as for natural progression, best look to the latest programming languages which can affix subscripts such as
123_27 (interpret _ to mean subscript)
and so on
?
Mark
Is there a natural progression for base-32?
This is part of why Ada uses the form 16# to introduce hex constants, 8# for octal, 2# for binary, etc.
I wouldn't concern myself too much over needing space for "future growth" in basing though. This isn't like RAM or addressing space where you need an order of magnitude more every generation.
In fact, studies have shown that octal and hex are pretty much the sweet spot for human-readable representations that are binary-compatible. If you go any lower than octal, it starts to require a rediculous number of digits to represent larger numbers. If you go any higher than hex, the math tables get rediculously large. Hex is actually a bit too much already, but Octal has the problem that it doesn't evenly fit in a byte.
There is a standard encoding for Base32. It is very similar to Base64. But it isn't very convenient to read. Hex is used because 2 hex digits can be used to represent 1 8-bit byte. And octal was used primarily for older systems that used 12-bit bytes. It made for a more compact representation of data when compared to displaying raw registers as binary.
It should also be noted that some languages use o### for octal and x## or h## for hex, as well as, many other variations.
I think it 0x actually came for the UNIX/Linux world and was picked-up by C/C++ and other languages. But I don't know the exact reason or true origin.

What is the most efficient way to store and work with a floating point number with 1,000,000 significant digits in C?

I'm writing a utility to calculate π to a million digits after the decimal. On a 32- or 64-bit consumer desktop system, what is the most efficient way to store and work with such a large number accurate to the millionth digit?
clarification: The language would be C.
Forget floating point, you need bit strings that represent integers
This takes a bit less than 1/2 megabyte per number. "Efficient" can mean a number of things. Space-efficient? Time-efficient? Easy-to-program with?
Your question is tagged floating-point, but I'm quite sure you do not want floating point at all. The entire idea of floating point is that our data is only known to a few significant figures and even the famous constants of physics and chemistry are known precisely to only a handful or two of digits. So there it makes sense to keep a reasonable number of digits and then simply record the exponent.
But your task is quite different. You must account for every single bit. Given that, no floating point or decimal arithmetic package is going to work unless it's a template you can arbitrarily size, and then the exponent will be useless. So you may as well use integers.
What you really really need is a string of bits. This is simply an array of convenient types. I suggest <stdint.h> and simply using uint32_t[125000] (or 64) to get started. This actually might be a great use of the more obscure constants from that header that pick out bit sizes that are fast on a given platform.
To be more specific we would need to know more about your goals. Is this for practice in a specific language? For some investigation into number theory? If the latter, why not just use a language that already supports Bignum's, like Ruby?
Then the storage is someone else's problem. But, if what you really want to do is implement a big number package, then I might suggest using bcd (4-bit) strings or even ordinary ascii 8-bit strings with printable digits, simply because things will be easier to write and debug and maximum space and time efficiency may not matter so much.
I'd recommend storing it as an array of short ints, one per digit, and then carefully write utility classes to add and subtract portions of the number. You'll end up moving from this array of ints to floats and back, but you need a 'perfect' way of storing the number - so use its exact representation. This isn't the most efficient way in terms of space, but a million ints isn't very big.
It's all in the way you use the representation. Decide how you're going to 'work with' this number, and write some good utility functions.
If you're willing to tolerate computing pi in hex instead of decimal, there's a very cute algorithm that allows you to compute a given hexadecimal digit without knowing the previous digits. This means, by extension, that you don't need to store (or be able to do computation with) million digit numbers.
Of course, if you want to get the nth decimal digit, you will need to know all of the hex digits up to that precision in order to do the base conversion, so depending on your needs, this may not save you much (if anything) in the end.
Unless you're writing this purely for fun and/or learning, I'd recommend using a library such as GNU Multiprecision. Look into the mpf_t data type and its associated functions for storing arbitrary-precision floating-point numbers.
If you are just doing this for fun/learning, then represent numbers as an array of chars, which each array element storing one decimal digit. You'll have to implement long addition, long multiplication, etc.
Try PARI/GP, see wikipedia.
You could store its decimals digits as text in a file and mmap it to an array.
i once worked on an application that used really large numbers (but didnt need good precision). What we did was store the numbers as logarithms since you can store a pretty big number as a log10 within an int.
Think along this lines before resorting to bit stuffing or some complex bit representations.
I am not too good with complex math, but i reckon there are solutions which are elegant when storing numbers with millions of bits of precision.
IMO, any programmer of arbitrary precision arithmetics needs understanding of base conversion. This solves anyway two problems: being able to calculate pi in hex digits and converting the stuff to decimal representation and as well finding the optimal container.
The dominant constraint is the number of correct bits in the multiplication instruction.
In Javascript one has always 53-bits of accuracy, meaning that a Uint32Array with numbers having max 26 bits can be processed natively. (waste of 6 bits per word).
In 32-bit architecture with C/C++ one can easily get A*B mod 2^32, suggesting basic element of 16 bits. (Those can be parallelized in many SIMD architectures starting from MMX). Also each 16-bit result can contain 4-digit decimal numbers (wasting about 2.5 bits) per word.

Resources