Whenever I see C programs that refer directly to a specific location on the memory (e.g. a memory barrier) it is done with hexadecimal numbers, also in windows when you get a segfualt it presents the memory being segfualted with a hexadecimal number.
For example: *(0x12DF)
I am wondering why memory addresses are represented using hexadecimal numbers?
Is there a special reason for that or is it just a convention?
Memory is often manipulated in terms of larger units, such as pages or segments, which
tend to have sizes that are powers of 2. So if addresses are expressed in hex, it's
much easier to read them as page+offset or similar constructs. Decimal is difficult because
of that pesky factor of 5, and binary addresses are too long to be easily readable.
Its a much shorter way to represent what would otherwise be written in binary. It is also very nice and easy to convert hex to binary and back. Each 4 digits of binary corresponds to one digit of hex.
Convention and convenience: hex shows more clearly what relationship various pointers have to address segmenting. (For example, shared libraries are usually loaded on even hex boundaries, and the data segment likewise is on an even boundary.) DEC minicomputer convention actually preferred octal, but IBM's hex preference won out in practice.
(As for why this matters: what's easier to remember, 0xb73eb000 or 3074338816? It's the address of one of the shared objects in my current shell on jinx.)
It's the shortest, common number format, thus the numbers don't take up much place and everybody knows what they mean.
Computer only understands binary language which is collection of 0's and 1's. That means ON/OFF. As in case of the human readability the binary number which may be representing some address or data has to be converted into human readable format. Hexadecimal is one of them. But the question can be why we have converted binary to HEX only why not decimal, octal etc. Answer is HEX is the one which can be easily converted with the least amount of overhead on both HW as well as SW. thats why we are using addresses as HEX. But internally they are used as binary only.
Hope it helps :)
Related
What functions and features need to be changed in the C compiler and standard library, to be able to write and compile code in a separate base system? Obviously, there's trouble that comes from using base systems greater than decimal, where you must represent digits using non-numeric symbols (assume for now they are simply the capital letters A-Z). The underlying features of integer and floating point math do not need to be changed, only the portions when binary components are converted into decimal.
Ultimately, I'd like to know the feasibility of such a task and the edge cases of binary to decimal translation, so a detailed explanation is best. If you could also indicate the modifications necessary for a system with less digits than decimal (e.g., heximal) in a separate category from extra modifications that are also necessary for a system with more digits (e.g., dozenal), it would be appreciated.
Right now I believe that the parser which turns constant literals into binary numbers, as well as the underlying code for printf and scanf functions needs to be modified slightly. I'm curious if there's other details in an arbitrary portion of the standard library that would also need modification.
Let say if we are given a byte of binary data, how can you know what that data represents?
Is it true that you cant really know what the data represents because you need to know whether the one byte of binary data is represented in base 2, if it unsigned, signed, etc.
or is it that you can know what it represents since binary is base 2?
I am sorry to tell that a byte of data has nothing to do with it's supposed representation.
You state that because it's a byte, it's a binary representation. this is purely assumption.
It depends on the intention of the guy who store the very data.
It might represent anything. As #nos told you, it really depend on the convention the setter used to store it.
You may have a complementary to 2 number, a signed byte on 7 bit, un unsigned on 8 bits, an octal representation (or a partial representation) or a mask (each group of byte within the byte may describer something totally different than another). It could also be a representation of a special coding. Etc.
This is truly unlimited.
In order to properly interpret it you need to know the underlying convention (a spec). #fede1024 told you about files, which use special character so that you can double check with the convention.
One more thing… Bear in mind that even binary data can be stored in natural order or in reverse order: that's endianness. So when you examine a number store in at least 2 bytes, you have to know whether the most significant byte is stored first or sec on din memory. If you misinterpret this, you won't understand the underlying piece of data. Endianness is a constant for a given processor.
Base-2 and binary refer to the same thing. Typically, you do need to know whether the byte is signed or unsigned at least (in C). As for what the data represents - well, "it depends". Whether you want to interpret it as a single byte, as a character (or not), etc. With multi-byte data, you often also have to take endianness (ordering of the bytes into larger words) into account.
Some files format start with a magic number, for example all PNG files starts with 89 50 4E 47 0D 0A 1A 0A. That said, if you have a general binary file without any kind of magic number, you can just guess about his contents.
You can try to open it with an hexadecimal editor, but there is no automatic way to understand what the data represents.
You know it's base 2 since it's a byte of binary data, as you said. To see if it is true, in C everything but 0 is true. If it's 0, then it's false.
I'm writing a utility to calculate π to a million digits after the decimal. On a 32- or 64-bit consumer desktop system, what is the most efficient way to store and work with such a large number accurate to the millionth digit?
clarification: The language would be C.
Forget floating point, you need bit strings that represent integers
This takes a bit less than 1/2 megabyte per number. "Efficient" can mean a number of things. Space-efficient? Time-efficient? Easy-to-program with?
Your question is tagged floating-point, but I'm quite sure you do not want floating point at all. The entire idea of floating point is that our data is only known to a few significant figures and even the famous constants of physics and chemistry are known precisely to only a handful or two of digits. So there it makes sense to keep a reasonable number of digits and then simply record the exponent.
But your task is quite different. You must account for every single bit. Given that, no floating point or decimal arithmetic package is going to work unless it's a template you can arbitrarily size, and then the exponent will be useless. So you may as well use integers.
What you really really need is a string of bits. This is simply an array of convenient types. I suggest <stdint.h> and simply using uint32_t[125000] (or 64) to get started. This actually might be a great use of the more obscure constants from that header that pick out bit sizes that are fast on a given platform.
To be more specific we would need to know more about your goals. Is this for practice in a specific language? For some investigation into number theory? If the latter, why not just use a language that already supports Bignum's, like Ruby?
Then the storage is someone else's problem. But, if what you really want to do is implement a big number package, then I might suggest using bcd (4-bit) strings or even ordinary ascii 8-bit strings with printable digits, simply because things will be easier to write and debug and maximum space and time efficiency may not matter so much.
I'd recommend storing it as an array of short ints, one per digit, and then carefully write utility classes to add and subtract portions of the number. You'll end up moving from this array of ints to floats and back, but you need a 'perfect' way of storing the number - so use its exact representation. This isn't the most efficient way in terms of space, but a million ints isn't very big.
It's all in the way you use the representation. Decide how you're going to 'work with' this number, and write some good utility functions.
If you're willing to tolerate computing pi in hex instead of decimal, there's a very cute algorithm that allows you to compute a given hexadecimal digit without knowing the previous digits. This means, by extension, that you don't need to store (or be able to do computation with) million digit numbers.
Of course, if you want to get the nth decimal digit, you will need to know all of the hex digits up to that precision in order to do the base conversion, so depending on your needs, this may not save you much (if anything) in the end.
Unless you're writing this purely for fun and/or learning, I'd recommend using a library such as GNU Multiprecision. Look into the mpf_t data type and its associated functions for storing arbitrary-precision floating-point numbers.
If you are just doing this for fun/learning, then represent numbers as an array of chars, which each array element storing one decimal digit. You'll have to implement long addition, long multiplication, etc.
Try PARI/GP, see wikipedia.
You could store its decimals digits as text in a file and mmap it to an array.
i once worked on an application that used really large numbers (but didnt need good precision). What we did was store the numbers as logarithms since you can store a pretty big number as a log10 within an int.
Think along this lines before resorting to bit stuffing or some complex bit representations.
I am not too good with complex math, but i reckon there are solutions which are elegant when storing numbers with millions of bits of precision.
IMO, any programmer of arbitrary precision arithmetics needs understanding of base conversion. This solves anyway two problems: being able to calculate pi in hex digits and converting the stuff to decimal representation and as well finding the optimal container.
The dominant constraint is the number of correct bits in the multiplication instruction.
In Javascript one has always 53-bits of accuracy, meaning that a Uint32Array with numbers having max 26 bits can be processed natively. (waste of 6 bits per word).
In 32-bit architecture with C/C++ one can easily get A*B mod 2^32, suggesting basic element of 16 bits. (Those can be parallelized in many SIMD architectures starting from MMX). Also each 16-bit result can contain 4-digit decimal numbers (wasting about 2.5 bits) per word.
I know how to convert binary to decimal. I know at least 2 methods: table and power ;-)
I want to convert binary to decimal and print this decimal. Moreover, I'm not interested in this `decimal'; I want just to print it.
But, as I wrote above, I know only 2 methods to convert binary to decimal and both of them required addition. So, I'm computing some value for 1 or 0 in binary and add it to the remembered value. This is a thin place. I have a really-really big number (1 and 64 zeros). While converting I need to place some intermediate result in some 'variable'. In C, I have an `int' type, which is 4 bytes only and not more than 10^11.
So, I don't have enough memory to store intermedite result while converting from binary to decimal. As I wrote above, I'm not interested in THAT decimal, I just want to print the result. But, I don't see any other ways to solve it ;-( Is there any solution to "just print" from binary?
Or, maybe, I should use something like BCD (Binary Coded Decimal) for intermediate representation? I really don't want to use this, 'cause it is not so cross-platform (Intel's processors have a built-in feature, but for other I'll need to write own implementation).
I would glad to hear your thoughts. Thanks for patience.
Language: C.
I highly recommend using a library such as GMP (GNU multiprecision library). You can use the mpz_t data type for large integers, the various import/export routines to get your data into an mpz_t, and then use mpz_out_str() to print it out in base 10.
Biggest standard integral data type is unsigned long long int - on my system (32-bit Linux on x86) it has range 0 - 1.8*10^20 which is not enough for you, so you need to create your own type (struct or array) and write basic math (basically you just need an addition) for that type.
If I were you (and memory is not an issue), I'd use an array - one byte per decimal digit rather then BCD. BCD is more compact as it stores 2 decimal digits per byte but you need to put much more effort working with high and low nibbles separately.
And to print you just add '0' (character, not digit) to every byte of your array and you get a printable string.
Well, when converting from binary to decimal, you really don't need ALL the binary bits at the same time. You just need the bits you are currently calculating the power of and probably a double variable to hold the results.
You could put the binary value in an array, lets say i[64], iterate through it, get the power depending on its position and keep adding it to the double.
Converting to decimal really means calculating each power of ten, so why not just store these in an array of bytes? Then printing is just looping through the array.
Couldn't you allocate memory for, say, 5 int's, and store your number at the beginning of the array? Then manually iterate over the array in int-sized chunks. Perhaps something like:
int* big = new int[5];
*big = <my big number>;
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
I have never used octal numbers in my code nor come across any code that used it (hexadecimal and bit twiddling notwithstanding).
I started programming in C/C++ about 1994 so maybe I'm too young for this? Does older code use octal? C includes support for these by prepending a 0, but where is the code that uses these base 8 number literals?
I recently had to write network protocol code that accesses 3-bit fields. Octal comes in handy when you want to debug that.
Just for effect, can you tell me what the 3-bit fields of this are?
0x492492
On the other hand, this same number in octal:
022222222
Now, finally, in binary (in groups of 3):
010 010 010 010 010 010 010 010
The only place I come across octal literals these days is when dealing with the permission bits on files in Linux, which are normally represented as 3 octal digits, where each digit represents the permissions for the file owner, group and other users respectively.
e.g. 0755 (also just 755 with most command line tools) means the file owner has full permissions (read, write, execute), and the group and other users just have read and execute permissions.
Representing these bits in octal makes it easier to figure out what permissions are set. You can tell at a glance what 0755 means, but not 493 or 0x1ed.
From Wikipedia
At the time when octal originally
became widely used in computing,
systems such as the IBM mainframes
employed 24-bit (or 36-bit) words.
Octal was an ideal abbreviation of
binary for these machines because
eight (or twelve) digits could
concisely display an entire machine
word (each octal digit covering three
binary digits). It also cut costs by
allowing Nixie tubes, seven-segment
displays, and calculators to be used
for the operator consoles; where
binary displays were too complex to
use, decimal displays needed complex
hardware to convert radixes, and
hexadecimal displays needed to display
letters.
All modern computing
platforms, however, use 16-, 32-, or
64-bit words, with eight bits making
up a byte. On such systems three octal
digits would be required, with the
most significant octal digit
inelegantly representing only two
binary digits (and in a series the
same octal digit would represent one
binary digit from the next byte).
Hence hexadecimal is more commonly
used in programming languages today,
since a hexadecimal digit covers four
binary digits and all modern computing
platforms have machine words that are
evenly divisible by four. Some
platforms with a power-of-two word
size still have instruction subwords
that are more easily understood if
displayed in octal; this includes the
PDP-11. The modern-day ubiquitous x86
architecture belongs to this category
as well, but octal is almost never
used on this platform.
-Adam
I have never used octal numbers in my
code nor come across any code that
used it.
I bet you have. According to the standard, numeric literals which start with zero are octal. This includes, trivially, 0. Every time you have used or seen a literal zero, this has been octal. Strange but true. :-)
Commercial Aviation uses octal "labels" (basically message type ids) in the venerable Arinc 429 bus standard. So being able to specify label values in octal when writing code for avionics applications is nice...
I have also seen octal used in aircraft transponders. A mode-3a transponder code is a 12-bit number that everyone deals with as 4 octal numbers. There is a bit more information on Wikipedia. I know it's not generally computer related, but the FAA uses computers too :).
It's useful for the chmod and mkdir functions in Unix land, but aside from that I can't think of any other common uses.
I came into contact with Octal through PDP-11, and so, apparently, did the C language :)
There are still a bunch of old Process Control Systems (Honeywell H4400, H45000, etc) out there from the late 60s and 70s which are arranged to use 24-bit words with octal addressing. Think about when the last nuclear power plants were constructed in the United States as one example.
Replacing these industrial systems is a pretty major undertaking so you may just be lucky enough to encounter one in the wild before they go extinct and gape in awe at their magnificent custom floating point formats!
tar files store information as an octal integer value string
There is no earthly reason to modify a standard that goes back to the birth of the language and which exists in untold numbers of programs. I still remember ASCII characters by their
octal values, would have to think to come up with the hex value of A, but it is 101 in octal; numeric 0 is 060... ^C is 003...
That is to say, I often use the octal representation.
Now if you really want to bend your mine, take a look at the word format for the PDP-10...
Anyone who learned to program on a PDP-8 has a warm spot in his heart for octal numbers. Word size was 12 bits divided into 4 groups of 3 bits each, so -1 was 7777 octal. This scheme was perpetuated in the PDP-11 which had 16 bit words but still used octal representation for various things, hence the *NIX file permission scheme which lives to this day.
Octal is and was most useful with the first available display hardware (7-segment displays). These original displays did not have the decoders available later.
Thus the digital register outputs were grouped to fit the available display which was capable of only displaying eight(8) symbols: 0,1,2 3,4,5,6,7 .
Also the first CRT display tubes were raster scan displays and simplest character-symbol generators were equivalent to the 7-segment displays.
The motivating driver was, as always, the least expensive display possible.