64 bit operations - c

I'm writing code for a primality testing function that handles long long int's.Do I have to use special operators for such large numbers?Is there any documentation concerning large number manipulation in C?I'm using the gnu standard library.Thanks.

No, you don't need to do anything special. You handle a long long int just the same way as you would handle a int. Just beware of overflows, as with every native integer type.

If long long ints are supported by your compiler you don't have to do any 'special' kind of stuff. If your processor doesn't support 64-bit types (probably 32-bit-processor then) the compiler will emulate that feature by using sequences of assembly code that breaks up the 64-bit operations into 32-bit ones.

long long is new in C99, though many compilers have supported that as an extension before that.
With gcc a long long is 64 bits, you can use it like any other integer type, nothing special is required.
There's a couple of things to be aware of though, integer constants in the source code needs the LL suffix (or LLU if it's unsigned, e.g. you have to do
long long foo = 123412341234123LL;
and not
long long foo = 123412341234123;
Similarly, for outputting a long long with the printf family, you have to use the conversion specifier "%lld" instead of "%d" or "%ld" (or "%llu" if it's unsigned), e.g.
printf("foo = %lld",foo);
There's some docs about long long in gcc here

If the compiler supports long long int, it works with standard operators.
By the way, long long int is 128 bits on 64-bit unices (where long alone is 64 bits). Use int64_t from <stdint.h> if you need 64-bits on all platforms. This does not apply to 64-bit windows, where long is still 32 bits and long long is 64 bits.

If you are just handling long long int you don't need anything special as long as your compiler supports it. Take care of overflows while adding and multiplying two long long ints
For handling very large numbers(range much greater than that of long long int) have a look at GNU MP BigNum Library

Have a look at the GMP library: http://gmplib.org/

Related

size_t is unsigned long long under 64 bit system? [duplicate]

I notice that modern C and C++ code seems to use size_t instead of int/unsigned int pretty much everywhere - from parameters for C string functions to the STL. I am curious as to the reason for this and the benefits it brings.
The size_t type is the unsigned integer type that is the result of the sizeof operator (and the offsetof operator), so it is guaranteed to be big enough to contain the size of the biggest object your system can handle (e.g., a static array of 8Gb).
The size_t type may be bigger than, equal to, or smaller than an unsigned int, and your compiler might make assumptions about it for optimization.
You may find more precise information in the C99 standard, section 7.17, a draft of which is available on the Internet in pdf format, or in the C11 standard, section 7.19, also available as a pdf draft.
Classic C (the early dialect of C described by Brian Kernighan and Dennis Ritchie in The C Programming Language, Prentice-Hall, 1978) didn't provide size_t. The C standards committee introduced size_t to eliminate a portability problem
Explained in detail at embedded.com (with a very good example)
In short, size_t is never negative, and it maximizes performance because it's typedef'd to be the unsigned integer type that's big enough -- but not too big -- to represent the size of the largest possible object on the target platform.
Sizes should never be negative, and indeed size_t is an unsigned type. Also, because size_t is unsigned, you can store numbers that are roughly twice as big as in the corresponding signed type, because we can use the sign bit to represent magnitude, like all the other bits in the unsigned integer. When we gain one more bit, we are multiplying the range of numbers we can represents by a factor of about two.
So, you ask, why not just use an unsigned int? It may not be able to hold big enough numbers. In an implementation where unsigned int is 32 bits, the biggest number it can represent is 4294967295. Some processors, such as the IP16L32, can copy objects larger than 4294967295 bytes.
So, you ask, why not use an unsigned long int? It exacts a performance toll on some platforms. Standard C requires that a long occupy at least 32 bits. An IP16L32 platform implements each 32-bit long as a pair of 16-bit words. Almost all 32-bit operators on these platforms require two instructions, if not more, because they work with the 32 bits in two 16-bit chunks. For example, moving a 32-bit long usually requires two machine instructions -- one to move each 16-bit chunk.
Using size_t avoids this performance toll. According to this fantastic article, "Type size_t is a typedef that's an alias for some unsigned integer type, typically unsigned int or unsigned long, but possibly even unsigned long long. Each Standard C implementation is supposed to choose the unsigned integer that's big enough--but no bigger than needed--to represent the size of the largest possible object on the target platform."
The size_t type is the type returned by the sizeof operator. It is an unsigned integer capable of expressing the size in bytes of any memory range supported on the host machine. It is (typically) related to ptrdiff_t in that ptrdiff_t is a signed integer value such that sizeof(ptrdiff_t) and sizeof(size_t) are equal.
When writing C code you should always use size_t whenever dealing with memory ranges.
The int type on the other hand is basically defined as the size of the (signed) integer value that the host machine can use to most efficiently perform integer arithmetic. For example, on many older PC type computers the value sizeof(size_t) would be 4 (bytes) but sizeof(int) would be 2 (byte). 16 bit arithmetic was faster than 32 bit arithmetic, though the CPU could handle a (logical) memory space of up to 4 GiB.
Use the int type only when you care about efficiency as its actual precision depends strongly on both compiler options and machine architecture. In particular the C standard specifies the following invariants: sizeof(char) <= sizeof(short) <= sizeof(int) <= sizeof(long) placing no other limitations on the actual representation of the precision available to the programmer for each of these primitive types.
Note: This is NOT the same as in Java (which actually specifies the bit precision for each of the types 'char', 'byte', 'short', 'int' and 'long').
Type size_t must be big enough to store the size of any possible object. Unsigned int doesn't have to satisfy that condition.
For example in 64 bit systems int and unsigned int may be 32 bit wide, but size_t must be big enough to store numbers bigger than 4G
This excerpt from the glibc manual 0.02 may also be relevant when researching the topic:
There is a potential problem with the size_t type and versions of GCC prior to release 2.4. ANSI C requires that size_t always be an unsigned type. For compatibility with existing systems' header files, GCC defines size_t in stddef.h' to be whatever type the system'ssys/types.h' defines it to be. Most Unix systems that define size_t in `sys/types.h', define it to be a signed type. Some code in the library depends on size_t being an unsigned type, and will not work correctly if it is signed.
The GNU C library code which expects size_t to be unsigned is correct. The definition of size_t as a signed type is incorrect. We plan that in version 2.4, GCC will always define size_t as an unsigned type, and the fixincludes' script will massage the system'ssys/types.h' so as not to conflict with this.
In the meantime, we work around this problem by telling GCC explicitly to use an unsigned type for size_t when compiling the GNU C library. `configure' will automatically detect what type GCC uses for size_t arrange to override it if necessary.
If my compiler is set to 32 bit, size_t is nothing other than a typedef for unsigned int. If my compiler is set to 64 bit, size_t is nothing other than a typedef for unsigned long long.
size_t is the size of a pointer.
So in 32 bits or the common ILP32 (integer, long, pointer) model size_t is 32 bits.
and in 64 bits or the common LP64 (long, pointer) model size_t is 64 bits (integers are still 32 bits).
There are other models but these are the ones that g++ use (at least by default)

What is the difference between long int and long long int? [duplicate]

What's the difference between long long and long? And they both don't work with 12 digit numbers (600851475143), am I forgetting something?
#include <iostream>
using namespace std;
int main(){
long long a = 600851475143;
}
Going by the standard, all that's guaranteed is:
int must be at least 16 bits
long must be at least 32 bits
long long must be at least 64 bits
On major 32-bit platforms:
int is 32 bits
long is 32 bits as well
long long is 64 bits
On major 64-bit platforms:
int is 32 bits
long is either 32 or 64 bits
long long is 64 bits as well
If you need a specific integer size for a particular application, rather than trusting the compiler to pick the size you want, #include <stdint.h> (or <cstdint>) so you can use these types:
int8_t and uint8_t
int16_t and uint16_t
int32_t and uint32_t
int64_t and uint64_t
You may also be interested in #include <stddef.h> (or <cstddef>):
size_t
ptrdiff_t
long long does not exist in C++98/C++03, but does exist in C99 and c++0x.
long is guaranteed at least 32 bits.
long long is guaranteed at least 64 bits.
To elaborate on #ildjarn's comment:
And they both don't work with 12 digit numbers (600851475143), am I forgetting something?
The compiler looks at the literal value 600851475143 without considering the variable that you're assigning it to/initializing it with. You've written it as an int typed literal, and it won't fit in an int.
Use 600851475143LL to get a long long typed literal.
Your C++ compiler supports long long, that is guaranteed to be at least 64-bits in the C99 standard (that's a C standard, not a C++ standard). See Visual C++ header file to get the ranges on your system.
Recommendation
For new programs, it is recommended that one use only bool, char, int, and double, until circumstance arises that one of the other types is needed.
http://www.somacon.com/p111.php
Depends on your compiler.long long is 64 bits and should handle 12 digits.Looks like in your case it is just considering it long and hence not handling 12 digits.

Program not recognizing the variable as long int nor as unsigned int

Instead of getting number "2147483648" I get "-2147483648" because of an signed int overflow. I tried declaring the variable both as long int as well as unsigned int with no use. They are just not recognized as such types. If anyone wondering, I'm left shiffting the value.
int multiplier = 1,i;
long int mpy = 0;
for(i=32;i>=0;i--){
mpy = 1 << multiplier++;
printf("mpy = %d\n",mpy);
}
Since the constant 1 is an int, when shifted left, it remains an int. If you want an unsigned long long, make it such:
unsigned long long mpy = 1ULL << multiplier++;
You could use one of the suffixes L or UL or LL for long, unsigned long and long long instead (and lower-case versions of these, but the suffix is best written in upper-case to avoid confusion of l and 1). The choice depends on what you're really trying to do.
Note that the type of the result of << is the type of the left-hand operand. The result of the shift is only subsequently converted to the type of the left-hand side of the assignment operator. The LHS of the assignment does not affect how the value on the RHS is calculated.
As user3528438 pointed out in a comment, and as I assumed (perhaps mistakenly) you would know — if multiplier (the RHS of the << operator) evaluates to a negative value or a value equal to or larger than the number of bits in the integer type, then you invoke undefined behaviour.
Note that long long and unsigned long long are standard in the decade-and-a-half old standard (C99) and the newer C11 standard — but they were not part of the quarter-century old C89/C90 standard. If you're stuck on a platform where the compiler is in a time-warp — release date in the 201x decade and C standard compatibility date of 1990 — then you have to go with alternative platform-specific 64-bit techniques. The loop in the updated question covers 33 values since you count from 32 down to and including 0. No 32-bit type will have distinct values for each of the 33 shifts.
(Advanced users might be interested in INT35-C Use correct integer precisions and N1899 — Integer precision bits update; they're a tad esoteric for most people as yet. I'm not sure whether I'll ever find it necessary to worry about the issue raised.)
Note also the discussion in the comments below about printf() formats. You should make sure you print the value with the correct format. For long int, that should be %ld; for unsigned long long, that would be %llu. Other types require other formats. Make sure you're using sensible compiler warning options. If you're using GCC, you should look at gcc -Wall -Wextra -Werror -std=c11 as a rather effective set of options; I use slightly more stringent options than even those when I'm compiling C code.
Depending on what compiler you are using, and if you are compiling in 32-bit vs. 64-bit mode, what you are seeing could be exactly as expected.
https://software.intel.com/en-us/articles/size-of-long-integer-type-on-different-architecture-and-os
tl;dr: with MSVC, both int and long are 32-bits, you need to graduate up to __int64 if you want to store a bigger number. With gcc or other compilers in 32-bit mode, you run into the same issue of int = long = 32-bit, which doesn't help you in your situation. Only when you make the move to 64-bit compilation on non-Microsoft compilers do int and long start to diverge.
edit per comments section:
int64_t or long long would also be standards-compliant types that could be used. Alternatively, unsigned would allow the poster to fit their value into 32-bits.

Architectures/ABIs where sizeof(long long) != 8

In x86/amd64 world sizeof(long long) is 8.
Let me quote quite insightful 8 year old mail by Zack Weinberg:
Scott Robert Ladd writes:
On a 64-bit AMD64 architecture, GCC defines long long as 64 bits, the
same as a long.
Given that certain 64-bit instructions (multiply) produce 128-bit
results, doesn't it seem logical the long long be defined as 128 bits?
No, for two reasons:
The choice of 64-bit 'long long' has been written into the ABI of
most LP64-model operating systems; we can't unilaterally change it.
This is actually the correct choice, as it removes the aberration
that makes 'long' not the widest basic integral type. There is
lots and lots of code in the wild written to the assumption that
sizeof(long) >= sizeof(size_t) - this is at least potentially
broken by ABIs where long long is wider than long.
(This was an extremely contentious topic during the development of
C99. As best as I can tell from an outside perspective, 'long long'
was only standardized due to pressure from Microsoft who can't for
some reason implement an LP64 model. Everyone else hated the idea
of making 'long' not necessarily the widest basic integral type.)
Best current practice appears to be to provide an "extended integral
type" __int128. This doesn't have the problems of 'long long' because
it's not a basic integral type (in particular, it cannot be used for
size_t).
zw
long long is widest basic integral type. It's 64-bit long on any non-dead-old architectures/ABIs I know. This allows for going with simple cross-platform (well, at least for many 32/64-bit architectures) typedefs:
typedef char s8;
typedef unsigned char u8;
typedef short s16;
typedef unsigned short u16;
typedef int s32;
typedef unsigned int u32;
typedef long long s64;
typedef unsigned long long u64;
that are nicer than intXX_t, because:
they use same underlying type for 64-bit integers on different platforms
allows avoiding verbose PRId64/PRIu64
(I am well aware that Visual C++ supports %lld/%llu only since 2005)
But how portable this solution is can be expressed by answers to the following question.
What are the architectures/ABIs where sizeof(long long) != 8?
If you cannot provide any recent/modern ones, then go ahead with the old ones, but only if they are still in use.
TI TMS320C55x architecture has CHAR_BIT of 16-bit and long long of 40-bit. Although the 40-bit long long violates ISO, sizeof (long long) is different from 8.
Actually nearly all the C99 implementations with CHAR_BIT > 8 have sizeof (long long) != 8.
TMS320C55x Optimizing C/C++ Compiler User’s Guide (2003)
http://www.ti.com/lit/ug/spru281f/spru281f.pdf
"(This was an extremely contentious topic during the development of C99. As best as I can tell from an outside perspective, 'long long' was only standardized due to pressure from Microsoft who can't for some reason implement an LP64 model. Everyone else hated the idea of making 'long' not necessarily the widest basic integral type.)"
It was contentious among Unix vendors, without reference to Microsoft, which had a lot of I16LP32 code, where the only 32-bit integer was long, so they plausibly did not want to change that.
UNIX & other vendors had ILP32 or like Amdahl, Convex, others ILP32LL64, so needed a 64-bit datatype, just as PDP-11s got to be IP16L32 in mid-1970s to get a 32-bit datatype instead of int X[2].
For the detailed history, here's 2006 article in ACM Queue, later reprinted in 2009 CACM. See especially Table 1.
"The Long Road to 64 Bits
Double, double, toil and trouble"
https://queue.acm.org/detail.cfm?id=1165766
Also, if you read C99, I wrote the rationale for long long.
In the meetings described in the paper, we were split between:
IL32LLP64 - leave long as 32-bit
ILP64 - make ints and longs 64, introduce new type for 32
LP64 - make longs 64, leave ints 32
There were good arguments for each, but the argument was effectively settled by fact that the first two companies shipping 64-bit micros both went LP64.
Your "cross-platform" typedefs are just misguided. The correct ones are
#include <stdint.h>
typedef int8_t s8;
typedef uint8_t u8;
typedef int16_t s16;
...

What kind of data type is "long long"?

I don't know this type. Is that the biggest one from all? I think it is an integer type, right? Or is it a floating point thing? Bigger than double?
According to C99 standard, long long is an integer type which is at least 64-bit wide. There are two integer 64-bit types specified: long long int and unsigned long long int
So, yes, this is the biggest integer type specified by C language standard (C99 version).
There is also long double type specified by C99. It's an extended precision floating point numeric data type long for 80-bits on most popular x86-based platforms and implementations of C language.
The short and simple is that a long long is an int that is at least 64 bits wide. The rationale for this is here. Basically, it is a response to 64 bit architecture and backwards compatibility. And the name long long was deemed the least bad of all possibilities by the standards committee.

Resources