Is there any alternative to strtoull() function in C? - c

I need to convert char* to unsigned long long int and there is a function called strtoull() in the C standard library but it takes to much time. I need to quick conversion between char* to unsigned long long int. How can I write my own conversion function which is faster than the standard one?

Shortest/fastest code I can think of right now:
unsigned long long strtoull_simple(const char *s) {
unsigned long long sum = 0;
while (*s) {
sum = sum*10 + (*s++ - '0');
}
return sum;
}
No error checking. Profile to find if it improves performance. YMMV.
After accept: Tried a variation that does the initial calculation as unsigned before continuing on to unsigned long long. Marginal to negative improvements on my 64-bit machine depending on number set. Suspect it will be faster on machines where unsigned long long operations are expensive.
unsigned long long strtoull_simple2(const char *s) {
unsigned sumu = 0;
while (*s) {
sumu = sumu*10 + (*s++ - '0');
if (sumu >= (UINT_MAX-10)/10) break; // Break if next loop may overflow
}
unsigned long long sum = sumu;
while (*s) {
sum = sum*10 + (*s++ - '0');
}
return sum;
}
If code knows the length of the string then the following had some performance improvements (5%)
unsigned long long strtoull_2d(const char *s, unsigned len) {
unsigned sumu = 0;
#define INT_MAX_POWER_10 9
if (len > INT_MAX_POWER_10) {
len = INT_MAX_POWER_10;
}
while (len--) {
sumu = sumu * 10 + (*s++ - '0');
}
unsigned long long sum = sumu;
while (*s) {
sum = sum * 10 + (*s++ - '0');
}
return sum;
}
Conclusion: Improvements (I tried 7) on the simple original solution could yield small incremental speed efficiencies, but they become more and more platform and data set dependent. Suggest that programing talent is better applied to the higher level code improvements.

Answer from #soerium modified to use unsigned long long give better performance than strtoull().
unsigned long long fast_atoull(const char *str)
{
unsigned long long val = 0;
while(*str)
{
val = (val << 1) + (val << 3) + (*(str++) - 48);
}
return val;
}

Related

Unsigned Long Int overflow when calculating pow

I am trying to make a function that quickly calculates x^y mod z. It works well when calculating something like 2^63 mod 3, but at 2^64 mod 3 and higher exponents it just returns 0.
I am suspecting an overflow somewhere, but I can't pin it down. I have tried explicit casts at the places where calculations (* and mod) are made, I have also made my storage variables (resPow, curPow) unsigned long long int (as Suggested here) but that didn't help much.
typedef unsigned long int lint;
lint fastpow(lint nBase, lint nExp, lint nMod) {
int lastTrueBit = 0;
unsigned long long int resPow = 1ULL;
unsigned long long int curPow = nBase;
for (int i = 0; i < 32; i++) {
int currentBit = getBit(nExp, i);
if (currentBit == 1) {
for (lint j = 0; j < i - lastTrueBit; j++) {
curPow = curPow * curPow;
}
resPow =resPow * curPow;
lastTrueBit = i;
}
}
return resPow % nMod;
}
I am suspecting an overflow somewhere,
Yes, both curPow * curPow and resPow * curPow may mathematically overflow.
The usual way to contain overflow here is to perform mod on intermediate products.
// curPow = curPow * curPow;
curPow = (curPow * curPow) % nMod;
// resPow =resPow * curPow;
resPow = (resPow * curPow) % nMod;
This is sufficient when nMod < ULLONG_MAX/(nMod - 1). (The mod value is half the precision of unsigned long long). Otherwise more extreme measures are needed as in: Modular exponentiation without range restriction.
Minor stuff
for(int i = 0; i < 32; i++) assumes lint/unsigned long is 32 bits. Portable code would avoid that magic number. unsigned long is 64-bits on various platforms.
LL is not needed here. U remains useful to quiet various compiler warnings.
// unsigned long long int resPow = 1ULL;
unsigned long long int resPow = 1U;

Converting a string of chars into its decimal value then back to its character valus

unsigned long long int power(int base, unsigned int exponent)
{
if (exponent == 0)
return 1;
else
return base * power(base, exponent - 1);
}
I am working on a program where I need to take in a string of 8 characters (e.g. "I want t") then convert this into a long long int in the pack function. I have the pack function working fine.
unsigned long long int pack(char unpack[])
{
/*converting string to long long int here
didn't post code because its large*/
}
After I enter "I want t" I get "Value in Decimal = 5269342824372117620" and then I send the decimal to the unpack function. So I need to convert 5269342824372117620 back into "I want t". I tried bit manipulation which was unsuccessful any help would be greatly appreciated.
void unpack(long long int pack)
{
long long int bin;
char convert[100];
for(int i = 63, j = 0, k = 0; i >= 0; i--,j++)
{
if((pack & (1 << i)) != 0)
bin += power(2,j);
if(j % 8 == 0)
{
convert[k] = (char)bin;
bin = 0;
k++;
j = -1;
}
}
printf("String: %s\n", convert);
}
A simple solution for your problem is to consider the characters in the string to be digits in a large base that encompasses all possible values. For example base64 encoding can convert strings of 8 characters to 48-bit numbers, but you can only use a subset of at most 64 different characters in the source string.
To convert any 8 byte string into a number, you must use a base of at least 256.
Given your extra input, After I enter "I want t" I get "Value in Decimal = 5269342824372117620", and since 5269342824372117620 == 0x492077616e742074, you do indeed use base 256, big-endian order and ASCII encoding for the characters.
Here is a simple portable pack function for this method:
unsigned long long pack(const char *s) {
unsigned long long x = 0;
int i;
for (i = 0; i < 8; i++) {
x = x * 256 + (unsigned char)s[i];
}
return x;
}
The unpack function is easy to derive: compute the remainders of divisions in the reverse order:
char *unpack(char *dest, unsigned long long x) {
/* dest is assumed to have a length of at least 9 */
int i;
for (i = 8; i-- > 0; ) {
s[i] = x % 256;
x = x / 256;
}
s[8] = '\0'; /* set the null terminator */
return s;
}
For a potentially faster but less portable solution, you could use this, but you would get a different conversion on little-endian systems such as current Macs and PCs:
#include <string.h>
unsigned long long pack(const char *s) {
unsigned long long x;
memcpy(&x, s, 8);
return x;
}
char *unpack(char *s, unsigned long long x) {
memcpy(s, &x, 8);
s[8] = '\0';
return s;
}

C how to convert string of unsigned long long to uint32_t[]?

Is there any short code convert a string of unsigned long long to uint32_t[]?
eg, 11767989860 => uint32_t[] {0xaaaa, 0xbbbb}?
Ignoring overflow, this gives you an unsigned long long (which I believe is 64-bit, not 32):
unsigned long long r = 0;
while (*pstr >= '0' && *pstr <= '9') r = r * 10 + (*pstr++ - '0');
You have 11767989860 but in string form, and you want it to break into integer array {0x2,0xBD6D4664}.
You can first convert string to long long, then copy 4 bytes from that long long variable into your integer array.
Below is the sample program
#include<stdio.h>
#include<stdint.h>
int main()
{
unsigned long long ll = 0;
uint32_t arr[2];
char str[]="11767989860";
char *tmpPtr = NULL;
tmpPtr = &ll;
sscanf(str,"%llu",&ll);
printf("ll=%llu",ll);
/*Big endian*/
memcpy(arr,&ll,sizeof(ll));
printf("\n%u %u\n",arr[0],arr[1]);
/*Little endian*/
memcpy(&arr,&tmpPtr[4],sizeof(ll)/2);
memcpy(&arr[1],&tmpPtr[0],sizeof(ll)/2);
printf("\n%u %u\n",arr[0],arr[1]);
return 0;
}

C bit manipulation DES permute

I was having trouble with implementing the DES algorithm in Python, so I thought I'd switch to C. But I've ran into an issue, which I haven't been able to fix in hours, hopefully you can help me. Here's the source:
int PI[64] = {58,50,42,34,26,18,10,2,
60,52,44,36,28,20,12,4,
62,54,46,38,30,22,14,6,
64,56,48,40,32,24,16,8,
57,49,41,33,25,17,9,1,
59,51,43,35,27,19,11,3,
61,53,45,37,29,21,13,5,
63,55,47,39,31,23,15,7};
unsigned long getBit(unsigned long mot, unsigned long position)
{
unsigned long temp = mot >> position;
return temp & 0x1;
}
void setBit(unsigned long* mot, int position, unsigned long value)
{
unsigned long code = *mot;
code ^= (-value ^ code) & (1 << position);
*mot = code;
}
void permute( unsigned long * mot, int * ordre, int taille )
{
unsigned long res;
int i = 0;
unsigned long bit;
for (i = 0; i < taille; i++)
{ setBit(&res, i, getBit(*mot, ordre[i] - 1)); }
*mot = res;
}
int main(int argc, char *argv[])
{
unsigned long bloc = 0x0123456789ABCDEF;
permute(&bloc, PI, 64);
printf(" end %lx\n", bloc);
return 1;
}
I made this permutation manually and with my Python program, and the result of this permutation should be 0xcc00ccfff0aaf0aa but I get 0xffffffffcc00ccff (which is, somehow, half correct and half broken). What is going on? How to fix this?
I added UL at the end of my hex word, and I used uint64_t instead of unsigned long int. When I changed -value, I got either fffffffffffffff or 0, but with UL and uint64_t I'm getting the correct result, which probably means, as you guys suggested, that my unsigned longs were not 64-bit longs. Thanks !

C what is happening when printf convers long long to %o (unsigned int)

I'm trying to replicate printf because I'm not allowed to use the real one in assignments, and I don't understand what is happening when I pass it a value too large:
unsigned int n = 4294967286;
printf("%o", n); #=> 37777777766
my_printf("%o", n); #=> 4256507006;
I'm getting the value like that:
a = (unsigned int)(va_arg(f->l, unsigned int));
Then I'm using my ui_to_s to get the corresponding string:
char *ui_to_s_base(unsigned long long n, int base, const char *base_set)
{
const char *defaut_base = "0123456789abcdef";
char *res;
char *tmp;
unsigned long long i;
tmp = str_new_size(256);
i = 0;
tmp[i++] = base_set ? base_set[(n % base)] : defaut_base[(n % base)];
while ((n /= 10) > 0)
tmp[i++] = base_set ? base_set[(n % base)] : defaut_base[(n % base)];
tmp[i] = '\0';
res = str_reverse(tmp);
free(tmp);
return (res);
}
Am I doing something wrong?
while ((n /= 10) > 0)
You should be dividing by base. 10 will work like a charm just as long as you ask it to print in decimal.
I guess the moral of the story here is that if you have a bug in some code that contains a nontrivial constant, that is probably a good place to start investigating.

Resources