I'm working on a Ruby native C method: power mod. Here's what I got:
#define TO_BIGNUM(x) (FIXNUM_P(x) ? rb_int2big(FIX2LONG(x)) : x)
#define CONST2BIGNUM(x) (TO_BIGNUM(INT2NUM(x)))
VALUE method_big_power_mod(VALUE self, VALUE base, VALUE exp, VALUE mod){
VALUE res = TO_BIGNUM(INT2NUM(1));
base = TO_BIGNUM(base);
exp = TO_BIGNUM(exp);
mod = TO_BIGNUM(mod);
while (rb_big_cmp(exp, CONST2BIGNUM(0))) {
if (rb_big_modulo(exp, CONST2BIGNUM(2))) {
VALUE mul = rb_big_mul(res, base);
res = rb_big_modulo(mul, mod);
}
base = rb_big_modulo(rb_big_pow(base, CONST2BIGNUM(2)), mod);
exp = rb_big_div(exp, CONST2BIGNUM(2));
}
return res;
}
It segfaults every time. I isolated the problem to rb_big_modulo calls. gdb stacktrace says that it crashes in the bigdivrem method after calling rb_big_modulo. I tried to look through the source of bignum.c, but I can't figure out what's causing the crash. Am I doing something wrong?
There are two problems that are causing the segfault:
1 - The functions rb_big_* sometimes doesn't return a Bignum object, but when you call then the first arg must be a Bignum object. For example:
if (rb_big_modulo(exp, CONST2BIGNUM(2))) {
VALUE mul = rb_big_mul(res, base); // This maybe return a Fixnum
res = rb_big_modulo(mul, mod); // This will cause a segfault :(
}
2 - The function rb_big_pow when you call it with both args Bignum, it will warn you and will return a Float object where you can't convert easily to a Bignum object. So, you should replace the line where you call it by:
VALUE x = TO_BIGNUM(rb_big_pow(base, INT2NUM(2))); // Power by a Fixnum instead a Bignum
base = TO_BIGNUM(rb_big_modulo(x , mod));
The final implementation will be:
#define TO_BIGNUM(x) (FIXNUM_P(x) ? rb_int2big(FIX2LONG(x)) : x)
#define CONST2BIGNUM(x) (TO_BIGNUM(INT2NUM(x)))
VALUE method_big_power_mod(VALUE self, VALUE base, VALUE exp, VALUE mod){
VALUE res = TO_BIGNUM(INT2NUM(1));
base = TO_BIGNUM(base);
exp = TO_BIGNUM(exp);
mod = TO_BIGNUM(mod);
while (rb_big_cmp(exp, CONST2BIGNUM(0))) {
if (rb_big_modulo(exp, CONST2BIGNUM(2))) {
VALUE mul = TO_BIGNUM(rb_big_mul(res, base));
res = TO_BIGNUM(rb_big_modulo(mul, mod));
}
VALUE x = TO_BIGNUM(rb_big_pow(base, INT2NUM(2)));
base = TO_BIGNUM(rb_big_modulo(x , mod));
exp = TO_BIGNUM(rb_big_div(exp, CONST2BIGNUM(2)));
}
return res;
}
I don't know the performance impact with all these conversions. Maybe, you should test when it is a Fixnum or a Bignumand calculate it using the proper function or benchmark both approaches.
When I ran it, I went thought an infinite loop, but I don't know if I call it with the correct values.
Related
I am trying to convert below code to Swift:
{
// Set up the variables
double totalUsedMemory = 0.00;
mach_port_t host_port;
mach_msg_type_number_t host_size;
vm_size_t pagesize;
// Get the variable values
host_port = mach_host_self();
host_size = sizeof(vm_statistics_data_t) / sizeof(integer_t);
host_page_size(host_port, &pagesize);
vm_statistics_data_t vm_stat;
// Check for any system errors
if (host_statistics(host_port, HOST_VM_INFO, (host_info_t)&vm_stat, &host_size) != KERN_SUCCESS) {
// Error, failed to get Virtual memory info
return -1;
}
// Memory statistics in bytes
natural_t usedMemory = (natural_t)((vm_stat.active_count +
vm_stat.inactive_count +
vm_stat.wire_count) * pagesize);
natural_t allMemory = [self totalMemory];
return usedMemory;
}
My Swift code is:
{
// Set up the variables
var totalUsedMemory: Double = 0.00
var host_port: mach_port_t
var host_size: mach_msg_type_number_t
var pagesize:vm_size_t
// Get the variable values
host_port = mach_host_self()
host_size = mach_msg_type_number_t(MemoryLayout<vm_statistics_data_t>.stride / MemoryLayout<integer_t>.stride)
// host_size = sizeof(vm_statistics_data_t) / sizeof(integer_t);
host_page_size(host_port, &pagesize);
var vm_stat: vm_statistics_data_t ;
// Check for any system errors
if (host_statistics(host_port, HOST_VM_INFO, (host_info_t)&vm_stat, &host_size) != KERN_SUCCESS) {
// Error, failed to get Virtual memory info
return -1;
}
// Memory statistics in bytes
var usedMemory: Int64 = (Int64)((vm_stat.active_count + vm_stat.inactive_count + vm_stat.wire_count) * pagesize);
return usedMemory;
}
I am getting these 2 errors:
**Binary operator '&' cannot be applied to operands of type '(host_info_t).Type' (aka 'UnsafeMutablePointer.Type') and 'vm_statistics_data_t' (aka 'vm_statistics')
in this statement
host_statistics(host_port, HOST_VM_INFO, (host_info_t)&vm_stat, &host_size)
And
Binary operator '*' cannot be applied to operands of type 'UInt32' and 'vm_size_t' (aka 'UInt')**
in this statement -
var usedMemory: Int64 = (Int64)((vm_stat.active_count + vm_stat.inactive_count + vm_stat.wire_count) * pagesize);
Swift is a lot more strict about pointer types than C is, which can make it a real pain to interact with functions like this that expect you to pass pointers to types other than the actual type of the thing you're trying to pass to the function. So I agree with the commenters that you're probably better off leaving this function in (Objective-)C. However, if you absolutely have to convert to Swift, you're probably going to have to do something like this:
// Initialize a blank vm_statistics_data_t
var vm_stat = vm_statistics_data_t()
// Get a raw pointer to vm_stat
let err: kern_return_t = withUnsafeMutableBytes(of: &vm_stat) {
// Bind the raw buffer to Int32, since that's what host_statistics
// seems to want a pointer to.
let boundBuffer = $0.bindMemory(to: Int32.self)
// Call host_statistics, and return its status out of the closure.
return host_statistics(host_port, HOST_VM_INFO, boundBuffer.baseAddress, &host_size)
}
// Now take a look at what we got and compare it against KERN_SUCCESS
if err != KERN_SUCCESS {
// Error, failed to get Virtual memory info
return -1;
}
I tried executing Sieve Of Eratosthenes algorithm using a large Integer array and a large Bool array.
The integer version seems to execute MUCH faster than the boolean one. What is the possible reason for this?
import Foundation
var n : Int = 100000000;
var prime = [Bool](repeating: true, count: n+1)
var p = 2
let start = DispatchTime.now()
while((p*p)<=n)
{
if(prime[p] == true)
{
var i = p*2
while (i<=n)
{
prime[i] = false
i = i + p
}
}
p = p+1
}
let stop = DispatchTime.now()
let time = (Double)(stop.uptimeNanoseconds - start.uptimeNanoseconds) / 1000000.0
print("Time = \(time) ms")
Boolean array execution time : 78223.342295 ms
import Foundation
var n : Int = 100000000;
var prime = [Int](repeating: 1, count: n+1)
var p = 2
let start = DispatchTime.now()
while((p*p)<=n)
{
if(prime[p] == 1)
{
var i = p*2
while (i<=n)
{
prime[i] = 0
i = i + p
}
}
p = p+1
}
let stop = DispatchTime.now()
let time = (Double)(stop.uptimeNanoseconds - start.uptimeNanoseconds) / 1000000.0
print("Time = \(time) ms")
Integer array execution time : 8535.54546 ms
TL, DR:
Do not attempt to optimize your code in a Debug build. Always run it through the Profiler. Int was faster then Bool in Debug but the oposite was true when run through the Profiler.
Heap allocation is expensive. Use your memory judiciously. (This question discusses the complications in C, but also applicable to Swift)
Long answer
First, let's refactor your code for easier execution:
func useBoolArray(n: Int) {
var prime = [Bool](repeating: true, count: n+1)
var p = 2
while((p*p)<=n)
{
if(prime[p] == true)
{
var i = p*2
while (i<=n)
{
prime[i] = false
i = i + p
}
}
p = p+1
}
}
func useIntArray(n: Int) {
var prime = [Int](repeating: 1, count: n+1)
var p = 2
while((p*p)<=n)
{
if(prime[p] == 1)
{
var i = p*2
while (i<=n)
{
prime[i] = 0
i = i + p
}
}
p = p+1
}
}
Now, run it in the Debug build:
let count = 100_000_000
let start = DispatchTime.now()
useBoolArray(n: count)
let boolStop = DispatchTime.now()
useIntArray(n: count)
let intStop = DispatchTime.now()
print("Bool array:", Double(boolStop.uptimeNanoseconds - start.uptimeNanoseconds) / Double(NSEC_PER_SEC))
print("Int array:", Double(intStop.uptimeNanoseconds - boolStop.uptimeNanoseconds) / Double(NSEC_PER_SEC))
// Bool array: 70.097249517
// Int array: 8.439799614
So Bool is a lot slower than Int right? Let's run it through the Profiler by pressing Cmd + I and choose the Time Profile template. (Somehow the Profiler wasn't able to separate these functions, probably because they were inlined so I had to run only 1 function per attempt):
let count = 100_000_000
useBoolArray(n: count)
// useIntArray(n: count)
// Bool: 1.15ms
// Int: 2.36ms
Not only they are an order of magnitude faster than Debug but the results are reversed to: Bool is now faster than Int!!! The Profiler doesn't tell us why how so we must go on a witch hunt. Let's check the memory allocation by adding an Allocation instrument:
Ha! Now the differences are laid bare. The Bool array uses only one-eight as much memory as Int array. Swift array uses the same internals as NSArray so it's allocated on the heap and heap allocation is slow.
When you think even more about it: a Bool value only take up 1 bit, an Int takes 64 bits on a 64-bit machine. Swift may have chosen to represent a Bool with a single byte, while an Int takes 8 bytes, hence the memory ratio. In Debug, this difference may have caused all the difference as the runtime must do all kinds of checks to ensure that it's actually dealing with a Bool value so the Bool array method takes significantly longer.
Moral of the lesson: don't optimize your code in Debug mode. It can be misleading!
(A partial answer ...)
As #MartinR mentions in his comments to the question, there is no such major difference between the two cases if you build for release mode (with optimizations); the Bool case is slightly faster due its smaller memory footprint (but equally fast as e.g. UInt8 which has the same footprint).
Running instruments to profile the (non-optimized) debug build, we clearly see that the array element access & assignment is the culprit for the Bool case (an as far as my brief testing has seen; for all types except the integer ones, Int, UInt16, and so on).
We can further ascertain that its not the writing part in particular that yields the overhead, but rather the repeated accessing of the i:th element.
The same explicit read-access tests for an array of integer elements show no such large overhead.
It would almost seem as if the random element access is, for some reason, not working as it should (for non-integer types) when compiling with debug build config.
I found a url request having suspicious code to one of my Drupal site. Will someone explain what will be the depth of this code and advise any precautions to be taken. Code:
function (){try{var _0x5757=["/x6C/x65/x6E/x67/x74/x68","/x72/x61/x6E/x64/x6F/x6D","/x66/x6C/x6F/x6F/x72"],_0xa438x1=this[_0x5757[0]],_0xa438x2,_0xa438x3;if(_0xa438x1==0){return};while(--_0xa438x1){_0xa438x2=Math[_0x5757[2]](Math[_0x5757[1]]()*(_0xa438x1 1));_0xa438x3=this[_0xa438x1];this[_0xa438x1]=this[_0xa438x2];this[_0xa438x2]=_0xa438x3;};}catch(e){}finally{return this}}
Site returned page not found error and I observed no issues.
Run this code through a beatifier and you will receive:
function () {
try {
var _0x5757 = ["/x6C/x65/x6E/x67/x74/x68", "/x72/x61/x6E/x64/x6F/x6D", "/x66/x6C/x6F/x6F/x72"],
_0xa438x1 = this[_0x5757[0]],
_0xa438x2, _0xa438x3;
if (_0xa438x1 == 0) {
return
};
while (--_0xa438x1) {
_0xa438x2 = Math[_0x5757[2]](Math[_0x5757[1]]() * (_0xa438x1 1));
_0xa438x3 = this[_0xa438x1];
this[_0xa438x1] = this[_0xa438x2];
this[_0xa438x2] = _0xa438x3;
};
} catch (e) {} finally {
return this
}
}
First, let's rename some variables and decrypt the array of strings in the third line. I've renamed _0x5757 to arr and escaped the hex-chars within the array. That gives you:
var arr = ["length", "random", "floor"],
So here we have a list of functions that will be used shortly. Substitute the strings in and rename the variables and you will receive:
function () {
try {
var arr = ["length", "random", "floor"],
length_func = "length",
rand_number, temp;
if (length_func == 0) {
return
};
while (--length_func) {
rand_number = Math["floor"](Math["random"]() * (length_func 1));
temp = this[length_func];
this[length_func] = this[rand_number];
this[rand_number] = temp;
};
} catch (e) {} finally {
return this
}
}
Notice how there is a syntax error in the script when generating a random number.
* (length_func 1)
with length_func = "length" is not valid JavaScript syntax, so the code is actually not functional. I can still make a guess on what it was supposed to do: If we remove the obfuscation of calling a function by doing Math["floor"] instead of Math.floor() the important lines are
while (--length_func) {
rand_number = Math.floor( Math.random() * ( length 1 ));
temp = this.length_func;
this.length_func = this.rand_number;
this.rand_number = temp;
};
It seems that it tries to compute a random integer using Math.random() and Math.floor(), then swaps the contents of the variables length_func and rand_numerber, all wrapped in a while(--length_func) loop. There's nothing functional here or anything that makes sense. An attempt at an infinte loop hanging the browser maybe? The code is, as it stands, non-functional. It even fails to generate a random number, because Math.floor() will always round-down the inputted float, and Math.rand() will generate a number within 0.0 to 1.0, so nearly always something slightly below 1.0, therefore rand_number = 0 for most of the time. The multiplication with the rand() output with the length_func 1 maybe should have made the number bigger, but the syntax is invalid. When I use my browser's console to execute length, it gives me 0, when I try to do length(1), then length is not a function, the only length that makes sense here is a string-length or array length, but then it would have to explicitly be "someString".length. Hope this helps you.
char (* text)[1][45+1];
text = calloc(5000,(130+1));
strcpy(0[*text],"sometext)");
Now I want to encode "sometext" to base58, however, I do not know how, and oddly enough, there isn't one example of BASE58 in C.
The base58 encoding I'm interested in uses these symbols:
123456789abcdefghijkmnopqrstuvwxyzABCDEFGHJKLMNPQRSTUVWXYZ
It's been optimized to lessen the risk of mis-reading, so 0 and 'O' are both gone, for instance.
P.S
Don't mind the weird allocation and declaration of the variables, I was experimenting.
You're not supposed to encode strings, you're supposed to encode integers.
If starting with a string, you must first decide how to interpret it as an integer (might be base128, or something), then re-encode in base58.
Satoshi has the reference implementation (https://github.com/bitcoin/bitcoin/blob/master/src/base58.h)
However, he uses some utility bignum class to do it, and it's in C++. If you have access to a bignum library, you just keep dividing by 58 until the number is broken up. If you don't have a bignum library, AFAIK you're outta luck.
Here's an implementation in PHP for large numbers I've created for Amithings, beyond the integers (Integer -> http://php.net/manual/en/language.types.integer.php).
For example, try the example below (Don't forget to pass your ID to the function in string format. Use the PHP function strval()):
$number = '123456789009876543211234567890';
$result = base58_encode($number);
echo('Encoded: ' . $result . '<br>');
echo('Decoded: ' . base58_decode($result) . '<br>');
Important: You may consider to change this routine by including some sort of key/password/encryption to ensure that others can not decode your database IDs.
function base58_encode($input)
{
$alphabet = '123456789abcdefghijkmnopqrstuvwxyzABCDEFGHJKLMNPQRSTUVWXYZ';
$base_count = strval(strlen($alphabet));
$encoded = '';
while (floatval($input) >= floatval($base_count))
{
$div = bcdiv($input, $base_count);
$mod = bcmod($input, $base_count);
$encoded = substr($alphabet, intval($mod), 1) . $encoded;
$input = $div;
}
if (floatval($input) > 0)
{
$encoded = substr($alphabet, intval($input), 1) . $encoded;
}
return($encoded);
}
function base58_decode($input)
{
$alphabet = '123456789abcdefghijkmnopqrstuvwxyzABCDEFGHJKLMNPQRSTUVWXYZ';
$base_count = strval(strlen($alphabet));
$decoded = strval(0);
$multi = strval(1);
while (strlen($input) > 0)
{
$digit = substr($input, strlen($input) - 1);
$decoded = bcadd($decoded, bcmul($multi, strval(strpos($alphabet, $digit))));
$multi = bcmul($multi, $base_count);
$input = substr($input, 0, strlen($input) - 1);
}
return($decoded);
}
My simple code with Crypto++ library:
string base58_encode(Integer num, string vers)
{
string alphabet[58] = {"1","2","3","4","5","6","7","8","9","A","B","C","D","E","F",
"G","H","J","K","L","M","N","P","Q","R","S","T","U","V","W","X","Y","Z","a","b","c",
"d","e","f","g","h","i","j","k","m","n","o","p","q","r","s","t","u","v","w","x","y","z"};
int base_count = 58; string encoded; Integer div; Integer mod;
while (num >= base_count)
{
div = num / base_count; mod = (num - (base_count * div));
encoded = alphabet[ mod.ConvertToLong() ] + encoded; num = div;
}
encoded = vers + alphabet[ num.ConvertToLong() ] + encoded;
return encoded;
}
It's just for cryptocurrency wallets. string can be changed for other tasks.
Here is an implementation that seems to be pure c.
https://github.com/trezor/trezor-crypto/blob/master/base58.c
When I remove the tests to compute minimum and maximum from the loop, the execution time is actually longer than with the test. How is that possible ?
Edit :
After running more test, it seems the runtime is not constant, ie the same code
can run in 9 sec or 13 sec.... So it was just a repetable coincidence. Repetable until you do enough tests that is...
Some details :
execution time with the min max test : 9 sec
execution time without the min max test : 13 sec
CFLAGS=-Wall -O2 -fPIC -g
gcc 4.4.3 32 bit
Section to remove is now indicated in code
Some guess :
bad cache interaction ?
void FillFullValues(void)
{
int i,j,k;
double X,Y,Z;
double p,q,r,p1,q1,r1;
double Ls,as,bs;
unsigned long t1, t2;
t1 = GET_TICK_COUNT();
MinLs = Minas = Minbs = 1000000.0;
MaxLs = Maxas = Maxbs = 0.0;
for (i=0;i<256;i++)
{
for (j=0;j<256;j++)
{
for (k=0;k<256;k++)
{
X = 0.4124*CielabValues[i] + 0.3576*CielabValues[j] + 0.1805*CielabValues[k];
Y = 0.2126*CielabValues[i] + 0.7152*CielabValues[j] + 0.0722*CielabValues[k];
Z = 0.0193*CielabValues[i] + 0.1192*CielabValues[j] + 0.9505*CielabValues[k];
p = X * InvXn;
q = Y;
r = Z * InvZn;
if (q>0.008856)
{
Ls = 116*pow(q,third)-16;
}
else
{
Ls = 903.3*q;
}
if (q<=0.008856)
{
q1 = 7.787*q+seiz;
}
else
{
q1 = pow(q,third);
}
if (p<=0.008856)
{
p1 = 7.787*p+seiz;
}
else
{
p1 = pow(p,third);
}
if (r<=0.008856)
{
r1 = 7.787*r+seiz;
}
else
{
r1 = pow(r,third);
}
as = 500*(p1-q1);
bs = 200*(q1-r1);
//
// cast on short int for reducing array size
//
FullValuesLs[i][j][k] = (char) (Ls);
FullValuesas[i][j][k] = (char) (as);
FullValuesbs[i][j][k] = (char) (bs);
//// Remove this and get slower code
if (MaxLs<Ls)
MaxLs = Ls;
if ((abs(Ls)<MinLs) && (abs(Ls)>0))
MinLs = Ls;
if (Maxas<as)
Maxas = as;
if ((abs(as)<Minas) && (abs(as)>0))
Minas = as;
if (Maxbs<bs)
Maxbs = bs;
if ((abs(bs)<Minbs) && (abs(bs)>0))
Minbs = bs;
//// End of Remove
}
}
}
TRACE(_T("LMax = %f LMin = %f\n"),(MaxLs),(MinLs));
TRACE(_T("aMax = %f aMin = %f\n"),(Maxas),(Minas));
TRACE(_T("bMax = %f bMin = %f\n"),(Maxbs),(Minbs));
t2 = GET_TICK_COUNT();
TRACE(_T("WhiteBalance init : %lu ms\n"), t2 - t1);
}
I think compiler is trying to unroll the inner loop because you are removing dependency between iterations. But somehow this doesn't help in your case. Maybe because the loop is too big and using too many registers to be unrolled.
Try to turn off unrolling and post results again.
If this is the case, I would suggest you to submit a performance issue to gcc.
PS. I think you can merge if (q>0.008856) and if (q<=0.008856).
Maybe its the cache, maybe unrolling problems, there is only one way to answer this: look at the generated code (e.g. by using the -S option). Maybe you can post it/or spot the difference when comparing them.
EDIT: As you now clarified that it was just the measurement I can only recommend (or better command ;-) you, that when you want to get runtime numbers: ALWAYS put it into some loop and average it. Best to do it outside your programm (in a shell script), so your cache is not already filled with the right data.