Create a list with millions of elements - arrays

I need to create and work with lists with 2**30 elements, but It's to slow. Is there any form to increase the speed?
My code:
sup = []
for i in range(2**30):
sup.append([i,pow(y,i,N)])
pow(y,i,n) == y**i*mod(N), modular exponentiation
I tried to use list comprehensions but isn't enough.

Different approach: why do you want to store those numbers in a list?
You have your formula right there; whenever some piece of code needs sup[i]; you just compute pow(y,i,N).
In other words: instead of storing values within a list; just compute them when you need them.
Edit: as it seems that you have good reasons to store that data in an array; I would then say: use the appropriate tool then.
Meaning: instead of doing computing intense things directly with python, you rather look into the numpy framework. That framework is designed for exactly such purposes. Beyond that, I would also look in the way you are storing/preparing your data. Example: you mention to later look for identical entries in that array. I am wondering if that would meant you should use a dictionary instead of a list; or did you really intend do check 2**30 entries each time you look for equal pow values?

Going by your comment and complementing the answer of GhostCat, go directly for the data you are looking for, for example like this
>>> from collections import defaultdict
>>> y = 2
>>> N = 10
>>> data = defaultdict(list)
>>> for i in range(100):
data[pow(y,i,N)].append(i)
>>> for x in data.items():
x
(8, [3, 7, 11, 15, 19, 23, 27, 31, 35, 39, 43, 47, 51, 55, 59, 63, 67, 71, 75, 79, 83, 87, 91, 95, 99])
(1, [0])
(2, [1, 5, 9, 13, 17, 21, 25, 29, 33, 37, 41, 45, 49, 53, 57, 61, 65, 69, 73, 77, 81, 85, 89, 93, 97])
(4, [2, 6, 10, 14, 18, 22, 26, 30, 34, 38, 42, 46, 50, 54, 58, 62, 66, 70, 74, 78, 82, 86, 90, 94, 98])
(6, [4, 8, 12, 16, 20, 24, 28, 32, 36, 40, 44, 48, 52, 56, 60, 64, 68, 72, 76, 80, 84, 88, 92, 96])
>>>
or more specifically, as you need a random sample go for it from the start and don't waste time producing a gazillion stuff you would not need, for example
>>> import random
>>> random_data = defaultdict(list)
>>> for i in random.sample(range(2**30), 20):
random_data[pow(2,i,10)].append(i)
>>> for x in random_data.items():
x
(8, [633728687, 357300263, 208747091, 456291987, 1028949643, 23961003, 750842555])
(2, [602395153, 215460881, 144481457, 829193705])
(4, [752840814, 26689262])
(6, [423520476, 969809132, 326786996, 736424520, 929123176, 865279408, 338237708])
>>>
and depending of what you do with those i later on, you can instead try a more mathematical approach to uncover the underplaying patter that produce an i for which yi mod N is the same and that way you can produce as many i as you need for that particular modular class.
Which for this example is easy, it is
2i = 8 (mod 10) for all i=3 (mod 4) -> range(3,2**30,4)
2i = 2 (mod 10) for all i=1 (mod 4) -> range(1,2**30,4)
2i = 4 (mod 10) for all i=2 (mod 4) -> range(2,2**30,4)
2i = 6 (mod 10) for all i=0 (mod 4) -> range(4,2**30,4)
2i = 1 (mod 10) for i=0

Related

multiplicative inverse with lookup table

I have an formula that I use multiple times in my subroutine, but my processor does not have division instruction(M0), so this is handled by the software library. To speed up this operation, I am considering using a lookup table to store the result of the inverse. However that would still take up 2kb in space (2 bytes per value). How can I optimize it further?
Formula is as follows, k is a constant known at compile time k = [10, 100]. x = [0, 1023]
(1000 * k) * ((1023/x) - 1)
EDITE: Clarification about precision. Since I have the "1000", I am considering using the result of the multiplication by 1000 to increase precision.
Assuming / is integer division
You don't need to store 1024 values, because many values of x result in the same value of 1023/x.
Specifically:
x: [ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 33, 34, 35, 36, 37, 39, 40, 42, 44, 46, 48, 51, 53, 56, 60, 63, 68, 73, 78, 85, 93, 102, 113, 127, 146, 170, 204, 255, 341, 511, 1023]
1023/x: [1023, 511, 341, 255, 204, 170, 146, 127, 113, 102, 93, 85, 78, 73, 68, 63, 60, 56, 53, 51, 48, 46, 44, 42, 40, 39, 37, 36, 35, 34, 33, 31, 30, 29, 28, 27, 26, 25, 24, 23, 22, 21, 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1]
You need only to store these 62 values of x and the 62 results of 1023/x.
As a bonus: if you look carefully, you'll notice those values are symmetric. The values for x are the exact mirror of the values for 1023/x. So you only need to store one of these two arrays.
You can easily shrink the lookup table to 256*2 bytes
static inline uint16_t get1023divxminus1(uint16_t x)
{
static const uint16_t table[256] = {0, 1022, 510, ....., 3};
if (x >= 512) return 0;
if (x >= 342) return 1;
if (x >= 256) return 2;
return table[x];
}
You could shrink the table even further, but I think it isn't worth the additional ifs.
You could compress the data in the table.
For example by storing full 2-byte values for every N-th value of x and store difference values for xs in between. The difference should fit in 1 byte in many cases.
If N would be 4, you'd store full values for x: 0, 4, 8, ... and difference values for x: 1, 2, 3, 5, 6, 7, 9, ...
To get the result for say x == 3, start with 2-byte value of 0 and add the 1-byte difference values of 1 and 2.
There will for sure be other 'tricks' to play if you'd have a close look at the data and think in the direction of data compression.
Accessing RAM is probably going to be slower than calculating long division, as long as your values fit within a register. In principle, calculating long division should be linear in the number of bits. Implement both and profile, but I am highly convinced that long division will be faster:
The algorithms is:
left shift the divisor until the MSD of the divisor equals the MSD of the dividend.
If the divisor is smaller than the dividend, write one, else write 0. Right shift the divisor by one. Repeat until the LSD of the divisor is also the LSD of the dividend.
Here is an explicit implementation:
https://codegolf.stackexchangechaschastitytity.com/questions/24541/divide-two-numbers-using-long-division

When trying to remove just one element in a nested numpy array the whole subarray gets deleted

I have a 3 dimensional numpy array (temp_X) like:
[ [[23,34,45,56],[34,45,67,78],[23,45,67,78]],
[[12,43,65,43],[23,54,67,87],[12,32,34,43]],
[[43,45,86,23],[23,45,56,23],[12,23,65,34]] ]
I want to remove the 1st element of each 3rd sub-array (highlighted values).
shown below is the code that i tried:
for i in range(len(temp_X)):
temp_X = np.delete(temp_X[i][(len(temp_X[i]) - 1)], [0])
Somehow when I run the code the whole array gets deleted except for 3 values. Any help is much appreciated. Thank you in advance.
With a as the 3D input array, here's one way -
m = np.prod(a.shape[1:])
n = m-a.shape[-1]
out = a.reshape(a.shape[0],-1)[:,np.r_[:n,n+1:m]]
Alternative to last step with boolean-indexing -
out = a.reshape(a.shape[0],-1)[:,np.arange(m)!=n]
Sample input, output -
In [285]: a
Out[285]:
array([[[23, 34, 45, 56],
[34, 45, 67, 78],
[23, 45, 67, 78]],
[[12, 43, 65, 43],
[23, 54, 67, 87],
[12, 32, 34, 43]],
[[43, 45, 86, 23],
[23, 45, 56, 23],
[12, 23, 65, 34]]])
In [286]: out
Out[286]:
array([[23, 34, 45, 56, 34, 45, 67, 78, 45, 67, 78],
[12, 43, 65, 43, 23, 54, 67, 87, 32, 34, 43],
[43, 45, 86, 23, 23, 45, 56, 23, 23, 65, 34]])
Here's another with mask creation to mask along the last two axes -
mask = np.ones(a.shape[-2:],dtype=bool)
mask[-1,0] = 0
out = np.moveaxis(a,0,-1)[mask].T

How to find the length of each element of a multidimensional array in Julia?

Suppose we have a n-element Array{Array{Array{Int64,1},1},1} in Julia, listed below:
1) Element 1: 1-element Array{Array{Int64,1},1}:
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10 . 141, 142, 143, 144, 145, 146, 147, 148, 149, 150]
2) Element 2: 2-element Array{Array{Int64,1},1}:
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10 . 41, 42, 43, 44, 45, 46, 47, 48, 49, 50]
[51, 52, 53, 54, 55, 56, 57, 58, 59, 60 . 141, 142, 143, 144, 145, 146, 147, 148, 149, 150]
and so on.
Actually, each element represents the connected components of several undirected graphs. Is there a command or a simple way to obtain the length of each deepest array (the number of connected components)? That is:
1) 150
2) 50 and 100
and so on.
Thank you!!
Given
a = [[rand(3), rand(4)], [rand(5)]]
the version you already commented would be
julia> map(x -> length.(x), a)
2-element Array{Array{Int64,1},1}:
[3, 4]
[5]
Alternatively, the following in my opinion would be more readable:
julia> [[length(x) for x in y] for y in a]
2-element Array{Array{Int64,1},1}:
[3, 4]
[5]
But #juliohm is right, there might be better data structures than deeply nested arrays. Maybe have a look at LightGraphs.jl, if you're dealing with graph problems.

Performing complicated matrix manipulation operations with cblas_sgemm in order to carry out multiplication

I have 100 3x3x3 matrices that I would like to multiply with another large matrix of size 3x5x5 (similar to convolving one image with multiple filters, but not quite).
For the sake of explanation, this is what my large matrix looks like:
>>> x = np.arange(75).reshape(3, 5, 5)
>>> x
array([[[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19],
[20, 21, 22, 23, 24]],
[[25, 26, 27, 28, 29],
[30, 31, 32, 33, 34],
[35, 36, 37, 38, 39],
[40, 41, 42, 43, 44],
[45, 46, 47, 48, 49]],
[[50, 51, 52, 53, 54],
[55, 56, 57, 58, 59],
[60, 61, 62, 63, 64],
[65, 66, 67, 68, 69],
[70, 71, 72, 73, 74]]])
In memory, I assume all sub matrices in the large matrix are stored in contiguous locations (please correct me if I'm wrong). What I want to do is, from this 3x5x5 matrix, I want to extract 3 5x3 columns from each sub-matrix of the large matrix and then join them horizontally to get a 5x9 matrix (I apologise if this part is not clear, I can explain in more detail if need be). If I were using numpy, I'd do:
>>> k = np.hstack(np.vstack(x)[:, 0:3].reshape(3, 5, 3))
>>> k
array([[ 0, 1, 2, 25, 26, 27, 50, 51, 52],
[ 5, 6, 7, 30, 31, 32, 55, 56, 57],
[10, 11, 12, 35, 36, 37, 60, 61, 62],
[15, 16, 17, 40, 41, 42, 65, 66, 67],
[20, 21, 22, 45, 46, 47, 70, 71, 72]])
However, I'm not using python so I do not have any access to the numpy functions that I need in order to reshape the data blocks into a form I want to carry out multiplication... I can only directly call the cblas_sgemm function (from the BLAS library) in C, where k corresponds to input B.
Here's my call to cblas_sgemm:
cblas_sgemm( CblasRowMajor, CblasNoTrans, CblasTrans,
100, 5, 9,
1.0,
A, 9,
B, 9, // this is actually wrong, since I don't know how to specify the right parameter
0.0,
result, 5);
Basically, the ldb attribute is the offender here, because my data is not blocked the way I need it to be. I have tried different things, but I am not able to get cblas_sgemm to understand how I want it to read and understand my data.
In short, I don't know how to tell cblas_sgemm to read x like k.Is there a way I can smartly reshape my data in python before sending it to C, so that cblas_sgemm can work the way I want it to?
I will transpose k by setting CblasTrans, so during multiplication, B is 9x5. My matrix A is of shape 100x9. Hope that helps.
Any help would be appreciated. Thanks!
In short, I don't know how to tell cblas_sgemm to read x like k.
You can't. You'll have to make a copy.
Consider k:
In [20]: k
Out[20]:
array([[ 0, 1, 2, 25, 26, 27, 50, 51, 52],
[ 5, 6, 7, 30, 31, 32, 55, 56, 57],
[10, 11, 12, 35, 36, 37, 60, 61, 62],
[15, 16, 17, 40, 41, 42, 65, 66, 67],
[20, 21, 22, 45, 46, 47, 70, 71, 72]])
In a two-dimensional array, the spacing of the elements in memory must be the same in each axis. You know from how x was created that the consecutive elements in memory are 0, 1, 2, 3, 4, ..., but your first row of k contains 0, 1, 2, 25, 26, ..... The is no spacing between 1 and 2 (i.e. the memory address increases by the size of one element of the array), but there is a large jump in memory between 2 and 25. So you'll have to make a copy to create k.
Having said that, there is an alternative method to efficiently achieve your desired final result using a bit of reshaping (without copying) and numpy's einsum function.
Here's an example. First define x and A:
In [52]: x = np.arange(75).reshape(3, 5, 5)
In [53]: A = np.arange(90).reshape(10, 9)
Here's my understanding of what you want to achieve; A.dot(k.T) is the desired result:
In [54]: k = np.hstack(np.vstack(x)[:, 0:3].reshape(3, 5, 3))
In [55]: A.dot(k.T)
Out[55]:
array([[ 1392, 1572, 1752, 1932, 2112],
[ 3498, 4083, 4668, 5253, 5838],
[ 5604, 6594, 7584, 8574, 9564],
[ 7710, 9105, 10500, 11895, 13290],
[ 9816, 11616, 13416, 15216, 17016],
[11922, 14127, 16332, 18537, 20742],
[14028, 16638, 19248, 21858, 24468],
[16134, 19149, 22164, 25179, 28194],
[18240, 21660, 25080, 28500, 31920],
[20346, 24171, 27996, 31821, 35646]])
Here's how you can get the same result by slicing x and reshaping A:
In [56]: x2 = x[:,:,:3]
In [57]: A2 = A.reshape(-1, 3, 3)
In [58]: einsum('ijk,jlk', A2, x2)
Out[58]:
array([[ 1392, 1572, 1752, 1932, 2112],
[ 3498, 4083, 4668, 5253, 5838],
[ 5604, 6594, 7584, 8574, 9564],
[ 7710, 9105, 10500, 11895, 13290],
[ 9816, 11616, 13416, 15216, 17016],
[11922, 14127, 16332, 18537, 20742],
[14028, 16638, 19248, 21858, 24468],
[16134, 19149, 22164, 25179, 28194],
[18240, 21660, 25080, 28500, 31920],
[20346, 24171, 27996, 31821, 35646]])

Rand() seems to not work properly [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Why do I always get the same sequence of random numbers with rand()?
I've been experimenting with generating random numbers in C, and I've come across something weird. I don't know if it's only on my compiler but whenever I try to generate a pseudo-random number with the rand() function, it returns a very predictable number — the number generated with the parameter before plus 3.125 to be exact. It's hard to explain but here's an example.
srand(71);
int number = rand();
printf("%d", number);
This returns 270.
srand(72);
int number = rand();
printf("%d", number);
This returns 273.
srand(73);
int number = rand();
printf("%d", number);
This returns 277.
srand(74);
int number = rand();
printf("%d", number);
This returns 280.
Every eighth number is 4 higher. Otherwise it's 3.
This can't possibly be right. Is there something wrong with my compiler?
Edit: I figured it out — I created a function where I seed only once, then I loop the rand() and it generates random numbers. Thank you all!
The confusion here is about how pseudorandom number generators work.
Pseudorandom number generators like C's rand work by having a number representing the current 'state'. Every time the rand function is called, some deterministic computations are done on the 'state' number to produce the next 'state' number. Thus, if the generator is given the same input (the same 'state'), it will produce the same output.
So, when you seed the generator with srand(74), it will always generate the same string of numbers, every time. When you seed the generator with srand(75), it will generate a different string of numbers, etc.
The common way to ensure different output each time is to always provide a different seed, usually done by seeding the generator with the current time in seconds/milliseconds, e.g. srand(time(NULL)).
EDIT: Here is a Python session demonstrating this behavior. It is entirely expected.
>>> import random
If we seed the generator with the same number, it will always output the same sequence:
>>> random.seed(500)
>>> [random.randint(0, 100) for _ in xrange(20)]
[80, 95, 58, 25, 76, 37, 80, 34, 57, 79, 1, 33, 40, 29, 92, 6, 45, 31, 13, 11]
>>> random.seed(500)
>>> [random.randint(0, 100) for _ in xrange(20)]
[80, 95, 58, 25, 76, 37, 80, 34, 57, 79, 1, 33, 40, 29, 92, 6, 45, 31, 13, 11]
>>> random.seed(500)
>>> [random.randint(0, 100) for _ in xrange(20)]
[80, 95, 58, 25, 76, 37, 80, 34, 57, 79, 1, 33, 40, 29, 92, 6, 45, 31, 13, 11]
If we give it a different seed, even a slightly different one, the numbers will be totally different from the old seed, yet still the same if the same (new) seed is used:
>>> random.seed(501)
>>> [random.randint(0, 100) for _ in xrange(20)]
[64, 63, 24, 81, 33, 36, 72, 35, 95, 46, 37, 2, 76, 21, 46, 68, 47, 96, 39, 36]
>>> random.seed(501)
>>> [random.randint(0, 100) for _ in xrange(20)]
[64, 63, 24, 81, 33, 36, 72, 35, 95, 46, 37, 2, 76, 21, 46, 68, 47, 96, 39, 36]
>>> random.seed(501)
>>> [random.randint(0, 100) for _ in xrange(20)]
[64, 63, 24, 81, 33, 36, 72, 35, 95, 46, 37, 2, 76, 21, 46, 68, 47, 96, 39, 36]
How do we make our program have different behavior each time? If we supply the same seed, it will always behave the same. We can use the time.time() function, which will yield a different number each time we call it:
>>> import time
>>> time.time()
1347917648.783
>>> time.time()
1347917649.734
>>> time.time()
1347917650.835
So if we keep re-seeding it with a call to time.time(), we will get a different sequence of numbers each time, because the seed is different each time:
>>> random.seed(time.time())
>>> [random.randint(0, 100) for _ in xrange(20)]
[60, 75, 60, 26, 19, 70, 12, 87, 58, 2, 79, 74, 1, 79, 4, 39, 62, 20, 28, 19]
>>> random.seed(time.time())
>>> [random.randint(0, 100) for _ in xrange(20)]
[98, 45, 85, 1, 67, 25, 30, 88, 17, 93, 44, 17, 94, 23, 98, 32, 35, 90, 56, 35]
>>> random.seed(time.time())
>>> [random.randint(0, 100) for _ in xrange(20)]
[44, 17, 10, 98, 18, 6, 17, 15, 60, 83, 73, 67, 18, 2, 40, 76, 71, 63, 92, 5]
Of course, even better than constantly re-seeding it is to seed it once and keep going from there:
>>> random.seed(time.time())
>>> [random.randint(0, 100) for _ in xrange(20)]
[94, 80, 63, 66, 31, 94, 74, 15, 20, 29, 76, 90, 50, 84, 43, 79, 50, 18, 58, 15]
>>> [random.randint(0, 100) for _ in xrange(20)]
[30, 53, 75, 19, 35, 11, 73, 88, 3, 67, 55, 43, 37, 91, 66, 0, 9, 4, 41, 49]
>>> [random.randint(0, 100) for _ in xrange(20)]
[69, 7, 25, 68, 39, 57, 72, 51, 33, 93, 81, 89, 44, 61, 78, 77, 43, 10, 33, 8]
Every invocation of rand() returns the next number in a predefined sequence where the starting number is the seed supplied to srand(). That' why it's called a pseudo-random number generator, and not a random number generator.
rand() is implemented by a pseudo random number generator.
The distribution of numbers generated by consecutive calls to rand() have the properties of being random numbers, but the order is pre-determined.
The 'start' number is determined by the seed that you provide.
You should give a PRNG a single seed only. Providing it with multiple seeds can radically alter the randomness of the generator. In addition, providing it the same seed over and over removes all randomness.
Generating a "random" number regardless of the implementation is dependent on a divergent infinite sequence. The infinite sequence is generated using the seed of the random function and it is actually pseudo random because of its nature. This would explain to you why your number is actually very dependent on the seed that you give the function.
In some implementations the sequence is only one and the seed is the starting member of the sequence. In others there are difference sequences depending on the seed. If a seed is not provided then the seed is determined by the internal "clock".
The number is truncated when using an upper and lower bounds for your random number by respectively doing randValue % upperBound and randValue + lowerBound. Random implementation is very similar to Hash Functions. Depending on architecture the upper bound of the random value is set depending on what it the largest integer/double that it can carry out if not set lower by the user.

Resources