Karatsuba Algorithm: splitting strings - c

I am trying to implement the Karatsuba algorithm in C.
I work with char strings (which are digits in a certain base), and although I think I have understood most of the Karatsuba algorithm, I do not get where I should split the strings to multiply.
For example, where should I cut 123 * 123, and where should I cut 123 * 12?
I can't get to a solution that works with both these calculations.
I tried to cut it in half and flooring the result when the number if odd, but it did not work, and ceiling does not work too.
Any clue?
Let a, b, c, and d be the parts of the strings.
Let's try with 123 * 12
First try (a = 1, b = 23, c = 1, d = 2) (fail)
z0 = a * c = 1
z1 = b * d = 46
z2 = (a + b) * (c + d) - z0 - z1 = 24 * 3 - 1 - 46 = 72 - 1 - 46 = 25
z0_padded = 100
z2_padded = 250
z0_padded + z1 + z2_padded = 100 + 46 + 250 = 396 != 123 * 12
Second try (a = 12, b = 3, c = 12, d = 0) (fail)
z0 = 144
z1 = 0
z2 = 15 * 12 - z1 - z0 = 180 - 144 = 36
z0_padded = 14400
z2_padded = 360
z0_padded + z1 + z2_padded = 14760 != 1476
Third try (a = 12, b = 3, c = 0, d = 12) (success)
z0 = 0
z1 = 36
z2 = 15 * 12 - z0 - z1 = 144
z0_padded = 0
z2_padded = 1440
z0_padded + z1 + z2_padded = 1476 == 1476
Let's try with 123 * 123
First try (a = 1, b = 23, c = 1, d = 23) (fail)
z0 = 1
z1 = 23 * 23 = 529
z2 = 24 * 24 - z0 - z1 = 46
z0_padded = 100
z2_padded = 460
z0_padded + z1 + z2_padded = 561 != 15129
Second try (a = 12, b = 3, c = 12, d = 3) (success)
z0 = 12 * 12 = 144
z1 = 3 * 3 = 9
z2 = 15 * 15 - z0 - z1 = 72
z0_padded = 14400
z2_padded = 720
z0_padded + z1 + z2_padded = 15129 == 15129
Third try (a = 12, b = 3, c = 1, d = 23) (fail)
z0 = 12
z1 = 3 * 23 = 69
z2 = 15 * 24 - z0 - z1 = 279
z0_padded = 1200
z2_padded = 2799
z0_padded + z1 = z2_padded = 4068 != 15129
Here, I do not get where I messed this up. Note that my padding method adds n zeroes at the end of a number where n = m * 2 and m equals the size of the longest string divided by two.
EDIT
Now that I have understood that b and d must be of the same length, it works almost everytime, but there are still exceptions: for example 1234*12
a = 123
b = 4
c = 1
d = 2
z0 = 123
z1 = 8
z2 = 127 * 3 - 123 - 8 = 250
z0_padded = 1230000
z2_padded = 25000
z0_padded + z1 + z2_padded = 1255008 != 14808
Here, assuming I split the strings correctly, the problem is the padding, but I do not get how I should pad. I read on Wikipedia that I should pad depending on the size of the biggest string (see a few lines up), there should be another solution.

The Karatsuba algorithm is a nice way to perform multiplications.
If you want it to work, b and d must be of the same length.
Here are two possibilities to compute 123x12 :
a= 1;b=23;c=0;d=12;
a=12;b= 3;c=1;d= 2;
Let's explain how it works for the second case :
123=12×10+3
12= 1×10+2
123×12=(12×10+3)×(1×10+2)
123×12=12×1×100+ (12×2+3×1)×10+3×2
123×12=12×1×100+((12+3)×(1+2)-12×1-3×2)×10+3×2
Let's explain how it works for the first case :
123=1×100+23
12=0×100+12
123×12=(1×100+23)×(0×100+12)
123×12=1×0×10000+ (1×12+23×0)×100+23×12
123×12=1×0×10000+((1+23)×(0+12)-1×0-23×12)×100+23×12
It also works with 10^k, 2^k or n instead of 10 or 100.

Related

SympifyError: SympifyError: index when using a loop

I am having trouble using simpify when changing the parameters in a loop. Before adding the loop it worked just fine so I am a bit confused about what is going wrong. The idea is to calculate the fixed points for the above equations when having a varying parameter. I determined the parameters by using a random algorithm beforehand.
data used
index c1 c2 c3 c4 c5
2 0.182984 2.016811 0.655393 1.581344 1000.0
3 0.481093 3.696431 0.174021 2.604066 1000.0
4 2.651888 0.665661 2.010521 1.004902 1000.0
5 4.356905 3.805205 0.169469 0.188154 1000.0
6 0.618898 1.205760 0.394822 0.624573 1000.0
7 1.628458 0.908339 0.117855 0.801636 1000.0
8 1.084346 0.251490 5.008077 4.606338 1000.0
9 0.314420 4.553279 0.279103 1.136288 1000.0
10 0.309323 3.447195 0.769426 1.058890 1000.0
11 1.353905 5.034620 3.025668 0.136687 1000.0
12 0.294230 0.590507 0.203964 0.105073 1000.0
13 0.433693 1.040195 0.197015 0.214636 1000.0
14 5.597691 2.734779 0.298786 6.869852 1000.0
15 0.106748 0.329506 1.642285 2.259433 1000.0
16 7.065243 0.138986 6.280275 0.265305 1000.0
17 0.676381 0.263757 6.540224 2.890927 1000.0
18 0.646750 2.573060 0.157341 1.779078 1000.0
19 2.829030 0.208247 0.102454 0.117786 1000.0
20 3.973703 0.134666 1.099034 4.255214 1000.0
df1 = df[df.columns[1]]
df2 = df[df.columns[2]]
df3 = df[df.columns[3]]
df4 = df[df.columns[4]]
EQ=[]
for i in df[:5]:
a = df["c1"]
b = df["c2"]
c = df["c3"]
d = df["c4"]
Q = 1
a1 = 0
b1 = 0
c1 = 0
d1 = 0
u,v = sm.symbols('u,v', negative=False)
# equations
U = a * u -a1* v**2 - b*v+b1*v + Q
V = c * u -c1*u*v- d*v + d1 + Q
# use sympy's way of setting equations to zero
Uqual = sm.Eq(U, 0)
Vqual = sm.Eq(V, 0)
# compute fixed points
equilibria = sm.solve( (Uqual, Vqual), u,v)
print('The fixed point(s) of this system are: %s' %equilibria)
equilibria.append(equilibria)
SympifyError Traceback (most recent call last)
<ipython-input-81-7104e05ced6a> in <module>
16 V = c * u -c1*u*v- d*v + d1 + Q
17 # use sympy's way of setting equations to zero
---> 18 Uqual = sm.Eq(U, 0)
19 Vqual = sm.Eq(V, 0)
20
~\anaconda3\lib\site-packages\sympy\core\relational.py in __new__(cls, lhs, rhs, **options)
501 rhs = 0
502 evaluate = options.pop('evaluate', global_parameters.evaluate)
--> 503 lhs = _sympify(lhs)
504 rhs = _sympify(rhs)
505 if evaluate:
~\anaconda3\lib\site-packages\sympy\core\sympify.py in _sympify(a)
510
511 """
--> 512 return sympify(a, strict=True)
513
514
~\anaconda3\lib\site-packages\sympy\core\sympify.py in sympify(a, locals, convert_xor, strict, rational, evaluate)
431
432 if strict:
--> 433 raise SympifyError(a)
434
435 if iterable(a):
SympifyError: SympifyError: index
1 0.32539361355594*u - 0.153951771353544*v + 1
2 0.111286178007145*u - 0.211620881593914*v + 1
3 0.410704332996077*u - 0.338148622964363*v + 1
4 1.39126513227539*u - 0.715390758416011*v + 1
5 0.289981428632838*u - 3.76334113661812*v + 1
...
96 0.450838908230239*u - 7.00849756407416*v + 1
97 4.59646738213032*u - 1.45107766000711*v + 1
98 6.28779804684458*u - 0.395831415205476*v + 1
99 0.196464087698782*u - 0.205057919337616*v + 1
100 1.69031014508742*u - 0.140571509904066*v + 1
Length: 100, dtype: object
In an isympy session:
Make a sample dataframe:
In [11]: import pandas as pd
In [12]: df = pd.DataFrame(np.arange(12).reshape(3,4))
In [13]: df
Out[13]:
0 1 2 3
0 0 1 2 3
1 4 5 6 7
2 8 9 10 11
Set up a non-iterative case:
In [15]: u,v = symbols('u,v', negative=False)
In [16]: a,a1,b,b1 = 1,2,3,4
In [17]: U = a * u -a1* v**2 - b*v+b1*v
In [18]: U
Out[18]:
2
u - 2⋅v + v
Versus on with dataframe values:
In [19]: a,b = df[0],df[1]
In [20]: a,b
Out[20]:
(0 0
1 4
2 8
Name: 0, dtype: int64,
0 1
1 5
2 9
Name: 1, dtype: int64)
In [21]: U = a * u -a1* v**2 - b*v+b1*v
In [22]: U
Out[22]:
0 -2*v**2 + 3*v
1 4*u - 2*v**2 - v
2 8*u - 2*v**2 - 5*v
dtype: object
This U is a pandas Series, with object elements (which are sympy expressions). But U itself is not sympy.
Eq applied to the simple expression:
In [23]: Eq(Out[18],0)
Out[23]:
2
u - 2⋅v + v = 0
Your error - Eq applied to the Series:
In [24]: Eq(Out[22],0)
---------------------------------------------------------------------------
SympifyError Traceback (most recent call last)
Input In [24], in <cell line: 1>()
----> 1 Eq(Out[22],0)
File /usr/local/lib/python3.8/dist-packages/sympy/core/relational.py:626, in Equality.__new__(cls, lhs, rhs, **options)
624 rhs = 0
625 evaluate = options.pop('evaluate', global_parameters.evaluate)
--> 626 lhs = _sympify(lhs)
627 rhs = _sympify(rhs)
628 if evaluate:
File /usr/local/lib/python3.8/dist-packages/sympy/core/sympify.py:528, in _sympify(a)
502 def _sympify(a):
503 """
504 Short version of :func:`~.sympify` for internal usage for ``__add__`` and
505 ``__eq__`` methods where it is ok to allow some things (like Python
(...)
526
527 """
--> 528 return sympify(a, strict=True)
File /usr/local/lib/python3.8/dist-packages/sympy/core/sympify.py:449, in sympify(a, locals, convert_xor, strict, rational, evaluate)
446 continue
448 if strict:
--> 449 raise SympifyError(a)
451 if iterable(a):
452 try:
SympifyError: SympifyError: 0 -2*v**2 + 3*v
1 4*u - 2*v**2 - v
2 8*u - 2*v**2 - 5*v
dtype: object
Eq() does not have a 'iterate over Series' (or even over list) capability.
We can iterate (list comprehension) and apply the Eq to each terms of the Series:
In [25]: [Eq(U[i],0) for i in range(3)]
Out[25]:
⎡ 2 2 2 ⎤
⎣- 2⋅v + 3⋅v = 0, 4⋅u - 2⋅v - v = 0, 8⋅u - 2⋅v - 5⋅v = 0⎦
As a general rule, sympy and pandas/numpy does not work well.
It's hard to understand what you are trying to achieve with the code you posted above. So, the following represents my guess:
# NOTE: the following variables are of type
# pandas.core.series.Series. They are iterables
# (think of them as arrays)
a = df["c1"]
b = df["c2"]
c = df["c3"]
d = df["c4"]
# constants
Q = 1
a1 = 0
b1 = 0
c1 = 0
d1 = 0
# symbols
u, v = symbols('u, v', negative=False)
# equations
# NOTE: because a, b, c, d are iterables, then U, V
# will be iterables too. Each element will be a SymPy
# expression because you used the symbols u and v.
U = a * u - a1 * v**2 - b * v + b1 * v + Q
V = c * u - c1 * u * v - d * v + d1 + Q
EQ = []
# loop over the equations and solve them
for u_eq, v_eq in zip(U, V):
# here we are asking to solve u_eq=0 and v_eq=0
equilibria = solve((u_eq, v_eq), (u, v))
EQ.append(equilibria)
print('The fixed point(s) of this system are: %s' % equilibria)

How many iterations of the while loop must be made in order to locate it?

Im having some trouble trying to find out why the correct answer to this question is 4. Could anyone be kind enough to briefly explain why? Thanks in advanced! Here's the question:
Consider the array a with values as shown:
4, 7, 19, 25, 36, 37, 50, 100, 101, 205, 220, 271, 306, 321
where 4 is a [0] and 321 is a [13] . Suppose that the search method is called with
first = 0 and last = 13 to locate the key 205. How many iterations of the while loop must be made in order to locate it?
My guess is that you have to use a binary search here, as the items are sorted.
Given this array
[0] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13]
4, 7, 19, 25, 36, 37, 50, 100, 101, 205, 220, 271, 306, 321
You initialize with:
left and right indexes: l = 0, r = 14 (= length of array)
Then you need these iterations:
m = (l + r) / 2 = (0 + 14) / 2 = 7
[m = 7] = 100 is < 205 ==> l = 7 + 1
m = (l + r) / 2 = (8 + 14) / 2 = 11
[m = 11] = 271 is > 205 ==> r = 11 - 1
m = (l + r) / 2 = (8 + 10) / 2 = 9
[m = 9] = 205 is = 205 ==> result = [9]
= 3 iterations!
However, a slight change to the algorithm can change the number of iterations. If you take r = N-1 instead of N as initial value then you get:
m = (l + r) / 2 = (0 + 13) / 2 = 6 (integer division)
[m = 6] = 50 is < 205 ==> l = 6 + 1
m = (l + r) / 2 = (7 + 13) / 2 = 10
[m = 10] = 220 is > 205 ==> r = 10 - 1
m = (l + r) / 2 = (7 + 9) / 2 = 8
[m = 8] = 101 is < 205 ==> l = 8 + 1
m = (l + r) / 2 = (9 + 9) / 2 = 9
[m = 9] = 205 is = 205 ==> result = [9]
= 4 iterations!
So the result depends on implementation details. Both variants are correct. Take care to choose the appropriate loop condition (I think l < r for the first and l <= r for the second algorithm.
Just go from the last index.
You start with index 13, which the first iteration you go to index 12, on the 4th iteration you are on index 9, which equals to 205.

Vectorize 2d convolution on matlab

I got this Code for computing two dimensional convolution for two given arrays.
[r,c] = size(x);
[m,n] = size(y);
h = rot90(y, 2);
center = floor((size(h)+1)/2);
Rep = zeros(r + m*2-2, c + n*2-2);
return
for x1 = m : m+r-1
for y1 = n : n+r-1
Rep(x1,y1) = x(x1-m+1, y1-n+1);
end
end
B = zeros(r+m-1,n+c-1);
for x1 = 1 : r+m-1
for y1 = 1 : n+c-1
for i = 1 : m
for j = 1 : n
B(x1, y1) = B(x1, y1) + (Rep(x1+i-1, y1+j-1) * h(i, j));
end
end
end
end
How can i vectorize it , so no for loops exist ?
Thanks in advance.
Here's what I came up with:
%// generate test matrices
x = randi(12, 4, 5)
y = [2 2 2;
2 0 2;
2 2 2]
[r,c] = size(x);
%[m,n] = size(y); %// didn't use this
h = rot90(y, 2);
center = floor((size(h)+1)/2);
Rep = zeros(size(x)+size(h)-1); %// create image of zeros big enough to pad x
Rep(center(1):center(1)+r-1, center(2):center(2)+c-1) = x; %// and copy x into the middle
%// all of this can be compressed onto one line, if desired
%// I'm just breaking it out into steps for clarity
CRep = im2col(Rep, size(h), 'sliding'); %// 'sliding' is the default, but just to be explicit
k = h(:); %// turn h into a column vector
BRow = bsxfun(#times, CRep, k); %// multiply k times each column of CRep
B = reshape(sum(BRow), r, c) %// take the sum of each column and reshape to match x
T = conv2(Rep, h, 'valid') %// take the convolution using conv2 to check
assert(isequal(B, T), 'Result did not match conv2.');
Here are the results of a sample run:
x =
11 12 11 2 8
5 9 2 3 2
7 9 3 4 8
7 10 8 5 4
y =
2 2 2
2 0 2
2 2 2
B =
52 76 56 52 14
96 120 106 80 50
80 102 100 70 36
52 68 62 54 34
T =
52 76 56 52 14
96 120 106 80 50
80 102 100 70 36
52 68 62 54 34

Inserting One Row Each Time in a Sequence from Matrix into Another Matrix After Every nth Row in Matlab

I have matrix A and matrix B. Matrix A is 100*3. Matrix B is 10*3. I need to insert one row from matrix B each time in a sequence into matrix A after every 10th row. The result would be Matrix A with 110*3. How can I do this in Matlab?
Here's another indexing-based approach:
n = 10;
C = [A; B];
[~, ind] = sort([1:size(A,1) n*(1:size(B,1))+.5]);
C = C(ind,:);
For canonical purposes, here's how you'd do it via loops. This is a bit inefficient since you're mutating the array at each iteration, but it's really simple to read. Given that your two matrices are stored in A (100 x 3) and B (10 x 3), you would do:
out = [];
for idx = 1 : 10
out = [out; A((idx-1)*10 + 1 : 10*idx,:); B(idx,:)];
end
At each iteration, we pick out 10 rows of A and 1 row of B and we concatenate these 11 rows onto out. This happens 10 times, resulting in 330 rows with 3 columns.
Here's an index-based approach:
%//pre-allocate output matrix
matrixC = zeros(110, 3);
%//create index array for the locations in matrixC that would be populated by matrixB
idxArr = (1:10) * 11;
%//place matrixB into matrixC
matrixC(idxArr,:) = matrixB;
%//place matrixA into matrixC
%//setdiff is used to exclude indexes already populated by values from matrixB
matrixC(setdiff(1:110, idxArr),:) = matrixA;
And just for fun here's the same approach sans magic numbers:
%//define how many rows to take from matrixA at once
numRows = 10;
%//get dimensions of input matrices
lengthA = size(matrixA, 1);
lengthB = size(matrixB, 1);
matrixC = zeros(lengthA + lengthB, 3);
idxArr = (1:lengthB) * (numRows + 1);
matrixC(idxArr,:) = matrixB;
matrixC(setdiff(1:size(matrixC, 1), idxArr),:) = matrixA;
Just for fun... Now with more robust test matrices!
A = ones(3, 100);
A(:) = 1:300;
A = A.'
B = ones(3, 10);
B(:) = 1:30;
B = B.' + 1000
C = reshape(A.', 3, 10, []);
C(:,end+1,:) = permute(B, [2 3 1]);
D = permute(C, [2 3 1]);
E = reshape(D, 110, 3)
Input:
A =
1 2 3
4 5 6
7 8 9
10 11 12
13 14 15
16 17 18
19 20 21
22 23 24
25 26 27
28 29 30
31 32 33
34 35 36
...
B =
1001 1002 1003
1004 1005 1006
...
Output:
E =
1 2 3
4 5 6
7 8 9
10 11 12
13 14 15
16 17 18
19 20 21
22 23 24
25 26 27
28 29 30
1001 1002 1003
31 32 33
34 35 36
...
Thanks to #Divakar for pointing out my previous error.
Solution Code
Here's an implementation based on logical indexing also known as masking and must be pretty efficient when working with large arrays -
%// Get sizes of A and B
[M,d] = size(A);
N = size(B,1);
%// Mask of row indices where rows from A would be placed
mask_idx = reshape([true(A_cutrow,M/A_cutrow) ; false(1,N)],[],1);
%// Pre-allocate with zeros:
%// http://undocumentedmatlab.com/blog/preallocation-performance
out(M+N,d) = 0;
%// Insert A and B using mask and ~mask
out(mask_idx,:) = A;
out(~mask_idx,:) = B;
Benchmarking
%// Setup inputs
A = rand(100000,3);
B = rand(10000,3);
A_cutrow = 10;
num_iter = 200; %// Number of iterations to be run for each approach
%// Warm up tic/toc.
for k = 1:50000
tic(); elapsed = toc();
end
disp(' ------------------------------- With MASKING')
tic
for iter = 1:num_iter
[M,d] = size(A);
N = size(B,1);
mask_idx = reshape([true(A_cutrow,M/A_cutrow) ; false(1,N)],[],1);
out(M+N,d) = 0;
out(mask_idx,:) = A;
out(~mask_idx,:) = B;
clear out
end
toc, clear mask_idx N M d iter
disp(' ------------------------------- With SORT')
tic
for iter = 1:num_iter
C = [A; B];
[~, ind] = sort([1:size(A,1) A_cutrow*(1:size(B,1))+.5]);
C = C(ind,:);
end
toc, clear C ind iter
disp(' ------------------------------- With RESHAPE+PERMUTE')
tic
for iter = 1:num_iter
[M,d] = size(A);
N = size(B,1);
C = reshape(A.', d, A_cutrow , []);
C(:,end+1,:) = permute(B, [2 3 1]);
D = permute(C, [2 1 3]);
out = reshape(permute(D,[1 3 2]),M+N,[]);
end
toc, clear out D C N M d iter
disp(' ------------------------------- With SETDIFF')
tic
for iter = 1:num_iter
lengthA = size(A, 1);
lengthB = size(B, 1);
matrixC = zeros(lengthA + lengthB, 3);
idxArr = (1:lengthB) * (A_cutrow + 1);
matrixC(idxArr,:) = B;
matrixC(setdiff(1:size(matrixC, 1), idxArr),:) = A;
end
toc, clear matrixC idxArr lengthA lengthB
disp(' ------------------------------- With FOR-LOOP')
tic
for iter = 1:num_iter
[M,d] = size(A);
N = size(B,1);
Mc = M/A_cutrow;
out(M+N,d) = 0;
for idx = 1 : Mc
out( 1+(idx-1)*(A_cutrow +1): idx*(A_cutrow+1), :) = ...
[A( 1+(idx-1)*A_cutrow : idx*A_cutrow , : ) ; B(idx,:)];
end
clear out
end
toc
Runtimes
Case #1: A as 100 x 3 and B as 10 x 3
------------------------------- With MASKING
Elapsed time is 4.987088 seconds.
------------------------------- With SORT
Elapsed time is 5.056301 seconds.
------------------------------- With RESHAPE+PERMUTE
Elapsed time is 5.170416 seconds.
------------------------------- With SETDIFF
Elapsed time is 35.063020 seconds.
------------------------------- With FOR-LOOP
Elapsed time is 12.118992 seconds.
Case #2: A as 100000 x 3 and B as 10000 x 3
------------------------------- With MASKING
Elapsed time is 1.167707 seconds.
------------------------------- With SORT
Elapsed time is 2.667149 seconds.
------------------------------- With RESHAPE+PERMUTE
Elapsed time is 2.603110 seconds.
------------------------------- With SETDIFF
Elapsed time is 3.153900 seconds.
------------------------------- With FOR-LOOP
Elapsed time is 19.822912 seconds.
Please note that num_iter was different for these two cases, as the idea was to keep the runtimes > 1 sec mark to compensate for tic-toc overheads.

Matlab - arranging numbers

I have vectors m, x, y & I want m1, x1, y1 as commented below:
% given
m = [-4 -3 -2 2 3 4];
x = [2 5 6 7 9 1];
y = [10 23 34 54 27 32];
% required
% m1 = [2 3 4]; % only +ve value from m
% x1 = [13 14 3]; % adding numbers(in x) corres. to -ve & +ve value in m & putting below 2, 3, 4 respectively
% y1 = [88 50 42]; % adding numbers(in y) corres. to -ve & +ve value in m & putting below 2, 3, 4 respectively
m1 = m(m > 0) % this gives me m1 as required
Any hint for x1, y1 will be very helpful.
Assuming m is built as [vectorNegativeReversed, vectorPositiveOriginal] the solution can be quite straightforward:
p = numel(m)/2;
m1 = m(p+1:end)
x1 = x(p+1:end) + x(p:-1:1)
y1 = y(p+1:end) + y(p:-1:1)
What about some flippy action:
m = [-4 -3 -2 2 3 4];
x = [2 5 6 7 9 1];
y = [10 23 34 54 27 32];
idx = find( (m > 0) );
xdi = find( ~(m > 0) );
m1 = m(idx)
x1 = fliplr( x(xdi) ) + x(idx)
y1 = fliplr( y(xdi) ) + y(idx)
returning:
m1 =
2 3 4
x1 =
13 14 3
y1 =
88 50 42

Resources