Symbolic Math with n - symbolic-math

Suppose we want to find summation formula for a sequence. The easiest one would be
x1 = 1; x2 = 2; ... ; xn = n; ...
We all know the sum of the first n items is (n+1)n/2.
My question is how to find the last formula using symbolic calculation with Sympy or matlab or any other software. The difficulty I have is how to deal with n. For example, if each entry in a sequence can be written as a function of n, such as, an = n^2, where n=1, 2, ... Now how do I use symbolic calculation to get a formula for a1+a2+...+an, please? Note I want a formula in terms of a general n without specify a value for n. Is this even possible? If so, how? Thank you!

The answer depends on the syntax of your symbolic math software. For example: here is the solution using Wolfram alpha: Sum[i, {i, 1, n}]
- n is just n - that's why it is called symbolic.

As #ChistianFries say, dealing with an arbitrary symbolic variable like your n is what symbolic math is all about.
In Matlab, you should start by reading up on the Symbolic Math toolbox and (optionally) MuPAD, which is a separate environment a bit like Mathematica. Symbolic summation can be accomplished in Matlab via symsum:
syms x n;
symsum(x,1,n)
which returns (n*(n + 1))/2 as expected. I recommend reading the documentation for symsum, and the other symbolic math functions, and trying out the examples provided.

Related

Matrix inverse calculation

can we calculate the inverse of a matrix in codesys?
I am trying to write a code for the following equation.
For Codesys there is a paid matrix library. For TwinCAT there is the free and open source TcMatrix.
This is a case where you don't need to calculate the inverse explicitly and indeed are better not to, both for performance and accuracy.
Changing notation, so that it's easier to type here, you want
V = inv( I + A'*A) *A'*B
This can be calculated like this:
compute
b = A'*B
C = I + A'*A
solve
C = L*L' for lower triangular L
This is cholesky decomposition, which can be done in place, and you should read up on
solve
L*x = b for x
L'*v = x for v
Note that because L is lower triangular (and so L' upper triangular) each of the above solutions can be done in O(dim*dim) and can be done in place if desired.
V is what you wanted.

Generating random number in sorted order

I want to generate random number in sorted order.
I wrote below code:
void CreateSortedNode(pNode head)
{
int size = 10, last = 0;
pNode temp;
while(size-- > 0) {
temp = (pnode)malloc(sizeof(struct node));
last += (rand()%10);
temp->data = last;//randomly generate number in sorted order
list_add(temp);
}
}
[EDIT:]
Expecting number will be generated in increased or decreased order: i.e {2, 5, 9, 23, 45, 68 }
int main()
{
int size = 10, last = 0;
while(size-- > 0) {
last += (rand()%10);
printf("%4d",last);
}
return 0;
}
Any better idea?
Solved back in 1979 (by Bentley and Saxe at Carnegie-Mellon):
https://apps.dtic.mil/dtic/tr/fulltext/u2/a066739.pdf
The solution is ridiculously compact in terms of code too!
Their paper is in Pascal, I converted it to Python so it should work with any language:
from random import random
cur_max=100 #desired maximum random number
n=100 #size of the array to fill
x=[0]*(n) #generate an array x of size n
for i in range(n,0,-1):
cur_max=cur_max*random()**(1/i) #the magic formula
x[i-1]=cur_max
print(x) #the results
Enjoy your sorted random numbers...
Without any information about sample size or sample universe, it's not easy to know if the following is interesting but irrelevant or a solution, but since it is in any case interesting, here goes.
The problem:
In O(1) space, produce an unbiased ordered random sample of size n from an ordered set S of size N: <S1,S2,…SN>, such that the elements in the sample are in the same order as the elements in the ordered set.
The solution:
With probability n/|S|, do the following:
add S1 to the sample.
decrement n
Remove S1 from S
Repeat steps 1 and 2, each time with the new first element (and size) of S until n is 0, at which point the sample will have the desired number of elements.
The solution in python:
from random import randrange
# select n random integers in order from range(N)
def sample(n, N):
# insist that 0 <= n <= N
for i in range(N):
if randrange(N - i) < n:
yield i
n -= 1
if n <= 0:
break
The problem with the solution:
It takes O(N) time. We'd really like to take O(n) time, since n is likely to be much smaller than N. On the other hand, we'd like to retain the O(1) space, in case n is also quite large.
A better solution (outline only)
(The following is adapted from a 1987 paper by Jeffrey Scott Vitter, "An Efficient Algorithm for Sequential Random Sampling". See Dr. Vitter's publications page.. Please read the paper for the details.)
Instead of incrementing i and selecting a random number, as in the above python code, it would be cool if we could generate a random number according to some distribution which would be the number of times that i will be incremented without any element being yielded. All we need is the distribution (which will obviously depend on the current values of n and N.)
Of course, we can derive the distribution precisely from an examination of the algorithm. That doesn't help much, though, because the resulting formula requires a lot of time to compute accurately, and the end result is still O(N).
However, we don't always have to compute it accurately. Suppose we have some easily computable reasonably good approximation which consistently underestimates the probabilities (with the consequence that it will sometimes not make a prediction). If that approximation works, we can use it; if not, we'll need to fallback to the accurate computation. If that happens sufficiently rarely, we might be able to achieve O(n) on the average. And indeed, Dr. Vitter's paper shows how to do this. (With code.)
Suppose you wanted to generate just three random numbers, x, y, and z so that they are in sorted order x <= y <= z. You will place these in some C++ container, which I'll just denote as a list like D = [x, y, z], so we can also say that x is component 0 of D, or D_0 and so on.
For any sequential algorithm that first draws a random value for x, let's say it comes up with 2.5, then this tells us some information about what y has to be, Namely, y >= 2.5.
So, conditional on the value of x, your desired random number algorithm has to satisfy the property that p(y >= x | x) = 1. If the distribution you are drawing from is anything like a common distribution, like uniform or Guassian, then it's clear to see that usually p(y >= x) would be some other expression involving the density for that distribution. (In fact, only a pathological distribution like a Dirac Delta at "infinity" could be independent, and would be nonsense for your application.)
So what we can speculate with great confidence is that p(y >= t | x) for various values of t is not equal to p(y >= t). That's the definition for dependent random variables. So now you know that the random variable y (second in your eventual list) is not statistically independent of x.
Another way to state it is that in your output data D, the components of D are not statistically independent observations. And in fact they must be positively correlated since if we learn that x is bigger than we thought, we also automatically learn that y is bigger than or equal to what we thought.
In this sense, a sequential algorithm that provides this kind of output is an example of a Markov Chain. The probability distribution of a given number in the sequence is conditionally dependent on the previous number.
If you really want a Markov Chain like that (I suspect that you don't), then you could instead draw a first number at random (for x) and then draw positive deltas, which you will add to each successive number, like this:
Draw a value for x, say 2.5
Draw a strictly positive value for y-x, say 13.7, so y is 2.5 + 13.7 = 16.2
Draw a strictly positive value for z-y, say 0.001, so z is 16.201
and so on...
You just have to acknowledge that the components of your result are not statistically independent, and so you cannot use them in an application that relies on statistical independence assumptions.

fast algorithm of finding sums in array

I am looking for a fast algorithm:
I have a int array of size n, the goal is to find all patterns in the array that
x1, x2, x3 are different elements in the array, such that x1+x2 = x3
For example I know there's a int array of size 3 is [1, 2, 3] then there's only one possibility: 1+2 = 3 (consider 1+2 = 2+1)
I am thinking about implementing Pairs and Hashmaps to make the algorithm fast. (the fastest one I got now is still O(n^2))
Please share your idea for this problem, thank you
Edit: The answer below applies to a version of this problem in which you only want one triplet that adds up like that. When you want all of them, since there are potentially at least O(n^2) possible outputs (as pointed out by ex0du5), and even O(n^3) in pathological cases of repeated elements, you're not going to beat the simple O(n^2) algorithm based on hashing (mapping from a value to the list of indices with that value).
This is basically the 3SUM problem. Without potentially unboundedly large elements, the best known algorithms are approximately O(n^2), but we've only proved that it can't be faster than O(n lg n) for most models of computation.
If the integer elements lie in the range [u, v], you can do a slightly different version of this in O(n + (v-u) lg (v-u)) with an FFT. I'm going to describe a process to transform this problem into that one, solve it there, and then figure out the answer to your problem based on this transformation.
The problem that I know how to solve with FFT is to find a length-3 arithmetic sequence in an array: that is, a sequence a, b, c with c - b = b - a, or equivalently, a + c = 2b.
Unfortunately, the last step of the transformation back isn't as fast as I'd like, but I'll talk about that when we get there.
Let's call your original array X, which contains integers x_1, ..., x_n. We want to find indices i, j, k such that x_i + x_j = x_k.
Find the minimum u and maximum v of X in O(n) time. Let u' be min(u, u*2) and v' be max(v, v*2).
Construct a binary array (bitstring) Z of length v' - u' + 1; Z[i] will be true if either X or its double [x_1*2, ..., x_n*2] contains u' + i. This is O(n) to initialize; just walk over each element of X and set the two corresponding elements of Z.
As we're building this array, we can save the indices of any duplicates we find into an auxiliary list Y. Once Z is complete, we just check for 2 * x_i for each x_i in Y. If any are present, we're done; otherwise the duplicates are irrelevant, and we can forget about Y. (The only situation slightly more complicated is if 0 is repeated; then we need three distinct copies of it to get a solution.)
Now, a solution to your problem, i.e. x_i + x_j = x_k, will appear in Z as three evenly-spaced ones, since some simple algebraic manipulations give us 2*x_j - x_k = x_k - 2*x_i. Note that the elements on the ends are our special doubled entries (from 2X) and the one in the middle is a regular entry (from X).
Consider Z as a representation of a polynomial p, where the coefficient for the term of degree i is Z[i]. If X is [1, 2, 3, 5], then Z is 1111110001 (because we have 1, 2, 3, 4, 5, 6, and 10); p is then 1 + x + x2 + x3 + x4 + x5 + x9.
Now, remember from high school algebra that the coefficient of xc in the product of two polynomials is the sum over all a, b with a + b = c of the first polynomial's coefficient for xa times the second's coefficient for xb. So, if we consider q = p2, the coefficient of x2j (for a j with Z[j] = 1) will be the sum over all i of Z[i] * Z[2*j - i]. But since Z is binary, that's exactly the number of triplets i,j,k which are evenly-spaced ones in Z. Note that (j, j, j) is always such a triplet, so we only care about ones with values > 1.
We can then use a Fast Fourier Transform to find p2 in O(|Z| log |Z|) time, where |Z| is v' - u' + 1. We get out another array of coefficients; call it W.
Loop over each x_k in X. (Recall that our desired evenly-spaced ones are all centered on an element of X, not 2*X.) If the corresponding W for twice this element, i.e. W[2*(x_k - u')], is 1, we know it's not the center of any nontrivial progressions and we can skip it. (As argued before, it should only be a positive integer.)
Otherwise, it might be the center of a progression that we want (so we need to find i and j). But, unfortunately, it might also be the center of a progression that doesn't have our desired form. So we need to check. Loop over the other elements x_i of X, and check if there's a triple with 2*x_i, x_k, 2*x_j for some j (by checking Z[2*(x_k - x_j) - u']). If so, we have an answer; if we make it through all of X without a hit, then the FFT found only spurious answers, and we have to check another element of W.
This last step is therefore O(n * 1 + (number of x_k with W[2*(x_k - u')] > 1 that aren't actually solutions)), which is maybe possibly O(n^2), which is obviously not okay. There should be a way to avoid generating these spurious answers in the output W; if we knew that any appropriate W coefficient definitely had an answer, this last step would be O(n) and all would be well.
I think it's possible to use a somewhat different polynomial to do this, but I haven't gotten it to actually work. I'll think about it some more....
Partially based on this answer.
It has to be at least O(n^2) as there are n(n-1)/2 different sums possible to check for other members. You have to compute all those, because any pair summed may be any other member (start with one example and permute all the elements to convince yourself that all must be checked). Or look at fibonacci for something concrete.
So calculating that and looking up members in a hash table gives amortised O(n^2). Or use an ordered tree if you need best worst-case.
You essentially need to find all the different sums of value pairs so I don't think you're going to do any better than O(n2). But you can optimize by sorting the list and reducing duplicate values, then only pairing a value with anything equal or greater, and stopping when the sum exceeds the maximum value in the list.

How is it possible to take an exponential of a matrix in MATLAB?

I have a MATLAB code which I have to convert to C language. According to the MATLAB code,
n1 = 11; x1 = randn(2,n1) + repmat([-1 1]’,1,n1);
w = [0 0]’;
here acccording to my calculation, the output of
w’*x1
will be a 1x3 matrix, that is a row vector as far as I know.
Then what will be the output of the following code,
z = exp(repmat(b,1,n1)+w’*x1);
where repmat() also creates a 1xn1 matrix (I'm not sure about this, figured it out from manual). My understanding is that addition of two 1x3 matrices will not give a scalar.
How is an exponential is taken here? Can exponential be applied to a matrix?
Like many MATLAB functions, the exp function operates element-wise when applied to arrays. For further details, please refer to the documentation.
Yes, you can apply exponentation to a matrix. From the Wikipedia article: Matrix exponential
Let X be an n×n real or complex matrix. The exponential of X, denoted by eX or exp(X), is the n×n matrix given by the power series
e^X = Sum(k=0, infinity) 1/k! * X^k
As #John Bartholomew pointed though, this is not what the exp() of Matlab does.

Perfect Power detection in linear time

I'm trying to write a C program which, given a positive integer n (> 1) detect whether exists numbers x and r so that n = x^r
This is what I did so far:
while (c>=d) {
double y = pow(sum, 1.0/d);
if (floor(y) == y) {
out = y;
break;
}
d++;
}
In the program above, "c" is the maxium value for the exponent (r) and "d" will start by being equal to 2. Y is the value to be checked and the variable "out" is set to output that value later on. Basically, what the script does, is to check if the square roots of y exists: if not, he tries with the square cube and so on... When he finds it, he store the value of y in "out" so that: y = out^d
My question is, is there any more efficient way to find these values? I found some documentation online, but that's far more complicated than my high-school algebra. How can I implement this in a more efficient way?
Thanks!
In one of your comments, you state you want this to be compatible with gigantic numbers. In that case, you may want to bring in the GMP library, which supports operations on arbitrarily large numbers, one of those operations being checking if it is a perfect power.
It is open source, so you can check out the source code and see how they do it, if you don't want to bring in the whole library.
If n fits in a fixed-size (e.g. 32-bit) integer variable, the optimal solution is probably just hard-coding the list of such numbers and binary-searching it. Keep in mind, in int range, there are roughly
sqrt(INT_MAX) perfect squares
cbrt(INT_MAX) perfect cubes
etc.
In 32 bits, that's roughly 65536 + 2048 + 256 + 128 + 64 + ... < 70000.
You need the r-base logarithm, use an identity to calculate it using the natural log
So:
log_r(x) = log(x)/log(r)
So you need to calculate:
x = log(n)/log(r)
(In my neck of the wood, this is highschool math. Which immediately explains my having to look up whether I remembered that identity correctly :))
After you are calculating y in
double y = pow(sum, 1.0/d);
you can get the nearest int to it and you can use your own power function to check for the
equality condition with sum.
int x = (int)(y+0.5);
int a = your_power_func(x,d);
if (a == sum)
break;
I guess this way you can confirm whether a number is integer power of some other number or not.

Resources