I tried modified the code given here to solve linear equations for values of x. Such as
(3*x+7)/3+(2*x)/9=6/10
by first splitting it into two expressions right and left and then using the "SolveSimpleRoot" and it worked giving the value of x. But if the linear equation was written in the form of
(3+2*x)/(5*x-2)=7, which you can mulitiply throughout by (5*x-2) and indeed is a linear then the code fails at
// extract coefficients, solve known forms of order up to 1
MathNet.Symbolics.Expression[] coeff = MathNet.Symbolics.Polynomial.Coefficients(variable, simple);
with an error of:
The input sequence was empty.Parameter name: source
It also fails to solve if the expression was like (2x+7)/x=2, which still expands out to be linear.
Any idea why?
The code is basically:
public void solveForX()
{
string eqn = "(3*x+7)/3+(2*x)/9=6/10"
string[] expString = eqn.Split('=');
var x = MathNet.Symbolics.Expression.Symbol("x");
MathNet.Symbolics.Expression aleft = MathNet.Symbolics.Infix.ParseOrThrow(expString[0]);
MathNet.Symbolics.Expression aright = MathNet.Symbolics.Infix.ParseOrThrow(expString[1]);
ans = SolveSimpleRoot(x, aleft - aright);
SelectionInsertText(MathNet.Symbolics.Infix.Print(cx));
}
private MathNet.Symbolics.Expression SolveSimpleRoot(MathNet.Symbolics.Expression variable, MathNet.Symbolics.Expression expr)
{
// try to bring expression into polynomial form
MathNet.Symbolics.Expression simple = MathNet.Symbolics.Algebraic.Expand(MathNet.Symbolics.Rational.Simplify(variable, expr));
// extract coefficients, solve known forms of order up to 1
MathNet.Symbolics.Expression[] coeff = MathNet.Symbolics.Polynomial.Coefficients(variable, simple);
switch (coeff.Length)
{
case 1: return variable;
case 2: return MathNet.Symbolics.Rational.Simplify(variable, MathNet.Symbolics.Algebraic.Expand(-coeff[0] / coeff[1]));
default: return MathNet.Symbolics.Expression.Undefined;
}
}
You can extend it to support such rational cases by multiplying both sides with the denominator (hence effectively taking the numerator only, with Rational.Numerator):
private Expr SolveSimpleRoot(Expr variable, Expr expr)
{
// try to bring expression into polynomial form
Expr simple = Algebraic.Expand(Rational.Numerator(Rational.Simplify(variable, expr)));
// extract coefficients, solve known forms of order up to 1
Expr[] coeff = Polynomial.Coefficients(variable, simple);
switch (coeff.Length)
{
case 1: return Expr.Zero.Equals(coeff[0]) ? variable : Expr.Undefined;
case 2: return Rational.Simplify(variable, Algebraic.Expand(-coeff[0] / coeff[1]));
default: return Expr.Undefined;
}
}
This way it can also handle (3+2*x)/(5*x-2)=7. I also fixed case 1 to return undefined if the coefficient is not equal to zero in this case, so the solution of the other example, (2*x+7)/x=2, will be undefined as expected.
Of course this still remains a very simplistic routine which will not be able to handle any higher order problems.
Related
So, I know that some programming languages have maximum numbers (integers). I am trying to find a combat this with this problem. This is more of an algorithm based question and a storage question. So basically, I am trying to store information to a database. But, I am trying to find a way to test for the worst possible case (reach a number so large its no longer supported) and find a solution to it. So, assume I have a function called functionInTime that looks something like this:
functionInTime(){
currenttime = getCurrentTime();
foo(); // Random void function
endtime = getCurrentTime();
return (endtime - currenttime);
}
This function should essentially just check how long it takes to run the function foo. Additionally, assume there is a SUM function that looks similar to:
SUM(arrayOfNumbers){
var S = 0;
for(int i = 0; i < arrayOfNumbers.length; i++){
S = S + arrayOfNumbers[i];
}
return S;
}
I also there is a function storeToDB, which looks something like this:
storeToDB(time){ // time is an INTEGER
dbInstance = new DBINSTANCE();
dbInstance.connectTo("wherever");
timesRun = dbInstance.get("times_run"); // return INTEGER
listOfTimes = dbInstance.get("times_list"); // return ARRAY
dbInstance.set("times_run", timesRun+1);
dbInstance.set("times_list", listOfTimes.addToArray(time));
dbInstance.close();
}
However, this is where problems start to stand out to me in terms of efficiency and lead me to believe that this is a terrible algorithm. Lastly assume I have a function called computeAverage, that is simply:
computeAverage(){
dbInstance = new DBINSTANCE();
dbInstance.connectTo("wherever");
timesRun = dbInstance.get("times_run"); // return INTEGER
listOfTimes = dbInstance.get("times_list"); // return ARRAY
return SUM(listOfTimes)/timesRun;
}
Since the end goal is to report the computed average each time, I think the above method should work. However, what I am trying to test is what happens if the value returned for the variable timesRun is so large that the language cannot support such a number. Parsing an array that long would take forever, but what if the size of an array is a number the language doesn't support? Removing or resetting the variables timesRun or listOfTimes would distort the data and throw off any other measurements that use this.
::: {"times_run": N} // N = number not supported by language
functionTime();
printThisOut("The overage average is "+computeAverage()+" ms per run...");
::: ERROR (times_run is not a supported number)
How can I infinitely add data in a way that is both efficient and is resistant to a maximum number overflow?
I'm new into matlab and my problem is that I'm trying to implement conjugate matrix transpose function('), but I have no idea how to change sign only in imaginary number. I know it may be stupid question but thanks for any tips and advice.
I tried something like this, but I got these errors:
error: complex matrix type invalid as index value
error: assignment failed, or no method for ' = matrix'
function [ result ] = transpose_matrix( a )
[Row,Col] = size(a);
result = zeros(Col, Row);
iY=1;
for iRow=1:Row
iX=iRow;
for iCol=1:Col
result(iX)=a(iY);
iX=iX+Row;
iY=iY+1;
end
end
imag(result)=imag(result)*-1;
end
MATLAB is confused because the following statement tries to treat imag as a variable with result as an index since it's on the left-hand side of the assignment.
imag(result) = imag(result) * (-1);
Also, it's important to note that imag returns a real number which is the magnitude of the imaginary component. Once you modify the output of imag, you need to multiply by sqrt(-1) to get it back to an imaginary number
imag(a) * (-1) * 1i;
Now to modify only the imaginary component of result, you'll want to simply add this new imaginary component with the real component of result.
result = real(result) + imag(result) * (-1) * 1i;
Or more simply:
result = real(result) - imag(result) * 1i;
A Potential Alternative
If you can use the normal transpose function you could replace your entire function with the following:
result = transpose(a);
result = real(result) - imag(result) * 1i;
I am frequently needing to calculate mean and standard deviation for numeric arrays. So I've written a small protocol and extensions for numeric types that seems to work. I just would like feedback if there is anything wrong with how I have done this. Specifically, I am wondering if there is a better way to check if the type can be cast as a Double to avoid the need for the asDouble variable and init(_:Double) constructor.
I know there are issues with protocols that allow for arithmetic, but this seems to work ok and saves me from putting the standard deviation function into classes that need it.
protocol Numeric {
var asDouble: Double { get }
init(_: Double)
}
extension Int: Numeric {var asDouble: Double { get {return Double(self)}}}
extension Float: Numeric {var asDouble: Double { get {return Double(self)}}}
extension Double: Numeric {var asDouble: Double { get {return Double(self)}}}
extension CGFloat: Numeric {var asDouble: Double { get {return Double(self)}}}
extension Array where Element: Numeric {
var mean : Element { get { return Element(self.reduce(0, combine: {$0.asDouble + $1.asDouble}) / Double(self.count))}}
var sd : Element { get {
let mu = self.reduce(0, combine: {$0.asDouble + $1.asDouble}) / Double(self.count)
let variances = self.map{pow(($0.asDouble - mu), 2)}
return Element(sqrt(variances.mean))
}}
}
edit: I know it's kind of pointless to get [Int].mean and sd, but I might use numeric elsewhere so it's for consistency..
edit: as #Severin Pappadeux pointed out, variance can be expressed in a manner that avoids the triple pass on the array - mean then map then mean. Here is the final standard deviation extension
extension Array where Element: Numeric {
var sd : Element { get {
let sss = self.reduce((0.0, 0.0)){ return ($0.0 + $1.asDouble, $0.1 + ($1.asDouble * $1.asDouble))}
let n = Double(self.count)
return Element(sqrt(sss.1/n - (sss.0/n * sss.0/n)))
}}
}
Swift 4 Array extension with FloatingPoint elements:
extension Array where Element: FloatingPoint {
func sum() -> Element {
return self.reduce(0, +)
}
func avg() -> Element {
return self.sum() / Element(self.count)
}
func std() -> Element {
let mean = self.avg()
let v = self.reduce(0, { $0 + ($1-mean)*($1-mean) })
return sqrt(v / (Element(self.count) - 1))
}
}
There's actually a class that provides this functionality already - called NSExpression. You could reduce your code size and complexity by using this instead. There's quite a bit of stuff to this class, but a simple implementation of what you want is as follows.
let expression = NSExpression(forFunction: "stddev:", arguments: [NSExpression(forConstantValue: [1,2,3,4,5])])
let standardDeviation = expression.expressionValueWithObject(nil, context: nil)
You can calculate mean too, and much more. Info here: http://nshipster.com/nsexpression/
In Swift 3 you might (or might not) be able to save yourself some duplication with the FloatingPoint protocol, but otherwise what you're doing is exactly right.
To follow up on Matt's observation, I'd do the main algorithm on FloatingPoint, taking care of Double, Float, CGFloat, etc. But then I then do another permutation of this on BinaryInteger, to take care of all of the integer types.
E.g. on FloatingPoint:
extension Array where Element: FloatingPoint {
/// The mean average of the items in the collection.
var mean: Element { return reduce(Element(0), +) / Element(count) }
/// The unbiased sample standard deviation. Is `nil` if there are insufficient number of items in the collection.
var stdev: Element? {
guard count > 1 else { return nil }
return sqrt(sumSquaredDeviations() / Element(count - 1))
}
/// The population standard deviation. Is `nil` if there are insufficient number of items in the collection.
var stdevp: Element? {
guard count > 0 else { return nil }
return sqrt(sumSquaredDeviations() / Element(count))
}
/// Calculate the sum of the squares of the differences of the values from the mean
///
/// A calculation common for both sample and population standard deviations.
///
/// - calculate mean
/// - calculate deviation of each value from that mean
/// - square that
/// - sum all of those squares
private func sumSquaredDeviations() -> Element {
let average = mean
return map {
let difference = $0 - average
return difference * difference
}.reduce(Element(0), +)
}
}
But then on BinaryInteger:
extension Array where Element: BinaryInteger {
var mean: Double { return map { Double(exactly: $0)! }.mean }
var stdev: Double? { return map { Double(exactly: $0)! }.stdev }
var stdevp: Double? { return map { Double(exactly: $0)! }.stdevp }
}
Note, in my scenario, even when dealing with integer input data, I generally want floating point mean and standard deviations, so I arbitrarily chose Double. And you might want to do safer unwrapping of Double(exactly:). You can handle this scenario any way you want. But it illustrates the idea.
Not that I know Swift, but from numerics POV you're doing it a bit inefficiently
Basically, you're doing two passes (actually, three) over the array to compute two values, where one pass should be enough. Vairance might be expressed as E(X2) - E(X)2, so in some pseudo-code:
tuple<float,float> get_mean_sd(data) {
float s = 0.0f;
float s2 = 0.0f;
for(float v: data) {
s += v;
s2 += v*v;
}
s /= count;
s2 /= count;
s2 -= s*s;
return tuple(s, sqrt(s2 > 0.0 ? s2 : 0.0));
}
Just a heads-up, but when I tested the code outlined by Severin Pappadeux the result was a "population standard deviation" rather than a "sample standard deviation". You would use the first in an instance where 100% of the relevant data is available to you, such as when you are computing the variance around an average grade for all 20 students in a class. You would use the second if you did not have universal access to all the relevant data, and had to estimate the variance from a much smaller sample, such as estimating the height of all males within a large country.
The population standard deviation is often denoted as StDevP. The Swift 5.0 code I used is shown below. Note that this is not suitable for very large arrays due to loss of the "small value" bits as the summations get large. Especially when the variance is close to zero you might run into run-times errors. For such serious work you might have to introduce an algorithm called compensated summation
import Foundation
extension Array where Element: FloatingPoint
{
var sum: Element {
return self.reduce( 0, + )
}
var average: Element {
return self.sum / Element( count )
}
/**
(for a floating point array) returns a tuple containing the average and the "standard deviation for populations"
*/
var averageAndStandardDeviationP: ( average: Element, stDevP: Element ) {
let sumsTuple = sumAndSumSquared
let populationSize = Element( count )
let average = sumsTuple.sum / populationSize
let expectedXSquared = sumsTuple.sumSquared / populationSize
let variance = expectedXSquared - (average * average )
return ( average, sqrt( variance ) )
}
/**
(for a floating point array) returns a tuple containing the sum of all the values and the sum of all the values-squared
*/
private var sumAndSumSquared: ( sum: Element, sumSquared: Element ) {
return self.reduce( (Element(0), Element(0) ) )
{
( arg0, x) in
let (sumOfX, sumOfSquaredX) = arg0
return ( sumOfX + x, sumOfSquaredX + ( x * x ) )
}
}
}
So I'm trying to do what I've said above. The user will enter a precision, such as 3 decimal places, and then using the trapezium rule, the program will keep adding strips on until the 3rd decimal place is no longer changing, and then stop and print the answer.
I'm not sure of the best way to approach this. Due to the function being sinusoidal, one period of 2PI will almost be 0. I feel like this way would be the best way of approaching the problem, but no idea of how to go about it. At the moment I'm checking the y value for each x value to see when that becomes less than the required precision, however it never really goes lower enough. At x = 10 million, for example, y = -0.0002, which is still relatively large for such a large x value.
for (int i = 1; i < 1000000000; i++)
{
sumFirstAndLast += func(z);
z += stripSize;
count++;
printf("%lf\n", func(z));
if(fabs(func(z))<lowestAddition/stripSize){
break;
}
}
So this above is what I'm trying to do currently. Where func is the function. The stripSize is set to 0.01, just something relatively small to make the areas of the trapeziums more accurate. sumFirstAndLast is the sum of the first and last values, set at 0.001 and 1000000. Just a small value and a large value.
As I mentioned, I "think" the best way to do this, would be to check the value of the integral over every 2PI, but once again not sure how to go about this. My current method gives me the correct answer if I take the precision part out, but as soon as I try to put a precision in, it gives a completely wrong answer.
For a non-periodic function that converges to zero you can (sort of) do a check of the function's value and compare to a minimum error value, but this doesn't work for a periodic function as you get an early exit before the integrand sum converges (as you've found out). For a non-periodic function you can simply check the change in the integrand sum on each iteration to a minimum error but that won't work here either.
Instead, you'll have to do like a few comments suggest to check for convergence relative to the period of the function, PI in this case (I found it works better than using 2*PI). To implement this do something like the following code (note I changed your sum to be the actual area instead of doing it at the end):
sumFirstAndLast = (0.5*func(a) + 0.5*func(b)) * stripSize;
double z = a + stripSize;
double CHECK_RANGE = 3.14159265359;
double NextCheck = CHECK_RANGE;
double LastCheckSum = 0;
double MinError = 0.0001;
for (int i = 1; i < 1000000000; i++)
{
sumFirstAndLast += func(z) * stripSize;
if (z >= NextCheck)
{
if (fabs(LastCheckSum - sumFirstAndLast ) < MinError) break;
NextCheck += CheckRange;
LastCheckSum = sumFirstAndLast;
}
z += stripSize;
count++;
}
This seems to work and give the result to the specified accuracy according to the value of MinError. There are probably other (better) ways to check for convergence when numerically integrating a periodic function. A quick Google search reveals this paper for example.
The integral of from 0 to infinity of cos(x)/sqrt(x), or sin(x)/sqrt(x) is well known to be sqrt(pi/2). So evaluating pi to any number of digits is easier problem. Newton did it by integrating a quarter circle to get the area = pi/4. The integrals are evaluated by the methods of complex analysis. They are done in may text books on the subject, and on one of my final exams in graduate school.
I am working on a program which generates C code for one function. This generated C function resides in the central loop of another target program; this function is performance sensitive. The generated function is used to call another function, based on a bool value -- this boolean value is fetched using 2 ints passed to the generated function: a state number and a mode number. Generated function looks like so:
void dispatch(System* system, int state, int mode) {
// Some other code here...
if (truthTable[state][mode]) {
doExpensiveCall(system, state, mode);
}
}
Some facts:
The range of 'state' and 'mode' values start at 0, and end at some number < 10,000. Their possible values are sequential, with no gaps in between. So, for example, if the end value of 'state' is 1000, then we know that there are 1001 and states (including state 0).
The code generator is aware of the states and modes, and it knows ahead of time which combination of state+mode will yield a value of true. Theoretically, any combination of state+mode could yield true value, and thus make a call to doExpensiveCall, but in practive it will mostly be a handful of state+mode combinations that will yield a value of true. Again, this info is known during the code generation.
Since this function will be called alot, I want to optimize the check for the truth value. In the average case, I expect the test to yield false for the vast percetage of time. On average, I expect that less than 1% of the calls will yield a value of true. But, theoretically, it could be as hight as 100% of the time (this point depends on the end-user).
I am exploring the different ways that I could compute whether a state+mode will yied a call to doExpensiveCall(). In the end, I'll have to choose something, so I'm exploring my options now. These are the different ways that I could think of so far:
1) Create a precomputed dual dimensional array, which contains booleans. This is what I'm using in the example above. This yields the fastest possible check that I can think of. The problem is that if state and mode have large ranges (say 10,000x1000), the generated table starts becomming very big (in the case of 10,000x1000, thats 10MB for just that table). Example:
// STATE_COUNT=4, MODE_COUNT=3
static const char truthTable[STATE_COUNT][MODE_COUNT] = {
{0,1,0},
{0,0,0},
{1,1,0},
{0,0,1}
}
2) Create a table like #1, but compressed: instead of each array entry being a single boolean, it would be a char bitfield. Then, during the check, I would do some computation with state+mode to decide how to index into the array. This reduces the size of the precomputed table by MODE_MODE/8. The downside is that the reduction is not that much, and now theres is now need compute the index of the boolean in the bitfield table, instead of just a simple array access as in the case in #1.
3) Since the amount of state+mode combinations that will yield a value of true is expected to be small, a switch statement is also possible (using the truthTable in #1 as reference):
switch(state){
case 0: // row
switch(mode){ // col
case 1: doExpensiveCall(system, state, mode);
break;
}
break;
case 2:
switch(mode){
case 0:
case 1: doExpensiveCall(system, state, mode);
break;
}
break;
case 3:
switch(mode){
case 2: doExpensiveCall(system, state, mode);
break;
}
break;
}
QUESTION:
What are other ways that, given the facts above, can be used calcuate this boolean value needed to call doExpensiveCall()?
Thanks
Edit:
I though about Jens sample code, and the following occurred to me. In order to have just one switch statement, I can do this computation in the generated code:
// #if STATE_COUNT > MODE_COUNT
int i = s * STATE_COUNT + m;
// #else
int i = m * MODE_COUNT + s;
// #endif
switch(i) {
case 1: // use computed values here, too.
case 8:
case 9:
case 14:
doExpensiveCall(system, s, m);
}
I'd try to use a modified version of (3), where you actually have only one call, and all the switch/case stuff leads to that call. By that you can ensure that the compiler will choose whatever heuristics he has for optimizing this.
Something along the line of
switch(state) {
default: return;
case 0: // row
switch(mode){ // col
default: return;
case 1: break;
}
break;
case 2:
switch(mode){
default: return;
case 0: break;
case 1: break;
}
break;
case 3:
switch(mode){
default: return;
case 2: break;
}
break;
}
doExpensiveCall(system, state, mode);
That is, you'd only have "control" inside the switch. The compiler should be able to sort this out nicely.
These heuristics will probably be different between architectures and compilation options (e.g -O3 versus -Os) but this is what compilers are for, make choices based on platform specific knowledge.
And for your reference to time effeciency, if your function call is really expensive as you claim, this part will just be burried in the noise, don't worry about it. (Or otherwise benchmark your code to be sure.)
If the code generator knows the percentage of the table that's in use it can choose the algorithm at build time.
So if it is about 50% true/false use the 10 MB table.
Otherwise use a hash table or a radix tree.
A hash table would choose a hash function and a number of buckets. You'd compute the hash, find the bucket and search the chain for the true (or false) values.
The radix tree would choose a radix (like 10) and you'd have 10 entries with pointers to NULL (no true values down there) and one would have a pointer to another 10 entries, until you finally reach a value.