Dafny: Using "forall" quantifiers with the "reads" or "modifies" clauses - arrays

So I am trying to implement Dijkstra's single source shortest paths algorithm in Dafny based directly on the description of the algorithm in the CLRS algorithms book as part of an undergraduate project. As part of the implementation, I have defined a "Vertex" object with two fields representing the current length of shortest path from source and the predecessor vertex:
class Vertex{
var wfs : int ;
var pred: Vertex;
}
As well as a "Graph" object that contains an array of "Vertex"-es:
class Graph{
var vertices: array<Vertex>;
....
I am trying to state some properties of the fields in each "Vertex" of the vertices array using a predicate in the "Graph" object:
predicate vertexIsValid()
reads this;
reads this.vertices;
{
vertices != null &&
vertices.Length == size &&
forall m :: 0 <= m < vertices.Length ==> vertices[m].wfs != 900000 &&
vertices[m].pred != null
}
To my understanding, the "reads" and "modifies" clauses in Dafny only work on one layer and I'd have to specify to Dafny that I would be reading each entry in the vertices array ( reads this.vertices[x] ) . I tried using a "forall" clause to do it:
forall m :: 0 <= m < vertices.Length ==> reads this.vertices[m]
but this doesn't seem to be a feature in Dafny. Does anyone know if there is a way to use quantifiers with the "reads" clause or otherwise tell Dafny to read the fields in each entry of an array containing objects?
Thanks for the help.

You can do that most easily by using a set as a reads clause.
For your example, this additional reads clause on vertexIsValid worked for me:
reads set m | 0 <= m < vertices.Length :: vertices[m]
You can think of this set expression as saying "the set of all elements vertices[m] where m is in bounds".

Related

Structuring a for loop to output classifier predictions in python

I have an existing .py file that prints a classifier.predict for a SVC model. I would like to loop through each row in the X feature set to return a prediction.
I am currently trying to define the element from which to iterate over so as to allow for definition of the test statistic feature set X.
The test statistic feature set X is written in code as:
X_1 = xspace.iloc[testval-1:testval, 0:5]
testval is the element name used in the for loop in the above line:
for testval in X.T.iterrows():
print(testval)
I am having trouble returning a basic set of index values for X (X is the pandas dataframe)
I have tested the following with no success.
for index in X.T.iterrows():
print(index)
for index in X.T.iteritems():
print(index)
I am looking for the set of index values, with base 1 if possible, like 1,2,3,4,5,6,7,8,9,10...n
seemingly simple stuff...i haven't located an existing question via stackoverflow or google.
ALSO, the individual dataframes I used as the basis for X were refined with the line:
df1.set_index('Date', inplace = True)
Because dates were used as the basis for the concatenation of the individual dataframes the loops as written above are returning date values rather than
location values as I would prefer hence:
X_1 = xspace.iloc[testval-1:testval, 0:5]
where iloc, location is noted
please ask for additional code if you'd like to see more
the loops i've done thus far are returning date values, I would like to return index values of the location of the rows to accommodate the line:
X_1 = xspace.iloc[testval-1:testval, 0:5]
The loop structure below seems to be working for my application.
i = 1
j = list(range(1, len(X),1)
for i in j:

How to make a function to know if disjoint set array represent single partition?

Suppose I have disjoint set with array implementation like this.
Consider this disjoint set array [0,0,3,3] which represents the following partition:{0,1}{2,3}. As you can see, the array [0,0,3,3] represents 2 partition class, that is, {0,1} and {2,3}.
Now consider [0,0,1,2], which represents the partition {0,1,2,3}. The array [0,0,1,2] represents a single partition.
How do I make a function to know whether an array represents single partition or not. The function will return true if the array passed to it represents a single partition and return false otherwise.
Another way to put it is, (see here) how do I know whether all vertices are in one single tree.
Javascript or python code are welcome. Actionscript is preferred.
Any help is appreciated.
You just count the number of root sets. That's the number of partitions you have.
In the disjoint set representation that you are using, the roots are the sets that link to themselves. For example in [0,0,3,3], 0->0 and 3->3 are roots, so you have 2 partitions. In [0,0,1,2], only 0->0 is a root, so there is one partition.
One easy way to accomplish this is to store additional information in Disjoint Set Union (DSU) data structure.
If we store not only parent information but the size of each disjoint set, then we can easily check if we are only left with one disjoint set, by comparing size of first disjoint set with the total amount of elements.
There's an easy way to implement this without using extra space:
In the tutorial you linked P[u] stores parent of element u, and in case u is the root of disjoint set it stores itself so u is root if P[u] === u
In our modified implementation we mark root nodes with negative numbers so u is a root of disjoint set if P[u] < 0, and now we can also store size of disjoint set as a negative number so if P[u] >= 0 it acts as in standard DSU implementation to show the parent of some node, and if it's negative it shows that current node is the root and -P[u] denotes the size of disjoint set this root represents.
Sample code (JavaScript, using only Path compression optimization so complexity for all functions is O(log N)):
const Find = (P,u) => P[u] < 0 ? u : P[u] = Find(P,P[u]);
const Union = (P,u,v) => {
u = Find(P,u);
v = Find(P,v);
if(u === v)return;
P[u] += P[v]; //we merge 2 sets size of new set is sum of the sets we are merging
P[v] = u;
}
const iSSinglePartition = (P) => -P[Find(P,0)] === P.length;
const N = 5;
let P = new Array(N);
P.fill(-1); //now -P[u] represent size of disjoint set, initially all elements are disjoint so we set sizes to 1
console.log(iSSinglePartition(P));
Union(P,0,1);
console.log(iSSinglePartition(P));
Union(P,1,2);
console.log(iSSinglePartition(P));
Union(P,3,4);
console.log(iSSinglePartition(P));
Union(P,3,0);
console.log(iSSinglePartition(P));

merge two sorted arrays in julia

Is there a neat function in julia which will merge two sorted arrays and return the sorted array for me? I have written:
c=1
p=1
i=1
n=length(tc)+length(tp)
t=Array{Float64}(n)
while(c<=length(tc) && p<=length(tp))
if(tp[p]<tc[c])
t[i]=tp[p]
p=p+1;
i=i+1;
else
t[i]=tc[c]
c=c+1;
i=i+1;
end
end
while(p<=length(tp))
t[i]=tp[p]
i=i+1
p=p+1
end
while(c<=length(tc))
t[i]=tc[c]
i=i+1
c=c+1
end
but is there no native function in base julia to do this?
Contrary to the other answers, there is in fact a method to do this in base Julia. BUT, it only works for arrays of integers, AND it will only work if the arrays are unique (in the sense that no integer is repeated in either array). Simply use the IntSet type as follows:
a = [2, 3, 4, 8]
b = [1, 5]
union(IntSet(a), IntSet(b))
If you run the above code, you'll note that the union function removes duplicates from the output, which is why I stated initially that your arrays must be unique (or else you must be happy to have duplicates removed in the output). You'll also notice that the union operation on the IntSet works much faster than union on a sorted Vector{Int}, since the former exploits the fact that an IntSet is pre-sorted.
Of course, the above is not really in the spirit of the question, which more concerns a solution for any type for which the lt operator is defined, as well as allowing for duplicates.
Here is a function that efficiently finds the union of two pre-sorted unique vectors. I've never had a need for the non-unique case myself so have not written a function that covers that case I'm afraid:
"union <- Return the union of the inputs as a new sorted vector"
function union_vec(x::Vector{T}, y::Vector{T})::Vector{T} where {T}
(nx, ny) = (1, 1)
z = T[]
while nx <= length(x) && ny <= length(y)
if x[nx] < y[ny]
push!(z, x[nx])
nx += 1
elseif y[ny] < x[nx]
push!(z, y[ny])
ny += 1
else
push!(z, x[nx])
nx += 1
ny += 1
end
end
if nx <= length(x)
[ push!(z, x[n]) for n = nx:length(x) ]
elseif ny <= length(y)
[ push!(z, y[n]) for n = ny:length(y) ]
end
return z
end
Another option is to look at sorted dictionaries, available in the DataStructures.jl package. I haven't done it myself, but a method that just inserts all observations into a sorted dictionary (checking for key duplication as you go) and then iterates over (keys, values) should also be a fairly efficient way to attack this problem.
Although an explicit function to merge two sorted vectors seems to be missing, one can be constructed easily from the existing building blocks (the question actually demonstrated this, but it doesn't define a function).
The following method tries to leverage the existing sort code and still remain efficient.
In code:
mergesorted(a,b) = sort!(vcat(a,b))
The following is an example:
julia> a = [1:2:11...];
julia> b = [2:3:20...];
julia> show(a)
[1,3,5,7,9,11]
julia> show(b)
[2,5,8,11,14,17,20]
julia> show(mergesorted(a,b))
[1,2,3,5,5,7,8,9,11,11,14,17,20]
I didn't benchmark the function, but QuickSort (the default sort algorithm) is usually good performing on pre-sorted arrays, so it should be OK and the allocation of a result vector is required in any implementation.
I keep coming across this in different projects, so I made a package MergeSorted (https://github.com/vvjn/MergeSorted.jl). You can use it as follows.
using MergeSorted
a = sort!(rand(1000))
b = sort!(rand(1000))
c = mergesorted(a,b)
sort!(vcat(a,b)) == c
Or without allocating new memory.
mergesorted!(c, a, b)
You can also use all of the sort options.
a = sort!(rand(1000), order=Base.Reverse)
b = sort!(rand(1000), order=Base.Reverse)
c = mergesorted(a,b, order=Base.Reverse)
sort!(vcat(a,b), order=Base.Reverse) == c
It's around 4-6 times faster than sort!(vcat(a,b)), which uses QuickSort by default, and twice as fast as sort!(vcat(a,b), alg=MergeSort) but MergeSort uses more memory.
No, such function does not exist. And actually I have not seen a language which has such function out of the box.
To do this, you have to maintain two pointers in each of the arrays, compare the values and move the smaller (based on what I see, this is exactly what you do).

How to use sets of numbers as array indices?

I want to implement the Travelling Salesman Problem (Dynamic Programming) in C. I have the following pseudocode:
** Ignoring the base cases**
** C[i][j] is cost of the edge from i to j**
for m = 2,3,4..n:
for each set S of size m which is subset of {1,2,3...n}:
for each J in S, j ≠ 1:
A[S][j] = min of { A[S-{j}][k] + C[k][j] } for all k is in S and k ≠ j:
return min of { A[{1,2,3...n},j] + C[j][1] } for all j from 2 to n
A[S][j] stores the shortest path from 1 to j which visits all vertices in S exactly once. (S includes 1 and j).
The time complexity is O(n22n).
My problem is that in this pseudocode they have used sets as array indices and the time complexity indicates that the lookup for a set without an element j (S - {j}) takes constant time.
What I have thought of is using a 3D array indexed by m,i and j. Where 'i' points to a set at stored in a different array of sets indexed by m,i.
But the problem is that I cannot do the lookup A[S-{j}[k]] in constant time.
My question is that how do I implement an array indexed by a 'set' without changing the time complexity of the original algorithm.
Let each path be represented by a binary string, where each bit represents whether or not a city is visited.
So
(123456)
011001
means city 2, 3 and 6 are visited.
You use the above as array index.
When you want to look-up the path without a city, just set that bit to 0 and use the output as index.
The first city will always be visited so you really don't need a bit for that city.

MATLAB excluding data outside 1 standard deviation

I'm inexperienced with MATLAB, so sorry for the newbie question:
I've got a large vector (905350 elements) storing a whole bunch of data in it.
I have the standard deviation and mean, and now I want to cut out all the data points that are above/below one standard deviation from the mean.
I just have no clue how. From what I gather I have to make a double loop of some sort?
It's like: mean-std < data i want < mean + std
If the data is in variable A, with the mean stored in meanA and the standard deviation stored in stdA, then the following will extract the data you want while maintaining the original order of the data values:
B = A((A > meanA-stdA) & (A < meanA+stdA));
Here are some helpful documentation links that touch on the concepts used above: logical operators, matrix indexing.
You can simply use the Element-wise logical AND:
m = mean(A);
sd = std(A);
B = A( A>m-sd & A<m+sd );
Also, knowing that: |x|<c iff -c<x<c, you can combine both into one as:
B = A( abs(A-m)<sd );
Taking A as your original vector, and B as the final one:
B = sort(A)
B = B(find(B > mean-std,1,'first'):find(B < mean+std,1,'last'))
y = x(x > mean-std);
y = y(y < mean+std);
should work. See FIND for more details. The FIND command is being used implicitly in the above code.

Resources