I am striving to create a function in VBA that calculates the number of missing values in each column of a matrix of nxn dimensions.
Each column should contain the numbers 1 to n only once.
However if this is not the case I want to the function to state how many values are missing. For example in a column of 4x4 matrix (1,2,1,3) there is one missing value which is 4, and the function should return the value 1, for the 1 missing value.
I am very new to VBA and by no means a master, but this is what I have done so far...
Function calcCost(sol() As Integer, n As Integer) As Integer
Dim ArrayOfTruth(1 To n) As Boolean
For Row = 1 To n
For i = 1 To n
If ProbMatrix(Column, Row) = i Then
ArrayOfTruth(i) = True
cost = 0
For i = 1 To n
If ArrayOfTruth(i) = True Then
cost = cost + 1
Assuming that the requirement of a square range of cells supersedes the description of the 'matrix's' values, I'm not sure why an array is needed at all.
Function calcCost(rTopLeft As Range, n As Long)
Dim c As Long, r As Long
For c = 1 To n
If Not CBool(Application.CountIf(rTopLeft.Resize(n, n), c)) Then _
r = r + 1
Next c
calcCost = r
End Function
Syntax:
=calcCost(<top left corner of desired range>, <number of cells both right and down>)
Example:
=calcCost(E9, 18)
The above implementation could also be written as,
=18-SUMPRODUCT(--SIGN(COUNTIF(OFFSET(E9,0,0,18,18), ROW(1:18))))
Related
All -
I'm wondering if there's an efficient way to "shift" elements of a 2-dimensional array. Effectively, what I have is triangular data, saved in a VBA array (n x m, where n <= m):
0 1 2 3 4 5
----------------
0 | A B C D E F
1 | G H I J
2 | K L
I'd like to "restructure" this array to:
0 1 2 3 4 5
----------------
0 | A B C D E F
1 | G H I J
2 | K L
The blank values in the array are actually empty strings (""). I'd imagine there's some looping that I could do to perform this with some compute cost, but I'm wondering if there's an efficient approach for subset "shifting" within VBA...
As #TimWilliams commented correctly, you won't do it without any loops. - A possible approach, however reducing loops would be to
write the initial array (named e.g. v) row wise to an empty target range (applying individual offsets you can calculate before) and eventually
assign them back as so called data field array.
The following example code should give you an idea. - Little draw back: in any case you get the array back as 1-based array.
'~~> assign initial (variant) array v as in OP
'...
'~~> calculate column offsets per array row, e.g. {0,2,4}
'...
'~~> shift array rows and write them to any empty target area
Dim startCell As Range: Set startCell = Sheet1.Range("X1")
Dim i As Long, j As Long, tmp As Variant
For i = 1 To UBound(v)
'~~> get individual column offset per row, e.g. {0,2,4}
j = Array(0, 2, 4)(i - 1)
'~~> write next row to target range
startCell.Offset(i, j).Resize(1, UBound(v, 2)) = Application.Index(v, i, 0)
Next i
'~~> get rearranged 1-based 2-dim datafield array
v = startCell.Offset(1).Resize(UBound(v), UBound(v, 2))
If you shift elements within a greater array, you could write the entire array to the target and overwrite only rows you need rearranged (considering to clear these single row ranges before:-)
I have a 2d array, with flexible dimensions:
arr_emissions(1 to n, 0 to m)
Where n is 22 or larger, and m is 6 or larger.
In the smallest case column m = 6 should contain the sum of columns m = 2 - 5.
I could ofcourse simply add them, but as the dimensions of the array are flexible I would like to implement a more robust method, that preferly doesn't loop over the entire array.
I was hoping to implement the native application.WorksheetFormula.Sum(). I saw an implementation in this answer, but that only works for complete rows or columns.
Example:
I have arr_emissions(0 to 111,1 to 6). It is populated in a loop from 1 to 111.
The data in the array is as follows:
(1,1) #3-4-2020# 'a date value
(1,2) 1,379777
(1,3) 0
(1,4) Empty
(1,5) Empty
Don't know if this helps, but this takes a source array v and then populates a new array w with the sum of columns 2-4 of the corresponding row of v.
Sub x()
Dim v, i As Long, w()
'just to populate source array
v = Range("A1").CurrentRegion.Value
ReDim w(1 To UBound(v, 1))
For i = 1 To UBound(w)
'each element of w is sum of columns 2-4 of corresponding row of v
w(i) = Application.Sum(Application.Index(v, i, Array(2, 3, 4)))
Next i
'write w to sheet
Range("G1").Resize(UBound(w)) = Application.Transpose(w)
End Sub
Thanks to the answer from SJR I found myself a working solution. This is all within a larger piece of code, but for this example I filled some variables with fixed numbers to match my example from my question.
Dim days as Integer
days = 111
Dim emissions_rows as Integer
emissions_cols = 6
ReDim arr_emissions(0 To days, 1 To emissions_cols) As Variant
Dim arr_sum As Variant
Dim sum_str As String
sum_str = "Transpose(row(2:" & emissions_rows - 1 & "))"
arr_sum = Application.Evaluate(sum_str) '= Array(2,3,4,5)
arr_emissions(emissions_index, emissions_cols) = Application.Sum(Application.Index(arr_emissions, emissions_index + 1, arr_sum))
The code writes a string to include the variables, so to take the second column untill the second to last column, which is then evaluated into an array.
That array is then used within the sum function, to only sum over those columns.
The result is then written to the last column of arr_emissions().
emissions_index is an index that is used to loop over the array.
I have a table of information, and I would like to create a function that would find two columns in a table range that match the headers that I provide, then store the difference between each of the rows of the two columns as an array. After getting this array, I want the function to return the average, max and min of the array. The output will be horizontal and placed in 3 adjacent cells.
I am not doing this manually as the table is quite large and I have to get the difference and average of many permutations (435 permutations) of two rows, so manual calculation would be too tedious.
Function MatchDiff(header1 As String, header2 As String, tbl As Range) As Variant()
Dim c, r, a, Lcol As Single
Dim temp_spreads(), temp_final() As Variant
Dim Average As Double
Dim tbl1, tbl2 As Range
ReDim temp_diff(0)
ReDim temp_final(0)
For c = 1 To tbl.Columns.Count
If header1 = tbl.Cells(1, c) Then
tbl1 = tbl.Range(tbl.Cells(2, c), tbl.Cells(tbl.Rows.Count, c))
ElseIf header2 = tbl.Cells(1, c) Then
tbl2 = tbl.Range(tbl.Cells(2, c), tbl.Cells(tbl.Rows.Count, c))
End If
Next c
For r = 1 To tbl1.Rows.Count
temp_diff(UBound(temp_diff)) = (tbl1.Cells(r, 1).Value - tbl2.Cells(r, 1).Value)
ReDim Preserve temp_diff(UBound(temp_diff) + 1)
Next r
Average = Application.WorksheetFunction.Average(temp_diff)
temp_final(UBound(temp_final)) = Average
ReDim Preserve temp_final(UBound(temp_final) + 1)
Min = Application.WorksheetFunction.Min(temp_diff)
temp_final(UBound(temp_final)) = Min
ReDim Preserve temp_final(UBound(temp_final) + 1)
Max = Application.WorksheetFunction.Max(temp_diff)
temp_final(UBound(temp_final)) = Max
ReDim Preserve temp_final(UBound(temp_final) + 1)
Lcol = Range(Application.Caller.Address).Rows.Count
For a = UBound(temp_final) To Lcol
temp_final(UBound(temp_final)) = ""
ReDim Preserve temp_final(UBound(temp_final) + 1)
Next a
ReDim Preserve temp_final(UBound(temp_final) - 1)
MatchDiff = temp_final
End Function
This is what I have tried to do but it returns an invalid name error. I am extremely new to vba (have only used python and R) and really need some help. Thanks in advance!
No need for VBA.
If headers is the range representing the header labels and data the range representing the data (excluding the header row) then
=INDEX(data,0,MATCH(header1,headers,0))
provides an array corresponding to the column of your data table labelled header1.
So, your maximum, minimum and average values can be simply obtained as
=MAX(INDEX(data,0,MATCH(header1,headers,0))-INDEX(data,0,MATCH(header2,headers,0)))
=MIN(INDEX(data,0,MATCH(header1,headers,0))-INDEX(data,0,MATCH(header2,headers,0)))
=AVERAGE(INDEX(data,0,MATCH(header1,headers,0))-INDEX(data,0,MATCH(header2,headers,0)))
where header1 and header2 are your two selected header labels.
Each formula needs to be entered as an array formula using CTL+SHIFT+ENTER rather just ENTER when committing from the formula bar. The formula will then appear inside curly braces {...} in the formula bar confirming it is an array formula.
Since you have 435 permutations, I'm guessing that your data table has 30 columns.
If you wanted to, you could easily generate the results for all 435 possible permutations.
To do this create a list of 435 pairs (n,m) such that n is less than m and n, m are each in range 1,...,30. Create the list starting from (1,2) and ending at (29,30). Now MATCH(header1,headers,0) and MATCH(header2,headers,0) can simply be replaced by n and m, respectively in the formulae to give
=MAX(INDEX(data,0,n)-INDEX(data,0,m))
=MIN(INDEX(data,0,n)-INDEX(data,0,m))
=AVERAGE(INDEX(data,0,n)-INDEX(data,0,m))
as the required results for pair (n,m), where again these formulae should be entered as array formulae with CTL+SHIFT+ENTER.
The picture below shows the results of applying this approach for all 15 permutations of an example data table with 25 rows and 6 columns.
11/24/14 - as per below.....
Still trying to figure this out - might it be easier by creating a smaller array which could roll through the larger array? ...then any necessary calcs could be done on the entirety of the small array.
I cannot figure out how to isolate just a (rolling) subset of an array. The rolling subset could be used for moving averages, standard devs, max/min, etc.
11/21/14 - I have made several attempts, this is the latest iteration. It shouldn't produce meaningful output until the minimum periods have been looped thru (stdev_periods = 10).
--pct_chg_array() is an array which holds percent change data from i=2 to i = 2541... declared as variant
--stdev_periods = 10 ...declared as integer
--i is a counter ...declared as integer
--stdev_array() is an empty array which I hope to populate with a standard deviation calculation for a rolling n period range ...declared as variant
--Option Base 1 and Option Explicit are on
For i = 2 To 2541
If IsNumeric(i) And i <> 0 Then
stdev_array(i, 1) = Application.WorksheetFunction.stdev(Range(pct_chg_array(i, 1).Offset(-stdev_periods, 0), pct_chg_array(i, 0)))
Else
stdev_array(i, 1) = 0
End If
Next i
Any guidance would be immensely appreciated. Thanks!
----EDIT----
Just to simplify, this is how I would express it in a worksheet formula...
=IF(ISNUMBER(OFFSET($E3,-stdev_periods+1,0)),STDEV(OFFSET($E3,0,0,-stdev_periods)),0)
...with "stdev_periods" = 10 and column E holding 1 period %chg data (ie =$E3/$E2-1).
Put this function in the module:
Public Function Slice(vntInputArray As Variant, lngStartIndex As Long, lngEndIndex As Long)
'Use to return an arbitrary-sized subset from a 1 dimensional array.
'Assumes developer is using Option Base 1
Dim vntSubArray() As Variant, lngInputIndex As Long
Dim lngElementCountIndex As Long: lngElementCountIndex = 1
For lngInputIndex = lngStartIndex To lngEndIndex
ReDim Preserve vntSubArray(lngInputIndex)
vntSubArray(lngElementCountIndex) = vntInputArray(lngInputIndex)
lngElementCountIndex = (lngElementCountIndex + 1)
Next lngInputIndex
Slice = vntSubArray
End Function
Adding the function to your code:
For i = 2 To 2541
If IsNumeric(i) And i > stdev_periods Then 'Using greater than to account for Option Base 1
stdev_array(i, 1) = WorksheetFunction.stdev(Slice( pct_chg_array, (i -stdev_periods), i))
Else
stdev_array(i, 1) = 0
End If
Next i
I'm trying to create a vector of size 121x101 such that the ith column is made up of V_t*e, where V_t = 1000*10^((i-1)/20) and e is a 121 long column of ones.
Clearly i is to be varied from 1 to 101 million, but how would I apply that to a matrix without only yielding the final value in the results (applying this to every column without repeating commands)?
From your question, it looks like each row is the same. Thus, you can just calculate one row using REPMAT as
iRow = 1:101;
V_t = 1000*10.^((iRow-1)/20);
V_te = repmat(V_t,121,1);
If you want to have e be 1 in row 1, 2 in row 2, etc, you can use NDGRID to create two arrays of the same size as the output, which contain the values of e and i for every element (i,j) of the output
[ee,ii] = ndgrid(1:121,1:101);
V_te = 1000*10.^((i-1)/20) .* ee;
or you can use BSXFUN to do the expansion of e and i for you
iRow = 1:101;
V_t = 1000*10.^((iRow-1)/20);
V_te = bsxfun(#times,V_t,(1:121)');