VBA two dimensional arrays connecting to a database - arrays

I am working with vba in excel and a database in access. The access database is a table that contains 3 columns; OrderIDs which is a column of numbers saying what order the particular item was in, OrderDescription which is a column that contains the description of the item, and Item # which is a column that gives a number to each particular item (if the item is the same as another, they both are the same item).
I need to build a 2-dimensional array in excel using VBA holding which items were purchased in which orders. The rows will be the Order ID and the columns will be the Item ID. The elements of this array will contain an indicator (like True or a “1”) that indicates that this order contains certain items. For example, row 6 (representing order ID 6) will have “True” in columns 1, 5, and 26 if that order purchased item IDs 1, 5, and 26. All other columns for that order will be blank.
In order to do this, i think I will have to determine the max order number (39) and the max item number(33). This information is available in the database which I can connect to using a .connection and .recordset. Some order numbers and some item numbers may not appear.
Note also that this will likely be a sparse array (not many entries) as most orders contain only a few items. We do not care how many of an item a customer purchased, only that the item was purchased on this order.
MY QUESTION is how can I set up this array? I tried a loop that would assign the values of the order numbers to an array and the items numbers to an array and then dimensioning the array to those sizes, but it wont work.
is there a way to make an element of an array return a value of True if it exists?
Thanks for your help

It seems to me that the best bet may be a cross tab query run on an access connection. You can create your array with the ADO method GetRows : http://www.w3schools.com/ado/met_rs_getrows.asp.
TRANSFORM Nz([Item #],0)>0 AS Val
SELECT OrderNo
FROM Table
GROUP BY OrderNo
PIVOT [Item #]
With a Counter table containing integers from 1 to maximum number of items in a column (field) Num.
TRANSFORM First(q.Val) AS FirstOfVal
SELECT q.OrderNo
FROM (SELECT t.OrderNo, c.Num, Nz([Item #],0)>0 AS Val
FROM TableX t RIGHT JOIN [Counter] c ON t.[Item #] = c.Num
WHERE c.Num<12) q
GROUP BY q.OrderNo
PIVOT q.Num
Output:
OrderNo 1 2 3 4 5 6 7 8 9 10 11
0 0 0 0 0 0
1 -1 -1 -1 -1
2 -1 -1 -1 -1

Related

Using the window function "last_value", when the values of the sorted field are same, the value snowflake returns is not the last value

As we all known, the window function "last_value" returns the last value within an ordered group of values.
In the following example, group by field "A" and sort by field "B" in positive order.
In the group of "A = 1", the last value is returned, which is, the C value 4 when B = 2.
However, in the group of "A = 2", the values of field "B" are the same.
At this time, instead of the last value, which is, the C value 4 in line 6, the first C value 1 in B = 2 is returned.
This puzzles me why the last value within an ordered group of values is not returned when I encounter the value I want to use for sorting.
Example
row_number
A
B
C
LAST_VALUE(C) IGNORE NULLS OVER (PARTITION BY A ORDER BY B ASC)
1
1
1
2
4
2
1
1
1
4
3
1
1
3
4
4
1
2
4
4
5
2
2
1
1
6
2
2
4
1
This puzzles me why the last value within an ordered group of values is not returned when I encounter the value I want to use for sorting.
For partition A equals 2 and column B, there is a tie:
The sort is NOT stable. To achieve stable sort a column or a combination of columns in ORDER BY clause must be unique.
To ilustrate it:
SELECT C
FROM tab
WHERE A = 2
ORDER BY B
LIMIT 1;
It could return either 1 or 4.
If you sort by B within A then any duplicate rows (same A and B values) could appear in any order and therefore last_value could give any of the possible available values.
If you want a specific row, based on some logic, then you would need to sort by all columns within the group to reflect that logic. So in your case you would need to sort by B and C
Good day Bill!
Right, the sorting is not stable and it will return different output each time.
To get stable results, we can run something like below
select
column1,
column2,
column3,
last_value(column3) over (partition by column1 order by
column2,column3) as column2_last
from values
(1,1,2), (1,1,1), (1,1,3),
(1,2,4), (2,2,1), (2,2,4)
order by column1;

Counting rows in a table based on multiple array criterias

I am trying to count rows in a table based on multiple criteria in different columns of that table. The criteria are not directly in the formula though; they are arrays which I would like to refer to (and not list them in the formula directly).
Range table example:
Group Status
a 1
b 4
b 6
c 4
a 6
d 5
d 4
a 2
b 2
d 3
b 2
c 1
c 2
c 1
a 4
b 3
Criteria/arrays example:
group
a
b
status
1
2
3
I am able to do this if i only have one array search through a range (1 column) in that table:
{=SUM(COUNTIFS(data[Group],group[group]))}
Returns "9" as expected (=sum of rows in the group column of the data table which match any values in group[group])
But if I add a second different range and a different array I get an incorrect result:
{=SUM(COUNTIFS(data[Group],group[group], data[Status],status[status]))}
Returns "3" but should return "5" (=sum of rows which have 1, 2 or 3 in the status column and a or b in the group column)
I searched and googled for various ideas related to using sumproduct or defining arrays within the formula instead of classifying the whole formula as an array but I was not able to get expected results via those means.
Thank you for your help.
Because your group and status criteria are a different number of values (2 values for group, but 3 values for status), I'm not sure you can do this in a single formula. Best way I know of to do this would be to use a helper column (which can be hidden if preferred).
Put this array formula in a helper column and copy down the length of your data (array formulas must be confirmed with Ctrl+Shift+Enter):
=AND(OR(data[#Group]=group[group]),OR(data[#Status]=status[status]))
And then get the count with: =COUNTIF(helpercolumn,TRUE)
You could use a slightly different approach, using Power Query / Power Pivot.
Name your tables Data, Group and Status, then create the following query, named Filtered Data:
let
tbData = Excel.CurrentWorkbook(){[Name="Data"]}[Content],
tbGroup = Excel.CurrentWorkbook(){[Name="Group"]}[Content],
tbStatus = Excel.CurrentWorkbook(){[Name="Status"]}[Content],
#"Merged Group" = Table.NestedJoin(tbData,{"Group"},tbGroup,{"Group"},"tbGroup",JoinKind.Inner),
#"Merged Status" = Table.NestedJoin(#"Merged Group",{"Status"},tbStatus,{"Status"},"Merged Status",JoinKind.Inner),
#"Removed Columns" = Table.RemoveColumns(#"Merged Status",{"tbGroup", "Merged Status"}),
#"Changed Type" = Table.TransformColumnTypes(#"Removed Columns",{{"Status", type number}})
in
#"Changed Type"
Load To as connection only, and tick Load to Data Model
Now create a DAX measure:
Status Sum:=SUM ( 'Filtered Data'[Status] )
You can then use the following formula on your worksheet, to get the Sum of Status values, for rows matching the criteria specified in the Group and Status tables:
=CUBEVALUE("ThisWorkbookDataModel","[Measures].[Status Sum]")
Simply refresh the data connection to update the value.

Update table with random numbers in kdb+q

when I run the following script:
tbl: update prob: 1?100 from tbl;
I was expecting that I get a new column created with each row having a random number. However, I get back a column containing the same number for all the rows in the table.
How do I resolve this? I need to update my existing table and not create a table from scratch.
When you are using 1?100 you are only requesting 1 random value within the range of 0-100. If you use 10?100, you will be returned a list of 10 random values between 0-100.
So to do this in an update you want to use something like this
tbl:([]time:5?.z.p;sym:5?`3;price:5?10f;qty:5?10)
time sym price qty
-----------------------------------------------
2012.02.19D18:34:27.148501760 gkn 8.376952 9
2008.07.29D20:23:13.601434560 odo 7.041609 3
2007.02.07D08:17:59.482332864 pbl 0.955069 9
2001.04.27D03:36:44.475531384 aph 1.127308 2
2010.03.03D03:35:55.253069888 mgi 0.7663449 6
update r:abs count[i]?0h from tbl
time sym price qty r
-----------------------------------------------------
2012.02.19D18:34:27.148501760 gkn 8.376952 9 23885
2008.07.29D20:23:13.601434560 odo 7.041609 3 19312
2007.02.07D08:17:59.482332864 pbl 0.955069 9 10372
2001.04.27D03:36:44.475531384 aph 1.127308 2 25281
2010.03.03D03:35:55.253069888 mgi 0.7663449 6 27503
Note that I am using type short and abs to return positive values.
You need to seed your initial data, using something like rand(time), otherwise it will use the same seed, and thus, give the same sequence of random numbers.
EDIT: Per https://code.kx.com/wiki/Reference/SystemCommands
Use \S?n, where n is any integer.
EDIT2: Check out https://code.kx.com/wiki/Reference/SystemCommands#.5CS_.5Bn.5D_-_random_seed for how to use random numbers, please.
Just generate as many random numbers as you have rows using count tbl:
First create your table tbl:
tbl:([]date:reverse .z.d-til 100;price:sums 100?1f)
date price
--------------------
2018.04.26 0.2426471
2018.04.27 0.6163571
2018.04.28 1.179559
..
Then add a column of random numbers between 0 and 100:
update rdn:(count tbl)?100 from tbl
date price rdn
------------------------
2018.04.26 0.2426471 25
2018.04.27 0.6163571 33
2018.04.28 1.179559 13
..

Compare two numbers and set a value of 1 or 0 to the Highest or Lowest in Excel or SQL Server 2012

I have a list of players and a list of their scores in a game. I'd like to compare the two columns, and return a value of 1 or 0. 1 = Highest number, 0 = Lowest number. If the two values are equal, I'd like to return two 1's.
How can I do this using Excel or SQL Server 2012?
If I understood you correctly, you want the player with the highest score in a team to be marked with 1 and the lowest with 0. In that case:
SELECT t.team_id,t.player_id,t.score,
case when t.score = s.max_score then 1
when t.score = s.min_score then 0
end as pro_ind
FROM YourTable t
INNER JOIN(select team_id,max(score) as max_score,min(score) as min_score
FROM YourTable
GROUP BY team_id) s
ON (t.team_id = s.team_id)
Of course I guessed the names of the columns, so you will have to adjust it.

Linq - Limit list to 1 row per unique values based on value (minimum) of single field

I have a stored procedure (I cannot edit) that I am calling via linq.
The stored procedure returns values (more complex but important data below):
Customer Stock Item Date Price Priority Qty
--------------------------------------------------------
CUST1 TAP 01-04-2012 £30 30 1 - 30
CUST1 TAP 05-04-2012 £33 30 1 - 30
CUST1 TAP 01-04-2012 £29 20 31 - 99
CUST1 TAP 01-04-2012 £28 10 1 - 30
I am trying to limit this list to rows which have unique Dates and unique quantities in LINQ.
I want to remove items with the HIGHER priority leaving rows with unique dates and qty's.
I have tried several group by's using Max and order by's but have not been able to get a result.
Is there any way to do this via linq?
EDIT:
Managed to convert brad-rem's answer into VB.net.
Syntax below if anyone needs it:
returnlist = (From p In returnlist
Order By p.Qty Ascending, p.Priority
Group By AllGrp = p.Date, p.Qty Into g = Group
Select g.First).ToList
How about the following. It groups by Date and Qty and orders it so that the lower priorities come first. Then, it just selects the first item from each group, which are all the lower priority items:
var result = from d in dbData
orderby d.Priority
group d by new
{
d.Date,
d.Qty
} into group1
select group1.First();

Resources