This might be a nightmare.
Let's say I have two rows of data in two different tables, each row containing one character each. A is Row1 and B is Row2 in Table1 and is reversed in Table2. B is Row1 and A is Row2.
I also have a third table that contains three columns. The first two are columns to be joined on, and the third is the resulting value, depending on what was joined in the first two columns.
A,A= 1
A,B=.8
A,C=.2
B,A=.8
B,B= 1
B,C=.6
C,A=.2
C,B=.6
C,C= 1
What I'm trying to do, in essence, is try finding the highest-rated pairs from Table1 and Table2 by using associated values within Table3.
A,A= 1
B,B= 1
Because of the matching A's in Table1+2 and matching B's in Table1+2. Instead, I forgot that by just aimlessly joining tables, I get this instead:
A,A= 1
A,B=.8
B,A=.8
B,B= 1
However, I'm getting ALL possible pairs, and that won't work. And the problem here is that I cannot do a direct JOIN between Table1+2, because a value within Table1 might not match up with Table2, for instance...
Row1 and Row2 in Table1 is A,B and Row1 and Row2 in Table2 is B,C. If I do a direct JOIN, values A and C won't line up with each other, leaving me only with the pairs of B.
I thought of one more problem with this, though! In trying to use a subquery, the subquery would be constantly re-run... meaning that previously selected rows would then be up for grabs again the next time, leading to incorrect values.
For instance, with A,B and B,C... I would expect to get this returned via subqueries:
A,B=.8
B,B= 1
Unless, of course, there's a way from disqualifying a row from being used again.
Any suggestions or ideas? I'm using Access but I'm sure the concepts apply to any database solution.
Related
in the Snowflake Docs it says:
First, prune micro-partitions that are not needed for the query.
Then, prune by column within the remaining micro-partitions.
What is meant with the second step?
Let's take the example table t1 shown in the link. In this example table I use the following query:
SELECT * FROM t1
WHERE
Date = ‚11/3‘ AND
Name = ‚C‘
Because of the Date = ‚11/3‘ it would only scan micro partitions 2, 3 and 4. Because of the Name = 'C' it can prune even more and only scan micro-partions 2 and 4.
So in the end only micro-partitions 2 and 4 would be scanned.
But where does the second step come into play? What is meant with prune by column within the remaining micro partitions?
Does it mean, that only rows 4, 5 and 6 on micro-partition 2 and row 1 on micro-partition 4 are scanned, because date is my clustering key and is sorted so you can prune even further with the date?
So in the end only 4 rows would be scanned?
But where does the second step come into play? What is meant with prune by column within the remaining micro partitions?
Benefits of Micro-partitioning:
Columns are stored independently within micro-partitions, often referred to as columnar storage.
This enables efficient scanning of individual columns; only the columns referenced by a query are scanned.
It is recommended to avoid SELECT * and specify required columns explicitly.
It simply means to only select the columns that are required for the query. So in your example it would be:
SELECT col_1, col_2 FROM t1
WHERE
Date = ‚11/3‘ AND
Name = ‚C‘
I'm trying to get to an array function that will select multiple rows and columns of data and present the data cleanly in a single cell. The data in the orange headed block has multiple rows (speakers) for each ID (sessions). The data in the blue block is the unique list of IDs (sessions) where I'm trying to get a formatted output. The ideal output would be (name) - (title) separated by cf/lf so multiple speakers will stack neatly in the same cell.
This gives an imperfect result with everything separated by dashes:
=TEXTJOIN(" - ",1,QUERY(A2:D17,"select B,C where A matches '"&F4&"' and D = 'T1'",0))
This should be closer to what I'm looking for but spits out a row mismatch error:
=QUERY({A:A,B:B&" - "&C:C&char(10),D:D},"select Col2 where Col1 matches '"&F4&"' and Col3 ='T1'",0)
And of course neither of these does this as an array and I'm trying to avoid having to maintain the sheet as sessions and speakers are added. I struggle with the intricacies of the query function so any help/instruction you can give would be very much appreciated!
Example Data
try:
=ARRAYFORMULA(TEXTJOIN(CHAR(10), 1, QUERY({A:A, B:B&" - "&C:C, D:D},
"select Col2
where Col1 matches '"&F4&"'
and Col3 ='T1'", 0)))
Let's say I have an item A,B,C in Table1.
They all have attributes f1. However, A and B has f2 which does not apply to C.
Table1 would be designed as:
itemName f1 f2
------------------------------------
A 100 50
A 43 90
B 66 10
C 23
There would be another table Table2 contains all the possible value of f2:
itemName f2(possible value)
------------------------------------
A 50
A 90
A 77
B 10
Let's say now i want to add an record with the highest value of f2 into Table1,depends on the iteaName. Things working fine for A and B. But in the case of C, when i loop through Table2, since there is no record of C in Table2, I cannot distinguish if it's a corrupted table or the fact that C just does not have attribute f2.
The only twos ways i can think of to solve this issue is:
1. Adding a constraint in the code, like:
if (iteaName == C )
"Do not search Table2"
else (search Table2)
if (No record)
return "Corrupted Table"
Or
2. Adding another bool field "having_f2"in Talbe1 to helping identifying that f2 does not apply to C.
The above is just an example of where to put such business logic constraints, in the DB or in the code.
Can you give me more opinions on the tradeoff between the above two ideology? In another word, which one makes more sence.
Since this is basically a field validation ("if MyModel can have property f2 set to NULL (inexistent)"), I would say, you must do that in a validator of your model.
Only if that is impossible, add some columns to model tables.
The rule I use is the following: database is used to store model data. You should try to store nothing else, except data, if possible. In your case has_f2 is not a data, but a business rule.
Of course, there are exceptions to this rule. For example, sometimes business logic must be controlled by the user and in this case it is perfectly ok to store it in the database.
Regarding your second proposal: you typically can also just query for a ~NULL value in the table, which would be the same as adding and setting a boolean attribute (would be better considering redundancy). This would also be the way to detect if the table is "corrupt". However, you can also start your query by collecting all "itemName" entries from table2, possibly building an intersection with table1 and inserting the cases of interest into table1:
1.) Intersect the "itemName" from table1 and table2 => table3
2.) Join the table3 and table2 on "itemName", "f2" => insert each tuple into table1
Alternatively, you can also split table1 into two tables { "itemName", "f1" } and { "itemName", "f2" } which would eliminate your problem.
The Problem
I'm building an SSRS report which requires regular group headings on certain rows. The data returned from the database query includes a column with a bit flag indicating which rows need to be treated as group subheadings.
Here's a snippet of the source data:
Note the IsGroupHeading column, where the flag is set to 1 on the first row ("0401").
I want to produce output which looks like this Excel mockup:
So every time the report encounters a row where IsGroupHeading equals 1, it generates a bold group heading row followed by a row with column headings.
What's Happening
I've tried creating a row group in SSRS with the expression =Fields!IsGroupHeading.Value = 1 but I get unexpected results: (1) Only the first group heading is treated specially, and (2) the group heading row is repeated underneath the heading. The result looks like this:
Notice that the "0401" row is repeated under the group heading. In addition, only the first group heading ever gets this special treatment. The report simply ignores subsequent group headings and renders them as normal data rows.
The Question
I've spent hours trying to get this right and this is the closest I've been able to get it and my googling on row groups turns up pages mostly about creating subtotals, so I'm throwing this one out to the community hoping some SSRS experts can help me with this.
I'm going to assume that you're doing this in SQL and that all tariff numbers start with the group header tariff number (in this case, 0401).
Let's say your SQL currently looks like this:
SELECT TariffNumber, RowDescription, TariffRate, IsGroupHeading
FROM Tariffs
What we want to do is join this table on itself to give the group TariffNumber and RowDescription columns on each row to enable us to group on it. We also want to exclude the GroupHeader Tariff from the Details rows. So we get something like this:
SELECT TariffGroup.TariffNumber AS GroupNumber, TariffGroup.RowDescription AS GroupDescription,
TariffDetail.TariffNumber, TariffDetail.RowDescription, TariffDetail.TariffRate
FROM Tariffs AS TariffDetail
INNER JOIN Tariffs AS TariffGroup ON TariffGroup.TariffNumber = Left(TariffDetail.TariffNumber, CharIndex(TariffDetail.TariffNumber, '.')-1) AND TariffDetail.IsGroupHeader = 0
Now you just need to group on GroupNumber and you're done.
i have three tables
documents
attributes
attributevalues
documents can have many attributes
and these atributes have value in attributevalue table
what i want in single query get all documents and assigned atributes of relevant documents in row each row
(i assume every documents have same attributes assigned dont need complexity of diffrent attribues now)
for example
docid attvalue1 attvalue2
1 2 2
2 2 2
3 1 1
how can i do that in single query
Off the top if my head, I don't think you can do this without dynamic SQL.
The crux of the Entity-Attribute-Value (EAV) technique (which is what you are using) is to store columns as rows. What you want to do is convert those rows back to columns for the purpose of this query. Using PIVOT makes this possible. However, PIVOT requires knowing the number of rows that need to be converted to columns at the time the query is written. So assuming you are using EAV because you need flexible attributes/values, you won't know this information when you write the query.
So the solution would be to use dynamic SQL in conjunction with PIVOT. Did a quick search and this looks promising (didn't really read the whole thing):
http://www.simple-talk.com/community/blogs/andras/archive/2007/09/14/37265.aspx
For the record, I am not a fan of dynamic SQL and would recommend finding another approach to the larger problem (e.g. pivoting in application code).
If you know all the attributes (and their IDs) at design-time:
SELECT d.docid,
a1.attvalue AS attvalue1
a2.attvalue AS attvalue2
FROM documents d
JOIN attributevalues a1 ON d.docid = a1.docid
JOIN attributevalues a2 ON d.docid = a2.docid
WHERE a1.attrid = 1
AND a2.attrid = 2
If you don't, things get quite a bit messier and difficult to answer without knowing your schema.
lets make example
documents table's columns
docid,docname,createddate,createduser
and values
1 account.doc 10.10.2010 aeon
2 hr.doc 10.11.2010 aeon
atributes table's columns
attid,name,type
and values
1 subject string
2 recursive int
attributevalues table's columns
attvalueid,docid,attid,attvalue(sql_variant)
and values
1 1 1 "accounting doc"
1 1 2 0
1 2 1 "humen r doc"
1 2 2 1
and I want query result
docid,name,atribvalue1,atribvalue1,atribvalueN
1 account.doc "accounting doc" 0
2 hr.doc "humen r doc" 1