Apriori algorithm -Find 2 of combination - sql-server

I have an Order table like this:
ORDER_ID PRODUCT_ID
1 1230
1 1231
1 1232
2 1231
2 2000
3 1230
3 3567
and a Product table:
PRODUCT_ID NAME
1230 A
1231 B
1232 C
My first question, how to get combination of 2 Product Table, then how my new table structure should be?
for example;
{1230,1231}, {1230,1232}, {1231,1232}
but I don't want to this {1231,1230} because it already added.
Second one, in Order table, I keep sold product one session. How will be my new table?
example;
orderid products
1 {1230,1231,1232}
Finally, I want to find product other sold together support value,
exp: {1231,1230} count : 2
{1230,1232 count : 0 }
thanks in advance.
edit: i want to do like this: http://webdocs.cs.ualberta.ca/~zaiane/courses/cmput499/slides/Lect10/sld054.htm

If I have interpreted your requirement correctly?
;WITH T(P1, P2, ORDER_ID)
AS (SELECT p1.PRODUCT_ID,
p2.PRODUCT_ID,
O.ORDER_ID
FROM Product p1
JOIN Product p2
ON p1.PRODUCT_ID < p2.PRODUCT_ID
JOIN [ORDER] o
ON o.PRODUCT_ID IN ( p1.PRODUCT_ID, p2.PRODUCT_ID )
GROUP BY p1.PRODUCT_ID,
p2.PRODUCT_ID,
O.ORDER_ID
HAVING COUNT(*) = 2)
SELECT P1,
P2,
COUNT(*) AS Cnt
FROM T
GROUP BY P1,
P2

I don't really understand questions 2 or 3, so please clarify in your question.
The first one is tricky, but I think you're looking for something like this:
SELECT * FROM products p1, products p2 GROUP BY ((p1.PRODUCT_ID*p2.PRODUCT_ID)+p1.PRODUCT_ID+p2.PRODUCT_ID)
Because it would group by rows only where the two numbers are the same, without caring about order. There might be a more elegant way to create what's basically a unique id for that combination, but I can't think of any.

Related

Sql Server Weird CASE Statement

I am attempting to do something, but I am not sure if it is possible. I don't really know how to look up something like this, so I'm asking a question here.
Say this is my table:
Name | Group
-----+--------
John | Alpha
Dave | Alpha
Dave | Bravo
Alex | Bravo
I want to do something like this:
SELECT TOP 1 CASE
WHEN Group = 'Alpha' THEN 1
WHEN Group = 'Bravo' THEN 2
WHEN Group = 'Alpha' AND
Group = 'Bravo' THEN 3
ELSE 0
END AS Rank
FROM table
WHERE Name = 'Dave'
I understand why this won't work, but this was the best way that I could explain what I am trying to do. Basically, I just need to know when one person is a part of both groups. Does anyone have any ideas that I could use?
You should create a column to hold the values you want to sum and sum them, probably easiest to do this via a subquery:
Select Name, SUM(Val) as Rank
FROM (SELECT Name, CASE WHEN Group = 'Alpha' THEN 1
WHEN Group = 'Bravo' THEN 2
ELSE 0 END AS Val
FROM table
WHERE Name = 'Dave') T
GROUP BY Name
You can add TOP 1 and ORDER BY SUM(Val) to get the top ranked row if required.
After reading your comment, it could be simplified further to:
Select Name, COUNT([GROUP]) GroupCount
FROM table
GROUP BY Name
HAVING COUNT([GROUP]) > 1
That will simply return all names where they have more than 1 group.

SQL Join one-to-many tables, selecting only most recent entries

This is my first post - so I apologise if it's in the wrong seciton!
I'm joining two tables with a one-to-many relationship using their respective ID numbers: but I only want to return the most recent record for the joined table and I'm not entirely sure where to even start!
My original code for returning everything is shown below:
SELECT table_DATES.[date-ID], *
FROM table_CORE LEFT JOIN table_DATES ON [table_CORE].[core-ID] = table_DATES.[date-ID]
WHERE table_CORE.[core-ID] Like '*'
ORDER BY [table_CORE].[core-ID], [table_DATES].[iteration];
This returns a group of records: showing every matching ID between table_CORE and table_DATES:
table_CORE date-ID iteration
1 1 1
1 1 2
1 1 3
2 2 1
2 2 2
3 3 1
4 4 1
But I need to return only the date with the maximum value in the "iteration" field as shown below
table_CORE date-ID iteration Additional data
1 1 3 MoreInfo
2 2 2 MoreInfo
3 3 1 MoreInfo
4 4 1 MoreInfo
I really don't even know where to start - obviously it's going to be a JOIN query of some sort - but I'm not sure how to get the subquery to return only the highest iteration for each item in table 2's ID field?
Hope that makes sense - I'll reword if it comes to it!
--edit--
I'm wondering how to integrate that when I'm needing all the fields from table 1 (table_CORE in this case) and all the fields from table2 (table_DATES) joined as well?
Both tables have additional fields that will need to be merged.
I'm pretty sure I can just add the fields into the "SELECT" and "GROUP BY" clauses, but there are around 40 fields altogether (and typing all of them will be tedious!)
Try using the MAX aggregate function like this with a GROUP BY clause.
SELECT
[ID1],
[ID2],
MAX([iteration])
FROM
table_CORE
LEFT JOIN table_DATES
ON [table_CORE].[core-ID] = table_DATES.[date-ID]
WHERE
table_CORE.[core-ID] Like '*' --LIKE '%something%' ??
GROUP BY
[ID1],
[ID2]
Your example field names don't match your sample query so I'm guessing a little bit.
Just to make sure that I have everything you’re asking for right, I am going to restate some of your question and then answer it.
Your source tables look like this:
table_core:
table_dates:
And your outputs are like this:
Current:
Desired:
In order to make that happen all you need to do is use a subquery (or a CTE) as a “cross-reference” table. (I used temp tables to recreate your data example and _ in place of the - in your column names).
--Loading the example data
create table #table_core
(
core_id int not null
)
create table #table_dates
(
date_id int not null
, iteration int not null
, additional_data varchar(25) null
)
insert into #table_core values (1), (2), (3), (4)
insert into #table_dates values (1,1, 'More Info 1'),(1,2, 'More Info 2'),(1,3, 'More Info 3'),(2,1, 'More Info 4'),(2,2, 'More Info 5'),(3,1, 'More Info 6'),(4,1, 'More Info 7')
--select query needed for desired output (using a CTE)
; with iter_max as
(
select td.date_id
, max(td.iteration) as iteration_max
from #table_dates as td
group by td.date_id
)
select tc.*
, td.*
from #table_core as tc
left join iter_max as im on tc.core_id = im.date_id
inner join #table_dates as td on im.date_id = td.date_id
and im.iteration_max = td.iteration
select *
from
(
SELECT table_DATES.[date-ID], *
, row_number() over (partition by table_CORE date-ID order by iteration desc) as rn
FROM table_CORE
LEFT JOIN table_DATES
ON [table_CORE].[core-ID] = table_DATES.[date-ID]
WHERE table_CORE.[core-ID] Like '*'
) tt
where tt.rn = 1
ORDER BY [core-ID]

How to loop through the number of rows from a table in SQL Server?

I have a situation like this..
I want to loop through a table and get each row as a parameter to another procedure
It is like this ..
while ( select #myTitle=Title from tblBooks )
select * from tblBorrowed where Title = #myTitle
It is just a pseudo code it is an error in sql..
How to do it in SQL Server to view a result like this
tblBooks:
ID Title
------------
1 A
2 B
3 C
4 D
tblBorrowed:
ID Title
------------
1 A
2 A
3 A
4 D
5 C
6 C
7 D
And I want to make a result like this
Title Borrowed
------------
A 3
B 0
C 2
D 2
Which is the 'Borrowed' column is the no of times Book being borrowed
How can I do that? Any idea? I appreciate it..
Thank you so much..
SQL is not really a procedural language where you loop through rows to get your result. You can do that, but it's really slow.
Most things in SQL can be done without loops. It looks like all you need is a group by.
Start with something like this:
select
Title, -- Title field for the group
count(*) as Borrowed -- "Borrowed" is the number of rows in the group
from tblBorrowed
group by Title -- groups are determined by looking at Title field
This makes SQL Server split the results into groups based on Title, like this:
ID Title
------------
1 A
2 A
3 A
4 D
7 D
5 C
6 C
Then for each of those groups, it counts the number of rows - count(*) and reports one row back for each group, with its row-count.
Title Borrowed
------------
A 3
D 2
C 2
Notice that you don't have B in there, because there are no rows to count. If you do need to have a row for B, then you'll have to use a join.
select
book.Title,
borrow.ID
from tblBooks book
left join tblBorrowed borrow on borrow.Title=book.Title
Now you are starting with the tblBooks table (including B) and trying to match each row to one or more tblBorrowed rows. If there's no tblBorrowed row, you still have the one tblBooks row.
You get something like this:
Title ID
---------
A 1
A 2
A 3
B null
C 5
C 6
D 4
D 7
So now you can group that query and count it again:
select
book.Title,
count(*)
from tblBooks book
left join tblBorrowed borrow on borrow.Title=book.Title
group by book.Title
And it gets split into groups:
Title ID
---------
A 1
A 2
A 3
B null
C 5
C 6
D 4
D 7
And each group is counted:
Title ID
---------
A 3
B 0
C 2
D 2
Try this:
SELECT TB.Title, COUNT(BO.ID) AS Borrowed
FROM tblBooks AS TB LEFT OUTER JOIN
tblBorrowed AS BO ON BO.Title = TB.Title
GROUP BY TB.Title
Use tblBorrowed column in COUNT() function
COUNT(BO.ID) returns 0. COUNT(*) returns 1

Count each unique content

How do I count how many times the content of a field nameappears in my table?
Name | Other
Brad | smth
Brad | smth
Daniel | smth
Matt | smth
Matt | smth
Matt | smth
For example,for the above table I would like to know how many times I have 'Brad',how many times 'Daniel' and how many times 'Matt'.How do I do this with just one select?
I'm interested in this because I want do display only the Names that appear more times than a given value.
My actual code:
select director.LastName,director.FirstName,count(director.FirstName)as counter,film.title
from director,film
where film.Id_Director=director.id
group by director.LastName,director.FirstName,film.title
having count(Director.FirstName)>2
Baz Luhrmann 1 Paranormal activity 4
Baz Luhrmann 1 Struck by lightning
Baz Luhrmann 1 The big bang theory
Baz Luhrmann 1 The family
Baz Luhrmann 1 The Quarterback
Brad Falchuk 1 A Kitty or a Gaga
Brad Falchuk 1 All or nothing
Brad Falchuk 1 Bridesmaids
Brian Dan 1 All or nothing
I was expecting it to count exactly how many times 'Baz' appears in the table(this should be done for every name) and display only if the value of count > the 3 for example.
Group by the name and use a count()
select name, count(*) as name_count
from your_table
group by name
Aggregate functions like count are applied for each group.
To display only names that appear more than 1 time you can do
select name, count(*) as name_count
from your_table
group by name
having count(*) > 1
Having is like a where clause but for groups.
Edit
select d.LastName, d.FirstName, count(f.Id_Director) as counter
from director d
inner join film f on f.Id_Director = d.id
group by d.LastName, d.FirstName
having count(f.Id_Director) > 2
You had grouped by the film too. That won't work. You basically queried for directors that are more than 2 times part of a film.
The problem is you are grouping by film. Since there are a director/film is ill away count as 1.
You you want to keep the film names in that select result set I suggest you make a select movies and a subquery to count how many times that director can be joined to other movies.
Just writed a example at SQLFiddle
Example

how to select a row based on a single distinct value

If i have 4 rows and want to select rows based on a single column's distinct values and dont mind which info it uses for the rest of the row, how do i do this? There doesn't seem to be a 'distinct' function for single cols whilst maintaining rest of row data.
eg
Name, value
john 1
john 2
michael 3
michael 5
result
john 1
michael 5
note it could have been john 2 or michael 3, i dont care which row for John or Michael it uses for the rest of the values.
GROUP BY Name and use any aggregate function with the value MIN or MAX since you don't care about the value of it:
SELECT Name, MIN(value)
FROM table
GROUP BY Name
Try this
select a.* from TAbleName a
inner join
(
select distinct name,min(Id) as id from TAbleName
group by name
) as b
on a.name= b.name
and a.id=b.id

Resources