I am using MS SQL to create a report to merge 2 tables. The problem is that I need 2 different headers and 1 column needs to have values from 2 different fields from the 2 tables.
Sample
Material | Plant
-------------------------------------------------
Component | Quantity
XXX - Material | ABC--Plant
--------------------------------------------------
YYYY-Component | 3000- Quantity
Is this even possible?
select * from
(
(select *,rn=row_number()over(order by column1) from table1)x,
(select *,rn1=row_number()over(order by column2) from table2)y
)
where x.rn=y.rn1
Firstly you need to give a extra column say rownumber,then repeat same for table2
then you can join using the row_numbers
Related
I have a question in regards to adding data to a particular column of a table, i had a post yesterday where a user guided me (thanks for that) to what i needed and said an update was the way to go for what i need, but i still can't achieve my goal.
i have two tables, the tables where the information will be added from and the table where the information will be added to, here is an example:
source_table (has only a column called "name_expedient_reviser" that is nvarchar(50))
name_expedient_reviser
kim
randy
phil
cathy
josh
etc.
on the other hand i have the destination table, this one has two columns, one with the ids and the other where the names will be inserted, this column values are null, there are some ids that are going to be used for this.
this is how the other table looks like
dbo_expedient_reviser (has 2 columns, unique_reviser_code numeric PK NOT AI, and name_expedient_reviser who are the users who check expedients this one is set as nvarchar(50)) also this is the way this table is now:
dbo_expedient_reviser
unique_reviser_code | name_expedient_reviser
1 | NULL
2 | NULL
3 | NULL
4 | NULL
5 | NULL
6 | NULL
what i need is the information of the source_table to be inserted into the row name_expedient_reviser, so the result should look like this
dbo_expedient_reviser
unique_reviser_code | name_expedient_reviser
1 | kim
2 | randy
3 | phil
4 | cathy
5 | josh
6 | etc.
how can i pass the information into this table? what do i have to do?.
EDIT
the query i saw that should have worked doesn't update which is this one:
UPDATE dbo_expedient_reviser
SET dbo_expedient_reviser.name_expedient_reviser = source_table.name_expedient_reviser
FROM source_table
JOIN dbo_expedient_reviser ON source_table.name_expedient_reviser = dbo_expedient_reviser.name_expedient_reviser
WHERE dbo_expedient_reviser.name_expedient_reviser IS NULL
the query was supposed to update the information into the table, extracting it from the source_table as long as the row name_expedient_reviser is null which it is but is doesn't work.
Since the Names do not have an Id associated with them I would just use ROW_NUMBER and join on ROW_NUMBER = unique_reviser_code. The only problem is, knowing what rows are null. From what I see, they all appear null. In your data, is this the case or are there names sporadically in the table like 5,17,29...etc? If the name_expedient_reviser is empty in dbo_expedient_reviser you could also truncate the table and insert values directly. Hopefully that unique_reviser_code isn't already linked to other things.
WITH CTE (name_expedient_reviser, unique_reviser_code)
AS
(
SELECT name_expedient_reviser
,ROW_NUMBER() OVER (ORDER BY name_expedient_reviser)
FROM source_table
)
UPDATE er
SET er.name_expedient_reviser = cte.name_expedient_reviser
FROM dbo_expedient_reviser er
JOIN CTE on cte.unique_reviser_code = er.unique_reviser_code
Or Truncate:
Truncate Table dbo_expedient_reviser
INSERT INTO dbo_expedient_reviser (name_expedient_reviser, unique_reviser_code)
SELECT DISTINCT
unique_reviser_code = ROW_NUMBER() OVER (ORDER BY name_expedient_reviser)
,name_expedient_reviser
FROM source_table
it is not posible to INSERT the data into a single column, but to UPDATE and move the data you want is the only way to go in that cases
Lets say that I have to store the following information in my database,
Now my database tables will be designed and structured like this,
In a later date, if I had to add another sub category level how will I be able to achieve this without having to change the database structure at all?
I have heard of defining the columns as row data in a table and using pivots to extract the details later on...Is that the proper way to achieve this?
Can someone please enlighten me or guide me in the proper direction? Thanks in advance...
:)
It would be difficult to add more columns to your table when new levels are to be generated. The best way is to use a Hierarchy table to maintain Parent-Child relationship.
Table : Items
x----x------------------x------------x
| ID | Items | CategoryId |
|----x------------------x------------x
| 1 | Pepsi | 3 |
| 2 | Coke | 3 |
| 3 | Wine | 4 |
| 4 | Beer | 4 |
| 5 | Meals | 2 |
| 6 | Fried Rice | 2 |
| 7 | Black Forest | 7 |
| 8 | XMas Cake | 7 |
| 9 | Pinapple Juice | 8 |
| 10 | Apple Juice | 8 |
x----x------------------x------------x
Table : Category
In category table, you can add categories to n levels. In Items table, you can store the lowest level category. For example, take the case of Pepsi - its categoryId is 3. In Category table, you can find its parent using JOINs and find parent's parents using Hierarchy queries.
In Category table, the categories with ParentId is null(that is with no parentId) will be the MainCategory and the other items with ParentId will be under SubCategory.
EDIT :
Any how you need to alter the tables, because as per your current schema, you cannot add column to the first table because the number of Sub category may keep on changing. Even if you create a table as per Rhys Jones answer, you have to join two tables with string. The problem in joining with string is that, when there is a requirement to change the Sub category or Main category name, you have to change in every table which you be fall to trouble in future and is not a good database design. So I suggest you to follow the below pattern.
Here is the query that get the parents for child items.
DECLARE #ITEM VARCHAR(30) = 'Black Forest'
;WITH CTE AS
(
-- Finds the original parent for an ITEM ie, Black Forest
SELECT I.ID,I.ITEMS,C.CategoryId,C.Category,ParentId,0 [LEVEL]
FROM #ITEMS I
JOIN #Category C ON I.CategoryId=C.CategoryId
WHERE ITEMS = #ITEM
UNION ALL
-- Now it finds the parents with hierarchy level for ITEM
-- ie, Black Forest. This is called Recursive query, which works like loop
SELECT I.ID,I.ITEMS,C.CategoryId,C.Category,C.ParentId,[LEVEL] + 1
FROM CTE I
JOIN #Category C ON C.CategoryId=I.ParentId
)
-- Here we keep a column to show header for pivoting ie, CATEGORY0,CATEGORY1 etc
-- and keep these records in a temporary table #NEWTABLE
SELECT ID,ITEMS,CATEGORYID,CATEGORY,PARENTID,
'CATEGORY'+CAST(ROW_NUMBER() OVER(PARTITION BY ITEMS ORDER BY [LEVEL] DESC)-1 AS VARCHAR(4)) COLS,
ROW_NUMBER() OVER(PARTITION BY ITEMS ORDER BY [LEVEL] DESC)-1 [LEVEL]
INTO #NEWTABLE
FROM CTE
ORDER BY ITEMS,[LEVEL]
OPTION(MAXRECURSION 0)
Here is the result from the above query
Explanation
Black Forest comes under Cake.
Cake comes under Bakery.
Bakery comes under Food.
Like this you can create children or parent for any number of levels. Now if you want to add a parent to Food and Beverage, for eg, Food Industry, just add Food Industry to Category table and keep Food Industry's Id as ParentId for Food and Beverage. Thats all.
Now if you want do pivoting, you can follow the below procedures.
1. Get values from column to show those values as column in pivot
DECLARE #cols NVARCHAR (MAX)
SELECT #cols = COALESCE (#cols + ',[' + COLS + ']', '[' + COLS + ']')
FROM (SELECT DISTINCT COLS,[LEVEL] FROM #NEWTABLE) PV
ORDER BY [LEVEL]
2. Now use the below PIVOT query
DECLARE #query NVARCHAR(MAX)
SET #query = 'SELECT * FROM
(
SELECT ITEMS, CATEGORY, COLS
FROM #NEWTABLE
) x
PIVOT
(
MIN(CATEGORY)
FOR COLS IN (' + #cols + ')
) p
ORDER BY ITEMS;'
EXEC SP_EXECUTESQL #query
Click here to view result
You will get the below result after the pivot
NOTE
If you want all the records irrespective of an item, remove the WHERE clause inside CTE. Click here to view result.
Now I have provided order of columns in pivot table as DESC ie, its shows top-level parent.....Item's parent. If you want to show Item's parent first followed be next level and top-level parent at last, you can change DESC inside the ROW_NUMBER() to ASC. Click here to view result.
According to your schema there's no relationship between 'main category' and 'sub category' but your sample data suggests there would be a relationship, i.e. Alcohol IS A Beverage etc. This sounds like a hierarchy of categories, in which case you could you a single self-referencing Category table instead;
create table dbo.Category (
CategoryID int not null constraint PK_Category primary key clustered (CategoryID),
ParentCategoryID int not null,
CategoryName varchar(100) not null
)
alter table dbo.Category add constraint FK_Category_Category foreign key(ParentCategoryID) references dbo.Category (CategoryID)
insert dbo.Category values (1, 1, 'Beverages')
insert dbo.Category values (2, 1, 'Soft Drink')
insert dbo.Category values (3, 1, 'Alcohol')
This way you can create as many levels of category as you want. Any category where ParentCategoryID = CategoryID is a top level category.
Hope this helps,
Rhys
In order to add a new sub category, you should add the category to the table "ItemSubCategory1" after that you can easily add it to the "Drinks" table.
For Example:
If there is a new category name "Hot Drinks" and a new item "Coffee" which comes in Beverages main category (let CatId=1, MainCatText='Beverages' in ItemMainCategory table) then
INSERT INTO ItemSubCategory1(CatId,SubCatText) VALUES(4,'Hot Drinks')
INSERT INTO Drinks(ItemId,ItemName,ItemMainCategory,ItemSubCategory)
VALUES(5,'Coffee',1,4)
In MS-SQL, I have a View 'ListingResult' which contains rows from tables 'ListingCategoryXref' and 'Listing'. This is the View statement:
SELECT
dbo.Listing.ListingName,
dbo.Listing.ListingId,
dbo.ListingCategoryXref.CategoryId
FROM dbo.Listing INNER JOIN
dbo.ListingCategoryXref ON dbo.Listing.ListingId = dbo.ListingCategoryXref.ListingId
GROUP BY
dbo.Listing.ListingName,
dbo.Listing.ListingId,
dbo.ListingCategoryXref.CategoryId
Listings can have many rows in ListingCategoryXref, thus.
ListingResult (View)
Listing (table)
ListingId ListingName StateId
1 Toms bar 3
2 French place 5
ListingCategoryXref (table)
ListingId CategoryId
1 10
1 15
The query below returns a row per Listing per ListingCategoryXref.
SELECT TOP(26)
[ListingResult].[ListingId],
[ListingResult].[ListingName]
FROM [ListingResult]
WHERE [ListingResult].[StateId] = 3
So 'Tom's Bar' is returned twice as it has two categories. I figure I can either change the query above, or change the ListingResult View in SQL. I still need to return 26 results which I can't dictate if I use a wrapped select statement with ROW_NUMBER() OVER(PARTITION BY ListingId. (Is that true?) I'm using LLBLGen to access the DB so I'd prefer to change the view, if that is possible? Apologies for my newness to SQL being that obvious.
From the query above, the following result will be returned...
ListingName | ListingId | CategoryId
Toms bar | 1 | 10
Toms bar |1 | 15
If you only want Toms bar to be returned once, you'll need to remove the CategoryId column from the result set, and the group by clause, or add CategoryId to an agrgate function, and remove it from the group by clause i.e.
SELECT
dbo.Listing.ListingName,
dbo.Listing.ListingId,
COUNT(dbo.ListingCategoryXref.CategoryId) as Categories
FROM dbo.Listing
INNER JOIN dbo.ListingCategoryXref ON dbo.Listing.ListingId = dbo.ListingCategoryXref.ListingId
GROUP BY dbo.Listing.ListingName, dbo.Listing.ListingId
Which will return...
ListingName | ListingId | Categories
Toms bar | 1 | 2
Can you give an example of what you would like to see?
I have a table that looks similar to this:
session_id | sku
------------|-----
a | 1
a | 2
a | 3
a | 4
b | 2
b | 3
c | 3
I want to pivot this into a table similar to this:
sku1 | sku2 | score
------|------|------
1 | 2 | 1
1 | 3 | 1
1 | 4 | 1
2 | 3 | 2
2 | 4 | 1
3 | 4 | 1
The idea is to store a denormalised table that allows one to look up for a given sku, what other skus are related to sessions it has been related to, and how many times both skus are related to the same session.
What algorithms, patterns or strategies could you suggest for implementing this in PostgreSQL or other technologies?
I realise that this kind of lookup can be done on the original table using counts, or using a facetting search engine. However, I want to make the reads more performant, and just want to keep the overall statistics. The idea is that I will be performing this pivot regularly on the newest few thousand rows in the first table, then storing the result in the second. I'm only concerned with approximate statistics for the second table.
I've got some SQL that works, but VERY slowly. Also looking into the potential for using a graph database of some sort, but wanted to avoid adding another technology for a small part of the app.
Update: The SQL below seems performant enough. I can convert 1.2 million rows in the first table (tags) into 250k rows in the second table (product_relations) with around 2-3k variations of sku in about 5 minutes on my iMac. I will realistically be denormalising only up to 10k rows per day. Question is whether this is actually the best approach. Seems a little dirty to me.
BEGIN;
CREATE
TEMPORARY TABLE working_tags(tag_id int, session_id varchar, sku varchar) ON COMMIT DROP;
INSERT INTO working_tags
SELECT id,
session_id,
sku
FROM tags
WHERE time < now() - interval '12 hours'
AND processed_product_relation IS NULL
AND sku IS NOT NULL LIMIT 200000;
CREATE
TEMPORARY TABLE working_relations (sku1 varchar, sku2 varchar, score int) ON COMMIT DROP;
INSERT INTO working_relations
SELECT a.sku AS sku1,
b.sku AS sku2,
count(DISTINCT a.session_id) AS score
FROM working_tags AS a
INNER JOIN working_tags AS b ON a.session_id = b.session_id
AND a.sku < b.sku
WHERE a.sku IS NOT NULL
AND b.sku IS NOT NULL
GROUP BY a.sku,
b.sku;
UPDATE product_relations
SET score = working_relations.score+product_relations.score
FROM working_relations
WHERE working_relations.sku1 = product_relations.sku1
AND working_relations.sku2 = product_relations.sku2;
INSERT INTO product_relations (sku1, sku2, score)
SELECT working_relations.sku1,
working_relations.sku2,
working_relations.score
FROM working_relations
LEFT OUTER JOIN product_relations ON (working_relations.sku1 = product_relations.sku1
AND working_relations.sku2 = product_relations.sku2)
WHERE product_relations.sku1 IS NULL;
UPDATE tags
SET processed_product_relation = TRUE
WHERE id IN
(SELECT tag_id
FROM working_tags);
COMMIT;
If I've interpreted your intention correctly (per comments) this should do it:
SELECT
s1.sku AS sku1,
s2.sku AS sku2,
count(session_id)
FROM session s1
INNER JOIN session s2 USING (session_id)
WHERE s1.sku < s2.sku
GROUP BY s1.sku, s2.sku
ORDER BY 1,2;
See: http://sqlfiddle.com/#!15/2e0b2/1
In other words: Self-join session, then find all pairings of SKUs for each session ID, excluding ones where the left is greater than or equal to the right in order to avoid repeating pairings - if we have (1,2,count) we don't want (2,1,count) as well. Then group by the SKU pairings and count how many rows are found for each pairing.
You may want to count(distinct session_id) instead, if your SKU pairings can repeat and you want to exclude duplicates. There will probably be more efficient ways to do that, but that's the simplest.
An index on at least session_id will be very useful. You may also want to mess with planner cost parameters to make sure it chooses a good plan - in particular, make sure effective_cache_size is accurate and random_page_cost vs seq_page_cost reflects your caching and I/O costs. Finally, throw as much work_mem at it as you can afford.
If you're creating a materialized view, just CREATE UNLOGGED TABLE whatever AS SELECT .... . That way you minimise the numer of writes/rewrites/overwrites.
Basically what I'm trying to figure out is,
Say I have
table 1tbl1
ID | Name
and table2tbl2
ID | Name
Then I have a mapping table mt
ID | tbl1ID | tbl2ID
Data really isn't important here, and these tables are examples.
How to make a view that will grab all the items in tbl1 that aren't mapped to mt.
I'm using Microsoft SQL-server 2008 by the way.
CREATE VIEW v_unmapped
AS
SELECT *
FROM tbl1
WHERE id NOT IN
(
SELECT tbl1Id
FROM mt
)