Create materialized view from multiple tables with identical schemas - database

I am trying create a materialized view in Redshift.
I have 100 tables of the form
user_1
user_2
...
user_100
Each table has the same schema. For example:
user_1
| id | user_id | date | expense |
|----|---------|----------|---------|
| 1 | 1 | 20200521 | 200 |
| 2 | 2 | 20200601 | 100 |
| 3 | 1 | 20200603 | 90 |
user_2
| id | user_id | date | expense |
|----|---------|----------|---------|
| 1 | 1 | 20200521 | 250 |
| 2 | 3 | 20200204 | 10 |
| 3 | 2 | 20200403 | 50 |
What I want to do is to create a materialized view which has all the rows from all 100 tables.
The objective is to run a queries to calculate the sum(expense) for given user_id between certain dates.
So my materialized view will be something like this:
user_1
| id | user_id | date | expense |
|----|---------|----------|---------|
| 1 | 1 | 20200521 | 200 |
| 2 | 2 | 20200601 | 100 |
| 3 | 1 | 20200603 | 90 |
| 4 | 1 | 20200521 | 250 |
| 5 | 3 | 20200204 | 10 |
| 6 | 2 | 20200403 | 50 |
I am having trouble with the CREATE query for this view.
Any guidance on the query to create this view is appreciated.
Thank you.

How about UNIONing the tables?
Keep in mind the difference between UNION (does implicit distinct) and UNION ALL which just combines rows from two tables as they come.
CREATE MATERIALIZED VIEW users
AS
SELECT * FROM user_1 UNION ALL
SELECT * FROM user_2 UNION ALL
SELECT * FROM user_3 UNION ALL
...
SELECT * FROM user_100

Related

Sorting Table in hierarchical order

Is it possible to sorting queries table in hierarchical order like this:
Expected
+----+--------+-----------+-------+--------+-----------+-----------+---------+
| ID | Code | Name | Qty | Amount | is_parent | parent_id | remarks |
+----+--------+-----------+-------+--------+-----------+-----------+---------+
| 1 | ABC | Parent1 | 2 | 1,000 | 1 | 0 | xxx |
+----+--------+-----------+-------+--------+-----------+-----------+---------+
| 4 | FFLK | Product Z | 10 | 2,500 | 0 | 1 | xxx |
+----+--------+-----------+-------+--------+-----------+-----------+---------+
| 5 | P6DT | Product 5 | 7 | 1,700 | 0 | 1 | xxx |
+----+--------+-----------+-------+--------+-----------+-----------+---------+
| 6 | P2GL | Product T | 5 | 1,100 | 0 | 1 | xxx |
+----+--------+-----------+-------+--------+-----------+-----------+---------+
| 2 | DHG | Parent2 | 5 | 1,500 | 1 | 0 | xxx |
+----+--------+-----------+-------+--------+-----------+-----------+---------+
| 3 | LMSJ | Product U | 4 | 600 | 0 | 2 | xxx |
+----+--------+-----------+-------+--------+-----------+-----------+---------+
This is the original data table:
+----+--------+-----------+-------+--------+-----------+-----------+---------+
| ID | Code | Name | Qty | Amount | is_parent | parent_id | remarks |
+----+--------+-----------+-------+--------+-----------+-----------+---------+
| 1 | ABC | Parent1 | 2 | 1,000 | 1 | 0 | xxx |
+----+--------+-----------+-------+--------+-----------+-----------+---------+
| 2 | DHG | Parent2 | 5 | 1,500 | 1 | 0 | xxx |
+----+--------+-----------+-------+--------+-----------+-----------+---------+
| 3 | LMSJ | Product U | 4 | 600 | 0 | 2 | xxx |
+----+--------+-----------+-------+--------+-----------+-----------+---------+
| 4 | FFLK | Product Z | 10 | 2,500 | 0 | 1 | xxx |
+----+--------+-----------+-------+--------+-----------+-----------+---------+
| 5 | P6DT | Product 5 | 7 | 1,700 | 0 | 1 | xxx |
+----+--------+-----------+-------+--------+-----------+-----------+---------+
| 6 | P2GL | Product T | 5 | 1,100 | 0 | 1 | xxx |
+----+--------+-----------+-------+--------+-----------+-----------+---------+
is_parent column = 1 if data row set to parent, 0 if data row set to child
parent_id column = 0 if data row set to parent, depend on ID of parent data
I'm using SQL Server to generate the data.
It looks like the actual question is how to query the data in hierarchical order. This is possible using recursive queries but a faster alternative is to use SQL Server's support for hierarchical data.
A recursive query that returns the data in hierarchical order would look like this :
WITH h AS
(
SELECT
ID,Code,Name,Qty,Amount,is_parent,parent_id,remarks
FROM
dbo.ThatTable
WHERE
parent_id=0
UNION ALL
SELECT
c.ID,c.Code,c.Name,c.Qty,c.Amount,c.is_parent,c.parent_id,c.remarks
FROM
dbo.ThatTable c
INNER JOIN h ON
c.parent_id= h.Id
)
SELECT * FROM h
This query's performance will be acceptable if the ID and Parent_ID fields are indexed, but not great.
Adding a hierarchyid field to the table would make the query simpler and far faster. Assuming there's a hierarchy field, the query would be just :
SELECT *
FROM ThatTable
ORDER BY hierarchy
Adding an index on hierarchy will this query and any query that looks eg for children of a specific node, very fast. Instead of querying recursively, the server only needs to look into that single index.
The article Lesson 1: Converting a Table to a Hierarchical Structure shows how to create a new table with a hierarchyid and populate it from parent/child data.

Find the newest entry of a crosstable per record?

I have three tables:
My products with their IDs and their features.
is a table with treatments of my products with a treatment-ID, a method, and a date. The treatments are done in batches of many products so there is a crosstable
with the products IDs and the treatment IDs and a bool value for the success of the treatment.
Each product can undergo many different treatments so there is a many-to-many relation. I now want to add to the product table (1.) for every product a value that shows the method of its most recent successful treatment if there is any.
I made a query that groups the crosstable's entries by product-ID but I don't know how to show the method and date of it's last treatment.
table 1:
| productID | size | weight | height | ... |
|-----------|:----:|-------:|--------|-----|
| 1 | 13 | 16 | 9 | ... |
| 2 | 12 | 17 | 12 | ... |
| 3 | 11 | 15 | 15 | ... |
| ... | ... | ... | ... | ... |
table 2:
| treatmentID | method | date |
|-------------|:--------:|-----------:|
| 1 | dye blue | 01.02.2016 |
| 2 | dye red | 01.02.2017 |
| 3 | dye blue | 01.02.2018 |
| ... | ... | ... |
table 3:
| productID | treatmentID | success |
|-----------|:-----------:|--------:|
| 1 | 1 | yes |
| 1 | 2 | yes |
| 1 | 3 | no |
| ... | ... | ... |
I need table 1 to be like:
table 1:
| productID | size | weight | height | latest succesful method |
|-----------|:----:|-------:|--------|-------------------------|
| 1 | 13 | 16 | 9 | dye red |
| 2 | 12 | 17 | 12 | ... |
| 3 | 11 | 15 | 15 | ... |
| ... | ... | ... | ... | ... |
My query:
SELECT table3.productID, table2.method
FROM table2 INNER JOIN table3 ON table2.treatmentID = table3.treatmentID
GROUP BY table3.productID, table2.method
HAVING (((table3.productID)=Max([table2].[date])))
ORDER BY table3.productID DESC;
but this does NOT show only one (the most recent) entry but all of them.
Simplest solution here would be to write either a subquery within your sql, or create a new query to act as a subquery(it will look like a table) to help indicate(or elminate) the records you want to see.
Using similar but potentially slightly different source data as you only gave one example.
Table1
| ProductID | Size | Weight | Height |
|-----------|------|--------|--------|
| 1 | 13 | 16 | 9 |
| 2 | 12 | 17 | 12 |
| 3 | 11 | 15 | 15 |
Table2
| TreatmentID | Method | Date |
|-------------|------------|----------|
| 1 | dye blue | 1/2/2016 |
| 2 | dye red | 1/2/2017 |
| 3 | dye blue | 1/2/2018 |
| 4 | dye yellow | 1/4/2017 |
| 5 | dye brown | 1/5/2018 |
Table3
| ProductID | TreatmentID | Success |
|-----------|-------------|---------|
| 1 | 1 | yes |
| 1 | 2 | yes |
| 1 | 3 | no |
| 2 | 4 | no |
| 2 | 5 | yes |
First order of business is to get the max(dates) and productIds of successful treatments.
We'll do this by aggregating the date along with the productIDs and "success".
SELECT Table3.productid, Max(Table2.Date) AS MaxOfdate, Table3.success
FROM Table2 INNER JOIN Table3 ON Table2.treatmentid = Table3.treatmentid
GROUP BY Table3.productid, Table3.success;
This should give us something along the lines of:
| ProductID | MaxofDate | Success |
|-----------|-----------|---------|
| 1 | 1/2/2018 | No |
| 1 | 1/2/2017 | Yes |
| 2 | 1/4/2017 | No |
| 2 | 1/8/2017 | Yes |
We'll save this query as a "regular" query. I named mine "max", you should probably use something more descriptive. You'll see "max" in this next query.
Next we'll join tables1-3 together but in addition we will also use this "max" subquery to link tables 1 and 2 by the productID and MaxOfDate to TreatmentDate where success = "yes" to find the details of the most recent SUCCESSFUL treatment.
SELECT table1.productid, table1.size, table1.weight, table1.height, Table2.method
FROM ((table1 INNER JOIN [max] ON table1.productid = max.productid)
INNER JOIN Table2 ON max.MaxOfdate = Table2.date) INNER JOIN Table3 ON
(Table2.treatmentid = Table3.treatmentid) AND (table1.productid = Table3.productid)
WHERE (((max.success)="yes"));
The design will look something like this:
Design
(ps. you can add queries to your design query editor by clicking on the "Queries" tab when you are adding tables to your query design. They act just like tables, just be careful as very detailed queries tend to bog down Access)
Running this query should give us our final results.
| ProductID | Size | Weight | Height | Method |
|-----------|------|--------|--------|-----------|
| 1 | 13 | 16 | 9 | dye red |
| 2 | 12 | 17 | 12 | dye brown |

Count rows which has the same ID and display on the table

This is the original table:
| ID | Card_No |
|----+---------|
| 1 | 6453671 |
| 1 | 8795732 |
| 1 | 9948495 |
| 2 | 7483009 |
| 2 | 1029001 |
| 3 | 7463094 |
Is it possible to make it like this? Which will be adding a calculated column the the original table?
| ID | Card_No | Total |
|----+---------|-------|
| 1 | 6453671 | 3 |
| 1 | 8795732 | 3 |
| 1 | 9948495 | 3 |
| 2 | 7483009 | 2 |
| 2 | 1029001 | 2 |
| 3 | 7463094 | 3 |
I'm using Microsoft Access, and I've tried code like this:
SELECT ID, COUNT (*) AS Total FROM Table GROUP BY ID
But I did not get the result I want.
First of all, saving the calculated value back into table is not only unnecessary but bad design.
Options:
build a report that counts records with an expression in textbox
build aggregate query then another query joining aggregate query to the table
DCount() domain aggregate function in query

Group by Date bigger than SQL Server

I'm working with Micrososft SqlServer 2012 and I have this table:
Table
+-----------+--------+------------+
| Id_Client | Amount | Date |
+-----------+--------+------------+
| 1 | 100 | 24/08/2015 |
| 2 | 100 | 24/07/2015 |
| 3 | 100 | 24/06/2015 |
| 3 | 100 | 24/05/2015 |
+-----------+--------+------------+
And I need to make a query like this:
Query
SELECT ID_CLIENT,
CASE WHEN DATE <= '01/07/2015' THEN 'OLD' ELSE 'NEW' END,
SUM(AMOUNT) FROM TABLE
GROUP BY ID_CLIENT
How do I Group by Date with the condition, instead of each Date?
I expect something like:
Expected result
+---+-----+-----+
| 1 | NEW | 100 |
| 2 | NEW | 100 |
| 3 | OLD | 200 |
+---+-----+-----+

How to implement Auto_Increment per User, on the same table?

I would like to have multiple users that share the same tables in the database, but have one auto_increment value per user. I will use an embedded database, JavaDB and as what I know it doesn't support this functionality. How can I implement it?
Should I implement a trigger on inserts that lookup the users last inserted row, and then add one, or are there any better alternative? Or is it better to implement this in the application code?
Or is this just a bad idea? I think this is easier to maintain than creating new tables for every user.
Example:
table
+----+-------------+---------+------+
| ID | ID_PER_USER | USER_ID | DATA |
+----+-------------+---------+------+
| 1 | 1 | 2 | 3454 |
| 2 | 2 | 2 | 6567 |
| 3 | 1 | 3 | 6788 |
| 4 | 3 | 2 | 1133 |
| 5 | 4 | 2 | 4534 |
| 6 | 2 | 3 | 4366 |
| 7 | 3 | 3 | 7887 |
+----+-------------+---------+------+
SELECT * FROM table WHERE USER_ID = 3
+----+-------------+---------+------+
| ID | ID_PER_USER | USER_ID | DATA |
+----+-------------+---------+------+
| 3 | 1 | 3 | 6788 |
| 6 | 2 | 3 | 4366 |
| 7 | 3 | 3 | 7887 |
+----+-------------+---------+------+
SELECT * FROM table WHERE USER_ID = 2
+----+-------------+---------+------+
| ID | ID_PER_USER | USER_ID | DATA |
+----+-------------+---------+------+
| 1 | 1 | 2 | 3454 |
| 2 | 2 | 2 | 6567 |
| 4 | 3 | 2 | 1133 |
| 5 | 4 | 2 | 4534 |
+----+-------------+---------+------+
If you can guarantee that there will only be one session per user, then it would be pretty safe to do. If a user can have more than one session then whether you do this in a trigger or in the application code you will need to take an exclusive table lock to make sure that the session you are in is the only one to get that next number.
But don't go for a table per user. That would make your sql really ugly and prevent any sort of sql plan sharing.
You may be better served by using a timestamp instead of a serial number.

Resources