I have following staging table and a destination table with the same data:
ID | Name | Job | Hash
1 | A | IT | XYZ1
2 | B | Driver | XYZ2
The staging table gets truncated each time and new data gets inserted. Sometimes, a person can get a second job. In that case, we have 2 records with ID 2 and Name B, but with a differentjobandhash` in the staging table.
ID | Name | Job | Hash
1 | A | IT | XYZ1
2 | B | Driver | XYZ2
2 | B | IT | XYY4
If this happens, I need to insert all records with ID 2 into the destination table. I already have a LKP that checkes for (un-)matching ID's, but how can I "tell" SSIS to take ALL records from the staging table based on the ID's I get from the no match output?
You tell ssis by link the no match output from the lookup to the destination. Assume you have already set 'Redirect rows to no match output' in lookup - general. And in your lookup, you check for matching id (not sure how you check unmatching) This way, lookup will output all non-matched rows (by Id) to the destination.
Related
I have a table which has records of sessions a players have played in a group music play. (music instruments)
so if a user joins a session, and leaves, there is one row created. If they join even the same session 2x, then two rows are created.
Table: music_sessions_user_history
| Column | Type | Default|
| --- | --- | ---|---
| id | character varying(64) | uuid_generate_v4()|
| user_id | user_id | |
| created_at | timestamp without time zone | now()|
| session_removed_at | timestamp without time zone | |
| max_concurrent_connections | integer |
| music_session_id|character varying(64)|
This table is basically the amount of time a user was in a given session. So you can think of it as a timerange or tsrange in PG. The max_concurrent_connections which is a count of the number of users who were in the session at once.
so the query at it's heart needs to find overlapping time ranges for different users in the same session; and to then count them up as a pair that played together.
The query needs to do this: It tries to report each user that played in a music session with others - and who those users were
So for example, if a userA played with userB, and that's the only data in the database, then two rows would be returned like:
| User | Other users in the session |
| --- | --- |
|userA | [userB] |
|userB | [userA] |
But if userA played with both userB and UserC, then three rows would be like:
| User | Other users in the session |
| --- | --- |
|userA | [userB, userC]|
|userB | [userA, userC]|
|userC | [userA, userB]|
Any help of constructing this query is much appreciated.
update:
I am able to get overlapping records using this query.
select m1.user_id, m1.created_at, m1.session_removed_at, m1.max_concurrent_connections, m1.music_session_id
from music_sessions_user_history m1
where exists (select 1
from music_sessions_user_history m2
where tsrange(m2.created_at, m2.session_removed_at, '[]') && tsrange(m1.created_at, m1.session_removed_at, '[]')
and m2.music_session_id = m1.music_session_id
and m2.id <> m1.id);
Need to find a way to convert these results in to pairs.
create a cursor and for each fetched record determine which other records intersect using a between time of start and end time.
append the intersecting results into a temporary table
select the results of the temporary table
I have a need to manage a dataset for multiple customers - each customer manages a small table to update procedure volumes for the next five years. The table is structured like so:
+-------------+--------+--------+--------+--------+--------+
| | Year 1 | Year 2 | Year 3 | Year 4 | Year 5 |
+-------------+--------+--------+--------+--------+--------+
| Procedure A | 5 | 10 | 14 | 12 | 21 |
+-------------+--------+--------+--------+--------+--------+
| Procedure B | 23 | 23 | 2 | 3 | 4 |
+-------------+--------+--------+--------+--------+--------+
| Procedure C | 5 | 6 | 7 | 8 | 12 |
+-------------+--------+--------+--------+--------+--------+
The values in this table will be managed by each customer via MS PowerApps.
This same structure exists for every single customer. What is the best way to put all of these in one dataset?
Should I just add a column for CUSTOMER ID and just put all the data in there?
The process:
Utilizing PowerApps, a new customer deal will be generated and a row will be added for them in the SQL DB in a customer records table.
Simultaneously, the blank template of the above table should be generated for them.
Now, the customer can interface with this SQL table within PowerApps and add their respective procedure volumes.
The question isn't explained well but:
I would assume all of the customer specific data has at least one column that is the same. For instance CustomerName. You could create your own table with CustomerId, CustomerName, (any other fields you would like to see). If there isn't a concept of CustomerId on the customer's tables, you would have to join them on CustomerName. You could populate your own CustomerId for the new table.
I would be happy to help more if you could clarify the question and show a few examples.
So I have a transaction table (postgres) that inserts a new row whenever a user renews their subscription for our service. The table subscription looks like this:
+--------+--------+------------+
| userId | prodId | renew_date |
+--------+--------+------------+
| 1 | 1 | 2018-05-01 |
| 1 | 1 | 2018-06-01 |
| 1 | 1 | 2018-07-01 |
| 2 | 3 | 2017-04-16 |
| 2 | 3 | 2017-05-16 |
+--------+--------+------------+
If the analysts want to figure out the Nth renewal or latest renewal for a particular user or product, I have two solutions to give them that:
1.) During my ETL process, I truncate the DW warehouse target table and re-populate it with:
select *
, row_number() over (partition by userId, productId order by renew_date asc) as nth_renewal
from subscription
I can't think of a way where i can +1 to the previous renewal if I were to do incremental updates, what if this is the customers first renewal?
2.) I just copy the exact OLTP table over to the data warehouse and do incremental updates every day. This way, I let the analysts calculate the nth renewal themselves. (also as a follow up question: is it ever OK to have a duplicate copy of a transactional table in my data warehouse?)
Yesterday, I was asked the same question by two different people. Their tables have a field that groups records together, like a year or location. Within those groups, they want to have a unique ID that starts at 1 and increments up sequentially. Obviously, you could search for MAX(ID), but if these applications have a lot of traffic, they'd need to lock the entire table to ensure the same ID wasn't returned multiple times. I thought about using sequences but that would mean dynamically creating a sequence for each group.
Example 1:
Records created during the year should increment by one and then restart at 1 at the beginning of the next year.
| Year | ID |
|------|----|
| 2016 | 1 |
| 2016 | 2 |
| 2017 | 1 |
| 2017 | 2 |
| 2017 | 3 |
Example 2:
A company has many locations and they want to generate a unique ID for each customer, combining a the location ID with a incrementing ID.
| Site | ID |
|------|----|
| XYZ | 1 |
| ABC | 1 |
| XYZ | 2 |
| XYZ | 3 |
| DEF | 1 |
| ABC | 2 |
One trick that is often under-used is to create a clustered index on Site / ID or Year / ID - BUT Change the order of the ID column to Desc rather than ASC.
This way when you need to scan the CI to get the Next ID value it only needs to check 1 row in the clustered index. I've used this on Multi-Billion Record tables and it runs quite quickly. You can get even better performance by partitioning the table by Site or Year then you'll get the added benefit of partition elimination when you run your MAX(ID) queries.
I'm new in DataBases at all and have some difficulties with setting relationships between 3 tables in MS Access 2013.
The idea is that I have a table with accounts info, a table with calls related to this accounts and also one table with all the possible call responses. I tried different combinations between them but nothing works.
1st table - Accounts : AccountID(PK) | AccountName | Language | Country | Email
2nd table - Calls : CallID(PK) | Account | Response | Comment | Date
3rd table - Responses: ResponseID(PK) | Response
When you have a table, it usually has a Primary Key field that is the main index of the table. In order for you to connect it with other tables, you usually do that by setting Foreign Key on the other table.
Let's say you have your Accounts table, and it has AccountID field as Primary Key. This field is unique (meaning no duplicate value for this field).
Now, you have the other table called Calls and you have a Foreign Key field called AccountID there, which points to the Accounts table.
Essentially you have Accounts with the following data:
AccountID| AccountName | Language | Country | Email
1 | FirstName | EN | US | some#email.com
2 | SecondName | EN | US | some#email.com
Now you have the other table Calls with Many calls
CallID(PK) | AccountID(FK) | ResponseID(FK) | Comment | Date
1 | 1 | 1 | a comment | 26/10
2 | 1 | 1 | a comment | 26/10
3 | 2 | 3 | a comment | 26/10
4 | 2 | 3 | a comment | 26/10
You can see the One to Many relationship: One accountID (in my example AccountID=1) to Many Calls (in my example 2 rows with AccountID=1 as foreign keys, rows 1 & 2) and AccountID=2 has also 2 rows of Calls (rows 3 and 4)
Same goes for the Responses table
Using this table structure:
Accounts : AccountID(PK) | AccountName | Language | Country | Email
Calls : CallID(PK) | AccountID(FK) | ResponseID(FK) | Comment | Date
Responses: ResponseID(PK) | Response
Accounts.AccountID is referenced by Calls.AccountID. 1:n – many calls for one account possible, but each call concerns just one account.
Responses.ResponseID is referenced by Calls.ResponseID. 1:n – many calls can get the same response from the prepared set, but each call gets exactly one of them.
To actually define the Relationships in Access, open the Relationships window...
... then follow the detailed instructions here:
How to define relationships between tables in an Access database