Postgresql - Find overlapping time ranges for different users in the same session and present them as pairs - database

I have a table which has records of sessions a players have played in a group music play. (music instruments)
so if a user joins a session, and leaves, there is one row created. If they join even the same session 2x, then two rows are created.
Table: music_sessions_user_history
| Column | Type | Default|
| --- | --- | ---|---
| id | character varying(64) | uuid_generate_v4()|
| user_id | user_id | |
| created_at | timestamp without time zone | now()|
| session_removed_at | timestamp without time zone | |
| max_concurrent_connections | integer |
| music_session_id|character varying(64)|
This table is basically the amount of time a user was in a given session. So you can think of it as a timerange or tsrange in PG. The max_concurrent_connections which is a count of the number of users who were in the session at once.
so the query at it's heart needs to find overlapping time ranges for different users in the same session; and to then count them up as a pair that played together.
The query needs to do this: It tries to report each user that played in a music session with others - and who those users were
So for example, if a userA played with userB, and that's the only data in the database, then two rows would be returned like:
| User | Other users in the session |
| --- | --- |
|userA | [userB] |
|userB | [userA] |
But if userA played with both userB and UserC, then three rows would be like:
| User | Other users in the session |
| --- | --- |
|userA | [userB, userC]|
|userB | [userA, userC]|
|userC | [userA, userB]|
Any help of constructing this query is much appreciated.
update:
I am able to get overlapping records using this query.
select m1.user_id, m1.created_at, m1.session_removed_at, m1.max_concurrent_connections, m1.music_session_id
from music_sessions_user_history m1
where exists (select 1
from music_sessions_user_history m2
where tsrange(m2.created_at, m2.session_removed_at, '[]') && tsrange(m1.created_at, m1.session_removed_at, '[]')
and m2.music_session_id = m1.music_session_id
and m2.id <> m1.id);
Need to find a way to convert these results in to pairs.

create a cursor and for each fetched record determine which other records intersect using a between time of start and end time.
append the intersecting results into a temporary table
select the results of the temporary table

Related

Using tables to categorise resources with

I'm trying to design a database that allows for filtering according to if a specific resource fills certain categories. I've gotten to the point where I can input data that seems to be how it should be filled out but I'm not sure how I should pull it out again.
The main resource table looks like this:
Table1 - resources
| resourceID | AutoNum |
| title | short text |
| author | short text |
| publish date | date |
| type | short text |
Table2 - Department Categories
| ID | AutoNum |
| 1 | Yes/No |
| 2 | Yes/No |
| fID| Number |
Table3 - Categories
| ID | AutoNum |
| cat | Yes/No |
| dog | Yes/No |
| bird | Yes/No |
| fID | Number |
I have built a form where you can fill in items to the resource ID, and at the same time check off the Yes/No boxes in tables 2 & 3.
I'm trying to use the primary key ID from table 1 and copy it into the table 2 & 3 with referential integrity to cascade deletes, updates. Which I think is the right way to do this.
Currently, I've learnt that I can implement a search function for the columns in table 1, this seems to work fine. However I am stuck with applying the relevant columns in table 2 and 3 as filters.
apply search>
[X] - Cats
Should only return records from table 1 where in table 3 the relevant column has a tick in the Yes/No box.
I hope I have explained this properly, very new to Access and databases so if you need clarity, don't mind offering.

How best to Calculate Nth subscription renewal?

So I have a transaction table (postgres) that inserts a new row whenever a user renews their subscription for our service. The table subscription looks like this:
+--------+--------+------------+
| userId | prodId | renew_date |
+--------+--------+------------+
| 1 | 1 | 2018-05-01 |
| 1 | 1 | 2018-06-01 |
| 1 | 1 | 2018-07-01 |
| 2 | 3 | 2017-04-16 |
| 2 | 3 | 2017-05-16 |
+--------+--------+------------+
If the analysts want to figure out the Nth renewal or latest renewal for a particular user or product, I have two solutions to give them that:
1.) During my ETL process, I truncate the DW warehouse target table and re-populate it with:
select *
, row_number() over (partition by userId, productId order by renew_date asc) as nth_renewal
from subscription
I can't think of a way where i can +1 to the previous renewal if I were to do incremental updates, what if this is the customers first renewal?
2.) I just copy the exact OLTP table over to the data warehouse and do incremental updates every day. This way, I let the analysts calculate the nth renewal themselves. (also as a follow up question: is it ever OK to have a duplicate copy of a transactional table in my data warehouse?)

duplicate amount column for payment transaction design

There are few payment methods: credit/debit card, cash, bitcoin
This is my payment transaction table:
Transaction:
| ID | AMOUNT | METHOD |
| 1 | 80 | credit |
| 2 | 100 | cash |
Transaction_credit:
| ID | AMOUNT | TYPE | TRANSACTION_ID |
| 1 | 80 | sale | 1 |
| 2 | -80 | reversal | 1 |
Transaction_cash:
| ID | AMOUNT | TYPE | TRANSACTION_ID |
| 2 | 100 | payment | 2 |
| 2 | -100 | refund | 2 |
Do you think it is a good idea to have amount in card, cash, and bitcoin sub table?
How can I solve the duplicate amount in sub table?
I think your database design needs some improvements.
Firstly: Transaction Entity (Table) in Accounting Systems holds all money transactions. If your sales are reversal, you should make a new Transaction row too. Also, if your Payment refunded, you should make a new Transaction row too.
Secondly: Details of all transactions should be saved in second level Entities (Tables). (as you design correctly). Transaction types (e.g. Card, Cash, Bitcoin and etc.) have many different attributes. So putting all types in one entity, make some bad design traps like Nullification.
Thirdly: If you want to have a complete Accounting System to supports all accounting parts (like generating Balance Sheet), you should add many other entities.
But in this case, you should hold Amount in Transaction. Fining Amount in other tables is so difficult when you want to perform some queries based on overall Amount on Transaction.

SQL Server - Multiple Identity Ranges in the Same Column

Yesterday, I was asked the same question by two different people. Their tables have a field that groups records together, like a year or location. Within those groups, they want to have a unique ID that starts at 1 and increments up sequentially. Obviously, you could search for MAX(ID), but if these applications have a lot of traffic, they'd need to lock the entire table to ensure the same ID wasn't returned multiple times. I thought about using sequences but that would mean dynamically creating a sequence for each group.
Example 1:
Records created during the year should increment by one and then restart at 1 at the beginning of the next year.
| Year | ID |
|------|----|
| 2016 | 1 |
| 2016 | 2 |
| 2017 | 1 |
| 2017 | 2 |
| 2017 | 3 |
Example 2:
A company has many locations and they want to generate a unique ID for each customer, combining a the location ID with a incrementing ID.
| Site | ID |
|------|----|
| XYZ | 1 |
| ABC | 1 |
| XYZ | 2 |
| XYZ | 3 |
| DEF | 1 |
| ABC | 2 |
One trick that is often under-used is to create a clustered index on Site / ID or Year / ID - BUT Change the order of the ID column to Desc rather than ASC.
This way when you need to scan the CI to get the Next ID value it only needs to check 1 row in the clustered index. I've used this on Multi-Billion Record tables and it runs quite quickly. You can get even better performance by partitioning the table by Site or Year then you'll get the added benefit of partition elimination when you run your MAX(ID) queries.

Reducing amount of http requests by grouping queries

So I have a bunch of chatty http requests in my Angular 1 app which are bottle necking many of the other requests.
Imagine I have a list of Users from a totally different data source and I make calls to 5 different tables such as:
user.signup
+-----+------------+
| uid | date |
+-----+------------+
| 1 | 2016-12-13 |
| 2 | 2016-12-01 |
+-----+------------+
user.favourite_color
+-----+-------+
| uid | color |
+-----+-------+
| 1 | red |
| 5 | blue |
| 7 | green |
+-----+-------+
user.location
+-----+-----------+
| uid | location |
+-----+-----------+
| 2 | uk |
| 3 | france |
| 9 | greenland |
+-----+-----------+
The reason they are in different tables are because the fields are optional.
The way I see it I have 3 options:
Put them in 1 table
So I could just group them all in 1 table and have a bunch of null columns but that just doesn't sit right with me in terms of DB design.
+-----+------------+-----------+-------+
| uid | date | location | color |
+-----+------------+-----------+-------+
| 1 | 2016-12-13 | null | red |
| 2 | 2016-12-01 | uk | null |
| 3 | null | greenland | null |
| 5 | null | null | blue |
+-----+------------+-----------+-------+
Join them all with 1 request
So I could just have one query that joins all these tables but the way I see it they would have to be full joins with the expectation that some uid's wouldn't exist in some tables. e.g.
+------+------------+-------+-----------+-------+-------+
| uid | date | l_uid | location | c_uid | color |
+------+------------+-------+-----------+-------+-------+
| 1 | 2016-12-13 | null | null | 1 | red |
| 2 | 2016-12-01 | 2 | uk | null | null |
| null | null | 3 | greenland | null | null |
| null | nul | null | null | 5 | blue |
+------+------------+-------+-----------+-------+-------+
which is probably even worse!
Change the way the requests are made?
Maybe make some clever changes how the requests are made:
function activate() {
$q.all([requestSignupDate(), requestFaveColor(), requestLocation(), ....])
.then(function (data) {
//do a bunch of stuff with the data
});
}
which I want to change to:
function activate() {
requestUserData();
}
Any suggestions?
This is a typical ORM problem - precisely this database does not provide a better way of storing the user as an entity.
The entity properties are spread out in multiple tables and due to being optional you are taxed to do left joins with multiple tables.
So you have to essentially solve that problem first. You have several options or (non-options without knowing requirements.)
Put them in one table
If you can use nullable columns and refactor your application - this is preferable. I see that your other tables have just one or two more fields. Heavily normalized tables saves some space and with no data repetition other normalization benefits are moot.
Join them all with 1 request
Only if your query stays performant. Can you use left join ?
You would do this if the above option is difficult. Use this only as a quick fix.
Other options To solve Database Problem
Use server side caching if feasible.
If feasible use a different NoSQL database (e.g. MongoDB)
Change the way the requests are made?
Do you really need all the properties upfront ? The web is asynchronous so why not keep things async. Use $q.all only if you need all the properties. For example the user may not even navigate/scroll to certain part of the page to see stuff so why may queries in the first place.
Along with this you can cluster your server side and the database so that these queries fall on multiple machines and load gets distributed. You may get some items retrieved in parallel.
If the number of columns are all that you have and the tables are all that you mentioned i.e. the supplemental tables have fewer properties I would go with Put them in 1 table option.
Why doesn't using nullable fields sit right with you? NULL exists because it's useful, and a profile table with optional values for a fixed set of fields is practically the textbook case for invoking it. If fields can be dynamically redefined (eg swapping out "favorite color" for "favorite food"), it's another story, but that's not a requirement in what you've described.

Resources