How to aggregate number of notes sent to each user? - sql-server

Consider the following tables
group (obj_id here is user_id)
group_id obj_id role
--------------------------
100 1 A
100 2 root
100 3 B
100 4 C
notes
obj_id ref_obj_id note note_id
-------------------------------------------
1 2 10
1 3 10
1 0 foobar 10
1 4 20
1 2 20
1 0 barbaz 20
2 0 caszes 30
2 1 30
4 1 70
4 0 taz 70
4 3 70
Note: a note in the system can be assigned to multiple users (for instance: an admin could write "sent warning to 2 users" and link it to 2 user_ids). The first user the note gets linked to is stored differently than the other linked users. The note itself is linked to the first linked user only. Whenever group.obj_id = notes.obj_id then ref_obj_id = 0 and note <> null
I need to make an overview of the notes per user. Normally I would do this by joining on group.obj_id = notes.obj_idbut here this goes wrong because of ref_obj_id being 0 (in which case I should join on notes.obj_id)
There are 4 notes in this system (foobar, barbaz, caszes and taz).
The desired output is:
obj_id user_is_primary notes_primary user_is_linked notes_linked
-------------------------------------------------------------------
1 2 10;20 2 30;70
2 1 30 2 10;20
3 0 2 10;70
4 1 70 1 20
How can I get to this aggregated result?
I hope that I was able to explain the situation clearly; perhaps it is my inexperience but I find the data model not the most straightforward.

Couldn't you simply put this in the ON clause of your join?
case when notes.ref_obj_id = 0 then notes.obj_id else notes.ref_obj_id end = group.obj_id

Related

Stored procedure returning customers a user cannot support

I've created a stored procedure to allocate a customer to a user based on the number and type of customer requests versus the skills of the user. Below is an extract of part of the stored procedure - status = 0 means unallocated.
SELECT TOP 1
gdd.customerReference
FROM
customerRequests gdd
LEFT OUTER JOIN
userSkills us ON us.requestTypeId = gdd.requestTypeId
AND us.userId = #pinUserId
LEFT OUTER JOIN
requestAttributes dt ON dt.requestTypeId = gdd.requestTypeId
WHERE
gdd.status = 0
GROUP BY
gdd.requestDateTime, gdd.customerReference, gdd.requestId, gdd.requestTypeId
HAVING
COUNT(*) = COUNT(us.userId)
Example data:
Customer Requests
requestId requestTypeId policyNumber customerReference requestDateTime userId status
1 3 Policy A Customer 1 30/11/2015 10:13 0
2 4 Policy A Customer 1 30/11/2015 10:33 0
3 11 Policy B Customer 2 26/11/2015 15:26 0
4 17 Policy B Customer 2 26/11/2015 15:27 0
5 1 Policy B Customer 2 27/11/2015 10:05 0
Users skills:
skillId userId requestTypeId
1 user1 3
2 user1 17
3 user1 11
4 user1 1
5 user2 1
6 user2 3
7 user4 4
Request attributes:
requestTypeId description priority tolerance
1 Type A 200 90
2 Type B 999 999999
3 Type C 100 7
4 Type D 100 5
5 Type E 50 5
6 Type F 100 999999
7 Type G 999 999999
8 Type H 999 999999
9 Type I 999 999999
10 Type J 999 999999
11 Type K 100 999999
12 Type L 999 999999
13 Type M 999 999999
14 Type N 999 999999
15 Type O 100 5
16 Type P 100 10
17 Type Q 100 10
By using TOP 1 in the select the query is correctly returning Customer 2 when I substitute #pinUserId with user1. The problem is that if I take TOP 1 out of the query I also see Customer 1 in the result set even though user1 is not able to handle both of Customer 1's requests. If Customer 1's requests came in before Customer 2 user1 would have been incorrectly allocated Customer 1.
Can anyone suggest a solution to this problem?

Comparisons across multiple rows in Stata (household dataset)

I'm working on a household dataset and my data looks like this:
input id id_family mother_id male
1 2 12 0
2 2 13 1
3 3 15 1
4 3 17 0
5 3 4 0
end
What I want to do is identify the mother in each family. A mother is a member of the family whose id is equal to one of the mother_id's of another family member. In the example above, for the family with id_family=3, individual 5 has mother_id=4, which makes individual 4 her mother.
I create a family size variable that tells me how many members there are per family. I also create a rank variable for each member within a family. For families of three, I then have the following piece of code that works:
bysort id_family: gen family_size=_N
bysort id_family: gen rank=_n
gen mother=.
bysort id_family: replace mother=1 if male==0 & rank==1 & family_size==3 & (id[_n]==id[_n+1] | id[_n]==id[_n+2])
bysort id_family: replace mother=1 if male==0 & rank==2 & family_size==3 & (id[_n]==id[_n-1] | id[_n]==id[_n+1])
bysort id_family: replace mother=1 if male==0 & rank==3 & family_size==3 & (id[_n]==id[_n-1] | id[_n]==id[_n-2])
What I get is:
id id_family mother_id male family_size rank mother
1 2 12 0 2 1 .
2 2 13 1 2 2 .
3 3 15 1 3 1 .
4 3 17 0 3 2 1
5 3 4 0 3 3 .
However, in my real data set, I have to get the mother for families of size 4 and higher (up to 9), which makes this procedure very inefficient (in the sense that there are too many row elements to compare "manually").
How would you obtain this in a cleaner way? Would you make use of permutations to index the rows? Or would you use a for-loop?
Here's an approach using merge.
// create sample data
clear
input id id_family mother_id male
1 2 12 0
2 2 13 1
3 3 15 1
4 3 17 0
5 3 4 0
end
save families, replace
clear
// do the job
use families
drop id male
rename mother_id id
sort id_family id
duplicates drop
list, clean abbreviate(10)
save mothers, replace
use families, clear
merge 1:1 id_family id using mothers, keep(master match)
generate byte is_mother = _merge==3
list, clean abbreviate(10)
The second list yields
id id_family mother_id male _merge is_mother
1. 1 2 12 0 master only (1) 0
2. 2 2 13 1 master only (1) 0
3. 3 3 15 1 master only (1) 0
4. 4 3 17 0 matched (3) 1
5. 5 3 4 0 master only (1) 0
where I retained _merge only for expositional purposes.

SQl Server Query with add column value in rows as addition

I have a Table ExamSubjects
ExId,SubId,GroupId,SubOrder,NoQuestions
1 2 1 1 60
1 3 1 2 60
1 1 2 3 120
I want an output in way that column NoQuestions must be added top next column based on groupId
ExId SubId,StartNo,EndNo
1 2 1 60
1 3 61 120
1 1 1 120
Is There any other method other than using loop and temp table.Currently i am using while loop to generate the output
SELECT ExId,SubId,ISNULL((SELECT SUM(NoQuestions) FROM ExamSubjects ex2 WHERE GroupId=ex1.GroupId AND SubOrder<ex1.SubOrder),0)+1 AS StartNo,ISNULL((SELECT SUM(NoQuestions) FROM ExamSubjects ex2 WHERE GroupId=ex1.GroupId AND SubOrder<ex1.SubOrder),0)+NoQuestions
FROM ExamSubjects ex1

Select rows where count() = n

I'm implementing a search functionality where the results should show results page and for each result, the main image and up to 3 more thumbnails.
Right now in the procution version, for each ad it makes 1 select to return the images from the database which it terrible for performance, so I've changed it to a single query that does basically the following:
select * from AdImages order by IsMainImage desc, AdImageId
and returns something like:
AdImageId AdId IsMainImage FilePath
----------- ----------- ----------- ----------------------------------------
1 1 1 9c513f10-5480-4e41-89c6-074b36051999.jpg
5 2 1 f64f9c12-398e-445f-9724-baebe40930b1.jpg
6 4 1 8187d566-b296-4ab0-85e5-b9fc86f293b7.jpg
8 5 1 b8165008-09b3-4258-bf54-043195138344.jpg
10 6 1 86c636ed-f4ed-4f7e-8c7e-fc0b24faa956.jpg
11 7 1 4409a3fd-2bc0-4512-9850-6f5146193e50.jpg
13 8 1 b9b66c48-92b7-479a-a85d-dc6d26b03ebc.jpg
14 9 1 9f3f06ad-4fe1-43a5-8cce-3bb804bb10b7.jpg
16 10 1 016c30dc-5ee8-40d8-9d0f-398f444d7a7b.jpg
19 11 1 e5e56602-1af7-492b-8a8e-b61ac86b751b.jpg
2 1 0 02d44ce1-0de6-4e22-b4ef-043a72e9b5e8.jpg
3 1 0 8c4e19db-faff-44c2-9aab-6a96ab2a8e22.jpg
4 1 0 d8c2464a-277c-40fa-ab43-d2455e819e7e.jpg
7 4 0 d1430ae0-df51-43b7-acea-50d606eee5ba.jpg
9 5 0 b947ae4c-653d-4c27-9edd-567d977e1af3.jpg
12 7 0 3080c947-3769-4762-bb29-f1f9c5303ecd.jpg
15 9 0 d2543ce3-1e65-4a18-80d6-584de0025f1a.jpg
17 10 0 03b26d6a-4e0c-4393-9b5a-d9f2a24d36da.jpg
18 10 0 cde5dacd-3984-4cea-b56f-c3a6c5b82fa0.jpg
20 11 0 9e286ac0-25b1-4a05-af83-26e5d0002c2a.jpg
21 11 0 b1266770-9926-462c-8ec0-e965b21021eb.jpg
22 11 0 0542bd2a-4c4b-41d4-b51b-d311f42f0da9.jpg
23 11 0 b1cc44c9-50c4-4e81-bc9a-a0a4b515e709.jpg
My local db is very small but I could notice a very good performance gain, anyway, I think it could be better if I could make this query return only up to 4 rows for each ad instead of all the rows for each ad as it is doing. But to do so, it should be something like where count(AdId) == 4 which I'm not sure is possible.
I'm also using Entity Framework here. Any extra advice would be very welcome.
Use Window Function
select AdImageId ,AdId ,IsMainImage ,FilePath
from(
select row_number() over(partition by Adid order by IsMainImage desc, AdImageId) rn,*
from AdImages)a
where rn<=4
If I am understanding you correctly, you can just return the TOP xx results.
SELECT TOP(3) * from AdImages order by IsMainImage desc, AdImageId;
This will return only the top 3 results.

Updating syntax

I have the following scenario:
Table is _etblpricelistprices
Columns are as follows:
iPriceListNameID iPricelistNameID iStockID fExclPrice
1 1 1 10
2 2 1 20
3 3 1 30
4 4 1 40
5 5 1 100
6 6 1 200
7 7 1 300
8 8 1 400
9 1 2 1000
10 2 2 2000
11 3 2 3000
12 4 2 4000
13 5 2 50
14 6 2 40
15 7 2 30
16 8 2 20
There are only two stock items here, but a lot more in the DB. The first column is the PK which auto-increments. The second column is the Pricelist. The pricelist is split as follows. (1-4) is current pricing and (5-8) is future pricing. the third column is the stock item's ID, and the fourth column, the pricing of the item.
I need a script to update this table to swap the future and current pricing per item. Please help
Observe, if you will, that swapping the iPricelistNameID values will achieve the same overall effect as swapping the fExclPrice values, and can be perfomed using a formula:
UPDATE _etblpricelistprices
SET
iPricelistNameID = CASE
WHEN iPricelistNameID > 4 THEN iPricelistNameID - 4
ELSE iPricelistNameID + 4
END

Resources