Nested Query in SOQL

Nested Query in SOQL - salesforce

Can someone please help me generate SOQL for the below query.
Getting this error - Nesting of semi join sub-selects is not supported
SELECT External_ID_vod__c, FirstName, LastName, Middle_vod__c
FROM Account
where Id IN (select Account_vod__c from EM_Attendee_vod__c WHERE Id IN (SELECT Incurred_Expense_Attendee_vod__c
FROM Expense_Header_vod__c
where CALENDAR_YEAR(CreatedDate) > 2020 and Status_vod__c = 'Paid_in_Full_vod'))

Yes, with WHERE clauses you can go "down" the related list only 1 level, looks like you'd need 2 levels.
Couple ideas.
Can you do it in 2 steps? First select Account_vod__c from EM_Attendee_vod__c..., then pass the results to 2nd query.
See if you can eliminate a level by using rollup summary fields - but with this case might be tricky, rollup of all payments in 2020 might be not possible.
See if you can run a report that's close to what you need (even if it'd only grab these Account_vod__c) and you could use "reporting snapshot" - save the intermediate results of the report in a helper custom object. That could make it easier to query.
See if you can run the query by going "up". For example Account_vod__c is a real lookup/master-detail you could try with something like
select Account_vod__r.External_ID_vod__c, Account_vod__r.FirstName, Account_vod__r.LastName, Account_vod__r.Middle_vod__c
from EM_Attendee_vod__c
WHERE Id IN (SELECT Incurred_Expense_Attendee_vod__c
FROM Expense_Header_vod__c
where CALENDAR_YEAR(CreatedDate) > 2020 and Status_vod__c = 'Paid_in_Full_vod')
It's not perfect, it'd give you duplicate accounts if they have multiple attendees but it could work good enough. And in a pinch you could always try to deduplicate it with a GROUP BY Account_vod__r.External_ID_vod__c, Account_vod__r.FirstName, Account_vod__r.LastName, Account_vod__r.Middle_vod__c (although GROUP BY doesn't like to have more than 200 results... you could then cheat with LIMIT + OFFSET if you expect to have 2K accounts max)

Related

Extracting all records using a conditional SQL Server query?

I have a long database of observations for individuals. There are multiple observations for each individual, all assigned different medcodeid's.
I want to extract all records of individuals with certain medcodeid's assigned, but only if they at some point have had a smaller list of specific codes assigned.
This is an example of what I start with:
long dataset, multiple observations
and this is the records I'd like to extract:
multiple observations, but patients 3 and 5 are not extracted, as they never had a medcode 12
Would this be an additional WHERE clause? I am struggling as this will then only extract the second AND medcodeid list. But I want it to extract all, if the individual has had one of these certain fewer codes at some point. I hope that makes some sense. I am unfamiliar with IF command? And cannot see how CASE WHEN would work either.
Thank you very much in advance!

You definitely don't want to filter out all the rows so you're right that an additional condition won't help with that. And where only lets you look at the current row and you're trying to make a decision based all the rows belonging to the patient.
This query just uses a table expression and an analytic count() that tags each row with the number of matches as it lets you look outside the current row just like you need.
-- my additions to your query are in lowercase
with data as (
SELECT obs.patid, yob, obsdate, medcodeid,
count(case when medcodeid IN (<list of mandatory codes>) then 1 end)
over (partition by obs.patid) as medcode_count
-- assuming the relationship looks something like this
from obs inner join medcode on medcode.patid = obs.patid
WHERE medcodeid IN (<list of codes>)
AND obsdate BETWEEN '2004-12-31' AND GETDATE()
AND patienttypeid = 3 AND acceptable = 1 AND gender = 2
AND YEAR(obsdate) - yob > 15 AND YEAR(obsdate) - yob < 45
)
select * from data where medcode_count > 0;
At first I thought you were requiring that at least five of the codes from the full set were found. Now that you've edited the question I believe that you want to require that at least one code from a smaller subset is present. Either way this approach will work.

If I'm understanding what you're asking, I think what you need is an additional WHERE clause with a subquery. This could be done with and EXIST or a join but I find an IN query to be easier to work with.
You left the FROM out of your query so I had to guess at it but try this:
SELECT
obs.patid,
yob,
obsdate,
medcodeid
FROM
obs
WHERE
medcodeid IN (list of 20 codes)
AND (obsdate BETWEEN '2004-12-31' AND GETDATE())
AND patienttypeid = 3
AND acceptable = 1
AND gender = 2
AND ((YEAR(obsdate))-yob) > 15
AND ((YEAR(obsdate)) - yob) < 45
AND obs.patid IN (
SELECT
obs.patid
FROM
obs
WHERE
medcodeid IN (5 of the 20 codes)
);

Is there a way to sum an entire quantity in SQL with unique values

I am trying to get a total summation of both the ItemDetail.Quantity column and ItemDetail.NetPrice column. For sake of example, let's say the quantity that is listed is for each individual item is 5, 2, and 4 respectively. I am wondering if there is a way to display quantity as 11 for one single ItemGroup.ItemGroupName
The query I am using is listed below
select Location.LocationName, ItemDetail.DOB, SUM (ItemDetail.Quantity) as "Quantity",
ItemGroup.ItemGroupName, SUM (ItemDetail.NetPrice)
from ItemDetail
Join ItemGroupMember
on ItemDetail.ItemID = ItemGroupMember.ItemID
Join ItemGroup
on ItemGroupMember.ItemGroupID = ItemGroup.ItemGroupID
Join Location
on ItemDetail.LocationID = Location.LocationID
Inner Join Item
on ItemDetail.ItemID = Item.ItemID
where ItemGroup.ItemGroupID = '78' and DOB = '11/20/2019'
GROUP BY Location.LocationName, ItemDetail.DOB, Item.ItemName,
ItemDetail.NetPrice, ItemGroup.ItemGroupName

If you are using SQL Server 2012 , you can use the summation on partition to display the
details and aggregates in the same query.
SUM(SalesYTD) OVER (ORDER BY DATEPART(yy,ModifiedDate)),1)
Link :
https://learn.microsoft.com/en-us/sql/t-sql/functions/sum-transact-sql?view=sql-server-ver15

We can't be certain without seeing sample data. But I suspect you need to remove some fields from you GROUP BY clause -- probably Item.ItemName and ItemDetail.NetPrice.
Generally, you won't GROUP BY a column that you are applying an aggregate function to in the SELECT -- as in SUM(ItemDetail.NetPrice). And it is not very common, in my experience, to GROUP BY columns that aren't included in the SELECT list - as you are doing with Item.ItemName.
I think you need to go back to basics and read about what GROUP BY does.

First of all welcome to the overflow...
Second: The answer is going to be "It depends"
Any time you aggregate data you will need to Group by the other fields in the query, and you have that in the query. The gotcha is what happens when data is spread across multiple locations.
My suggestion is to rethink your problem and see if you really need these other fields in the query. This will depend on what the person using the data really wants to know.
Do they need to know how many of item X there are, or do they really need to know that item X is spread out over three sites?
You might find you are better off with two smaller queries.

SQL column update to solve data quality issue

I am unable to figure out how to write 'smart code' for this.
In this case I would like the end result for the first two case to be:
product_cat_name
A_SEE
A_BEE
Business is rule is such that one product_cat_name can belong to only one group but due to data quality issues we sometimes have a product_cat_name belonging to 2 different groups. As a special case in such a situation we would like to append group to the product_cat_name so that product_cat_name becomes unique.
It sounds so simple yet I am cracking my head over this.
Any help much appreciated.

Something like this:
with names as (
select prod_cat_nm , prod_cat_nm+group as new_nm from (query that joins 3 tables together) as qry
join
(Select prod_cat_nm, count(distinct group)
from (query that joins 3 tables together) as x
group by
prod_cat_nm
having count(distinct group) > 1) dups
on dups.prod_cat_nm = qry.prod_cat_nm
)
SELECT prod_cat_nm, STRING_AGG(New_nm, '') WITHIN GROUP (ORDER BY New_Nm ASC) AS new_prod_cat_nm
FROM names
GROUP BY prod_cat_nm;
I've used the 2017 STRING_AGG() here as its shortest to write - But you could easily change this to use Recursions or XML path

It is simple if you break it down into small pieces.
You need to UPDATE the table obviously, and change the value of product_cat_name. That's easy.
The new value should be group + product_cat_name. That's easy.
You only want to do this when a product_cat_name is associated with more than one group. That's probably the tricky part, but it can also be broken down into small pieces that are easy.
You need to identify which product_cat_names have more than one group. That's easy. GROUP BY product_cat_name HAVING COUNT(DISTINCT Group) > 1.
Now you need to use that to limit your UPDATE to only those product_cat_names. That's easy. WHERE product_cat_name IN (Subquery using above logic to get PCNs that have more than one Group).
All easy steps. Put them together and you've got your solution.

SalesForce SOQL highest number from size column

i'm new to SOQL and SF, so bear with me :)
I have Sales_Manager__c object tied to Rent__c via Master-Detail relationship. What i need is to get Manager with highest number of Rent deals for a current year. I figured out that the Rents__r.size column stores that info. The question is how can i gain access to the size column and retrieve highest number out of it?
Here is the picture of query i have
My SOQL code
SELECT (SELECT Id FROM Rents__r WHERE StartDate__c=THIS_YEAR) FROM Sales_Manager__c

Try other way around, by starting the query from Rents and then going "up". This will let you sort by the count and stop after say top 5 men (your way would give you list of all managers, some with rents, some without and then what, you'd have to manually go through the list and find the max).
SELECT Manager__c, Manager__r.Name, COUNT(Id)
FROM Rent__c
WHERE StartDate__c=THIS_YEAR
GROUP BY Manager__c, Manager__r.Name
ORDER BY COUNT(Id) DESC
LIMIT 10
(I'm not sure if Manager__c is the right name for your field, experiment a bit)

RunningDifference(x) for multiple x values

My Table tbl_data(event_time, monitor_id,type,event_date,status)
select status,sum(runningDifference(event_time)) as delta from (SELECT status,event_date,event_time FROM tbl_data WHERE event_date >= '2018-05-01' AND monitor_id =3 ORDER BY event_time ASC) group by status
Result will be
status delta
1 4665465
2 965
This query result give me right answer for single monitor_id, Now I required it for multiple monitor_id,
How can I achieve it in single/same query??

Usually this is achieved with conditional expressions. Like SELECT ...,if(monitor_id = 1, status, NULL) AS status1,... and then you do your aggregate function that, as you might know, skips NULL values. But I did some testing and it turns out that because of clickhouse internals runningDifference() can't distinguish columns originated from the same source. At the same time it distinguishes columns that came from different sources just fine. It is a bug.
I opened an issue on Github: https://github.com/yandex/ClickHouse/issues/2590
UPDATE: Devs reacted incredibly fast and with the latest source from master you can get what you want with the strategy I described. See the issue for code example.