I have the following DB Structure (simplified):
Payments
----------------------
Id | int
InvoiceId | int
Active | bit
Processed | bit
Invoices
----------------------
Id | int
CustomerOrderId | int
CustomerOrders
------------------------------------
Id | int
ApprovalDate | DateTime
ExternalStoreOrderNumber | nvarchar
Each Customer Order has an Invoice and each Invoice can have multiple Payments.
The ExternalStoreOrderNumber is a reference to the order from the external partner store we imported the order from and the ApprovalDate the timestamp when that import happened.
Now we have the problem that we had a wrong import an need to change some payments to other invoices (several hundert, so too mach to do by hand) according to the following logic:
Search the Invoice of the Order which has the same external number as the current one but starts with 0 instead of the current digit.
To do that I created the following query:
UPDATE DB.dbo.Payments
SET InvoiceId=
(SELECT TOP 1 I.Id FROM DB.dbo.Invoices AS I
WHERE I.CustomerOrderId=
(SELECT TOP 1 O.Id FROM DB.dbo.CustomerOrders AS O
WHERE O.ExternalOrderNumber='0'+SUBSTRING(
(SELECT TOP 1 OO.ExternalOrderNumber FROM DB.dbo.CustomerOrders AS OO
WHERE OO.Id=I.CustomerOrderId), 1, 10000)))
WHERE Id IN (
SELECT P.Id
FROM DB.dbo.Payments AS P
JOIN DB.dbo.Invoices AS I ON I.Id=P.InvoiceId
JOIN DB.dbo.CustomerOrders AS O ON O.Id=I.CustomerOrderId
WHERE P.Active=0 AND P.Processed=0 AND O.ApprovalDate='2012-07-19 00:00:00'
Now I started that query on a test system using the live data (~250.000 rows in each table) and it is now running since 16h - did I do something completely wrong in the query or is there a way to speed it up a little?
It is not required to be really fast, as it is a one time task, but several hours seems long to me and as I want to learn for the (hopefully not happening) next time I would like some feedback how to improve...
You might as well kill the query. Your update subquery is completely un-correlated to the table being updated. From the looks of it, when it completes, EVERY SINGLE dbo.payments record will have the same value.
To break down your query, you might find that the subquery runs fine on its own.
SELECT TOP 1 I.Id FROM DB.dbo.Invoices AS I
WHERE I.CustomerOrderId=
(SELECT TOP 1 O.Id FROM DB.dbo.CustomerOrders AS O
WHERE O.ExternalOrderNumber='0'+SUBSTRING(
(SELECT TOP 1 OO.ExternalOrderNumber FROM DB.dbo.CustomerOrders AS OO
WHERE OO.Id=I.CustomerOrderId), 1, 10000))
That is always a BIG worry.
The next thing is that it is running this row-by-row for every record in the table.
You are also double-dipping into payments, by selecting from where ... the id is from a join involving itself. You can reference a table for update in the JOIN clause using this pattern:
UPDATE P
....
FROM DB.dbo.Payments AS P
JOIN DB.dbo.Invoices AS I ON I.Id=P.InvoiceId
JOIN DB.dbo.CustomerOrders AS O ON O.Id=I.CustomerOrderId
WHERE P.Active=0 AND P.Processed=0 AND O.ApprovalDate='2012-07-19 00:00:00'
Moving on, another mistake is to use TOP without ORDER BY. That's asking for random results. If you know there's only one result, you wouldn't even need TOP. In this case, maybe you're ok with randomly choosing one from many possible matches. Since you have three levels of TOP(1) without ORDER BY, you might as well just mash them all up (join) and take a single TOP(1) across all of them. That would make it look like this
SET InvoiceId=
(SELECT TOP 1 I.Id
FROM DB.dbo.Invoices AS I
JOIN DB.dbo.CustomerOrders AS O
ON I.CustomerOrderId=O.Id
JOIN DB.dbo.CustomerOrders AS OO
ON O.ExternalOrderNumber='0'+SUBSTRING(OO.ExternalOrderNumber,1,100)
AND OO.Id=I.CustomerOrderId)
However, as I mentioned very early on, this is not being correlated to the main FROM clause at all. We move the entire search into the main query so that we can make use of JOIN-based set operations rather than row-by-row subqueries.
Before I show the final query (fully commented), I think your SUBSTRING is supposed to address this logic but starts with 0 instead of the current digit. However, if that means how I read it, it means that for an order number '5678', you're looking for '0678' which would also mean that SUBSTRING should be using 2,10000 instead of 1,10000.
UPDATE P
SET InvoiceId=II.Id
FROM DB.dbo.Payments AS P
-- invoices for payments
JOIN DB.dbo.Invoices AS I ON I.Id=P.InvoiceId
-- orders for invoices
JOIN DB.dbo.CustomerOrders AS O ON O.Id=I.CustomerOrderId
-- another order with '0' as leading digit
JOIN DB.dbo.CustomerOrders AS OO
ON OO.ExternalOrderNumber='0'+substring(O.ExternalOrderNumber,2,1000)
-- invoices for this other order
JOIN DB.dbo.Invoices AS II ON OO.Id=II.CustomerOrderId
-- conditions for the Payments records
WHERE P.Active=0 AND P.Processed=0 AND O.ApprovalDate='2012-07-19 00:00:00'
It is worth noting that SQL Server allows UPDATE ..FROM ..JOIN which is less supported by other DBMS, e.g. Oracle. This is because for a single row in Payments (update target), I hope you can see that it is evident it could have many choices of II.Id to choose from from all the cartesian joins. You will get a random possible II.Id.
I think something like this will be more efficient ,if I understood your query right. As i wrote it by hand and didn't run it, it may has some syntax error.
UPDATE DB.dbo.Payments
set InvoiceId=(SELECT TOP 1 I.Id FROM DB.dbo.Invoices AS I
inner join DB.dbo.CustomerOrders AS O ON I.CustomerOrderId=O.Id
inner join DB.dbo.CustomerOrders AS OO On OO.Id=I.CustomerOrderId
and O.ExternalOrderNumber='0'+SUBSTRING(OO.ExternalOrderNumber, 1, 10000)))
FROM DB.dbo.Payments
JOIN DB.dbo.Invoices AS I ON I.Id=Payments.InvoiceId and
Payments.Active=0
AND Payments.Processed=0
AND O.ApprovalDate='2012-07-19 00:00:00'
JOIN DB.dbo.CustomerOrders AS O ON O.Id=I.CustomerOrderId
Try to re-write using JOINs. This will highlight some of the problems. Will the following function do just the same? (The queries are somewhat different, but I guess this is roughly what you're trying to do)
UPDATE Payments
SET InvoiceId= I.Id
FROM DB.dbo.Payments
CROSS JOIN DB.dbo.Invoices AS I
INNER JOIN DB.dbo.CustomerOrders AS O
ON I.CustomerOrderId = O.Id
INNER JOIN DB.dbo.CustomerOrders AS OO
ON O.ExternalOrderNumer = '0' + SUBSTRING(OO.ExternalOrderNumber, 1, 10000)
AND OO.Id = I.CustomerOrderId
WHERE P.Active=0 AND P.Processed=0 AND O.ApprovalDate='2012-07-19 00:00:00')
As you see, two problems stand out:
The undonditional join between Payments and Invoices (of course, you've caught this off by a TOP 1 statement, but set-wise it's still unconditional) - I'm not really sure if this really is a problem in your query. Will be in mine though :).
The join on a 10000-character column (SUBSTRING), embodied in a condition. This is highly inefficient.
If you need a one-time speedup, just take the queries on each table, try to store the in-between-results in temporary tables, create indices on those temporary tables and use the temporary tables to perform the update.
Related
Good evening all!
I'm running into a really odd issue that I'm having trouble understanding.
I have 3 tables (parts table, parts move history and a parts detail table).
What I'm trying to do is have the result set return lot#,part#,product description,quantity,part location, what's currently in inventory (versus full history) and who last moved the product.
Now, for the query. When I run the below query, I get a result set of 4,751 rows; which lines up perfectly with my expected results. However, when I try to add in the userid field, I then get a result set of 186,573. This large result set appears to pull in all historic data versus just matching the userid to the 4,751 rows I actually need.
From the Parts Table I need (prod_desc)
From the Parts Detail Table I need (lot,part#,lotquantity,prtlocation)
From the Parts Move History Table I need (move_date,user_id)
4,751 Query:
SELECT DISTINCT
inv.lot,
inv.part#,
prt.prod_desc,
inv.lotquantity,
inv.prtlocation,
MAX(mv.move_date)AS 'Move Date'
FROM invdet AS inv
LEFT JOIN movetable AS mv ON inv.part# = mv.part#
LEFT JOIN partmstr AS prt ON inv.part# = prt.part#
WHERE inv.lot IS NOT NULL
GROUP BY inv.lot,inv.part#,prt.prod_desc,inv.lotquantity,inv.prtlocation
ORDER BY inv.prtlocation
186,573 Query:
SELECT DISTINCT
inv.lot,
inv.part#,
prt.prod_desc,
inv.lotquantity,
inv.prtlocation,
MAX(mv.move_date)AS 'Move Date'
mv.user_id
FROM invdet AS inv
LEFT JOIN movetable AS mv ON inv.part# = mv.part#
LEFT JOIN partmstr AS prt ON inv.part# = prt.part#
WHERE inv.lot IS NOT NULL
GROUP BY inv.lot,inv.part#,prt.prod_desc,inv.lotquantity,inv.prtlocation,mv.user_id
ORDER BY inv.prtlocation
If I don't use the MAX function, I do not get current inventory and instead get all results in the table, which I do not need. I'm still learning and my GROUP BY's leave a lot to be desired as I'm still wrapping my head around it (open to suggestions!). I'm sure there's a subquery I can throw in here somewhere, but I'm still figuring those out as well. Any help is greatly appreciated!
I think the problem is that when you insert mv.user_id from table movetable you get all part's movements and not only the last one with date max(mv.move_date).
One way is to remove the left join to movetable and use maybe a cross apply like
SELECT inv.lot,inv.part,prt.prod_desc,inv.lotquantity,inv.prtlocation,x.move_date,x.user_id
FROM invdet AS inv
CROSS APPLY(SELECT TOP 1
mv.user_id,mv.move_date
FROM movetable mv
WHERE inv.part=mv.part
ORDER BY mv.move_date DESC) AS x
LEFT JOIN partmstr AS prt ON inv.part=prt.part
WHERE inv.lot IS NOT NULL
ORDER BY inv.prtlocation
I've not tested it but should be fine, maybe a bit slow because cross apply executes one subquery per each row in inv table. If it is too slow, you can user ROWNUMBER to create a table composed of only the last movements and then use it in the LEFT JOIN, as follows
SELECT inv.lot,inv.part,prt.prod_desc,inv.lotquantity,inv.prtlocation,y.move_date,y.user_id
FROM invdet AS inv
LEFT JOIN(SELECT x.user_id,x.move_date,x.part
FROM (SELECT mv.user_id,mv.move_date,mv.part,rn=ROWNUMBER() OVER(PARTITION BY mv.part ORDER BY mv.move_date DESC)
FROM movetable mv) AS x
WHERE x.rn=1) AS y ON y.part=inv.part
LEFT JOIN partmstr AS prt ON inv.part=prt.part
WHERE inv.lot IS NOT NULL
ORDER BY inv.prtlocation
Hope it helps.
The following CTE Query throws "The statement terminated. The maximum recursion 100 has been exhausted before statement completion."
WITH MyCTE
AS (
SELECT o.organizationid,
organization AS organization
FROM organization o
INNER JOIN store s ON s.organizationid = o.organizationid
UNION ALL
SELECT store.storeid,
CAST(storeNAme AS NVARCHAR(50)) AS storeNAme
FROM store
INNER JOIN MyCTE ON store.organizationid = MyCTE.organizationid)
SELECT DISTINCT
Organization
FROM MyCTE
when executing the subquery before and after the union all, the followig result is gained.
Anchor query:-
SELECT o.organizationid,
organization AS organization
FROM organization o
INNER JOIN store s ON s.organizationid = o.organizationid
Result:-
organizationid |organization
--------------------------------
3 | Org1
query after union all:-
SELECT store.storeid,
CAST(storeNAme AS NVARCHAR(50)) AS storeNAme
FROM store
Result:-
StoreId |StoreName
--------------------------------
3 | Warehouse1
May I know the reason why ?
Specify the maxrecursion option at the end of the query:
...
from MyCTE
option (maxrecursion 0)
That allows you to specify how often the CTE can recurse before generating an error. Maxrecursion 0 allows infinite recursion.
I think we can try this to increase the max recursion depth :
SET GLOBAL cte_max_recursion_depth=10000;
According here, https://forums.mysql.com/read.php?100,681245,681245
When you are running recursive query, you allow a query to call itself. It doesn't matter how many rows query returns, but how many times a query will call itself. Therefore there is a risk it might go into infinite loop while. SQL Server in order to prevent this situation from happening have a setting which will determine how many recursions are allowed. Default value for this setting is 100 - therefore you will get this message when your query exceeds 100 recursions (self calls).
You can override this setting by adding OPTION (MAXRECURSION nn) at the end of query. When nn is new maximum.
You can also remove protection completely by setting this value to 0 - it will looks like this then: OPTION (MAXRECURSION 0) however this is not recommended since when running it will keep consuming resources.
OP asked for suggestion how to avoid potential infinite loop, and create exit criteria for recursion.
As far as I can see you are trying to build data hierarchy. You might consider introducing additional column with hierarchy level and stop at some level (I choose 10 in example):
WITH MyCTE
AS (
SELECT o.organizationid,
organization AS organization,
1 lvl
FROM organization o
INNER JOIN store s ON s.organizationid = o.organizationid
UNION ALL
SELECT store.storeid,
CAST(storeNAme AS NVARCHAR(50)) AS storeNAme,
lvl+1
FROM store
INNER JOIN MyCTE ON store.organizationid = MyCTE.organizationid
WHERE lvl<10
)
SELECT DISTINCT
Organization
FROM MyCTE
We have been learning SQL Server programming in Database Systems class. The professor goes exceptionally fast and is not very open to asking questions. I did ask him this, but he just advised me to review the code he'd given (which doesn't actually answer the question).
When making a query, what is the difference between using the term JOIN and using the "=" operator? For example, I have the following query:
SELECT VENDOR_NAME, ITEM_NAME, QTY
FROM VENDOR, VENDOR_ORDER, INVENTORY
WHERE VENDOR.VENDOR_ID = VENDOR_ORDER.VENDOR_ID
AND VENDOR_ORDER.INV_ID = INVENTORY.INV_ID
ORDER BY VENDOR_NAME
In class the professor has used the following code:
SELECT DISTINCT CUS_CODE, CUS_LNAME, CUS_FNAME
FROM CUSTOMER JOIN INVOICE USING (CUS_CODE)
JOIN LINE USING (INV_NUMBER)
JOIN PRODUCT USING (P_CODE)
WHERE P_DESCRIPT = 'Claw hammer';
It seems to me that using a join is performing the same function as the "=" is in mine? Am I correct or is there a difference that I am unaware of?
Edit:
Trying to use Inner Join based on things I've found on Google. I ended up with the following.
SELECT VENDOR_NAME, ITEM_NAME, QTY
FROM VENDOR, VENDOR_ORDER, INVENTORY
INNER JOIN VENDOR_ORDER USING (VENDOR_ID)
INNER JOIN INVENTORY USING (INV_ID)
ORDER BY VENDOR_NAME
Now I get the error message ""VENDOR_ID" is not a recognized table hints option. If it is intended as a parameter to a table-valued function or to the CHANGETABLE function, ensure that your database compatibility mode is set to 90.
"
I'm using 2014, so my compatibility level is 120.
The difference between what you are doing (in your first example) and what your professor is doing is that you are creating a set of all possible combinations of the rows in those tables, then narrowing your results to the ones that match the way you want them to. He is creating a set of only the rows that match the way you want them to in the first place.
If your tables were:
Table1
ID1
1
2
3
Table2
ID2
1
2
3
Your query starts with basically a cross join:
Select * from Table1, Table2
ID1 ID2
1 1
2 1
3 1
1 2
2 2
3 2
1 3
2 3
3 3
Then narrows that result set down by applying the where ID1 = ID2
ID1 ID2
1 1
2 2
3 3
This is inefficient and somewhat difficult to read in more complex examples, as people have mentioned in the comments.
Your professor is building the criteria to relate the two tables into the join itself, so he is effectively skipping the first step. In our example tables, this would be Select * from Table1 join Table2 on ID1 = ID2.
There are several types of joins in SQL, which differ based on how you want to handle cases where a value exists in one of your tables, but has no match in the other table. See traditional venn diagram explanation from http://www.codeproject.com/Articles/33052/Visual-Representation-of-SQL-Joins:
Don't worry it's your professors issue not yours. Make sure you give appropriate feedback at the end of the course ;)
Hang in there.
So here is some info:
So the first issue is: your professor should not be teaching you USING because it has limited implementation (it definitely won't work in SQL Server) and IMHO it's a bad idea because you should explicitly list join columns.
Here are some queries that will work in SQL Server - lets build them up bit by bit. I will need to make some assumptions
First just join vendor to vendor order:
SELECT VENDOR.VENDOR_NAME, VENDOR_ORDER.QTY
FROM VENDOR
INNER JOIN
VENDOR_ORDER
ON VENDOR.VENDOR_ID = VENDOR_ORDER.VENDOR_ORDER
By using inner join we match these two tables on VENDOR_ID
If you have seven records in VENDOR_ORDER with VENDOR_ID = 7, and one record in table VENDOR then the result of this will be.... 7 records, with the data from the VENDOR table repeating seven times.
Now to that, join in inventory
SELECT VENDOR.VENDOR_NAME, INVENTORY.ITEM_NAME, VENDOR_ORDER.QTY
FROM VENDOR
INNER JOIN
VENDOR_ORDER
ON VENDOR.VENDOR_ID = VENDOR_ORDER.VENDOR_ORDER
INNER JOIN
INVENTORY ON INVENTORY.INV_ID = VENDOR_ORDER.INV_ID
ORDER BY VENDOR.VENDOR_NAME
This 'INNER JOIN' syntax is the modern version (often referred as SQL-92). Having a comma seperated list after the FROM clause is 'old school'
Both methods work the same way but the old school way causes ambiguities if you start using outer joins. So get into the habit of doing it the new way.
Lastly, to neaten things up you can use an 'allias'. Which means you give each table a shorter name then use that. I've also added in the invoice number so you can get an idea of what's going on:
SELECT V.VENDOR_NAME, I.ITEM_NAME, ORD.INV_ID, ORD.QTY
FROM VENDOR As V
INNER JOIN
VENDOR_ORDER As ORD
ON V.VENDOR_ID = ORD.VENDOR_ORDER
INNER JOIN
INVENTORY As I ON I.INV_ID = ORD.INV_ID
ORDER BY V.VENDOR_NAME
I have a performance problem on a query.
First table is a Customer table which has millions records in it. Customer table has a column of email address and some other information about customer.
Second table is a CommunicationInfo table which contains just Email addresses.
And What I want in here is; how many times the email address in CommunicationInfo table repeats in Customers table. What could be the the most performer query.
The basic query that I can explain this situation is;
Select ci.Email, count(*) from Customer c left join
CommunicationInfo ci on c.Email1 = ci.Email or c.Email2 = ci.Email
Group by ci.Email
But sure, it takes about 5, 6 minutes in execution.
Thanks in Advance.
this query is about as good as it gets if you have an index on Customer.Email and another on CommunicationInfo.Email
Select
c.Email, count(*)
from Customer c
left join CommunicationInfo ci on c.Email1 = ci.Email
left join CommunicationInfo ci2 on c.Email2 = ci2.Email
Group by c.Email
You mention:
And What I want in here is; how many
times the email address in
CommunicationInfo table repeats in
Customers table. What could be the the
most performer query.
To me, that sounds like you could easily use an INNER JOIN - this would most likely be a lot faster, since it will limit the search scope to just those customers who really do have an e-mail - anyone who doesn't have an e-mail at all (and thus a count(*) = 0) will not even be looked at - that might make a big difference even just in the number of rows SQL Server has to count and group.
So try this:
SELECT
ci.Email, COUNT(*)
FROM
dbo.Customer c
INNER JOIN dbo.CommunicationInfo ci
ON c.Email1 = ci.Email OR c.Email2 = ci.Email
GROUP BY
ci.Email
How does that perform in your case??
Using the OR condition robs the optimizer of opportunity to use HASH JOIN or MERGE JOIN.
Use this:
SELECT ci.Email, SUM(cnt)
FROM (
SELECT ci.Email, COUNT(c.Email) AS cnt
FROM CommunicationInfo ci
LEFT JOIN
Customer c
ON c.Email1 = ci.Email
GROUP BY
ci.Email
UNION ALL
SELECT ci.Email, COUNT(c.Email) AS cnt
FROM CommunicationInfo ci
LEFT JOIN
Customer c
ON c.Email2 = ci.Email
GROUP BY
ci.Email
) q2
GROUP BY
ci.Email
or this:
SELECT ci.Email, COUNT(*)
FROM CommunicationInfo ci
LEFT JOIN
(
SELECT Email1 AS email
FROM Customer c
UNION ALL
SELECT Email2
FROM Customer
) q
ON q.Email = ci.Email
GROUP BY
ci.Email
Make sure that you have indexes on Customer(Email) and Customer(Email2)
The first query will be more efficient if your emails are mostly not filled, the second one — if most emails are filled.
Depending on your environment there may not be much you can do to optimize this.
A couple of questions:
How many records in CommunicationInfo?
How often do you really need to run this query? Is it a one time analysis, or are multiple people going to be running this every 10 minutes?
Are the fields indexed? I'll make a guess that neither Email1 nor Email2 field is indexed. However, I wouldn't suggest adding an index without taking the balance of the whole system into consideration.
Why are you using a left join? Do you really need EVERYTHING from the Customer table? You're counting, so no harm in doing an INNER JOIN.
Suggestions:
Run the query through the Query Optimization wizard to see if there is anything SQL Server would recommend.
An extreme suggestion would be to dump the Email1 and Email2 columns into a temp table and join to that. I've seen queries run slowly because of a large amount of stress on a particular table, so sometimes copying the records into a temp table is faster, but this technique is very dependent on how much memory there is, how fast IO is, and the amount of stress on a particular table.
This is going to seem like a lame question for all experts in SQL server views but...
So I have a small set of data that my client needs for reporting purposes. I have to admit that although I did ask them their reporting requirements, it isn't till now that I see that my db could be better optimised.
One of the pieces of data they want is the time difference between two tasks that may have run:
select caseid, hy.createdate
from app_history hy
where hy.activityid in (303734, 303724)
This gives me two rows (after edit) per case-submission which then have to be measured; but a few wiggles:
Activity 303734 will always run, activity 303724 might run.
Each 303734 and 303724 combo match up. Conceiveably a case can have 1 un-matched 303734 with a matched pair afterwards on the 2nd submission. Matching these might be down to intuition. Not good.
There maybe more than one submission per caseid and if that is the case then both activities will run every subsequent time.
There is no way to write the submission number to this table.
The app_history table holds userid, caseid and activityid as foreign keys. The PK is the identity column ID.
Is there a better way to write the query?
AFter help from KM:
select
c.id, c.submissionno, hya.caseid, hya.createtime, hyb.caseid, hyb.createtime
,CASE
WHEN hyb.caseid IS NOT NULL THEN DATEDIFF(mi,hya.createtime,hyb.createtime)
ELSE NULL
END AS Difference
from app_case c
inner join app_history hya on c.id = hya.caseid
left outer join app_history hyb on c.id = hyb.caseid
where hya.activityid in (303734) and hyb.activityid in (303724) order by c.id asc
This nearly works.
I now have this issue:
460509|2|460509|15:15:39.000|460509|15:16:13.000|1
460509|2|460509|15:15:39.000|460509|15:18:13.000|3
460509|2|460509|15:17:52.000|460509|15:16:13.000|-1
460509|2|460509|15:17:52.000|460509|15:18:13.000|1
So I am now getting 1 row comparing each of the two for each of the four rows... mmm I think it is the best I can hope for. :(
USE LEFT JOIN
SELECT
a.caseid, a.createdate
,b.caseid, b.createdate
,CASE
WHEN b.caseid IS NOT NULL THEN DATEDIFF(mi,a.createdate,b.createdate)
ELSE NULL
END AS Difference
FROM app_history a
LEFT OUTER JOIN app_history b ON b.activityid=303724
WHERE a.activityid=303734
EDIT after a little more schema info...
SELECT
a.caseid, a.createdate
,b.caseid, b.createdate
,CASE
WHEN b.caseid IS NOT NULL THEN DATEDIFF(mi,a.createdate,b.createdate)
ELSE NULL
END AS Difference
FROM (SELECT MAX(ID) AS MaxID FROM app_history WHERE activityid=303734) aa
INNER JOIN app_history a ON aa.MaxID=a.ID
LEFT OUTER JOIN a(SELECT MAX(ID) AS MaxID FROM app_history WHERE activityid=303724) bb ON 1=1
LEFT OUTER JOIN app_history b ON bb.MaxID=b.ID
do something like this
select datediff(
day,
(select isnull(hy.createdate,0) from app_history hy where hy.activityid =303734),
(select isnull(hy.createdate,0) from app_history hy where hy.activityid =303724)
)