mysql complex select query from multiple tables - database

Table Visits;
fields[id,patient_id(fk),doctor_id(fk),flag(Xfk),type(Xfk),time_booked,date,...]
Xfk = it refer to other table, but its not a must to exist so i dont use constrain.
SELECT `v`.`date`, `v`.`time_booked`, `v`.`stats`, `p`.`name` as pt_name,
`d`.`name` as dr_name, `f`.`name` as flag_name, `f`.`color` as flag_color,
`vt`.`name` as type, `vt`.`color` as type_color
FROM (`visits` v, `users` p, `users` d, `flags` f, `visit_types` vt)
WHERE `p`.`id`=`v`.`patient_id`
AND `d`.`id`=`v`.`doctor_id`
AND `v`.`flag`=`f`.`id`
AND `v`.`type`=`vt`.`id`
AND `v`.`date` >= '2013-02-27'
AND (v.date <= DATE_ADD('2013-02-27', INTERVAL 7 DAY))
AND (`v`.`doctor_id`='00002' OR `v`.`doctor_id`='00001')
ORDER BY `v`.`date` ASC, `v`.`time_booked` ASC;
One big statmeant i have !
my question is,
1: should i consider using join instead of select multiple tables ?
and if i should why ?
this query execution time is 0.0009 so i think its fine, and since i get all my data in one query, or is it bad practice ?
2: in the select part i want to say
if v.type != 0 select f.name,f.color else i dont want to select them nither there tables flags f
is it possible ?
also currently if flag was not found, it replicate all rows as much as flag table have in rows ! is there a way i can prevent this ? both for
flag and visit_types table ?

If it's running fast, I wouldn't mess with it. I generally prefer to use joins instead of matching stuff in the where clause.
Any chance you'd remove the ` characters? Just makes it a bit harder to read in my opinion.
Look at the case statement for MySQL: http://dev.mysql.com/doc/refman/5.0/en/case.html
select case when v.type <> 0 then
f.name
else
''
end as name, ...

Related

Does SQL Server CASE expression not do short circuit?

Appropriate if somebody could help us to optimize below query
As per execution plane it seems the else part sub query is always executing, irrespective of conditions.
Won't CASE be short circuited? Why is it executing, even though it is not necessary?
IF OBJECT_ID('#Calculation') IS NOT NULL
DROP TABLE #Calculation;
SELECT
Result.IdDeckungsbeitrag,
Result.wert AS Wert,
REPLACE(#Formula, '<#PackagingCosts> ', Result.wert) AS Kalkulation
INTO
#Calculation
FROM
(SELECT
deck.IdDeckungsbeitrag,
CASE
WHEN lp.ID_VERPACKUNG_2 IS NULL
AND lp.ID_VERPACKUNG_3 IS NULL
THEN vg.VERPACKUNGSKOSTEN_PRO_EINHEIT * deck.Menge
ELSE
(
SELECT SUM(temp.me) * vg.VERPACKUNGSKOSTEN_PRO_EINHEIT
FROM
(
SELECT SUM(gv.MENGE) AS me
FROM dbo.KUNDENRECHNUNG_POSITION krp
LEFT JOIN dbo.LIEFERSCHEIN_POSITION lp
ON lp.ID_LIEFERSCHEIN_POSITION = krp.ID_LIEFERSCHEIN_POSITION
LEFT JOIN dbo.GEBINDE_VERLADEN gv
ON gv.ID_LIEFERSCHEIN_POSITION = lp.ID_LIEFERSCHEIN_POSITION
LEFT JOIN dbo.MATERIAL_BESTAND mb
ON mb.ID_MATERIAL_BESTAND = gv.ID_MATERIAL_BESTAND
LEFT JOIN dbo.MATERIAL_GEBINDE mg
ON mg.ID_MATERIAL_BESTAND = mg.ID_MATERIAL_BESTAND
WHERE mg.CHARGE_NUMMER = deck.Charge
GROUP BY mg.ID_VERPACKUNG
) temp
)
END AS wert
FROM #DeckungsbeitragCalculationPositions_TVP deck
LEFT JOIN dbo.KUNDENRECHNUNG_POSITION krp
ON krp.ID_KUNDENRECHNUNG_POSITION = deck.IdDeckungsbeitrag
LEFT JOIN dbo.LIEFERSCHEIN_POSITION lp
ON lp.ID_LIEFERSCHEIN_POSITION = krp.ID_LIEFERSCHEIN_POSITION
LEFT JOIN dbo.VERPACKUNG vg
ON vg.ID_VERPACKUNG = lp.ID_VERPACKUNG_1
WHERE deck.IdMandant = #Id_Mandant
) Result;
Don't think of it as short-circuit or not. That's a procedural code mindset rather than a set-based mindset.
In SQL, the actual "execution" happens in the realm of table or index scans and seeks, hash matches, sorts, and the like. Pull data from different tables into a working set that will eventually produce the desired relational result.
Not seeing any real or projected execution plan, in this case (no pun intended), I suspect the query optimizer decided it was most efficient to first produce this result set in memory:
SELECT mg.ID_VERPACKUNG, mg.CHARGE_NUMMER, SUM(gv.MENGE) AS me
FROM dbo.KUNDENRECHNUNG_POSITION krp
LEFT JOIN dbo.LIEFERSCHEIN_POSITION lp
ON lp.ID_LIEFERSCHEIN_POSITION = krp.ID_LIEFERSCHEIN_POSITION
LEFT JOIN dbo.GEBINDE_VERLADEN gv
ON gv.ID_LIEFERSCHEIN_POSITION = lp.ID_LIEFERSCHEIN_POSITION
LEFT JOIN dbo.MATERIAL_BESTAND mb
ON mb.ID_MATERIAL_BESTAND = gv.ID_MATERIAL_BESTAND
LEFT JOIN dbo.MATERIAL_GEBINDE mg
ON mg.ID_MATERIAL_BESTAND = mg.ID_MATERIAL_BESTAND
GROUP BY mg.ID_VERPACKUNG, mg.CHARGE_NUMMER
This is the subquery from the ELSE clause, minus the WHERE clause conditions and with additional info added to the SELECT to make the match more effective. If the query optimizer can't be confident of meeting the WHEN clause a high percentage of the time, it might believe producing this larger ELSE set once to match against as needed is more efficient. Put another way, if it thinks it will have to run that subquery a lot anyway, it may try to pre-load it for all possible data.
We don't know enough about your database to suggest real solutions, but indexing around the ID_VERPACKUNG_2, ID_VERPACKUNG_3, and CHARGE_NUMMER fields might help. You might also be able to use a CTE, temp table, or table variable to help Sql Server to a better job of caching this data just once.
If the conditions of where are not met then the else will become active. If this is always the case then you would expect the following to return no rows:
select *
FROM #DeckungsbeitragCalculationPositions_TVP deck
LEFT JOIN dbo.KUNDENRECHNUNG_POSITION krp ON krp.ID_KUNDENRECHNUNG_POSITION = deck.IdDeckungsbeitrag
LEFT JOIN dbo.LIEFERSCHEIN_POSITION lp ON lp.ID_LIEFERSCHEIN_POSITION = krp.ID_LIEFERSCHEIN_POSITION
WHERE lp.ID_VERPACKUNG_2 IS NULL
AND lp.ID_VERPACKUNG_3 IS NULL

need to know how to remove duplicate rows from SQL to fetch data from a variety of tables;

I need data from a variety of tables and below is the only way I know to do it (I just know the basics). The query below works fine but shows duplicates. I need to know how to remove those.
SELECT DISTINCT
a.int_order_id, a.trans_id, a.wtn,a.isp_id,
d.first_name, d.middle_initial, d.last_name,
d.company_name, d.emaiL,
a.ilec_lob, a.node_type_id, a.cddd,
a.isp_ban, a.tos_version, a.isp_ckt_id,
a.isp_circuit_type, a.atm_vpi, a.atm_vci,
a.frs_dlci, b.order_create_date, b.pon,
b.order_status_id, e.trans_status_id,
e.description, c.STREET_NUMBER,
c.STREET_NUMBER_SUFFIX, c.DIRECTIONAL_ID,
c.street_name, c.thoroughfare_id,
c.street_suffix, c.address_line1, c.address_line2,
c.unit_type_id, c.unit_value, c.city, c.state_id, c.zip
FROM
VZEXTRACT1.vvov_os_ord_dsl A
JOIN
VZEXTRACT1.vvov_os_order_details B ON a.int_order_id = b.int_order_id
JOIN
VZEXTRACT1.vvov_os_ord_address C ON b.int_order_id = c.int_order_id
JOIN
vzextract1.vvov_os_ord_contact D ON c.int_order_id = d.int_order_id
JOIN
VZEXTRACT1.vv0v_trans_status E On b.order_status_id = e.trans_status_id
WHERE
a.isp_id NOT IN (657,500)
AND B.ORDER_CREATE_DATE >= to_date('01-may-15')
AND B.ORDER_CREATE_DATE < to_date('30-JUL-15')
When you want to remove duplicate rows you will have to use
Distinct : It will consider distinct of all selected columns so you will have to find out because of which column you are getting duplicate rows even if you have used distinct
Group by : You can use group by clause to get distinct rows. You can use aggregate function for column causing duplicate rows from point 1 and avoid duplicates.
over(partition by) : you can also use this clause for column causing duplicates. Like you concat values of such column ,
wm_concat(my_col)over(partition by id)

SQL Server LEFT JOIN

This query has been keeping me busy for the last couple of days. I tried to rewrite it with different ideas but I keep having the same problem. To simplify the problem I put part of my query in a view, this view returns 23 records. Using a left join I would like to add fields coming from the table tblDatPositionsCalc to these 23 records. As you can see I have an additional condition on the tblDatPositionsCalc in order to only consider the most recent records. With this condition it would return 21 records. The join should be on two fields together colAccount and colId.
I simply want the query to return the 23 records from the view and where possible have the information from tblDatPositionsCalc. There is actually only 2 records in the view without corresponding id and account in tblDatPositionsCalc, that means out of the 23 records only 2 will have missing values in the fields coming from the table tblDatPositionsCalc.
The problem with my query is that it only returns the 21 records from tblDatPositionsCalc. I don't understand why. I tried to move the condition on date in just after the JOIN condition but that did not help.
SELECT TOP (100) PERCENT
dbo.vwCurrPos.Account,
dbo.vwCurrPos.Id,
dbo.vwCurrPos.TickerBB,
dbo.vwCurrPos.colEquityCode,
dbo.vwCurrPos.colType,
dbo.vwCurrPos.colCcy,
dbo.vwCurrPos.colRegion,
dbo.vwCurrPos.colExchange,
dbo.vwCurrPos.[Instr Type],
dbo.vwCurrPos.colMinLastDay,
dbo.vwCurrPos.colTimeShift,
dbo.vwCurrPos.Strike,
dbo.vwCurrPos.colMultiplier,
dbo.vwCurrPos.colBetaVol,
dbo.vwCurrPos.colBetaEq,
dbo.vwCurrPos.colBetaFloor,
dbo.vwCurrPos.colBetaCurv,
dbo.vwCurrPos.colUndlVol,
dbo.vwCurrPos.colUndlEq,
dbo.vwCurrPos.colUndlFut,
tblDatPositionsCalc_1.colLots,
dbo.vwCurrPos.[Open Positions],
dbo.vwCurrPos.colListMatShift,
dbo.vwCurrPos.colStartTime,
tblDatPositionsCalc_1.colPrice,
tblDatPositionsCalc_1.colMktPrice,
dbo.vwCurrPos.colProduct,
dbo.vwCurrPos.colCalendar,
CAST(dbo.vwCurrPos.colExpiry AS DATETIME) AS colExpiry,
dbo.vwCurrPos.colEndTime,
CAST(tblDatPositionsCalc_1.colDate AS datetime) AS colDate,
dbo.vwCurrPos.colFund,
dbo.vwCurrPos.colExchangeTT,
dbo.vwCurrPos.colUserTag
FROM dbo.vwCurrPos
LEFT OUTER JOIN dbo.tblDatPositionsCalc AS tblDatPositionsCalc_1
ON tblDatPositionsCalc_1.colId = dbo.vwCurrPos.Id
AND tblDatPositionsCalc_1.colAccount = dbo.vwCurrPos.Account
WHERE (tblDatPositionsCalc_1.colDate =
(SELECT MAX(colDate) AS Expr1 FROM dbo.tblDatPositionsCalc))
ORDER BY
dbo.vwCurrPos.Account,
dbo.vwCurrPos.Id,
dbo.vwCurrPos.colEquityCode,
dbo.vwCurrPos.colRegion
Any idea what might cause the problem?
(Option 1) DrCopyPaste is right so your from clause would look like:
...
FROM dbo.vwCurrPos
LEFT OUTER JOIN dbo.tblDatPositionsCalc AS tblDatPositionsCalc_1
ON tblDatPositionsCalc_1.colId = dbo.vwCurrPos.Id
AND tblDatPositionsCalc_1.colAccount = dbo.vwCurrPos.Account
and (tblDatPositionsCalc_1.colDate =
(SELECT MAX(colDate) AS Expr1 FROM dbo.tblDatPositionsCalc))
...
reason: the where clause restriction of left joined to column = some expression with fail to return for "null = something" so the row will be removed.
(Option 2) As oppose to pushing code in to additional views where it is harder to maintain you can nest sql select statements;
select
X.x1,X.x2,
Y.*
from X
left join
(select Z.z1 as y1, Z.z2 as y2, Z.z3 as y3
from Z
where Z.z1 = (select max(Z.z1) from Z)
) as Y
on x.x1 = Y.y1 and X.x2 = Y.y2
The advantage here is you check each nested sub query a move out quickly. Although if you still building up more logic check out common table expressions (CTE's) http://msdn.microsoft.com/en-us/library/ms175972.aspx

Yet another subquery issue

Hello from an absolute beginner in SQL!
I have a field I want to populate based on another table. For this I have written this query, which fails with: Msg 512, Level 16, State 1, Line 1
Subquery returned more than 1 value. This is not permitted when the subquery follows =, !=, <, <= , >, >= or when the subquery is used as an expression.
The statement has been terminated.
oK, here goes:
Update kre.CustomerOrderLineCopy
SET DepNo = (SELECT customerordercopy.DepNo
FROM kre.CustomerOrderCopy , kre.CustomerOrderLineCopy
WHERE CustomerOrderLineCopy.OrderCopyNo =kre.CustomerOrderCopy.OrderCopyNo)
WHERE CustomerOrderLineCopy.OrderCopyNo = (SELECT CustomerOrderCopy.OrderCopyNo
FROM kre.CustomerOrderCopy, kre.CustomerOrderLineCopy
WHERE kre.CustomerOrderLineCopy.OrderCopyNo = kre.CustomerOrderCopy.OrderCopyNo)
What I'm trying to do is to change DepNo in CustomerOrderLineCopy, with the value in DepNo in CustomerOrderCopy - based on the same OrderCopyNo in both tables.
I'm open for all suggestion.
Thanks,
ohalvors
If you just join the tables together the update is easier:
UPDATE A SET A.DepNo = B.DepNo
FROM kre.CustomerOrderLineCopy A
INNER JOIN kre.CustomerOrderCopy B ON A.OrderCopyNo = B.OrderCopyNo
The problem is that at least one of your sub queries return more than one value. Think about this:
tablePerson(name, age)
Adam, 11
Eva, 11
Sven 22
update tablePerson
set name = (select name from tablePerson where age = 11)
where name = 'Sven'
Which is equivalent to: set Sven's name to Adam and Eva. Which is not possible.
If you want to use sub queries, either make sure your sub queries can only return one value or force one value by using:
select top 1 xxx from ...
This may be enough to quieten it down:
Update kre.CustomerOrderLineCopy
SET DepNo = (SELECT customerordercopy.DepNo
FROM kre.CustomerOrderCopy --, kre.CustomerOrderLineCopy
WHERE CustomerOrderLineCopy.OrderCopyNo =kre.CustomerOrderCopy.OrderCopyNo)
WHERE CustomerOrderLineCopy.OrderCopyNo = (SELECT CustomerOrderCopy.OrderCopyNo
FROM kre.CustomerOrderCopy --, kre.CustomerOrderLineCopy
WHERE kre.CustomerOrderLineCopy.OrderCopyNo = kre.CustomerOrderCopy.OrderCopyNo)
(Where I've commented out kre.CustomerOrderLineCopy in the subqueries) That is, you were hopefully trying to correlate these subqueries with the outer table - not introduce another instance of kre.CustomerOrderLineCopy.
If you still get an error, then you still have multiple rows in kre.CustomerOrderCopy which have the same OrderCopyNo. If that's so, you need to give us (and SQL Server) the rules that you want to apply for how to select which row you want to use.
The danger of switching to the FROM ... JOIN form shown in #Avitus's answer is that it will no longer report if there are multiple matching rows - it will just silently pick one of them - which one is never made clear.
Now I look at the query again, I'm not sure it even needs a WHERE clause now. I think this is the same:
Update kre.CustomerOrderLineCopy
SET DepNo = (
SELECT customerordercopy.DepNo
FROM kre.CustomerOrderCopy
WHERE CustomerOrderLineCopy.OrderCopyNo = kre.CustomerOrderCopy.OrderCopyNo)

SQL Server 2008 Stored Proc Performance where Column = NULL

When I execute a certain stored procedure (which selects from a non-indexed view) with a non-null parameter, it's lightning fast at about 10ms. When I execute it with a NULL parameter (resulting in a FKColumn = NULL query) it's much slower at about 1200ms.
I've executed it with the actual execution plan and it appears the most costly portion of the query is a clustered index scan with the predicate IS NULL on the fk column in question - 59%! The index covering this column is (AFAIK) good.
So what can I do to improve the performance here? Change the fk column to NOT NULL and fill the nulls with a default value?
SELECT top 20 dbo.vwStreamItems.ItemId
,dbo.vwStreamItems.ItemType
,dbo.vwStreamItems.AuthorId
,dbo.vwStreamItems.AuthorPreviewImageURL
,dbo.vwStreamItems.AuthorThumbImageURL
,dbo.vwStreamItems.AuthorName
,dbo.vwStreamItems.AuthorLocation
,dbo.vwStreamItems.ItemText
,dbo.vwStreamItems.ItemLat
,dbo.vwStreamItems.ItemLng
,dbo.vwStreamItems.CommentCount
,dbo.vwStreamItems.PhotoCount
,dbo.vwStreamItems.VideoCount
,dbo.vwStreamItems.CreateDate
,dbo.vwStreamItems.Language
,dbo.vwStreamItems.ProfileIsFriendsOnly
,dbo.vwStreamItems.IsActive
,dbo.vwStreamItems.LocationIsFriendsOnly
,dbo.vwStreamItems.IsFriendsOnly
,dbo.vwStreamItems.IsDeleted
,dbo.vwStreamItems.StreamId
,dbo.vwStreamItems.StreamName
,dbo.vwStreamItems.StreamOwnerId
,dbo.vwStreamItems.StreamIsDeleted
,dbo.vwStreamItems.RecipientId
,dbo.vwStreamItems.RecipientName
,dbo.vwStreamItems.StreamIsPrivate
,dbo.GetUserIsFriend(#RequestingUserId, vwStreamItems.AuthorId) as IsFriend
,dbo.GetObjectIsBookmarked(#RequestingUserId, vwStreamItems.ItemId) as IsBookmarked
from dbo.vwStreamItems WITH (NOLOCK)
where 1 = 1
and vwStreamItems.IsActive = 1
and vwStreamItems.IsDeleted = 0
and vwStreamItems.StreamIsDeleted = 0
and (
StreamId is NULL
or
ItemType = 'Stream'
)
order by CreateDate desc
When it's not null, do you have
and vwStreamItems.StreamIsDeleted = 0
and (
StreamId = 'xxx'
or
ItemType = 'Stream'
)
or
and vwStreamItems.StreamIsDeleted = 0
and (
StreamId = 'xxx'
)
You have an OR clause there which is most likely the problem, not the IS NULL as such.
The plans will show why: the OR forces a SCAN but it's manageable with StreamId = 'xxx'. When you use IS NULL, you lose selectivity.
I'd suggest changing your index make StreamId the right-most column.
However, a view is simply a macro that expands so the underlying query on the base tables could be complex and not easy to optimise...
The biggest performance gain would be for you to try to loose GetUserIsFriend and GetObjectIsBookmarked functions and use JOIN to make the same functionality. Using functions or stored procedures inside a query is basically the same as using FOR loop - the items are called 1 by 1 to determine the value of a function. If you'd use joining tables instead, all of the items values would be determined together as a group in 1 pass.

Resources