Generate hierarchy of employees - sql-server

I have two tables: EmployeeMaster and EmployeeDetails. The schema of both are as below:
Sample data in both tables is shown:
I want to generate the hierarchy using EmployeeDetails table primarily. This table contains a column named: Manager. The EmployeeId of the Manager needs to be picked from the table EmployeeMaster table.
This is how the hierarchy needs to be formed. An EmployeeId is passed as a parameter to a stored procedure. The two supervisors of this Employee needs to be picked and 10 employees below this employee in seniority needs to be picked.
For instance, I pass the EmployeeId of Josh.Berkus to the stored procedure. The stored procedure query should return hierarchy as below:
I want the final output in this format:
Employee_Id .... Manager_Id
----------- .... ------------
Please note that Manager_Id is the EmployeeId of Manager.
I tried using a CTE with union all query, but not able to get it correctly.

Actually you will need to work out the recursivity since on manager can have a manager...
take a look at:
http://msdn.microsoft.com/en-us/library/ms190766(v=sql.105).aspx
http://msdn.microsoft.com/en-us/library/ms186243(v=sql.105).aspx
The thing is that your going to need 2 queries... one to go "up" the hierarchy and one to go down... and then union the results...
why don't you merge the two tables, since one person cant have 2 managers right?!? Specially because a manager is also a employee... this will simplify everything...

You can use a CROSS JOIN to create a link between all your records and then you can put the condition to select only those columns that have a manager-employee relationship between them.
The code should be something like this:
SELECT
ed.employeeid 'Employee ID',
em.employeeid 'Manager ID',
FROM EMPLOYEEMASTER em CROSS JOIN EMPLOYEEDETAILS ed
WHERE ed.manager = em.username

You’ll need to implement some recursion here in order to get full hierarchy.
Here is a quick and dirty example of how you can implement this to get manager hierarchy. You would need something similar for lower level hierarchy too
create function dbo.GetManagerHierarchy
(
#EmpID int
)
returns varchar(100)
as
begin
declare #result varchar(100)
declare #managerId int
SET #managerId = (select top 1 Manager from EmployeeDetails where EMployeeId)
if #managerId is not null then
SET #result = dbo.GetManagerHierarchy(#managerID) + '-' + CONVERT(varchar(100), #managerId) +
else
SET #result = ''
return #result
end

Related

What is easiest and optimize way to find specific value from database tables?

As per my requirement, I have to find if some words like xyz#test.com value exists in which tables of columns. The database size is very huge and more than 2500 tables.
Can anyone please provide an optimal way to find this type of value from the database. I've created a loop query which took around almost more than 9 hrs to run.
9 hours is clearly a long time. Furthermore, 2,500 tables seems close to insanity for me.
Here is one approach that will run 1 query per table, not one per column. Now I have no idea how this will perform against 2,500 tables. I suspect it may be horrible. That said I would strongly suggest a test filter first like Table_Name like 'OD%'
Example
Declare #Search varchar(max) = 'cappelletti' -- Exact match '"cappelletti"'
Create Table #Temp (TableName varchar(500),RecordData xml)
Declare #SQL varchar(max) = ''
Select #SQL = #SQL+ ';Insert Into #Temp Select TableName='''+concat(quotename(Table_Schema),'.',quotename(table_name))+''',RecordData = (Select A.* for XML RAW) From '+concat(quotename(Table_Schema),'.',quotename(table_name))+' A Where (Select A.* for XML RAW) like ''%'+#Search+'%'''+char(10)
From INFORMATION_SCHEMA.Tables
Where Table_Type ='BASE TABLE'
and Table_Name like 'OD%' -- **** Would REALLY Recommend a REASONABLE Filter *** --
Exec(#SQL)
Select A.TableName
,B.*
,A.RecordData
From #Temp A
Cross Apply (
Select ColumnName = a.value('local-name(.)','varchar(100)')
,Value = a.value('.','varchar(max)')
From A.RecordData.nodes('/row') as C1(n)
Cross Apply C1.n.nodes('./#*') as C2(a)
Where a.value('.','varchar(max)') Like '%'+#Search+'%'
) B
Drop Table #Temp
Returns
If it Helps, the individual queries would look like this
Select TableName='[dbo].[OD]'
,RecordData= (Select A.* for XML RAW)
From [dbo].[OD] A
Where (Select A.* for XML RAW) like '%cappelletti%'
On a side-note, you can search numeric data and even dates.
Make a procedure with VARCHAR datatype of column with table name and store into the temp table from system tables.
Now make one dynamic Query with executing a LOOP on each record with = condition with input parameter of email address.
If condition is matched in any statement using IF EXISTS statement, then store that table name and column name in another temp table. and retrieve the list of those records from temp table at end of the execution.

SSRS Stepped reported based on number

Using SSRS with SQL Server 2008 R2 (Visual Studio environment).
I am trying to produce a stepped down report based on a level/value in a table on sql server. The level act as a indent position with sort_value been the recursive parent in the report.
Sample of table in SQL Server:
Sample of output required
OK, I've come up with a solution but please note the following before you proceed.
1. The process relies on the data being in the correct order, as per your sample data.
2. If this is your real data structure, I strongly recommend you review it.
OK, So the first things I did was recreate your table exactly as per example. I called the table Stepped as I couldn't think of anything else!
The following code can then be used as your dataset in SSRS but you can obviously just run the T-SQL directly to see the output.
-- Create a copy of the data with a row number. This means the input data MUST be in the correct order.
DECLARE #t TABLE(RowN int IDENTITY(1,1), Sort_Order int, [Level] int, Qty int, Currency varchar(20), Product varchar(20))
INSERT INTO #t (Sort_Order, [Level], Qty, Currency, Product)
SELECT * FROM Stepped
-- Update the table so each row where the sort_order is NULL will take the sort order from the row above
UPDATE a SET Sort_Order = b.Sort_Order
FROM #t a
JOIN #t b on a.RowN = b.rowN+1
WHERE a.Sort_Order is null and b.Sort_Order is not null
-- repeat this until we're done.
WHILE ##ROWCOUNT >0
BEGIN
UPDATE a SET Sort_Order = b.Sort_Order
FROM #t a
JOIN #t b on a.RowN = b.rowN+1
WHERE a.Sort_Order is null and b.Sort_Order is not null
END
-- Now we can select from our new table sorted by both sort oder and level.
-- We also separate out the products based on their level.
SELECT
CASE Level WHEN 1 THEN Product ELSE NULL END as ProdLvl_1
, CASE Level WHEN 2 THEN Product ELSE NULL END as ProdLvl_2
, CASE Level WHEN 3 THEN Product ELSE NULL END as ProdLvl_3
, QTY
, Currency
FROM #t s
ORDER BY Sort_Order, Level
The output looks like this...
You may also want to consider swapping out the final statement for this.
-- Alternatively use this style and use a single column in the report.
-- This is better if the number of levels can change.
SELECT
REPLICATE('--', Level-1) + Product as Product
, QTY
, Currency
FROM #t s
ORDER BY Sort_Order, Level
As this will give you a single column for 'product' indented like this.

Downsides to MERGE with dummy USING table?

I'm creating some stored procedures specifically for use from C# to update various tables in our database. A large number of items require a predictable function that will:
1) Check if a matching row already exists
2) If it doesn't exist, insert data
3) Gather ID of row and return to user
I know this can be done in a number of ways, but the most elegant way I can imagine seems to be using a MERGE with a dummy table and using the procedure params for the ON clause, such as:
CREATE PROCEDURE dbo.UpdatePerson(#PersonID INT, #FirstName VARCHAR(50)) AS
MERGE dbo.Person p
USING (SELECT 1 One) One
ON p.Person_ID = #PersonID
WHEN MATCHED THEN
UPDATE SET First_Name = #FirstName
WHEN NOT MATCHED THEN
INSERT (Person_ID, First_Name) VALUES (#PersonID, #FirstName);
This wraps it all together in one nice bundle, even though I'm not working with an actual table to merge in. I know the same basic idea could be accomplished with:
...
USING (SELECT #PersonID Person_ID, #FirstName First_Name) NewPerson
ON p.Person_ID = NewPerson.Person_ID
...
and maybe this would offer some kind of performance benefit?
Can anyone offer any solid reasons for/against this kind of usage of MERGE?
Instead of using MERGE you can use if condition.
You are having a temp table
CREATE TABLE #Table(PersonID INT,First_Name VARCHAR(100))
-- BEFORE THAT INSERT INTO TEMP TABLE
IF EXISTS(SELECT 1 FROM YOURTABLE WHERE PERSONID IN(SELECT PERSONID FROM #TABLE))
BEGIN
-------YOUR UPDATE QUERY
END
ELSE
BEGIN
-------INSERT QUERY
END
DROP TABLE #Table

Update statement to update multiple rows

I have a question regarding the following syntax. Is there a cleaner way to roll this up into one statement rather than two. I've tried several iterations but this seems to be the only way I can successfully execute these two statements.
UPDATE employee
SET hire_date = '1979-03-15'
WHERE emp_id = 'PMA42628M'
UPDATE employee
SET hire_date = '1988-12-22'
where emp_id = 'PSA89086M';
I tried this as well and I also tried using an AND statement. Neither worked. Basically I am looking for a less newbie way then the method above, if one exists. I spent a long time searching and did not find one.
UPDATE employee
SET hire_date = ('1979-03-15', '1988-12-22')
WHERE emp_id = ('PMA42628M', 'PSA89086M');
Appriciate any advice on this one, and by the way, I am using sql server.
Thanks
Try this one, this will combine multiple selects and returns them as if they come from the DB:
UPDATE e
SET hire_date = t.hire_date
FROM dbo.employee e
JOIN (
SELECT emp_id = 'PMA42628M', hire_date = '1979-03-15'
UNION ALL
SELECT emp_id = 'PSA89086M', hire_date = '1988-12-22'
) t ON t.emp_id = e.emp_id
If you are using SQL Server 2008 or later version, you could also use a different syntax for the derived table:
UPDATE e
SET hire_date = t.hire_date
FROM dbo.employee e
JOIN (
VALUES
('PMA42628M', '1979-03-15'),
('PSA89086M', '1988-12-22')
) t (emp_id, hire_date) ON t.emp_id = e.emp_id
I am looking for a less newbie way
Doing two separate update statements is (according to me) "the less newbie way" you could complicate stuff and do something like this.
update employee
set hire_date = case emp_id
when 'PMA42628M' then '1979-03-15'
when 'PSA89086M' then '1988-12-22'
end
where emp_id in ('PMA42628M', 'PSA89086M')
but what would that gain you? The entire update would run in one implicit transaction so if you want your two updates to be in a transaction you just use begin transaction .... commit.
You can make a temporary table or a table variable containing the updates you want to do, then run the UPDATE statement linking the table to the table you intend to update.
Note that for two updates, you get two statements: the INSERT into the update table and the UPDATE statement itself. The number of statements remains two though for as many updates you need to do.
CREATE TABLE #employee (emp_id VARCHAR(9) NOT NULL PRIMARY KEY,hire_date DATE NOT NULL);
INSERT INTO #employee (emp_id,hire_date)
VALUES ('PMA42628M','2013-06-05'),('PSA89086M','2013-06-05');
CREATE TABLE #target_updates(emp_id VARCHAR(9) NOT NULL,hire_date DATE NOT NULL);
INSERT INTO #target_updates (emp_id,hire_date)
VALUES ('PMA42628M','1979-03-15'),('PSA89086M','1988-12-22');
UPDATE
#employee
SET
hire_date=tu.hire_date
FROM
#employee AS e
INNER JOIN #target_updates AS tu ON
tu.emp_id=e.emp_id;
SELECT
*
FROM
#employee
ORDER BY
emp_id;
DROP TABLE #target_updates;
DROP TABLE #employee;
update table_name set='value' where orgid in (idnum1, idnum2)

Need help improving query performance

I need help with improving the performance of the following SQL query. The database design of this application is based on OLD mainframe entity designs. All the query does is returns a list of clients based on some search criteria:
#Advisers: Only returns clients which was captured by this adviser.
#outlets: just ignore this one
#searchtext: (firstname, surname, suburb, policy number) any combination of that
What I'm doing is creating a temporary table, then query all the tables involved, creating my own dataset, and then insert that dataset into a easily understandable table (#clients)
This query takes 20 seconds to execute and currently only returns 7 rows!
Screenshot of all table count can be found here: Table Record Count
Any ideas where I can start to optimize this query?
ALTER PROCEDURE [dbo].[spOP_SearchDashboard]
#advisers varchar(1000),
#outlets varchar(1000),
#searchText varchar(1000)
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
-- Set the prefixes to search for (firstname, surname, suburb, policy number)
DECLARE #splitSearchText varchar(1000)
SET #splitSearchText = REPLACE(#searchText, ' ', ',')
DECLARE #AdvisersListing TABLE
(
adviser varchar(200)
)
DECLARE #SearchParts TABLE
(
prefix varchar(200)
)
DECLARE #OutletListing TABLE
(
outlet varchar(200)
)
INSERT INTO #AdvisersListing(adviser)
SELECT part as adviser FROM SplitString (#advisers, ',')
INSERT INTO #SearchParts(prefix)
SELECT part as prefix FROM SplitString (#splitSearchText, ',')
INSERT INTO #OutletListing(outlet)
SELECT part as outlet FROM SplitString (#outlets, ',')
DECLARE #Clients TABLE
(
source varchar(2),
adviserId bigint,
integratedId varchar(50),
rfClientId bigint,
ifClientId uniqueidentifier,
title varchar(30),
firstname varchar(100),
surname varchar(100),
address1 varchar(500),
address2 varchar(500),
suburb varchar(100),
state varchar(100),
postcode varchar(100),
policyNumber varchar(100),
lastAccess datetime,
deleted bit
)
INSERT INTO #Clients
SELECT
source, adviserId, integratedId, rfClientId, ifClientId, title,
firstname, surname, address1, address2, suburb, state, postcode,
policyNumber, max(lastAccess) as lastAccess, deleted
FROM
(SELECT DISTINCT
'RF' as Source,
advRel.SourceEntityId as adviserId,
cast(pe.entityId as varchar(50)) AS IntegratedID,
pe.entityId AS rfClientId,
cast(ifClient.Id as uniqueidentifier) as ifClientID,
ISNULL(p.title, '') AS title,
ISNULL(p.firstname, '') AS firstname,
ISNULL(p.surname, '') AS surname,
ISNULL(ct.address1, '') AS address1,
ISNULL(ct.address2, '') AS address2,
ISNULL(ct.suburb, '') AS suburb,
ISNULL(ct.state, '') AS state,
ISNULL(ct.postcode, '') AS postcode,
ISNULL(contract.policyNumber,'') AS policyNumber,
coalesce(pp.LastAccess, d_portfolio.dateCreated, pd.dateCreated) AS lastAccess,
ISNULL(client.deleted, 0) as deleted
FROM
tbOP_Entity pe
INNER JOIN tbOP_EntityRelationship advRel ON pe.EntityId = advRel.TargetEntityId
AND advRel.RelationshipId = 39
LEFT OUTER JOIN tbOP_Data pd ON pe.EntityId = pd.entityId
LEFT OUTER JOIN tbOP__Person p ON pd.DataId = p.DataId
LEFT OUTER JOIN tbOP_EntityRelationship ctr ON pe.EntityId = ctr.SourceEntityId
AND ctr.RelationshipId = 79
LEFT OUTER JOIN tbOP_Data ctd ON ctr.TargetEntityId = ctd.entityId
LEFT OUTER JOIN tbOP__Contact ct ON ctd.DataId = ct.DataId
LEFT OUTER JOIN tbOP_EntityRelationship ppr ON pe.EntityId = ppr.SourceEntityId
AND ppr.RelationshipID = 113
LEFT OUTER JOIN tbOP_Data ppd ON ppr.TargetEntityId = ppd.EntityId
LEFT OUTER JOIN tbOP__Portfolio pp ON ppd.DataId = pp.DataId
LEFT OUTER JOIN tbOP_EntityRelationship er_policy ON ppd.EntityId = er_policy.SourceEntityId
AND er_policy.RelationshipId = 3
LEFT OUTER JOIN tbOP_EntityRelationship er_contract ON er_policy.TargetEntityId = er_contract.SourceEntityId AND er_contract.RelationshipId = 119
LEFT OUTER JOIN tbOP_Data d_contract ON er_contract.TargetEntityId = d_contract.EntityId
LEFT OUTER JOIN tbOP__Contract contract ON d_contract.DataId = contract.DataId
LEFT JOIN tbOP_Data d_portfolio ON ppd.EntityId = d_portfolio.EntityId
LEFT JOIN tbOP__Portfolio pt ON d_portfolio.DataId = pt.DataId
LEFT JOIN tbIF_Clients ifClient on pe.entityId = ifClient.RFClientId
LEFT JOIN tbOP__Client client on client.DataId = pd.DataId
where
p.surname <> ''
AND (advRel.SourceEntityId IN (select adviser from #AdvisersListing)
OR
pp.outlet COLLATE SQL_Latin1_General_CP1_CI_AS in (select outlet from #OutletListing)
)
) as RFClients
group by
source, adviserId, integratedId, rfClientId, ifClientId, title,
firstname, surname, address1, address2, suburb, state, postcode,
policyNumber, deleted
SELECT * FROM #Clients --THIS ONLY RETURNS 10 RECORDS WITH MY CURRENT DATASET
END
Clarifying questions
What is the MAIN piece of data that you are querying on - advisers, search-text, outlets?
It feels like your criteria allows for users to search in many different ways. A sproc will always use exactly the SAME plan for every question you ask of it. You get better performance by using several sprocs - each tuned for a specific search scenario (i.e I bet you could write something blazingly fast for querying just by policy-number).
If you can separate your search-text into INDIVIDUAL parameters then you may be able to:
Search for adviser relationships matching your supplied list - store in temp table (or table variable).
IF ANY surnames have been specified then delete all records from temp which aren't for people with your supplied names.
Repeat for other criteria lists - all the time reducing your temp records.
THEN join to the outer-join stuff and return the results.
In your notes you say that outlets can be ignored. If this is true then taking them out would simplify your query. The "or" clause in your example means that SQL-Server needs to find ALL relationships for ALL portfolios before it can realistically get down to the business of filtering the results that you actually want.
Breaking the query up
Most of you query consists of outer-joins that are not involved in filtering. Try moving these joins into a separate select (i.e. AFTER you have applied all of your criteria). When SQL-Server sees lots of tables then it switches off some of its possible optimisations. So your first step (assuming that you always specify advisers) is just:
SELECT advRel.SourceEntityId as adviserId,
advRel.TargetEntityId AS rfClientId
INTO #temp1
FROM #AdvisersListing advisers
INNER JOIN tbOP_EntityRelationship advRel
ON advRel.SourceEntityId = advisers.adviser
AND advRel.RelationshipId = 39;
The link to tbOP_Entity (aliased as "pe") does not look like it is needed for its data. So you should be able to replace all references to "pe.EntityId" with "advRel.TargetEntityId".
The DISTINCT clause and the GROUP-BY are probably trying to achieve the same thing - and both of them are really expensive. Normally you find ONE of these used when a previous developer has not been able to get his results right. Get rid of them - check your results - if you get duplicates then try to filter the duplicates out. You may need ONE of them if you have temporal data - you definitely don't need both.
Indexes
Make sure that the #AdvisersListing.adviser column is same datetype as SourceEntityId and that SourceEntityId is indexed. If the column has a different datatype then SQL-Server won't want to use the index (so you would want to change the data-type on #AdvisersListing).
The tbOP_EntityRelationship tables sounds like it should have an index something like:
CREATE UNIQUE INDEX advRel_idx1 ON tbOP_EntityRelationship (SourceEntityId,
RelationshipId, TargetEntityId);
If this exists then SQL-Server should be able to get everything it needs by ONLY going to the index pages (rather than to the table pages). This is known as a "covering" index.
There should be a slightly different index on tbOP_Data (assuming it has a clustered primary key on DataId):
CREATE INDEX tbOP_Data_idx1 ON tbOP_Data (entityId) INCLUDE (dateCreated);
SQL-Server will store the keys from the table's clustered index (which I assume will be DataId) along with the value of "dateCreated" in the index leaf pages. So again we have a "covering" index.
Most of the other tables (tbOP__Client, etc) should have indexes on DataId.
Query plan
Unfortunately I couldn't see the explain-plan picture (our firewall ate it). However 1 useful tip is to hover your mouse over some of the join lines. It tells you how many records be accessed.
Watch out for full-table-scans. If SQL-Server needs to use them then its pretty-much given up on your indexes.
Database structure
Its been designed as a transaction database. The level of normalization (and all of the EntityRelationship-this and Data-that are really painful for reporting). You really need to consider having a separate reporting database that unravels some of this information into a more usable structure.
If you are running reports directly against your production database then I would expect a bunch of locking problems and resource contention.
Hope this has been useful - its the first time I've posted here. Has been ages since I last tuned a query in my current company (they have a bunch of stern-faced DBAs for sorting this sort of thing out).
Looking at your execution plan... 97% of the cost of your query is in processing the DISTINCT clause. I'm not sure it is even necessary since you are taking all that data and doing a group by on it anyway. You might want to take it out and see how that affects the plan.
That kind of query is just going to take time, with that many joins and that many temp tables, there's just nothing easy or efficient about it. One trick I have been using is using local variables. It might not be an all out solution, bit if it shaves a few seconds, it's worth it.
DECLARE #Xadvisers varchar(1000)
DECLARE #Xoutlets varchar(1000)
DECLARE #XsearchText varchar(1000)
SET #Xadvisers = #advisers
SET #Xoutlets = #outlets
SET #XsearchText = #searchText
Believe me, I have tested it thoroughly, and it helps with complicated scripts. Something about the way SQL Server handles local variables. Good luck!

Resources