SQL Server Dynamic Search Based on Stored Search Parameters - sql-server

I have to create a dynamic search. The search criteria is stored in tables and there is a main table for the stored records. Here is the structure:
--Main Table. This table stores records of a user. Basically we store files in this table. Each file is associated with a single city.
DECLARE #Records TABLE(
[RecordId] INT IDENTITY(1,1),
[FileName] VARCHAR(100),
[OwnerId] INT,
[CityId] INT
)
INSERT INTO #Records
SELECT 'A', 100, 101 UNION
SELECT 'B', 100, 102 UNION
SELECT 'C', 100, 103 UNION
SELECT 'D', 100, 104
--The next table is used to associate a file with multiple friends.
DECLARE #FriendRecords TABLE
(
[Id] INT IDENTITY(1,1),
[RecordId] INT,
[FriendId] INT
)
INSERT INTO #FriendRecords
SELECT 1, 201 UNION --File '1' is associated with 'FriendId' 201
SELECT 1, 202 UNION --File '1' is associated with 'FriendId' 202
SELECT 2, 201 UNION --File '2' is associated with 'FriendId' 201
SELECT 3, 202 --File '3' is associated with 'FriendId' 202
--The following table is used to create a criteria for user.
DECLARE #Criteria TABLE
(
[CriteriaId] INT IDENTITY(1,1),
[CriteriaName] VARCHAR(50),
[OwnerId] INT
)
INSERT INTO #Criteria
SELECT 'SampleCriteria', 100 --Criteria created by user 100
--The following table is used to store cities that needs to be searched in 'Records' table for owner of criteriaId '1'.
DECLARE #CriteriaCities TABLE(
[CriteriaCityId] INT IDENTITY(1,1),
[CriteriaId] INT,
[CityId] INT
)
INSERT INTO #CriteriaCities
SELECT 1, 101 UNION
SELECT 1, 102 UNION
SELECT 1, 103
--The following table is used to store friend that needs to be searched in 'FriendsRecords' table for owner of criteriaId '1'.
DECLARE #CriteriaFriend TABLE(
[CriteriaCityId] INT IDENTITY(1,1),
[CriteriaId] INT,
[FriendId] INT
)
INSERT INTO #CriteriaFriend
SELECT 1, 202;
Basically, the user can create a criteria(#Criteria table) and store the search parameters(#CriteriaCities and #CriteriaFriend tables.) The requirement is to get the files according to the stored criteria. The query I am looking for should return records from #Records table that has cityIds 101, 102&103 AND FriendId '201'. The result is only 'C' from #Records table. If I create a left join on all the tables, I get the other records for ownerId '100' as well. If I include an inner join within tables I get no records if there is no entry for criteria in #CriteriaCities or #CriteriaFriend table. What should be the query that searches for records in the main table based on record that exist in the link tables(#CriteriaCities, #CriteriaFriend)? If the search parameter is not stored in these table the join should not be created between these tables.

I'm not sure you have your logic fully understood as your question doesn't quite make sense per my comment, but if I follow what you have stated as your rules you can use the following select statement to get your data. I am not sure how you would pass in which criteria you are searching for yet though, as you have not explained how this part of your process works:
-- Count the records in each criteria table.
declare #CriteriaCitiesCount int = (select count(1) from #CriteriaCities);
declare #CriteriaFriendCount int = (select count(1) from #CriteriaFriend);
select *
from #Records r
left join #CriteriaCities cc
on(r.CityId = cc.CityId)
left join #FriendRecords f
on(r.RecordId = f.RecordId)
left join #CriteriaFriend cf
on(f.FriendId = cf.FriendId)
where (#CriteriaCitiesCount = 0 -- Only return records where we aren't filtering,
or cc.CityId is not null -- Or we are filtering and we get a match.
)
and (#CriteriaFriendCount = 0
or cf.FriendId is not null
);

Sorry,i am not able to understand your output.
Why you should get only FileName C ?
Try something like,
;With CTE as
(
select cc.CityId,cf.FriendId
from #Criteria C
inner join #CriteriaCities CC
on c.CriteriaId=cc.CriteriaId
inner join #CriteriaFriend CF
on c.CriteriaId=cf.CriteriaId
where c.OwnerId=#ownerid
)
select *
from #Records R
inner join #FriendRecords FR
on r.RecordId=fr.RecordId
where EXISTS(
select cityid from cte c where r.CityId=c.cityid and c.FriendId=fr.FriendId
)

Related

Efficient query to filter for list of values across columns/distinct rows

SQL Server version is 2016+/Azure SqlDb (flexible if additive, compatible with both, forward-compatible).
Use case is API users sending a single-column list of values to filter some target table. The target table has 2-n columns whose values are unique across rows (maybe columns, doesn't matter) for the table/range being queried. (So far n <= 5, but that's a detail/not guaranteed.)
Here's a good-enough sample table DDL:
IF NOT EXISTS (SELECT 1 FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_NAME = 'SomeTable')
BEGIN
CREATE TABLE dbo.SomeTable (
ID int IDENTITY(1, 1) not null PRIMARY KEY CLUSTERED
, NaturalKey1 nvarchar(10) not null UNIQUE NONCLUSTERED
, NaturalKey2 nvarchar(10) not null UNIQUE NONCLUSTERED
, NaturalKey3 nvarchar(10) not null UNIQUE NONCLUSTERED
);
END
IF NOT EXISTS (SELECT 1 FROM dbo.SomeTable)
BEGIN
INSERT INTO dbo.SomeTable VALUES
('A', 'AA', 'ZZZZZ')
,('B', 'B', 'YYYYY')
,('C', 'CC', 'XXX')
,('D', 'DDD', 'WWWWW')
,('E', 'EEEE', 'V')
,('F', 'FF', 'UUUUUUUUU')
,('G', 'GGGGGGGG', 's')
-- lots more
;
END
SELECT * FROM dbo.SomeTable;
-- DROP TABLE dbo.SomeTable;
Assumptions are that all NaturalKey columns are of same type (probably nvarchar); filtering happens db-side; and in as few steps as possible, ideally one execution, in a stored procedure. Parameter will be string list or TVP, doesn't matter really. Result will include all data in any row of SomeTable that matches any value on any column. Target table is of unknown size.
Here's an example parameter for our pal above:
DECLARE #filterValues nvarchar(1000) = 'DDD,XXX,E,HH,ok,whatever,YYYYY';
SELECT * FROM string_split(#filterValues, ',');
I know a couple ways to do this, and can imagine several more, so it's not that kind of stuck. I'll bet someone knows a better trick than either of the two I'll illustrate.
Approach 1 Build a temp table updated for existence and join on it (concise and nice to audit, that's about it for pros)
DECLARE #filterValues nvarchar(1000) = 'DDD,XXX,E,HH,ok,whatever,YYYYY';
SELECT value AS InValue, CONVERT(int, null) AS IDMatch
INTO #filters
FROM string_split(#filterValues, ',');
UPDATE f
SET f.IDMatch = st.ID
FROM #filters AS f
INNER JOIN dbo.SomeTable AS st ON f.InValue IN (st.NaturalKey1, st.NaturalKey2, st.NaturalKey3);
SELECT * FROM #filters; -- Audit
SELECT st.* FROM #filters AS f INNER JOIN dbo.SomeTable AS st ON f.IDMatch = st.ID;
IF OBJECT_ID('tempdb..#filters') IS NOT NULL DROP TABLE #filters;
Approach 2 Unpivot SomeTable (I like the nifty cross apply trick) and just join (at scale there be ogres methinks)
SELECT
st.*
FROM
dbo.SomeTable AS st
CROSS APPLY (VALUES (st.NaturalKey1)
, (st.NaturalKey2)
, (st.NaturalKey3)
) AS nk(Val)
INNER JOIN #filters AS f ON nk.Val = f.InValue;
IF OBJECT_ID('tempdb..#filters') IS NOT NULL DROP TABLE #filters;
Is there a question in our future
Works is better than doesn't work, but looking for better/more efficient/more scalable methods from the T-SQL gurus. Unknown dimensions in columns and rows, response time is an SLA, filter size may or may not be capped. Bonus points if this ports neatly from SomeTable to SomeTableVersionN. (No dynamic SQL in an API.)
Could be dupe question, couldn't find it, pointing that out is just fine thank you.

SQL to join nvarchar(max) column with int column

I need some expert help to do left join on nvarchar(max) column with an int column. I have a Company table with EmpID as nvarchar(max) and this column holds multiple employee ID's separated with commas:
1221,2331,3441
I wanted to join this column with Employee table where EmpID is int.
I did something like below, But this doesn't work when I have 3 empId's or just 1 empID.
SELECT
A.*, B.empName AS empName1, D.empName AS empName2
FROM
[dbo].[Company] AS A
LEFT JOIN
[dbo].[Employee] AS B ON LEFT(A.empID, 4) = B.empID
LEFT JOIN
[dbo].[Employee] AS D ON RIGHT(A.empID, 4) = D.empID
My requirement is to get all empNames if there are multiple empID's in separate columns. Would highly appreciate any valuable input.
You should, if possible, normalize your database.
Read Is storing a delimited list in a database column really that bad?, where you will see a lot of reasons why the answer to this question is Absolutly yes!.
If, however, you can't change the database structure, you can use LIKE:
SELECT A.*, B.empName AS empName1, D.empName AS empName2
FROM [dbo].[Company] AS A
LEFT JOIN [dbo].[Employee] AS B ON ',' + A.empID + ',' LIKE '%,'+ B.empID + ',%'
You can give STRING_SPLIT a shot.
SQL Server (starting with 2016)
https://learn.microsoft.com/en-us/sql/t-sql/functions/string-split-transact-sql
CREATE TABLE #Test
(
RowID INT IDENTITY(1,1),
EID INT,
Names VARCHAR(50)
)
INSERT INTO #Test VALUES (1,'John')
INSERT INTO #Test VALUES (2,'James')
INSERT INTO #Test VALUES (3,'Justin')
INSERT INTO #Test VALUES (4,'Jose')
GO
CREATE TABLE #Test1
(
RowID INT IDENTITY(1,1),
ID VARCHAR(MAX)
)
INSERT INTO #Test1 VALUES ('1,2,3,4')
GO
SELECT Value,T.* FROM #Test1
CROSS APPLY STRING_SPLIT ( ID , ',' )
INNER JOIN #Test T ON value = EID
It sounds like you need a table to link employees to companies in a formal way. If you had that, this would be trivial. As it is, this is cumbersome and super slow. The below script creates that linkage for you. If you truly want to keep your current structure (bad idea), then the part you want is under the "insert into..." block.
--clean up the results of any prior runs of this test script
if object_id('STACKOVERFLOWTEST_CompanyEmployeeLink') is not null
drop table STACKOVERFLOWTEST_CompanyEmployeeLink;
if object_id('STACKOVERFLOWTEST_Employee') is not null
drop table STACKOVERFLOWTEST_Employee;
if object_id('STACKOVERFLOWTEST_Company') is not null
drop table STACKOVERFLOWTEST_Company;
go
--create two example tables
create table STACKOVERFLOWTEST_Company
(
ID int
,Name nvarchar(max)
,EmployeeIDs nvarchar(max)
,primary key(id)
)
create table STACKOVERFLOWTEST_Employee
(
ID int
,FirstName nvarchar(max)
,primary key(id)
)
--drop in some test data
insert into STACKOVERFLOWTEST_Company values(1,'ABC Corp','1,2,3,4,50')
insert into STACKOVERFLOWTEST_Company values(2,'XYZ Corp','4,5,6,7,8')--note that annie(#4) works for both places
insert into STACKOVERFLOWTEST_Employee values(1,'Bob') --bob works for abc corp
insert into STACKOVERFLOWTEST_Employee values(2,'Sue') --sue works for abc corp
insert into STACKOVERFLOWTEST_Employee values(3,'Bill') --bill works for abc corp
insert into STACKOVERFLOWTEST_Employee values(4,'Annie') --annie works for abc corp
insert into STACKOVERFLOWTEST_Employee values(5,'Matthew') --Matthew works for xyz corp
insert into STACKOVERFLOWTEST_Employee values(6,'Mark') --Mark works for xyz corp
insert into STACKOVERFLOWTEST_Employee values(7,'Luke') --Luke works for xyz corp
insert into STACKOVERFLOWTEST_Employee values(8,'John') --John works for xyz corp
insert into STACKOVERFLOWTEST_Employee values(50,'Pat') --Pat works for XYZ corp
--create a new table which is going to serve as a link between employees and their employer(s)
create table STACKOVERFLOWTEST_CompanyEmployeeLink
(
CompanyID int foreign key references STACKOVERFLOWTEST_Company(ID)
,EmployeeID INT foreign key references STACKOVERFLOWTEST_Employee(ID)
)
--this join looks for a match in the csv column.
--it is horrible and slow and unreliable and yucky, but it answers your original question.
--drop these messy matches into a clean temp table
--this is now a formal link between employees and their employer(s)
insert into STACKOVERFLOWTEST_CompanyEmployeeLink
select c.id,e.id
from
STACKOVERFLOWTEST_Company c
--find a match based on an employee id followed by a comma or preceded by a comma
--the comma is necessary so we don't accidentally match employee "5" on "50" or similar
inner join STACKOVERFLOWTEST_Employee e on
0 < charindex( convert(nvarchar(max),e.id) + ',',c.employeeids)
or 0 < charindex(',' + convert(nvarchar(max),e.id) ,c.employeeids)
order by
c.id, e.id
--show final results using the official linking table
select
co.Name as Employer
,emp.FirstName as Employee
from
STACKOVERFLOWTEST_Company co
inner join STACKOVERFLOWTEST_CompanyEmployeeLink link on link.CompanyID = co.id
inner join STACKOVERFLOWTEST_Employee emp on emp.id = link.EmployeeID

Query Large (in the millions) Data Faster

I have two tables:
Tbl1 has 2 columns: name and state
Tbl2 has name and state and additional columns about the fields
I am trying to match tbl1 name and state with tbl2 name and state. I have remove all exact matches, but I see that I could match more if I could account for misspelling and name variations by using a scalar function that compares the 2 names and returns an integer showing how close of a match they are (the lower the number returned the better the match).
The issue is that Tbl1 has over 2M records and Tbl2 has over 4M records – it takes about 30sec to just to search one record from Tbl1 in Tbl2.
Is there some way I could arrange the data or query so the search could be completed faster?
Here’s the table structure:
CREATE TABLE Tbl1
(
Id INT NOT NULL IDENTITY( 1, 1 ) PRIMARY KEY,
Name NVARCHAR(255),
[State] VARCHAR(50),
Phone VARCHAR(50),
DoB SMALLDATETIME
)
GO
CREATE INDEX tbl1_Name_indx ON dbo.Tbl1( Name )
GO
CREATE INDEX tbl1_State_indx ON dbo.Tbl1( [State] )
GO
CREATE TABLE Tbl2
(
Id INT NOT NULL IDENTITY( 1, 1 ) PRIMARY KEY,
Name NVARCHAR(255),
[State] VARCHAR(50)
)
GO
CREATE INDEX tbl2_Name_indx ON dbo.Tbl1( Name )
GO
CREATE INDEX tbl2_State_indx ON dbo.Tbl1( [State] )
GO
Here's a sample function that I tested with to try to rule out function complexity:
CREATE FUNCTION [dbo].ScoreHowCloseOfMatch
(
#SearchString VARCHAR(200) ,
#MatchString VARCHAR(200)
)
RETURNS INT
AS
BEGIN
DECLARE #Result INT;
SET #Result = 1;
RETURN #Result;
END;
Here's some sample data:
INSERT INTO Tbl1
SELECT 'Bob Jones', 'WA', '555-333-2222', 'June 10, 1971' UNION
SELECT 'Melcome T Homes', 'CA', '927-333-2222', 'June 10, 1971' UNION
SELECT 'Janet Rengal', 'WA', '555-333-2222', 'June 10, 1971' UNION
SELECT 'Matt Francis', 'TN', '234-333-2222', 'June 10, 1971' UNION
SELECT 'Same Bojen', 'WA', '555-333-2222', 'June 10, 1971' UNION
SELECT 'Frank Tonga', 'NY', '903-333-2222', 'June 10, 1971' UNION
SELECT 'Jill Rogers', 'WA', '555-333-2222', 'June 10, 1971' UNION
SELECT 'Tim Jackson', 'OR', '757-333-2222', 'June 10, 1971'
GO
INSERT INTO Tbl2
SELECT 'BobJonez', 'WA' UNION
SELECT 'Malcome X', 'CA' UNION
SELECT 'Jan Regal', 'WA'
GO
Here's the query:
WITH cte as (
SELECT t1Id = t1.Id ,
t1Name = t1.Name ,
t1State = t1.State,
t2Name = t2.Name ,
t2State = t2.State ,
t2.Phone ,
t2.DoB,
Score = dbo.ScoreHowCloseOfMatch(t1.Name, t2.Name)
FROM dbo.Tbl1 t2
JOIN dbo.Tbl2 t1
ON t1.State = t2.State
)
SELECT *
INTO CompareResult
FROM cte
ORDER BY cte.Score ASC
GO
One possibility would be to add a column with a normalized name used only for matching purposes. You would remove all the white spaces, remove accents, replace first names by abbreviated first names, replace known nicknames by real names etc.
You could even sort the first name and the last name of one person alphabetically in order to allow swapping both.
Then you can simply join the two tables by this normalized name column.
JOIN dbo.Tbl2 t1
ON t1.State = t2.State
You are joining 2Mx4M rows on a max 50 distinct values join criteria. No wonder this is slow. You need to go back to the drawing board and redefine your problem. If you really want to figure out the 'close match' of every body with everybody else in the same state, then be prepared to pay the price...

T-SQL: Two Level Aggregation in Same Query

I have a query that joins a master and a detail table. Master table records are duplicated in results as expected. I get aggregation on detail table an it works fine. But I also need another aggregation on master table at the same time. But as master table is duplicated, aggregation results are duplicated too.
I want to demonstrate this situation as below;
If Object_Id('tempdb..#data') Is Not Null Drop Table #data
Create Table #data (Id int, GroupId int, Value int)
If Object_Id('tempdb..#groups') Is Not Null Drop Table #groups
Create Table #groups (Id int, Value int)
/* insert groups */
Insert #groups (Id, Value)
Values (1,100), (2,200), (3, 200)
/* insert data */
Insert #data (Id, GroupId, Value)
Values (1,1,10),
(2,1,20),
(3,2,50),
(4,2,60),
(5,2,70),
(6,3,90)
My select query is
Select Sum(data.Value) As Data_Value,
Sum(groups.Value) As Group_Value
From #data data
Inner Join #groups groups On groups.Id = data.GroupId
The result is;
Data_Value Group_Value
300 1000
Expected result is;
Data_Value Group_Value
300 500
Please note that, derived table or sub-query is not an option. Also Sum(Distinct groups.Value) is not suitable for my case.
If I am not wrong, you just want to sum value column of both table and show it in a single row. in that case you don't need to join those just select the sum as a column like :
SELECT (SELECT SUM(VALUE) AS Data_Value FROM #DATA),
(SELECT SUM(VALUE) AS Group_Value FROM #groups)
SELECT
(
Select Sum(d.Value) From #data d
WHERE EXISTS (SELECT 1 FROM #groups WHERE Id = d.GroupId )
) AS Data_Value
,(
SELECT Sum( g.Value) FROM #groups g
WHERE EXISTS (SELECT 1 FROM #data WHERE GroupId = g.Id)
) AS Group_Value
I'm not sure what you are looking for. But it seems like you want the value from one group and the collected value that represents a group in the data table.
In that case I would suggest something like this.
select Sum(t.Data_Value) as Data_Value, Sum(t.Group_Value) as Group_Value
from
(select Sum(data.Value) As Data_Value, groups.Value As Group_Value
from data
inner join groups on groups.Id = data.GroupId
group by groups.Id, groups.Value)
as t
The edit should do the trick for you.

TSQL Select Clause with Case Statement

I have a basic select statement that is getting me a list of types that are stored in the database:
SELECT teType
FROM BS_TrainingEvent_Types
WHERE source = #source
FOR XML PATH ('options'), TYPE, ELEMENTS, ROOT ('types')
My table contains a type column and a source column.
There is a record in that table where I need to include it for two separate sources but I can't create a separate record for it.
**Table Data**
type | source
test users
test2 members
test3 admins
I need a case statement to be able to say IF source = admins also give me the type test2.
Does this make sense and is it possible to do with a basic select?
Update
I came up with this temp solution but I still think there is a better way to handle this.:
DECLARE #tmp AS TABLE (
QID VARCHAR (10));
INSERT INTO #tmp (QID)
SELECT DISTINCT qid
FROM tfs_adhocpermissions;
SELECT t.QID,
emp.FirstName,
emp.LastName,
emp.NTID,
(SELECT accessKey
FROM TFS_AdhocPermissions AS p
WHERE p.QID = t.QID
FOR XML PATH ('key'), TYPE, ELEMENTS, ROOT ('keys'))
FROM #tmp AS t
LEFT OUTER JOIN
dbo.EmployeeTable AS emp
ON t.QID = emp.QID
FOR XML PATH ('data'), TYPE, ELEMENTS, ROOT ('root');
try this
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
--create temp table for testing
IF OBJECT_ID('Tempdb..#BS_TrainingEvent_Types') IS NOT NULL
DROP TABLE #BS_TrainingEvent_Types
SELECT [type] ,
[source]
INTO #BS_TrainingEvent_Types
FROM ( VALUES ( 'test', 'users'), ( 'test2', 'members'),
( 'test3', 'admins') ) t ( [type], [source] )
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
--final query
DECLARE #Source VARCHAR(10) = 'users'
IF #Source = 'admins'
BEGIN
SELECT [Type]
FROM #BS_TrainingEvent_Types
WHERE source = #source
OR [type] = 'test2'
FOR XML PATH('options') ,
TYPE ,
ELEMENTS ,
ROOT('types')
END
ELSE
BEGIN
SELECT [Type]
FROM #BS_TrainingEvent_Types
WHERE source = #source
FOR XML PATH('options') ,
TYPE ,
ELEMENTS ,
ROOT('types')
END
select sq.teType
from (
SELECT t.teType
FROM BS_TrainingEvent_Types t
WHERE t.source = #source
union all
SELECT t.teType
FROM BS_TrainingEvent_Types t
WHERE #source = 'admins' and t.source = 'members'
) sq
FOR XML PATH ('options'), TYPE, ELEMENTS, ROOT ('types');
Though normally it would be better to introduce an additional table that would store these relationships, so that the whole idea would be more expandable.

Resources