I have a front end search box where the user can search for someone by firstname, middlename, surname or job title and bulk of the backend code looks like this:
SELECT TOP 50 * FROM (SELECT [EmployeeId], SUM(MatchOrder) as MatchOrder
FROM (SELECT
[EmployeeId],
CASE WHEN A.[EmployeeFieldId] = 4 Then 15 --Surname
WHEN A.[EmployeeFieldId] in (1, 2) Then 15 --PreferredName, FirstName
WHEN A.[EmployeeFieldId] = 3 Then 5 --MiddleName
WHEN A.[EmployeeFieldId] = 5 Then 20 --JobTitle
ELSE 3
END as MatchOrder
FROM [latest].[EmployeeAttributes] A
WHERE (' + #search + ')
) internal
GROUP BY EmployeeId) A
join dbo.vwEmployees E on E.EmployeeId = A.EmployeeId -- TEMP
ORDER BY 2 DESC'
Each employeeID is given a score (MatchOrder) which is totalled depending on how many of the above criteria are met (e.g. First Name + Surname match = 30) and then the search is ordered by the MatchOrder score to be displayed by the front end, But the problem is that if someone's First and Surname are very similar, e.g. Patrick Patterson and I only search for Pat Rice, then Patrick Patterson (30 pts) appears above Patrick Rice(30pts) because the First Name is being matched twice.
I'd like for it to either lower the points score if the match is doubly made, or modify my switch statement to somehow do this (nested case?
Do you know how I can combat this? Any help would be appreciated.
Thanks
Since [EmployeeFieldId] is always mapped to the same [MatchOrder], you should be able to control this by including [EmployeeFieldId] in the "internal" result set and slapping a DISTINCT clause on the SELECT:
SELECT DISTINCT
[EmployeeId],
[EmployeeFieldId],
CASE WHEN A.[EmployeeFieldId] = 4 Then 15 --Surname
WHEN A.[EmployeeFieldId] in (1, 2) Then 15 --PreferredName, FirstName
WHEN A.[EmployeeFieldId] = 3 Then 5 --MiddleName
WHEN A.[EmployeeFieldId] = 5 Then 20 --JobTitle
ELSE 3
END as MatchOrder
FROM [latest].[EmployeeAttributes] A
WHERE (' + #search + ')
That way, each employee will get at max one of the same field IDs applied towards their score.
Related
I have few string with numbers like this; and its around 3000 records.
Column
------------
Cell 233567-3455
Cell123-4567
Cell#123-7449
Local 456-0987
1 616 468-7796
1234567-5x2345
234/625-1234
(C)755-7442
5732878-2
5721899-23
6712909-3
7894200-234
2144-57238
5673893/588218
437-4737-5772
How can i find the records like below:
Column
-------------
5732878-2
5721899-23
6712909-3
7894200-234
Once I find this, I need to split those into two parts
1st Column. | 2nd column
------------- |
5732878 | 5732872
5721899 | 5721823
6712909 | 6712903
7894200 | 7894234
I tried to fix This using PARINDEX and CHARINDEX
But somehow its not working.Please help.
I don't know your filtering logic to get to your intermediate set, but this should get your expected final result set. I assumed you only want records where the length of the string to the left of the hyphen is greater than the length on the right and also exclude records with more than 1 hyphen.
SELECT LEFT(telephone, CHARINDEX('-', telephone)-1) AS [1stTelephone],
STUFF(
--get the string before the hyphen
LEFT(telephone, CHARINDEX('-', telephone)-1),
--get the starting location of chars we are going to replace
LEN(LEFT(telephone, CHARINDEX('-', telephone)))-LEN(RIGHT(telephone, CHARINDEX('-', REVERSE(telephone))-1)),
--get the length of the section we are replacing
LEN(RIGHT(telephone, CHARINDEX('-', REVERSE(telephone))-1)),
--replace that section with the string after the hyphen
RIGHT(telephone, CHARINDEX('-', REVERSE(telephone))-1)
) AS [2nd telephone]
FROM your_table
WHERE LEN(LEFT(telephone, CHARINDEX('-', telephone))) > LEN(RIGHT(telephone, CHARINDEX('-', REVERSE(telephone))))
AND len(telephone) - len(REPLACE(telephone, '-', '')) = 1
Somewhat dirty method (looks specifically for 7 digits followed by hyphen followed by any number of digits):
SELECT BasePhone AS Phone1, LEFT(BasePhone, 7-LEN(OtherPhoneEnd)) + OtherPhoneEnd AS Phone2
FROM (
SELECT LEFT(Telephone, 7) AS BasePhone, SUBSTRING(Telephone,9,7) AS OtherPhoneEnd
FROM Telephones
WHERE Telephone LIKE '[0-9][0-9][0-9][0-9][0-9][0-9][0-9]-%'
)
I assumed based on information you given, that you want numbers with hyphen (-) at 8th position. Try this:
create table #TelNo (
Tel varchar(30)
)
insert #TelNo(Tel)
values ('5732878-2'),
('5721899-23'),
('6712909-3'),
('7894200-234'),
('2144-57238'),
('5673893/588218'),
('437-4737-5772')
select Tel, LEFT(Tel, Len(tel) - len(suffix)) + suffix [SecondTel] from (
select substring(Tel, 1, 7) [Tel], substring(Tel, 9, 10) [suffix] from #TelNo
where CHARINDEX('-', Tel) = 8
)a
You could use something like this:
DDL
use tempdb
create table TelNo (
Tel varchar(30)
)
insert TelNo(Tel)
values ('5732878-2'),
('5721899-23'),
('6712909-3'),
('7894200-234'),
('2144-57238'),
('5673893/588218'),
('437-4737-5772')
Code
select Tel,
case
when Tel like '%_-[0-9]' then left(Tel, len(Tel)-2)
when Tel like '%__-[0-9][0-9]' then left(Tel, len(Tel)-3)
when Tel like '%___-[0-9][0-9][0-9]' then left(Tel, len(Tel)-4)
else Tel
end Tel1,
case
when Tel like '%_-[0-9]' then left(Tel, len(Tel)-3) + right(Tel, 1)
when Tel like '%__-[0-9][0-9]' then left(Tel, len(Tel)-5) + right(Tel, 2)
when Tel like '%___-[0-9][0-9][0-9]' then left(Tel, len(Tel)-7) + right(Tel, 3)
else NULL
end Tel2
from TelNo
We handle a lot of sensitive data and I would like to mask passenger names using only the first and last letter of each name part and join these by three asterisks (***),
For example: the name 'John Doe' will become 'J***n D***e'
For a name that consists of two parts this is doable by finding the space using the expression:
LEFT(CardHolderNameFromPurchase, 1) +
'***' +
CASE WHEN CHARINDEX(' ', PassengerName) = 0
THEN RIGHT(PassengerName, 1)
ELSE SUBSTRING(PassengerName, CHARINDEX(' ', PassengerName) -1, 1) +
' ' +
SUBSTRING(PassengerName, CHARINDEX(' ', PassengerName) +1, 1) +
'***' +
RIGHT(PassengerName, 1)
END
However, the passenger name can have more than two parts, there is no real limit to it. How should can I find the indices of all spaces within an expression? Or should I maybe tackle this problem in a different way?
Any help or pointer is much appreciated!
This solution does what you want it to, but is really the wrong approach to use when trying to hide personally identifiable data, as per Gordon's explanation in his answer.
SQL:
declare #t table(n nvarchar(20));
insert into #t values('John Doe')
,('JohnDoe')
,('John Doe Two')
,('John Doe Two Three')
,('John O''Neill');
select n
,stuff((select ' ' + left(s.item,1) + '***' + right(s.item,1)
from dbo.fn_StringSplit4k(t.n,' ',null) as s
for xml path('')
),1,1,''
) as mask
from #t as t;
Output:
+--------------------+-------------------------+
| n | mask |
+--------------------+-------------------------+
| John Doe | J***n D***e |
| JohnDoe | J***e |
| John Doe Two | J***n D***e T***o |
| John Doe Two Three | J***n D***e T***o T***e |
| John O'Neill | J***n O***l |
+--------------------+-------------------------+
String splitting function based on Jeff Moden's Tally Table approach:
create function [dbo].[fn_StringSplit4k]
(
#str nvarchar(4000) = ' ' -- String to split.
,#delimiter as nvarchar(1) = ',' -- Delimiting value to split on.
,#num as int = null -- Which value to return, null returns all.
)
returns table
as
return
-- Start tally table with 10 rows.
with n(n) as (select 1 union all select 1 union all select 1 union all select 1 union all select 1 union all select 1 union all select 1 union all select 1 union all select 1 union all select 1)
-- Select the same number of rows as characters in #str as incremental row numbers.
-- Cross joins increase exponentially to a max possible 10,000 rows to cover largest #str length.
,t(t) as (select top (select len(isnull(#str,'')) a) row_number() over (order by (select null)) from n n1,n n2,n n3,n n4)
-- Return the position of every value that follows the specified delimiter.
,s(s) as (select 1 union all select t+1 from t where substring(isnull(#str,''),t,1) = #delimiter)
-- Return the start and length of every value, to use in the SUBSTRING function.
-- ISNULL/NULLIF combo handles the last value where there is no delimiter at the end of the string.
,l(s,l) as (select s,isnull(nullif(charindex(#delimiter,isnull(#str,''),s),0)-s,4000) from s)
select rn
,item
from(select row_number() over(order by s) as rn
,substring(#str,s,l) as item
from l
) a
where rn = #num
or #num is null;
GO
If you consider PassengerName as sensitive information, then you should not be storing it in clear text in generally accessible tables. Period.
There are several different options.
One is to have reference tables for sensitive information. Any table that references this would have an id rather than the name. Viola. No sensitive information is available without access to the reference table, and that would be severely restricted.
A second method is a reversible compression algorithm. This would allow the the value to be gibberish, but with the right knowledge, it could be transformed back into a meaningful value. Typical methods for this are the public key encryption algorithms devised by Rivest, Shamir, and Adelman (RSA encoding).
If you want to do first and last letters of names, I would be really careful about Asian names. Many of them consist of two or three letters, when written in Latin script. That isn't much hiding. SQL Server does not have simple mechanisms to do this. You can write a user-defined function with a loop to manager the process. However, I view this as the least secure and least desirable approach.
This uses Jeff Moden's DelimitedSplit8K, as well as the new functionality in SQL Server 2017 STRING_AGG. As I don't know what version you're using, I've just gone "whole hog" and assumed you're using the latest version.
Jeff's function is invaluable here, as it returns the ordinal position, something which Microsoft have foolishly omitted from their own function, STRING_SPLIT (and didn't add in 2017 either). Ordinal position is key here, so we can't make use of the built in function.
WITH VTE AS(
SELECT *
FROM (VALUES ('John Doe'),('Jane Bloggs'),('Edgar Allan Poe'),('Mr George W. Bush'),('Homer J Simpson')) V(FullName)),
Masking AS (
SELECT *,
ISNULL(STUFF(Item, 2, LEN(item) -2,'***'), Item) AS MaskedPart
FROM VTE V
CROSS APPLY dbo.delimitedSplit8K(V.Fullname, ' '))
SELECT STRING_AGG(MaskedPart,' ') AS MaskedFullName
FROM Masking
GROUP BY Fullname;
Edit: Nevermind, OP has commented they are using 2008, so STRING_AGG is out of the question. #iamdave, however, has posted an answer which is very similar to my own, just do it the "old fashioned XML way".
Depending on your version of SQL Server, you may be able to use the built-in string split to rows on spaces in the name, do your string formatting, and then roll back up to name level using an XML path.
create table dataset (id int identity(1,1), name varchar(50));
insert into dataset (name) values
('John Smith'),
('Edgar Allen Poe'),
('One Two Three Four');
with split as (
select id, cs.Value as Name
from dataset
cross apply STRING_SPLIT (name, ' ') cs
),
formatted as (
select
id,
name,
left(name, 1) + '***' + right(name, 1) as out
from split
)
SELECT
id,
(SELECT ' ' + out
FROM formatted b
WHERE a.id = b.id
FOR XML PATH('')) [out_name]
FROM formatted a
GROUP BY id
Result:
id out_name
1 J***n S***h
2 E***r A***n P***e
3 O***e T***o T***e F***r
You can do that using this function.
create function [dbo].[fnMaskName] (#var_name varchar(100))
RETURNS varchar(100)
WITH EXECUTE AS CALLER
AS
BEGIN
declare #var_part varchar(100)
declare #var_return varchar(100)
declare #n_position smallint
set #var_return = ''
set #n_position = 1
WHILE #n_position<>0
BEGIN
SET #n_position = CHARINDEX(' ', #var_name)
IF #n_position = 0
SET #n_position = LEN(#var_name)
SET #var_part = SUBSTRING(#var_name, 1, #n_position)
SET #var_name = SUBSTRING(#var_name, #n_position+1, LEN(#var_name))
if #var_part<>''
SET #var_return = #var_return + stuff(#var_part, 2, len(#var_part)-2, replicate('*',len(#var_part)-2)) + ' '
END
RETURN(#var_return)
END
I have a question about the Any-Operator.
On Technet it says
For example, the following query finds customers located in a territory not covered by any sales persons.
Use AdventureWorks2008R2;
GO
SELECT
CustomerID
FROM
Sales.Customer
WHERE
TerritoryID <> ANY
(
SELECT
TerritoryID
FROM
Sales.SalesPerson
);
Further
The results include all customers, except those whose sales territories are NULL, because every territory that is assigned to a customer is covered by a sales person. The inner query finds all the sales territories covered by sales persons, and then, for each territory, the outer query finds the customers who are not in one.
But that query returns all customers.
I updated a customers TerritoryID to a value that no sales.person has, but still that query returns all customers, instead of that one I expected ..
Am I missing something ?
Might it be that that article on technet is simply wrong ?
https://technet.microsoft.com/de-de/library/ms187074(v=sql.105).aspx (german)
There is one customer with TerritoryID = 13
Inner query result (SELECT TerritoryID FROM Sales.SalesPerson) :
4
2
4
3
6
5
1
4
6
1
1
6
9
1
8
10
7
And in table Sales.Customer is a row with CustomerID = 13, which is the one not covered by a sales-person..
create table #t1
(
id int
)
insert into #t1
values(1),(2),(3)
As you can see,T1 has three values
now lets see,how Any Works
When 'is Equal to ' is used with any ,it works like IN
select * from #t1 where id=
any(select 0)--no result
when Any is used with > or <> ,Any means get me all the values which are greater than minimum value
select * from #t1 where id<>
any(select 1)--2,3
select * from #t1 where id<>
any(select 0)--1,2,3
If your subquery returns one value,the outer query will try to get values which are greater than inner query
<> ANY means any Sales.Customer with a TerritoryID that is Greater Than or Less Than any of the TerritoryID's in the Sales.SalesPerson
so TerritoryID = 13 is greater than all or your examples (4 2 4 3 6 5 1 4 6 1 1 6 9 1 8 10 7), so it's included.
<> ALL is the equivalent of NOT IN so that is what you're confusing <> ANY with
Look at <> ANY as, if there are any records in the set that are not equal to the quailifier, then include it.
The following query has the same result:
SELECT CustomerID FROM Sales.Customer
WHERE TerritoryID NOT IN (SELECT TerritoryID FROM Sales.SalesPerson)
I'm trying to order items by a list of names that are not in alphabetical order. After completing the list I am trying to continue the rest in alphabetical order without the ones I initially selected.
See example:
INPUT:
print 'Results:'
select * from Geniuses
order by ('Charles Babbage',
'Albert Einstein',
'Adrien-Marie Legendre',
'Niels Henrik Abel')
then finally sort the rest in alphabetical order...
OUTPUT:
Results:
Charles Babbage ... details
Albert Einstein ...
Adrien-Marie Legendre ...
Niels Henrik Abel ...
Arthur Cayley ...
...
select * from Geniuses
order by
-- First, order by your set order...
case FullName
when 'Charles Babbage' then 1
when 'Albert Einstein' then 2
when 'Adrien-Marie Legendre' then 3
when 'Niels Henrik Abel' then 4
else 5
end,
-- Then do a secondary sort on FullName for everyone else.
FullName
EDIT:
I saw your comment that it's configurable by each user. In that case, you'd have to have a FavoriteGeniuses table that tracks which user prefers which Geniuses, and then have a sort order specified in that table:
select *
from
Geniuses g left join
FavoriteGeniuses fg
ON fg.GeniusID = g.GeniusID
AND fg.UserID = #UserID
order by
-- The higher the number, the first up on the list.
-- This will put the NULLs (unspecified) below the favorites.
fg.SortPriority DESC,
f.FullName
Try it like this:
select * from Geniuses
order by
case when columnName = 'Charles Babbage' then 0
when columnName = 'Albert Einstein' then 1
when columnName = 'Adrien-Marie Legendre' then 2
when columnName = 'Niels Henrik Abel' then 3
else 4
end,
columName
I'm doing some reporting against a silly database and I have to do
SELECT [DESC] as 'Description'
FROM dbo.tbl_custom_code_10 a
INNER JOIN dbo.Respondent b ON CHARINDEX(',' + a.code + ',', ',' + b.CC10) > 0
WHERE recordid = 116
Which Returns Multiple Rows
Palm
Compaq
Blackberry
Edit *
Schema is
Respondent Table (At a Glance) ...
*recordid lname fname address CC10 CC11 CC12 CC13*
116 Smith John Street 1,4,5, 1,3,4, 1,2,3, NULL
Tbl_Custom_Code10
*code desc*
0 None
1 Palm
10 Samsung
11 Treo
12 HTC
13 Nokia
14 LG
15 HP
16 Dash
Result set will always be 1 row, so John Smith: | 646-465-4566 | Has a Blackberry, Palm, Compaq | Likes: Walks on the beach, Rainbows, Saxophone
However I need to be able to use this within another query ... like
Select b.Name, c.Number, d.MulitLineCrap FROM Tables
How can I go about this, Thanks in advance ...
BTW I could also do it in LINQ if any body had any ideas ...
Here is one way to make a comma-separated list based on a query (just replace the query inside the first WITH block). Now, how that joins up with your query against b and c, I have no idea. You'll need to supply a more complete question - including specifics on how many rows come back from the second query and whether "MultilineCrap" is the same for each of those rows or if it depends on data in b/c.
;WITH x([DESC]) AS
(
SELECT d FROM (VALUES('Palm'),('Compaq'),('Blackberry')) AS x(d)
)
SELECT STUFF((SELECT ',' + [DESC]
FROM x
FOR XML PATH(''), TYPE).value(N'./text()[1]', N'varchar(max)'),1,1,'');
EDIT
Given the new requirements, perhaps this is the best way:
CREATE FUNCTION dbo.GetMultiLineCrap
(
#s VARCHAR(MAX)
)
RETURNS VARCHAR(MAX)
AS
BEGIN
DECLARE #x VARCHAR(MAX) = '';
SELECT #x += ',' + [desc]
FROM dbo.tbl_custom_code_10
WHERE ',' + #s LIKE '%,' + RTRIM(code) + ',%';
RETURN (SELECT STUFF(#x, 1, 1, ''));
END
GO
SELECT r.LName, r.FName, MultilineCrap = dbo.GetMultiLineCrap(r.CC10)
FROM dbo.Respondent AS r
WHERE recordid = 116;
Please use aliases that make a little bit of sense, instead of just serially applying a, b, ,c, etc. Your queries will be easier to read, I promise.