Lookup delimited values in a table in sql-server - sql-server

In a table A i have a column (varchar*30) city-id with the value e.g. 1,2,3 or 2,4.
The description of the value is stored in another table B, e.g.
1 Amsterdam
2 The Hague
3 Maastricht
4 Rotterdam
How must i join table A with table B to get the descriptions in one or maybe more rows?

Assuming this is what you meant:
Table A:
id
-------
1
2
3
Table B:
id | Place
-----------
1 | Amsterdam
2 | The Hague
3 | Maastricht
4 | Rotterdam
Keep id column in both tables as auto increment, and PK.
Then just do a simple inner join.
select * from A inner join B on (A.id = B.id);

Ideal way to deal with such scenarios is to have a normalized table as Collin. In case that can't be done here is the way to go about -
You would need to use a table-valued function to split the comma-seperated value. If you are having SQL-Server 2016, there is a built-in SPLIT_STRING function, if not you would need to create one as shown in this link.
create table dbo.sCity(
CityId varchar(30)
);
create table dbo.sCityDescription(
CityId int
,CityDescription varchar(30)
);
insert into dbo.sCity values
('1,2,3')
,('2,4');
insert into dbo.sCityDescription values
(1,'Amsterdam')
,(2,'The Hague')
,(3,'Maastricht')
,(4,'Rotterdam');
select ctds.CityDescription
,sst.Value as 'CityId'
from dbo.sCity ct
cross apply dbo.SplitString(CityId,',') sst
join dbo.sCityDescription ctds
on sst.Value = ctds.CityId;

Related

Extracting data from a table into another table based on a common value

I have a table which somewhat looks like this
Table A:
Voter_id Id
----------------------
null | DEPT 1f7h
null | DEPT 3k9n
null | DEPT 2lp0
null | DEPT 2f6k
(250,000 rows like this)
This table Table A has close to 250,000 rows.
I have another table Table B which looks like this
Name_of_variable |Id | value_of_variable
--------------------------------------------------
Voter_id |DEPT 1f7h | 12OK9MJL
First_Name |DEPT adas | Umar
DOB |DEPT opwe | 20-02-199
Age |DEPT jqwq | 24
Voter_id |DEPT 90aa | 189H8MLI
(almost 1 million rows like this)
Table B id column has index
I wanted to fill Voter_id column of Table A using Table B column such that Voter_id column of table A = value_of_variable of Table B where Name_of_variable of Table A is 'Voter_id' and TableA.Id=TableB.Id
I have used this query for extracting data and it is working fine on my development database which has 15,000 records in Table A.I want to know if i can further optimize it because it may not work that good on bigger data.
update TableA
set Voter_id =(select value_of_variable
from TableB
where Name_of_variable like 'Voter_id'
and TableA.Id = TableB.id
limit 1);
You need to create an index on TableA.Id
CREATE UNIQUE INDEX Id_idx ON TableA (Id);
In case your TableA.Id can contain duplicate entries, please remove UNIQUE
You might also wanna play with
CREATE UNIQUE INDEX Id_idx ON TableB (Id) INCLUDE (Name_of_variable);
I have resolved this question by changing my update query like this
update TableA set Voter_id = TableB.value_of_variable
from TableB where TableA.id = TableB.id and TableB.Name_of_variable='Voter_id';

Simplify multiple joins

I have a Claims table with 70 columns, 16 of which contain diagnosis codes. The codes mean nothing, so I need to pull the descriptions for each code located in a separate table.
There has to be a simpler way of pulling these code descriptions:
-- This is the claims table
FROM
[database].[schema].[claimtable] AS claim
-- [StagingDB].[schema].[Diagnosis] table where the codes located
-- [ICD10_CODE] column contains the code
LEFT JOIN
[StagingDB].[schema].[Diagnosis] AS diag1 ON claim.[ICDDiag1] = diag1.[ICD10_CODE]
LEFT JOIN
[StagingDB].[schema].[Diagnosis] AS diag2 ON claim.[ICDDiag2] = diag2.[ICD10_CODE]
LEFT JOIN
[StagingDB].[schema].[Diagnosis] AS diag3 ON claim.[ICDDiag3] = diag3.[ICD10_CODE]
-- and so on, up to ....
LEFT JOIN
[StagingDB].[schema].[Diagnosis]AS diag16 ON claim.[ICDDiag16] = diag16.[ICD10_CODE]
-- reported column will be [code_desc]
-- ie:
-- diag1.[code_desc] AS Diagnosis1
-- diag2.[code_desc] AS Diagnosis2
-- diag3.[code_desc] AS Diagnosis3
-- diag4.[code_desc] AS Diagnosis4
-- etc.
I think what you are doing is already correct in given scenario.
Another way can be from programming point of view or you can give try and compare ther performace.
i) Pivot Claim table on those 16 description columns.
ii) Join the Pivoted column with [StagingDB].[schema].[Diagnosis]
Another way can be to put [StagingDB].[schema].[Diagnosis] table in some #temp table
instead of touching large Staging 16 times.
But for data analysis has to be done to decide if there is any way.
You can go for UNPIVOT of the claimTable and then join with Diagnosis table.
TEST SETUP
create table #claimTable(ClaimId INT, Diag1 VARCHAR(10), Diag2 VARCHAR(10))
CREATE table #Diagnosis(code VARCHAR(10), code_Desc VARCHAR(255))
INSERT INTO #ClaimTable
VALUES (1, 'Fever','Cold'), (2, 'Headache','toothache');
INSERT INTO #Diagnosis
VALUEs ('Fever','Fever Desc'), ('cold','cold desc'),('headache','headache desc'),('toothache','toothache desc');
Query to Run
;WITH CTE_Claims AS
(SELECT ClaimId,DiagnosisNumeral, code
FROM #claimTable
UNPIVOT
(
code FOR DiagnosisNumeral in ([Diag1],[Diag2])
) as t
)
SELECT c.ClaimId, c.code, d.code_Desc
FROM CTE_Claims AS c
INNER JOIN #Diagnosis as d
on c.code = d.code
ResultSet
+---------+-----------+----------------+
| ClaimId | code | code_Desc |
+---------+-----------+----------------+
| 1 | Fever | Fever Desc |
| 1 | Cold | cold desc |
| 2 | Headache | headache desc |
| 2 | toothache | toothache desc |
+---------+-----------+----------------+

SQLite query with unknown foreign key

I am playing around with a SQLite database in a vb.net application. The database is supposed to store time series data for many variables.
Right now I am trying to build the database with 2 tables as followed:
Table varNames:
CREATE TABLE IF NOT EXISTS varNames(id INTEGER PRIMARY KEY, varName TEXT UNIQUE);
It looks like this:
ID | varName
---------------
1 | var1
2 | var2
... | ...
Table varValues:
CREATE TABLE IF NOT EXISTS varValues(timestamp INTEGER, varValue FLOAT, id INTEGER, FOREIGN KEY(id) REFERENCES varNames(id) ON DELETE CASCADE);
It looks like this:
timestamp | varValue | id
------------------------------
1 | 1.0345 | 1
4 | 3.5643 | 1
1 | 7.7866 | 2
3 | 4.5668 | 2
... | .... | ...
The first table contains all variable names with IDs. The second table contains the values of each variable for many time steps (indicated by the timestamps). A foreign key links the tables through the variable IDs.
Building up the database works fine.
Now I want to query the database and plot the time series for selected variables. For this I use the following statement:
select [timestamp], [varValue] FROM varValues WHERE (SELECT id from varNames WHERE varName= '" & NAMEvariable & "');
Since the user does not know the variabel ID, only the name of the Variable (in NAMEvariable) I use the ..WHERE (SELECT... statement. It seems like this really slows down the performance. The time series have up to 50k points.
Is there any better way to query values for a specific variable which can only be addressed by its name?
You probably should use a join query, something like:
SELECT a.[timestamp], a.varValue
FROM varValues AS a, varNames AS b
WHERE b.varName = <name>
AND a.id = b.ID
edit: To query for more than one parameter, use something like this:
SELECT a.[timestamp], a.varValue
FROM varValues AS a, varNames AS b
WHERE b.varName IN (<name1>, <name2>, ...)
AND a.id = b.ID

Find one record that exists as two records in another vendor database

I have two vendor databases that have become horribly out-of-sync over the years that I'm trying to correct. A single customer can have multiple id_numbers, and these IDs exist in both vendor databases. All of the IDs for a single customer are correctly attached to one customer record in the Vendor1 database (meaning they belong to the same customer_code). The problem, however, is that those same IDs might be split amongst multiple customers in the Vendor2 database, which is incorrect. I will need to merge those multiple customers together in the Vendor2 database.
I'm trying to identify which customers are represented as two or more customers in the second vendor database. So far I have joined the two together, but I can't figure out how to find only customers that having two or more distinct MemberInternalKeys for the same customer_code.
Here's what I have so far:
select top 10
c.customer_code,
i.id_number,
cc.MemberInternalKey
from Vendor1.dbo.customer_info as c
join Vendor1.dbo.customer_ids as i
on c.customer_code = i.customer_code
join Vendor2.dbo.Clubcard as cc
on (i.id_number collate Latin1_General_CI_AS_KS) = cc.ClubCardId
where i.id_code = 'PS'
In the example below, I would expect to only get back the last two rows in the table. The first two rows should not be included in the results because they have the same MemberInternalKey for both records and belong to the same customer_code. The third row should also not be included since there is a 1-1 match between both vendor databases.
customer_code | id_number | MemberInternalKey
--------------|-----------|------------------
5549032 | 4000 | 4926877
5549032 | 4001 | 4926877
5031101 | 4007 | 2379218
2831779 | 4029 | 1763760
2831779 | 4062 | 4950922
Any help is greatly appreciated.
If I understand correctly, you can use window functions for this logic:
select c.*
from (select c.customer_code, i.id_number, cc.MemberInternalKey,
min(MemberInternalKey) over (partition by customer_code) as minmik,
max(MemberInternalKey) over (partition by customer_code) as maxmik
from Vendor1.dbo.customer_info c join
Vendor1.dbo.customer_ids i
on c.customer_code = i.customer_code join
Vendor2.dbo.Clubcard as cc
on (i.id_number collate Latin1_General_CI_AS_KS) = cc.ClubCardId
where i.id_code = 'PS'
) c
where minmik <> maxmik;
This calculates the minimum and maximum MemberInternalKey for each customer_code. The outer where then returns only rows where these are different.
Another option is
Declare #YourTable table (customer_code int, id_number int, MemberInternalKey int)
Insert Into #YourTable values
(5549032,4000,4926877),
(5549032,4001,4926877),
(5031101,4007,2379218),
(2831779,4029,1763760),
(2831779,4062,4950922)
Select A.*
From #YourTable A
Join (
Select customer_code
From #YourTable
Group By customer_code
Having min(MemberInternalKey)<>max(MemberInternalKey)
) B on A.customer_code=B.customer_code
Returns
customer_code id_number MemberInternalKey
2831779 4029 1763760
2831779 4062 4950922

Join two columns from two different tables

I have a problem in joining the two columns from two different tables.
The Scenario is: I have a table A with 11 columns and another table B with 6 columns.
There is a column names SAMPLE1 which exists in both the tables.But, SAMPLE2 from 1st table A and ABC from 2nd table B are having the same value but with different column names. Same with SAMPLE3 and DEF. Now I would like to join that columns to a single column (which consists of data from both the tables) and rest of the columns also should appear on that final table.
Example:
Table A
SAMPLE1 SAMPLE2 SAMPLE3 .........SAMPLE 11 (Total 11 columns in this table)
US 75.2 US1_US NULL
INDIA 71 I3_INDIA NULL
UK 1851.23 UK1_UK NULL
Table B
SAMPLE1 ABC DEF............. XYZ (Total 6 columns in this table)
CHINA 123.2 C1_CHINA 2
JAPAN 1.1 J1_JAPAN 2
GERMANY 10.2314 G1_GERMANY 2
SINGAPORE 100.22 S1_SINGAPORE 2
Now I would like to see the output like this:
SomeTable
SAMPLE1 SOMENAME1 SOMENAME2..SAPMLE 11 ABC DEF ..... SOMENAME3
US 75.2 US1_US NULL NULL NULL NULL
INDIA 71 I3_INDIA NULL NULL NULL NULL
UK 1851.23 UK1_UK NULL NULL NULL NULL
CHINA 123.2 C1_CHINA NULL NULL NULL 2
JAPAN 1.1 J1_JAPAN NULL NULL NULL 2
GERMANY 10.2314 G1_GERMANY NULL NULL NULL 2
SINGAPORE 100.22 S1_SINGAPORE NULL NULL NULL 2
In short:
SELECT
(SAMPLE1(FROM TABLE A) + SAMPLE1(FROM TABLE B)) AS SAMPLE1,
(SAMPLE2 + ABC) AS SOMENAME1,
(SAMPLE3 + DEF) AS SOMENAME2,
A.SAMPLE4, A.SAMPLE5,...,
B.GHI, B.JKL,...
(A.SAMPLE11 +B.XYZ) AS SOMENAME3
I have used union but it didn't work.
select SAMPLE1,SAMPLE2,SAMPLE3,...,SAMPLE 11 from TABLE A
UNION
SELECT SAMPLE1, ABC, DEF, ...., XYZ FROM TABLE B
Now I am getting an error:
Msg 205, Level 16, State 1, Line 1
All queries combined using a UNION,
INTERSECT or EXCEPT operator must have
an equal number of expressions in
their target lists.
I have used union, coalesce, full outer join (all the suggestions or answers below)
The final code will be used in a select statement. How?
You're probably looking for a join, like:
select a.sample1
, a.sample2
, b.abc
, b.def
... etc ...
from TableA as a
full outer join
TableB as b
where a.sample1 = b.sample1
Join the two tables using FULL OUTER (preserve data from either table where it does not exist on the other), then use COALESCE to get the common SAMPLE1 column from whichever table has it.
SELECT COALESCE(A.SAMPLE1, B.SAMPLE1) SAMPLE1,
A.SAMPLE2,
A.SAMPLE3,
...
A.SAMPLE11,
B.ABC,
B.DEF,
...
B.XYZ
FROM table1 A FULL OUTER JOIN table2 B on A.SAMPLE1 = B.SAMPLE1
References: MSDN - Using Outer Joins / COALESCE
TRY THIS, IT MAY BE USEFUL
table 1 contains 5 columns
select col1,col2...col5 from table1
table 2 contains 3 columns
select col1,col2,col3 from table1
Query:
select col1,col2,col3,col4,col5 from table1
union
select col1,col2,col3,'','' from table1

Resources