Rearranging and deduplicating SQL columns based on column data

Rearranging and deduplicating SQL columns based on column data - sql-server

Sorry I know that's a rubbish Title but I couldn't think of a more concise way of describing the issue.
I have a (MSSQL 2008) table that contains telephone numbers:
| CustomerID | Tel1 | Tel2 | Tel3 | Tel4 | Tel5 | Tel6 |
| Cust001 | 01222222 | 012333333 | 07111111 | 07222222 | 01222222 | NULL |
| Cust002 | 07444444 | 015333333 | 07555555 | 07555555 | NULL | NULL |
| Cust003 | 01333333 | 017777777 | 07888888 | 07011111 | 016666666 | 013333 |
I'd like to:
Remove any duplicate phone numbers
Rearrange the telephone numbers so that anything beginning with "07" is the first phone number. If there are multiple 07's, they should be in the first fields. The order of the numbers apart from that doesn't really matter.
So, for example, after processing, the table would look like:
| CustomerID | Tel1 | Tel2 | Tel3 | Tel4 | Tel5 | Tel6 |
| Cust001 | 07111111 | 07222222 | 01222222 | 012333333 | NULL | NULL |
| Cust002 | 07444444 | 07555555 | 015333333 | NULL | NULL | NULL |
| Cust003 | 07888888 | 07011111 | 016666666 | 013333 | 01333333 | 017777777 |
I'm struggling to figure out how to efficiently achieve my goal (there are 600,000+ records in the table). Can anyone help?
I've created a fiddle if it'll help anyone play around with the scenario.

You can break up the numbers into individual rows using UNPIVOT, then reorder them based on the occurence of the '07' prefix using ROW_NUMBER(), and finally recombine it using PIVOT to end up with the 6 Tel columns again.
select *
FROM
(
select CustomerID, Col, Tel
FROM
(
select *, Col='Tel' + RIGHT(
row_number() over (partition by CustomerID
order by case
when Tel like '07%' then 1
else 2
end),10)
from phonenumbers
UNPIVOT (Tel for Seq in (Tel1,Tel2,Tel3,Tel4,Tel5,Tel6)) seqs
) U
) P
PIVOT (MAX(TEL) for Col IN (Tel1,Tel2,Tel3,Tel4,Tel5,Tel6)) V;
SQL Fiddle

Perhaps using cursor to collect all customer id and sorting the fields...traditional sorting technique as we used to do in school c++ ..lolz...like to know if any other method possible.
If you dont get any then it is the last way . It will take a long time for sure to execute.

Related

SQL Server - Is it possible to define a table column as a table?

I know that this is possible in Oracle and I wonder if SQL Server also supports it (searched for answer without success).
It would greatly simplify my life in the current project if I could define a column of a table to be a table itself, something like:
Table A:
Column_1 Column_2
+----------+----------------------------------------+
| 1 | Columns_2_1 Column_2_2 |
| | +-------------+------------------+ |
| | | 'A' | 12345 | |
| | +-------------+------------------+ |
| | | 'B' | 777777 | |
| | +-------------+------------------+ |
| | | 'C' | 888888 | |
| | +-------------+------------------+ |
+----------+----------------------------------------+
| 2 | Columns_2_1 Column_2_2 |
| | +-------------+------------------+ |
| | | 'X' | 555555 | |
| | +-------------+------------------+ |
| | | 'Y' | 666666 | |
| | +-------------+------------------+ |
| | | 'Z' | 000001 | |
| | +-------------+------------------+ |
+----------+----------------------------------------+
Thanks in advance.

There is one option where you can store data as XML
Declare #YourTable table (ID int,XMLData xml)
Insert Into #YourTable values
(1,'<root><ID>1</ID><Active>1</Active><First_Name>John</First_Name><Last_Name>Smith</Last_Name><EMail>john.smith#email.com</EMail></root>')
,(2,'<root><ID>2</ID><Active>0</Active><First_Name>Jane</First_Name><Last_Name>Doe</Last_Name><EMail>jane.doe#email.com</EMail></root>')
Select ID
,Last_Name = XMLData.value('(root/Last_Name)[1]' ,'nvarchar(50)')
,First_Name = XMLData.value('(root/First_Name)[1]' ,'nvarchar(50)')
From #YourTable
Returns
ID Last_Name First_Name
1 Smith John
2 Doe Jane

Actually, for a normalized database we do not require such functionality.
Because if we need to insert a table within a column than we can create a child table and reference it as a foreign key in the parent table.
In spite, if you still insist to such functionality than you can use SQL Server 2016 to support JSON data where you can store any associative list in JSON format.
Like:
DECLARE #json NVARCHAR(4000)
SET #json =
N'{
"info":{
"type":1,
"address":{
"town":"Bristol",
"county":"Avon",
"country":"England"
},
"tags":["Sport", "Water polo"]
},
"type":"Basic"
}'
SELECT
JSON_VALUE(#json, '$.type') as type,
JSON_VALUE(#json, '$.info.address.town') as town,
JSON_QUERY(#json, '$.info.tags') as tags
SELECT value
FROM OPENJSON(#json, '$.info.tags')
In older versions, this can be achieved through xml as shown in previous answer.
Your can also make use of "sql_variant" datatype to map your table.
Previously, I was also in search of such features as available in Oracle. But after reading various articles and blogs from experts, I was convinced, such features will make the things more complex beside helping.
Only storing the data in required format is not important, It is worthy when it is also efficiently available (readable).
Hope this will help you to take your decision.

SQL Server group and inverse

I'm stuck with an easy SQL Server thing for my own personal project.
I have a table X:
| Name | Lecture | Points |
|:-----------|------------:|:------------:|
| John | Math | 2
| John | Bio | 5
| Tom | Physics | 8
| Tom | Math | 2
| Bob | Physics | 1
| Bob | Bio | 6
And I want to group by Name and to put all points I one row for each person:
| Name | Math | Bio | Physics |
|:-----------|------:|:----:|:-------:|
| John | 2 | 5 | NULL
| Tom | 2 | NULL | 8
| Bob | NULL | 6 | 1
I tried doing this:
SELECT Name, ? AS Math, ? AS Bio, ? AS Physics
FROM X
GROUP BY Name
but I don't know what to put instead of "?". How can I do that ?

You need a pivot table and you need to know the values, unless you want to use dynamic sql (not recommended unless absolutely necessary):
SELECT Name,
ISNULL([Math], 0) as [Math],
ISNULL([Bio], 0) as [Bio],
ISNULL([Physics], 0) as [Physics]
FROM
(
SELECT Name, Lecture, SUM(Points)
FROM Table X
GROUP BY Name, Lecture
) as t1
PIVOT (SUM([Points]) for [Lecture] in ([Math], [Bio], [Physics])) as t2

Join two Select Statements into a single row when one select has n amount of entries?

Is it possible in SQL Server to take two select statements and combine them into a single row without knowing how many entries one of the select statements got?
I've been looking around at various Join solutions but they all seem to work on the basis that the amount of columns is predetermined. I have a case here where one table has a determined amount of columns (t1) and the other table have an undetermined amount of entries (t2) which all use a key that matches one entry in t1.
+----+------+-----+
| id | name | ... |
+----+------+-----+
| 1 | John | ... |
+----+------+-----+
And
+-------------+----------------+
| activity_id | account_number |
+-------------+----------------+
| 1 | 12345467879 |
| 1 | 98765432515 |
| ... | ... |
| ... | ... |
+-------------+----------------+
The number of account numbers belonging to the first query is unknown.
After the query it would become:
+----+------+-----+----------------+------------------+-----+------------------+
| id | name | ... | account_number | account_number_2 | ... | account_number_n |
+----+------+-----+----------------+------------------+-----+------------------+
| 1 | John | ... | 12345467879 | 98765432515 | ... | ... |
+----+------+-----+----------------+------------------+-----+------------------+
So I don't know how many account numbers could be associated with the id beforehand.

T-SQL Merging data

I've imported data from an XML file by using SSIS to SQL Server.
The result what I got in the database is similar to this:
+-------+---------+---------+-------+
| ID | Name | Brand | Price |
+-------+---------+---------+-------+
| 2 | NULL | NULL | 100 |
| NULL | SLX | NULL | NULL |
| NULL | NULL | Blah | NULL |
| NULL | NULL | NULL | 100 |
+-------+---------+---------+-------+
My desired result would be:
+-------+---------+---------+-------+
| ID | Name | Brand | Price |
+-------+---------+---------+-------+
| 2 | SLX | Blah | 100 |
+-------+---------+---------+-------+
Is there a pretty solution to solve this in T-SQL?
I've already tried it with a SELECT MAX(ID) and then a GROUP BY ID, but I'm still stuck with the NULL values. Also I've tried it with MERGE, but also a failure.
Could someone give me a direction where to search further?

You can select MAX on all columns....
SELECT MAX(ID), MAX(NAME), MAX(BRAND), MAX(PRICE)
FROM [TABLE]
Click here for a fiddley fidd fiddle...

Is this a good design for a table?

Each user/person could know one or more languages.
All I can think is a table like
+----------+------+-----+------------+-----+-----+-----+-------+
| PersonID | Java | PHP | Javascript | C++ | C | CSS | HTML |
+----------+------+-----+------------+-----+-----+-----+-------+
| 1 | Yes | Yes | No | Yes | No | Yes | No |
| 2 | No | Yes | Yes | No | Yes | No | No |
| 3 | Yes | No | Yes | Yes | Yes | Yes | No |
+----------+------+-----+------------+-----+-----+-----+-------+
Considering I'm going to need at least 100 columns for all the languages, is it normal to have that many columns? Something tells me this is the wrong approach.
Thank you very much and sorry about my english!

I would suggest you to create three tables.
One table contains the information of the Person like his Name etc.
Second table contains two columns LanguageId and Language name.
+------------+-----------+
| LanguageID | Name |
+------------+-----------+
| 1 | Javascript|
| 2 | C |
| 3 | C++ |
+------------+-----------+
Third table contains the Id, PersonId, LanguageID. In this table you can join the above two tables record.
+---+----------+------------+
|ID | PersonID | LanguageID |
+---+----------+------------+
|1 | 1 | 1 |
|2 | 2 | 2 |
|3 | 3 | 3 |
+---+----------+------------+
Reasons to support my answer:
In future if you want to add any new language in your table then it
would be easier to add that in the main table.
You can join the two tables easily and get the result

A little improvement we can do over Rahul Tripathi response is to remove the "Known" column. You need only two tables for this case. One containing PersonId and LanguageId the person knows. The second table is for the languages only.
You know what languages one person knows by joining both tables. For example if you need to know the list of known languages you can do:
SELECT p.PersonId, l.Name
FROM Person p INNER JOIN Language l ON (p.LanguageId = l.LanguageId)
WHERE (p.PersonId = theIdYouNeedToKnow)

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Rearranging and deduplicating SQL columns based on column data - sql-server

Perhaps using cursor to collect all customer id and sorting the fields...traditional sorting technique as we used to do in school c++ ..lolz...like to know if any other method possible. If you dont get any then it is the last way . It will take a long time for sure to execute.

Related

SQL Server - Is it possible to define a table column as a table?

SQL Server group and inverse

Join two Select Statements into a single row when one select has n amount of entries?

T-SQL Merging data

Is this a good design for a table?

Categories

Resources