when duplicate values found then - sql-server

I want to have a query that selects all duplicate values in a column. If those value meet the conditions then I'd like the query to return only those values.
Class Student_ID Location
Biology 511 4A
Biology 512 15B
Biology 513 15B
English 514 6A
Biology 521 6A
Spanish 522 6A
Spanish 523 15B
Chemistry 524 4A
English 531 15B
Biology 532 4A
Chemistry 534 4A
Select all duplicate values in the class column and if among those values there is location in both 4A and 15B then assign 1.
CASE WHEN count(class) > 1 AND (Location = '4A' AND Location = '15B') THEN 1
ELSE 0 END
what is most important is how to select duplicate values as a group and then look at the condition (location must be 4A and 15B). So the query must first group the duplicated values from the class column and then see if within the group the values meet the condition of location. So for example we first group the class column we get 5x biology this is then seen as a group and then within this group if there exist one row with location 4A AND one row with location 15B then and only then assign value 1 for biology. Almost all the values in class column have duplicates.
Desired Output
Class Location
Biology 1
Chemistry 0
English 0
Spanish 0

As an alternative to Tim Schmelter's answer, you can also do this with a LEFT JOIN.
SELECT yt1.Class, IIF(COUNT(yt2.Class) > 0, 1, 0) AS IsMatch
FROM YourTable yt1
LEFT JOIN YourTable yt2 ON yt1.Location = '4A' AND yt2.Location = '15B' AND
yt2.Class = yt1.Class
GROUP BY yt1.Class

Related

Pulling ordered list using array functions in Excel

I have a report in excel that displays the sales results from each employee. The columns are Location, Region, Username & Sales. It is sorted by Sales descending, showing which employee has the best sales in the company.
I am attempting to have an additional sheet per region that displays the results for all employees in that region also sorted by Sales (to avoid sorting the results of the many regions myself everyday).
An example version of the first 12 rows of the Data Sheet:
G H I J K X
Row Location Username Sales Region Region
1 38 John.Doe 85 North1 North1
2 154 John.Smith 83 South2
3 23 E.Williams 83 North1
4 210 M.Williams 79 East5
5 139 Joe.Dawn 77 North2
6 22 Kay.Smith 69 South2
7 51 Jay.Smith 69 South2
8 125 L.Smith 69 East2
9 51 L.Day 69 South2
10 23 23.Guest2 67 North1
11 92 U.Goode 65 North4
I have successfully created an array function that pulls the Sales column of only the results in the specified region.
{=LARGE(SMALL(IF(IF(ISERROR(K:K),"",K:K)=$X$2,J:J),
ROW(INDIRECT("1:"&COUNTIF(K:K,$X$2)))),F2)}
I am attempting now for an array function that pulls the Username that matches the corresponding sales amount in the original array, and also matches the region. I am having trouble when a single region has 'ties' or more than one employee with the same sales that month. Here is what I started with for that function:
=INDEX(I:I,MATCH(1,(Y2=J:J)*($X$1=K:K),0)
but that is having trouble when a single region has multiple users with the same sales. So I am trying a conditional to accomodate, with the function I know that works for singles when there's only one of that sales for that region.
{=IF(COUNTIF($AB$2:AB2,AB2)>1,
INDEX(I:I,
SMALL(IF(J:J=AB2,
IF(K:K=$AB$2,ROW(K:K)-ROW(INDEX(K:K,1,1))+1)),
COUNTIF($AB$2:AB2,AB2))),
INDEX(I:I,MATCH(1,(AC2=J:J)*($AB$2=K:K),0)))}
The inner piece may be sufficient if it worked, excluding the need for the conditional:
{=INDEX(I:I,
SMALL(IF(J:J=AB2,
IF(K:K=$AB$2,ROW(K:K)-ROW(INDEX(K:K,1,1))+1)),
COUNTIF($AB$2:AB2,AB2)))}
I'll use the same function for Username.
Expected results for two regions:
X Y Z AA AB AC AD AE
Region Sales Username Location Region Sales Username Location
North1 85 John.Doe 38 South2 83 John.Smith 154
83 E.Williams 23 69 Kay.Smith 22
67 23.Guest2 23 69 Jay.Smith 51
69 L.Day 51
Since beginning to type this question I have found a work around that includes a few additional columns to complete the calculation, but still wanted to ask this to see if it was possible for knowledge's sake.
With North1 in X2, these are the formulas for Y2:AA2.
=IFERROR(AGGREGATE(14, 6, ($J$2:$J$999)/($K$2:$K$999=X$2), ROW(1:1)), "")
=IFERROR(INDEX($H:$H, AGGREGATE(15, 6, ROW($2:$999)/(($K$2:$K$999=X$2)*($J$2:$J$999=Y2)), COUNTIF(Y$2:Y2, Y2))), "")
=IFERROR(INDEX($H:$H, AGGREGATE(15, 6, ROW($2:$999)/(($K$2:$K$999=X$2)*($J$2:$J$999=Y2)), COUNTIF(Y$2:Y2, Y2))), "")
Fill down as necessary.
With South2 in AB2, copy Y2:AA2 to AC2:AE2 and fill down as necessary.

SQL Server : GPS distance returning invalid floating point

I have a table in a SQL Server database that contains columns:
ID, Latitude, Longitude
I want to calculate the distance between each point and take all that are less, but when I apply the WHERE distance < 10 I'm getting an error
An invalid floating point operation occurred
I'm not sure what im doing wrong, because it's working perfectly without the WHERE clause.
My code:
SELECT
[t5].Name as Hotel1, [t6].Name as Hotel2, Distance
FROM
(SELECT
[t2].ID as GPS1, [T1].ID as GPS2,
(ACOS(Cos(PI()*[t1].Latitude/180.0) * Cos(PI() * [t1].Longitude/180.0) * Cos(PI()*[t2].Latitude/180.0) * Cos(PI() * [t2].Longitude/180.0) + Cos(PI()*[t1].Latitude/180.0) * Sin(PI() * [t1].Longitude/180.0) * Cos(PI()*[t2].Latitude/180.0) * Sin(PI() *[t2].Longitude/180.0) + Sin(PI()*[t1].Latitude/180.0) * Sin(PI() * [t2].[Latitude]/180.0)) * 6371) AS Distance
FROM
GPSLocations [t1]
JOIN
GPSLocations [t2] ON [t1].ID <> [t2].ID) [t4]
JOIN
Hotels [t5] ON [t5].FK_GPSLocationID = GPS1
JOIN
Hotels [t6] ON [t6].FK_GPSLocationID = GPS2
WHERE
distance < 10
Sample data:
GPSLocations table looks like this
Latitude Longitude ID
39,7531224 -105,0001446 847
55,10309 12,31064 581
55,10317 12,37527 684
55,10382 9,35923 740
55,1097 12,50471 636
55,11163 10,77026 358
55,11366 11,74766 668
Hotel table looks like this:
ID Name FK_GPSLocationID
64 Hotel Findus 59
65 Best Western CPH 60
66 Comwell Middelfart 61
67 Hotel Middelfart 62
68 Master demo 1 63
69 Sky hotel 1 64
70 Rene bigstart hotel 3 65
This is just sample data and there for the foreign keys do not match.
All the GPS locations can be downloaded here: http://noxiaz.dk/GPSLocations.txt
The joins with the table Hotels makes no difference on the error im getting

Get all employees count whose first name starts alphabetically

In TSQL how do i get the starting letter of the fname of the employee and number of employee with that letter. I got to do this on PUBS database
select ASCII(fname) 'ASCII CODE',SUBSTRING(fname,1,1) 'LETTER' from employee
Output
65 A
65 A
65 A
65 A
67 C
67 C
68 D
68 D
69 E
.. ..
Expecting output
10 A
20 B
30 C
.. ..
Since it involved a Grouping by first letter of fname i had included groupby fname but no change in output. What is the exact SQL i need to run ?
SELECT LEFT(fname,1), COUNT(1)
FROM employee
GROUP BY LEFT(fname,1)
Edit: Damn! Ninja'd - SO can be a bit slow to update sometimes lol
Just need to group by the two columns you selected ...
SELECT SUBSTRING(fname,1,1) 'LETTER', COUNT(*) cnt
FROM employee
GROUP BY (fname,1,1)

Formatting link lists using TSQL

Shog9 keeps on making my link lists look awesome.
Essentially, I write a bunch of queries that pull out results from the Stackoverflow data dump. However, my link lists look very ugly and are hard to understand.
Using some formatting magic Shog9 manages to make the link lists look a lot nicer.
So, for example, I will write a query that returns the following:
question id,title,user id, other info
4,When setting a form’s opacity should I use a decimal or double?,8,Eggs McLaren, some other stuff lots of text
And I want it to paste it into an answer on meta and make it look like this:
Question Id User Name Other Info
When setting a form’s opacity... Eggs Mclaren Some other stuff...
So assuming my starting point is the query that returns the start info.
What are the least amount of steps I can run in query analyser to turn the results into:
<h3> Question Id User Name Other Info </h3>
<pre>
When setting a form’s opacity... Eggs Mclaren Some other stuff...
</pre>
My initial thoughts are to insert the results into a temp table and then run a stored proc that will iron the data into my desired structure. Run the proc, cut and paste and be done.
Any candidate TSQL based solutions to this problem?
EDIT: Accepting my answer, its the only solution with an implementation.
Not sure of your exact requirements, but have you considered selecting the data as XML and then applying an XSLT transform to the results?
I'll update this post with my progress as I refine my proc:
Example:
select top 20
UserId = u.Id,
UserName = u.DisplayName,
u.Reputation,
sum(case when p.ParentId is null then 1 else 0 end) as Questions,
sum(case when p.ParentId is not null then 1 else 0 end) as Answers
into #t
from Users u
join Posts p on p.OwnerUserId = u.Id
where p.CommunityOwnedDate is null and p.ClosedDate is null
group by u.Id, u.DisplayName, u.Reputation
having sum(case when p.ParentId is not null then 1 else 0 end) < sum(case when p.ParentId is null then 1 else 0 end) / 6
order by Reputation desc
exec spShog9
Results:
User Reputation
Questions Answers
Edward Tanguay 8317 465 24
me 5767 311 29
Joan Venge 4844 226 14
Blankman 4546 310 1
acidzombie24 4359 371 32
Thanks 4350 416 21
Masi 4193 555 74
LazyBoy 3230 94 12
KingNestor 3187 92 11
Nick 2084 79 6
George2 1973 263 1
Xaisoft 1944 174 12
John 1929 160 24
danmine 1901 53 3
zsharp 1771 145 16
carrier 1742 56 8
JC Grubbs 1550 50 5
vg1890 1534 56 2
Coocoo4Cocoa 1514 143 0
Keand64 1513 83 5
Masi 4193 555 74
LazyBoy 3230 94 12
KingNestor 3187 92 11
Nick 2084 79 6
George2 1973 263 1
Xaisoft 1944 174 12
John 1929 160 24
danmine 1901 53 3
zsharp 1771 145 16
carrier 1742 56 8
JC Grubbs 1550 50 5
vg1890 1534 56 2
Coocoo4Cocoa 1514 143 0
Keand64 1513 83 5
The proc is on gist: http://gist.github.com/165544
You could do something like:
with
data (question_id,title,user_id, username ,other_info) as
(
select 4,'When setting a form''s opacity should I use a decimal or double?',8,'Eggs McLaren', 'some other stuff lots of text'
union all
select 5,'Another q title',9,'OtherUsername', 'some other stuff lots of text')
select
(select 'http://stackoverflow.com/questions/' + cast(question_id as varchar(10)) as [#href], title as [*] for xml path('a')) as questioninfo
,(select 'http://stackoverflow.com/users/' + cast(user_id as varchar(10)) + '/' + replace(username, ' ', '-') as [#href], username as [*] for xml path('a')) as userinfo
, other_info
from data
...but see how you go. I personally find that FOR XML PATH is very powerful for getting marked-up results in a way that suits me.
Rob

Finding bigram in a location index

I have a table which indexes the locations of words in a bunch of documents.
I want to identify the most common bigrams in the set.
How would you do this in MSSQL 2008?
the table has the following structure:
LocationID -> DocID -> WordID -> Location
I have thought about trying to do some kind of complicated join... and i'm just doing my head in.
Is there a simple way of doing this?
I think I better edit this on monday inorder to bump it up in the questions
Sample Data
LocationID DocID WordID Location
21952 534 27 155
21953 534 109 156
21954 534 4 157
21955 534 45 158
21956 534 37 159
21957 534 110 160
21958 534 70 161
It's been years since I've written SQL, so my syntax may be a bit off; however, I believe the logic is correct.
SELECT CONCAT(i.WordID, "|", j.WordID) as bigram, count(*) as freq
FROM index as i, index as j
WHERE j.Location = i.Location+1 AND
j.DocID = i.DocID
GROUP BY bigram
ORDER BY freq DESC
You can also add the actual word IDs to the select list if that's useful, and add a join to whatever table you've got that dereferences WordID to actual words.

Resources