How to limit the results of a query and group them together,

How to limit the results of a query and group them together, - database

I have the two tables pictured from a "city jail' DB, one is the sentences given to criminals and the other criminal information. I am trying to write a query the lists only the criminal_id, first and last names with more that one sentence (i.e. the criminal_id's that have more than one sentence_id associated with it).
I have tried this query but get an error.
select
criminals.last, sentences.criminal_id,
count(sentences.sentence_id) as 'Number of Sentences'
from
criminals
join
sentences on criminals.criminal_id = sentences.criminal_id
where
count(sentences.sentence_id) > 1
group by
criminals.last
order by
'Number of Sentences' desc;
I get this error:
An aggregate may not appear in the WHERE clause unless it is in a subquery contained in a HAVING clause or a select list, and the column being aggregated is an outer reference.
I would appreciate any suggestions on how to go about this one.

Filtering on aggregates such as the count happen in the HAVING clause, so you may use this version:
SELECT c.last, s.criminal_id, C0UNT(s.sentence_id) AS [Number of Sentences]
FROM criminals c
INNER JOIN sentences s
ON c.criminal_id = s.criminal_id
GROUP BY c.last, s.criminal_id
HAVING C0UNT(s.sentence_id) > 1
ORDER BY C0UNT(s.sentence_id) DESC;

Related

how to select first rows distinct by a column name in a sub-query in sql-server?

Actually I am building a Skype like tool wherein I have to show last 10 distinct users who have logged in my web application.
I have maintained a table in sql-server where there is one field called last_active_time. So, my requirement is to sort the table by last_active_time and show all the columns of last 10 distinct users.
There is another field called WWID which uniquely identifies a user.
I am able to find the distinct WWID but not able to select the all the columns of those rows.
I am using below query for finding the distinct wwid :
select distinct(wwid) from(select top 100 * from dbo.rvpvisitors where last_active_time!='' order by last_active_time DESC) as newView;
But how do I find those distinct rows. I want to show how much time they are away fromm web apps using the diff between curr time and last active time.
I am new to sql, may be the question is naive, but struggling to get it right.

If you are using proper data types for your columns you won't need a subquery to get that result, the following query should do the trick
SELECT TOP 10
[wwid]
,MAX([last_active_time]) AS [last_active_time]
FROM [dbo].[rvpvisitors]
WHERE
[last_active_time] != ''
GROUP BY
[wwid]
ORDER BY
[last_active_time] DESC
If the column [last_active_time] is of type varchar/nvarchar (which probably is the case since you check for empty strings in the WHERE statement) you might need to use CAST or CONVERT to treat it as an actual date, and be able to use function like MIN/MAX on it.
In general I would suggest you to use proper data types for your column, if you have dates or timestamps data use the "date" or "datetime2" data types
Edit:
The query aggregates the data based on the column [wwid], and for each returns the maximum [last_active_time].
The result is then sorted and filtered.
In order to add more columns "as-is" (without aggregating them) just add them in the SELECT and GROUP BY sections.
If you need more aggregated columns add them in the SELECT with the appropriate aggregation function (MIN/MAX/SUM/etc)
I suggest you have a look at GROUP BY on W3
To know more about the "execution order" of the instruction you can have a look here

You can solve problem like this by rank ordering the results by a key and finding the last x of those items, this removes duplicates while preserving the key order.
;
WITH RankOrdered AS
(
SELECT
*,
wwidRank = ROW_NUMBER() OVER (PARTITION BY wwid ORDER BY last_active_time DESC )
FROM
dbo.rvpvisitors
where
last_active_time!=''
)
SELECT TOP(10) * FROM RankOrdered WHERE wwidRank = 1

If my understanding is right, below query will give the desired output.
You can have conditions according to your need.
select top 10 distinct wwid from dbo.rvpvisitors order by last_active_time desc

SQL Server Select Top records from one table in another table

I have a table with thousands of keywords. I would like to isolate the top 25 negative keywords in that table and then from those top keywords create a join to find the sentences linked to those keywords in another table. The final result will be id_file, sentence_id, sentiment, sentence, token. Both tables have the tokens.
The token table (tbl_token) has the following columns:
id_file, sentence_id, sentiment, token
The filters to isolate the top 25 from tbl_token are as follows:
id_file = 3, sentiment = 'negative'
The sentence table (tbl_sentence) has the following columns:
id_file, sentence_id, sentiment, **sentence**, token
The sentence_id in both tables have a one to many relationship so a join on those will pull out the sentences. The tokens from the top query exist in tbl_sentence.
My current solution is to first run a top 25 from tbl_token for the same filters as above, count token, sort it in descending order.
SELECT TOP (25)
COUNT(token) AS Count, token
FROM
tbl_token
GROUP BY
token, sentiment, id_file
HAVING
(sentiment = N'negative') AND (id_file = 3)
ORDER BY
COUNT(token) DESC
Then I link that to all the tokens in a view which has the sentence_id. Then I can link sentence_id from the view to tbl_sentence to isolate the sentences based on the top 25 negative keywords.
This works but I am a just wondering if this can be done in one stored procedure.

This is a simple query using a SELECT TOP with an INNER JOIN. Have you researched JOINS? Also, are you sure you don't mean one to many? If the token appears in multiple sentences then you will only get the first 25 results as you specified instead of multiple matches of the top 25 tokens. The ORDER BY is relatively important as the TOP 25 will not always be in a predictable order unless you specify a rank order.
SELECT TOP 25
ts.id_file,
ts.sentence_id,
ts.sentiment,
ts.sentence,
ts.token
FROM
tbl_token tt
INNER JOIN tbl_sentence ts on ts.sentence_id=tt.sentence_id
WHERE
tt.id_file=3
AND
tt.sentiment='negative'
ORDER BY
tt.SomeFieldToRank25ByDateOrPriority
Edited for One to Many!
SELECT
ts.id_file,
ts.sentence_id,
ts.sentiment,
ts.sentence,
ts.token
SentenceCount=COUNT(*)
FROM
(
SELECT TOP 25
tt.sentence_id
FROM
tbl_token tt
WHERE
tt.id_file=3
AND
tt.sentiment='negative'
ORDER BY
tt.SomeFieldToRank25ByDateOrPriority
)AS X
INNER JOIN tbl_sentence ts on ts.sentence_id=x.sentence_id
GROUP BY
ts.id_file,
ts.sentence_id,
ts.sentiment,
ts.sentence,
ts.token

Basic T-SQL COUNT

I'm new to T-SQL and this question is T-SQL Count 101.
I'm studying T-SQL with this site http://sqlmag.com/t-sql/t-sql-101-lesson-4 but I can't figure out Which part of coding says WHERE(column_name) to execute 'COUNT' if it makes sense? In other words, how does this COUNT know what to count? It just says COUNT everything as Reviews from MovieReview table.....
SELECT MovieName,
LEFT(REPLICATE('* ',AVG(Stars)),10)
AS 'Stars',
COUNT(*) AS 'Reviews'
FROM MovieReview
GROUP BY MovieName
HAVING COUNT(*) >= 4
ORDER BY Stars
Result:
The TABLE name is MovieReview that contains the ratings that the five employees have given to movies they’ve watched in their spare time. This table contains four columns: EmployeeID, Genre, MovieName, and Stars. The Stars field specifies the movie’s rating, where 1 star is the worst rating and 5 is the best rating.
I understand below coding because it specified WHERE. Count everything as '...' From Employee table Where salary is less than 3000.
SELECT COUNT(*)
AS 'Impoverished'
FROM Employee
WHERE Salary < 30000
I need to learn creating reports from Data Warehouse. I learned SQL but most of sites use T-SQL when creating reports, I don't know why.
Thanks in advance.

count(*) counts the number of rows that match the where clause if a where clause is given, per distinct combination of the group by columns if a group by column is given.
Except for the behavior noted in the previous sentence, count(*) ignores the values in those rows.

Can i say that one column is more important then another in a full-text search?

I use Full-text indexing in SQL Server 2008.
I can't seem to find an answer to this question. Say that i have a full-text index on a table with the columns "Name" and "Description". I want to make the "Name" column much more important then the "Description" column.
So if you search for "Internet" the result with the name "Internet" will always come on top no matter how many occurences there is for internet in the description. It must be possible right?

I found this article just now.
http://www.goodercode.com/wp/?p=10
In my code it became like this. Works exactly as i want to :) Thanks for you help!
SELECT dbo.Item.Name, dbo.Item.[Description],NAME_SRCH.RANK AS rank1, DESC_SRCH.RANK AS rank2
FROM dbo.Item LEFT OUTER JOIN
FREETEXTTABLE(dbo.Item, name, 'Firefox') NAME_SRCH ON
dbo.Item.ItemId = NAME_SRCH.[KEY] LEFT OUTER JOIN
FREETEXTTABLE(dbo.Item, *, 'Firefox') DESC_SRCH ON
dbo.Item.ItemId = DESC_SRCH.[KEY]
ORDER BY rank1 DESC, rank2 DESC

You could add a computed column to your select list using Case where you have assign a value to that column based on the occurence of the search term in the columns of interest and then order by that column. So for example something like:
SELECT (CASE WHEN Name LIKE "%searchTerm%" THEN 10
WHEN Description LIKE "%searchTerm%" THEN 5
ELSE 0 END CASE) AS computedValue
FROM myTable
ORDER BY computedValue DESC

As I know there is no T-SQL syntax to specify rate of column.
But you can get this with trick of select.
(assuming using FREETEXTTABLE, but you can rewrite with other FTS constructions)
SELECT CASE
WHEN ISNULL(hi_prior.RANK) low_prior.KEY
ELSE hi_prior.RANK END --hi prior is selected whenever is possible
FROM
FREETEXTTABLE (Table1, (HiPriorColumn), #query) hi_prior
FULL OUTER JOIN
FREETEXTTABLE (Table1, (LowPriorColumn), #query) low_prior
ON (hi_prior.KEY = low_prior_KEY)
In case you need both result - use UNION, but multiply lowest on some rate: low_prior.RANK * 0.7

How do I assign weights to different columns in a full text search?

In my full text search query, I want to assign particular columns a higher weightage. Consider this query:
SELECT Key_Table.RANK, FT_Table.* FROM Restaurants AS FT_Table
INNER JOIN FREETEXTTABLE(Restaurants, *, 'chilly chicken') AS Key_Table
ON FT_Table.RestaurantID = Key_Table.[KEY]
ORDER BY Key_Table.RANK DESC
Now, I want the Name column to have a higher weightage in the results (Name, Keywords and Location are full-text indexed). Currently, if the result is found in any of the three columns, the ranks are not affected.
For example, I'd like a row with Name "Chilly Chicken" to have higher rank than one with Keywords "Chilly Chicken", but another name.
Edit:
I'm not eager to use ContainsTable, because that would mean separating the phrases (Chilly AND Chicken, etc.), which would involve me having to search all possible combinations - Chilly AND Chicken, Chilly OR Chicken, etc. I would like the FTS engine to automatically figure out which results match best, and I think FREETEXT does a fine job this way.
Apologies if I've misunderstood how CONTAINS/CONTAINSTABLE works.

The best solution is to use ContainsTable. Use a union to create a query that searches all 3 columns and adds an integer used to indicate which column was searched. Sort the results by that integer and then rank desc.
The rank is internal to sql server and not something you can adjust.
You could also manipulate the returned rank by dividing the rank by the integer (Name would be divided by 1, Keyword and Location by 2 or higher). That would cause the appearance of different rankings.
Here's some example sql:
--Recommend using start change tracking and start background updateindex (see books online)
SELECT 1 AS ColumnLocation, Key_Table.Rank, FT_Table.* FROM Restaurants AS FT_Table
INNER JOIN ContainsTable(Restaurant, Name, 'chilly chicken') AS Key_Table ON
FT_Table.RestaurantId = Key_Table.[Key]
UNION SELECT 2 AS ColumnLocation, Key_Table.Rank, FT_Table.* FROM Restaurants AS FT_Table
INNER JOIN ContainsTable(Restaurant, Keywords, 'chilly chicken') AS Key_Table ON
FT_Table.RestaurantId = Key_Table.[Key]
UNION SELECT 3 AS ColumnLocation, Key_Table.Rank, FT_Table.* FROM Restaurants AS FT_Table
INNER JOIN ContainsTable(Restaurant, Location, 'chilly chicken') AS Key_Table ON
FT_Table.RestaurantId = Key_Table.[Key]
ORDER BY ColumnLocation, Rank DESC
In a production environment, I would insert the output of the query into a table variable to perform any additional manipulation before returning the results (may not be necessary in this case). Also, avoid using *, just list the columns you really need.
Edit: You're right about using ContainsTable, you would have to modify the keywords to be '"chilly*" AND "chicken*"', I do this using a process that tokenizes an input phrase. If you don't want to do that, just replace every instance of ContainsTable above with FreeTextTable, the query will still work the same.