I am using SQL Server 2008 R2 to run queries and I have come across a database where it stores numeric values as varchar(4). For example:
SELECT [num]
FROM [TABLE1]
WHERE num > '95'
I get the below results
96
97
98
99
999
However when I run the same query without the '' i.e.
SELECT [num]
FROM [TABLE1]
WHERE num > 95
then I get
100
101
102
103
104
105
106
107
108
109
110
111
112
113
116
117
120
7001
7002
7003
7004
7005
7006
7007
96
97
98
99
999
In any case, I am not getting numbers in order i.e. 95, 96, 97, 98, 99. I understand this is because they are stored as varchar(4) i.e. of a string format. Please can someone explain what happens in both situations and how does a string compare in both the above cases?
Also if someone can help me write the code to change these varchar(4) into numeric on the fly so I can arrange them properly?
Much appreciated.
When you use > '95' it compares the "numbers" in alphabetical order, that's why the result is like that. When you use > 95 it type casts the column into a number and that's why the different result.
To be sure what actually happens, you should do the casting yourself. And of course you should not store numbers as varchars.
The correct ordering would be with
order by convert(int, num)
but it will fail if there's non-numeric fields in the table.
The > does a lexicographical comparison on strings, not numbers. So the output is in order of a string (order by ASC).
Related
I have an SQL table that has this data
I need the data to be formatted so that instead of showing this string of numbers and characters, I want to show time in minutes without the string. For example (in minutes):
88
85
85
67
63
76
71
75
75
42
I echo with Larnu comment.
You can try something like below.
declare #string varchar(20) = '1 hour 28 mins'
Select #string,case when CHARINDEX('hour',#string)>1 then
SUBSTRING(#string,1,CHARINDEX('hour',#string)-1) * 60 else 0 end
+
case when CHARINDEX('mins',#string)>1 then
SUBSTRING(#string,CHARINDEX('mins',#string)-3,2) else 0 end
I have the following problem:
data example;
input channel $ program $ item1 item2 GOAL1 GOAL2;
datalines;
CS A 100 10 100 10
CS A 101 9 100 9
CS B 102 11 102 11
CS B 101 14 101 11
BD A 200 210 200 210
BD A 201 209 200 209
BD B 202 211 202 211
BD B 201 214 201 214
;
run;
First, I need to notice that operations are going to be performed on channel-program level.
Second, a third variable call THIRD equals item1 in its first entry by group. However, in the second entry of third it will vary: if item1_entry1
data poli;
set poli;
by channel program;
array prog{*} A B; /*IN my original data I have 3 programs, so the solution has to be general*/
third=item1; /*So the first entry of item1 will be equal in third*/
do k=1 to dim(prog);
if program=prog{k} then do;
if lag(item1)<lag(item1) then THIRD=lag(item1)
else THIRD=item1;
end;
end;
run;
As expected the code does not give me what I want.
Specifically THIRD and FOURTH should be equal to the variables GOAL1 and GOAL 2.
NOTE: The idea behind the comparison is that always the higher levels are going to be greater or equal than the lower levels, and the lower levels cannot be greater than the upper levels: I can't have 100 and then 101, it should be 100 and 100 for one group.
Your description is a bit unclear. I think you want to compute a new variable want1, which is set to the value of item1 at the beginning of each by group. And within a by group, it decreases if item1 decreases, else stays the same. I would try (untested):
data want;
set example;
by channel program;
retain want1;
if first.program then want1=item1;
else want1=min(want1,item1);
run;
In order to use Unnest function I want convert a list to array.
This is my list of type text. It's an output of this function (How get all positions in a field in PostgreSQL?):
108,109,110,114,115,116,117,156,157,200,201,205
I convert to array with
array[108,109,110,114,115,116,117,156,157,200,201,205]
result is type text[]:
"{"108,109,110,114,115,116,117,156,157,200,201,205"}"
With this kind of array unnest function doesn't work so I think I want convert to array of Int
Thanks
with the_data(str) as (
select '108,109,110,114,115,116,117,156,157,200,201,205'::text
)
select elem
from the_data,
unnest(string_to_array(str, ',')) elem;
elem
------
108
109
110
114
115
116
117
156
157
200
201
205
(12 rows)
If I correctly understand, you need this (no necessary convert to INT):
select unnest( string_to_array('108,109,110,114,115,116,117,156,157,200,201,205', ',' ) )
Shog9 keeps on making my link lists look awesome.
Essentially, I write a bunch of queries that pull out results from the Stackoverflow data dump. However, my link lists look very ugly and are hard to understand.
Using some formatting magic Shog9 manages to make the link lists look a lot nicer.
So, for example, I will write a query that returns the following:
question id,title,user id, other info
4,When setting a form’s opacity should I use a decimal or double?,8,Eggs McLaren, some other stuff lots of text
And I want it to paste it into an answer on meta and make it look like this:
Question Id User Name Other Info
When setting a form’s opacity... Eggs Mclaren Some other stuff...
So assuming my starting point is the query that returns the start info.
What are the least amount of steps I can run in query analyser to turn the results into:
<h3> Question Id User Name Other Info </h3>
<pre>
When setting a form’s opacity... Eggs Mclaren Some other stuff...
</pre>
My initial thoughts are to insert the results into a temp table and then run a stored proc that will iron the data into my desired structure. Run the proc, cut and paste and be done.
Any candidate TSQL based solutions to this problem?
EDIT: Accepting my answer, its the only solution with an implementation.
Not sure of your exact requirements, but have you considered selecting the data as XML and then applying an XSLT transform to the results?
I'll update this post with my progress as I refine my proc:
Example:
select top 20
UserId = u.Id,
UserName = u.DisplayName,
u.Reputation,
sum(case when p.ParentId is null then 1 else 0 end) as Questions,
sum(case when p.ParentId is not null then 1 else 0 end) as Answers
into #t
from Users u
join Posts p on p.OwnerUserId = u.Id
where p.CommunityOwnedDate is null and p.ClosedDate is null
group by u.Id, u.DisplayName, u.Reputation
having sum(case when p.ParentId is not null then 1 else 0 end) < sum(case when p.ParentId is null then 1 else 0 end) / 6
order by Reputation desc
exec spShog9
Results:
User Reputation
Questions Answers
Edward Tanguay 8317 465 24
me 5767 311 29
Joan Venge 4844 226 14
Blankman 4546 310 1
acidzombie24 4359 371 32
Thanks 4350 416 21
Masi 4193 555 74
LazyBoy 3230 94 12
KingNestor 3187 92 11
Nick 2084 79 6
George2 1973 263 1
Xaisoft 1944 174 12
John 1929 160 24
danmine 1901 53 3
zsharp 1771 145 16
carrier 1742 56 8
JC Grubbs 1550 50 5
vg1890 1534 56 2
Coocoo4Cocoa 1514 143 0
Keand64 1513 83 5
Masi 4193 555 74
LazyBoy 3230 94 12
KingNestor 3187 92 11
Nick 2084 79 6
George2 1973 263 1
Xaisoft 1944 174 12
John 1929 160 24
danmine 1901 53 3
zsharp 1771 145 16
carrier 1742 56 8
JC Grubbs 1550 50 5
vg1890 1534 56 2
Coocoo4Cocoa 1514 143 0
Keand64 1513 83 5
The proc is on gist: http://gist.github.com/165544
You could do something like:
with
data (question_id,title,user_id, username ,other_info) as
(
select 4,'When setting a form''s opacity should I use a decimal or double?',8,'Eggs McLaren', 'some other stuff lots of text'
union all
select 5,'Another q title',9,'OtherUsername', 'some other stuff lots of text')
select
(select 'http://stackoverflow.com/questions/' + cast(question_id as varchar(10)) as [#href], title as [*] for xml path('a')) as questioninfo
,(select 'http://stackoverflow.com/users/' + cast(user_id as varchar(10)) + '/' + replace(username, ' ', '-') as [#href], username as [*] for xml path('a')) as userinfo
, other_info
from data
...but see how you go. I personally find that FOR XML PATH is very powerful for getting marked-up results in a way that suits me.
Rob
I have a table which indexes the locations of words in a bunch of documents.
I want to identify the most common bigrams in the set.
How would you do this in MSSQL 2008?
the table has the following structure:
LocationID -> DocID -> WordID -> Location
I have thought about trying to do some kind of complicated join... and i'm just doing my head in.
Is there a simple way of doing this?
I think I better edit this on monday inorder to bump it up in the questions
Sample Data
LocationID DocID WordID Location
21952 534 27 155
21953 534 109 156
21954 534 4 157
21955 534 45 158
21956 534 37 159
21957 534 110 160
21958 534 70 161
It's been years since I've written SQL, so my syntax may be a bit off; however, I believe the logic is correct.
SELECT CONCAT(i.WordID, "|", j.WordID) as bigram, count(*) as freq
FROM index as i, index as j
WHERE j.Location = i.Location+1 AND
j.DocID = i.DocID
GROUP BY bigram
ORDER BY freq DESC
You can also add the actual word IDs to the select list if that's useful, and add a join to whatever table you've got that dereferences WordID to actual words.