Problem: to set row number in columns' value in Vertica.
For example:
Table T has two columns: Id, name
I want to use a script to add the row number in the value of the name. In mySQl, I run the following script to update:
set #i=0;
update T set name = (CONCAT(name, (#i:=#i+1)));
However, Vertica doesn't support variables.
Could you please provide a way to reach the target?
As Vertica supports window functions, something like this can be used to retrieve this data:
select name,
row_number() over (order by name) as rn
from T;
I am not sure how this could be moved into an UPDATE statement though - I don't have a vertica installation available:
update T
set name = name || tx.rn
from (
select id,
row_number() over (order by name) as rn
from T
) as tx
where tx.id = t.id;
I don't know if that qualifies as a "self-join" which isn't allowed. But maybe that points you into the right direction.
Related
I have a table (call it oldtable) and the relevant columns are name, group, zip code. I have selected those into a new table (call that newtable). My issue is that some of the zip codes in the first table are NULL. I want to replace the NULL zip codes with the mode (most common value) of their group.
For example, say a row in newtable looks like this:
Name Group ZipCode
Blah G1 NULL
I want to replace that NULL with the most common zip code over all the people in G1 in oldtable. I am having trouble even getting started on pulling the mode of one column when grouped by another column.
I am using Microsoft SQL Server 2016.
This can be done using CROSS APPLY on an UPDATE.
UPDATE n SET
zipcode = x.zipcode
FROM newtable n
CROSS APPLY( SELECT TOP 1 zipcode, COUNT(*) cnt
FROM newtable o
WHERE n.[group] = o.[group]
GROUP BY zipcode
ORDER BY cnt DESC) x
WHERE n.zipcode IS NULL;
I am trying to figure out a way to check if their is repeated values in rows that are shared.
Example:
HMOID Name Addon10 Addon15 Addon20
RFFF Blah img path1 img path2 img path1
For my example, I would like to check if any of the addons for RFFF have any repeated value. In my example above, 'RFFF' has two images that are the same in Addon10 and Addon20 (The images have a path. so currently, they look like
http://oc2-reatest.regalmed.local/ocupgrade52/images/NDL_SCAN_SR.PNG).
I would like to be able to do this for multiple rows. I thought the following would give me an idea how to begin:
select * from HlthPlan
Group By HMO1A, HMONM
Having COUNT(*) > 1
However, it throughs the following error:
Msg 8120, Level 16, State 1, Line 1
Column 'HlthPlan.HMOID' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.*
I am fairly new to SQL and any suggestions would be appreciated.
Don't include * for your select query. Only include the columns that you are using in GROUP BY
SELECT HMO1A, HMONM, COUNT(*) from HlthPlan
GROUP BY HMO1A, HMONM
HAVING COUNT(*) > 1;
With only three columns to check, assuming non-null values across a single row:
select * from HlthPlan
where Addon10 in (Addon15, Addon20) or Addon15 = Addon20
You can also use cross apply to pivot the values for grouping:
select HMOID, addon
from HlthPlan cross apply (
select addon
from (values (Addon01), (Addon02), (Addon03), ... (Addon20)) as pvt(addon)
) a
where addon is not null
group by HMOID, addon
having count(*) > 1;
http://rextester.com/QWIW87618
You'll get multiple rows for each HMOID where the are different groups of columns having the same value. By the way, reporting on the names of specific columns involved would add another degree of difficulty to the query.
One way you can check for this is using UNPIVOT to compare your results:
create table #hmo (hmoid varchar(6), name varchar(25), Addon10 varchar(25),
Addon15 varchar(25), addon20 varchar(25));
insert into #hmo
values ('RFFF', 'Blah','img path1', 'img path2', 'img path1');
select hmoid, name, addval, addcount = count(adds)
FROM #hmo
UNPIVOT
(
addval FOR adds IN
(addon10, addon15, addon20)
) as unpvt
group by hmoid, name, addval having count(*) > 1
Will give results:
hmoid name addval addcount
RFFF Blah img path1 2
This way you can check against every row in the table and will see any row that has any two or more columns with the same value.
This does have the potential to get tedious if you have a lot of columns; the easiest way to correct for that is to build your query dynamically using info from sys.tables and sys.columns.
I am new to SQL Server. I'm using SQL Server in Azure, and I am looking for the best way to accomplish setting a status field when a new record is entered.
I have the following data:
I need to set\calculate the Quote_Status field.
When a new version is added, the new version's Quote_Status should be "Open"
For the previous version (or all other versions), Quote_Status should be "Versioned"
A new version is defined as Quote_System, Quote_Date, and Quote_ID are all equal.
When a new version is added, the Quote_Status should look like this:
I've considered triggers and calculated fields, but I've never done anything like this and don't know how to start the SQL. Thanks!
If the column can be calculated on the fly, something like the following could work:
with cte as (
select *,
row_number() over (
partition by Quote_System, Quote_Date, Quote_ID
order by QuoteVersion desc
) as rn
from dbo.yourTable
)
select *, case when rn = 1 then 'Open' else 'Versioned' end as Quote_Status
from cte;
Essentially, for each grouping of (Quote_System, Quote_Date, Quote_ID), I'm enumerating the versions in descending order. With that, the first (i.e. rn = 1) is the Open one while the rest are Versioned. In actual use, I'd add a where clause to the actual select so that there's a reasonable chance for it performing well.
If you need it to be persisted and Quote_Version is monotone increasing, I'd prefer to do it in a stored procedure. Like so:
create procedure dbo.insert_Quote (
#Quote_System varchar(20),
#Quote_Date date,
#Quote_ID varchar(20),
#Quote_Version int
)
as
begin
update dbo.yourTable
set Quote_Status = 'Versioned'
where Quote_System = #Quote_System
and Quote_Date = #Quote_Date
and Quote_ID = #Quote_ID
and Quote_Status <> 'Versioned';
insert into dbo.yourTable
(Quote_System, Quote_Date, Quote_ID, Quote_Version, Quote_Status)
values
(#Quote_System, #Quote_Date, #Quote_ID, #Quote_Version, 'Open');
end
If you really need a trigger, I can come up with something like that, too. But it's my least preferable solution.
My question needs little explanation so I'd like to explain this way:
I've got a table (lets call it RootTable), it has one million records, and not in any proper order. What I'm trying to do is to get number of rows(#ParamCount) from RootTable and at the same time these records must be sorted and also have an additional column(with unique data) added on the fly to maintain a key for row identification which will be used later in the program. It can take any number of parameters but my basic parameters are the two which mentioned below.
It's needed for SQL SERVER environment.
e.g.
RootTable
ColumnA ColumnB ColumnC
ABC city cellnumber
ZZC city1 cellnumber
BCD city2 cellnumber
BCC city3 cellnumber
Passing number of rows to return #ParamCount and columnA startswith
#paramNameStartsWith
<b>#paramCount:2 <br>
#ParamNameStartsWith:BC</b>
desired result:
Id(added on the fly) ColumnA ColumnB ColumnC
101 BCC city3 cellnumber
102 BCD city2 cellnumber
Here's another point about Id column. Id must maintain its order, like in the above result it's starting from 101 because 100 is already assigned to the first row when sorted and added column on the fly, and because it starts with "ABC" so obviously it won't be in the result set.
Any kind of help would be appreciated.
NOTE: My question title might not reflect my requirement, but I couldn't get any other title.
So first you need your on-the-fly-ID. This one is created by the ROW_NUMBER() function which is available from SQL Server 2005 onwards. What ROW_NUMBER() will do is pretty self-explaining i think. However it works only on a partition. The Partition is specified by the OVER clause. If you include GROUP BY within the OVER clause, you will have multiple partitions. In your case, there is only one partition which is the whole table, therefor GROUP BY is not necessary. However an ORDER BY is required so that the system knows which record should get which row number in the partition. The query you get is:
SELECT ROW_NUMBER() OVER (ORDER BY ColumnA) ID, ColumnA,ColumnB,ColumnC
FROM RootTable
Now you have a row number for your whole table. You cannot include any condition like your #ParamNameStartsWith parameter here because you wanted a row number set for the whole table. The query above has to be a subquery which provides the set on which the condition can be applied. I use a CTE here, i think that is better for readability:
;WITH OrderedList AS (
SELECT ROW_NUMBER() OVER (ORDER BY ColumnA) ID, ColumnA,ColumnB,ColumnC
FROM RootTable
)
SELECT *
FROM OrderedList
WHERE ColumnA LIKE #ParamNameStartsWith+'%'
Please note that i added the wildcard % after the parameter, so that the condition is basically "starts with" #ParamNameStartsWith.
Finally,if i got you right you wanted only #ParamCount rows. You can use your parameter directly with the TOP keyword which is also only possible with SQL Server 2005 or later.
;WITH OrderedList AS (
SELECT ROW_NUMBER() OVER (ORDER BY ColumnA) ID, ColumnA,ColumnB,ColumnC
FROM RootTable
)
SELECT TOP (#ParamCount) *
FROM OrderedList
WHERE ColumnA LIKE #ParamNameStartsWith+'%'
I am using where in condition in SQL Server. I want to get result without order, because I gave a list into the 'where in' condition.
For example
select * from blabla where column in ('03.01.KO61.01410',
'03.02.A081.15002',
'03.02.A081.15016',
'03.02.A081.15003',
'02.03.A081.57105')
How can I do?
If you want the rows returned such that they're in the same order as the items in your IN, you need to find some way to specify that in an ORDER BY clause - the only way to get SQL Server to define an order. E.g.:
select * from blabla where column in ('03.01.KO61.01410',
'03.02.A081.15002',
'03.02.A081.15016',
'03.02.A081.15003',
'02.03.A081.57105')
order by
CASE column
when '03.01.KO61.01410' then 1
when '03.02.A081.15002' then 2
when '03.02.A081.15016' then 3
when '03.02.A081.15003' then 4
when '02.03.A081.57105' then 5
end
Due to my experience, SQL Server randomly order the result set for WHERE-IN Clause if you does not specify how to order it.
So, if you want to order by your WHERE-IN conditions, you must define some data item to order it as you passed. Otherwise, SQL Server will randomly order your resultset.
You're already doing it - if you don't explicitly specify an order by using ORDER BY, then there is no implied order.
If you want to totally randomize the output, you could add an ORDER BY NEWID() clause:
SELECT (list of columns)
FROM dbo.blabla
WHERE column IN ('03.01.KO61.01410', '03.02.A081.15002',
'03.02.A081.15016', '03.02.A081.15003', '02.03.A081.57105')
ORDER BY NEWID()
If you have an autoincrement id in your table, use it in an order clause. And if you don't, consider adding one...
Try this:
CREATE TYPE varchar20_list_type AS TABLE (
id INT IDENTITY PRIMARY KEY,
val VARCHAR(20) NOT NULL UNIQUE
)
DECLARE #mylist varchar20_list_type
INSERT #mylist (val) VALUES
('03.01.KO61.01410'),
('03.02.A081.15002'),
('03.02.A081.15016'),
('03.02.A081.15003'),
('02.03.A081.57105')
SELECT
*
FROM
blabla
JOIN #mylist AS t
ON
blabla.col = t.val
ORDER BY
t.id
More information from http://www.sommarskog.se/arrays-in-sql-2008.html
By the way, this can be easily done in PostgreSQL with VALUES: http://www.postgresql.org/docs/9.0/static/queries-values.html