Field is being updated with same value - sql-server

I have a table that has a new column, and updating the values that should go in the new column. For simplicity sake I am reducing my example table structure as well as my query. Below is how i want my output to look.
IDNumber NewColumn
1 1
2 1
3 1
4 2
5 2
WITH CTE_Split
AS
(
select
*,ntile(2) over (order by newid()) as Split
from TableA
)
Update a
set NewColumn = a.Split
from CTE_Split a
Now when I do this I get my table and it looks as such
IDNumber NewColumn
1 1
2 1
3 1
4 1
5 1
However when I do the select only I can see that I get the desire output, now I have done this before to split result sets into multiple groups and everything works within the select but now that I need to update the table I am getting this weird result. Not quiet sure what I'm doing wrong or if anyone can provide any sort of feedback.
So after a whole day of frustration I was able to compare this code and table to another that I had already done this process to. The reason that this table was getting updated to all 1s was because turns out that whoever made the table thought this was supposed to be a bit flag. When it reality it should be an int because in this case its actually only two possible values but in others it could be more than two.
Thank you for all your suggestion and help and it should teach me to scope out data types of tables when using the ntile function.

Try updating your table directly rather than updating your CTE. This makes it clearer what your UPDATE statement does.
Here is an example:
WITH CTE_Split AS
(
SELECT
*,
ntile(2) over (order by newid()) as Split
FROM TableA
)
UPDATE a
SET NewColumn = c.Split
FROM
TableA a
INNER JOIN CTE_Split c ON a.IDNumber = c.IDNumber
I assume that what you want to achieve is to group your records into two randomized partitions. This statement seems to do the job.

Related

SQL Server : Row Number without ordering

I want to create a Select statement that ranks the column as is without ordering.
Currently, the table is in the following order:
ITEM_Description1
ITEM_Description2
ITEM_StockingType
ITEM_RevisionNumber
I do not want the results to be numerical in any way, nor depend on the VariableID numbers, but with ROW_Number(), I have to choose something. Does anyone know how I can have the results look like this?
Row| VariableName
---------------------
1 | ITEM_Description1
2 | ITEM_Description2
3 | ITEM_StockingType
4 | ITEM_RevisionNumber
My code for an example is shown below.
SELECT
VariableName,
ROW_NUMBER() OVER (ORDER BY VariableID) AS RowNumber
FROM
SeanVault.dbo.TempVarIDs
Using ORDER BY (SELECT NULL) will give you the results your looking for.
SELECT
VariableName,
ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS rownum
FROM
SeanVault.dbo.TempVarIDs
Your problem seems to be with this sentence:
Currently, the table is in the following order:
No, your table is NOT implicitly ordered!!
Although it might look like this...
The only way to enforce the resultset's sort order is an ORDER BY-clause at the outer most SELECT.
If you want to maintain the sort order of your inserts you can use
a column like ID INT IDENTITY (which will automatically increase a sequence counter)
Using GETDATE() on insert will not solve this, as multiple row inserts might get the same DateTime value.
You do not have to show this in your output of course...
Your table has no inherent order. Even if you get that order a 100 times in a row is no guarantee it will be that order on the 101 time.
You can add an identity column to the table.

How to join two tables based on Grouping of 1 column in both the tables

I have come up a situation which is not easy to explain in sentence so i will go ahead and give the complete scenario here.
I have one result set like the below :-
It shows header_equipment_id(s) in a group of jil_equipment_id,relationship_name,cell_group.. For example 3159398,4622903 lies in one group.
The other result set is given below, This is the table where i want to update 3 columns namely Is_Applicable_Price,prc_content_rid,prc_type_name
If you notice clearly, You will find the same header_equipment_id column here. If you group it with the result found above, You will find 3 different groups for. But out of those 3 groups, one group is red, It is red because they belong to different cell_group/relationship_name.
**
Yellow and green are passed scenario and Red, Blue are fail.
**
I want to update the columns Is_Applicable_Price,prc_content_rid,prc_type_name if the Group of header_equipment_id(s) fall under the same cell_group and relationship_name.
So the final result set would look something like below -
Please help me with any inputs if possible. It's a situation where i know one single query won't work. But i will need to have multiple Temp tables for the transformation. But this is the shortest i have came across.
I am using Microsoft sql server 2012.
Please help. Even a small hint would be of great help to me. Thanks in advance.
It seems that the only thing the 2 tables have in common is that a cell_group can have a one or more rows of header_equipment_id. If we can generate a unique value based on header_equipment_id then we can join the 2 tables on this value. Note I have used a simple division , you may wish to check that this method is unique enough for your purposes.
/*create table a
(jil_equimentid int,relationship_name varchar(20),header_equipment_id int,
smart_equipment_id int,cell_group int,new_price_flag int,is_applicable_price int,prc_content_rid int,prc_type_name varchar(20))
truncate table a
insert into a values
(1282977,'default',3159398,1282977,3,1,1,106347924,'New Price'),
(1282977,'default',4622903,1262578,3,1,1,106347924,'New Price'),
(1282977,'default',1659861,1282977,6,1,1,106347925,'New Price'),
(1282977,'default',4622904,1282977,6,1,1,106347925,'New Price')
go
drop table t
go
create table t
(jil_equimentid int,relationship_name varchar(20),header_equipment_id int,
smart_equipment_id int,cell_group int,new_price_flag int,is_applicable_price int,prc_content_rid int,prc_type_name varchar(20))
truncate table t
insert into t values
(1282977,'128297711111 default',4622903,1282977,1,1,null,null,null),
(1282977,'128297711211 default',3159398,1262578,2,1,null,null,null),
(1282977,'128297712111 default',4622904,1282977,4,1,null,null,null),
(1282977,'128297712211 default',1659861,1282977,5,1,null,null,null),
(1282977,'128297711101 default',3159398,1262578,1,1,null,null,null),
(1282977,'128297711101 default',4622903,1282977,1,1,null,null,null),
(1282977,'default' ,3159398,1262578,2,1,null,null,null),
(1282977,'default' ,4622903,1282977,2,1,null,null,null),
(1282977,'128297711101 default',1659861,1262577,3,1,null,null,null),
(1282977,'128297711101 default',4622904,1282977,3,1,null,null,null),
(1282977,'default' ,1659861,1262577,4,1,null,null,null),
(1282977,'default' ,4622904,1262577,4,1,null,null,null)
*/
DROP TABLE #TEMPA;
;WITH CTE AS
(SELECT a.cell_group,
sum(a.header_equipment_id / 10000000.0000) uniqueval
from a
group by a.cell_group
)
SELECT DISTINCT CTE.UNIQUEVAL ,IS_APPLICABLE_PRICE ,PRC_CONTENT_RID ,PRC_TYPE_NAME
INTO #TEMPA
FROM CTE
JOIN A ON A.CELL_GROUP = CTE.CELL_GROUP
;WITH CTE AS
(
SELECT t.relationship_name,t.cell_group,
sum(t.header_equipment_id / 10000000.0000) uniqueval
from t
group by t.relationship_name,t.cell_group having count(*) > 1
)
SELECT T.*,CTE.UNIQUEVAL,ta.*
FROM CTE
JOIN T ON T.RELATIONSHIP_NAME = CTE.RELATIONSHIP_NAME AND T.CELL_GROUP = CTE.CELL_GROUP
join #tempa ta on ta.uniqueval = cte.uniqueval

Using Set and Between in the same query to populate a table

What I am trying to do is populate a row using a counter or some function of sql I am unaware of.
I can use
INSERT INTO EmpSkillsBridge
(EmpIdFK , EmpSkillFK)
Values
(1,1);
This updates one entire record I can even do it in batches of parentheses.
However since I am setting up a the table for the first time and since the data is for test (cough homework cough) reasons. I am trying make it so that the I can create a whole batch of data I have 200 EmpID and 6 different skills. all I want right now is to make the first 25 EmpIdFK and the EmpSkillFK be (1,1)
If i use a where empIdFk < 26 I get an error.
I tried using a loop but being new I got a little lost on how to implement
Then I read I could use the between statement. so my question is can I use a set statement in conjuction with between and make the code work that way?
Set into EmpSkillsBridge
(EmpIdFK , EmpSkillFK)
WHERE (EMPID BETWEEN 1 AND 26)
Values
(1,1);
would this be the best way to go around that?
You can do this:
Insert into EmpSkillsBridge
Select empid, 1
from employees
where empid between 1 and 25;
Something like this should work for you:-
insert into EmpSkillsBridge
(EmpID, EmpSkillsFK)
select empid, 1 from
(
select ROW_NUMBER() over(order by number) empid
from master..spt_values
) v
where empid between 1 and 25

SQL Server 2005 SELECT TOP 1 from VIEW returns LAST row

I have a view that may contain more than one row, looking like this:
[rate] | [vendorID]
8374 1234
6523 4321
5234 9374
In a SPROC, I need to set a param equal to the value of the first column from the first row of the view. something like this:
DECLARE #rate int;
SET #rate = (select top 1 rate from vendor_view where vendorID = 123)
SELECT #rate
But this ALWAYS returns the LAST row of the view.
In fact, if I simply run the subselect by itself, I only get the last row.
With 3 rows in the view, TOP 2 returns the FIRST and THIRD rows in order. With 4 rows, it's returning the top 3 in order. Yet still top 1 is returning the last.
DERP?!?
This works..
DECLARE #rate int;
CREATE TABLE #temp (vRate int)
INSERT INTO #temp (vRate) (select rate from vendor_view where vendorID = 123)
SET #rate = (select top 1 vRate from #temp)
SELECT #rate
DROP TABLE #temp
.. but can someone tell me why the first behaves so fudgely and how to do what I want? As explained in the comments, there is no meaningful column by which I can do an order by. Can I force the order in which rows are inserted to be the order in which they are returned?
[EDIT] I've also noticed that: select top 1 rate from ([view definition select]) also returns the correct values time and again.[/EDIT]
That is by design.
If you don't specify how the query should be sorted, the database is free to return the records in any order that is convenient. There is no natural order for a table that is used as default sort order.
What the order will actually be depends on how the query is planned, so you can't even rely on the same query giving a consistent result over time, as the database will gather statistics about the data and may change how the query is planned based on that.
To get the record that you expect, you simply have to specify how you want them sorted, for example:
select top 1 rate
from vendor_view
where vendorID = 123
order by rate
I ran into this problem on a query that had worked for years. We upgraded SQL Server and all of a sudden, an unordered select top 1 was not returning the final record in a table. We simply added an order by to the select.
My understanding is that SQL Server normally will generally provide you the results based on the clustered index if no order by is provided OR off of whatever index is picked by the engine. But, this is not a guarantee of a certain order.
If you don't have something to order off of, you need to add it. Either add a date inserted column and default it to GETDATE() or add an identity column. It won't help you historically, but it addresses the issue going forward.
While it doesn't necessarily make sense that the results of the query should be consistent, in this particular instance they are so we decided to leave it 'as is'. Ultimately it would be best to add a column, but this was not an option. The application this belongs to is slated to be discontinued sometime soon and the database server will not be upgraded from SQL 2005. I don't necessarily like this outcome, but it is what it is: until it breaks it shall not be fixed. :-x

Is the 'BETWEEN' function very expensive in SQL Server?

I'm trying to join two relatively simple tables together, but my query is experiencing serious hangups. I'm not sure why, but I think it might have something to do with the 'between' function. My first table looks something like this (with a lot of other columns, but this would be the only column I'm pulling):
RowNumber
1
2
3
4
5
6
7
8
My second table "groups" my rows into "blocks", and has the following schema:
BlockID RowNumberStart RowNumberStop
1 1 3
2 4 7
3 8 8
The desired result I'm looking to get is to link the RowNumber with the BlockID like below, with the same number of rows with the first table. So the result would look like this:
RowNumber BlockID
1 1
2 1
3 1
4 2
5 2
6 2
7 2
8 3
In order to get that, I used the following query, writing the results into a temp table:
select A.RowNumber, B.BlockID
into TEMP_TABLE
from TABLE_1 A left join TABLE_2 B
on A.RowNumber between B.RowNumberStart and B.RowNumberStop
TABLE_1 and TABLE_2 are actually very large tables. Table 1 is about 122M Rows, and TABLE_2 is about 65M rows. In TABLE_1, the RowNumber is defined as a 'bigint', and in TABLE_2, the BlockID, RowNumberStart, and RowNumberStop are all defined as 'int'. Not sure that makes a difference, but just wanted include that information, too.
The query has now been hung up for eight hours. Similar queries on this type and volume of data are not taking anywhere near this long. So I'm wondering if it could be the 'between' statement that's hanging up this query.
Definitely would welcome any suggestions on how to make this more efficient.
BETWEEN is simply shorthand for :
select A.RowNumber, B.BlockID
into TEMP_TABLE
from TABLE_1 A left join TABLE_2 B
on A.RowNumber >= B.RowNumberStart AND A.RowNumber <= B.RowNumberStop
If execution plan goes from B to A (but left join would indicate it has to go from A to B, really), then I'm assuming TABLE_1 is indexed on RowNumber (and that should be covering on this query). If it's only got a clustered index on RowNumber and the table is very wide, I recommend a non-clustered index only on RowNumber, since you'll fit a lot more rows per page that way.
Otherwise, you want to index on TABLE_2 on RowNumberStart DESC or RowNumberStop ASC, because for given A you'd need a DESC on RowNumberStart to match.
I think you might want to change your join to an INNER JOIN, the way your join criteria is set up. (Are you ever going to get TABLE_1 in no block?)
If you look at your execution plan, you should get more clues as to why the performance might be bad, but the Stop criteria is probably not used on the seek into TABLE_1.
Unfortunately SQLMenace's answer about the SELECT INTO has been deleted. My comment regarding that was meant to be: #Martin SELECT INTO performance isn't as bad as it once was, but I still recommend a CREATE TABLE for most production because SELECT INTO will infer types and NULLability. This is fine if you verify it is doing what you think it is doing, but creating a super long varchar or a decimal column with very strange precision can result in not only odd tables, but performance issues (especially with some of those big varchars when you forget a LEFT or whatever). I think it just helps to make it clear what you are expecting the table to look like. Often I will SELECT INTO using WHERE 0 = 1 and check out the schema and then script it with my tweaks (like adding an IDENTITY or adding a column with a timestamp default).
You have one main problem: you want to display too much data volume at once. Ar you really sure you want handle the result of ALL 122M rows from table 1 at once? Do you really need that?

Resources