Related
According to this article:
When grouping with a column in a GROUP BY statement that contains NULLs, they will be put into one group in your result set:
However, what I want is to prevent grouping rows by NULL value.
The following code gives me one row:
IF(OBJECT_ID('tempdb..#TestTable') IS NOT NULL)
DROP TABLE #TestTable
GO
CREATE TABLE #TestTable ( ID INT, Value INT )
INSERT INTO #TestTable(ID, Value) VALUES
(NULL, 70),
(NULL, 70)
SELECT
ID
, Value
FROM #TestTable
GROUP BY ID, Value
The output is:
ID Value
NULL 70
However, I would like to have two rows. My desired result looks like this:
NULL 70
NULL 70
Is it possible to have two rows with GROUP BY?
UPDATE:
What I need is to count those rows:
SELECT
COUNT(1) AS rows
FROM (SELECT 1 AS foo
FROM #TestTable
GROUP BY ID, Value
)q
OUTPUT: 1
But, actually, there are two rows. I need output to have 2.
What you need is a way to make NULL values in Id unique. Using the following code will make the values unique, but continue to group the non-NULL value by virtue of the default value for a case expression being NULL:
group by Id, case when Id is NULL then NewId() end, Value
Assuming you want this behavior because you do want to group by the values of the nullable column (Id in your example), you can add a row_number when the id column is null using a common table expression to create an artificial difference between duplicate groups - like this:
-- Adding some more rows to the table
INSERT INTO #TestTable(ID, Value) VALUES
(NULL, 70),
(NULL, 70),
(1, 70),
(1, 70),
(2, 70);
The query, with the cte:
WITH CTE AS
(
SELECT Id, Value, IIF(Id IS NULL, ROW_NUMBER() OVER(ORDER BY Id), NULL) As Surrogate
FROM #TestTable
)
SELECT
ID
, Value
FROM CTE
GROUP BY ID, Surrogate, Value
Results:
ID Value
NULL 70
NULL 70
1 70
2 70
I need help in inserting 2 million rows into a table. The table I am inserting into has 4 billion rows and from where I am inserting has 2 million. The insert rate is around 190 rows per minute.
DECLARE #BatchSize INT = 5000
WHILE 1 = 1
BEGIN
INSERT INTO [dbo].[a] ([a].[col1], [a].[col2], [a].[adate], [a].[importdate])
SELECT TOP(#BatchSize)
b.col1,
b.col2,
b.adate,
b.importdate
FROM
b
WHERE
NOT EXISTS (SELECT 1
FROM dbo.[a]
WHERE [a].col1 = b.col1
AND [a].col2 = b.col2
AND [a].adate = b.adate)
--AND [sent].aDate > getdate()-10)
IF ##ROWCOUNT < #BatchSize BREAK
END;
In the above query, in table a, col1 and col2 and col3 are the primary key (Non-clustered). I want to insert every record in table a from table b ...
The table a has 3 indexes, one with col1.col2 and second with col1,col2,col3 and third with col1 only ......
Can anyone offer any idea about making it faster?
I have 128 Gb RAM on SQL Server 2008 R2.
Thanks
Since you want all of the rows in B inserted in A there should be no need to use an exists. The problem becomes one of tracking the rows already transferred in prior batches. The following example generates a row number and uses it to group rows into batches. If the row number is ordered by an existing index then there should be no sort pass required on the select side.
-- Sample data.
declare #A as Table ( Col1 Int, Col2 Int );
declare #B as Table ( Col1 Int, Col2 Int );
insert into #B ( Col1, Col2 ) values
( 1, 1 ), ( 1, 2 ), ( 1, 3 ), ( 1, 4 ), ( 1, 5 ),
( 2, 1 ), ( 2, 2 ), ( 2, 3 ), ( 2, 4 ), ( 2, 5 );
-- Rows to transfer in each batch.
declare #BatchSize as Int = 5;
-- First row to transfer in the current batch.
declare #BatchMark as Int = 1;
-- Count of rows processed.
declare #RowsProcessed as Int = 1;
-- Process the batches.
while #RowsProcessed > 0
begin
insert into #A ( Col1, Col2 )
select Col1, Col2
from ( select Col1, Col2, Row_Number() over ( order by Col1, Col2 ) as RN from #B ) as PH
where #BatchMark <= RN and RN < #BatchMark + #BatchSize;
select #RowsProcessed = ##RowCount, #BatchMark += #BatchSize;
select * from #A; -- Show progress.
end;
Alternatives would include adding a flag column to the B table to mark processed rows, using an existing id in the B table to track the maximum value already processed, using an additional table to track the index values of processed rows, deleting processed rows from B, ... .
An output clause may prove useful for some of the alternatives.
Rebuilding the index with a suitable fill-factor before transferring the data may help. See here. It depends on knowledge of the index values which is not available in your question.
*Updated - Please see below(Past the picture)
I am really stuck with this particular problem, I have two tables, Projects and Project Allocations, they are joined by the Project ID.
My goal is to populate a modified projects table's columns using the rows of the project allocations table. I've included an image below to illustrate what I'm trying to achieve.
A project can have up to 6 Project Allocations. Each Project Allocation has an Auto increment ID (Allocation ID) but I can't use this ID in a sub-selects because it isn't in a range of 1-6 so I can distinguish between who is the first PA2 and who is PA3.
Example:
(SELECT pa1.name FROM table where project.projectid = project_allocations.projectid and JVID = '1') as [PA1 Name],
(SELECT pa2.name FROM table where project.projectid = project_allocations.projectid and JVID = '1') as [PA2 Name],
The modified Projects table has columns for PA1, PA2, PA3. I need to populate these columns based on the project allocations table. So the first record in the database FOR EACH project will be PA1.
I've put together an SQL Agent job that drops and re-creates this table with the added columns so this is more about writing the project allocation row's into the modified projects table by row_num?
Any advice?
--Update
What I need to do now is to get the row_number added as a column for EACH project in order of DESC.
So the first row for each project ID will be 1 and for each row after that will be 2,3,4,5,6.
I've found the following code on this website:
use db_name
with cte as
(
select *
, new_row_id=ROW_NUMBER() OVER (ORDER BY eraprojectid desc)
from era_project_allocations_m
where era_project_allocations_m.eraprojectid = era_project_allocations_m.eraprojectid
)
update cte
set row_id = new_row_id
update cte
set row_id = new_row_id
I've added row_id as a column in the previous SQL Agent step and this code and it runs but it doesn't produce me a row_number FOR EACH projectid.
As you can see from the above image; I need to have 1-2 FOR Each project ID - effectively giving me thousands of 1s, 2s, 3s, 4s.
That way I can sort them into columns :)
From what I can tell a query using row number is what you are after. (Also, it might be a pivot table..)
Example:
create table Something (
someId int,
someValue varchar(255)
);
insert into Something values (1, 'one'), (1, 'two'), (1, 'three'), (1, 'four'), (2, 'ein'), (2, 'swei'), (3, 'un')
with cte as (
select someId,
someValue,
row_number() over(partition by someId order by someId) as rn
from Something
)
select distinct someId,
(select someValue from cte where ct.someId = someId and rn = 1) as value1,
(select someValue from cte where ct.someId = someId and rn = 2) as value2,
(select someValue from cte where ct.someId = someId and rn = 3) as value3,
(select someValue from cte where ct.someId = someId and rn = 4) as value4
into somethingElse
from cte ct;
select * from somethingElse;
Result:
someId value1 value2 value3 value4
1 one two three four
2 ein swei NULL NULL
3 un NULL NULL NULL
All,
Please see below data values in two tables:
FIRST Table: or Driver table contains Filter Criteria for selecting IDs from second Table
Key1 Value
1 Banks
1 Col1|Small
2 InsuranceCompany
2 Col2|Global
3 Banks
3 Col1|Big
3 Col2|Local
4 CreditUnion
Script
GO
CREATE TABLE [dbo].[TEST_DRIVER](
[Key1] [int] NOT NULL,
[Value] [varchar](50) NOT NULL
);
GO
INSERT INTO dbo.TEST_DRIVER(Key1, Value)
VALUES('1', 'Banks');
GO
INSERT INTO dbo.TEST_DRIVER(Key1, Value)
VALUES('1', 'Col1|Small');
GO
INSERT INTO dbo.TEST_DRIVER(Key1, Value)
VALUES('2', 'InsuranceCompany');
GO
INSERT INTO dbo.TEST_DRIVER(Key1, Value)
VALUES('2', 'Col2|Global');
GO
INSERT INTO dbo.TEST_DRIVER(Key1, Value)
VALUES('3', 'Banks');
GO
INSERT INTO dbo.TEST_DRIVER(Key1, Value)
VALUES('3', 'Col1|Big');
GO
INSERT INTO dbo.TEST_DRIVER(Key1, Value)
VALUES('3', 'Col2|Local');
GO
INSERT INTO dbo.TEST_DRIVER(Key1, Value)
VALUES('4', 'CreditUnion');
GO
Note:
a) Filter Criteria may exist in 1, 2 or 3 rows.
b) First selection criteria will always join with InstitutionType column of second table, however second and third criteria may or may not exist and the column to which it will join is specified in data itself with | separating columns name with value
SECOND Table: IDs from this table need to be found based on filter criteria in FIRST Table
ID InstitutionType Col1 Col2
100 Banks Small
200 Banks Global
300 Banks Big Local
400 InsuranceCompany Small Local
500 InsuranceCompany Global
600 CreditUnion Small Local
700 CreditUnion Global
800 CDO Global
Script
CREATE TABLE [dbo].[TEST_TARGET](
[ID] [int] NOT NULL,
[InstitutionType] [varchar](50) NOT NULL,
[Col1] [varchar](50) NULL,
[Col2] [varchar](50) NULL
);
GO
INSERT INTO [dbo].[TEST_TARGET](ID, InstitutionType, Col1, Col2)
VALUES('100', 'Banks', 'Small', '');
GO
INSERT INTO [dbo].[TEST_TARGET](ID, InstitutionType, Col1, Col2)
VALUES('200', 'Banks', '', 'Global');
GO
INSERT INTO [dbo].[TEST_TARGET](ID, InstitutionType, Col1, Col2)
VALUES('300', 'Banks', 'Big', 'Local');
GO
INSERT INTO [dbo].[TEST_TARGET](ID, InstitutionType, Col1, Col2)
VALUES('400', 'InsuranceCompany', 'Small', 'Local');
GO
INSERT INTO [dbo].[TEST_TARGET](ID, InstitutionType, Col1, Col2)
VALUES('500', 'InsuranceCompany', '', 'Global');
GO
INSERT INTO [dbo].[TEST_TARGET](ID, InstitutionType, Col1, Col2)
VALUES('600', 'CreditUnion', 'Small', 'Local');
GO
INSERT INTO [dbo].[TEST_TARGET](ID, InstitutionType, Col1, Col2)
VALUES('700', 'CreditUnion', '', 'Global');
GO
INSERT INTO [dbo].[TEST_TARGET](ID, InstitutionType, Col1, Col2)
VALUES('800', 'CDO', '', 'Global');
GO
EXPECTED OUTPUT:
ID
100
300
500
600
700
I am able to do it using cursor/while loop however I want to do it using query logic. Can someone please try to answer this interesting problem?
The code below will work and produce the exact results as your sample output, but seems a bit off how you are matching the records. Looks like you are only matching on Value and ignoring col1 & col2 in the driver table if they are both not available (which is why you are selecting 600 & 700) but not if one or the other exist? I would expect that once the driver table is transformed with col1 and col2 in the proper format, you would want to just join to it on all the columns like this:
on t.InstitutionType = d.Value
and coalesce(t.col1,'') = coalesce(d.col1,'')
and coalesce(t.col2,'') = coalesce(d.col2,'')
If there happened to be a mistake in your sample output, you can update the join condition in the code below.
with test_driver_cte AS
(select
Key1,
max(case when Value not like '%|%' then Value end) Value,
max(case when Value like '%|%' and col = 'Col1' then value2 end) Col1,
max(case when Value like '%|%' and col = 'Col2' then value2 end) Col2
from (
select
key1,
Value,
case when value like '%|%'
then substring(Value, 1, 4)
end col,
case when value like '%|%'
then substring(Value, 6, 10) --change 2nd parameter to length of Value-6
end value2
from test_driver
) td
group by key1
)
select distinct ID
from test_target t
join test_driver_cte d
on (t.InstitutionType = d.Value and d.col1 is null and d.col2 is null)
or (t.InstitutionType = d.Value and t.col1 = d.col1)
or (t.InstitutionType = d.Value and t.col2 = d.col2)
What you really want is a table like this
Key1 Value Col1 Col2
1 Banks Small
2 InsuranceCompany Global
3 Banks Big Local
4 CreditUnion
If you can't make it start that way then you can transform it like this:
SELECT D.Key1, D.Value, SUBSTRING(C1.Value,6) AS Col1, SUBSTRING(C2.Value,6) AS Col2
FROM DRIVER D
JOIN DRIVER c1 on D.Key1 = c1.Key1 AND LEFT(c1.Value,5) = 'Col1|'
JOIN DRIVER c2 on D.Key1 = c2.Key1 AND LEFT(c2.Value,5) = 'Col2|'
WHERE LEFT(D.Value,3) != 'Col'
Now you just join to get your result:
SELECT ID
FROM TABLE2
JOIN (
SELECT D.Key1, D.Value, SUBSTRING(C1.Value,6) AS Col1, SUBSTRING(C2.Value,6) AS Col2
FROM DRIVER D
JOIN DRIVER c1 on D.Key1 = c1.Key1 AND LEFT(c1.Value,5) = 'Col1|'
JOIN DRIVER c2 on D.Key1 = c2.Key1 AND LEFT(c2.Value,5) = 'Col2|'
WHERE LEFT(D.Value,3) != 'Col'
) x ON TABLE2.InstitutionType = x.Key1
AND COALESCE(x.col1,TABLE2.Col1) = TABLE2.Col1
AND COALESCE(x.col2,TABLE2.Col2) = TABLE2.Col2
You could also do the join like this (which might be more or less clear, but is the same)
) x ON (x.Key1, COALESCE(x.col1,TABLE2.Col1), COALESCE(x.col2,TABLE2.Col2)) =
(TABLE2.InstitutionType, TABLE2.Col1, TABLE2.Col2)
I did not test so I might have a typo or an off by one on the substring.
can anyone help me with T-SQL to sort this table
ID Comment ParentId
-- ------- --------
3 t1 NULL
4 t2 NULL
5 t1_1 3
6 t2_1 4
7 t1_1_1 5
to look like this
ID Comment ParentId
-- ------- --------
3 t1 NULL
5 t1_1 3
7 t1_1_1 5
4 t2 NULL
6 t2_1 4
Kind regards,
Lennart
try this:
DECLARE #YourTable table (id int, Comment varchar(10), parentID int)
INSERT INTO #YourTable VALUES (3, 't1' , NULL)
INSERT INTO #YourTable VALUES (4, 't2' , NULL)
INSERT INTO #YourTable VALUES (5, 't1_1' , 3)
INSERT INTO #YourTable VALUES (6, 't2_1' , 4)
INSERT INTO #YourTable VALUES (7, 't1_1_1', 5)
;with c as
(
SELECT id, comment, parentid, CONVERT(varchar(8000),RIGHT('0000000000'+CONVERT(varchar(10),id),10)) as SortBy
from #YourTable
where parentID IS NULL
UNION ALL
SELECT y.id, y.comment, y.parentid, LEFT(c.SortBy+CONVERT(varchar(8000),RIGHT('0000000000'+CONVERT(varchar(10),y.id),10)),8000) AS SortBy
FROM c
INNER JOIN #YourTable y ON c.ID=y.PArentID
)
select * from C ORDER BY SortBy
EDIT
here is output
id comment parentid SortBy
----------- ---------- ----------- ---------------------------------
3 t1 NULL 0000000003
5 t1_1 3 00000000030000000005
7 t1_1_1 5 000000000300000000050000000007
4 t2 NULL 0000000004
6 t2_1 4 00000000040000000006
(5 row(s) affected)
humm order by?
http://t-sql.pro/t-sql/ORDER-BY.aspx
SELECT ID, Comment, ParentId
FROM TestTable
ORDER BY Comment, ParentId asc
This sounds very much like a homework question, but here's some hints on where to go with this:
You'll want to do a quick google or StackOverflow search for the ORDER BY clause to be able to get a set of results ordered by the column you want to use (i.e. the 'Comment' column).
Once you've got that, you can start writing a SQL statement to order your results.
If you need to then place re-order the actual table (and not just get the results in a specific order), you'll need to look up using temporary tables (try searching for 'DECLARE TABLE'). Much like any temp swap, you can place the results you have in a temporary place, delete the old data, and then replace the table contents with the temporary data you have, but this time in the order you want.
But just ordering by Comment will give you that? Or have I missed the point?!
declare #table table
(
Comment varchar(10)
)
insert into #table (Comment) values ('t1')
insert into #table (Comment) values ('t2')
insert into #table (Comment) values ('t1_1')
insert into #table (Comment) values ('t2_1')
insert into #table (Comment) values ('t1_1_1')
select * from #table order by comment