Can't add data to datetime2 field - sql-server

Im using SQL Server Express 2008 and Im trying to add data to a field in a table which has a datatype of datetime2(7).
This is what Im trying to add:
'2012-02-02 12:32:10.1234'
But I am getting the error
Msg 8152, Level 16, State 4, Line 1
String or binary data would be truncated.
The statement has been terminated.
Does this mean that it's too long to be added to the field? and should be cut down abit? If so - can you give me an example of how it should look?
Note - I've also tried it in this format:
'01/01/98 23:59:59.999'
Thanks
**EDIT
The actual statement:
INSERT INTO dbo.myTable
(
nbr,
id,
name,
dsc,
start_date,
end_date,
last_date,
condition,
condtion_dsc,
crte_dte,
someting,
activation_date,
denial_date,
another_date,
a_name,
prior_auth_start_date,
prior_auth_end_date,
history_cmnt,
cmnt,
source,
program,
[IC-code],
[IC-description],
another_start_date,
another_start_date,
ver_nbr,
created_by,
creation_date,
updated_by,
updated_date)
VALUES
(
26,
'a',
'sometinh',
'c',
01/01/98 23:59:59.999,
01/01/98 23:59:59.999,
01/01/98 23:59:59.999,
'as',
'asdf',
01/01/98 23:59:59.999,
'lkop',
01/01/98 23:59:59.999,
01/01/98 23:59:59.999,
01/01/98 23:59:59.999,
'a',
01/01/98 23:59:59.999,
01/01/98 23:59:59.999,
'b',
'c',
'd',
'b',
'c',
'd',
01/01/98 23:59:59.999,
01/01/98 23:59:59.999,
423,
'Monkeys',
01/01/98 23:59:59.999,
'Goats',
01/01/98 23:59:59.999
);

Take a close look at the table you are trying to insert into. I bet one of the values you're trying to insert into a char/varchar/nchar/nvarchar column is too long.
SELECT
name,
max_length / CASE WHEN system_type_id IN (231, 239)
THEN 2 ELSE 1 END
FROM sys.columns
WHERE [object_id] = OBJECT_ID('dbo.TargetTableName')
AND system_type_id IN (167, 175, 231, 239);
This will get you a list like:
name
-------- --------
col1 32
col5 64
col7 12
Now, compare this list to the literals you have in your VALUES clause. As I suggested in a comment, I bet one of these has more characters than the table allows.
There's a chance there are binary or varbinary columns, and the issue is there, but I strongly suspect this is a simple "string is too long" problem - and has absolutely nothing to do with your DATETIME2(7) value.

Related

Snowflake - How to create summary table containing unique records

I'm looking for some Snowflake syntax assistance in how to generate a summary table or view from an existing table. My summary table should have 1 row per unique id from the existing table along with boolean values indicating if the various milestones (as per the summary column names) have been hit. Any help is appreciated as I am a Snowflake novice. Thanks.
Existing Table
Desired Summary Table/View
So using Himanshu's data, thank you:
WITH fake_data(id, updated, pipeline_id, stage_id) AS (
SELECT column1, to_date(column2,'mm/dd/yyyy hh:mm:ss'), column3, column4
FROM VALUES
(1111, '02/01/2022 09:01:00', 'A', '1' ),
(1111, '02/01/2022 10:01:00', 'A', '2' ),
(1111, '02/01/2022 11:01:00', 'B', '5' ),
(2222, '02/02/2022 13:01:00', 'A', '1' ),
(2222, '02/03/2022 18:01:00', 'B', '5' ),
(2222, '02/04/2022 07:01:00', 'B', '6' ),
(3333, '02/02/2022 14:01:00', 'A', '1' ),
(3333, '02/03/2022 18:01:00', 'A', '2' ),
(3333, '02/03/2022 07:01:00', 'C', '7' ),
(3333, '02/03/2022 21:01:00', 'C', '8' ),
(3333, '02/05/2022 17:01:00', 'C', '9' )
)
we are doing an aggregation across each id and we want to use COUNT_IF to see how many row meet out criteria, and if it is >0 we are happy
SELECT
id,
count_if(pipeline_id='A')>0 AS hit_stage_a,
count_if(pipeline_id='B')>0 AS hit_stage_b,
count_if(pipeline_id='C')>0 AS hit_stage_c,
count_if(stage_id='4')>0 AS hit_stage_4,
count_if(stage_id='5')>0 AS hit_stage_5,
count_if(stage_id='6')>0 AS hit_stage_6
FROM fake_data
GROUP BY 1
ORDER BY 1;
gives:
ID
HIT_STAGE_A
HIT_STAGE_B
HIT_STAGE_C
HIT_STAGE_4
HIT_STAGE_5
HIT_STAGE_6
1111
TRUE
TRUE
FALSE
FALSE
TRUE
FALSE
2222
TRUE
TRUE
FALSE
FALSE
TRUE
TRUE
3333
TRUE
FALSE
TRUE
FALSE
FALSE
FALSE
try this and see if this helps to get what you want.
SELECT ID, decode(HIT_PIPELINE_A, NULL,FALSE,TRUE) ,
decode(HIT_PIPELINE_B, NULL,FALSE,TRUE),
decode(HIT_PIPELINE_C, NULL,FALSE,TRUE),
decode(HIT_STAGE_4, NULL,FALSE,TRUE),
decode(HIT_STAGE_5, NULL,FALSE,TRUE),
decode(HIT_STAGE_6, NULL,FALSE,TRUE) FROM
(
SELECT * from tab1
PIVOT(MAx(PIPELINE_ID) FOR stage_id IN ('1','2','3','4','5','6'))
AS P(ID,DT,HIT_PIPELINE_A,HIT_PIPELINE_B,HIT_PIPELINE_C,HIT_STAGE_4,HIT_STAGE_5,HIT_STAGE_6)
) order by ID;
create or replace table Tab1 (ID varchar2(100), updated date, pipeline_id varchar2(100), stage_id varchar2(10));
insert into tab1 values(1111, to_date('02/01/2022 09:01:00','mm/dd/yyyy hh:mm:ss'), 'A', '1' );
insert into tab1 values(1111, to_date('02/01/2022 10:01:00','mm/dd/yyyy hh:mm:ss'), 'A', '2' );
insert into tab1 values(1111, to_date('02/01/2022 11:01:00','mm/dd/yyyy hh:mm:ss'), 'B', '5' );
insert into tab1 values(2222, to_date('02/02/2022 13:01:00','mm/dd/yyyy hh:mm:ss'), 'A', '1' );
insert into tab1 values(2222, to_date('02/03/2022 18:01:00','mm/dd/yyyy hh:mm:ss'), 'B', '5' );
insert into tab1 values(2222, to_date('02/04/2022 07:01:00','mm/dd/yyyy hh:mm:ss'), 'B', '6' );
insert into tab1 values(3333, to_date('02/02/2022 14:01:00','mm/dd/yyyy hh:mm:ss'), 'A', '1' );
insert into tab1 values(3333, to_date('02/03/2022 18:01:00','mm/dd/yyyy hh:mm:ss'), 'A', '2' );
insert into tab1 values(3333, to_date('02/03/2022 07:01:00','mm/dd/yyyy hh:mm:ss'), 'C', '7' );
insert into tab1 values(3333, to_date('02/03/2022 21:01:00','mm/dd/yyyy hh:mm:ss'), 'C', '8' );
insert into tab1 values(3333, to_date('02/05/2022 17:01:00','mm/dd/yyyy hh:mm:ss'), 'C', '9' );

Snowflake: How do I update a column with values taken at random from another table?

I've been struggling with this for a while now. Imagine I have these two tables:
CREATE TEMPORARY TABLE tmp_target AS (
SELECT * FROM VALUES
('John', 43, 'm', 17363)
, ('Mark', 21, 'm', 16354)
, ('Jean', 25, 'f', 74615)
, ('Sara', 63, 'f', 26531)
, ('Alyx', 32, 'f', 42365)
AS target (name, age, gender, zip)
);
and
CREATE TEMPORARY TABLE tmp_source AS (
SELECT * FROM VALUES
('Cory', 42, 'm', 15156)
, ('Fred', 51, 'm', 71451)
, ('Mimi', 22, 'f', 45624)
, ('Matt', 61, 'm', 12734)
, ('Olga', 19, 'f', 52462)
, ('Cleo', 29, 'f', 23352)
, ('Simm', 31, 'm', 62445)
, ('Mona', 37, 'f', 23261)
, ('Feng', 44, 'f', 64335)
, ('King', 57, 'm', 12225)
AS source (name, age, gender, zip)
);
I would like to update the tmp_target table by taking 5 rows at random from the tmp_source table for the column(s) I'm interested in. For example, maybe I want to replace all the names with 5 random names from tmp_source, or maybe I want to replace the names and the ages.
My first attempt was this:
UPDATE tmp_target t SET t.name = s.name FROM tmp_source s;
However, when I examine the target table, I notice that quite a few of the names are duplicated, usually in pairs. As well, Snowflake gives me number of rows updated: 5 as well as number of multi-joined rows updated: 5. I believe this is due to the non-deterministic nature of what's happening, possibly as noted in the Snowflake documentation on updates. Not to mention I get the nagging feeling that this is somehow horribly inefficient if the tables had many records.
Then I tried something to grab 5 random rows from the source table:
UPDATE tmp_target t SET t.name = cte.name
FROM (
WITH upd AS (SELECT name FROM tmp_source SAMPLE ROW (5 ROWS))
SELECT name FROM upd
) AS cte;
But I seem to run into the exact same issue, both when I examine the target table, and as reported by the number of multi-joined rows.
I was wondering if I can use row numbering somehow, but while I can generate row numbers in the subquery, I don't know how to do that in the SET part of the outside query.
I want to add that neither table has any identifiers or indexes that can be used, and I'm looking for a solution that wouldn't require any.
I would very much appreciate it if anyone can provide solutions or ideas that are as clean and tidy as possible, with some consideration given to efficiency (imagine a target table of 100K rows and a source table of 10M rows).
Thank you!
I like the two answers already provided, but let me give you a simple answer to solve the simple case:
UPDATE tmp_target t
SET t.name = (
select array_agg(s.name) possible_names
from tmp_source s
)[uniform(0, 9, random())]
;
The secret of this solution is building an array of possible values, and choosing one at random for each updated row.
Update: Now with a JavaScript UDF that will help us choose each name from source only once
create or replace function incremental_thing()
returns float
language javascript
as
$$
if (typeof(inc) === "undefined") inc = 0;
return inc++;
$$
;
UPDATE tmp_target t
SET t.name = (
select array_agg(s.name) within group (order by random())
from tmp_source s
)[incremental_thing()::integer]
;
Note that the JS UDF returns an incremental value each time it’s called, and that helps me choose the next value from a sorted array to use on an update.
Since the value is incremented inside the JS UDF, this will work as long as there's only one JS env involved. To for single-node processing and avoid parallelism choose an XS warehouse and test.
Two example as follows, the first uses a temporary table to house the joined data by a rownum, the second include everything in the one query, note I used UPPER and lower case strings to make sure the records were being updated the way I wanted.
CREATE OR REPLACE TEMPORARY TABLE tmp_target AS (
SELECT * FROM VALUES
('John', 43, 'm', 17363)
, ('Mark', 21, 'm', 16354)
, ('Jean', 25, 'f', 74615)
, ('Sara', 63, 'f', 26531)
, ('Alyx', 32, 'f', 42365)
AS target (name, age, gender, zip)
);
CREATE OR REPLACE TEMPORARY TABLE tmp_source AS (
SELECT * FROM VALUES
('CORY', 42, 'M', 15156)
, ('FRED', 51, 'M', 71451)
, ('MIMI', 22, 'F', 45624)
, ('MATT', 61, 'M', 12734)
, ('OLGA', 19, 'F', 52462)
, ('CLEO', 29, 'F', 23352)
, ('SIMM', 31, 'M', 62445)
, ('MONA', 37, 'F', 23261)
, ('FENG', 44, 'F', 64335)
, ('KING', 57, 'M', 12225)
AS source (name, age, gender, zip)
);
CREATE OR REPLACE TEMPORARY TABLE t1 as (
with src as (
SELECT tmp_source.*, row_number() over (order by 1) tmp_id
FROM tmp_source SAMPLE ROW (5 ROWS)),
tgt as (
SELECT tmp_target.*, row_number() over (order by 1) tmp_id
FROM tmp_target SAMPLE ROW (5 ROWS))
SELECT src.name as src_name,
src.age as src_age,
src.gender as src_gender,
src.zip as src_zip,
src.tmp_id as tmp_id,
tgt.name as tgt_name,
tgt.age as tgt_age,
tgt.gender as tgt_gender,
tgt.zip as tgt_zip
FROM src, tgt
WHERE src.tmp_id = tgt.tmp_id);
UPDATE tmp_target a
SET a.name = b.src_name,
a.gender = b.src_gender
FROM (SELECT * FROM t1) b
WHERE a.name = b.tgt_name
AND a.age = b.tgt_age
AND a.gender = b.tgt_gender
AND a.zip = b.tgt_zip;
UPDATE tmp_target a
SET a.name = b.src_name,
a.gender = b.src_gender
FROM (
with src as (
SELECT tmp_source.*, row_number() over (order by 1) tmp_id
FROM tmp_source SAMPLE ROW (5 ROWS)),
tgt as (
SELECT tmp_target.*, row_number() over (order by 1) tmp_id
FROM tmp_target SAMPLE ROW (5 ROWS))
SELECT src.name as src_name,
src.age as src_age,
src.gender as src_gender,
src.zip as src_zip,
src.tmp_id as tmp_id,
tgt.name as tgt_name,
tgt.age as tgt_age,
tgt.gender as tgt_gender,
tgt.zip as tgt_zip
FROM src, tgt
WHERE src.tmp_id = tgt.tmp_id) b
WHERE a.name = b.tgt_name
AND a.age = b.tgt_age
AND a.gender = b.tgt_gender
AND a.zip = b.tgt_zip;
At a first pass, this is all that came to mind. I'm not sure if it suits your example perfectly, since it involves reloading the table.
It should be comparably performant to any other solution that uses a generated rownum. At least to my knowledge, in Snowflake, an update is no more performant than an insert (at least in this case where you're touching every record, and every micropartition, regardless).
INSERT OVERWRITE INTO tmp_target
with target as (
select
age,
gender,
zip,
row_number() over (order by 1) rownum
from tmp_target
)
,source as (
select
name,
row_number() over (order by 1) rownum
from tmp_source
SAMPLE ROW (5 ROWS)
)
SELECT
s.name,
t.age,
t.gender,
t.zip
from target t
join source s on t.rownum = s.rownum;

SQL query to determine if groups of rows have a particular value in a particular column

I am working in SQL Server 2016. I have the following table and sample data:
CREATE TABLE A
(
col1 char(1)
,col2 int
,indicator_flag char(4)
)
;
INSERT INTO A
VALUES
('A', 1, 'Pass')
,('A', 2, 'Pass')
,('A', 3, 'Fail')
,('B', 10, 'Pass')
,('C', 19, 'Fail')
,('D', 1, 'Fail')
,('D', 2, 'Fail')
,('E', 1, 'Pass')
,('E', 2, 'Pass')
,('F', 20, 'Fail')
,('F', 21, 'Fail')
,('F', 100, 'Pass')
;
The indicator_flag column will only ever hold values 'Pass' and 'Fail'. For every distinct value in col1, I want to return a collapsed indicator_flag value according to the following rule -- if all values are 'Pass', then 'Pass'; else, 'Fail'.
So, for the sample data, I expect the following output:
col1 collapsed_indicator_flag
A Fail
B Pass
C Fail
D Fail
E Pass
F Fail
How can I achieve this output? The solution needs to perform well. (My actual table is very large.)
One method is to use aggregation:
select col1, min(indicator_flag) as indicator_flag
from a
group by col1;
This uses the observation that 'Pass' > 'Fail'.
If you want performance, then you could speed this up if you have the right indexes and another table with just col1 values:
select t.col1, coalesce(a.indicator_flag, 'Pass') as indicator_flag
from col1table t outer apply
(select a.*
from a
where a.col1 = t.col1 and a.indicator_flag = 'Fail'
) a;
The index for this query would be a(col1, indicator_flag).

SQL Server: rearrange and flatten out table of values?

I am working on a Stored Procedure that retrieves certain values from my database.
I am able to get the values I need, but I'm having a hard time figuring out how to order them they way I need.
Below is what I'm trying to achieve. I already have table properties (left), and need to create table newProperties by running a SELECT on properties.
Please note:
the field valueTypeID will ALWAYS be either 68 or 80.
the field value will never be the same. Each value will be a long string of chars that changes for each value (I have simplified for my question)
I think this should be the starting point of your SELECT statement:
select
p.genericID,
p68.Value as ValueTypeId68,
p80.Value as ValueTypeId80
from partials p
join partialsProps p68
on p.genericId = p68.genericId
and p68.valueTypeID = 68
join partialsProps p80
on p.genericId = p80.genericId
and p80.valueTypeID = 80
You can add the #cycle and #clientAccountID conditions on top of it.
SQL Fiddle: http://www.sqlfiddle.com/#!6/472de/1
Are 68 and 80 the only possible values for the column valueTypeID? if it's so then the answer provided by w0lf should work for you though I won't say it is the only solution.
However, if there can be multiple values for valueTypeID apart from 60 and 80 then I would like to propose a solution here.
I am assuming that you have numeric and non-numeric values in the column value but the problem you are having is that the numeric and non numeric values come together in a single column hence you want to separate them out. This is the scenario I imagine.
NOTE: If this does not provide you satisfactory solution, please provide the scenario in detail.
GO to SQLFiddle
-- Create table as you have provided in the question
CREATE TABLE #partials
([genericID] int);
INSERT INTO #partials ([genericID])
VALUES (11),(12),(13),(14);
CREATE TABLE #partialsProps
([genericID] int, [valueTypeID] int, [Value] varchar(1));
-- Insert values similar
INSERT INTO #partialsProps
([genericID], [valueTypeID], [Value])
VALUES
(11, 68, 'A'),
(11, 80, '1'),
(12, 68, 'Z'),
(12, 80, '2'),
(13, 68, 'B'),
(13, 80, '3'),
(14, 68, 'Y'),
(14, 80, '4')
;
select r1.value as Val1,r2.Value as Val2 from
(select pp.genericID,value, valueTypeID from #partialsProps pp join #partials p on
p.genericID = pp.genericID
where ISNUMERIC(Value) = 0 group by valueTypeID, value ,pp.genericID) r1
join
(select pp.genericID,value, valueTypeID from #partialsProps pp join #partials p on
p.genericID = pp.genericID
where ISNUMERIC(Value) = 1 group by valueTypeID, value ,pp.genericID) r2
on r1.genericID = r2.genericID ;
drop table #partials;
drop table #partialsProps;

Moving Average PER TICKER for each day

I am trying to calculate the, say, 3 day moving average (in reality 30 day) volume for stocks.
I'm trying to get the average of the last 3 date entries (rather than today-3 days). I've been trying to do something with rownumber in SQL server 2012 but with no success. Can anyone help out. Below is a template schema, and my rubbish attempt at SQL. I have various incarnations of the below SQL with the group by's but still not working. Many thanks!
select dt_eod, ticker, volume
from
(
select dt_eod, ticker, avg(volume)
row_number() over(partition by dt_eod order by max_close desc) rn
from mytable
) src
where rn >= 1 and rn <= 3
order by dt_eod
Sample Schema:
CREATE TABLE yourtable
([dt_date] int, [ticker] varchar(1), [volume] int);
INSERT INTO yourtable
([dt_date], [ticker], [volume])
VALUES
(20121201, 'A', 5),
(20121201, 'B', 7),
(20121201, 'C', 6),
(20121202, 'A', 10),
(20121202, 'B', 8),
(20121202, 'C', 7),
(20121203, 'A', 10),
(20121203, 'B', 87),
(20121203, 'C', 74),
(20121204, 'A', 10),
(20121204, 'B', 86),
(20121204, 'C', 67),
(20121205, 'A', 100),
(20121205, 'B', 84),
(20121205, 'C', 70),
(20121206, 'A', 258),
(20121206, 'B', 864),
(20121206, 'C', 740);
Three day average for each row:
with top3Values as
(
select t.ticker, t.dt_date, top3.volume
from yourtable t
outer apply
(
select top 3 top3.volume
from yourtable top3
where t.ticker = top3.ticker
and t.dt_date >= top3.dt_date
order by top3.dt_date desc
) top3
)
select ticker, dt_date, ThreeDayVolume = avg(volume)
from top3Values
group by ticker, dt_date
order by ticker, dt_date
SQL Fiddle demo.
Latest value:
with tickers as
(
select distinct ticker from yourtable
), top3Values as
(
select t.ticker, top3.volume
from tickers t
outer apply
(
select top 3 top3.volume
from yourtable top3
where t.ticker = top3.ticker
order by dt_date desc
) top3
)
select ticker, ThreeDayVolume = avg(volume)
from top3Values
group by ticker
order by ticker
SQL Fiddle demo.
Realistically you wouldn't need to create the tickers CTE for the second query, as you'd be basing this on a [ticker] table, and you'd probably have some sort of date parameter in the query, but hopefully this will get you on the right track.
You mentioned SQL 2012, which means that you can leverage a much simpler paradigm.
select dt_date, ticker, avg(1.0*volume) over (
partition by ticker
order by dt_date
ROWS BETWEEN 2 preceding and current row
)
from yourtable
I find this much more transparent as to what is actually trying to be accomplished.
You may want to look at yet another technique that is presented here: SQL-Server Moving Averages set-based algorithm with flexible window-periods and no self-joins.
The algorithm is quite speedy (much faster than APPLY and does not degrade in performance like APPLY does as data-points-window expands), easily adaptable to your requirement, works with pre-SQL2012, and overcomes the limitations of SQL-2012's windowing functionality that requires hard-coding of window-width in the OVER/PARTITION-BY clause.
For a stock-market type application with moving price-averages, it is a common requirement to allow a user to vary the number of data-points included in the average (from a UI selection, like allowing a user to choose 7 day, 30 day, 60 day, etc), and SQL-2012's OVER clause cannot handle this variable partition-width requirement without dynamic SQL.

Resources