Using an INSERT query that one column came from another table

Using an INSERT query that one column came from another table - sql-server

With these tables:
Table1
Date | Barcode | Sold | AmountSold
Table2
Barcode | Description | RetailPrice
00001 Item1 1.00
00002 Item2 2.00
00003 Item3 3.00
00004 Item4 4.00
00005 Item5 5.00
Is there a way to use an INSERT to Table1, like this:
INSERT INTO dbo.Table1
VALUES ('07/11/2017', '00003', 5, (? * 5))
With the ? being the RetailPrice (which is 3.00) of 00003 from Table2, then multiplied with Sold (which is 5)?
I have stumbled upon INSERT INTO SELECT, but this requires that all column that will be inserted will have a matching value from SELECT, which I do not need.
Note: the first three values will come from an external source, so the 4th value will be the only one that need to come from another table
I can of course use another query first to get the RetailPrice before inserting, but I'm avoiding to use this way to reduce loading time.

I believe that you are after something like this one:
INSERT INTO dbo.Table1 (Date, Barcode , Sold , AmountSold)
SELECT '07/11/2017', '00003', 5, 5 * RetailPrice
FROM Table2
-- WHERE Barcode = 'XXX'

INSERT INTO dbo.table1
VALUES ('07/11/2017', '00003', 5, ((SELECT RetailPrice
FROM dbo.table2
WHERE dbo.table2.Barcode = '00003') * 5))

Related

SQL query to sum multiple columns across multiple rows

I have a table in SQL Server that has multiple rows per id. There are values that need to be subtracted in two columns as well. In the sample below, I need to subtract 1033.90 - 1033.90 - 1181.60 to equal -1181.60.
ID
Value1
Value2
1
1033.90
0.00
1
0.00
1033.90
1
1181.60
0.00
I have tried a few different ways found from others' questions but nothing has worked yet. Cross Joins or Unions seemed to be the way but have yet to give the result needed. Can anyone lend any clues?

All you seem to want is a sum(s) with a group by id
DROP TABLE IF EXISTS T;
CREATE TABLE T
(ID INT, Value1 DECIMAL(10,2), Value2 DECIMAL(10,2));
INSERT INTO T VALUES
(1, 1033.90 ,0.00),
(1, 0.00 ,1033.90),
(1, 1181.60 ,0.00);
SELECT ID , SUM(VALUE1) - SUM(VALUE2) AS TOT
FROM T
GROUP BY ID;
+------+---------+
| ID | TOT |
+------+---------+
| 1 | 1181.60 |
+------+---------+
1 row in set (0.001 sec)
and the group by is unnecessary if you have only 1 id. If you want a running total which where I think Lamu is coming from then you would need some way of ordering events

Combining multiple rows in SQL Server into one

I have a table that looks like this
ID RefernceID Field1 Field2 Field3
-- ---------- ------ -------- -------
1 A01 Cat NULL Dog
2 A01 Cat Fish NULL
3 A02 Banana Apple NULL
4 A02 Banana NULL Mango
I'm trying to get this
ID RefernceID Field1 Field2 Field3
-- ---------- ------ -------- -------
1 A01 Cat Fish Dog
3 A02 Banana Apple Mango
So basically the rows are GROUPED by ReferenceID and Field 1 and then I want them to merge with the NULL's replaced.
Any help would be appreciated.
EDIT: Sorry, forgot to add that there are other columns as well (I just didn't mention the, and I still need one of the ID values.

You want aggregation :
select referenceid, field1, max(field2), max(field3)
from table t
group by referenceid, field1;

You can use a simple aggregation( max ) as :
select RefernceID,
min(ID) as ID,
max(field1) as field1,
max(field2) as field2,
max(field3) as field3
from tab
group by RefernceID

A neat trick is to use max or min, that just ignore nulls. So if you have only one non-null value, max will return it. Since you just need one of the ids, you could arbitrarily use min, which will return the result shown in the question:
SELECT MIN(id), referenceid, field1, MAX(field2), MAX(field3)
FROM mytable
GROUP BY referenceid, field1

Oracle Data Masking using random names from a temp table

We need to mask some Personally Identifiable Information in our Oracle 10g database. The process I'm using is based on another masking script that we are using for Sybase (which works fine), but since the information in the Oracle and Sybase databases is quite different, I've hit a bit of a roadblock.
The process is to select all data out of the PERSON table, into a PERSON_TRANSFER table. We then use a random number to select a random name from the PERSON_TRANSFER table, and then update the PERSON table with that random name. This works fine in Sybase because there is only one row per person in the PERSON table.
The issue I've encountered is that in the Oracle DB, there are multiple rows per PERSON, and the name may or may not be different for each row, e.g.
|PERSON|
:-----------------:
|PERSON_ID|SURNAME|
|1 |Purple |
|1 |Purple |
|1 |Pink | <--
|2 |Gray |
|2 |Blue | <--
|3 |Black |
|3 |Black |
The PERSON_TRANSFER is a copy of this table. The table is in the millions of rows, so I'm just giving a very basic example here :)
The logic I'm currently using would just update all rows to be the same for that PERSON_ID, e.g.
|PERSON|
:-----------------:
|PERSON_ID|SURNAME|
|1 |Brown |
|1 |Brown |
|1 |Brown | <--
|2 |White |
|2 |White | <--
|3 |Red |
|3 |Red |
But this is incorrect as the name that is different for that PERSON_ID needs to be masked differently, e.g.
|PERSON|
:-----------------:
|PERSON_ID|SURNAME|
|1 |Brown |
|1 |Brown |
|1 |Yellow | <--
|2 |White |
|2 |Green | <--
|3 |Red |
|3 |Red |
How do I get the script to update the distinct names separately, rather than just update them all based on the PERSON_ID? My script currently looks like this
DECLARE
v_SURNAME VARCHAR2(30);
BEGIN
select pt.SURNAME
into v_SURNAME
from PERSON_TRANSFER pt
where pt.PERSON_ID = (SELECT PERSON_ID FROM
( SELECT PERSON_ID FROM PERSON_TRANSFER
ORDER BY dbms_random.value )
WHERE rownum = 1);
END;
Which causes an error because too many rows are returned for that random PERSON_ID.
1) Is there a more efficient way to update the PERSON table so that names are randomly assigned?
2) How do I ensure that the PERSON table is masked correctly, in that the various surnames are kept distinct (or the same, if they are all the same) for any single PERSON_ID?
I'm hoping this is enough information. I've simplified it a fair bit (the table has a lot more columns, such as First Name, DOB, TFN, etc.) in the hope that it makes the explanation easier.
Any input/advice/help would be greatly appreciated :)
Thanks.

One of the complications is that the same surname may appear under different person_id's in the PERSON table. You may be better off using a separate, auxiliary table holding surnames that are distinct (for example you can populate it by selecting distinct surnames from PERSONS).
Setup:
create table persons (person_id, surname) as (
select 1, 'Purple' from dual union all
select 1, 'Purple' from dual union all
select 1, 'Pink' from dual union all
select 2, 'Gray' from dual union all
select 2, 'Blue' from dual union all
select 3, 'Black' from dual union all
select 3, 'Black' from dual
);
create table mask_names (person_id, surname) as (
select 1, 'Apple' from dual union all
select 2, 'Banana' from dual union all
select 3, 'Grape' from dual union all
select 4, 'Orange' from dual union all
select 5, 'Pear' from dual union all
select 6, 'Plum' from dual
);
commit;
CTAS to create PERSON_TRANSFER:
create table person_transfer (person_id, surname) as (
select ranked.person_id, rand.surname
from ( select person_id, surname,
dense_rank() over (order by surname) as rk
from persons
) ranked
inner join
( select surname, row_number() over (order by dbms_random.value()) as rnd
from mask_names
) rand
on ranked.rk = rand.rnd
);
commit;
Outcome:
SQL> select * from person_transfer order by person_id, surname;
PERSON_ID SURNAME
---------- -------
1 Pear
1 Pear
1 Plum
2 Banana
2 Grape
3 Apple
3 Apple
Added at OP's request: The scope has been extended - the requirement now is to update surname in the original table (PERSONS). This can be best done with the merge statement and the join (sub)query I demonstrated earlier. This works best when the PERSONS table has a PK, and indeed the OP said the real-life table PERSONS has such a PK, made up of the person_id column and an additional column, date_from. In the script below, I drop persons and recreate it to include this additional column. Then I show the query and the result.
Note - a mask_names table is still needed. A tempting alternative would be to just shuffle the surnames already present in persons so there would be no need for a "helper" table. Alas that won't work. For example, in a trivial example persons has only one row. To obfuscate surnames, one MUST come up with surnames not in the original table. More interestingly, assume every person_id has exactly two rows, with distinct surnames, but those surnames in every case are 'John' and 'Mary'. It doesn't help to just shuffle those two names. One does need a "helper" table like mask_names.
New setup:
drop table persons;
create table persons (person_id, date_from, surname) as (
select 1, date '2016-01-04', 'Purple' from dual union all
select 1, date '2016-01-20', 'Purple' from dual union all
select 1, date '2016-03-20', 'Pink' from dual union all
select 2, date '2016-01-24', 'Gray' from dual union all
select 2, date '2016-03-21', 'Blue' from dual union all
select 3, date '2016-04-02', 'Black' from dual union all
select 3, date '2016-02-13', 'Black' from dual
);
commit;
select * from persons;
PERSON_ID DATE_FROM SURNAME
---------- ---------- -------
1 2016-01-04 Purple
1 2016-01-20 Purple
1 2016-03-20 Pink
2 2016-01-24 Gray
2 2016-03-21 Blue
3 2016-04-02 Black
3 2016-02-13 Black
7 rows selected.
New query and result:
merge into persons p
using (
select ranked.person_id, ranked.date_from, rand.surname
from (
select person_id, date_from, surname,
dense_rank() over (order by surname) as rk
from persons
) ranked
inner join (
select surname, row_number() over (order by dbms_random.value()) as rnd
from mask_names
) rand
on ranked.rk = rand.rnd
) t
on (p.person_id = t.person_id and p.date_from = t.date_from)
when matched then update
set p.surname = t.surname;
commit;
select * from persons;
PERSON_ID DATE_FROM SURNAME
---------- ---------- -------
1 2016-01-04 Apple
1 2016-01-20 Apple
1 2016-03-20 Orange
2 2016-01-24 Plum
2 2016-03-21 Grape
3 2016-04-02 Banana
3 2016-02-13 Banana
7 rows selected.

Returning a column in a result set from merging results in another table

Please forgive the title, I wasn't sure how exactly to describe the situation below...
I have 2 tables. One column in the first table has a set of comma delimited codes that are in a second table. So, the two tables look like this:
Table1
RepID | RepDate | RepLocation
1 1/1/2010 BH,,,,AH,,,
2 2/1/2010 ,,,,,AH,,,
Table2
LocID | LocName
BH Bliss Hall
AH Agans Hall
I can successfully select from both tables using joins, and I obviously get multiple rows in the resultset:
RepID | RepDate | RepLocation
1 1/1/2010 Bliss Hall
1 1/1/2010 Agans Hall
2 2/1/2010 Agans Hall
But what I'd really like to do is get a result that looks like this:
RepID | RepDate | AllRepLocations
1 1/1/2010 Bliss Hall Agans Hall
2 2/1/2010 Agans Hall
I've never tried to do this before, and I'm having trouble coming up with the T-SQL to get this result, if it is even possible. I am calling a stored procedure, so if I need to do some extra coding or machinations to get the result I want, it is not a problem as I can do them in the stored procedure. This is on SQL Server 2008 R2.
Thank you.

Okay I think your best bet is to use dynamic SQL with REPLACE(). Try this out:
Your Table
CREATE TABLE Table1 (RepID INT,RepDate DATE,RepLocation VARCHAR(100));
INSERT INTO Table1
VALUES (1,'20100101','BH,AH'),
(2,'20100201','AH');
CREATE TABLE Table2 (LocID CHAR(2),LocName VARCHAR(25));
INSERT INTO Table2
VALUES ('BH','Bliss Hall'),
('AH','Agans Hall');
Actual Query
DECLARE #Replace VARCHAR(MAX);
SELECT #Replace = COALESCE('REPLACE( ' + #Replace,'REPLACE(RepLocation + '',''') + ',''' + LocId + ','',''' + LocName + ' '')'
FROM Table2
EXEC
(
'SELECT RepID,RepDate,' + #Replace + ' AS AllRepLocations
FROM Table1' --Change Table1 to your actual tableName
)
Results:
RepID RepDate AllRepLocations
----------- ---------- -------------------------
1 2010-01-01 Bliss Hall Agans Hall
2 2010-02-01 Agans Hall

SQL Server - Update Column with Handing Duplicate and Unique Rows Based Upon Timestamp

I'm working with SQL Server 2005 and looking to export some data off of a table I have. However, prior to do that I need to update a status column based upon a field called "VisitNumber", which can contain multiple entries same value entries. I have a table set up in the following manner. There are more columns to it, but I am just putting in what's relevant to my issue
ID Name MyReport VisitNumber DateTimeStamp Status
-- --------- -------- ----------- ----------------------- ------
1 Test John Test123 123 2014-01-01 05.00.00.000
2 Test John Test456 123 2014-01-01 07.00.00.000
3 Test Sue Test123 555 2014-01-02 08.00.00.000
4 Test Ann Test123 888 2014-01-02 09.00.00.000
5 Test Ann Test456 888 2014-01-02 10.00.00.000
6 Test Ann Test789 888 2014-01-02 11.00.00.000
Field Notes
ID column is a unique ID in incremental numbers
MyReport is a text value and can actually be thousands of characters. Shortened for simplicity. In my scenario the text would be completely different
Rest of fields are varchar
My Goal
I need to address putting in a status of "F" for two conditions:
* If there is only one VisitNumber, update the status column of "F"
* If there is more than one visit number, only put "F" for the one based upon the earliest timestamp. For the other ones, put in a status of "A"
So going back to my table, here is the expectation
ID Name MyReport VisitNumber DateTimeStamp Status
-- --------- -------- ----------- ----------------------- ------
1 Test John Test123 123 2014-01-01 05.00.00.000 F
2 Test John Test456 123 2014-01-01 07.00.00.000 A
3 Test Sue Test123 555 2014-01-02 08.00.00.000 F
4 Test Ann Test123 888 2014-01-02 09.00.00.000 F
5 Test Ann Test456 888 2014-01-02 10.00.00.000 A
6 Test Ann Test789 888 2014-01-02 11.00.00.000 A
I was thinking I could handle this by splitting each types of duplicates/triplicates+ (2,3,4,5). Then updating every other (or every 3,4,5 rows). Then delete those from the original table and combine them together to export the data in SSIS. But I am thinking there is a much more efficient way of handling it.
Any thoughts? I can accomplish this by updating the table directly in SQL for this status column and then export normally through SSIS. Or if there is some way I can manipulate the column for the exact conditions I need, I can do it all in SSIS. I am just not sure how to proceed with this.

WITH cte AS
(
SELECT *, ROW_NUMBER() OVER (PARTITION BY VisitNumber ORDER BY DateTimeStamp) rn from MyTable
)
UPDATE cte
SET [status] = (CASE WHEN rn = 1 THEN 'F' ELSE 'A' END)

I put together a test script to check the results. For your purposes, use the update statements and replace the temp table with your table name.
create table #temp1 (id int, [name] varchar(50), myreport varchar(50), visitnumber varchar(50), dts datetime, [status] varchar(1))
insert into #temp1 (id,[name],myreport,visitnumber, dts) values (1,'Test John','Test123','123','2014-01-01 05:00')
insert into #temp1 (id,[name],myreport,visitnumber, dts) values (2,'Test John','Test456','123','2014-01-01 07:00')
insert into #temp1 (id,[name],myreport,visitnumber, dts) values (3,'Test Sue','Test123','555','2014-01-01 08:00')
insert into #temp1 (id,[name],myreport,visitnumber, dts) values (4,'Test Ann','Test123','888','2014-01-01 09:00')
insert into #temp1 (id,[name],myreport,visitnumber, dts) values (5,'Test Ann','Test456','888','2014-01-01 10:00')
insert into #temp1 (id,[name],myreport,visitnumber, dts) values (6,'Test Ann','Test789','888','2014-01-01 11:00')
select * from #temp1;
update #temp1 set status = 'F'
where id in (
select id from #temp1 t1
join (select min(dts) as mindts, visitnumber
from #temp1
group by visitNumber) t2
on t1.visitnumber = t2.visitnumber
and t1.dts = t2.mindts)
update #temp1 set status = 'A'
where id not in (
select id from #temp1 t1
join (select min(dts) as mindts, visitnumber
from #temp1
group by visitNumber) t2
on t1.visitnumber = t2.visitnumber
and t1.dts = t2.mindts)
select * from #temp1;
drop table #temp1
Hope this helps