I toggle the display of PHI when generating data extracts from a vendor's EHR system.
To date, I've been manually enabling and disabling these fields in my script files:
-- PHI enabled
SELECT MRN
--,HASHBYTES('SHA2_256',MRN) MRN_HASH
...
GO
-- PHI disabled
SELECT -- MRN
,HASHBYTES('SHA2_256',MRN) MRN_HASH
...
GO
Is there a way to do this dynamically?
--
-- disable this variable when running `SQLCMD` from command line
-- PS> sqlcmd -E -S server -d database -i .\script.sql -v hide_phi=1
--
:setvar hide_phi 0
:out c:\users\x\desktop\patients.csv
SELECT
<if $(hide_phi)=0 then hide MRN>
<if $(hide_phi)=1 then hide MRN_HASH>
...
GO
SQLCMD accepts variables. You can simply pass in the variable to your .SQL file, and inside your file, do a conditional check on the variable value. You could use a CASE statement to check value of the variable, and return the appropriate value.
Pseudo sample query:
select "MRN" = case
when '$(hide_phi)' = '1' then HASHBYTES('SHA2_256', MRN)
else MRN
END
...
GO
or possibly this:
select "MRN" = case '$(hide_phi)'
when '1' then HASHBYTES('SHA2_256', MRN)
else MRN
END
More info: https://msdn.microsoft.com/en-us/library/ms188714.aspx
Related
I am attempting what I thought was a relatively easy thing. I use a SQL task to look for a filename in a table. If it exists, do something, if not, do nothing.
Here is my setup in SSIS:
My SQL Statement in 'File Exists in Table' is as follows, and ResultSet is 'Single Row':
SELECT ISNULL(id,0) as id FROM PORG_Files WHERE filename = ?
My Constraint is:
When I run it, there are no files in the table yet, so it should return nothing. I've tried ISNULL and COALESCE to set a value. I receive the following error:
Error: 0xC002F309 at File Exist in Table, Execute SQL Task: An error occurred while assigning a value to variable "id": "Single Row result set is specified, but no rows were returned.".
Not sure how to fix this. The ISNULL and COALESCE are the things suggestions found in SO and on MSDN
Try changing your SQL statement to a COUNT then your comparison expression would read #ID > 0. So, if you have files that match your pattern the count will be greater than 0 if there are no files it will return 0.
SELECT COUNT(id) as id FROM PORG_Files WHERE filename = ?
If you want to check if a row exists then you should use Count as #jradich1234 mentioned.
SELECT COUNT(*) as id FROM PORG_Files WHERE filename = ?
If you are looking to check if a row exists and to store the id in a variable to use it later in the package, first you have to use TOP 1 since you selecting a single row result set and you can use a similar logic:
DECLARE #Filename VARCHAR(4000) = ?
IF EXISTS(SELECT 1 FROM PORG_Files WHERE filename = #Filename)
SELECT TOP 1 id FROM PORG_Files WHERE filename = #Filename
ELSE
SELECT 0 as id
Then if id = 0 then no rows exists.
It just needs to use UNION another row empty
always return a value
Like it if work for you
SELECT foo As nameA, fa AS nameB
UNION ALL
SELECT NULL AS nameA, NULL AS nameB
And then you can validated (line) contrained if var is null no allow
i have a table on sql like this:
CD_MATERIAL | CD_IDENTIFICACAO
1 | 002323
2 | 00322234
... | ...
AND SO ON (5000+ lines)
I need to use that info to search and replace multiple external xml files on a folder (all the tags on those XML had numbers like the CD_IDENTIFICACAO from sql query, i need to replace with corresponding cd_material from sql query "ex.: 002323 becomes 1)
I used this query to extract all the cd_identificacao to use on Notepad++:
declare #result varchar(max)
select #result = COALESCE(#result + '', '') + CONCAT('(',CD_IDENTIFICACAO,')|') from TBL_MATERIAIS WHERE CD_IDENTIFICACAO <> '' ORDER BY CD_MATERIAL
select #result
That would bring me ex.:
(1TEC45D025)|(1TEC800039)|(999999999)|(542251)|(2TEC58426)|(234852)
and changed the parameters to get the replace ex.:
(? 2000)|(? 2001)|(? 2002)|(? 2003)|(? 2004)|(? 2005)
but i don't know how to add a number (increment) on front of "?" so notepad++ would understand it (search and replace would have 5000+ results, so it's not pratical to manually add the increment).
I was able to get a workaround for this. I've used this query to get all the the terms for find and replace i needed (1 per line)
select concat('<cProd>',cd_identificacao,'</cProd>'), concat('<cProd>',cd_material,'</cProd>') from tbl_materiais where cd_identificacao <> '' order by cd_material
That would result in:
<cProd>1TEC460054</cProd> <cProd>1</cProd>
<cProd>1TEC240035</cProd> <cProd>2</cProd>
(i added the tag too to make sure no other information could be replaced as there were many number combinations that could lead to incorrect replacement)
then pasted it on a txt and i used the notepad++ to replace the space between column 1 and 2 for /r/n wich would result in:
<cProd>1TEC460054</cProd>
<cProd>1</cProd>
<cProd>1TEC240035</cProd>
<cProd>2</cProd>
then i used "Ecobyte Replace Text" Tool, pasted my result file as new selection in bottom frame, loaded all my files on a new replace group on top frame (on properties of the group, u can change directory and options), then executed the replacement, it worked perfectly.
Thx.
I'm using cqlsh to add data to Cassandra with the BATCH query and I can load the data with a query using the "-e" flag but not from a file using the "-f" flag. I think that's because the file is local and Cassandra is remote. Details below:
This is a sample of my query (there are more rows to insert, obviously):
BEGIN BATCH;
INSERT INTO keyspace.table (id, field1) VALUES ('1','value1');
INSERT INTO keyspace.table (id, field1) VALUES ('2','value2');
APPLY BATCH;
If I enter the query via the "-e" flag then it works no problem:
>cqlsh -e "BEGIN BATCH; INSERT INTO keyspace.table (id, field1) VALUES ('1','value1'); INSERT INTO keyspace.table (id, field1) VALUES ('2','value2'); APPLY BATCH;" -u username -p password -k keyspace 99.99.99.99
But if I save the query to a text file (query.cql) and call as below, I get the following output:
>cqlsh -f query.cql -u username -p password -k keyspace 99.99.99.99
Using 3 child processes
Starting copy of keyspace.table with columns ['id', 'field1'].
Processed: 0 rows; Rate: 0 rows/s; Avg. rate: 0 rows/s
0 rows imported from 0 files in 0.076 seconds (0 skipped).
Cassandra obviously accepts the command but doesn't read the file, I'm guessing that's because the Cassandra is located on a remote server and the file is located locally. The Cassandra instance I'm using is a managed service with other users, so I don't have access to it to copy files into folders.
How do I run this query on a remote instance of Cassandra where I only have CLI access?
I want to be able to use another tool to build the query.cql file and have a batch job run the command with the "-f" flag but I can't work out how I'm going wrong.
You're executing a local cqlsh client so it should be able to access your local query.cql file.
Try to remove the BEGIN BATCH and APPLY BATCH and just let the 2 INSERT statements in the query.cql and retry again.
One other solution to insert data quickly is to provide a csv file and use the COPY command inside cqlsh. Read this blog post: http://www.datastax.com/dev/blog/new-features-in-cqlsh-copy
Scripting insert by generating one cqlsh -e '...' per line is feasible but it will be horribly slow
I've got a package in SSIS 2012 that has an Execute SQL task in the control flow level.
The SQL in question does an Upsert via the SQL merge statement. What I want to do, is return the count of records inserted and records updated (No deletes going on here to worry about). I'm using the output option to output the changed recs to a table variable.
I've tried returning the values as:
Select Count(*) as UpdateCount from #mergeOutput where Action = 'Update'
and
Select Count(*) as InsertCount from #mergeOutput where Action = 'Insert'
I've tried setting the resultset to both Single rowset and Full rowset, but i'm not seeing anything returned to the package variables I've set for them (intInsertcount and intUpdatecount).
What am I doing wrong?
Thanks.
You should try the following:
Select UpdateCount = (Select Count(*) as UpdateCount from #mergeOutput where Action = 'Update'),
InsertCount = (Select Count(*) as InsertCount from #mergeOutput where Action = 'Insert')
Using a single result set this should give you an output along the lines of
UpdateCount | InsertCount
# | #
Then just map the result set changing the name of each result and use breakpoints to test and makesure the variables update through the process.
This is what I use when I want to return multiple result sets from different tables in the same query, however I don't know how it works with the output of merge statements.
Set the Execute SQL command output as single line
SqlCommand as :
Select sum(case when Action='Update' then 1 else 0 end) Update_count,
sum(case when Action='Insert' then 1 else 0 end) Insert_count
from #mergeOutput;
In the Result set tab click on the Add button; set your variables to point to the 2 outputs of the above query, positionally:
0 => intUpdatecount; 1 => intInsertcount
how to copy from last position when connection error ucures PostgreSQL DB 100GB from remote server ?
I had PosgreSQL DB on remote server it is ~100GB
I'm using PostgreSQL shell to copy from remote Db to file in my pc:
\copy BINARY table TO 'C:\\TEMP\\file.txt';
after coping 10-30GB usually connection is lost/drops and I get error something like:
error server lost connections terminate abnormal
So how I can copy from last point/position then error occurred? Lets say: if I have copied 20GB and connection was lost I want to start from 20gb last point there error was and continue but NOT from 0 bytes from beginning?
I CAN NOT use pg_dump BUT I can use php and PostgreSQL shell!
I have NO SUPERUSER rights!
If I use pgAdmin query tool:
copy BINARY table TO 'C:\\TEMP\\file.txt';
I get error:
mydb=> copy BINARY table TO 'C:\\TEMP\\file.txt';
ERROR: must be superuser to COPY to or from a file
HINT: Anyone can COPY to stdout or from stdin. psql's \copy command
also works for anyone.
So im using
\copy BINARY table TO 'C:\\TEMP\\file.txt';
command in PostgresSQL shell
OR I can USE tools/software to get DB queries from remote Postgresql and from last position/last read query!
Please HELP!
With those restrictions, you'll have to resort to doing what's possible with a SELECT command.
One approach would be to write a script which selects records from remote in primary key order. It should select a block of records (e.g. 1000) and insert them in the local host.
After each block is selected from remote and inserted on local, the loop should restart, with the starting point being the last inserted key + 1.
Since the source table doesn't have a primary key, you can as a last resort use the internal cid. But in this case it's very important that no changes must be done to the source table while its data is being copied, or there will be missed or duplicate records in the copy.
For example, using a test database prepared like this to represent the source database:
$ psql -tAc "CREATE ROLE so21202282 LOGIN PASSWORD 'so21202282'"
$ createdb so21202282 -O so21202282
$ psql -tAc "GRANT ALL PRIVILEGES ON DATABASE so21202282 to so21202282"
$ psql so21202282
# create table big_table (name varchar not null);
# alter table big_table owner to so21202282;
# insert into big_table SELECT md5(random()::text) AS name FROM (SELECT * FROM generate_series(1,10000) AS id) AS x;
# select count(*) from big_table;
count
-------
10000
(1 row)
# select * from big_table limit 10;
name
----------------------------------
121d50c1152512074770be4803147561
8f1dbf52867fd95d585d5a6a02116957
0f227635bcde2abc4e9d77b6911f43f2
7b4490a325a978f10c8c1651bc44fc84
79c1a77a0cb29f5a540653945f0cd1a0
03595e4afc31f987874a7824ba58f8ee
588ad3e66f70e109ccb780215d1e8e00
ac63376de3e4dcf067283bd7de475d6d
9b648ad199095b2cc9f575afe378e11e
9e70c3c4a24ece1ac0e8c54be41ffd0a
(10 rows)
# \q
And another database like this to represent the target database:
$ createdb so21202282_target -O so21202282
$ psql so21202282_target -tAc 'CREATE EXTENSION IF NOT EXISTS "dblink"'
$ psql -tAc "GRANT ALL PRIVILEGES ON DATABASE so21202282_target to so21202282"
$ psql so21202282_target
# create table big_table_target (source_ctid tid primary key, name varchar not null);
# alter table big_table_target owner to so21202282;
# \q
This sample code will migrate data from the source database to the target database:
create or replace function get_last_local_ctid() returns tid as $$
select case
when COUNT(*) > 0 then (select max(source_ctid) from big_table_target)
when COUNT(*) = 0 then '(0,0)'::tid
end as max_ctid
from big_table_target
$$ language sql;
create or replace function insert_from_next_remote_batch(start_ctid tid) returns integer as $$
declare
inserted_rows integer default 0;
begin
raise notice 'Copying data starting from cid %', start_ctid;
insert into big_table_target
select *
from dblink(
'dbname=so21202282 host=localhost user=so21202282 password=so21202282',
'select ctid, name from big_table ' ||
'where ctid > ''' || start_ctid || '''' ||
' order by ctid limit 1000' -- Change limit according to performance.
)
as source(ctid tid, name varchar);
GET DIAGNOSTICS inserted_rows = ROW_COUNT;
raise notice '% records copied', inserted_rows;
return inserted_rows;
end;
$$ language plpgsql;
create or replace function migrate_data() returns void as $$
declare
last_cid tid default (0,0);
inserted_rows integer default 0;
done boolean default false;
begin
raise notice 'Starting data migration';
while not done loop
last_cid = (select * from get_last_local_ctid());
raise notice 'Last local cid is %', last_cid;
-- Use a new session via dblink to create an independent transaction.
inserted_rows = (
select *
from dblink(
'dbname=so21202282_target host=localhost user=so21202282 password=so21202282',
'select insert_from_next_remote_batch(''' || last_cid || ''')'
)
as f(inserted_rows integer)
);
raise notice '% rows copied', inserted_rows;
done = (inserted_rows = 0);
end loop;
raise notice 'Data migration complete';
end;
$$ language plpgsql;
This will be the output of running select migrate_data();:
NOTICE: Starting data migration
NOTICE: Last local cid is (0,0)
NOTICE: 1000 rows copied
NOTICE: Last local cid is (8,40)
NOTICE: 1000 rows copied
NOTICE: Last local cid is (16,80)
NOTICE: 1000 rows copied
NOTICE: Last local cid is (24,120)
NOTICE: 1000 rows copied
NOTICE: Last local cid is (33,40)
NOTICE: 1000 rows copied
NOTICE: Last local cid is (41,80)
NOTICE: 1000 rows copied
NOTICE: Last local cid is (49,120)
NOTICE: 1000 rows copied
NOTICE: Last local cid is (58,40)
NOTICE: 1000 rows copied
NOTICE: Last local cid is (66,80)
NOTICE: 1000 rows copied
NOTICE: Last local cid is (74,120)
NOTICE: 1000 rows copied
NOTICE: Last local cid is (83,40)
NOTICE: 0 rows copied
NOTICE: Data migration complete