When we create an external function to call lambda to fetch let's say an encryption key to decrypt a particular column in a snowflake table, does the LAMBDA get triggered for each row in the table? Or once the encryption key is retrieved, it is available for the entire session
Thanks
It would trigger one call to get the encryption key per batch (of up to 4000 rows) and use that for each row in the batch.
Related
I have created a lookup table for an existing table and now I want to undo what I did and bring the original table to its original form.
I, unfortunately, don't know how to do so.
Creating a lookup table generates a table and a sequence. The format is <column_name>_LOOKUP (for the table) and <column_name>_LOOKUP_SEQ (for the sequence).
They can both be dropped in the object browser.
I have a flat file which has following columns
Device Name
Device Type
Device Location
Device Zone
Which I need to insert into SQL Server table called Devices.
Devices table has following structure
DeviceName
DeviceTypeId (foreign key from DeviceType table)
DeviceLocationId (foreign key from DeviceLocation table)
DeviceZoneId (foreign key from DeviceZone table)
DeviceType, DeviceLocation and DeviceZone tables are already prepopulated.
Now I need to write ETL which reads flat file and for each row get DeviceTypeId, DeviceLocationId and DeviceZoneId from corresponding tables and insert into Devices table.
I am sure this is not new but its being a while I worked on such SSIS packages and help would be appreciated.
Load the flat content into a staging table and write a stored procedure to handle the inserts and updates in T-SQL.
Having FK relationships between the destination tables, can probably make a lot of trouble with a single data flow and a multicast.
The problem is that you have no control over the order of the inserts so the child record could be inserted before the parent.
Also, for identity columns on the tables, you cannot retrieve the identity value from one stream and use it in another without using subsequent merge joins.
The simplest way to do that, is by using Lookup Transformation to get the ID for each value. You must be aware that duplicates may lead to a problem, you have to make sure that the value is not found multiple times in the foreign tables.
Also, make sure to redirect rows that have no match into a staging table to check them later.
You can refer to the following article for a step by step guide to Lookup Transformation:
An Overview of the LOOKUP TRANSFORMATION in SSIS
I have to synchronize data from a file (Excel) to a database (MySQL) using Spring Batch.
The file will be processed record by record. Adding and updating database records works fine but I wonder how to detect and delete entries from the database that were removed from the file?
I consider to implement this:
read the file record-by-record
create or update the record in the database and remember the primary key
remove all records with different primary keys (final step after all records have been processed)
Do you know how to collect and pass all processed primary keys to a final step?
Or do you recommend another implementation?
Thanks,
Patrick
Update: I'm not allowed to alter the database tables.
Use a column to mark updated/added records.
After main step create a new one where you delete record not marked.
If DB schema modification is not an option:
Step 1. Dump primary keys from DB to CSV (original.csv)
Step 2. Create/update DB and store primary keys of updated data to CSV (updated.CSV)
After step 2. Create a differential file: original minus updated (diff.CSV)
Step 3. Read diff.CSV and delete records by PK
I have a requirement to change a "broken" computed column in a table to an identity column and as part of this work update some of the field values. This column is a pseudo primary key so doesn't have any constraints defined against it. I therefore need to determine if any other tables in the database contain a pseudo foreign key back to this column.
Before writing something myself I'd like to know if there is a script/tool in existence that when given a value (not a column name) can search across the data in all of the tables within an SQL Server database and show where that value exists?
Thanks in advance.
Quick google found this page/script:
http://vyaskn.tripod.com/search_all_columns_in_all_tables.htm
I don't personally know of a pretty GUI-interfaced utility that'll do it.
From what I gather, Linq to SQL doesn't actually execute any database commands (including the opening of database connections) until the SubmitChanges() method is called. If this is the case, I would like to increase the efficiency of a few methods. Is it possible for me to retrieve the ID of an object before inserting it? I'd rather not call SubmitChanges() twice, if it's possible for me to know the value of the ID before it's actually inserted into the database. From a logical point of view, it would only makes sense to have to open a connection to the database in order to find out the value, but does an insertion procedure also have to take place?
Thanks
The usual technique to solve this, is to generate a unique identifier in the application layer (such as a GUID) and use this as the ID. That way you do not have to retrieve the ID on a subsequent call.
Of course, using a GUID as a primary key can have it's drawbacks. If you decide to go this way look up COMB GUID.
Well, here is the problem: You get somehow id BEFORE inserting to database, and do some processing with it. In the same time another thread does the same, and get's the same ID, you've got a conflict.
I.e. I don't think there is an easy way of doing this.
I don't necessarily recommend this, but have seen it done. You can calculate your own ID as an integer using a stored procedure and a table to hold the value of the next ID. The stored procedure selects the value from the table to return it, then increments the value. The table will look something like the following
Create Table Keys(
name varchar(128) not null primary key,
nextID int not null
)
Things to note before doing this is that if you select and then update in 2 different batches you have potential key collision. Both steps need to be treated as an atomic transaction.