How to move SNOWPIPE from one schema to another? - snowflake-cloud-data-platform

how can i move snowpipe from one schema to another ? Does it replace if i use the following command?
Create or replace pipe <target_schema>.<mypipe_name> as
I couldn't find an alter statement to rename pipe or change schema.

You do a clone, there are some considerations wrt to how you have defined the pipe source, fully qualified name versus just the pipe name.
https://docs.snowflake.com/en/user-guide/object-clone.html#cloning-and-pipes

Related

Query to return the Target Table of a Snowpipe

I'm trying to write a script to retreive the COPY_HISTORY of tables that are target tables for my various pipes, but I can't find a way to start with a PIPE name and return the target table that I need to query.
Preferrably would like to avoid pulling out the pipe definition and regexing through it for the INTO clause.
I could not find a way to do it with SHOW PIPES, INFORMATION_SCHEMA.PIPES or INFORMATION_SCHEMA.LOAD_HISTORY.
Try SNOWFLAKE.ACCOUNT_USAGE.COPY_HISTORY. I see PIPE_NAME and TABLE_NAME.
LOAD_HISTORY does not return the history of data loaded using Snowpipe.

Oracle impdp remap one schema to many failed with "invalid value for parameter, 'remap_schema"

I created a dump file dumpfile.dmp in Oracle 12c for a schema say A from the source database, then I tried to import the dump file to several schemas say B, C, D on another database TESTDB with one command using the schema_remap option. The command looks like this:
impdp system/password#TESTDB directory=mydirectory dumpfile=dumpfile.dmp remap_schema=A:B,C,D remap_tablespace=TBS_A:TBS_B,TBS_A:TBS_C,TBS_A:TBS_D logfile=mylogfile.log.
I even put the command in .par file but I still get the same error.
It always come back with error "UDI-00014: invalid value for parameter, 'remap_schema'"
I will appreciate if anyone can tell me what I am doing wrongly?
You need to take a closer look at the syntax for REMAP_SCHEMA (and REMAP_TABLESPACE).
There is no provision for remapping one exported schema (or tablespace) to multiple destination schemas (or tablespaces).
https://docs.oracle.com/en/database/oracle/oracle-database/12.2/sutil/datapump-import-utility.html#GUID-5DA84A72-B71C-4491-9DD8-7075D9A4B04F
If your follow-up question is 'so how do I accomplish this purpose?' the answer is to run a separate import for each destination schema.
tried to import the dump file to several schemas say B, C, D
remap_schema=A:B,C,D
You can't do that; DataPump doesn't support that kind of remap. remap_schema must be a 1:1 relationship, as must remap_tablespace, and the source must be unique (i.e. you can only remap schema A once per import). Per the documentation:
Multiple REMAP_SCHEMA lines can be specified, but the source schema
must be different for each one.
You will have to run separate imports for each target schema.
impdp system/password#TESTDB directory=mydirectory dumpfile=dumpfile.dmp remap_schema=A:B remap_tablespace=TBS_A:TBS_B logfile=mylogfile.log
impdp system/password#TESTDB directory=mydirectory dumpfile=dumpfile.dmp remap_schema=A:C remap_tablespace=TBS_A:TBS_C logfile=mylogfile.log
impdp system/password#TESTDB directory=mydirectory dumpfile=dumpfile.dmp remap_schema=A:D remap_tablespace=TBS_A:TBS_D logfile=mylogfile.log

If a database has 2 or more pipes, will all of them have the same notification_channel?

(Interesting conversation string which may be of value to other Snowflake Users...)
Q:
I have created two pipes for a database to pump data from ASW s3. The pipes are almost identical with an exception of a target table.
The idea was to insert data into different tables based on original s3 prefix. I created two event notifications on s3 bucket to watch two different prefixes, but since notification_channel generated by snoflake is the same for both pipes, it is impossible to distinguish events on SF side and both pipes inserting data from both s3 folders.
So is it a bug or a feature?
A:
It is not uncommon for different pipe to have same ARN value in the same region. As long as your target destinations are different in the copy statements, you should be fine injesting data correctly. Pipes in the same regions share the same queue to injest data. It all comes down to configuring the destination properly.
Q:
My target destinations are different:
create or replace pipe gen3_pipe auto_ingest = true
as
copy into gen3_data(device_id, event_datetime, load, origin, inserted_at )
from (select $1:device_id, TO_TIMESTAMP_NTZ($1:event_datetime), $1:load, metadata$filename, CURRENT_TIMESTAMP from #rawdata)
create or replace pipe gen2_pipe auto_ingest = true
as
copy into gen2_data(device_id, event_datetime, load, origin, inserted_at)
from (select $1:device_id, TO_TIMESTAMP_NTZ($1:event_datetime), $1:load, metadata$filename, CURRENT_TIMESTAMP from #rawdata)
Both tables gen2 and gen3 are getting the same data, basically gen2 is a copy of gen3.
If both pipes subscribed to the same ARN, how can i distinguish between events?
A:
Sharing of the queues(ARN) between pipes is normal. As long as your target table is different in both statements(for both pipes) and does not include purge in any of the pipes, when a file comes int a bucket, it is loaded into both tables gen3_data and gen2_data. You can drop a file and test it out. Let me know if have any problem and we can debug.
Q:
That is exactly the problem. I want to separate data coming from s3 and for that I have two separate S3 events, hoping snowflake pipes, each can listed to its own event. But both pipes have the same ARN, so separation on this level seems to impossible, or is it?
A:
Maybe I am not understanding this correctly. Do you want to load the same file into both target tables or are you differentiating between files that load into their respective tabes? You can include the prefix in the statement when you are creating the external stage. When a data file is uploaded into the bucket, all pipes that match the stage directory path perform a one-time load of the file into their corresponding target tables.
For example: Create two stages
stage for DEV is "s3://bucket/sub_folder_1" and for PROD is "s3://bucket/sub_folder_2".
Create copy into commands to a target table defined as pointing to respective stages as well.
Q:
Yes, thank you! I figured the solution with prefixes/folders while creating external stages.
The initial idea was to utilize two events on s3 side with different prefixes, since we need them for other purposes as well.
So there is no way to tell a pipe what queue to listen (ie assign ARN) explicitly, is there?
A:
That is correct. The stage reference has to be different for each pipe otherwise they can load same set of files into one or more target tables.

Need to map csv file to target table dynamically

I have several CSV files and have their corresponding tables (which will have same columns as that of CSVs with appropriate datatype) in the database with the same name as the CSV. So, every CSV will have a table in the database.
I somehow need to map those all dynamically. Once I run the mapping, the data from all the csv files should be transferred to the corresponding tables.I don't want to have different mappings for every CSV.
Is this possible through informatica?
Appreciate your help.
PowerCenter does not provide such feature out-of-the-box. Unless the structures of the source files and target tables are the same, you need to define separate source/target definitions and create mappings that use them.
However, you can use Stage Mapping Generator to generate a mapping for each file automatically.
PMy understanding is you have mant CSV files with different column layouts and you need to load them into appropriate tables in the Database.
Approach 1 : If you use any RDBMS you should have have some kind of import option. Explore that route to create tables based on csv files. This is a manual task.
Approach 2: Open the csv file and write formuale using the header to generate a create tbale statement. Execute the formula result in your DB. So, you will have many tables created. Now, use informatica to read the CSV and import all the tables and load into tables.
Approach 3 : using Informatica. You need to do lot of coding to create a dynamic mapping on the fly.
Proposed Solution :
mapping 1 :
1. Read the CSV file pass the header information to a java transformation
2. The java transformation should normalize and split the header column into rows. you can write them to a text file
3. Now you have all the columns in a text file. Read this text file and use SQL transformation to create the tables on the database
Mapping 2
Now, the table is available you need to read the CSV file excluding the header and load the data into the above table via SQL transformation ( insert statement) created by mapping 1
you can follow this approach for all the CSV files. I haven't tried this solution at my end but, i am sure that the above approach would work.
If you're not using any transformations, its wise to use Import option of the database. (e.g bteq script in Teradata). But if you are doing transformations, then you have to create as many Sources and targets as the number of files you have.
On the other hand you can achieve this in one mapping.
1. Create a separate flow for every file(i.e. Source-Transformation-Target) in the single mapping.
2. Use target load plan for choosing which file gets loaded first.
3. Configure the file names and corresponding database table names in the session for that mapping.
If all the mappings (if you have to create them separately) are same, use Indirect file Method. In the session properties under mappings tab, source option.., you will get this option. Default option will be Direct change it to Indirect.
I dont hav the tool now to explore more and clearly guide you. But explore this Indirect File Load type in Informatica. I am sure that this will solve the requirement.
I have written a workflow in Informatica that does it, but some of the complex steps are handled inside the database. The workflow watches a folder for new files. Once it sees all the files that constitute a feed, it starts to process the feed. It takes a backup in a time stamped folder and then copies all the data from the files in the feed into an Oracle table. An Oracle procedure gets to work and then transfers the data from the Oracle table into their corresponding destination staging tables and finally the Data Warehouse. So if I have to add a new file or a feed, I have to make changes in configuration tables only. No changes are required either to the Informatica Objects or the db objects. So the short answer is yes this is possible but it is not an out of the box feature.

How to name an SQLite database so it doesn't have the default name of main?

How can I name an SQLite database so it doesn't have the default name of main?
I don't think so.
The main database has a special meaning.
You can attach other databases with other names.
From http://www.sqlite.org/sqlite.html
The ".databases" command shows a list of all databases open in the current connection. There will always be at least 2. The first one is "main", the original database opened. The second is "temp", the database used for temporary tables. There may be additional databases listed for databases attached using the ATTACH statement. The first output column is the name the database is attached with, and the second column is the filename of the external file.
You can't. "main" is simply the name which SQLite always uses for the primary database that you have open. (If necessary, you can add extra databases using ATTACH, though.)
http://www.sqlite.org/lang_attach.html

Resources