We have staged the log files in external stage s3.The staged log files are in CEF file format.How to parse CEF files from stage to move the data to snowflake?
If the files have a fixed format (i.e. there are record and field delimiters and each record has the same number of columns) then you can just treat it as a text file and create an appropriate file format.
If the file has a semi-structured format then you should be able to load it into a variant column - whether you can create multiple rows per file or only one depends in the file structure. If you can only create one record per file then you may run into issues with file size as a variant column has a maximum file size.
Once the data is in a variant column you should be able to process it to extract usable data from it. If there is a structure Snowflake can process (e.g. xml or json) then you can use the native capabilities. If there is no recognisable structure then you'd have to write your own parsing logic in a stored procedure.
Alternatively, you could try and find another tool that will convert your files to an xml/json format and then Snowflake can easily process those files.
I have a question in SSIS.
For an instance, I have 100 flat files of the same metadata columns to be loaded using an incremental load, but my question is how can we find which flat contains error data while loading using for each loop container. Any solution can be appreciated
Simplest solution: Since you are using a Foreach Loop container, then the file path is mapped to a variable. You can simply add a Derived Column Transformation and use this variable within the expression as following (assuming the variable name is FilePath):
#[User::FilePath]
Then insert it with the erroneous rows.
I'm trying to create simple project in which I'd like to download XML files from given website. I have stored files names in DataBase table. What I have done looking at this tutorial: Implementing Foreach Looping Logic in SSIS is:
a. Read all distinct rows from my Table (let's call it XMLTable)
b. Assign result of this query to User variable called: nameOfFileToDownload
c. Created For Each Loop container
d. Configured to assign localy each row with file name to download to: nameFileForeachLoop variable
e. Download files from concate link as a path using HTTPManager with dynamic file name from nameFileForeachLoop variable.
f. Created XMLFlatFile connection for dummy file - I assumed after reading from above tutorial.
The problem is now that this loop container works but doesn't download files separately - still to one file which at the end is empty. My nameFileForeachLoop variable is not updated during each LOOP iteration. What's more I have noticed that during FLAT FILE creation I have only CSV and TXT extension available. I have tried many approaches but without results. Can you help me how to download XML files?
For example I have following link to XML: nbp.pl/kursy/xml/c001z180102.xml What changes here is last part of this link with XML extension which I get from my XMLTable.
I have configured my components as follows:
You are on the right track, but need some amendements.
Do not create and configure Flat File Destination connection manager unless you are creating tables in .CSV or .TXT files. In provided example author selects data with dynamic queries and stores the results in dynamic txt files. As I understand, this is not your case.
Here are some examples how to download and save files with HTTP in SSIS. Sample download script and Review of different approaches to HTTP download.
I know this may be a simple task but I have yet to find a simple answer. I have a large sql table that I want to export into multiple flat files (.csv to be exact) that are 10,000 records each. I want to do this using SSIS and from what I gather I will need a FOREACH LOOP container. This is as far as I have got. As an added bonus, a few of the columns have commas in the data itself so when the file gets delimited by commas the data still needs to be preserved without taking out the original comma
All the videos I have come across have been using scripts or delimited by the type of data or some other way. I just want to have csv files based on a set number of records in each file. Any help is much appreciated.
Has anyone been able to get a variable record length text file (CSV) into SQL Server via SSIS?
I have tried time and again to get a CSV file into a SQL Server table, using SSIS, where the input file has varying record lengths. For this question, the two different record lengths are 63 and 326 bytes. All record lengths will be imported into the same 326 byte width table.
There are over 1 million records to import.
I have no control of the creation of the import file.
I must use SSIS.
I have confirmed with MS that this has been reported as a bug.
I have tried several workarounds. Most have been where I try to write custom code to intercept the record and I cant seem to get that to work as I want.
I had a similar problem, and used custom code (Script Task), and a Script Component under the Data Flow tab.
I have a Flat File Source feeding into a Script Component. Inside there I use code to manipulate the incomming data and fix it up for the destination.
My issue was the provider was using '000000' as no date available, and another coloumn had a padding/trim issue.
You should have no problem importing this file. Just make sure when you create the Flat File connection manager, select Delimited format, then set SSIS column length to maximum file column length so it can accomodate any data.
It appears like you are using Fixed width format, which is not correct for CSV files (since you have variable length column), or maybe you've incorrectly set the column delimiter.
Same issue. In my case, the target CSV file has header & footer records with formats completely different than the body of the file; the header/footer are used to validate completeness of file processing (date/times, record counts, amount totals - "checksum" by any other name ...). This is a common format for files from "mainframe" environments, and though I haven't started on it yet, I expect to have to use scripting to strip off the header/footer, save the rest as a new file, process the new file, and then do the validation. Can't exactly expect MS to have that out-of-the box (but it sure would be nice, wouldn't it?).
You can write a script task using C# to iterate through each line and pad it with the proper amount of commas to pad the data out. This assumes, of course, that all of the data aligns with the proper columns.
I.e. as you read each record, you can "count" the number of commas. Then, just append X number of commas to the end of the record until it has the correct number of commas.
Excel has an issue that causes this kind of file to be created when converting to CSV.
If you can do this "by hand" the best way to solve this is to open the file in Excel, create a column at the "end" of the record, and fill it all the way down with 1s or some other character.
Nasty, but can be a quick solution.
If you don't have the ability to do this, you can do the same thing programmatically as described above.
Why can't you just import it as a test file and set the column delimeter to "," and the row delimeter to CRLF?