Exclude specific Sub Folders - sql-server

I got a package that runs through a folder and it's sub folders to get client data. The agreement has changed and now the client will post his data in different folder name every time. I was wondering if I can do a foreach loop on the main folder and exclude specific folders like archive .
I don't have knowledge in writing scripts so I was wondering if SSIS can do that without the script.

Using Execute Script Task
Get List of - filtered - files using an Execute Script Task before entering Loop and loop over then using ForEach Loop container (Ado enumerator)
You have to a a SSIS variable (ex: User::FilesList) with type System.Object (Scope: Package)
Add an Execute Script Task before the for each Loop container and add User::FilesList as a ReadWrite Variable
In the Script Write The following Code:
Imports System.Linq
Imports System.IO
Imports System.Collections.Generic
Public Sub Main()
Dim Directory as String = "C\Temp"
Dim strSubDirectory as String = Directory & "\New Folder"
Dim lstFiles As New List(Of String)
lstFiles.AddRange(Directory.GetFiles(Directory, "*.*", SearchOption.TopDirectoryOnly).Where(Function(x) Not x.Contains(strSubDirectory)).ToList)
Dts.Variables.Item("FilesList").Value = lstFiles
Dts.TaskResult = ScriptResults.Success
End Sub
In the For each Loop Container Choose the Enumertaion Type as From variable Enumerator and choose FilesList variable as a source
ScreenShots
Using Expression Task
For more details you can refer to my answer in the following link (it is a similar case) WildCards in SSIS Collection {not include} name xlsx

you may have a more control, if you use Script task
Here is the sample code which I have used in one of SSIS:
// Fetch Exclude Directory list from Table
List<excludeDir> excludeDir = new List<excludeDir>();
SqlConnection conn = new SqlConnection(#"Data Source=.\SQLEXPRESS;AttachDbFilename=C:\testDB.mdf;Integrated Security=True;User Instance=True");
SqlCommand cmd = new SqlCommand("select DirList from excludeDir", conn);
SqlDataReader dr;
try
{
conn.Open();
dr = cmd.ExecuteReader();
while (dr.Read())
{
excludeDir.Add(new excludeDir()
{
Dir = dr.GetInt32(dr.GetOrdinal("DirList")),
});
}
dr.Close();
}
catch (Exception exp)
{
throw;
}
finally
{
conn.Close();
}
// compare against Sub directory list and process
string[] dirs = Directory.GetDirectories(#"C:\My Sample Path\");
string[] fileExclude = excludeDir ;
foreach (string path in dirs)
{
FileInfo f = new FileInfo(item2);
listBox1.Items.Add(f.Name);
for (int i = 0; i < fileExclude.Length; i++)
{
-- Console.WriteLine(fileArray[i]);
IF dirs [i] == fileExclude [i]
{
//Set Flags accordingly and execute
}
}
}

You can't do this in the foreach loop properties, but what you can do is start the tasks inside the loop with a script task that checks to see if the folder name is a value that you want to exclude, and if it is, do nothing but loop to the next folder.

I would achieve this (without a Script Task) by setting the Disable property on the Tasks within the For Each Loop Container using an Expression, e.g.
FINDSTRING ( #[User::Each_File_Path] , "archive" , 1 ) > 0

Related

Can a array list of C# be used to populate SSIS object variable?

I have populated a list in C# script and assigned its value to SSIS object variable.
Then I used that object variable to execute some SQL query by looping through For each do enumerator.
I tried doing this by Foreach ado enumerator but getting error
X variable doesn't contain a valid data object.
Can anybody provide any inputs.
Youre using a list. Not a recordset and therefore you need to enumerate over a variable.
If you want to use ADO Recordset, you need to fill a datatable instead.
This shows you how to write to object with a variable list
This shows you how to write to object with recordset (using multiple values)
Like this:
1 .C# Script code - Write to Object with list using variable enumerator
public void Main()
{
// TODO: Add your code here
List<string> NewList = new List<string>();
NewList.Add("Ost");
NewList.Add("Hest");
Dts.Variables["User::NameList"].Value = NewList;
Dts.TaskResult = (int)ScriptResults.Success;
}
1. Variable settings in ssis
1. Foreach loop container settings
Use Foreach Variable Enumerator and use your object variable
Map your outcome to a variable(s)
1. Execute SQL Task test case
Write your SQL with variables
Map your variable to Parameter mapping
1. Result
2. C# Script code - Write to object with datatable using ADO enumerator
public void Main()
{
// TODO: Add your code here
DataTable dt = new DataTable();
dt.Columns.Add("FilmName",typeof(string));
dt.Columns.Add("ActorName",typeof(string));
dt.Rows.Add("Starwars", "Harrison ford");
dt.Rows.Add("Pulp fiction", "Samuel Jackson");
Dts.Variables["User::NameList"].Value = dt;
Dts.TaskResult = (int)ScriptResults.Success;
}
2. Variable settings in ssis
2. Foreach loop container settings
Use Foreach ADO Enumerator and your object as variable
Map your outcome to variable(s)
2. Execute sql task test case
Write your SQL with variables
Map your variable(s) to Parameter mapping
2. Result
Thanks #plaidDK
Second approch solved my problem
2.C# Script code - Write to object with datatable using ADO enumerator
Instead of list I have populated data table:
public DataTable ToDataTable<T>(List<T> items)
{
DataTable dataTable = new DataTable(typeof(T).Name);
//Get all the properties by using reflection
PropertyInfo[] Props = typeof(T).GetProperties(BindingFlags.Public | BindingFlags.Instance);
foreach (PropertyInfo prop in Props)
{
//Setting column names as Property names
dataTable.Columns.Add(prop.Name);
}
foreach (T item in items)
{
var values = new object[Props.Length];
for (int i = 0; i < Props.Length; i++)
{
values[i] = Props[i].GetValue(item, null);
}
dataTable.Rows.Add(values);
}
return dataTable;
}
//Variable passed as below
Variables.vFailedTransactionNo = dt;
ANd then ado enumerator done rest of the job.
Thanks for help!

How to load multiple sheets of an Excel File in SSIS with header information in the sheets

I have a file with 2 sheets in it. It has the header information in it. I tried foreach loop container to load the data
Error i got is:
[SSIS.Pipeline] Error: "Excel Source" failed validation and returned validation status "VS_NEEDSNEWMETADATA".
I tried removing the header row manually from the sheets and ran foreach loop container and it worked perfectly fine.
But in my requirement i will be getting the header row followed by blank row in each sheet.
How do i do in this case.
I believe we need to use script task to to eliminate header and followed null row from the file and read the rest of the records.
My problem is i am bad at c# code logic.
Your help is much appreciated.
Thank you,
swathi
The following Script Task will delete the top 2 rows from every worksheet in the file (you'll need to create the variable 'ExcelFilePath' in SSIS and pass that in to the task, along with 'System::TaskName'):
public void Main()
{
MainTask();
GC.Collect();
GC.WaitForPendingFinalizers();
}
private void MainTask()
{
xl.Application xlApp = null;
xl.Workbook excelFile = null;
string excelFilePath = Dts.Variables["User::ExcelFilePath"].Value.ToString();
string thisTask = Dts.Variables["System::TaskName"].Value.ToString();
try
{
xlApp = new xl.Application();
excelFile = xlApp.Workbooks.Open(excelFilePath);
xlApp.DisplayAlerts = false;
foreach (xl.Worksheet ws in excelFile.Worksheets)
{
ws.Rows["1:2"].EntireRow.Delete();
}
xlApp.DisplayAlerts = true;
excelFile.Save();
excelFile.Close();
xlApp.Quit();
Dts.TaskResult = (int)ScriptResults.Success;
}
catch (Exception ex)
{
Dts.Events.FireError(0, thisTask, ex.Message, String.Empty, 0);
if (excelFile != null) excelFile.Close(SaveChanges:false);
if (xlApp != null) xlApp.Quit();
}
}
You will need to add references to 'COM' > 'Microsoft Excel [version number] Object Library' (whichever version you have) and '.NET' > 'Microsoft.CSharp'. You'll then need to declare using xl = Microsoft.Office.Interop.Excel; in your 'Namespaces' region.

Import Latest csv file in a folder - SSIS

I want to import the latest csv file into a table using SSIS? I currently have a step that gets the last file in a folder:
Report_201209030655.csv
Report_201209030655.csv
Report_201209030655.csv
Based on created time I want steps to import data of the latest csv to a table.
refer this solution:
[http://blog.sqlauthority.com/2011/05/12/sql-server-import-csv-file-into-database-table-using-ssis/][1]
then use script task to populate the file name and pass that variable as file name for source component.
Getting latest file code:
public void Main()
{
string[] files = System.IO.Directory.GetFiles(#"C:\SSIS\Files");
DataTable NewList=new DataTable();
DataColumn col = new DataColumn("FileName");
NewList.Columns.Add(col);
System.IO.FileInfo finf;
foreach (string f in files)
{
finf = new System.IO.FileInfo(f);
if (finf.LastWriteTime > DateTime.Now.AddHours(-24))
{
NewList.Rows.Add(f);
}
}
Dts.Variables["User::FileNameArray"].Value = NewList;
Dts.TaskResult = (int)ScriptResults.Success;
}

Object Variable in script tasks

In my package, I have an Execute Sql Task that sets the result set to a User variable. I then have a c# script task that needs to reference this User variable as a result set. I need the entire result set sent into my script tasks as the web service I am calling needs the entire result set in one shot.
This is the current code I am testing with. It isn't much as I am still trying to figure out where to go with it.
Any help with this is greatly appreciated
public void Main()
{
Variable resultSet = Dts.Variables["User::ZBatch_Order_Export_ResultSet"];
Dts.TaskResult = (int)ScriptResults.Success;
}
This is the update working code:
public void Main()
{
DataTable dt = new DataTable();
OleDbDataAdapter oleDa = new OleDbDataAdapter();
oleDa.Fill(dt, Dts.Variables["User::ZBatch_Order_Export_ResultSet"].Value);
foreach (DataRow row in dt.Rows)
{
Dts.Events
.FireError(0, "ZBatch - Script Task", row["orderDate"]
.ToString(), String.Empty, 0);
// Do some Webservice magic
}
Dts.TaskResult = (int)ScriptResults.Success;
}
So very close, to access the Value of a variable, you need to hit that property
public void Main()
{
Variable resultSet = Dts.Variables["User::ZBatch_Order_Export_ResultSet"].Value;
// do stuff here with resultSet and the webservice
Dts.TaskResult = (int)ScriptResults.Success;
}

Programatically add document to Hummingbird/OpenText eDocs database

I am working with the the (formerly Hummingbird Enterprise) OpenText eDocs document management system.
http://www.opentext.com/2/global/products/products-opentext-edocs-products/products-opentext-edocs-document-management.htm
We are still using Hummingbird 5.1.0.5.
I have been reviewing the API docs for this software, but some areas are slightly vague.
So far, I can create my Profile form, populate some values.
DOCSObjects.Application docApp = null;
DOCSObjects.IProfile profile = null;
Type fType = Type.GetTypeFromProgID("DOCSObjects.Application");
docApp = (DOCSObjects.Application)Activator.CreateInstance(fType);
try { profile = docApp.CurrentLibrary.CreateProfile("DEF_PROF"); }
catch (Exception ex) { System.Diagnostics.Debug.WriteLine(ex.Message); }
if (profile != null)
{
try
{
profile.Columns["DOCNAME"].Value = "New PDF Document";
profile.Columns["APP_ID"].Value = "ACROBAT";
profile.ShowProfile(1);
// not sure how to set a document here
profile.SetDocument(docApp.CurrentLibrary.Name, document);
profile.Save(); // requires a short flag, but what?
}
catch (Exception ex)
{
System.Diagnostics.Debug.WriteLine(ex.Message);
}
}
else
{
MessageBox.Show("Profile is null");
}
Where I am having trouble is how to save a document with the profile.
I am using C# and the API docs and intellisense simply ask for on object for the document.
Does that mean the path or do I need to load the PDF into some specific DOCSObjects type?
Also, the API docs references a Constant such as OF_NORMAL when saving the document. I assume this is 0, but are there others I should know about? There are many Constants referenced in the docs that have no values defined. (All examples are in C++/VB).
I know it's a long shot anyone is using this software, but thought I would give it a try.
Thank you and any assistance is appreciated.
I have done it in VB - using an API wrapper that I created. You should use the PCDClient under DM API folder instead of the DOCSObjects.
This code here probably won't work right away for you because it is heavily customized, but play around with it and you can probably figure it out. Good Luck!
Public Sub CreateProfile(ByRef Doc As Profile)
Try
'SET THE STATIC META DATA
Doc.objDoc.SetProperty("TYPE_ID", "DOCS") ' DOCUMENT TYPE IS ALWAYS DOCS
Doc.objDoc.SetProperty("TYPIST_ID", RDIMSAPI._UserID)
Doc.objDoc.SetProperty("APP_ID", RDIMSData.GetApp(Doc.FileToImport)) ' FILE TO IMPORT
'CREATE THE DOCUMENT
Doc.objDoc.Create()
If Doc.objDoc.ErrNumber <> 0 Then
Throw New Exception(Doc.objDoc.ErrNumber & " - " & Doc.objDoc.ErrDescription)
End If
'RETRIEVE THE NEW DOCUMENT PROFILE
Dim DocNumber As Integer = Doc.objDoc.GetReturnProperty("%OBJECT_IDENTIFIER")
Dim VersionID As Integer = Doc.objDoc.GetReturnProperty("%VERSION_ID")
'ADD THE DOCUMENT TO THE PROFILE
Dim objPutDoc As New PCDClient.PCDPutDoc
objPutDoc.SetDST(RDIMSAPI._sDST)
objPutDoc.AddSearchCriteria("%TARGET_LIBRARY", RDIMSAPI._Library)
objPutDoc.AddSearchCriteria("%DOCUMENT_NUMBER", DocNumber)
objPutDoc.AddSearchCriteria("%VERSION_ID", VersionID)
objPutDoc.Execute()
If objPutDoc.ErrNumber <> 0 Then
Throw New Exception(Doc.objDoc.ErrNumber & " - " & Doc.objDoc.ErrDescription)
End If
objPutDoc.NextRow()
'UPLOAD THE DOCUMENT
Dim objPutStream As PCDClient.PCDPutStream = objPutDoc.GetPropertyValue("%CONTENT")
Dim fs As FileStream = System.IO.File.OpenRead(Doc.FileToImport)
Dim fi As FileInfo = New System.IO.FileInfo(Doc.FileToImport)
Dim br As BinaryReader = New BinaryReader(fs)
Dim addDocBytes As Byte() = br.ReadBytes(CInt(fs.Length))
br.Read(addDocBytes, 0, addDocBytes.Length)
br.Close()
Dim bytesWritten As Integer = 0
objPutStream.Write(addDocBytes, addDocBytes.Length, bytesWritten)
objPutStream.SetComplete()
'UNLOCK THE DOCUMENT
Dim objDoc As New PCDClient.PCDDocObject
objDoc.SetDST(RDIMSAPI._sDST)
objDoc.SetObjectType("0_RDIMSPROF_SYS")
objDoc.SetProperty("%TARGET_LIBRARY", RDIMSAPI._Library)
objDoc.SetProperty("%OBJECT_IDENTIFIER", DocNumber)
objDoc.SetProperty("%VERSION_ID", VersionID)
objDoc.SetProperty("%STATUS", "%UNLOCK")
objDoc.Update()
objDoc.Fetch()
objDoc = Nothing
If Doc.objDoc.ErrNumber <> 0 Then
Throw New Exception(Doc.objDoc.ErrNumber & " - " & Doc.objDoc.ErrDescription)
End If
'RELEASE ALL OBJECTS AND RETURN DOCUMENT NUMBER
objPutDoc = Nothing
Catch ex As Exception
'IF EXCEPTION, LOG ERROR AND DISPLAY MESSAGE
Throw New Exception("(" & Me.GetType().FullName & "." & New StackTrace(0).GetFrame(0).GetMethod.Name & ") " & ex.Message)
Exit Sub
End Try
End Sub
I don't know if you're still trying. But here's my C# code for this. It's part of a larger module, so it won't work immediately. The profile parameter would be for example "DEF_PROF".
This also uses the PCDClientLib. My understanding is that these are serverside libraries, wich you should use only on the server. And that you should use the lib you've already used for clientside code.
// All variable prepended with an underscore are class fields etc...
// DMImportException is a custom exception, nothing special really
/// <summary>
/// Import a file into the library previously logged in to.
/// </summary>
/// <param name="profile">The name of the used profile.</param>
/// <param name="profileNameValues">A dictionary of strings containing the profile values wich should be saved for the document.</param>
/// <param name="FileName">The path and filename of the file to import.</param>
public virtual void ImportFile(string profile, Dictionary<string, string> profileNameValues, string FileName)
{
if (!_isLoggedIn)
{
throw new DMImportException("Trying to import a file while not logged in into DM.");
}
int totalbyteswritten;
byte[] bdata;
bdata = file.readallbytes(filename);
pcddocobject objdoc = new pcddocobject();
objdoc.setproperty("%target_library", _library);
objdoc.setdst(_dst);
objdoc.setobjecttype(profile);
foreach(var profilenamevaluepair in profilenamevalues)
{
objdoc.setproperty(profilenamevaluepair.key, profilenamevaluepair.value);
}
objdoc.create();
if (objdoc.errnumber != 0)
{
throw new dmimportexception("error while creating a new objdoc. check the inner error.", objdoc.errnumber, objdoc.errdescription);
}
_docnumber = objDoc.GetReturnProperty("%OBJECT_IDENTIFIER").ToString();
_versionID = objDoc.GetReturnProperty("%VERSION_ID").ToString();
PCDPutDoc objPutDoc = new PCDPutDoc();
objPutDoc.SetDST(_dst);
objPutDoc.AddSearchCriteria("%TARGET_LIBRARY", _library);
objPutDoc.AddSearchCriteria("%DOCUMENT_NUMBER", _docNumber);
objPutDoc.AddSearchCriteria("%VERSION_ID", _versionID);
objPutDoc.Execute();
if (objPutDoc.ErrNumber != 0)
{
throw new DMImportException("RecentEdit Failure on Execute: Error while trying to get a handle to the newly created doc. Check the inner error.", objPutDoc.ErrNumber, objPutDoc.ErrDescription);
}
objPutDoc.NextRow();
PCDPutStream objPutStream = (PCDPutStream)objPutDoc.GetPropertyValue("%CONTENT");
objPutStream.Write((object)bdata, (int)bdata.Length, out TotalBytesWritten);
objPutStream.SetComplete();
objPutStream = null;
objDoc = null;
objDoc = new PCDDocObject();
objDoc.SetDST(_dst);
objDoc.SetObjectType(profile);
objDoc.SetProperty("%TARGET_LIBRARY", _library);
objDoc.SetProperty("%OBJECT_IDENTIFIER", _docNumber);
objDoc.SetProperty("%VERSION_ID", _versionID);
objDoc.SetProperty("%STATUS", "%UNLOCK");
objDoc.Update();
if (objDoc.ErrNumber != 0)
{
throw new DMImportException("Error while trying to unlock the just imported file. Check the inner error.", objDoc.ErrNumber, objDoc.ErrDescription);
}
objPutDoc = null;
objDoc = null;
return;
}
P.S. I'd recommend you update to a later version of eDocs (we're upgrading from 5.1.0.5 to 5.2.1 end of this week ;-D)
--- EDIT ---
I think you need
Application.CurrentLibrary.CreateProfile("PROF_DEF").CreateVersionFromFile( /* filePath is one of the params */);
if you really need to do this with the DM Ext. API instead of the DM API

Resources