Detecting an empty flat file with SSIS script task - sql-server

I am looking for a script task which identifies a zero KB file in a folder and outputs the same in a mail or text file.
Thanks in advance. Let me know for any questions.

Something like this:
String FilePath = Dts.Variables["User::FilePath"].Value.ToString();
String strContents;
StreamReader sReader;
sReader = File.OpenText(FilePath);
strContents = sReader.ReadToEnd();
sReader.Close();
if (strContents.Length==0)
MessageBox.Show("Empty file");

Related

ssis script task (vb) issues when reading large file

I'm using the below code inside a ssis script task to modify the contents of a file. I'm basicallly creating 1 json document when in the file there are many jsons, one after the other.
This code works perfectly up until around a 1GB file (to read the 1GB file it's using almost 7GB memory in SSIS), after that it crashes (i assume due to memory). I need to read files up until 5GB.
Any help please
Public Sub Main()
Dim filePath As String = Dts.Variables("User::filepath").Value.ToString()
Dim content As String = File.ReadAllText(filePath).Replace("}", "},")
content = content.Substring(0, Len(content) - 1)
content = "{ ""query"" : [" + content + "] }"
File.WriteAllText(filePath, content)
Dts.TaskResult = ScriptResults.Success
End Sub
It is not recommended to use File.ReadAllText(filePath) to read big flat files because it will store all the content in memory. I think you should use a simple data flow task to transfer the data from this flat file to a new flat file, and you can do the transformation you need in a script component on each row.
Also you can read it line by line in a script using a StreamReader using and write it to a new file using a StreamWriter, when finished you can delete the first file, and rename the new one.
References
How to open a large text file in C#
File System Read Buffer Efficiency
c# - How to read a large (5GB) txt file in .NET?

MD5 of file downloaded from database, from a JSONObject

My requirement is to compare the MD5 hashes of a file on the local disk and a file downloaded from a database.
The file is stored on SQL Server in a VARBINARY(MAX) column. The file can be any type. I'm currently testing with a PDF file. I get the file from the database using a HttpPost request. A JSONObject is built using the HttpResponse object. The JSONObject contains the file contents in binary format.
Now I have to compare the MD5 hash of the received binary data against the MD5 hash of the same file on disk. I have written the following code but the MD5 hashes do not match.
I think I'm going wrong in simply calculating the MD5 of the downloaded binary contents. Is there a correct way to do this? Thanks in advance.
// Read response from a HttpResponse object 'response'
BufferedReader reader = new BufferedReader(new InputStreamReader(response.getEntity().getContent()));
String line="";
StringBuilder sb = new StringBuilder();
while((line=reader.readLine())!=null) {
sb.append(line);
}
// construct jsonobject
JSONObject jsonResponse = new JSONObject(sb.toString());
//Read file from disk
FileInputStream fis = new FileInputStream(new File(this.getClass().getResource("C:\\demo.pdf").getPath()));
// Calculate MD5 of file read from disk
String md5Request = org.apache.commons.codec.digest.DigestUtils.md5Hex(fis);
// Calculate MD5 of binary contents. "binfile" is name of key in the JSONObject
// and binary contents of downloaded file are in its corresponding value field
String md5Response = org.apache.commons.codec.digest.DigestUtils.md5Hex(jsonResponse.getString("binfile"));
Assert.assertEquals("Hash sums of request and response must match", md5Request, md5Response);
When I debug, I see this value against the binfile key in the JSONObject 'jsonResponse'
binfile=[37,80,68,70,45,49,46,52,13,37,-30,-29,-49,-45,13,10,52,48...]
and what follows is a lengthy stream of binary data.
OK, in SQL there's a build-in function that looks like this:
select *,
convert(varchar(50),master.sys.fn_repl_hash_binary(a.BinaryField),2) as 'MD5Hash'
from SomeTable a
You give the fn_repl_hash_binary the name of the binary field you're reading, plus "2" as an argument which tells SQL to calc the value as an MD5; I think "1" is SHA.
And in Java, you can use something like this:
private String getMD5Hash(byte[] bytes) throws java.lang.Exception{
String s="This is a test";
MessageDigest m=MessageDigest.getInstance("MD5");
m.update(bytes,0,bytes.length);
return new BigInteger(1,m.digest()).toString(16);
}
This should do the trick. Best of luck, CodeWarrior.
It is not a new post but here is a possible solution, as I faced this problem too on python and made a bunch of test to find how to do...
As you treat all data in binary, you need to open the file to compare in binary mode.
My original code that was failing every time to read the correct MD5 checksum:
with open(filepath, "r") as file_to_check:
tile_file = file_to_check.read()
Corrected code:
with open(filepath, "rb") as file_to_check:
tile_file = file_to_check.read()
Simply adding the b (binary) after the read (r) flag to let python know it need to read the file as binary and now it works.
This might be what will help you find your problem... Hope it helps!

reading text files in Adobe AIR

Recenlty i've found that not all text (.txt) files could be readed as i need in adobe air. Because of diff file encodings (unicode, utf-8, ascii).
For example:
var fDataStream:FileStream;
var textfile:File = new File ("C:\myfile.txt");
var sContent:String;
fDataStream = new FileStream();
fDataStream.open(textfile,FileMode.READ);
sContent = fDataStream.readUTFBytes(fDataStream.bytesAvailable);
fDataStream.close ();
If 'myfile.txt' is not utf-8 encoded, then i get string like that "ÿþE"
I know that there is fDataStream.readMultyBytes() method, but it requries string representing file charset that can't be known beforehand (input .txt files for app could be in diff charsets). I'am out of ideas.
Thanks.
I think you want to use .readbytes instead of .readUTFBytes
That should load anything you give it.
see
http://help.adobe.com/en_US/FlashPlatform/reference/actionscript/3/flash/filesystem/FileStream.html#readBytes()

How to read a file in Groovy into a string?

I need to read a file from the file system and load the entire contents into a string in a groovy controller, what's the easiest way to do that?
String fileContents = new File('/path/to/file').text
If you need to specify the character encoding, use the following instead:
String fileContents = new File('/path/to/file').getText('UTF-8')
The shortest way is indeed just
String fileContents = new File('/path/to/file').text
but in this case you have no control on how the bytes in the file are interpreted as characters. AFAIK groovy tries to guess the encoding here by looking at the file content.
If you want a specific character encoding you can specify a charset name with
String fileContents = new File('/path/to/file').getText('UTF-8')
See API docs on File.getText(String) for further reference.
A slight variation...
new File('/path/to/file').eachLine { line ->
println line
}
In my case new File() doesn't work, it causes a FileNotFoundException when run in a Jenkins pipeline job. The following code solved this, and is even easier in my opinion:
def fileContents = readFile "path/to/file"
I still don't understand this difference completely, but maybe it'll help anyone else with the same trouble. Possibly the exception was caused because new File() creates a file on the system which executes the groovy code, which was a different system than the one that contains the file I wanted to read.
the easiest way would be
new File(filename).getText()
which means you could just do:
new File(filename).text
Here you can Find some other way to do the same.
Read file.
File file1 = new File("C:\Build\myfolder\myTestfile.txt");
def String yourData = file1.readLines();
Read Full file.
File file1 = new File("C:\Build\myfolder\myfile.txt");
def String yourData= file1.getText();
Read file Line Bye Line.
File file1 = new File("C:\Build\myfolder\myTestfile.txt");
for (def i=0;i<=30;i++) // specify how many line need to read eg.. 30
{
log.info file1.readLines().get(i)
}
Create a new file.
new File("C:\Temp\FileName.txt").createNewFile();

Copy text from WPF DataGrid to Clipboard to Excel

I have WPF DataGrid (VS2010 C#). I copied the data from DataGrid to Clipboard and write it to an Excel file. Below is my code.
dataGrid1.SelectAllCells();
dataGrid1.ClipboardCopyMode = DataGridClipboardCopyMode.IncludeHeader;
ApplicationCommands.Copy.Execute(null, dataGrid1);
dataGrid1.UnselectAllCells();
string path1 = "C:\\test.xls";
string result1 = (string)Clipboard.GetData(DataFormats.CommaSeparatedValue);
Clipboard.Clear();
System.IO.StreamWriter file1 = new System.IO.StreamWriter(path1);
file1.WriteLine(result1);
file1.Close();
Everything works out OK except when I open the excel file it gives me two warning:
"The file you are trying to open
'test.xls' is in a different format
than specified by the file extension.
Verify that the file is not corrupted
and is from a trusted source before
opening the file. Do you want to open
the file now?"
"Excel has detected that 'test.xls' is
a SYLK file, but cannot load it."
But after I click through it, it still open the excel file OK and data are formated as it supposed to be. But I can't find how to get rid of the two warnings before the excel file is open.
You need to use csv as extension. Xls is the Excel file extension.
So
string path1 = "C:\\test.csv";
should work.
A problem like yours has already been described here : generating/opening CSV from console - file is in wrong format error.
Does it helps to solve yours ?
Edit : Here is the Microsoft KB related => http://support.microsoft.com/kb/323626

Resources