reading text files in Adobe AIR - file

Recenlty i've found that not all text (.txt) files could be readed as i need in adobe air. Because of diff file encodings (unicode, utf-8, ascii).
For example:
var fDataStream:FileStream;
var textfile:File = new File ("C:\myfile.txt");
var sContent:String;
fDataStream = new FileStream();
fDataStream.open(textfile,FileMode.READ);
sContent = fDataStream.readUTFBytes(fDataStream.bytesAvailable);
fDataStream.close ();
If 'myfile.txt' is not utf-8 encoded, then i get string like that "ÿþE"
I know that there is fDataStream.readMultyBytes() method, but it requries string representing file charset that can't be known beforehand (input .txt files for app could be in diff charsets). I'am out of ideas.
Thanks.

I think you want to use .readbytes instead of .readUTFBytes
That should load anything you give it.
see
http://help.adobe.com/en_US/FlashPlatform/reference/actionscript/3/flash/filesystem/FileStream.html#readBytes()

Related

Have to embed fonts when using utf-8 in libharu?

Yes,I have to ask another question. I just want to generate a pdf with Russian text and I find libharu-RELEASE_2_3_0 that can help me .
by the code:
HPDF_UseUTFEncodings(pdf);
HPDF_SetCurrentEncoder(pdf,"UTF-8");
detail_font_name = HPDF_LoadTTFontFromFile (pdf, "ttfont/arial.ttf", HPDF_TRUE);
/* add a new page object. */
page = HPDF_AddPage (pdf);
detail_font = HPDF_GetFont (pdf, detail_font_name, "UTF-8");
HPDF_Page_BeginText (page);
/* move the position of the text to top of the page. */
HPDF_Page_MoveTextPos(page, 10, 280);
HPDF_Page_SetFontAndSize (page, detail_font, 16);
HPDF_Page_MoveTextPos (page, 0, -20);
HPDF_Page_ShowText (page, "Об были вероломно программном чем");
it works for me ,but it embeds the font into the pdf .So the pdf size is too big, and I want to know how to generate pdf without embedding the font.
If I can not use utf-8 ,how can I get a pdf with Russian text.
Any Russian friends here?
Here is a same question but get no answer
utf8 in libharu: is embedding fonts really necessary?
" I want to know how to generate pdf without embedding the font."
detail_font_name = HPDF_LoadTTFontFromFile (pdf, "ttfont/arial.ttf", HPDF_FALSE);
that's it.
If you do not wanna use UTF8, then you should know the encoding of your text, for example, if you hard code in your source code, it's very likely that visual studio will encode them in UTF8, then you have to use UTF-8.
If you load texts dynamically, you should specify a correct encoding name for you text.

WPF find all regex matches in a xps document

I need to search an expression inside a xps document then list all matches (with the page number of each match).
I searched in google, but no reference or sample found which addresses this issue .
SO: How can I search a xps document and get this information?
The first thing to note is that an XPS file is an Open Packaging package. It can be opened and the contents accessed via the System.IO.Packaging.Package class. This makes any operations on the contents much easier.
Here's an example of how to search the page content with a given regex, while also tracking which page the match occurs on.
var regex = new Regex(#"th\w+", RegexOptions.Compiled | RegexOptions.IgnoreCase | RegexOptions.Multiline);
using(var xps = System.IO.Packaging.Package.Open(#"C:\path\to\regex.oxps"))
{
var pages = xps.GetParts()
.Where (p => p.ContentType == "application/vnd.ms-package.xps-fixedpage+xml")
.ToList();
for (var i = 0; i < pages.Count; i++)
{
var page = pages[i];
using(var reader = new StreamReader(page.GetStream()))
{
var s = reader.ReadToEnd();
var matches = regex.Matches(s);
if (matches.Count > 0)
{
var matchText = matches
.Cast<Match>()
.Aggregate (new StringBuilder(), (agg, m) => agg.AppendFormat("{0} ", m.Value));
Console.WriteLine("Found matches on page {0}: {1}", i + 1, matchText);
}
}
}
}
It is not going to be as simple as you might have thought. XPS files are compressed (zipped) files containing a somewhat complex folder structure containing all the text, fonts, graphics and other items. You can use compression tools such as 7-Zip or WinZip etc. to extract the entire folder structure from an XPS file.
Having said that, you can use the following sequence of steps to do what you want:
Extract the contents of your XPS file programmatically in a temp folder. You can use the new ZipFile class for this purpose if you're using .NET 4.5 or better.
The extracted folder will have the following folder structure:
_rels
Documents
1
_rels
MetaData
Pages
_rels
Resources
Fonts
MetaData
Go to Documents\1\Pages\ subfolder. Here you'll find one or more .fpage files, one for each page of your document. These files are in XML format and contain all text contained in the page in a structured manner.
Use simple loop to iterate through all .fpage files, opening each of them using an XML reader such as XDocument or XmlDocument and search for required text in node values using RegEx.IsMatch(). If found, note down the page number in a List and move ahead.

MD5 of file downloaded from database, from a JSONObject

My requirement is to compare the MD5 hashes of a file on the local disk and a file downloaded from a database.
The file is stored on SQL Server in a VARBINARY(MAX) column. The file can be any type. I'm currently testing with a PDF file. I get the file from the database using a HttpPost request. A JSONObject is built using the HttpResponse object. The JSONObject contains the file contents in binary format.
Now I have to compare the MD5 hash of the received binary data against the MD5 hash of the same file on disk. I have written the following code but the MD5 hashes do not match.
I think I'm going wrong in simply calculating the MD5 of the downloaded binary contents. Is there a correct way to do this? Thanks in advance.
// Read response from a HttpResponse object 'response'
BufferedReader reader = new BufferedReader(new InputStreamReader(response.getEntity().getContent()));
String line="";
StringBuilder sb = new StringBuilder();
while((line=reader.readLine())!=null) {
sb.append(line);
}
// construct jsonobject
JSONObject jsonResponse = new JSONObject(sb.toString());
//Read file from disk
FileInputStream fis = new FileInputStream(new File(this.getClass().getResource("C:\\demo.pdf").getPath()));
// Calculate MD5 of file read from disk
String md5Request = org.apache.commons.codec.digest.DigestUtils.md5Hex(fis);
// Calculate MD5 of binary contents. "binfile" is name of key in the JSONObject
// and binary contents of downloaded file are in its corresponding value field
String md5Response = org.apache.commons.codec.digest.DigestUtils.md5Hex(jsonResponse.getString("binfile"));
Assert.assertEquals("Hash sums of request and response must match", md5Request, md5Response);
When I debug, I see this value against the binfile key in the JSONObject 'jsonResponse'
binfile=[37,80,68,70,45,49,46,52,13,37,-30,-29,-49,-45,13,10,52,48...]
and what follows is a lengthy stream of binary data.
OK, in SQL there's a build-in function that looks like this:
select *,
convert(varchar(50),master.sys.fn_repl_hash_binary(a.BinaryField),2) as 'MD5Hash'
from SomeTable a
You give the fn_repl_hash_binary the name of the binary field you're reading, plus "2" as an argument which tells SQL to calc the value as an MD5; I think "1" is SHA.
And in Java, you can use something like this:
private String getMD5Hash(byte[] bytes) throws java.lang.Exception{
String s="This is a test";
MessageDigest m=MessageDigest.getInstance("MD5");
m.update(bytes,0,bytes.length);
return new BigInteger(1,m.digest()).toString(16);
}
This should do the trick. Best of luck, CodeWarrior.
It is not a new post but here is a possible solution, as I faced this problem too on python and made a bunch of test to find how to do...
As you treat all data in binary, you need to open the file to compare in binary mode.
My original code that was failing every time to read the correct MD5 checksum:
with open(filepath, "r") as file_to_check:
tile_file = file_to_check.read()
Corrected code:
with open(filepath, "rb") as file_to_check:
tile_file = file_to_check.read()
Simply adding the b (binary) after the read (r) flag to let python know it need to read the file as binary and now it works.
This might be what will help you find your problem... Hope it helps!

How to read a file in Groovy into a string?

I need to read a file from the file system and load the entire contents into a string in a groovy controller, what's the easiest way to do that?
String fileContents = new File('/path/to/file').text
If you need to specify the character encoding, use the following instead:
String fileContents = new File('/path/to/file').getText('UTF-8')
The shortest way is indeed just
String fileContents = new File('/path/to/file').text
but in this case you have no control on how the bytes in the file are interpreted as characters. AFAIK groovy tries to guess the encoding here by looking at the file content.
If you want a specific character encoding you can specify a charset name with
String fileContents = new File('/path/to/file').getText('UTF-8')
See API docs on File.getText(String) for further reference.
A slight variation...
new File('/path/to/file').eachLine { line ->
println line
}
In my case new File() doesn't work, it causes a FileNotFoundException when run in a Jenkins pipeline job. The following code solved this, and is even easier in my opinion:
def fileContents = readFile "path/to/file"
I still don't understand this difference completely, but maybe it'll help anyone else with the same trouble. Possibly the exception was caused because new File() creates a file on the system which executes the groovy code, which was a different system than the one that contains the file I wanted to read.
the easiest way would be
new File(filename).getText()
which means you could just do:
new File(filename).text
Here you can Find some other way to do the same.
Read file.
File file1 = new File("C:\Build\myfolder\myTestfile.txt");
def String yourData = file1.readLines();
Read Full file.
File file1 = new File("C:\Build\myfolder\myfile.txt");
def String yourData= file1.getText();
Read file Line Bye Line.
File file1 = new File("C:\Build\myfolder\myTestfile.txt");
for (def i=0;i<=30;i++) // specify how many line need to read eg.. 30
{
log.info file1.readLines().get(i)
}
Create a new file.
new File("C:\Temp\FileName.txt").createNewFile();

How do I get a temporary File object (of correct content-type, without writing to disk) directly from a ZipEntry (RubyZip, Paperclip, Rails 3)?

I'm currently trying to attach image files to a model directly from a zip file (i.e. without first saving them on a disk). It seems like there should be a clearer way of converting a ZipEntry to a Tempfile or File that can be stored in memory to be passed to another method or object that knows what to do with it.
Here's my code:
def extract (file = nil)
Zip::ZipFile.open(file) { |zip_file|
zip_file.each { |image|
photo = self.photos.build
# photo.image = image # this doesn't work
# photo.image = File.open image # also doesn't work
# photo.image = File.new image.filename
photo.save
}
}
end
But the problem is that photo.image is an attachment (via paperclip) to the model, and assigning something as an attachment requires that something to be a File object. However, I cannot for the life of me figure out how to convert a ZipEntry to a File. The only way I've seen of opening or creating a File is to use a string to its path - meaning I have to extract the file to a location. Really, that just seems silly. Why can't I just extract the ZipEntry file to the output stream and convert it to a File there?
So the ultimate question: Can I extract a ZipEntry from a Zip file and turn it directly into a File object (or attach it directly as a Paperclip object)? Or am I stuck actually storing it on the hard drive before I can attach it, even though that version will be deleted in the end?
UPDATE
Thanks to blueberry fields, I think I'm a little closer to my solution. Here's the line of code that I added, and it gives me the Tempfile/File that I need:
photo.image = zip_file.get_output_stream image
However, my Photo object won't accept the file that's getting passed, since it's not an image/jpeg. In fact, checking the content_type of the file shows application/x-empty. I think this may be because getting the output stream seems to append a timestamp to the end of the file, so that it ends up looking like imagename.jpg20110203-20203-hukq0n. Edit: Also, the tempfile that it creates doesn't contain any data and is of size 0. So it's looking like this might not be the answer.
So, next question: does anyone know how to get this to give me an image/jpeg file?
UPDATE:
I've been playing around with this some more. It seems output stream is not the way to go, but rather an input stream (which is which has always kind of confused me). Using get_input_stream on the ZipEntry, I get the binary data in the file. I think now I just need to figure out how to get this into a Paperclip attachment (as a File object). I've tried pushing the ZipInputStream directly to the attachment, but of course, that doesn't work. I really find it hard to believe that no one has tried to cast an extracted ZipEntry as a File. Is there some reason that this would be considered bad programming practice? It seems to me like skipping the disk write for a temp file would be perfectly acceptable and supported in something like Zip archive management.
Anyway, the question still stands:
Is there a way of converting an Input Stream to a File object (or Tempfile)? Preferably without having to write to a disk.
Try this
Zip::ZipFile.open(params[:avatar].path) do |zipfile|
zipfile.each do |entry|
filename = entry.name
basename = File.basename(filename)
tempfile = Tempfile.new(basename)
tempfile.binmode
tempfile.write entry.get_input_stream.read
user = User.new
user.avatar = {
:tempfile => tempfile,
:filename => filename
}
user.save
end
end
Check out the get_input_stream and get_output_stream messages on ZipFile.

Resources