Parse text file as list of variables? - text-parsing

I have a text file (currently in CSV format) as follows:
RACE,"race_human"
GENDER,"male"
AGE,30
ALIGNMENT,"align_lawful_good"
DEITY,"Old Faith"
However, I want to interpret the text file as if it were a list of variables. I.e.:
var RACE:string = "race_human";
Is there an easy way to do this, for instance reformatting the text file in the native language used by the program code?

You could split each line on the comma then place each item into a dictionary (provided the keys are unique). Use Dictionary.TryGetValue if there's a chance that a key does not exist.
string[] input = File.ReadAllLines(CSV-File-Here);
var dict = input.Select(s => s.Split(','))
.ToDictionary(s => s[0], s => s[1]);
// show alignment
string alignment = dict["ALIGNMENT"];
Console.WriteLine(alignment);
// show all values
foreach (var key in dict.Keys)
{
Console.WriteLine("{0}: {1}", key, dict[key]);
}
EDIT: you might be interested in FileHelpers to work with CSV files.

Related

Reading data from cvs file and converting data into multidimensional array in Swift

I'm new to Swift. I can read data (many rows and columns of names and mailing addresses) from csv file format. I have several of these files, so I created a function just to read the files and extract the data into a multidimensional array(s) - names, addresses, city, state, country. I read each of the lines from the file and try to append it to multidimensional array but I get errors - either index out of range or file type mismatch. What's the best way to enable this. See code below.
func getMailing(fileName: String) -> ([[String]])? {
let totalList = 243
var tempList: [String] = []
var arrayList = [[String]]()
guard let path = Bundle.main.url(forResource: fileName, withExtension: "csv") else {
print("File Error")
arrayList = [[""]]
return (arrayList)
}
do {
// get mailing data from file
let content = try String(contentsOf:path, encoding: String.Encoding.utf8)
// separate each line entry
tempList = content.components(separatedBy: "\r\n")
for index in 0...totalList - 1 {
// get each line from list and post into an array
let singleLine = tempList[index].components(separatedBy: ",").dropFirst().prefix(5)
// store each line data into into a multidimensional array for easy retrieval
arrayList[index].append(singleLine)
}
}
return (arrayList)
} catch {
print("File Error")
arrayList = [[""]]
return (arrayList)
}
}
Based on the code you've shown, it looks like you're trying to change the values of two different empty arrays 243 times. You have a loop setup to iterate based on your totalList property, but where you got that value, I have no idea. It would be wise to determine that value programmatically if you can.
You're setting both tempList and arrayList as empty arrays:
var tempList: [String] = []
var arrayList = [[String]]()
But then you're going through a loop and trying to change the value of an entry that doesn't even exist, hence your index out of range error. You need to first add something to both these arrays, because right now they are empty. It's probably crashing the first time through the loop when you try to set singleLine to tempList[index].components(separatedBy: ",").dropFirst().prefix(5), because you're saying tempList[0].components(separatedBy: ",").dropFirst().prefix(5), while there isn't an entry for tempList at index 0 because it's still empty! If you're going to loop through an array, it's always wise to do it based on the count of the array, or at least a quick fix when you need to use an index from two different arrays:
// Get the maximum times you can iterate based on the lowest count from each array
let maxLoop = min(tempList.count - 1, arrayList.count - 1)
for index in 0...maxLoop {
// get each line from list and post into an array
let singleLine = tempList[index].components(separatedBy: ",").dropFirst().prefix(5)
// store each line data into into a multidimensional array for easy retrieval
arrayList[index].append(singleLine)
}
Now that little chunk of code above won't even go through the loop once, because both arrays are still empty. You need to somewhere take your mailing data and parse it so that you can populate tempList and arrayList

Sort an Array, populated by an IsolatedStorageFile text file seems to move the 0 entry

I have the code below that splits a text file from IsolatedStorage, populates an Array with the data, sorts it, and then assigns it as the source for a ListPicker:
var splitFile = fileData.Split(';');
string[] testArray = splitFile;
Array.Sort<string>(testArray);
testLocationPicker.ItemsSource = testArray;
However, it doesn't seem to populating the array correctly and the sorting doesn't appear to be working as expected either.
The testArray[0] is blank, when it should be populated. When the output is shown the entry that should be at [0] appears at the bottom.
BEFORE SORTING:
AFTER SORTING:
It's only in sorting the array that it seems to screw up the order.
UPDATE: I tried the suggested:
var splitFile = fileData.Split(new[] { ';' }, StringSplitOptions.RemoveEmptyEntries);
string[] testArray = splitFile;
Array.Sort<string>(testArray);
testLocationPicker.ItemsSource = testArray;
This still results in the second screenshot, above.
When the app first ever runs I do this:
StringBuilder sb = new StringBuilder(); // Use a StringBuilder to construct output.
var store = IsolatedStorageFile.GetUserStoreForApplication(); // Create a store
store.CreateDirectory("testLocations"); // Create a directory
IsolatedStorageFileStream rootFile = store.CreateFile("locations.txt"); // Create a file in the root.
rootFile.Close(); // Close File
string[] filesInTheRoot = store.GetFileNames(); // Store all files names in an array
Debug.WriteLine(filesInTheRoot[0]); // Show first file name retrieved (only one stored at the moment)
string filePath = "locations.txt";
if (store.FileExists(filePath)) {
Debug.WriteLine("Files Exists");
StreamWriter sw =
new StreamWriter(store.OpenFile(filePath,
FileMode.Open, FileAccess.Write));
Debug.WriteLine("Writing...");
sw.WriteLine("Chicago, IL;");
sw.WriteLine("Chicago, IL (Q);");
sw.WriteLine("Dulles, VA;");
sw.WriteLine("Dulles, VA (Q);");
sw.WriteLine("London, UK;");
sw.WriteLine("London, UK (Q);");
sw.WriteLine("San Jose, CA;");
sw.WriteLine("San Jose, CA (Q);");
sw.Close();
Debug.WriteLine("Writing complete");
}
Then when I add to the file I do this:
StringBuilder sb = new StringBuilder(); // Use a StringBuilder to construct output.
var store = IsolatedStorageFile.GetUserStoreForApplication(); // Create a store
string[] filesInTheRoot = store.GetFileNames(); // Store all files names in an array
Debug.WriteLine(filesInTheRoot[0]); // Show first file name retrieved (only one stored at the moment)
byte[] data = Encoding.UTF8.GetBytes(locationName + ";");
string filePath = "locations.txt";
if (store.FileExists(filePath))
{
using (var stream = new IsolatedStorageFileStream(filePath, FileMode.Append, store))
{
Debug.WriteLine("Writing...");
stream.Write(data, 0, data.Length); // Semi Colon required for location separation in text file
stream.Close();
Debug.WriteLine(locationName + "; added");
Debug.WriteLine("Writing complete");
}
}
I'm splitting using a ";" could this be an issue?
There's no problem with the sort: 'space' is considered to come before 'a', so it appears on top of the list. The real problem is: why do you have an empty entry to begin with?
My guess is that, when creating the file, you're separating every entry with ;, including the last one. Therefore, when parsing the data with the string.Split method, you're left with an empty entry at the end of your array.
An easy way to prevent that is to use an overload of the string.Split method that filters empty entries:
var splitFile = fileData.Split(new[] { ';' }, StringSplitOptions.RemoveEmptyEntries);
I went a different way using IsolatedStorageSetting and storing Arrays/Lists to do what I wanted.

google visualization data.addRow - get data out of html data attribute

I have data inside a data attribute, like so:
<div class="dashboard-module" data-rows="new Date(2013,10,04),12,"OR"##new Date(2013,10,17),2,"OR"##new Date(2013,10,09),2,"CA""></div>
Im trying to split this string up and use it in the data.addRow function:
rows = el.data('rows');
rowsarray = rows.split('##');
// Error: Row given with size different than 3 (the number of columns in the table).
$.each(rowsarray, function(index, value) {
data.addRow( [value] );
});
// the following works
data.addRow([new Date(2013,10,04),12,"OR"]);
data.addRow([new Date(2013,10,09),2,"CA"]);
data.addRow([new Date(2013,12,12),14,"AL"]);
I guess the commas inside the new date are being counted as different parts of the array?
I'm assuming that the double-quotes inside your data-rows attribute are escaped (otherwise the HTML is malforned).
When you call rowsarray = rows.split('##');, you are getting an array of strings, like this:
[
'new Date(2013,10,04),12,"OR"',
'new Date(2013,10,17),2,"OR"',
'new Date(2013,10,09),2,"CA"'
]
not an array of arrays. If you want to store your data in an HTML attribute, your best bet is to use a JSON-compatible format. The problem then becomes storing dates, since Date objects are not JSON-compatible, but that is easy to work around. Store your data like this instead:
[["Date(2013,10,04)",12,"OR"],["Date(2013,10,17)",2,"OR"],["Date(2013,10,09)",2,"CA"]]
I did two things with the data-rows attribute: first, I changed the dates from a format like new Date(2013,10,17) to a string like "Date(2013,10,17)". Second, I converted the string to a JSON string representation of an array of arrays (which uses the standard javascript array brackets [ and ]). Note that JSON requires the use of double-quotes for all internal strings, so you must either escape all internal strings to use with the data-rows attribute, or use single-quotes around the data-rows attribute string (eg: data-rows='<string>').
You can then parse that string for entry into your DataTable:
rows = JSON.parse(el.data('rows'));
// convert date strings to Date objects
for (var i = 0; i < rows.length; i++) {
var dateStr = rows[i][0];
var dateArr = dateStr.substring(5, dateStr.length - 1).split(',');
rows[i][0] = new Date(dateArr[0], dateArr[1], dateArr[2]);
}
data.addRows(rows);

Adding a big data table from MS word to an array?

I am working on a database project, where I want to create an array as follows:
unsigned char* drillSize[][10] = {
{"0.3678", "23.222", "MN", "89000", ".000236", "678", "ZX", "8563", "LX", "0.678"},
{"0.3678", "23.222", "MN", "89000", ".000236", "678", "ZX", "8563", "LX", "0.678"},
.
.
.
//around 6000 rows }
I have been provided with this data in an Microsoft Word file. If I were to key in the data manually it might take weeks; is there a way to insert commas and inverted commas for each element by some means?
You can try it with regular expressions.
If your data is structured in any way like a csv file you can just import it into your program in some way and then slice it up with basic string functions.
In php I'd
$input = file_get_contents($myfile) //lets say this contains a text: 1,2,3,4,5
$sliced = explode(",",$input);
$myarray = null; //our variable
$output = "\$myarray = new array("; //creating a template for an array as a String
//some logic with the $sliced array if you need
...
$output .= implode(",",$sliced); //we put it back as string
$output .= ");"; //close the array
eval($output); //after this its in php's scope
print_r($myarray);
Basically that's what you need in a more complex form. If the text is not that structured you might need some regular expression library for C, but I'd recommend creating a text in Python, Perl or something which has more support and more flexible then copy the code manually back to C.

Unzip from a SQL Server text column to an image column

I have images of various formats (.png, .jpg, .bmp, etc.) stored as compressed text in a text column in a SQL Server 2005 table. I need to read the row, unzip the image and store it in an image column in another table.
I am using the SharpZip library, and all of the examples deal with file sources and destinations. I can't find anything that covers unzipping from a variable to another variable. A code snippet illustrating this or a link to a relevant resource would be much appreciated.
EDIT: A bit more information - the data is stored in a TEXT column. It appears as follows (text column abbreviated for display):
ImageID ImageData
1 FORMAT-ZIPV3 UEsDBBQAAAAIAOV6wzxdTnDvshs...
2 FORMAT-ZIPV3 UEsDBBQAAAAIAAF2yjxGncjOLgA...
3 FORMAT-ZIPV3 UEsDBBQAAAAIAKd6yjyjnQNr6gg...
4 FORMAT-ZIPV3 UEsDBBQAAAAIALdNyzyrPC8EMJw...
5 FORMAT-ZIPV3 UEsDBBQAAAAIAA1rOD1nZY1t0f0...
6 FORMAT-ZIPV3 UEsDBBQAAAAIANZplj2seyJ+VmM...
7 FORMAT-ZIPV3 UEsDBBQAAAAIAC5vhD27LPbPcv8...
8 FORMAT-ZIPV3 UEsDBBQAAAAIAK1qKz5DJNH3xMg...
9 FORMAT-ZIPV3 UEsDBBQAAAAIAHVkEztC3th/9hs...
10 FORMAT-ZIPV3 UEsDBBQAAAAIAEtXKz7DXHUdvow...
What I know for certain is that the images were compressed at some point in the process using SharpZip before being inserted into the table. It appears that the format information was added to the beginning of the data prior to inserting.
Looking at this data, would anyone have any insight on how this image data has been manipulated? Again, I need to get the uncompressed image data into a column of a data type conducive to reading for display on a web page.
EDIT: Ok, I'm stumped. Executing the following code produces the error, "Failed to convert parameter value from a Int32 to a Byte[]". It appears to be placing the length of the byte array into the byte array's value...
commandUncompressed.Connection = connectionUncompressed;
commandUncompressed.Parameters.Add("#Image_k", SqlDbType.VarChar, 10);
commandUncompressed.Parameters.Add("#ImageContents", SqlDbType.Image);
commandUncompressed.CommandText = sqlSaveImage;
connectionUncompressed.Open();
reader = command.ExecuteReader();
if (reader.HasRows)
{
while (reader.Read())
{
Console.WriteLine(reader["Image_k"].ToString()); // Merely for testing
String format = reader["ImageContents_Compressed"].ToString().Substring(0, 12);
var offset = 13; //"FORMAT-ZIPV3 ".Length;
var s = reader["ImageContents_Compressed"].ToString().Substring(offset);
var bytes = Convert.FromBase64String(s);
if (format == "FORMAT-ZIPV2 ")
{
bytes = ConvertStringToBytes(s); // Not a Base-64 encoded string? External conversion function utilized.
}
using (var zis = new ZipInputStream(new MemoryStream(bytes)))
{
ZipEntry zipEntry = zis.GetNextEntry(); // Doesn't seem to work unless an entry has been referenced
byte[] buffer = new byte[zis.Length];
commandUncompressed.Parameters["#Image_k"].Value = reader["Image_k"].ToString();
commandUncompressed.Parameters["#ImageContents"].Value = zis.Read(buffer, 0, buffer.Length);
commandUncompressed.ExecuteNonQuery();
}
}
}
It appears to be reading the data from the source text column just fine. I just cannot figure out how to get that into the image type parameter. The value for buffer variable shows the length of the byte array, rather than the actual bytes. Maybe that's what the value property typically shows for byte arrays? I'm so close and yet so far away. :/
EDIT: Ok, I'm a knucklehead. I made the following correction, and it works!
zis.Read(buffer, 0, buffer.Length)
commandUncompressed.Parameters["#ImageContents"].Value = buffer;
At this point I am only able to process FORMAT-ZIPV3 data, as I haven't figured out how to decode the FORMAT-ZIP2 strings yet. Following is a sampling of the V2 data. If anyone is able to determine the encoding, let me know. Would it be different if zipped using BZIP instead of ZIP format?
ImageID ImageData
1 FORMAT-ZIPV2 504B03041400020008005157422A2E25FDBAF26701008D6901000E...
2 FORMAT-ZIPV2 504B03041400020008009159422A7FC94BA2B2540500D35705000E...
3 FORMAT-ZIPV2 504B0304140002000800685A422A0CAA51F4473A0600B97206000E...
4 FORMAT-ZIPV2 504B03041400020008001D5D422A770BD3ED201902002C4A02000E...
5 FORMAT-ZIPV2 504B0304140002000800325E422A4B6C2FB4045001001C6E01000E...
6 FORMAT-ZIPV2 504B03041400020008006F72422A5F793AC1A1F00200ECF302000E...
7 FORMAT-ZIPV2 504B0304140002000800D572422A1B348A731DE5000085EB00000E...
8 FORMAT-ZIPV2 504B03041400020008003D73422A8AEBB7F855640300DD1B04000E...
9 FORMAT-ZIPV2 504B03041400020008006368D528C5D0A6BA794900004A2502000E...
10 FORMAT-ZIPV2 504B03041400020008008E5B6C2A2D9E9C33D7AF05005CEC05000E...
In response to a similar question, someone on sqlmonster.com provided a nifty VarBinaryStream class. It works with a column type of varbinary(max).
If your data is stored in a varbinary(max), and is in zip format, you could use that class to instantiate a VarBinaryStream, then instantiate a ZipInputStream around that, and ba-da-boom, you're there. Just read from the ZipInputStream.
In C# it might look like this
using (var imageSrc = new VarBinarySource(connection,
"Table.Name",
"Column",
"KeyColName",
1))
{
using (var s = new VarBinaryStream(imageSrc))
{
using(var zis = new ZipInputStream(s))
{
....
}
}
}
If the images are small, then you probably wouldn't want all this streaming stuff. If the column is a binary(n) or a varbinary(n) where n is less than 8000, just use the SqlBinary type and read in all the data into memory, then instantiate a MemoryStream around that. Simpler. In VB.NET it looks something like this:
Dim bytes as Bytes()
bytes = dr.GetSqlBinary(columnNumber)
Using ms As New MemoryStream(bytes)
Using zis As New ZipInputStream(ms)
...
End Using
End Using
Finally, I'm going to question the wisdom of applying zip compression to .jpg images, and similar. The jpg format is already compressed; compressing it again before putting the data into SQL Server won't cause the data to become appreciably smaller. It only increases processing time. If possible, I'd suggest you reconsider your design for storage of compressed images.
ok, with the update you provided, containing the data format, you can draw some conclusions.
The data is an actual string. Suspecting that it is a Base64-encoded string, I did a small test and used Convert.ToBase64String() on a byte stream that contains a zip file. It looks like this: UEsDBBQAAAAIAJJyYyk3M56F+QIAA...
Aha! you have a base64-encoded (string) version of the byte data for a bonafide zip file. To decode it, strip the prefix and then use FromBase64String() to get the byte array, insert into a MemoryStream, then read it with ZipInputStream.
something like this:
var offset = "FORMAT-ZIPV3 ".Length();
var s = sqlReader["CompressedImage"].ToString().Substring(offset);
var bytes = Convert.FromBase64String(s);
using (var zis = new ZipInputStream(new MemoryStream(bytes)))
{
...
zis.Read(...);
...
}
If the data is "really long", you're going to want to stream it out of that table, rather than just read it into a big string and convert it. I don't know how large text columns can be, but supposing that it could be 500mb, you don't want a 500mb string, and you don't want to do a conversion of a 500mb string with Convert.FromBase64String(). In that case You need to use a Base64Stream, or the FromBase64Transform class in the System.Security.Cryptography namespace.
Editorial comment. It is sort of backwards to zip-compress image data. The images are probably compressed already. But to compound that backwardsness by then doing a base64 encode, thereby expanding the data... ??? That is triple backwards. That makes noooooo sense at all. I understand that's how your vendor supplied it.
Ok, with your furhter update, using this as the format:
ImageID ImageData
1 FORMAT-ZIPV2 504B03041400020008005157422A2E25FDBAF26701008D6901000E...
2 FORMAT-ZIPV2 504B03041400020008009159422A7FC94BA2B2540500D35705000E...
That data is still zipfile data, but it is encoded as simple hex digits. You need to convert that to a byte array. Here's some code to do it.
public static class ConvertEx
{
static readonly String prefix= "FORMAT-ZIPV2 ";
public static string ToHexString(byte[] b)
{
System.Text.StringBuilder sb1 = new System.Text.StringBuilder();
int i = 0;
for (i = 0; i < b.Length; i++)
{
sb1.Append(System.String.Format("{0:X2}", b[i]));
}
return sb1.ToString().ToLower();
}
public static byte[] ToByteArray(string s)
{
if (s.StartsWith(prefix))
{
System.Console.WriteLine("removing prefix");
s = s.Substring(prefix.Length);
}
s= s.Trim(); // whitespace
System.Console.WriteLine("length: {0}", s.Length);
var r= new byte[s.Length/2];
for (int i = 0; i < s.Length; i+=2)
{
r[i/2] = (byte) Convert.ToUInt32(s.Substring(i,2), 16);
}
return r;
}
}
You can use that this way:
string s = GetStringContentFromDatabase()
var decoded = ConvertEx.ToByteArray(s);
using (var ms = new MemoryStream(decoded))
{
// use DotNetZip to read the zip file
// SharpZipLib is something similar...
using (var zip = ZipFile.Read(ms))
{
// print out the list of entries in the zipfile
foreach (var e in zip)
{
System.Console.WriteLine("{0}", e.FileName);
}
}
}
The examples on the SharpZip Wiki use Stream objects - while the sample does use a File, you could easily use a MemoryStream object here and the sample would work the same.

Resources