Writing data into RandomAccessFile - randomaccessfile

In my project i need to create a file for each student and i thinki have the method created, here it is below
public addStudent(String fullName, int grn, String formClass, String formTeacher)
{
//Default values
int creativity = 0;
int action = 0;
int service = 0;
int total = 0;
//Initialize File
RandomAccessFile adding = new RandomAccessFile(new File(fullName + ".dat"), "rw");
long fileSize = adding.length();
adding.seek(fileSize);
//Variables from Method
adding.writeUTF(fullName + "\n");
adding.writeInt(grn + "\n");
adding.writeUTF(formClass + "\n");
adding.writeUTF(formTeacher + "\n");
//Variables created in method
adding.writeInt(creativtiy + "\n");
adding.writeInt(action + "\n");
adding.writeInt(service + "\n");
adding.writeInt(total + "\n");
adding.close();
}
I just keep thinking that its not right and would like some clarification about certain parts such as this line
RandomAccessFile adding = new RandomAccessFile(new File(fullName + ".dat"), "rw");
fullname is a variable that is passed into the method and it is the name and surname of a student (ex: John Lennon). What i want to do is have the file named "John Lennon.dat". however i keep thinking my approach here is wrong.
Another question is about the integer values. they will be updated from time to time, but by simple addition of current+new. How do i do that?

You have to be carefull if you use possible user input (fullname) unfiltered for naming your files. This can lead to a security hole. You should check fullname for special characters which are not allowed in your file system or would change your directory. Imagine someone could input ../importantfile as fullname and without checking it is possible to overwrite some important files in other directories.
The safest way is to use some generic name schema for your files (like data1.dat, data2.dat and to store the relation fullname to filename in another place (maybe a file index.dat).
I assume you have a good reason to use a RandomAccessFile here. According to your code it is possible to have more than one record in one file. If you do not store the record's starting position at another location then you have to read one record after another. If you found your record to change then you have to read in all fields before your integer value so that your file position points to the integer start position. Then you can read your integer, change the value, move your file position 4 bytes back (seek(-4)). Then you can write your modified integer value.
Alternatives:
You can read the whole file in, modify the integer values and then create the whole file new and overwrite the old. For short files this could be less complex without a significant performance penalty (but this depends, if you have thousand files to change in a short time, this alternative is not recommended).
You can store the file positions of your integer values in another place and use these to directly access these values. This only works if your strings are immutable.
You can use an alternative file format like XML, JSON or serialized objects. But all of these doesn't support in situ changes.
You can use an embedded database like SQLite or H2 and let the database care about file access and indexing.

Related

Can you embed blob data in a script for sql server?

I'm testing the retrieval of blobs by a web application.
There are some difficulties uploading blobs programmatically from the javascript code so I decided to prepopulate the database with some data instead. However I'm running into some problems with that aswell.
We have a database versioning process that expects all the schema + data for the database to be in scripts that can be run by sqlcmd.
This post seems to show how to insert blobs. However this script requires that you specify an absolute path to a file on the server.
Is there another way? We are using source control and continuous integration and so wouldn't ever really want to refer to a file in a specific place outside a given copy of a repository on one machine.
If not it seems like there are 2 options:
Take the hit and never change or delete anything from a random directory on the database server aswell. The data will need to be split between several locations. Further more we either ban blobs from production config deployment or just have to bear in mind we have to do something crazy if we ever need them - we wont be in control of the directory structure on a remote server. This probably wont be a massive problem to be fair - I can't see us wanting to ship any config in blob form really.
or
write a program that does something crazy like remotely create a temporary directory on the server and then copy the file there at the correct version and output a script with that filename in.
It doesn't really seem like having things under source control and not wanting to hardcode paths is exactly an outlandish scenario but the poor quality of database tools stopped surprising me a while ago!
Assuming you are referring to a field of type BINARY / VARBINARY / IMAGE, you should be able to just specify the Hex Bytes such as:
0x0012FD...
For example:
INSERT INTO TableName (IDField, BlobField) VALUES (1, 0x0012FD);
You just need to get that string of hex digits from the file. If you already have such a value in the DB already, then just select that row and field in SSMS, and copy / paste the value from the cell (in "Results to Grid" mode) into your SQL script.
You can also wrap long lines using a backslash as follows:
INSERT INTO TableName (IDField, BlobField) VALUES (1, 0x0012FD\
12B36D98\
D523);
If wrapping via back-slash, be sure to start each new line at the first position as the entire thing is treated as a continuous string. Hence, indenting lines that come immediately following a back-slash would then have spaces between the hex digits, which is not valid. For example:
INSERT INTO TableName (IDField, BlobField) VALUES (1, 0x0012FD\
12B36D98\
D523);
equates to:
0x0012FD 12B36D98D523
If you have access to C#, here's a function that I've used that will take a binary blob and spit out a SQL script that sets a varbinary(max) variable to the contents of the blob. It will format it nicely and take into account length restrictions on SQL statements (which can be an issue with very large blobs). So basically it will output something like:
select #varname = 0x4d5a90000300000004000000ffff0000b8000000000000 +
0x0040000000000000000000000000000000000000000000000000000000000000 +
0x0000000000800000000e1fba0e00b409cd21b8014ccd21546869732070726f67 +
...
0x007365745f4d6574686f64007365745f53656e644368756e6b65640053747265;
select #varname = #varname + 0x616d004765745265717565737453747265 +
0x616d0053797374656d2e5465787400456e636f64696e6700476574456e636f64 +
...
You just have to make sure to declare the variable at the front of the script it gives you. You could build a little utility that runs this function on a file (or wherever your blobs come from) to help in creating your scripts.
public static string EncodeBinary(string variable, byte[] binary)
{
StringBuilder result;
int column;
int concats;
bool newLine;
if (binary.Length == 0)
{
return "select " + variable + " = null;";
}
result = new StringBuilder("select ");
result.Append(variable);
result.Append(" = 0x");
column = 12 + variable.Length;
concats = 0;
for (int i = 0; i < binary.Length; i++)
{
newLine = false;
if (column > 64)
{
concats++;
newLine = true;
}
if (newLine)
{
if (concats == 64)
{
result.Append(";\r\nselect ");
result.Append(variable);
result.Append(" = ");
result.Append(variable);
result.Append(" + 0x");
column = 15 + variable.Length * 2;
concats = 1;
}
else
{
result.Append(" +\r\n0x");
column = 2;
}
}
result.Append(binary[i].ToString("x2"));
column += 2;
}
result.Append(";\r\n");
return result.ToString();
}

Manipulating (read/write) data of a file from a program in non-binary mode

I am in the process of making a school database system for a project. The requirements prohibit me from using binary modes (i.e. "rb"). The problem though is that in one of the seven options, option 1, I'm required to load all the files, meaning that let's say if I were to use option 3 which adds a student record (student ID and student name) to the file, then when I load all the files once more in option 1, the program is meant to recognize the information already in the file. So, if I were to use option 3 one more time the program should prevent me from inputting the same student ID (as that is one of the project restrictions) when adding a new student record. As far as I know (which is not much but...), manipulating file data in binary mode makes this easier because you can manipulate blocks of data and such. I'm not entirely familiar with this and am hoping if anyone could suggest some functions which I could use. Actually, more specifically, from what I understand the advantage of binary mode is that it is easy to calculate the (offset?), so perhaps I'm required to be able to manipulate the offset in non-binary modes? If so, anyone people enlighten me.
In short, it's a text file?
Then, you should never try to "update" the file. Just read it all (parsing it in the process), update the parsed data, and overwrite the file.
This can be done with the "create new, then delete old and rename new" method, which has the advantage of being able to do it without loading the whole file in memory (just read one record from the input file, update it when appropriate, and write it in the output stream, rinse, repeat).
struct Record { ... }
int ReadAndParseRecord(FILE*, struct Record*); /*Will contain calls to fgets(), fscanf()*/
void WriteRecord(FILE*, struct Record const *); /*Will contain calls to fputs(), fprintf()*/
void DoWorkOnOneRecord(struct Record *);
void DoWork(char const *filenameIn, char const *filenameOut)
{
FILE* fileIn = fopen(filenameIn, "r");
FILE* fileOut = fopen(filenameOut, "w");
struct Record currentRecord;
while(ReadAndParseRecord(fileIn, &currentRecord))
{
DoWorkOnOneRecord(&currentRecord);
WriteRecord(fileOut, &currentRecord);
}
/*If there are any records to add, do it here*/
/*...*/
fclose(fileIn), fclose(fileOut);
}
And when you have to update the file, you DoWork() into a new file, then delete the old one and rename the new one (you can also have a backup of the old one, in which case change the sequence to "delete the backup, rename the old one, rename the new one").

open text file, modify text, place into sql database with groovy

I have a text file that has a large grouping of numbers (137mb text file) and am looking to use groovy to open the text file, read it line-by-line, modify the numbers, and then place them into a database (as strings). There are going to be 2 items per line that need to be written to separate database columns, which are related.
My text file looks as such:
A.12345
A.14553
A.26343
B.23524
C.43633
C.23525
So the flow would be:
Step 1.The file is opened
Step 2.Line 1 is red
Step 3.Line 1 is split into letter/number pair [:]
Step 4.The number is divided by 10
Step 5.Letter is written to letter data base (as string)
Step 6.Number is written to number database (as string)
Step 7.Letter:number pair is also written to a separate comma separated text file.
Step 8.Proceed to next line (line 2)
Output text file should look like this:
A,1234.5
A,1455.3
A,2634.3
B,2352.4
C,4363.3
C,2352.5
Database for numbers should look like this:
1:1234.5
2:1455.3
3:2634.3
4:2352.4
5:4363.3
6:2352.5
*lead numbers are database index locations, for relational purpose
Database for letters should look like this:
1:A
2:A
3:A
4:B
5:C
6:C
*lead numbers are database index locations, for relational purpose
I have been able to do most of this; the issue I am running into is not be able to use the .eachLine( line -> ) function correctly... and have NO clue how to output the values to the databases.
There is one more thing I am quite dense about, and that is the instance where the script encounters an error. The text file has TONS of entries (around 9000000) so I am wondering if there is a way to make it so if the script fails or anything happens that I can restart the script from the last modified line.
Meaning, the script has an error (my computer gets shut down somehow) and stops running at line 125122 (completes modification of line 125122) of the text file... how do I make it so when I start the script the second time run the script at line 125123.
Here is my sample code so far:
//openfile
myFile = new File("C:\\file.txt")
//set fileline to target
printFileLine = { it }
//set target to argument
numArg = myFile.eachLine( printFileLine )
//set argument to array split at "."
numArray = numArg.split(".")
//set int array for numbers after the first char, which is a letter
def intArray = numArray[2] { it as int } as int
//set string array for numbers after the first char, which is a letter
def letArray = numArray[1] { it as string }
//No clue how to write to a database or file... or do the persistence thing.
Any help would be appreciated.
I would use a loop to cycle over every line within the text file, I would also use Java methods for manipulating strings.
def file = new File('C:\\file.txt')
StringBuilder sb = new StringBuilder();
file.eachLine { line ->
//set StringBuilder to new line
sb.setLength(0);
sb.append(line);
//format string
sb.setCharAt(1, ',');
sb.insert(5, '.');
}
You could then write each line to a new text file, example here. You could use a simple counter (e.g. counter = 0; and then counter++;) to store the latest line that has been read/written and use that if an error occurs. You could catch possible errors within a try/catch statement if you are regularly getting crashes also.
This guide should give you a good start with working with a database (presuming SQL).
Warning, all of this code is untested and should hopefully give you more direction. There are probably many other ways to solve this differently, so keep an open mind.

SAS dataset fieldtype num to char with current format

I have currently a dataset which contains the variables I need together with the needed formats.
Now I am using the getvarc() function (among others) in a loop to get those variables to write to a file.
The first problem occuring ofcourse, is that some variables are not char type but num type. I could use getvarn() to retrieve those, but then the format goes to waste which I really need.
E.g. date in num type: 18750. Format = yymmdd10. Thus looking like 2011-05-03
Using the value=getvarn() for this field, it would retrieve 18750 for value. If I then want to output (using PUT) this value, it would not give me the 2011-05-03 date as I want.
So now I am looking for a better way to do this.
My first option is to use the sashelp.vcolumn data set, to retrieve the format on this field, and use it in the put statement to output in the right format.
My second option is to convert the data set containing this field (in reality it is about multiple data sets), where I convert all num type fields to char types remaining the right format.
Which option should I go with?
And in case of the second option, can this be done in a generic way (as I said, this is not about one variable only) ?
EDIT:
After the answer of Cmjohn I got to find a way to fix my problem. Not using the sashelp.vcolumn but using the varfmt function.
So a short explanation of what I am doing in my code:
data _null_
set dataset1;
file file1;
if somefield = x then do;
dsid = open(datasetfield, i);
rc = fetchobs (dsid);
varnum = varnum(dsid,anotherfield);
**varfmt = varfmt(dsid,anotherfield);
if vartype(dsid,anotherfield) = 'N' then do;**
value = getvarn(dsid,varnum);
**value_formatted = putn(value,varfmt);**
**end;
else do;**
value = getvarc(dsid,varnum);
**value_formatted = putc(value,varfmt);
end;**
put value_formatted;
end;
run;
So this is in general (quickly out of my head) what I am doing now. The bold part of the solution is what I came up with after Cmjohns response. So my first question is answered, 'How to do it?'.
But my added question: What would be most efficient in the long run: keep this process or make my data sets the way that all data can be read in by only using getvarc without the need of the type-check and the getfmt()?
I'd suggest that you use the vvalue function, or any of the other techniques provided in the answers to this question to put formatted values to a file.

How to load text file into sort of table-like variable in Lua?

I need to load file to Lua's variables.
Let's say I got
name address email
There is space between each. I need the text file that has x-many of such lines in it to be loaded into some kind of object - or at least the one line shall be cut to array of strings divided by spaces.
Is this kind of job possible in Lua and how should I do this? I'm pretty new to Lua but I couldn't find anything relevant on Internet.
You want to read about Lua patterns, which are part of the string library. Here's an example function (not tested):
function read_addresses(filename)
local database = { }
for l in io.lines(filename) do
local n, a, e = l:match '(%S+)%s+(%S+)%s+(%S+)'
table.insert(database, { name = n, address = a, email = e })
end
return database
end
This function just grabs three substrings made up of nonspace (%S) characters. A real function would have some error checking to make sure the pattern actually matches.
To expand on uroc's answer:
local file = io.open("filename.txt")
if file then
for line in file:lines() do
local name, address, email = unpack(line:split(" ")) --unpack turns a table like the one given (if you use the recommended version) into a bunch of separate variables
--do something with that data
end
else
end
--you'll need a split method, i recommend the python-like version at http://lua-users.org/wiki/SplitJoin
--not providing here because of possible license issues
This however won't cover the case that your names have spaces in them.
If you have control over the format of the input file, you will be better off storing the data in Lua format as described here.
If not, use the io library to open the file and then use the string library like:
local f = io.open("foo.txt")
while 1 do
local l = f:read()
if not l then break end
print(l) -- use the string library to split the string
end

Resources