Groovy script to modify text file line by line - file

I have a input file from which a Groovy file reads input. Once a particular input is processed, Groovy script should be able to comment the input line it used and then move on.
File content:
1
2
3
When it processes line 1 and line 2, the input file will look as below:
'1
'2
3
By this way, if I re-run the Groovy, I would like to start from the line it stopped last time. If a input was used and it failed, that particular line shall not be commented (') so that a retry can be attempted.
Appreciate if you can help to draft a Groovy script.
Thanks

AFAIK in Groovy you can only append text at the end of the file.
Hence to add ' on each line when it is processed you need to rewrite the entire file.
You can use the follow approach but I only recommend you to use for a small files since you're loading all the lines in memory. In summary an approach for your question could be:
// open the file
def file = new File('/path/to/sample.txt')
// get all lines
def lines = file.readLines()
try{
// for each line
lines.eachWithIndex { line,index ->
// if line not starts with your comment "'"
if(!line.startsWith("'")){
// call your process and make your logic...
// but if it fails you've to throw an exception since
// you can not use 'break' within a closure
if(!yourProcess(line)) throw new Exception()
// line is processed so add the "'"
// to the current line
lines.set(index,"'${line}")
}
}
}catch(Exception e){
// you've to catch the exception in order
// to save the progress in the file
}
// join the lines and rewrite the file
file.text = lines.join(System.properties.'line.separator')
// define your process...
def yourProcess(line){
// I make a simple condition only to test...
return line.size() != 3
}
An optimal approach to avoid load all lines in memory for a large files is to use a reader to read the file contents, and a temporary file with a writer to write the result, and optimized version could be:
// open the file
def file = new File('/path/to/sample.txt')
// create the "processed" file
def resultFile = new File('/path/to/sampleProcessed.txt')
try{
// use a writer to write a result
resultFile.withWriter { writer ->
// read the file using a reader
file.withReader{ reader ->
while (line = reader.readLine()) {
// if line not starts with your comment "'"
if(!line.startsWith("'")){
// call your process and make your logic...
// but if it fails you've to throw an exception since
// you can not use 'break' within a closure
if(!yourProcess(line)) throw new Exception()
// line is processed so add the "'"
// to the current line, and writeit in the result file
writer << "'${line}" << System.properties.'line.separator'
}
}
}
}
}catch(Exception e){
// you've to catch the exception in order
// to save the progress in the file
}
// define your process...
def yourProcess(line){
// I make a simple condition only to test...
return line.size() != 3
}

Related

Writing to output stream in chunks in Groovy

I'm getting Jenkins console logs and writing them into an output stream like this:
ByteArrayOutputStream stream = new ByteArrayOutputStream()
currentBuild.rawBuild.getLogText().writeLogTo(0, stream)
However, the downside of this approach is that writeLogTo() method is limited to 10000 lines:
https://github.com/jenkinsci/stapler/blob/master/core/src/main/java/org/kohsuke/stapler/framework/io/LargeText.java#L572
In this case, if Jenkins console log is more than a 10000 lines then the data from line 10000 and up is lost and not written into a buffer.
I'm trying to re-write the above approach in the most easiest way to account for cases when the log has more than 10000 lines.
I feel like my attempt is very complicated and error-prone. Is there an easier way to introduce a new logic?
Please note that the code below is not tested, this is just a draft of how I'm planning to implement it:
ByteArrayOutputStream stream = new ByteArrayOutputStream()
def log = currentBuild.rawBuild.getLogText()
def offset = 0
def maxNumOfLines = 10000
# get total number of lines in the log
# def totalLines = (still trying to figure out how to get it)
if (totalLines > maxNumOfLines) {
def numOfExecutions = round(totalLines / maxNumOfLines)
}
for (int i=0; i<numOfExecutions; i++) {
log.writeLogTo(offset, stream)
offset += maxNumOfLines
}
writeLogTo(long start, OutputStream out)
According to comments this method returns the offset to start the next write operation.
Seems code could be like this
def logFile = currentBuild.rawBuild.getLogText()
def start=0
while(logFile.length()>start)
start=logFile.writeLogTo(start, stream)
stream could be a FileOutputStream to avoid reading whole log into memory.
There is another method readAll()
So, the code could be simple as this to read whole log as text:
def logText=currentBuild.rawBuild.getLogText().readAll().getText()
Or if you want to transfer it to a local file:
new File('path/to/file.log').withWriter('UTF-8'){ w->
w << currentBuild.rawBuild.getLogText().readAll()
}

Get files names and content , and then merge into another file with mapreduce

I have several files with datas in it.
For example: file01.csv with x lignes in it, file02.csv with y lines in it.
I would like to treat and merge them with mapreduce in order to get a file with the x lines beginning with file01 then line content, and y files beginning with file02 then line content.
I have two issues here:
I know how to get lines from a file with mapreduce by setting FileInputFormat.setInputPath(job, new Path(inputFile));
But I don't understand how I can get lines of each file of a folder.
Once I have those lines in my mapper, how can I access to the filename corresponding, so that I can create the data I want ?
Thank you for your consideration.
Ambre
You do not need map-reduce in your situation. That's because you want to preserve the order of lines in result file. In this case single thread processing will be faster.
Just run java client with code like this:
FileSystem fs = FileSystem.get();
OutputStream os = fs.create(outputPath); // stream for result file
PrintWriter pw = new PrintWriter(new OutputStreamWriter(os));
for (String inputFile : inputs) { // reading input files
InputStream is = fs.open(new Path(inputFile));
BufferedReader br = new BufferedReader(new InputStreamReader(is));
String line;
while ((line = br.readLine()) != null) {
pw.println(line);
}
br.close();
}
pw.close();

Need program to loop correctly

Got it.
while 1:
line = sub.readline().split()
if line == []:
new = main
break
else:
new = main.replace(line[0],line[1])
main = new
This seem to work for me. Thanks for the help =).
Try another loop and in that loop index the word swap that you need to occur:
I assume each line of sub.txt has the subs that you want. Read all the lines of sub.txt, storing each line in an indexable array. Set up a loop around your main code, and in that loop index through the array referencing, sequentially the line of sub.txt that you want each time.
As pointed out by Cameron, this method will overwrite the output file every time thus recording only your last change.
The error in this block is :
while True:
word = substitute.readline().split()
print(word)
if word == []: // ---Indentation ---
break
else:
new = (main_story.read().replace(word[0],word[1]))
new_story.write(new)
You need to read the complete file at once, make changes and write to file.
Or you could read from the first file, and then do subsequent read/writes on the new file.

Not writing to the file Scala

I have the following code, which is supposed to write to a file one line at a time, until it reaches ^EOF.
import java.io.PrintWriter
import java.io.File
object WF {
def writeline(file: String)(delim: String): Unit = {
val writer = new PrintWriter(new File(file))
val line = Console.readLine(delim)
if (line != "^EOF") {
writer.write(line + "\n")
writer.flush()
}
else {
sys.exit()
}
}
}
var counter = 0
val filename = Console.readLine("Enter a file: ")
while (true) {
counter += 1
WF.writeline(filename)(counter.toString + ": ")
}
For some reason, at the console, everything looks like it works fine, but then, when I actually read the file, nothing has been written to it! What is wrong with my program?
Every time you create a new PrintWriter you're wiping out the existing file. Use something like a FileWriter, which allows you to specify that you want to open the file for appending:
val writer = new PrintWriter(new FileWriter(new File(file), true))
That should work, although the logic here is pretty confusing.
Use val writer = new FileWriter(new File(file), true) instead. The second parameter tells the FileWriter to append to the file. See http://docs.oracle.com/javase/6/docs/api/java/io/FileWriter.html
I'm guessing the problem is that you forgot to close the writer.

atomically creating a file lock in MATLAB (file mutex)

I am looking for a simple already implemented solution for atomically creating a file lock in MATLAB.
Something like:
file_lock('create', 'mylockfile'); %this will block until it creates the lock file.
file_lock('remove', 'mylockfile'); %this will remove the lock file:
This question has already been asked several times, with some proposed solution ideas (such as using Java FileLock),
but I didn't find a simple already implemented solution.
Are you aware of such an implemented solution?
Notes:
locking file access OR exchanging messages bw Matlab Instances
Thread Subject: Safe file mutex without race condition
I've settled on a pretty simple solution for combining error/logging messages from multiple worker threads into a single file. Every time I want to write to that file, I first write the output to the thread's own temporary file. Next, I append that temporary file to the "master" log file using flock. Skipping over some details here, the idea is:
fid=fopen(threadtemp, 'w');
fprintf(fid, 'Error message goes here');
fclose(fid);
runme = sprintf('flock -x %s -c ''cat %s >> %s''', LOGFILE, threadtemp, LOGFILE);
system(runme);
See the flock man page for details, but the call above is acquiring an eXclusive lock on the logfile, running the provided Command under the lock, and then releasing it.
This obviously only works if you're on a system which has flock (Linux/OS X, and only certain types of file systems at that) and you're doing something that can be done from the command line, but I'd bet that it's a pretty common use-case.
Depending on which Java version you're using, perhaps this will work (translated from: http://www.javabeat.net/2007/10/locking-files-using-java/)
classdef FileLock < handle
properties (Access = private)
fileLock = []
file
end
methods
function this = FileLock(filename)
this.file = java.io.RandomAccessFile(filename,'rw');
fileChannel = this.file.getChannel();
this.fileLock = fileChannel.tryLock();
end
function val = hasLock(this)
if ~isempty(this.fileLock) && this.fileLock.isValid()
val = true;
else
val = false;
end
end
function delete(this)
this.release();
end
function release(this)
if this.hasLock
this.fileLock.release();
end
this.file.close
end
end
end
Usage would be:
lock = FileLock('my_lock_file');
if lock.hasLock
%// do something here
else
%// I guess not
end
%// Manually release the lock, or just delete (or let matlab clean it up)
I like the obj wrapping pattern for IO so that releasing happens even in exceptions
EDIT: The file ref must be kept around and manually closed or you won't be able to edit this. That means this code is only really useful for pure lock files, I think.
If you only need to run on OS X and Linux (not Windows), you can use the following:
pathLock='/tmp/test.lock'
% Try to create and lock this file.
% In my case I use -r 0 to avoid retrying
% You could use -r -1 to retry forever, or for a particular amount of time,
% etc, see `man lockfile` for details.
if ~system(sprintf('lockfile -r 0 %s',pathLock))
% We succeeded, so perform some task which needs to be serialized.
% runSerializedTask()
% Now remove the lockfile
system(sprintf('rm -f %s',pathLock));
end
Write to a new file, then rename it. Renaming is an atomic operation, and all the new content will become visible at once.
At the end I did one implementation based on two consecutive tests (movefile, and verify the contents of the moved file).
not very well written, but it works for now for me.
+++++ file_lock.m ++++++++++++++++++++++++
function file_lock(op, filename)
%this will block until it creates the lock file:
%file_lock('create', 'mylockfile')
%
%this will remove the lock file:
%file_lock('remove', 'mylockfile')
% todo: verify that there are no bugs
filename = [filename '.mat'];
if isequal(op, 'create')
id = [tempname() '.mat']
while true
save(id, 'id');
success = fileattrib(id, '-w');
if success == 0; error('fileattrib'); end
while true
if exist(filename, 'file'); %first test
fprintf('file lock exists(1). waiting...\n');
pause(1);
continue;
end
status = movefile(id, filename); %second test
if status == 1; break; end
fprintf('file lock exists(2). waiting...\n');
pause(1);
end
temp = load(filename, 'id'); % third test.
if isequal(id, temp.id); break; end
fprintf('file lock exists(3). waiting...\n');
pause(1)
end
elseif isequal(op, 'remove')
%delete(filename);
execute_rs(#() delete(filename));
else
error('invalid op');
end
function execute_rs(f)
while true
try
lastwarn('');
f();
if ~isequal(lastwarn, ''); error(lastwarn); end %such as: Warning: File not found or permission denied
break;
catch exception
fprintf('Error: %s\n.Retrying...\n', exception.message);
pause(.5);
end
end
++++++++++++++++++++++++++++++++++++++++++

Resources