Is there an easy way to parse a file for a directive (as I can't think of a better word for it)?
I need to scan a file for <!--#directive parameter=value -->, copy the value, find the location and length where this directive was in the file, so it can be replaced with whatever.
I come from microcontrollers, and don't have a lot of experience with extra / full libraries.
Is there a better way to implement this than manually scanning it line by line? (I guess with ftell for position, fgets, and then parse with sscanf, fseek back to last position if it was a match).
Here is a regular expression which can help:
<!--\s*#.+=(\S+)\s?-->
Group with index 1 in each match is your value.
You can test them here: https://regex101.com/
Also consider using high-level language for this. Here is a snippet in C# which prints all values from a text file:
var inputText = File.ReadAllText("D:\\myTextFile.txt");
var regex = new Regex("<!--\\s*#.+=(\\S+)\\s?-->");
var matches = regex.Matches(inputText);
foreach (var g in matches.Cast<Match>().Select(match => match.Groups[1]))
Console.WriteLine(g.ToString());
Related
I need to iterate and parse 108 lines from a file and then sort them into 3 different hashes. (In one iterator.) (In Ruby)
I have the file loaded into the program and into the array I need to parse. When I try to make the iterator anyway I try to use the Regex Match command I get an error abut the unknown method. Is as simple that I can't use that method on a array?
lines = File.readlines('access_log')
lines.each.match(/^([:\d\.]+) .*\[.*\].*\"[A-Z]+ *(.+) HTTP/)
This and every other way I have tried to use the match method it hasn't worked.
This also doesn't anything for the hash, as I haven't done that yet.
/^([:\d\.]+) .*\[.*\].*\"[A-Z]+ *(.+) HTTP/.match(lines)
Have also tried this, but the error output appears that you cant run it on only the array. I beilive this is where I would need to tie the iterator in, but I'm stumped.
So, what's happening is that what readlines does is it slurps the entire text file.
So you have an array with the content of the textfile separated by a newline(and the newline is kept in every string in the array).
After that, you're doing lines.each, which brings out an enumerator. Then you're calling .match on the enumerator instead of the string itself
The proper way to do this would be
lines.each { |line| line.match(/^([:\d\.]+) .*\[.*\].*\"[A-Z]+ *(.+) HTTP/) }
However, the above actually won't do anything because all you're doing is iterating against each element and checking if it matches the REGEX.If you want it to actually do something, try...
matches = lines.map { |line| line.match(/^([:\d\.]+) .*\[.*\].*\"[A-Z]+ *(.+) HTTP/) }
Remember that the match method only works on strings. If match matches something, it returns an object of the class MatchData, else if it doesn't match anything-- nil.
I have a text file that I read using the usual URLRequest and URLloader functions. It consists of a series of names, each separated by \r\d. I want to create an array of those names, but I want to eliminate both the \r and the \d. This code does a great job at splitting the names into arrays, but it leaves the carriage return in the string.
names = testfile.split(String.fromCharCode(10));
And this leaves the new line:
names = testfile.split(String.fromCharCode(13));
I'm mainly a C/C++/assembly programmer, AS3 has some things that seem rather odd to me. Is there a way to do this? I've tried searching the resulting string array members but I get errors from the compiler. Very easy to do in C/C++/assembly, but I haven't quite figured AS3 out yet.
You should be able to use a RegExp to do this. Something like:
var noLines:String = withLines.replace( /[\r\n]/g, "" );
That'll remove all new lines from your string; whether you want to do that before or after splitting it up to you.
If your string is in the form:
name1
name2
name3
Then you might even be able to get away with splitting using a RegExp:
var names:Array = withLines.split( /[\r\n]/ );
You can test out the RegExp provided here: http://regexr.com?38dmk (click on the replace tab and clear the replace input)
Seems like a standard enough problem to warrant a standard design in the solution:
Say I want to write x+2 (or less) strings in a file. x strings make up the content of a section and the two strings make a kind of a header and footer for that section. The catch is that I would not write the header/footer strings if there are no strings in the content. Furthermore, these x strings are written from disparate places in the code. So the current flow is:
write header string
write content strings
write footer string
This leads to the header/footer strings being written even if the content is empty, and I have to address that, i.e. not writing the header/footer strings in this case.
The solution that I can think of is writing the header string before the first content string that is being written (implemented by funnelling each content string write with a header string write, with a boolean flag preventing multiple header string writes), and then writing the footer string only if the header string has been written (governed by a boolean flag).
This is the top level gist of it, just wondering if there are standard approaches available for cases like these.
Thanks!
There are a number of solutions to this:
Write the header and data lines to an in-memory cache and output them at the time you try to write the footer (but only if there are data lines, otherwise output nothing).
Same thing but using a temporary file for the data cache in case it's too big.
Remember the header and whether or not you've output it.
Since the first two solutions involve inefficiencies (caching possibly large amounts of data, or using relatively slow external storage), I'll concentrate on the latter. See the note at the bottom on how to do the caching (a).
The approach which doesn't require caching the data is to just have an indicator as to whether or not you've written the header. Before each data line, output the header (and set the flag) only if the flag is not yet set. You can also use this flag to control the footer (if the header hasn't been output, neither should the footer be):
def outHeader (str):
headerText = str
headerSent = false
def outdata (str):
if not headerSent:
write headerText
headerSent = true
write str
def outFooter (str):
if headerSent:
write str
This solution is perhaps much simpler in terms of no data caching required.
(a) If you did want to go with the caching solution (despite the advice that it's a sub-optimal solution), the following pseudo-code shows how it could be done:
def outHeader (str):
cachedHeader = str
cachedData = ""
def outdata (str):
cachedData = cachedData + str + "\n"
def outFooter (str):
if cachedData != "":
write cachedHeader
write cachedData
write str
The only difference between that in-memory cache and a file-based cache is:
creating an empty temporary file and setting lineCount to 0 where you currently create cachedData in outHeader().
sending str to the temporary file and incrementing lineCount in outData().
using lineCount to decide if there's cached data in outFooter and reading the lines back from the temporary file for output as data.
I want to read from the file "hello.txt" on line each time and then write this line to "bye.text" and to the screen.
How can I do this?
The only funcs I see in "File" are:
readAllText
readAllLines
WriteAllLines, etc.
As Jack says, you need to use the StreamWriter and StreamReader types if you want to work with files (or any other streams) using line-by-line functions. Just use the constructor like this:
open System.IO
let addLine (line:string) =
use wr = StreamWriter("D:\\temp\\test.txt", true)
wr.WriteLine(line)
Here, we're using an overload of the StreamWriter constructor that takes the path (as a string) and boolean specifying that we want to append to an existing file. Also note that I'm using use keyword to make sure that the file is closed when addLine completes.
To read content as a sequence of lines, you can use StreamReader similarly - create an instance of the type using constructor and then use ReadLine method until you get null as a result.
The methods in the System.IO.File class only support reading/writing the entire file. If you want a more granular approach (e.g., reading/writing line-by-line) you need to use something like StreamReader and StreamWriter.
Is there any way to get extension of a file from its filename ??
The only algo I could develop to do the above task is to find last '.' and rest all string to end..
But not too sure how to get index of final '.' from the given string..
Any new idea or suggestion for the same ??
I'm actually trying to develop a filter which only does processing on all image files rather than other non image file... Is there any other way of doing the same using any inbuilt function ???
Use the built in function "fileparts".
You can use the regexp function with the split parameter
output = regexp(your_string, '\.', 'split')
The output is a cell array that you can pull the value you want out of.
http://www.mathworks.com/help/techdoc/ref/regexp.html