Regular expression to replace (if|then) - arrays

I have some verse references in articles that I want to link to the adjacent verses file.
Example:
some text (Gen 2:15, 16), other text (Ex 4:12, 13) more.. etc.
I could replace the first one with the following regex:
\(Gen \1: \2, \3\)
Here I fixed the "1" (book=) and the "Gen"
But I couldn't figure out how to use if|then so that I could give it all arrays of (Gen|Ex|Lev.. etc.), so that it replaces Gen with book number "1", Ex "2".. etc.

You need to somewhere define what all the book orders are. And you'll need to use some sort of scripting language, not just a plain old regex. For example, you could do something along the lines of:
books = ["Gen", "Ex", ..., "Rev"]
...and then replace book_name with books.index(book_name)+1
The exact code/syntax obviously depends on which language you choose to use.

With notepad++ you won't be able to get the order numbers.
But everything else is possible. You need to put each book on a new line:
find \), and replace by \n
Then use this pattern:
[a-z\s]+\(([a-z]+)\s+([0-9:]+)\,\s+([0-9]+)\)
and replace by:
\1: \2, \3
you'll get the list of urls. Which then you can merge back to one line if needed.
The only problem is the book number.
Demo is here: https://regex101.com/r/qN8mO7/2

Related

Regular Expression Generation for AngularJS ng-pattern

I'm using a regex to validate a form input. So basically a user can input "SELECT some_name of select_match".
So far I have the regex: \bSELECT\b \bof select_match\b
The last part is the middle part, which I think should be [a-zA-Z] but I'm not sure how to place it in the middle. I've read multiple pages but can't get it to work.
Also preferably I'd like the regex to ignore spaces between "SELECT" and of "select_match". Meaning that SELECT blabla of select_match and SELECT blabla of select_match would both be validated as correct.
Can anyone tell me how to do this? Thank you.
If I understood you correctly, this should work:
/^SELECT\s+(\w+)\s+of select_match$/
Notes:
This allows any number of spaces between "SELECT" and the match_name; and between the match_name and the "of" (but, at least 1. To change to at least 0, change the \s+ to a \s*)
After that, the rest of the string must be exactly like that (same spaces and words exactly).
The match_name will be in match group 1.
If this doesn't work, show a bit of your code (where you use it) and we can try to find the problem.
Note: If you are using it in ng-pattern lose the "/"s (being the pattern: ^SELECT\s+(\w+)\s+of select_match$).
Note2: If you are using it in a string, remember you might need to escape every "\" (making it a "\", and the result: ^SELECT\\s+(\\w+)\\s+of select_match$

regex: extract text between two string with text that match a specific word

I'm refactorying a very big C project and I need to find out some part of code written by specific programmer.
Fortunately every guy involved in this project mark his own code using his email address in standard C style comments.
Ok, someone could say that this could be achieved easily with a grep from command line, but this is not my goal: I may need to remove this comments or substitute them with other text so regex is the only solution.
Ex.
/*********************************************
*
* ... some text ....
*
* author: user#domain.com
*
*********************************************/
From this post I found the right expression to search for C style comments which is:
\/\*(\*(?!\/)|[^*])*\*\/
But that is not enough! I only need the comments which contains a specific email address. Fortunately the domain of email address I'm looking for seems to be unique in the whole project so this could make it simpler.
I think I must use some positive lookahead assertion, I've tried this one:
(\/\*)(\*(?!\/)|[^*](?=.*domain.com))*(\*\/)
but it doesn't run!
Any advice?
You can use
\/\*[^*]*(?:\*(?!\/)[^*]*)*#domain\.com[^*]*(?:\*(?!\/)[^*]*)*\*\/
See the regex demo
Pattern details:
/\* - comment start
[^*]*(?:\*(?!\/)[^*]*)* - everything but */
#domain\.com - literal domain.com
[^*]*(?:\*(?!\/)[^*]*)* - everything but */
\*\/ - comment end
A faster alternative (as the first part will be looking for everything but the comment end and the word #domain):
\/\*[^*#]*(?:\*(?!\/)[^*#]*|#(?!domain\.com)[^*#]*)*#domain\.com[^*]*(?:\*(?!\/)[^*]*)*\*\/
See another demo
In these patterns, I used an unrolled construct for (\*(?!\/)|[^*])*: [^*]*(?:\*(?!\/)[^*]*)*. Unrolling helps construct more efficient patterns.

Regular expression to extract string in C Code (not inside comment)

I have this code in C but I only know how to extract string with regular expression that not inside comment code:
1. /* * "path_build()" function in "home.c" for more information.
2. * this is an example basic"
3. */
4.
5. /*** Free ***/
6. VALOR = string_make(format("%sxtra", libpath));
7. event_signal_string(EVENT_INITSTATUS, "Inicializando...");
should only return:
"%sxtra"
"Inicializando..."
I try:
".*"
but its don't work, it show me all text inside "", including the strings that inside /*...*/
I use EditPag Pro, RegExp panel.
It's a game translation project, I take the string of every C file and I translate to Spanish. I can't remove the comments of the original file.
The only thing I have clear is that this is the regex to find comments in C, maybe that will help the solution:
(/\*([^*]|[\r\n]|(\*+([^*/]|[\r\n])))*\*+/)|(//.*)
Any help?
Edit: I put number of lines.
Hernaldo, this is an interesting question.
Here are two versions because I am not sure if you want to capture the "inside of the string" or "the whole string"
The regexps below capture the strings to capture Group 1. You completely ignore the overall match (Group 0) and just focus on Group 1. To retrieve the strings, just iterate over Group 1 matches in your language (discarding empty strings if any).
Version 1: "The inside of the string"
(?s)/\*.*?\*/|"([^"]+)"
This will capture %sxtra and Inicializando... to Group 1.
Version 2: "The whole string"
(?s)/\*.*?\*/|("[^"]+")
This will capture "%sxtra" and "Inicializando..." to Group 1.
Please let me know if you have any questions!
Note: I did not handle /* nested /* comments */ */ as that was not specified in the question. That would require a bit of tweaking and probably a regex engine supporting recursion.
The final solution for EditPad 6/7 is:
(?<!^[ \t]*/?[*#][^"\n]*")(?<=^[^"\n]*")[^"]+
Link:
Regular expression for a string that does not start with a /*

Removing characters from Actionscript 3 Strings

I have a text file that I read using the usual URLRequest and URLloader functions. It consists of a series of names, each separated by \r\d. I want to create an array of those names, but I want to eliminate both the \r and the \d. This code does a great job at splitting the names into arrays, but it leaves the carriage return in the string.
names = testfile.split(String.fromCharCode(10));
And this leaves the new line:
names = testfile.split(String.fromCharCode(13));
I'm mainly a C/C++/assembly programmer, AS3 has some things that seem rather odd to me. Is there a way to do this? I've tried searching the resulting string array members but I get errors from the compiler. Very easy to do in C/C++/assembly, but I haven't quite figured AS3 out yet.
You should be able to use a RegExp to do this. Something like:
var noLines:String = withLines.replace( /[\r\n]/g, "" );
That'll remove all new lines from your string; whether you want to do that before or after splitting it up to you.
If your string is in the form:
name1
name2
name3
Then you might even be able to get away with splitting using a RegExp:
var names:Array = withLines.split( /[\r\n]/ );
You can test out the RegExp provided here: http://regexr.com?38dmk (click on the replace tab and clear the replace input)

Calling functions from plain text descriptions

I have an app which has common maths functions behind the scenes:
add(x, y)
multiply(x, y)
square(x)
The interface is a simple google- style text field. I want the user to be able to enter a plain text description -
'2*3'
'2 times 3'
'multiply 2 and 3'
'take the product of 2 and 3'
and get a answer mathematical answer
Question is, how should I map the text descriptions to the functions ? I'm guessing I need to
tokenise the text
identify key tokens (function names, arguments)
try and map token combinations to function signatures
However I'm guessing this is already a 'solved problem' in the machine learning space. Should I be using Natural Language Processing ? Plain text search ? Something else ?
All ideas gratefully received, plus implementation suggestions [I'm using Python/AppEngine; I know about NLTK and Whoosh]
[PS I understand Google does this already, at least for the first two queries on the list. I'm guessing they also go it statistically, having a very large amount of search data. I don't have a large amount of data available, so will need an alternative approach].
After you tokenise the text, you need parsing to get a syntax tree of your natural language phrase. Once you have this, you can map the parse tree to a mathematical expression, and then evaluate the expression. I do not think this is a solved problem. I would start with several templates, say the first two, and experiment. The larger the domain of possible descriptions, the harder the task is.
I would recommend some tool for provide grammar/patterns on text like SimpleParse for python http://www.ibm.com/developerworks/linux/library/l-simple.html. As java programmer I would prefer GATE or graph-expression.

Resources