Regex Help on finding string with non repeating highlighter - reactjs

I am trying to write Regex expression & not able to figure out how to stop repeating char.
Input String : 0.000000
Search String : 000
Output String highlighting : 0.**000**000 -> First match with extact length & ignore rest
Tried below expression in javascript :
/000/
Its giving 0.**000000**
Example in editor :
https://codesandbox.io/s/react-highlighter-with-emotion-forked-614pp?file=/src/index.tsx

Use a lookbehind based regex to only match the first 000 (tested in the OP sandbox and confirmed it's working):
<Highlighter search={/(?<=^\d*\.)000/}>0.000000</Highlighter>
See proof:
Explanation
--------------------------------------------------------------------------------
(?<= look behind to see if there is:
--------------------------------------------------------------------------------
^ the beginning of the string
--------------------------------------------------------------------------------
\d* 0 or more digits (0-9)
--------------------------------------------------------------------------------
\. '.'
--------------------------------------------------------------------------------
) end of look-behind
--------------------------------------------------------------------------------
000 '000'

Related

Is there a Regex to trim numbers after two decimal places?

I am trying to reduce the number of decimals and for some reason, using a "fixed decimal" 258.2 does not change all numbers in my column to two decimal places.
Pretty much I have the following after specifying the number as a fixed decimal with 2 places:
6.933141
5.13
1.56
2.94
1.54
6.470931
So changing the amount of fixed decimals did not do it for me, so I have been trying to use RegEx, and came up with (^\d+.\d{2}). This however only identifies what I want to keep.
Is there a way to do this using Regex_Replace?
Thank you all in advance for your help!
Use
^(\d+\.\d{2})\d+$
Replacement: $1. See regex proof.
EXPLANATION
--------------------------------------------------------------------------------
^ the beginning of the string
--------------------------------------------------------------------------------
( group and capture to $1:
--------------------------------------------------------------------------------
\d+ digits (0-9) (1 or more times (matching
the most amount possible))
--------------------------------------------------------------------------------
\. '.'
--------------------------------------------------------------------------------
\d{2} digits (0-9) (2 times)
--------------------------------------------------------------------------------
) end of $1
--------------------------------------------------------------------------------
\d+ digits (0-9) (1 or more times (matching
the most amount possible))
--------------------------------------------------------------------------------
$ before an optional \n, and the end of the
string

Regex to reject if all numbers and reject colon

I am trying for a regex to
reject if input is all numbers
accept alpha-neumeric
reject colon ':'
I tried ,
ng-pattern="/[^0-9]/" and
ng-pattern="/[^0-9] [^:]*$/"
for example ,
"Block1 Grand-street USA" must be accepted
"111132322" must be rejected
"Block 1 grand : " must be rejected
You may use
ng-pattern="/^(?!\d+$)[^:]+$/"
See the regex demo.
To only forbid a : at the end of the string, use
ng-pattern="/^(?!\d+$)(?:.*[^:])?$/"
See another regex demo
The pattern matches
^ - start of string
(?!\d+$) - no 1+ digits to the end of the string
[^:]+ - one or more chars other than :
(?:.*[^:])? - an optional non-capturing group that matches 1 or 0 occurrences of
.* - any 0+ chars other than line break chars, as many as possible
[^:] - any char other than : (if you do not want to match an empty string, replace the (?: and )?)
$ - end of string.
According to comments, you want to match any character but colon.
This should do the job:
ng-pattern="/^(?!\d+$)[^:]+$/"

StringRegExp pattern not working properly

I have this string:
{"name": "Fancy HaXXor123Name","profession": 1,"race": 2,"map_id": 1052,"world_id": 268435461,"team_color_id": 0,"commander": false,"fov": 0.768}
I want to get an array back which includes the following information (from left to right from the string):
Fancy HaXXor123Name
1
2
1052
268435461
0
false
0.768
I tried to mess with RegExBuddy and got a promissing pattern which looks like this
(\d{1,}).(\d{1,})|(\d{1,})|(?i)"(.*?)"
This is what I got back
name
Fancy HaXXor123Name
profession
1
race
2
map_id
10
2
world_id
2684354
1
team_color_id
0
commander
fov
0
768
So there are large spaces between the informations, torn numbers and the false is missing. I can't fix this problem and I'm completely new to StringRegExp.
I'm using AutoIT which uses the PCRE RegExp-Engine (this is what think).
You may use a regex like the following:
"\s*:\s*(?:"\K[^"]*|\K[^][\s,{}]+)
See the regex demo
Details:
"\s*:\s* - a literal ", 0+ whitespaces, :, 0+ whitespaces
(?:"\K[^"]*|\K[^][\s,{}]+) - A non-capturing group matching 2 alternatives:
"\K[^"]* - a ", then \K zeros the text matched so far, and then matches 0+ chars other than " with [^"]*
\K[^][\s,{}]+ - \K drops the text matched so far, and [^][\s,{}]+ matches 1+ chars other than ], [, whitespace, ,, { and }.

Replace values only if they are different

I have a vcf file like this:
http://www.1000genomes.org/node/101
Here's the example from that site:
##fileformat=VCFv4.0
##fileDate=20090805
##source=myImputationProgramV3.1
##reference=1000GenomesPilot-NCBI36
##phasing=partial
##INFO=<ID=NS,Number=1,Type=Integer,Description="Number of Samples With Data">
##INFO=<ID=DP,Number=1,Type=Integer,Description="Total Depth">
##INFO=<ID=AF,Number=.,Type=Float,Description="Allele Frequency">
##INFO=<ID=AA,Number=1,Type=String,Description="Ancestral Allele">
##INFO=<ID=DB,Number=0,Type=Flag,Description="dbSNP membership, build 129">
##INFO=<ID=H2,Number=0,Type=Flag,Description="HapMap2 membership">
##FILTER=<ID=q10,Description="Quality below 10">
##FILTER=<ID=s50,Description="Less than 50% of samples have data">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##FORMAT=<ID=GQ,Number=1,Type=Integer,Description="Genotype Quality">
##FORMAT=<ID=DP,Number=1,Type=Integer,Description="Read Depth">
##FORMAT=<ID=HQ,Number=2,Type=Integer,Description="Haplotype Quality">
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT NA00001 NA00002 NA00003
20 14370 rs6054257 G A 29 PASS NS=3;DP=14;AF=0.5;DB;H2 GT:GQ:DP:HQ 0|0:48:1:51,51 1|0:48:8:51,51 1/1:43:5:.,.
20 17330 . T A 3 q10 NS=3;DP=11;AF=0.017 GT:GQ:DP:HQ 0|0:49:3:58,50 0|1:3:5:65,3 0/0:41:3
20 1110696 rs6040355 A G,T 67 PASS NS=2;DP=10;AF=0.333,0.667;AA=T;DB GT:GQ:DP:HQ 1|2:21:6:23,27 2|1:2:0:18,2 2/2:35:4
20 1230237 . T . 47 PASS NS=3;DP=13;AA=T GT:GQ:DP:HQ 0|0:54:7:56,60 0|0:48:4:51,51 0/0:61:2
20 1234567 microsat1 GTCT G,GTACT 50 PASS NS=3;DP=9;AA=G GT:GQ:DP 0/1:35:4 0/2:17:2 1/1:40:3
After the header lines, each line has fields that contain genotypes starting with the 10th field. The 10th field is below the NA0001 heading; the 11th field is genotype NA0002, etc. I have a file with 123 different genotypes, so going from position 10 to 133 (NA0001 until NA0123). What is shown in these fields can be 0/0, 0/1, 0/2 .... till 8/9 for instance. Now I want to replace all the non-equal ones. So I would like to keep 0/0, 1/1, 2/2, etc. And replace 0/1, 0/2, 1/2, 4/5, 4/6 etc by ./.
I would like to write this in a C script. Thought about using sed y/regexp/replacement/ but no idea how to write all those unequal values in a regular expression. And on other positions in the file there could also be these values, so really only positions 10 till 133 should be replaced. And it needs to be replaced; I will be needing the rest of the file with the new values.
Hope it is clear. Anyone any idea how to do this?
This regex should do what you want: \s(\d)[|\/](?!\1)\d: Replace matches with ./.:
Breakdown:
\s(\d) matches a space followed by a single digit, capturing the digit in capture group #1
[|\/] matches a pipe or slash (since it seems that the VCF format allows either)
(?!\1)\d uses a negative lookahead to ensure that the next character is not the same as capture group #1, and matches the digit
Caveats:
I matched a leading space and trailing : to try to ensure it matches only the intended values. I couldn't work out a good way to limit it to fields 10 and after.
Example using perl:
perl -pe 's#\s(\d)[|/](?!\1)\d:# ./.:#g' testfile.vcf > testfile_afterchange.vcf
Note: I used # as the delimiter to avoid having to escape the / characters in the regex.

Lex/Flex - Split the phone number Up?

I am making a program which got to split the phone-number apart, each part has been divided by a hyphen (or spaces, or '( )' or empty).
Exp: Input: 0xx-xxxx-xxxx or 0xxxxxxxxxx or (0xx)xxxx-xxxx
Output: code 1: 0xx
code 2: xxxx
code 3: xxxx
But my problem is: sometime "Code 1" is just 0x -> so "Code 2" must be xxxxx (1st part always have hyphen or a parenthesis when 2 digit long)
Anyone can give me a hand, It would be grateful.
According to your comments, the following regex will extract the information you need
^\(?(0\d{1,2})\)?[- ]?(\d{4,5})[- ]?(\d{4})$
Break down:
^\(?(0\d{1,2})\)? matches 0x, 0xx, (0xx) and (0x) at he beggining of the string
[- ]? as parenthesis can only be used for the first group, the only valid separators left are space and the hyphen. ? means 0 or 1 time.
(\d{4,5}) will match the second group. As the length of the 3rd group is fixed (4 digits), the regex will automatically calculate the length of the Group1 and 2.
(\d{4})$ matches the 4 digits at the end of the number.
See it in action
You can the extract data from capture group 1,2 and 3
Note: As mentionned in the comments of the OP, this only extracts data from correctly formed numbers. It will match some ill-formed numbers.

Resources