I have this code for matching an IP address pattern. But it doesn't seem to work and I don't know why. It always prints on the terminal "No match"
regex_t regex;
int reti;
char msgbuf[100];
reti = regcomp(®ex, "^([0-9]{1,3}).([0-9]{1,3}).([0-9]{1,3}).([0-9]{1,3})$", 0);
if (reti) {
fprintf(stderr, "Could not compile regex\n");
exit(1);
}
reti = regexec(®ex, "124.168.21.3", 0, NULL, 0);
if (!reti) {
puts("Match");
} else if (reti == REG_NOMATCH) {
puts("No match");
} else {
regerror(reti, ®ex, msgbuf, sizeof(msgbuf));
fprintf(stderr, "Regex match failed: %s\n", msgbuf);
exit(1);
}
regfree(®ex);
Any idea?
I found it, in fact I should specify the cflags field of the regcomp function to REG_EXTENDED and not 0.
You should escape the dots. And you probably don't need the capturing groups. Replace
"^([0-9]{1,3}).([0-9]{1,3}).([0-9]{1,3}).([0-9]{1,3})$"
with
"^[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}$"
Related
i am trying to build regular expression with the regex.h lib.
i checked my expression in https://regex101.com/ with the the input
"00001206 ffffff00 00200800 00001044" and i checked it in python as well, both gave me the expected result.
when i ran the code below in c (over unix) i got "no match" print.
any one have any suggest?
regex_t regex;
int reti;
reti = regcomp(®ex, "([0-9a-fA-F]{8}( |$))+$", 0);
if (reti)
{
fprintf(stderr, "Could not compile regex\n");
exit(1);
}
reti = regexec(®ex, "00001206 ffffff00 00200800 00001044", 0, NULL, 0);
if (!reti)
{
printf("Match");
}
else if (reti == REG_NOMATCH) {
printf("No match bla bla\n");
}
Your pattern contains a $ anchor, capturing groups with (...) and the interval quantifier {m,n}, so you need to pass REG_EXTENDED to the regex compile method:
regex_t regex;
int reti;
reti = regcomp(®ex, "([0-9a-fA-F]{8}( |$))+$", REG_EXTENDED); // <-- See here
if (reti)
{
fprintf(stderr, "Could not compile regex\n");
exit(1);
}
reti = regexec(®ex, "00001206 ffffff00 00200800 00001044", 0, NULL, 0);
if (!reti)
{
printf("Match");
}
else if (reti == REG_NOMATCH) {
printf("No match bla bla\n");
}
See the online C demo printing Match.
However, I believe you need to match the entire string, and disallow whitespace at the end, so probably
reti = regcomp(®ex, "^[0-9a-fA-F]{8}( [0-9a-fA-F]{8})*$", REG_EXTENDED);
will be more precise as it will not allow any arbitrary text in front and won't allow a trailing space.
I am trying to match strings like 'sdb-iof-pool 1008.56M 884K' using this regular expression: ^(.*)([\s]+)([-+]?[0-9]*\.?[0-9]+)([K|M|G|T|P]{1})([\s]+)([-+]?[0-9]*\.?[0-9]+)([K|M|G|T|P]{1})(.*)$
My c code is the following:
int reti;
regex_t regex;
size_t maxGroups = 8;
regmatch_t groupArray[maxGroups];
const char * pattern = "^(.*)([\\s]+)([-+]?[0-9]*\\.?[0-9]+)([K|M|G|T|P]{1})([\\s]+)([-+]?[0-9]*\\.?[0-9]+)([K|M|G|T|P]{1})(.*)$";
reti = regcomp(®ex, pattern, REG_EXTENDED);
if (reti) {
regerror(reti, ®ex, log_buffer, IOF_MAX_MSG);
snprintf(error, IOF_MAX_MSG, "%s: Failed to compile regex '%s': (%d) '%s'", __FUNCTION__, pattern, reti, log_buffer);
return FAIL;
}
reti = regexec(®ex, cmd_output, maxGroups, groupArray, 0);
if (reti == REG_NOMATCH) {
regerror(reti, ®ex, log_buffer, IOF_MAX_MSG);
regfree(®ex);
snprintf(error, IOF_MAX_MSG, "Failed to match regex '%s' on '%s': %s", pattern, cmd_output, log_buffer);
return FAIL;
}
regfree(®ex);
Even though tools like this seem to confirm that the regular expression works fine, my program returns:
"Failed to match regex '^(.)([\s]+)([-+]?[0-9].?[0-9]+)([K|M|G|T|P]{1})([\s]+)([-+]?[0-9].?[0-9]+)([K|M|G|T|P]{1})(.)$' on 'sdb-iof-pool 1008.56M 884K': No match"
After several trials and errors using the utility "grep" with the option "-E" for extended regular expression and by reading the manual of the utility I suspected that the character class [\s] was the culprit. The character class [\s] is not recognized by POSIX syntax. [:space:] must be used instead.
I want to check whether a string only contains alphanumeric characters or not in C. I do not want to use the isalnum function. Why doesn't the code below work correctly?
int main()
{
printf("test regular expression\n");
int retval = 0;
regex_t re;
char line[8] = "4a.zCCb";
char msgbuf[100];
if (regcomp(&re,"[a-zA-z0-9]{2,8}", REG_EXTENDED) != 0)
{
fprintf(stderr, "Failed to compile regex '%s'\n", tofind);
return EXIT_FAILURE;
}
if ((retval = regexec(&re, line, 0, NULL, 0)) == 0)
printf("Match : %s\n", line);
else if (retval == REG_NOMATCH)
printf("does not match : %s\n", line);
else {
regerror(retval, &re, msgbuf, sizeof(msgbuf));
fprintf(stderr, "Regex match failed: %s\n", msgbuf);
exit(1);
}
regfree(&re);
}
If you want the entire string to be alphanumeric, you need to include begin and end anchors in the regex:
"^[a-zA-z0-9]{2,8}$"
As it stands, there are 4 alphanumerics at the end of the string, which matches the original regex.
Try using \w* to match all word characters.
I am trying to write a program to find whether a give string is hex or not.So the given string must contain only character in between 0-9,A-F and a-f.How can i accomplish this using C?
The program i tried is give below but the regex pattern is not working well.What will be the error in this pattern?
#include <sys/types.h>
#include <regex.h>
#include <stdio.h>
int main(int argc, char *argv[]){
regex_t regex;
int reti;
char msgbuf[100];
/* Compile regular expression */
reti = regcomp(®ex, "^[a-fA-F0-9]+$", 0);
if( reti )
{
fprintf(stderr, "Could not compile regex\n");
//exit(1);
}
/* Execute regular expression */
reti = regexec(®ex, "ABC123defG", 0, NULL, 0);
if( !reti ){
puts("Match");
}
else if( reti == REG_NOMATCH ){
puts("No match");
}
else{
regerror(reti, ®ex, msgbuf, sizeof(msgbuf));
fprintf(stderr, "Regex match failed: %s\n", msgbuf);
//exit(1);
}
/* Free compiled regular expression if you want to use the regex_t again */
regfree(®ex);
return 0;
}
You need to specify REG_EXTENDED in the flags argument to regcomp. If you don't, you end up with "basic" regular expression syntax, which doesn't include the + operator, amongst other things.
It's slightly surprising that "basic" regular expressions still exist, never mind being the default. But that's backwards-compatibility for you.
I'm trying to create a collection of regexes in C, with no much success.
Currently I'm trying to find include statements with the following regex:
(#include <.+>)|(#include \".+\")
here is my code:
#include <stdio.h>
#include <stdlib.h>
#include <regex.h>
char *regex_str = "(#include <.+>)|(#include \".+\")";
char *str = "#include <stdio.h>";
regex_t regex;
int reti;
int main() {
/* Compile Regex */
reti = regcomp(®ex, regex_str, 0);
if (reti) {
printf("Could not compile regex.\n");
exit(1);
}
/* Exec Regex */
reti = regexec(®ex, str, 0, NULL, 0);
if (!reti) {
printf("Match\n");
} else if (reti == REG_NOMATCH) {
printf("No Match\n");
} else {
regerror(reti, ®ex, str, sizeof(str));
printf("Regex match failed: %s\n", str);
exit(1);
}
/* Free compiled regular expression if you want to use the regex_t again */
regfree(®ex);
return 0;
}
The result I get is: No Match
What am I doing wrong?
You might need to escape your match group:
char *regex_str = "\\(#include [\"<].*[\">]\\)";
Which could likely be rolled into one pattern.