how to use regular expression in a char array C - c

EDIT:
Once again, I ll try to explain it more precisely.
Am writing IRC client - network application based on sockets. Receiving data from server with recv() function and storing them in the rBuf[] array one character after another.
Then copying them to array rBufh[] which is passed to parsing function.
Important thing is that server is sending just one character at a time [one char in one recv()] and keys (char arrays stored in two dim. array [key1]) are the strings which if parsing method finds one of them in received data it needs to react properly basing it reaction on the 3 digits code which meaning is specified in protocol documentation. - as for basics.
Parsing algorithm/function works as follows:
key1 (containing keys) and rBufh arrays are the arguments passed to it
Take first char of the first string stored in key1 and compare it with first char of rBufh
If they match return int ret = 2 (which means 'part of key has been found in received data'
Now recv() adds one char (as the server only sends on char each time) to rBufh array which is passed to parsing function once again and the algorithm starts again till it finds matching key, returning int ret = 1 (found matching key) or int ret = 0 if in step 3 they did not match.
While parser is working, printing to stdout is stoped and basing on value returned by parser printing is either stoped for a longer time (if part of one of the keys from key1 array is found) or allowed to print (if whole key has been found or parser can't match letter passed to it in rBufH array to any of keys stored in key1)
We are getting closer to understanding what I asked for - I hope : )
Quantity of strings stored in key1 array is known beforehand as they are specified in the protocol doc. so initializing it is possible right away while defining.
What i asked is:
is it possible to pass to key1 (not necessarily using sprintf as I did - it just seemed convenient) string which will contain some part of code meaning 'any three digits'.
e.g.
lets mark by XXX the part of code mentioned 3 lines above
key1[0] = "irc2.gbatemp.net XXX nickname"
and now if parser come across any string matching key1[0] which contains ANY three digits in place of XXX it will return 1 - so other function will have posibility to react properly.
This way i don't have to store many strings with different values of XXX in key1 array but just one which maches all of them.
Ok, now I really did my best to help You understand it and so gain an answer : )
Greetings
Tom
P.S.
Feel free to change the title of this topic as I don't really know how should i discribe the problem in few words as You already noticed. (it is as it is because my first idea was to pass some regex to key1)

If you just want to insert an arbitrary 3 digit number in your string then do this:
int n = 123; // some integer value that you want to represent as 3 digits
sprintf(key1[1], ":irc2.gbatemp.net %03d %s ", n, nickname);
Note that the %03d format specifier is not a regular expression - it just specifies that the integer parameter will be displayed as a 3 digit decimal value, with leading zeroes if needed.

You can't tell the compiler that. Signature is
int sprintf ( char * str, const char * format, ... );
Compiler does not look inside format string and doesn't even make sure you match your %d, %s and int, char* up. This will have to be runtime. Just something like
int printKey(char* key, int n) {
if(!(0 <= n && n <= 999)) {
return -1; //C-style throw an error
}
sprintf(key, ":irc2.gbatemp.net %d %s", n, nickname);
return 0;
}
Also use snprintf not sprintf or you're asking for memory troubles.

Related

Searching for all integers that occure twice in a vector

I got the task in university to realize an input of a maximum of 10 integers, which shall be stored in a one dimensional vector. Afterwards, every integer of the vector needs to be displayed on the display (via printf).
However, I don't know how to check the vector for each number. I thought something along the lines of letting the pointer of the vector run from 0 to 9 and comparing the value of each element with all elements again, but I am sure there is a much smarter way. I don't in any case know how to code this idea since I am new to C.
Here is what I have tried:
#include <stdio.h>
int main(void)
{
int vector[10];
int a;
int b;
int c;
a = 0;
b = 0;
c = 0;
printf("Please input 10 integers.\n\n");
while (a <= 10);
{
for (scanf_s("%lf", &vektor[a]) == 0)
{
printf("This is not an integer. Please try again.\n");
fflush(stdin);
}
a++;
}
for (b <= 10);
{
if (vector[b] != vector[c]);
{
printf("&d", vector[b]);
c++;
}
b++;
}
return 0;
}
Your code has several problems, some syntactic and some semantic. Your compiler will help with many of the former kind, such as
misspelling of variable name vector in one place (though perhaps this was a missed after-the-fact edit), and
incorrect syntax for a for loop
Some compilers will notice that your scanf format is mismatched with the corresponding argument. Also, you might even get a warning that clues you in to the semicolons that are erroneously placed between your loop headers and their intended bodies. I don't know any compiler that would warn you that bad input will cause your input loop to spin indefinitely, however.
But I guess the most significant issue is that the details of your approach to printing only non-duplicate elements simply will not serve. For this purpose, I recommend figuring out how to describe in words how the computer (or a person) should solve the problem before trying to write C code to implement it. These are really two different exercises, especially for someone whose familiarity with C is limited. You can reason about the prose description without being bogged down and distracted by C syntax.
For example, here are some words that might suit:
Consider each element, E, of the array in turn, from first to last.
Check all the elements preceding E in the array for one that contains the same value.
If none of the elements before E contains the same value as E then E contains the first appearance of its value, so print it. Otherwise, E's value was already printed when some previous element was processed, so do not print it again.
Consider the next E, if any (go back to step 1).

Using chars into an int datatype

Using C, One array with 5 memory spaces
int call[5]
I'm trying to figure out how to use the first 3 spaces of the array to make a base-36 conversion (meaning 1K0 base-36 equals to 2040 in base-10), the other 2 spaces would be filled with data (probably more ints).
However... does 1K0 look actually like an int? (to me K looks like a char and in theory, char should be enough -127 to 127 for the conversion using base-36)
however what would happen if i try to do this using int instead of char?
is there any alternative to use a base-36 conversion in the first array only mixed with ints for the rest of the spaces in memory
does it matter? (since the array was declared int)
EDIT: to be clear i just want to know if i can declare an int array and fill it with chars, and if i cant, how can i achieve this?
I'm not exactly sure if you can tell the compiler that, you can signal Hex (0x), octal (o) Binary (b) but base 36 is odd enough to not be standard.
You can always make a function (and maybe embed it in a class) that does the string-to-base10 conversion of base 36
I'll do my best in C terms
int base10Number = 0
int base36StringLenght = strlen(base35String);
for( int i = base36StringLenght - 1; i <= 0; i--){ //count from the end of the string back
char c = base36String[i];
if(c <= '9'){
c -= '0' //gives a 0 to 9 range
}
else{ //assuming that the string is perfect and has no extra characters is safe to just do this
c = tolower(c); //first make sure it's lowercase
c -= 'a' + 10 // this will make the letters start at 10 dec
}
base10Number += c * pow(36, base36StringLenght - 1 - i) // the power function specifies which 'wheel' you're turning (imagine an old analog odometer) and then turns the wheel c times, then adds it to the overall sum.
}
This whole code works by the theory that every digit starting from the last is worth its base 10 number multipled by 36 to the power of its position from last.
So last digit is c * 36^0 which is one, the c*36^1 and so on. similar on how 2*10^1 equals 20 if the 2 is in second-to-last position.
Hope that some of this makes sense to you, and dont forget to make your base36 number a string
EDIT:
I saw your edit to the answer and the short asnwer is yes, you can totally do that, it would be a waste of space since you'll have a whole 3 bytes unused at all times, you can simply make a char array. Besides, all string functions will demand you to feed them a char array (you can cast it) but if you're storing one digit per array space char will do the trick. If your array is dynamic and/or you need to make some math on it the base 36 to base 10 conversion will allow you to do math and to blow the whole array for a single int or float type. But if you're just going to store it to display it later or to feed it to another function in the same format the conversion is not necessary at all. (If you're working with a big ammount of this numbers and you need to put them in a database converting to base10 and storing in a single in will save tons of space)
PS: also edited the code to use ' ' enclosed chars instead of ascii numbers, thanks for the request!

Ncurses - why doesn't printing a string using mvprintw() when multiple arguments in function call are present not work?

I've just started using ncurses. An array validity defined as
static const char* validity[] = {
"invalid",
"valid"
};
allows me to map the 0 to invalid and the 1 to valid where necessary in order to make a more readable output in my application.
I also have the following lines in my code:
if(data->pos != NULL) {
mvprintw(4, 0,
"Position: X %5d Y %5d Z %5d \n",
data->pos->x, data->pos->y, data->pos->z);
// ...
}
This allows me to output a 3D coordinate that is stored inside a struct data, which is updated multiple times every second and filled with new coordinates. It works without any issues.
Next thing I have is use mvprintw(...) to output a single string argument:
if(data->detected != NULL) {
mvprintw(5, 0,
"Detected: %s\n",
booleans[data->detected]);
}
where booleans is very similar to the validity array but with true and false strings inside to map boolean values to strings. This works too!
However the following code outputs a strange result:
if(data->validity_check != NULL) {
mvprintw(11, 0,
"Validity: %d [%s]\n",
data->validity_check->timestamp,
validity[data->validity_check->valid]);
}
validity_check is just another struct that contains a timestamp (as long integer) and a valid flag which can be either 0 or 1.
The output should look for example like
Validity: 123456789 [invalid]
but instead I get
Validity: 123456789 [(null)]
I was quite surprised by this and made an output using printf() to see if valid actually contained valid data or not (that's the problem with C-arrays - no check for out of bounds). The result: it did. The values were jumping between 0 and 1 as expected. For some reason however the second argument appeared to be "broken". I even removed the retrieval from the array and used a hard-coded argument value but still no change.
Further investigation made me rewrite this part of the code like this:
if(data->validity_check != NULL) {
mvprintw(11, 0,
"Validity: %d",
data->validity_check->timestamp);
printw(" [%s]\n", validity[data->validity_check->valid]);
}
It worked without any issues and I got the desired result. However I have no idea what I'm doing wrong here. I have looked the documentation multiple times but I'm probably missing something since I can't find anything that can explain this behaviour.
You write:
validity_check is just another struct that contains a timestamp (as long integer) and a valid flag which can be either 0 or 1.
If timestamp is a long int, you should use the %ld format instead of just %d:
if (data->validity_check != NULL) {
mvprintw(11, 0,
"Validity: %ld [%s]\n",
data->validity_check->timestamp,
validity[data->validity_check->valid]);
}
The printf family of functions are variadicfunctions and the format string must tell them exactly which type to use. (This is diffferent to normal functions, where the type of arguments is known and types get narrows or promoted as appropriate.)
That's a common type of error, but enabling warnings should tell you about such format mismatches, at least in GCC and Clang.

Variable length arithmetic calculator WITHOUT using strings?

I'm trying to create a calculator that solves arithmetic expressions of different lengths (e.g. 2+3/4 or 7*8/2+12-14), and I was wondering if it was possible to do so without the use of strings.
I've found countless tutorials explaining how to make a simple calculator with only two numbers and an operator, and I've also found examples using sscanf and strings to get the input.
However, my question is: Is there a way (is it even possible) to get variable length inputs without using strings?
At first I thought i could simply add more specifiers:
int num1 , num2, num3;
char op1, op2;
printf("Please enter your equation to evaluate: ");
scanf("%d%c%d%c%d", &num1, &op1, &num2, &op2, &num3);
but obviously, that doesn't work for equations longer than 3 numbers or less than 3 numbers.
I was also thinking of perhaps using some sort of recursive function, but I'm not sure how I would do that if I need to ask for the entire equation up front?
If you intend to read ASCII from user input, or from command line arguments, then you're pretty inescapably in the world of strings. What you can do is convert them into something else as early as possible.
You could abandon ASCII altogether and define a binary file format.
For example, you might say that each pair of two bytes is a token. The first byte is an element type ( signed integer, unsigned integer, float, operator), the second byte is the value.
Pseudocode:
while(!done) {
int type = read(f);
int value = read(f);
switch(type) {
case TYPE_INTEGER:
push_to_stack(value);
break;
case TYPE_FLOAT:
push_to_stack_as_float(value);
break;
case TYPE_OPERATOR:
execute_operator(value);
break;
}
}
Quite why you would force yourself down this route, I don't know. You'd probably find yourself wanting to write a program to convert ASCII input into your binary file format; which would use strings. So why did you run away from strings in the first place?
you can create a list of struct, every struct will have to contains a value OR a sub list, an operator (char?) and a reference to the next (and or before) char.
Then you just ask the user for a number (or "("/")"), and a operator sign. Every number + operator is a new element in list, every ( is a sub list, every ) is a return to superior list (you can even don't create a sublist, but elaborate it on the fly and return just the result, like a recursive function)
Also struct and code can be elaborated to support multiple parameter.

memset() not setting memory in c

I apologize if my formatting is incorrect as this is my first post, I couldn't find a post on the site that dealt with the same issue I am running into. I'm using plain C on ubuntu 12.04 server. I'm trying to concatenate several strings together into a single string, separated by Ns. The string sizes and space between strings may vary, however. A struct was made to store the positional data as several integers that can be passed to multiple functions:
typedef struct pseuInts {
int pseuStartPos;
int pseuPos;
int posDiff;
int scafStartPos;
} pseuInts;
As well as a string struct:
typedef struct string {
char *str;
int len;
} myString;
Since there are break conditions for the concatenated string multiple nodes of a dynamically linked list were assembled containing an identifier and the concatenated string:
typedef struct entry {
myString title;
myString seq;
struct entry *next;
} entry;
The memset call is as follows:
} else if ((*pseuInts)->pseuPos != (*pseuInts)->scafStartPos) {
(*pseuEntry)->seq.str = realloc ((*pseuEntry)->seq.str, (((*pseuEntry)->seq.len) + (((*pseuInts)->scafStartPos) - ((*pseuInts)->pseuPos)))); //realloc the string being extended to account for the Ns
memset (((*pseuEntry)->seq.str + ((*pseuEntry)->seq.len)), 'N', (((*pseuInts)->scafStartPos) - ((*pseuInts)->pseuPos))); //insert the correct number of Ns
(*pseuEntry)->seq.len += (((*pseuInts)->scafStartPos) - ((*pseuInts)->pseuPos)); //Update the length of the now extended string
(*pseuInts)->pseuPos += (((*pseuInts)->scafStartPos) - ((*pseuInts)->pseuPos)); //update the position values
}
These are all being dereferenced as this else if decision is in a function being called by a function called from main, but the changes to the pseuEntry struct need to be updated in main so as to be passed to another function for further processing.
I've double checked the numbers being used in pseuInts by inserting some printf commands and they are correct in the positioning of how many Ns need to be added, even as they change between different short strings. However, when the program is run the memset only inserts Ns the first time it's called. IE:
GATTGT and TAATTTGACT are separated by 4 spaces and they become:
GATTGTNNNNTAATTTGACT
The second time it is called on the same concatenated string it doesn't work though. IE:
TAATTTGACT and TCTCC are separated by 6 spaces so the long string should become:
GATTGTNNNNTAATTTGACTNNNNNNTCTCC
but it only shows:
GATTGTNNNNTAATTTGACTTCTCC
I've added printfs to display the concatenated string immediately before and after the memset and the they are identical in output.
Sometimes the insertion is adding extra character spaces, but not initializing them so they print nonsense (as would be expected). IE:
GAATAAANNNNNNNNNNNNNNNNN¬GCTAATG
should be
GAATAAANNNNNNNNNNNNNNNNNGCTAATG
I've switched the memset with a for or a while loop and I get the same result. I used an intermediate char * to realloc and still get the same result. I'm looking for for suggestions as to where I should look to try and detect the error.
If you are okay with considering a completely different approach, I would like to offer this:
I understand your intent to be: Replace existing spaces between two strings with an equal number of "N"s. memset() (and associated memory allocations) is the primary method to perform the concatenations.
The problems you have described with your current concatenation attempts are :
1) garbage embedded in resulting string.
2) writing "N" in some unintended memory locations.
3) "N" not being written in other intended memory locations.
Different approach:
First: verify that the memory allocated to the string being modified is sufficient to contain results
second: verify all strings to be concatenated are \0 terminated before attempting concatenation.
third: use strcat(), and a for(;;) loop to append all "N"s, and eventually, subsequent strings.
eg.
for(i=0;i<numNs;i++)//compute numNs with your existing variables
{
strcat(firstStr, "N");//Note: "N" is already NULL term. , and strcat() also ensures null term.
}
strcat(firstStr, lastStr); //a null terminated concatenation
I know this approach is vastly different from what you were doing, but it does address at least the issues identified from your problem statement. If this makes no sense, please let me know and I will address questions as I am able to. (currently have other projects going on)
Looking at your memset:
memset (((*pseuEntry)->seq.str + ((*pseuEntry)->seq.len))), ...
That's the destination. Shouldn't it be:
(memset (((*pseuEntry)->seq.str + ((*pseuEntry)->seq.len) + ((*pseuEntry)->seq.pseuStartPos))
Otherwise I'm missing the meaninging of pseuInts .

Resources