C Xor Encryption (beginner) - c

I found This Code for xor encryption and it works
string encryptDecrypt(string toEncrypt)
{
char key = 'K'; //Any char will work
string output = toEncrypt;
int z = toEncrypt.size();
for (int i = 0; i < z; i++)
{
output[i] = toEncrypt[i] ^ key;
}
return output;
}
int main(int argc, const char * argv[])
{
string encrypted = encryptDecrypt("kylewbanks.com");
string decrypted = encryptDecrypt(encrypted);
return 0;
}
but when I changed it to this form:
CHAR* encryptDecrypt(CHAR* toEncrypt)
{
char key = 'K'; //Any char will work
CHAR* output = toEncrypt;
for (int i = 0; i < strlen(toEncrypt); i++)
{
output[i] = toEncrypt[i] ^ key;
}
return output;
}
int main(int argc, CHAR* argv[])
{
CHAR* encrypted = encryptDecrypt("kylewbanks.com");
CHAR* decrypted = encryptDecrypt(encrypted);
return 0;
}
the program break with this message: Access violation writing location 0x001CCC90
I need to use CHAR* or WCHAR* for my variable (I don't have permission to use string)
the questions are:
1- what is the problem in second code?
2- how can I change key to a word not a character?
thanks for help :)
Edit:
the problem is that it breaks immediately after the line 7
it doesn't do this line at all:
output[i] = toEncrypt[i] ^ key;

1, In encryptDecrypt output is a local variable and link to a static string area, it is readonly. Usually you need to create a buffer space and pass it into encryptDecrypt. malloc inside the function also works but you need to free it manually. BTW, strlen is also dangerous because the encrypted data may not be a valid string.
2, use WCHAR and wcslen instead of CHAR and strlen, key use L'k' should work.

1- what is the problem in second code?
As far as I understand you, you want to encrypt and decrypt a c-string. Strings in C are usually character arrays and terminated with a 0. Also you use the pointer arithmetics wrong here.
While:
string output = toEncrypt;
does a copy operation of the whole string,
CHAR* output = toEncrypt;
will just create a new pointer and let it point to the same point where the pointer toEncrypt points at atm. So at the end you just have 2 pointers pointing to the same position which is as good as using 1 pointer at all.
I would suggest that you make a new buffer in that method to store the encrypted data:
CHAR* output = new CHAR[strlen(toEncrypt)];
Though this could lead to undefined behavior if you don't encrypt zero terminated strings. As strlen would continue counting characters until it finds a 0. So I also suggest you to send the lenght of the data to encrypt to your function:
CHAR* encryptDecrypt(CHAR* toEncrypt, int length)
Also readonly character initializations are always constant in C.
So a possible solution would be:
wchar_t* encryptDecrypt(const wchar_t* toEncrypt, int length)
{
wchar_t key = L'K'; //Any char will work
wchar_t* output = new wchar_t[length]; // Make a temporary buffer
for (int i = 0; i < length; i++)
{
output[i] = toEncrypt[i] ^ key; // Encrypt every char of the array to encrypt
}
return output;
}
int main(int argc, CHAR* argv[])
{
// Your Data to encrypt / decrypt
const wchar_t* sourceString = L"kylewbanks.com";
const int sourceStrLen = wcslen(sourceString); // Get the lenght of your data before you encrypt it and can't use wcslen anymore. Alternatively just enter the amount of characters in your string here.
// Encrypt / Decrypt your String
wchar_t* encrypted = encryptDecrypt(sourceString, sourceStrLen);
wchar_t* decrypted = encryptDecrypt(encrypted, sourceStrLen);
// Free the allocated buffers
delete[] encrypted;
delete[] decrypted;
return 0;
}
Note that if you want to display this string somewhere using methods like printf(), it will probably display some memory garbage after your string as the encryption returns just the encrypted data which is not zero-terminated.
2- how can I change key to a word not a character?
You could use several methods of XORing with the Key-String but one of the most simple ways would be to XOR the source string with the key-string over and over.
Something like:
kylewbanks.com
XOR keykeykeykeyke
__________________
RESULT
if your key-string is "key"
Possible solution:
wchar_t* encryptDecrypt(const wchar_t* toEncrypt, int length)
{
const wchar_t* key = L"KEY"; // A readonly wide String
wchar_t* output = new wchar_t[length]; // Make a temporary buffer
for (int i = 0; i < length; i++)
{
output[i] = toEncrypt[i] ^ key[i % wcslen(key)]; // i % 3 wil be ascending between 0,1,2
}
return output;
}

Related

Cannot access empty string from array of strings in C

I'm using an array of strings in C to hold arguments given to a custom shell. I initialize the array of buffers using:
char *args[MAX_CHAR];
Once I parse the arguments, I send them to the following function to determine the type of IO redirection if there are any (this is just the first of 3 functions to check for redirection and it only checks for STDIN redirection).
int parseInputFile(char **args, char *inputFilePath) {
char *inputSymbol = "<";
int isFound = 0;
for (int i = 0; i < MAX_ARG; i++) {
if (strlen(args[i]) == 0) {
isFound = 0;
break;
}
if ((strcmp(args[i], inputSymbol)) == 0) {
strcpy(inputFilePath, args[i+1]);
isFound = 1;
break;
}
}
return isFound;
}
Once I compile and run the shell, it crashes with a SIGSEGV. Using GDB I determined that the shell is crashing on the following line:
if (strlen(args[i]) == 0) {
This is because the address of arg[i] (the first empty string after the parsed commands) is inaccessible. Here is the error from GDB and all relevant variables:
(gdb) next
359 if (strlen(args[i]) == 0) {
(gdb) p args[0]
$1 = 0x7fffffffe570 "echo"
(gdb) p args[1]
$2 = 0x7fffffffe575 "test"
(gdb) p args[2]
$3 = 0x0
(gdb) p i
$4 = 2
(gdb) next
Program received signal SIGSEGV, Segmentation fault.
parseInputFile (args=0x7fffffffd570, inputFilePath=0x7fffffffd240 "") at shell.c:359
359 if (strlen(args[i]) == 0) {
I believe that the p args[2] returning $3 = 0x0 means that because the index has yet to be written to, it is mapped to address 0x0 which is out of the bounds of execution. Although I can't figure out why this is because it was declared as a buffer. Any suggestions on how to solve this problem?
EDIT: Per Kaylum's comment, here is a minimal reproducible example
#include<stdio.h>
#include<string.h>
#include<stdlib.h>
#include<unistd.h>
#include<sys/types.h>
#include<sys/wait.h>
#include <sys/stat.h>
#include<readline/readline.h>
#include<readline/history.h>
#include <fcntl.h>
// Defined values
#define MAX_CHAR 256
#define MAX_ARG 64
#define clear() printf("\033[H\033[J") // Clear window
#define DEFAULT_PROMPT_SUFFIX "> "
char PROMPT[MAX_CHAR], SPATH[1024];
int parseInputFile(char **args, char *inputFilePath) {
char *inputSymbol = "<";
int isFound = 0;
for (int i = 0; i < MAX_ARG; i++) {
if (strlen(args[i]) == 0) {
isFound = 0;
break;
}
if ((strcmp(args[i], inputSymbol)) == 0) {
strcpy(inputFilePath, args[i+1]);
isFound = 1;
break;
}
}
return isFound;
}
int ioRedirectHandler(char **args) {
char inputFilePath[MAX_CHAR] = "";
// Check if any redirects exist
if (parseInputFile(args, inputFilePath)) {
return 1;
} else {
return 0;
}
}
void parseArgs(char *cmd, char **cmdArgs) {
int na;
// Separate each argument of a command to a separate string
for (na = 0; na < MAX_ARG; na++) {
cmdArgs[na] = strsep(&cmd, " ");
if (cmdArgs[na] == NULL) {
break;
}
if (strlen(cmdArgs[na]) == 0) {
na--;
}
}
}
int processInput(char* input, char **args, char **pipedArgs) {
// Parse the single command and args
parseArgs(input, args);
return 0;
}
int getInput(char *input) {
char *buf, loc_prompt[MAX_CHAR] = "\n";
strcat(loc_prompt, PROMPT);
buf = readline(loc_prompt);
if (strlen(buf) != 0) {
add_history(buf);
strcpy(input, buf);
return 0;
} else {
return 1;
}
}
void init() {
char *uname;
clear();
uname = getenv("USER");
printf("\n\n \t\tWelcome to Student Shell, %s! \n\n", uname);
// Initialize the prompt
snprintf(PROMPT, MAX_CHAR, "%s%s", uname, DEFAULT_PROMPT_SUFFIX);
}
int main() {
char input[MAX_CHAR];
char *args[MAX_CHAR], *pipedArgs[MAX_CHAR];
int isPiped = 0, isIORedir = 0;
init();
while(1) {
// Get the user input
if (getInput(input)) {
continue;
}
isPiped = processInput(input, args, pipedArgs);
isIORedir = ioRedirectHandler(args);
}
return 0;
}
Note: If I forgot to include any important information, please let me know and I can get it updated.
When you write
char *args[MAX_CHAR];
you allocate room for MAX_CHAR pointers to char. You do not initialise the array. If it is a global variable, you will have initialised all the pointers to NULL, but you do it in a function, so the elements in the array can point anywhere. You should not dereference them before you have set the pointers to point at something you are allowed to access.
You also do this, though, in parseArgs(), where you do this:
cmdArgs[na] = strsep(&cmd, " ");
There are two potential issues here, but let's deal with the one you hit first. When strsep() is through the tokens you are splitting, it returns NULL. You test for that to get out of parseArgs() so you already know this. However, where your program crashes you seem to have forgotten this again. You call strlen() on a NULL pointer, and that is a no-no.
There is a difference between NULL and the empty string. An empty string is a pointer to a buffer that has the zero char first; the string "" is a pointer to a location that holds the character '\0'. The NULL pointer is a special value for pointers, often address zero, that means that the pointer doesn't point anywhere. Obviously, the NULL pointer cannot point to an empty string. You need to check if an argument is NULL, not if it is the empty string.
If you want to check both for NULL and the empty string, you could do something like
if (!args[i] || strlen(args[i]) == 0) {
If args[i] is NULL then !args[i] is true, so you will enter the if body if you have NULL or if you have a pointer to an empty string.
(You could also check the empty string with !(*args[i]); *args[i] is the first character that args[i] points at. So *args[i] is zero if you have the empty string; zero is interpreted as false, so !(*args[i]) is true if and only if args[i] is the empty string. Not that this is more readable, but it shows again the difference between empty strings and NULL).
I mentioned another issue with the parsed arguments. Whether it is a problem or not depends on the application. But when you parse a string with strsep(), you get pointers into the parsed string. You have to be careful not to free that string (it is input in your main() function) or to modify it after you have parsed the string. If you change the string, you have changed what all the parsed strings look at. You do not do this in your program, so it isn't a problem here, but it is worth keeping in mind. If you want your parsed arguments to survive longer than they do now, after the next command is passed, you need to copy them. The next command that is passed will change them as it is now.
In main
char input[MAX_CHAR];
char *args[MAX_CHAR], *pipedArgs[MAX_CHAR];
are all uninitialized. They contain indeterminate values. This could be a potential source of bugs, but is not the reason here, as
getInput modifies the contents of input to be a valid string before any reads occur.
pipedArgs is unused, so raises no issues (yet).
args is modified by parseArgs to (possibly!) contain a NULL sentinel value, without any indeterminate pointers being read first.
Firstly, in parseArgs it is possible to completely fill args without setting the NULL sentinel value that other parts of the program should rely on.
Looking deeper, in parseInputFile the following
if (strlen(args[i]) == 0)
contradicts the limits imposed by parseArgs that disallows empty strings in the array. More importantly, args[i] may be the sentinel NULL value, and strlen expects a non-NULL pointer to a valid string.
This termination condition should simply check if args[i] is NULL.
With
strcpy(inputFilePath, args[i+1]);
args[i+1] might also be the NULL sentinel value, and strcpy also expects non-NULL pointers to valid strings. You can see this in action when inputSymbol is a match for the final token in the array.
args[i+1] may also evaluate as args[MAX_ARGS], which would be out of bounds.
Additionally, inputFilePath has a string length limit of MAX_CHAR - 1, and args[i+1] is (possibly!) a dynamically allocated string whose length might exceed this.
Some edge cases found in getInput:
Both arguments to
strcat(loc_prompt, PROMPT);
are of the size MAX_CHAR. Since loc_prompt has a length of 1. If PROMPT has the length MAX_CHAR - 1, the resulting string will have the length MAX_CHAR. This would leave no room for the NUL terminating byte.
readline can return NULL in some situations, so
buf = readline(loc_prompt);
if (strlen(buf) != 0) {
can again pass the NULL pointer to strlen.
A similar issue as before, on success readline returns a string of dynamic length, and
strcpy(input, buf);
can cause a buffer overflow by attempting to copy a string greater in length than MAX_CHAR - 1.
buf is a pointer to data allocated by malloc. It's unclear what add_history does, but this pointer must eventually be passed to free.
Some considerations.
Firstly, it is a good habit to initialize your data, even if it might not matter.
Secondly, using constants (#define MAX_CHAR 256) might help to reduce magic numbers, but they can lead you to design your program too rigidly if used in the same way.
Consider building your functions to accept a limit as an argument, and return a length. This allows you to more strictly track the sizes of your data, and prevents you from always designing around the maximum potential case.
A slightly contrived example of designing like this. We can see that find does not have to concern itself with possibly checking MAX_ARGS elements, as it is told precisely how long the list of valid elements is.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAX_ARGS 100
char *get_input(char *dest, size_t sz, const char *display) {
char *res;
if (display)
printf("%s", display);
if ((res = fgets(dest, sz, stdin)))
dest[strcspn(dest, "\n")] = '\0';
return res;
}
size_t find(char **list, size_t length, const char *str) {
for (size_t i = 0; i < length; i++)
if (strcmp(list[i], str) == 0)
return i;
return length;
}
size_t split(char **list, size_t limit, char *source, const char *delim) {
size_t length = 0;
char *token;
while (length < limit && (token = strsep(&source, delim)))
if (*token)
list[length++] = token;
return length;
}
int main(void) {
char input[512] = { 0 };
char *args[MAX_ARGS] = { 0 };
puts("Welcome to the shell.");
while (1) {
if (get_input(input, sizeof input, "$ ")) {
size_t argl = split(args, MAX_ARGS, input, " ");
size_t redirection = find(args, argl, "<");
puts("Command parts:");
for (size_t i = 0; i < redirection; i++)
printf("%zu: %s\n", i, args[i]);
puts("Input files:");
if (redirection == argl)
puts("[[NONE]]");
else for (size_t i = redirection + 1; i < argl; i++)
printf("%zu: %s\n", i, args[i]);
}
}
}

Size problem while decoding ciphered text in C

[EDIT]
This is the ciphered text needs to be decoded:
bURCUE}__V|UBBQVT
I have decoder that successfully fetches ciphered text but convert it to some point. The rest of the encoded message is gibberish. I checked the size of buffer and char pointer, both seem correct, I couldn't find the flaw
Message I expect to see:
SecretLongMessage
Decrypted message on the screen looks like this:
SecretLong|drs`fe
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define BUZZ_SIZE 1024
char* encryptDecrypt(const char* toEncrypt, int length)
{
char key[] = "1011011011";
char* output = malloc(length + 1);
output[length] = '\0'; //buffer
for (int i = 0; i < length; i++)
{
output[i] = toEncrypt[i] ^ key[i % (sizeof(key)/sizeof(char))];
}
return output;
}
int main(int argc, char* argv[])
{
char buff[BUZZ_SIZE];
FILE *f;
f = fopen("C:\\Users\\Dell\\source\\repos\\XOR\\XOR\\bin\\Debug\\cipher.txt", "r"); // read mode
fgets(buff, BUZZ_SIZE, f);
printf("Ciphered text: %s, size = %d\n", buff,sizeof(buff));
fclose(f);
char* sourceString = buff;
//Decrypt
size_t size = strlen(sourceString);
char* decrypted = encryptDecrypt(buff, size);
//printf("\nsize = %d\n",size);
printf("\nDecrypted is: ");
printf(decrypted);
// Free the allocated buffers
return 0;
}
Here is my C# code that gives cipher
String szEncryptionKey = "1011011011";
public Form1()
{
InitializeComponent();
}
string EncryptOrDecrypt(string text, string key)
{
var result = new StringBuilder();
for (int c = 0; c < text.Length; c++)
{
// take next character from string
char character = text[c];
// cast to a uint
uint charCode = (uint)character;
// figure out which character to take from the key
int keyPosition = c % key.Length; // use modulo to "wrap round"
// take the key character
char keyChar = key[keyPosition];
// cast it to a uint also
uint keyCode = (uint)keyChar;
// perform XOR on the two character codes
uint combinedCode = charCode ^ keyCode;
// cast back to a char
char combinedChar = (char)combinedCode;
// add to the result
result.Append(combinedChar);
}
return result.ToString();
}
private void Button1_Click(object sender, EventArgs e)
{
String str = textBox1.Text;
var cipher = EncryptOrDecrypt(str, szEncryptionKey);
System.IO.File.WriteAllText(#"C:\\Users\\Dell\\source\\repos\\XOR\\XOR\\bin\\Debug\\cipher.txt", cipher);
}
You want to use all characters from
char key[] = "1011011011";
for your encryption. But the array key includes a terminating '\0' which is included in the calculation when you use
key[i % (sizeof(key)/sizeof(char))]
because sizeof(key) includes the terminating '\0'.
You could either use strlen to calculate the string length or use key[i % (sizeof(key)/sizeof(char))-1] or initialize the array as
char key[] = {'1', '0', '1', '1', '0', '1', '1', '0', '1', '1' };
to omit the terminating '\0'. In the latter case you can use sizeof to calculate the key index as in your original code
After the C# code was added to the question it is clear that the encryption doesn't include a '\0' in the key. key.Length is comparable to strlen(key) in C, not sizeof(key).
BTW: The variable name String szEncryptionKey = "1011011011"; in C# is misleading because it is not a zero terminated string as it would be in C.
Note: strlen(key) is the same as sizeof(key)-1 in your case because you don't specify the array size and initialize the array to a string. It might not be the same in other cases.

C - strtok_s - Access violation reading location

I can't understand why I get the following message for my function below (in Visual Studio 2015).
0xC0000005: Access violation reading location 0x0000002C.
I have read this answer but it does not help me.
What this code is about.
There is a string of ints separated in groups of "index,value" pairs. Indexes are unique. Each group is separated by a semi-colon. Example: 1,2;3,5;2,2;3,4
I am trying to get an array of int with each value at its index.
My code so far extracts the strings and puts it into a char* buffer.
Then I separate the groups of "index,value" by the semi-colons and store them in char** arrayKeyValue, which is a member of struct inputElement . The other struc member is a int representing the number of "index,value" groups in the array. I do this with the function "separateStringBySemicolon".
Then I try to separate each group of "index,value" into a new array, where at each "index" will match its "value". I do this by passing my struct to the function "separateKeyValue". I use strtok_s but I get an error.
The first call to the function below (token2 = strtok_s(arrayOfKeyValue[j], sepComma, &next_token2);) brings the error. I understand that token2 or next_token2 cannot be accessed, but I am not sure. And if so, why?
double* separateKeyValue(struct inputElement* inputElement)
{
int count = inputElement->length;
char** arrayOfKeyValue = inputElement->data;
double* arrayDecimal = malloc(count * sizeof(double));
char sepComma = ','; //wrong should be char sepComma[] = ",";
char* token2 = NULL;
char* next_token2;
printf("Value in arrayofkeyvalue: %s", arrayOfKeyValue[0]);
for (size_t j = 0; j < count; j++)
{
token2 = strtok_s(arrayOfKeyValue[j], sepComma, &next_token2);
unsigned int index;
sscanf_s(token2, "%d", &index);
double value;
sscanf_s(next_token2, "%d", &value);
arrayDecimal[index] = value;
printf("res[%d] = %d\n", index, arrayDecimal[index]);
printf("\n");
}
return arrayDecimal;
}
You are specifying a char constant, sepComma, as the second parameter to strtok_s, where it expects a string of delimiter characters.
(not so) Coincidentally, the ASCII value of ',' is 0x2C.

When I try to get a number from a string of characters, the function always returns zero

EDIT: So it looks like the problem is that the string that getNum is supposed to convert to a float is not actually a string containing all the characters of the token. Instead it contains the character immediately following the token, which is usually NaN so the atof converts it to 0. I'm not sure why this behavior is occuring.
I'm working on a scanner + parser that evaluates arithmetic expressions. I am trying to implement a method that gets a token (stored as a string) which is a number and turns it into a float, but it always returns 0 no matter what the token is.
I was given the code for a get_character function, which I am not sure is correct. I'm having a little trouble parsing what's going on with it though, so I'm not sure:
int get_character(location_t *loc)
{
int rtn;
if (loc->column >= loc->line->length) {
return 0;
}
rtn = loc->line->data[loc->column++];
if (loc->column >= loc->line->length && loc->line->next) {
loc->line = loc->line->next;
loc->column = 0;
}
return rtn;
}
I used it in my getNum() function assuming it was correct. It is as follows:
static float getNum(){
char* tokenstr;
tokenstr = malloc(tok.length * sizeof(char));
int j;
for(j = 0; j < tok.length; j++){
tokenstr[j] = get_character(&loc);
}
match(T_LITERAL); /*match checks if the given token class is the same as the token
class of the token currently being parsed. It then moves the
parser to the next token.*/
printf("%f\n", atof(tokenstr));
return atof(tokenstr);
}
Below is some additional information that is required to understand what is going on in the above functions. These are details about some struct files which organize the input data.
In order to store and find tokens, three types of structs are used. A line_t struct, a location_t struct, and a token_t struct. The code for these are posted, but to summarize:
Lines contain an array of characters (the input from that line of the
input file), an int for the length of the line, an int that is the
line number as a form of identification, and a pointer to the next
line of input that was read into memory.
Locations contain a pointer to a specific line, and an int that
specifies a specific "column" of the line.
Tokens contain an int for the length of the token, a location describing where the token begins, and token class describing what kind of token it is for the parser.
Code for these structs:
typedef struct line {
char * data;
int line_num;
int length; /* number of non-NUL characters == index of trailing NUL */
struct line * next;
} line_t;
typedef struct {
line_t *line;
int column;
} location_t;
typedef struct {
token_class tc;
location_t location;
int length; /* length of token in characters (may span lines) */
} token_t;
It appears that the default behavior intended is to extract a character and then advance to the next prior to returning.
Yet the function, if the line length is exceeded (or the collumn value isn't initialized to less than the line length) will not advance.
Try this:
if (loc->column >= loc->line->length) {
loc->line = loc->line->next;
loc->column = 0;
return 0;
}
And make sure that the column location is properly initialized.
Personally, I would change the function to this:
int get_character(location_t *loc)
{
int rtn = 0;
if (loc->column < loc->line->length) {
rtn = loc->line->data[loc->column++];
}
if (loc->column >= loc->line->length && loc->line->next) {
loc->line = loc->line->next;
loc->column = 0;
}
return rtn;
}
I'd also use unsigned values for the column and length, just to avoid the possibility of negative array indicies.
I see a number of potential problems with this code:
char* tokenstr;
tokenstr = malloc(tok.length * sizeof(char));
int j;
for(j = 0; j < tok.length; j++){
tokenstr[j] = get_character(&loc);
}
match(T_LITERAL); /*match checks if the given token class is the same as the token
class of the token currently being parsed. It then moves the
parser to the next token.*/
printf("%f\n", atof(tokenstr));
return atof(tokenstr);
You create space for a new token string tokenstr, you copy it but you don't null terminate it after, nor is enough space allocated for a token plus the string terminator \0. And at the end there is a memory leak as tokenstr isn't freeed. I might consider a change to something like:
char* tokenstr;
float floatVal;
/* Make sure we have enough space including \0 to terminate string */
tokenstr = malloc((tok.length + 1) * sizeof(char));
int j;
for(j = 0; j < tok.length; j++){
tokenstr[j] = get_character(&loc);
}
/* add the end of string \0 character */
tokenstr[j] = '\0'
match(T_LITERAL); /*match checks if the given token class is the same as the token
class of the token currently being parsed. It then moves the
parser to the next token.*/
floatVal = atof(tokenstr);
/* Free up the `malloc`ed tokenstr as it is no longer needed */
free(tokenstr);
printf("%f\n", floatVal);
return floatVal;

Extracting key=value with scanf in C

I need to extract a value for a given key from a string. I made this quick attempt:
char js[] = "some preceding text with\n"
"new lines and spaces\n"
"param_1=123\n"
"param_2=321\n"
"param_3=string\n"
"param_2=321\n";
char* param_name = "param_2";
char *key_s, *val_s;
char buf[32];
key_s = strstr(js, param_name);
if (key_s == NULL)
return 0;
val_s = strchr(key_s, '=');
if (val_s == NULL)
return 0;
sscanf(val_s + 1, "%31s", buf);
printf("'%s'\n", buf);
And it in fact works ok (printf gives '321'). But I suppose the scanf/sscanf would make this task even easier but I have not managed to figure out the formatting string for that.
Is that possible to pass a content of a variable param_name into sscanf so that it evaluates it as a part of a formatting string? In other words, I need to instruct sscanf that in this case it should look for a pattern param_2=%s (the param_name in fact comes from a function argument).
Not directly, no.
In practice, there's of course nothing stopping you from building the format string for sscanf() at runtime, with e.g. snprintf().
Something like:
void print_value(const char **js, size_t num_js, const char *key)
{
char tmp[32], value[32];
snprintf(tmp, sizeof tmp, "%s=%%31s", key);
for(size_t i = 0; i < num_js; ++i)
{
if(sscanf(js[i], tmp, value) == 1)
{
printf("found '%s'\n", value);
break;
}
}
}
OP's has a good first step:
char *key_s = strstr(js, param_name);
if (key_s == NULL)
return 0;
The rest may be simplified to
if (sscanf(&key_s[strlen(param_name)], "=%31s", buf) == 0) {
return 0;
}
printf("'%s'\n", buf);
Alternatively one could use " =%31s" to allow spaces before =.
OP's approach gets fooled by "param_2 321\n" "param_3=string\n".
Note: Weakness to all answers so far to not parse the empty string.
One issue that bears consideration is the difference between finding a 'key=value' setting in the string for a specific key value (such as param_2 in the question), and finding any 'key=value' setting in the string (with no specific key in mind a priori). The techniques to be used are rather different.
Another issue that has not self-evidently been considered is the possibility that you're looking for a key param_2 but the string also contains param_22=xyz and t_param_2=abc. The simple-minded approaches using strstr() to hunt for param_2 will pick up either of those alternatives.
In the sample data, there is a collection of characters that are not in the 'key=value' format to be skipped before the any 'key=value' parts. In the general case, we should assume that such data appears before, in between, and after the 'key=value' pairs. It appears that the values do not need to support complications such as quoted strings and metacharacters, and the value is delimited by white space. There is no comment convention visible.
Here's some workable code:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
enum { MAX_KEY_LEN = 31 };
enum { MAX_VAL_LEN = 63 };
int find_any_key_value(const char *str, char *key, char *value);
int find_key_value(const char *str, const char *key, char *value);
int find_any_key_value(const char *str, char *key, char *value)
{
char junk[256];
const char *search = str;
while (*search != '\0')
{
int offset;
if (sscanf(search, " %31[a-zA-Z_0-9]=%63s%n", key, value, &offset) == 2)
return(search + offset - str);
int rc;
if ((rc = sscanf(search, "%255s%n", junk, &offset)) != 1)
return EOF;
search += offset;
}
return EOF;
}
int find_key_value(const char *str, const char *key, char *value)
{
char found[MAX_KEY_LEN + 1];
int offset;
const char *search = str;
while ((offset = find_any_key_value(search, found, value)) > 0)
{
if (strcmp(found, key) == 0)
return(search + offset - str);
search += offset;
}
return offset;
}
int main(void)
{
char js[] = "some preceding text with\n"
"new lines and spaces\n"
"param_1=123\n"
"param_2=321\n"
"param_3=string\n"
"param_4=param_2=confusion\n"
"m= x\n"
"param_2=987\n";
const char p2_key[] = "param_2";
int offset;
const char *str;
char key[MAX_KEY_LEN + 1];
char value[MAX_VAL_LEN + 1];
printf("String being scanned is:\n[[%s]]\n", js);
str = js;
while ((offset = find_any_key_value(str, key, value)) > 0)
{
printf("Any found key = [%s] value = [%s]\n", key, value);
str += offset;
}
str = js;
while ((offset = find_key_value(str, p2_key, value)) > 0)
{
printf("Found key %s with value = [%s]\n", p2_key, value);
str += offset;
}
return 0;
}
Sample output:
$ ./so24490410
String being scanned is:
[[some preceding text with
new lines and spaces
param_1=123
param_2=321
param_3=string
param_4=param_2=confusion
m= x
param_2=987
]]
Any found key = [param_1] value = [123]
Any found key = [param_2] value = [321]
Any found key = [param_3] value = [string]
Any found key = [param_4] value = [param_2=confusion]
Any found key = [m] value = [x]
Any found key = [param_2] value = [987]
Found key param_2 with value = [321]
Found key param_2 with value = [987]
$
If you need to handle different key or value lengths, you need to adjust the format strings as well as the enumerations. If you pass the size of the key buffer and the size of the value buffer to the functions, then you need to use snprint() to create the format strings used by sscanf(). There is an outside chance that you might have a single 'word' of 255 characters followed immediately by the target 'key=value' string. The chances are ridiculously small, but you might decide you need to worry about that (it prevents this code being bomb-proof).

Resources