Extracting key=value with scanf in C - c

I need to extract a value for a given key from a string. I made this quick attempt:
char js[] = "some preceding text with\n"
"new lines and spaces\n"
"param_1=123\n"
"param_2=321\n"
"param_3=string\n"
"param_2=321\n";
char* param_name = "param_2";
char *key_s, *val_s;
char buf[32];
key_s = strstr(js, param_name);
if (key_s == NULL)
return 0;
val_s = strchr(key_s, '=');
if (val_s == NULL)
return 0;
sscanf(val_s + 1, "%31s", buf);
printf("'%s'\n", buf);
And it in fact works ok (printf gives '321'). But I suppose the scanf/sscanf would make this task even easier but I have not managed to figure out the formatting string for that.
Is that possible to pass a content of a variable param_name into sscanf so that it evaluates it as a part of a formatting string? In other words, I need to instruct sscanf that in this case it should look for a pattern param_2=%s (the param_name in fact comes from a function argument).

Not directly, no.
In practice, there's of course nothing stopping you from building the format string for sscanf() at runtime, with e.g. snprintf().
Something like:
void print_value(const char **js, size_t num_js, const char *key)
{
char tmp[32], value[32];
snprintf(tmp, sizeof tmp, "%s=%%31s", key);
for(size_t i = 0; i < num_js; ++i)
{
if(sscanf(js[i], tmp, value) == 1)
{
printf("found '%s'\n", value);
break;
}
}
}

OP's has a good first step:
char *key_s = strstr(js, param_name);
if (key_s == NULL)
return 0;
The rest may be simplified to
if (sscanf(&key_s[strlen(param_name)], "=%31s", buf) == 0) {
return 0;
}
printf("'%s'\n", buf);
Alternatively one could use " =%31s" to allow spaces before =.
OP's approach gets fooled by "param_2 321\n" "param_3=string\n".
Note: Weakness to all answers so far to not parse the empty string.

One issue that bears consideration is the difference between finding a 'key=value' setting in the string for a specific key value (such as param_2 in the question), and finding any 'key=value' setting in the string (with no specific key in mind a priori). The techniques to be used are rather different.
Another issue that has not self-evidently been considered is the possibility that you're looking for a key param_2 but the string also contains param_22=xyz and t_param_2=abc. The simple-minded approaches using strstr() to hunt for param_2 will pick up either of those alternatives.
In the sample data, there is a collection of characters that are not in the 'key=value' format to be skipped before the any 'key=value' parts. In the general case, we should assume that such data appears before, in between, and after the 'key=value' pairs. It appears that the values do not need to support complications such as quoted strings and metacharacters, and the value is delimited by white space. There is no comment convention visible.
Here's some workable code:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
enum { MAX_KEY_LEN = 31 };
enum { MAX_VAL_LEN = 63 };
int find_any_key_value(const char *str, char *key, char *value);
int find_key_value(const char *str, const char *key, char *value);
int find_any_key_value(const char *str, char *key, char *value)
{
char junk[256];
const char *search = str;
while (*search != '\0')
{
int offset;
if (sscanf(search, " %31[a-zA-Z_0-9]=%63s%n", key, value, &offset) == 2)
return(search + offset - str);
int rc;
if ((rc = sscanf(search, "%255s%n", junk, &offset)) != 1)
return EOF;
search += offset;
}
return EOF;
}
int find_key_value(const char *str, const char *key, char *value)
{
char found[MAX_KEY_LEN + 1];
int offset;
const char *search = str;
while ((offset = find_any_key_value(search, found, value)) > 0)
{
if (strcmp(found, key) == 0)
return(search + offset - str);
search += offset;
}
return offset;
}
int main(void)
{
char js[] = "some preceding text with\n"
"new lines and spaces\n"
"param_1=123\n"
"param_2=321\n"
"param_3=string\n"
"param_4=param_2=confusion\n"
"m= x\n"
"param_2=987\n";
const char p2_key[] = "param_2";
int offset;
const char *str;
char key[MAX_KEY_LEN + 1];
char value[MAX_VAL_LEN + 1];
printf("String being scanned is:\n[[%s]]\n", js);
str = js;
while ((offset = find_any_key_value(str, key, value)) > 0)
{
printf("Any found key = [%s] value = [%s]\n", key, value);
str += offset;
}
str = js;
while ((offset = find_key_value(str, p2_key, value)) > 0)
{
printf("Found key %s with value = [%s]\n", p2_key, value);
str += offset;
}
return 0;
}
Sample output:
$ ./so24490410
String being scanned is:
[[some preceding text with
new lines and spaces
param_1=123
param_2=321
param_3=string
param_4=param_2=confusion
m= x
param_2=987
]]
Any found key = [param_1] value = [123]
Any found key = [param_2] value = [321]
Any found key = [param_3] value = [string]
Any found key = [param_4] value = [param_2=confusion]
Any found key = [m] value = [x]
Any found key = [param_2] value = [987]
Found key param_2 with value = [321]
Found key param_2 with value = [987]
$
If you need to handle different key or value lengths, you need to adjust the format strings as well as the enumerations. If you pass the size of the key buffer and the size of the value buffer to the functions, then you need to use snprint() to create the format strings used by sscanf(). There is an outside chance that you might have a single 'word' of 255 characters followed immediately by the target 'key=value' string. The chances are ridiculously small, but you might decide you need to worry about that (it prevents this code being bomb-proof).

Related

How to replace substring with int

So i've seen alot of functions like str_replace(str, substr, newstring) but all of them won't work with numbers so i was wondering if anyone had one that would work with both chars and ints or just int ive been looking everywhere and cant figure out a idea on how to write my own.
my goal exactly is to be able to replace a string with a int value in the string not just string with string
below is the function i use to replace strings and it worked just fine
void strrpc(char *target, const char *needle, const char *replacement)
{
char buffer[1024] = { 0 };
char *insert_point = &buffer[0];
const char *tmp = target;
size_t needle_len = strlen(needle);
size_t repl_len = strlen(replacement);
while (1) {
const char *p = strstr(tmp, needle);
// walked past last occurrence of needle; copy remaining part
if (p == NULL) {
strcpy(insert_point, tmp);
break;
}
// copy part before needle
memcpy(insert_point, tmp, p - tmp);
insert_point += p - tmp;
// copy replacement string
memcpy(insert_point, replacement, repl_len);
insert_point += repl_len;
// adjust pointers, move on
tmp = p + needle_len;
}
// write altered string back to target
strcpy(target, buffer);
}
You can turn an integer into a string by "printing" it to a string:
int id = get_id();
char idstr[20];
sprintf(idstr, "%d", id);
Now you can
char msg[1024] = "Processing item {id} ...";
strrpc(msg, "{id}", idstr);
puts(msg);
But note that the implementation of strrpc you found will work only if the string after replacement is shorter than 1023 character. Also note the the example above could more easily be written as just:
printf("Processing item %d ...\n", get_id());
without the danger of buffer overflow. I don't know what exactly you want to achieve, but perhaps string replacement is not the best solution here. (Just sayin'.)

Replace underscore with white spaces and make the first letter of the name and the surname to upper case

i want to replace _ (underscore) with white spaces and make the first letter of the name and the surname to upper case while printing the nameList in searchKeyword method.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
void searchKeyword(const char * nameList[], int n, const char keyword[])
{
int i,name=0;
char *str;
const char s[2] = " " ;
for(i=0;i<n;i++)
{
char *str = (char *) malloc((strlen(nameList[0])+1)*sizeof(char));
strcpy(str,nameList[i]);
strtok(str,"_");
if(strcmp(keyword,strtok(NULL,"_"))==0) // argument NULL will start string
{ // from last point of previous string
name++;
if(nameList[i] == '_')
strcpy(nameList[i],s);
//nameList[i] = ' ';
printf("%s\n",nameList[i]);
}
}
if(name==0)
{
printf("No such keyword found\n");
}
free(str); //deallocating space
}
int main()
{
char p1[] = "zoe_bale";
char p2[] = "sam_rodriguez";
char p3[] = "jack_alonso";
char p4[] = "david_studi";
char p5[] = "denzel_feldman";
char p6[] = "james_bale";
char p7[] = "james_willis";
char p8[] = "michael_james";
char p9[] = "dustin_bale";
const char * nameList[9] = {p1, p2, p3, p4, p5, p6, p7, p8, p9};
char keyword[100];
printf("Enter a keyword: ");
scanf("%s", keyword);
printf("\n");
searchKeyword(nameList, 9, keyword);
printf("\n");
for (int i = 0; i < 9; i++)
printf("%s\n",nameList[i]);
return 0;
}
Search through the strings and print the ones whose surname part is equal to keyword.
As shown in the example runs below, the strings are printed in “Name Surname” format (the first letters are capitalized).
Output should be like this:
Enter a keyword: james
Michael James
zoe_bale
sam_rodriguez
jack_alonso
david_studi
denzel_feldman
james_bale
james_willis
michael_james
dustin_bale
There is no reason to dynamically allocate storage for your name and surname. Looking at your input, neither will exceed 9-characters, so simply using an array for each of 64-chars provides 6X the storage required (if you are unsure, double that to 128-chars and have 1200% additional space). That avoids the comparatively expensive calls to malloc.
To check whether keyword exists in nameList[i], you don't need to separate the values first and then compare. Simply use strstr (nameList[i], keyword) to determine if keyword is contained in nameList[i]. If you then want to match only the name or surname you can compare again after they are separated. (up to you)
To parse the names from the nameList[i] string, all you need is a single pointer to locate the '_' character. A simple call to strchr() will do and it does not modify nameList[i] so there is no need to duplicate.
After using strchr() to locate the '_' character, simply memcpy() from the start of nameList[i] to your pointer to your name array, increment the pointer and then strcpy() from p to surname. Now you have separated name and surname, simply call toupper() on the first character of each and then output the names separate by a space, e.g.
...
#include <ctype.h>
#define NLEN 64
void searchKeyword (const char *nameList[], int n, const char keyword[])
{
for (int i = 0; i < n; i++) { /* loop over each name in list */
if (strstr (nameList[i], keyword)) { /* does name contain keyword? */
char name[NLEN], surname[NLEN]; /* storage for name, surname */
const char *p = nameList[i]; /* pointer to parse nameList[i] */
if ((p = strchr(p, '_'))) { /* find '_' in nameList[i] */
/* copy first-name to name */
memcpy (name, nameList[i], p - nameList[i]);
name[p++ - nameList[i]] = 0; /* nul-terminate first name */
*name = toupper (*name); /* convert 1st char to uppwer */
/* copy last name to surname */
strcpy (surname, p);
*surname = toupper (*surname); /* convert 1st char to upper */
printf ("%s %s\n", name, surname); /* output "Name Surname" */
}
}
}
}
Example Use/Output
Used with the remainder of your code, searching for "james" locates those names containing "james" and provides what looks like the output you requested, e.g.
$ ./bin/keyword_surname
Enter a keyword: james
James Bale
James Willis
Michael James
zoe_bale
sam_rodriguez
jack_alonso
david_studi
denzel_feldman
james_bale
james_willis
michael_james
dustin_bale
(note: to match only the name or surname add an additional strcmp before the call to printf to determine which you want to output)
Notes On Your Existing Code
Additional notes continuing from the comments on your existing code,
char *str = (char *) malloc((strlen(nameList[0])+1)*sizeof(char));
should simply be
str = malloc (strlen (nameList[i]) + 1);
You have previously declared char *str; so the declaration before your call to malloc() shadows your previous declaration. If you are using gcc/clang, you can add -Wshadow to your compile string to ensure you are warned of shadowed variables. (they can have dire consequences in other circumstances)
Next, sizeof (char) is always 1 and should be omitted from your size calculation. There is no need to cast the return of malloc() in C. See: Do I cast the result of malloc?
Your comparison if (nameList[i] == '_') is a comparison between a pointer and integer and will not work. Your compiler should be issuing a diagnostic telling you that is incorrect (do not ignore compiler warnings -- do not accept code until it compiles without warning)
Look things over and let me know if you have further questions.
that worked for me and has no memory leaks.
void searchKeyword(const char * nameList[], int n, const char keyword[])
{
int found = 0;
const char delim = '_';
for (int i = 0; i < n; i++) {
const char *fst = nameList[i];
for (const char *tmp = fst; *tmp != '\0'; tmp++) {
if (*tmp == delim) {
const char *snd = tmp + 1;
int fst_length = (snd - fst) / sizeof(char) - 1;
int snd_length = strlen(fst) - fst_length - 1;
if (strncmp(fst, keyword, fst_length) == 0 ||
strncmp(snd, keyword, snd_length) == 0) {
found = 1;
printf("%c%.*s %c%s\n",
fst[0]-32, fst_length-1, fst+1,
snd[0]-32, snd+1);
}
break;
}
}
}
if (!found)
puts("No such keyword found");
}
hopefully it's fine for you too, although I use string.h-functions very rarely.

How to word-wrap using specific delimiters, without dynamic allocation

I have a program that displays UTF-8 encoded strings with a size limitation (say MAX_LEN).
Whenever I get a string with a length > MAX_LEN, I want to find out where I could split it so it would be printed gracefully.
For example:
#define MAX_LEN 30U
const char big_str[] = "This string cannot be displayed on one single line: it must be splitted"
Without process, the output will looks like:
"This string cannot be displaye" // Truncated because of size limitation
"d on one single line: it must "
"be splitted"
The client would be able to chose eligible delimiters for the split but for now, I defined a list of delimiters by default:
#define DEFAULT_DELIMITERS " ;:,)]" // Delimiters to track in the string
So I am looking for an elegant and lightweight way of handling these issue without using malloc: my API should not return the sub-strings, I just want the positions of the sub-strings to display.
I already have some ideas that I will propose in answer: any feedback (e.g. pros and cons) would be appreciated, but most of all I am interested in alternatives solutions.
I just want the positions of the sub-strings to display.
So all you need is one function analysing your input returning the positions where a delimiter was found.
A possible appoach using strpbrk() assuming C99 at least:
#include <unistd.h> /* for ssize_t */
#include <string.h>
#define DELIMITERS (" ;.")
void find_delimiter_positions(
const char * input,
const char * delimiters,
ssize_t * delimiter_positions)
{
ssize_t dp_current = 0;
const char * p = input;
while (NULL != (p = strpbrk(p, delimiters)))
{
delimiter_positions[dp_current] = p - input;
++dp_current;
++p;
}
}
int main(void)
{
char input[] = "some randrom data; more.";
size_t input_length = strlen(input);
ssize_t delimiter_positions[input_length];
for (size_t s = 0; s < input_length; ++s)
{
delimiter_positions[s] = -1;
}
find_delimiter_positions(input, DELIMITERS, delimiter_positions);
for (size_t s = 0; -1 != delimiter_positions[s]; ++s)
{
/* print out positions */
}
}
For why C99: C99 introduces V(ariable) L(ength) A(rray), which are necessary here to get around the limitation to not use dynamic memory allocation.
If VLAs also may not be used one needs to fall back a defining a maximum number of possible occurences of delimiters per string. The latter however might be feasable as the maximum length of the string to be parsed is given, which in turn would imply the maximum number of possible delimiters per string.
For the latter case those lines from the example above
char input[] = "some randrom data; more.";
size_t input_length = strlen(input);
ssize_t delimiter_positions[input_length];
could be replaced by
char input[MAX_INPUT_LEN] = "some randrom data; more.";
size_t input_length = strlen(input);
ssize_t delimiter_positions[MAX_INPUT_LEN];
An approach that doesn't require additional storage is to make the wrapping function call a callback function for each substring. In the example below, the string is just printed with plain old printf, but the callback could call any other API function.
Things to note:
There is a function next that should advance a pointer to the next UTF-8 character. The encoding width for an UTF-8 char can be seen from its first byte.
The space and punctuation delimiters are treated slightly differently: Spaces are neither appended to the end or beginning of a line. (If there aren't any consecutive spaces in the text, that is.) Punctuation is retained at the end of a line.
Here's an example implementation:
#include <assert.h>
#include <stddef.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#define DELIMITERS " ;:,)]"
/*
* Advance to next character. This should advance the pointer to
* up to three chars, depending on the UTF-8 encoding. (But at the
* moment, it doesn't.)
*/
static const char *next(const char *p)
{
return p + 1;
}
typedef struct {
const char *begin;
const char *end;
} substr_t;
/*
* Wraps the text and stores the found substring' ranges into
* the lines struct. Return the number of word-wrapped lines.
*/
int wrap(const char *text, int width, substr_t *lines, uint32_t max_num_lines)
{
const char *begin = text;
const char *split = NULL;
uint32_t num_lines = 1;
int l = 0;
while (*text) {
if (strchr(DELIMITERS, *text)) {
split = text;
if (*text != ' ') split++;
}
if (l++ == width) {
if (split == NULL) split = text;
lines[num_lines - 1].begin = begin;
lines[num_lines - 1].end = split;
//write(fileno(stdout), begin, split - begin);
text = begin = split;
while (*begin == ' ') begin++;
split = NULL;
l = 0;
num_lines++;
if (num_lines > max_num_lines) {
//abort();
return -1;
}
}
text = next(text);
}
lines[num_lines - 1].begin = begin;
lines[num_lines - 1].end = text;
//write(fileno(stdout), begin, split - begin);
return num_lines;
}
int main()
{
const char *text = "I have a program that displays UTF-8 encoded strings "
"with a size limitation (say MAX_LEN). Whenever I get a string with a "
"length > MAX_LEN, I want to find out where I could split it so it "
"would be printed gracefully.";
substr_t lines[100];
const uint32_t max_num_lines = sizeof(lines) / sizeof(lines[0]);
const int num_lines = wrap(text, 48, lines, max_num_lines);
if (num_lines < 0) {
fprintf(stderr, "error: can't split into %d lines\n", max_num_lines);
return EXIT_FAILURE;
}
//printf("num_lines = %d\n", num_lines);
for (int i=0; i < num_lines; i++) {
FILE *stream = stdout;
const ptrdiff_t line_length = lines[i].end - lines[i].begin;
write(fileno(stream), lines[i].begin, line_length);
fputc('\n', stream);
}
return EXIT_SUCCESS;
}
Addendum: Here's another approach that builds loosely on the strtok pattern, but without modifying the string. It requires a state and that state must be initialised with the string to print and the maximum line width:
struct wrap_t {
const char *src;
int width;
int length;
const char *line;
};
int wrap(struct wrap_t *line)
{
const char *begin = line->src;
const char *split = NULL;
int l = 0;
if (begin == NULL) return -1;
while (*begin == ' ') begin++;
if (*begin == '\0') return -1;
while (*line->src) {
if (strchr(DELIMITERS, *line->src)) {
split = line->src;
if (*line->src != ' ') split++;
}
if (l++ == line->width) {
if (split == NULL) split = line->src;
line->line = begin;
line->length = split - begin;
line->src = split;
return 0;
}
line->src = next(line->src);
}
line->line = begin;
line->length = line->src - begin;
return 0;
}
All definitions not shown (DELIMITERS, next) are as above and the basic algorithm hasn't changed. I think this method is easy to use for the client:
int main()
{
const char *text = "I have a program that displays UTF-8 encoded strings "
"with a size limitation (say MAX_LEN). Whenever I get a string with a "
"length > MAX_LEN, I want to find out where I could split it so it "
"would be printed gracefully.";
struct wrap_t line = {text, 60};
while (wrap(&line) == 0) {
printf("%.*s\n", line.length, line.line);
}
return 0;
}
Solution1
A function that will be called successively until the whole string is processed: it would return the count of bytes to recopy to create the sub-strings:
The API:
/**
* Return the length between the beginning of the string and the
* last delimiter (such that returned length <= max_length)
*/
size_t get_next_substring_length(
const char * str, // The string to be splitted
const char * delim, // String of eligible delimiters for a split
size_t max_length); // The maximum length of resulting substring
On the client' side:
size_t shift = 0;
for(;;)
{
// Where do we start within big_str ?
const char * tmp = big_str + shift;
size_t count = get_next_substring_length(tmp, DEFAULT_DELIMITERS, MAX_LEN);
if(count)
{
// Allocate a sub-string and recopy "count" bytes
// Display the sub-string
shift += count;
}
else // End Of String (or error)
{
// Handle potential error
// Exit the loop
}
}
Solution2
Define a custom structure to store positions and lengths of sub-strings:
const char * str = "This is a long test string";
struct substrings
{
const char * str; // Beginning of the substring
size_t length; // Length of the substring
} sub[] = { {&str[0], 4},
{&str[5], 2},
{&str[8], 1},
{&str[10], 4},
{&str[15], 4},
{&str[20], 6},
{NULL, 0} };
The API:
size_t find_substrings(
struct substrings ** substr,
size_t max_length,
const char * delimiters,
const char * str);
On the client' side:
#define ARRAY_LENGTH 20U
struct substrings substr[ARRAY_LENGTH];
// Fill the structure
find_substrings(
&substr,
ARRAY_LENGTH,
DEFAULT_DELIMITERS,
big_str);
// Browse the structure
for (struct substrings * sub = &substr[0]; substr->str; sub++)
{
// Display sub->length bytes of sub->str
}
Some things are bothering me though:
in Solution1 I don't like the infinite loop, it is often bug prone
in Solution2 I fixed ARRAY_LENGTH arbitrarily but it should vary depending of input string length

Formatting a string based on Input and predefined values

I have 26 values's that i am considering as Special Symbol and are as with special delimeter "$" the value's can be from $A to $Z.
Same time i have a predefined template as:
I have $A,$B,$C.....
Now i am allowing user to input a string that can contain a special symbol and the values of those example:
Input - $ACar $BBike $CTruck.
Then my output should be : *I have Car,Bike,Truck... *
As now all special symbol has been replaced by its values.
Note 1.if $A Car $A Bike is the input value then it should take $A as Car rest should be discarted.
If input string doesn't contain any special symbol the there should be no change in output and output will be
I have $A,$B,$C.....
3.if input start as i am a men $A glass then till $A all values should be discarted.
Which approach should i follow to make this possible?
I am thinking to do strstr on the input string and compare those with my special symbol and store the position of Special Symbol in a list and then as per the position i am thinking to take the values but i don't think it will work for me.
Processing is simplified by using a dynamic string.
like this
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>
typedef struct dstr {
size_t size;
size_t capacity;
char *str;
} Dstr;//dynamic string
Dstr *dstr_make(void){
Dstr *s;
s = (Dstr*)malloc(sizeof(Dstr));
s->size = 0;
s->capacity=16;
s->str=(char*)realloc(NULL, sizeof(char)*(s->capacity += 16));
return s;
}
void dstr_addchar(Dstr *ds, const char ch){
ds->str[ds->size] = ch;
if(++ds->size == ds->capacity)
ds->str=(char*)realloc(ds->str, sizeof(char)*(ds->capacity += 16));
}
void dstr_addstr(Dstr *ds, const char *s){
while(*s) dstr_addchar(ds, *s++);
//dstr_addchar(ds, '\0');
}
void dstr_free(Dstr *ds){
free(ds->str);
free(ds);
}
void dic_entry(char *dic[26], const char *source){
char *p, *backup, ch;
p = backup = strdup(source);
for(;NULL!=(p=strtok(p, " \t\n"));p=NULL){
if(*p == '$' && isupper(ch=*(p+1))){
if(dic[ch -'A'] == NULL)
dic[ch -'A'] = strdup(p+2);
}
}
free(backup);
}
void dic_clear(char *dic[26]){
int i;
for(i=0;i<26;++i){
if(dic[i]){
free(dic[i]);
dic[i] = NULL;
}
}
}
int main(void){
const char *template = "I have $A,$B,$C.";
char *dic[26] = { 0 };
char buff[1024];
const char *cp;
Dstr *ds = dstr_make();
printf("input special value setting: ");
fgets(buff, sizeof(buff), stdin);
dic_entry(dic, buff);
for(cp=template;*cp;++cp){
if(*cp == '$'){
char ch;
if(isupper(ch=*(cp+1)) && dic[ch - 'A']!=NULL){
dstr_addstr(ds, dic[ch - 'A']);
++cp;
} else {
dstr_addchar(ds, *cp);
}
} else {
dstr_addchar(ds, *cp);
}
}
dstr_addchar(ds, '\0');
printf("result:%s\n", ds->str);
dic_clear(dic);
dstr_free(ds);
return 0;
}
/* DEMO
>a
input special value setting: $ACar $BBike $CTruck
result:I have Car,Bike,Truck.
>a
input special value setting: $BBike
result:I have $A,Bike,$C.
*/
What you're describing is called a Macro Processor or Macro Expander.
You can store your symbol table in an array indexed by the input char.
char *symtab[256] = {0};
Since the symbol names are single-letters, you can use strchr to find the first '$' and check if the next char is a letter (isupper()).
For the actual replacement, it will require some delicate memory management unless you just use really big buffers and make sure to only feed it small data.
If symtab['A'] == "Car" then you can loc = strstr(line, "$A"). Then loc-line is the length of the prefix part, 2 is the length of the symbol name being deleted, strlen("Car") is the length of the replacement, and strlen(loc+2) is the length of the suffix part. So the new string size should be
char *result = malloc( (loc-line) - 2 + strlen(symtab['A']) + strlen(loc+2) + 1);
Then patching up the new string is
strcpy(result,line);
strcpy(result + (loc-line), symtab['A']);
strcpy(result + (loc-line) + strlen(symtab['A']), loc+2);
Notice these are strcpy not strcat which appends strings together. The second and third strcpy calls overwrite the tail of the string just copied.

C: strings and pointers. changing sub-strings within a string in a specific pattern

I'm having a hard time understanding how should I do the following:
I have a list of words defined like so:
typedef struct _StringNode {
char *str;
struct _StringNode* next;
} StringNode;
Now I need to write a function which receives a string, and two word lists of the same length, and I need to replace every appearance of a word from the first list in the string with the corresponding word from the second list.
Example:
text: "stack overflow siteoverflow oveflow stack"
patterns: [ "stack", "overflow", "site" ]
replacements: [ "Hello", "guys", "here" ]
result: "Hello guys hereguys guys Hello"
For each word: I'm trying to use strstr() so I'll get a pointer to an occurrence of the word in a copy of the string and then to change the word, and to promote the pointer of the copy of the text string.
char* replace(const char *text,
const StringNode *patterns,
const StringNode *replacements);
You can use this
char *strnreplace(char *st,const int length,
const char *orig,const char *repl) {
static char buffer[length];
char *ch;
if (!(ch = strstr(st, orig)))
return st;
strncpy(buffer, st, ch-st);
buffer[ch-st] = 0;
sprintf(buffer+(ch-st), "%s%s", repl, ch+strlen(orig));
return buffer;
}
void replace(const char *text,
const StringNode *patterns,
const StringNode *replacements)
{
StringNode *pat, *rep;
char *temp = text;
int length = strlen(text);
for( pat = patterns, rep = replacements;
pat->next != NULL;
pat = pat->next, rep = rep->next ) {
temp = strnreplace(temp, length, pat->str, rep->str);
}
}
Perhaps something like this:
char* replace(const char *text,
const StringNode *patterns,
const StringNode *replacements)
{
char *out = malloc(1024), *put = out;
while(*text != '\0)
{
const StringNode *piter, *riter;
int found = 0;
/* Check if current start of text matches any pattern. */
for(piter = patterns, riter = replacements;
piter != NULL;
piter = piter->next, riter = riter->next)
{
const size_t plen = strlen(piter->str);
if(strncmp(text, piter->str, plen) == 0)
{
/* Hit found, emit replacement. */
const size_t rlen = strlen(riter->str);
memcpy(out, riter->str, rlen);
out += rlen;
text += plen;
found = 1;
break;
}
}
if(!found)
*put++ = *text++;
}
*put = '\0';
return out;
}
Note that the above does not handle buffer overflows, omitted for brevity. I would recommend implementing something like this on top of a dynamic string data type, to make the core operation (append) automatically grow the destination string as needed.
UPDATE In response to the comment, the algorithm the above is trying to implement is:
set output to empty string
while text remaining
if start of text matches pattern[i]
append replacement[i] to output
remove len(pattern[i]) characters from start of text
else
append first character of text to output
remove first character of text
So, it repeatedly checks for pattern-matches, as long as there is anything left in text.

Resources