Lowercase urls in Varnish (inline C) - c

In Varnish (3.0), urls are treated in a case sensitive way. By that I mean http://test.com/user/a4556 is treated differently from http://test.com/user/A4556. On my web server they're treated as the same url. What I'd like to do is have varnish lowercase all request urls as they come in.
I managed to find this discussion but the creator of Varnish indicates that I will have to use inline C to do it. I could achieve this in a simplistic way using multiple regexes but that just seems like it's bound to fail.
Ideally, what I'd like is a VCL configuration to do this (an example of this can be found here) but I'd settle for a C function that takes in a const char * and returns const char * (I'm not a C programmer so forgive me if I get the syntax wrong).

It must be mentioned that Varnish includes the ability to uppercase and lowercase strings in the std vmod ( https://www.varnish-cache.org/docs/trunk/reference/vmod_std.generated.html#func-tolower )
This is much cleaner than the embedded C route (which is disabled by default in Varnish 4). Here's an example I use to normalize the request Host and url;
import std;
sub vcl_recv {
# normalize Host header
set req.http.Host = std.tolower(regsub(req.http.Host, ":[0-9]+", ""));
....
}
sub vcl_hash {
# set cache key to lowercased req.url
hash_data(std.tolower(req.url));
....
}

Okay, I went ahead and solved this for myself. Here's the VCL:
C{
#include <ctype.h>
//lovingly lifted from:
//https://github.com/cosimo/varnish-accept-language/blob/master/examples/accept-language.vcl
static void strtolower(const char *s) {
register char *c;
for (c=s; *c; c++) {
if (isupper(*c)) {
*c = tolower(*c);
}
}
return;
}
}C
sub vcl_recv {
C{
strtolower(VRT_r_req_url(sp));
}C
}
I put this in a separate VCL file and then added an include for it.

I'll just share my solution, which expands Richard's code into a complete solution.
If URL contains upper case letters, we redirect user to the correct URL, instead of simply normalizing the URL before entering the cache machinery. This prevents search engines from indexing mixed case URLs separately from lower-case.
# Define a function that converts a string to lower-case in-place.
# http://stackoverflow.com/questions/6857445
C{
#include <ctype.h>
static void strtolower(char *c) {
for (; *c; c++) {
if (isupper(*c)) {
*c = tolower(*c);
}
}
}
}C
sub vcl_recv {
if (req.http.host ~ "[A-Z]" || req.url ~ "[A-Z]") {
# Convert host and path to lowercase in-place.
C{
strtolower(VRT_GetHdr(sp, HDR_REQ, "\005host:"));
strtolower((char *)VRT_r_req_url(sp));
}C
# Use req.http.location as a scratch register; any header will do.
set req.http.location = "http://" req.http.host req.url;
error 999 req.http.location;
}
# Fall-through to default
}
sub vcl_error {
# Check for redirects - redirects are performed using: error 999 "http://target-url/"
# Thus we piggyback the redirect target in the error response variable.
if (obj.status == 999) {
set obj.http.location = obj.response;
set obj.status = 301;
set obj.response = "Moved permanently";
return(deliver);
}
# Fall-through to default
}
There's an ugly cast from const char * to char * when converting req.url to lower-case... basically, we're modifying the string in-place despite Varnish telling us not to. It seems to work. :-)

Nearly 5 years after the original question was asked I think we have a cleaner answer available now. This SO question still comes up top in a search for "lowercase Varnish".
Here is a simplified variation on the example that Fastly recommends:
# at the top of your VCL
import std;
sub vcl_recv {
# Lowercase all incoming URLs. It will also be lowercase by the time the hash is computed.
set req.url = std.tolower(req.url);
}
https://www.fastly.com/blog/varnish-tip-case-insensitivity

If you are looking for a C function that converts an upper case string to lower case, this will do:
#include <ctype.h>
static char *
to_lower (char *str)
{
char *s = str;
while (*s)
{
if (isupper (*s))
*s = tolower (*s);
s++;
}
return str;
}
Note that this modifies the string in-place. So you may want to pass a copy of the original string as argument.

Note that to set the URL from the C block and avoid crashing use:
VRT_l_req_url(sp,"new-string", vrt_magic_string_end);
(Pulled this detail from "varnishd -C" output.) Here's an untested revision to the first answer:
C{
#include <ctype.h>
//lovingly lifted from:
//https://github.com/cosimo/varnish-accept-language/blob/master/examples/accept-language.vcl
static void strtolower(const char *s) {
register char *c;
for (c=s; *c; c++) {
if (isupper(*c)) {
*c = tolower(*c);
}
}
return;
}
}C
sub vcl_recv {
C{
const char *url = VRT_r_req_url(sp);
char urlRewritten[1000];
strcat(urlRewritten, url);
strtolower(urlRewritten);
VRT_l_req_url(sp, urlRewritten, vrt_magic_string_end);
}C
}

Related

Using C to modify response in VCL

The varnish docs say that we can include C snippets inside a VCL file, like
sub vcl_hash {
C{
int i = /* Some logic to get a number */
}C
}
But now how can I use the value of the integer i to set a response header, or cookie
See varnish.vcc
And the functions:
VRT_SetHdr
VRT_GetHdr
in varnish 4 there is ctx variabl defined for the context (as opposed to sp in varnish 3 ) (source)
example:
sub vcl_hash {
C{
const char *hash = calc_hash(...);
const struct gethdr_s hdr = {
HDR_BERESP,
"\010X-Hash:" // length prefixed string, in octal
};
VRT_SetHdr(ctx, &hdr, hash, vrt_magic_string_end);
}C
}
see here for another example
Why don't you just use VCL?
set resp.http.x-header = header to set any header you wanna set.
I'd encourage you to write a vmod directly, It'll be way more comfortable. You can find a (old but still relevant) guide here: https://info.varnish-software.com/blog/creating-a-vmod-vmod-str

SWIG: "out" typemap for structure field needs to access another field of the same structure

I am attempting to wrap a C library (which I did not write, and whose interfaces cannot be changed) using SWIG. It's mostly straightforward, but there's one field of one struct that's giving me trouble. The relevant struct definition looks like this:
struct Token {
const char *buffer;
const char *word;
unsigned short wordlen;
// ... other fields ...
};
buffer is a normal C string and should be exposed normally (but immutably). word is the problem field. It is a pointer to somewhere within the buffer string, and is meant to be understood as a string of length wordlen. I want to expose this to high-level languages as a normal read-only string, so they don't always have to be taking a slice.
I think the way to handle this is with an "out" typemap specifically for Token::word, something like this:
struct Token {
%typemap (out) const char *word {
$result = SWIG_FromCharPtrAndSize($1, ?wordlen?);
}
}
and this is where I got stuck: How do I access the wordlen field of the parent structure from this typemap?
Or if there's a better way to handle this entire issue, please tell me about that instead.
Unfortunately, it looks like SWIG has no support for mapping multiple structure members at the same time. Inspecting the generated output we learn that (arg1) points to the input structure. Thus, we have to:
Make the word immutable, so that the set wrapper is not generated.
Import the SWIG_FromCharPtrAndSize fragment - it's not available by default.
Map word using SWIG_FromCharPtrAndSize as you wished, referring to (arg1)->wordlen.
Skip wordlen, so that it's not mapped (either by %ignore-ing it, or not providing it in the struct visible to SWIG).
The following is a complete example. First, the header:
// main.h
#pragma once
struct Token {
const char *word;
unsigned short wordlen;
};
struct Token *make_token(void);
extern char *word_check;
And the SWIG module - note that we use the header verbatim, only overriding the definition of struct Token:
// token_mod.i
%module token_mod
%{#include "main.h"%}
%ignore Token;
%include "main.h"
%rename("%s") Token;
struct Token {
%immutable word;
%typemap (out, fragment="SWIG_FromCharPtrAndSize") const char *word {
$result = SWIG_FromCharPtrAndSize($1, (arg1)->wordlen);
}
const char *word;
%typemap (out) const char *word;
};
The demo code that uses Python to check that things work:
// https://github.com/KubaO/stackoverflown/tree/master/questions/swig-pair-53915787
#include <assert.h>
#include <stdlib.h>
#include <Python.h>
#include "main.h"
struct Token *make_token(void) {
struct Token *r = malloc(sizeof(struct Token));
r->word = "1234";
r->wordlen = 2;
return r;
}
char *word_check;
#if PY_VERSION_HEX >= 0x03000000
# define SWIG_init PyInit__token_mod
PyObject*
#else
# define SWIG_init init_token_mod
void
#endif
SWIG_init(void);
int main()
{
PyImport_AppendInittab("_token_mod", SWIG_init);
Py_Initialize();
PyRun_SimpleString(
"import sys\n"
"sys.path.append('.')\n"
"import token_mod\n"
"from token_mod import *\n"
"token = make_token()\n"
"cvar.word_check = token.word\n");
assert(word_check && strcmp(word_check, "12") == 0);
Py_Finalize();
return 0;
}
Finally, the CMakeLists.txt that makes the demo - it can be used with Python 2.7 or 3.x. Note: To switch Python versions, the build directory must be wiped (or at least the cmake caches in it have to be wiped).
cmake_minimum_required(VERSION 3.2)
set(Python_ADDITIONAL_VERSIONS 3.6)
project(swig-pair)
find_package(SWIG 3.0 REQUIRED)
find_package(PythonLibs 3.6 REQUIRED)
include(UseSwig)
SWIG_MODULE_INITIALIZE(${PROJECT_NAME} python)
SWIG_ADD_SOURCE_TO_MODULE(${PROJECT_NAME} swig_generated_sources "token_mod.i")
add_executable(${PROJECT_NAME} "main.c" ${swig_generated_sources})
target_include_directories(${PROJECT_NAME} PRIVATE ${PYTHON_INCLUDE_DIRS} ".")
target_link_libraries(${PROJECT_NAME} PRIVATE ${PYTHON_LIBRARIES})
The higher level languages don't care that wordlen is the size of word. Only the C does. If you cant change the C that you are swiging then you have to leave it as is and remember as you write in the higher languages that the char has a size. Also Swig and const don't like each other. Hereis there documentation on consts

bilingual program in console application in C

I have been trying to implement a way to make my program bilingual : the user could chose if the program should display French or English (in my case).
I have made lots of researches and googling but I still cannot find a good example on how to do that :/
I read about gettext, but since this is for a school's project we are not allowed to use external libraries (and I must admit I have nooo idea how to make it work even though I tried !)
Someone also suggested to me the use of arrays one for each language, I could definitely make this work but I find the solution super ugly.
Another way I thought of is to have to different files, with sentences on each line and I would be able to retrieve the right line for the right language when I need to. I think I could make this work but it also doesn't seem like the most elegant solution.
At last, a friend said I could use DLL for that. I have looked up into that and it indeed seems to be one of the best ways I could find... the problem is that most resources I could find on that matter were coded for C# and C++ and I still have no idea how I would do to implement in C :/
I can grasp the idea behind it, but have no idea how to handle it in C (at all ! I do not know how to create the DLL, call it, retrieve the right stuff from it or anything >_<)
Could someone point me to some useful resources that I could use, or write a piece of code to explain the way things work or should be done ?
It would be seriously awesome !
Thanks a lot in advance !
(Btw, I use visual studio 2012 and code in C) ^^
If you can't use a third party lib then write your own one! No need for a dll.
The basic idea is the have a file for each locale witch contains a mapping (key=value) for text resources.
The name of the file could be something like
resources_<locale>.txt
where <locale> could be something like en, fr, de etc.
When your program stars it reads first the resource file for specified locale.
Preferably you will have to store each key/value pair in a simple struct.
Your read function reads all key/value pair into a hash table witch offers a very good access speed. An alternative would be to sort the array containing the key/value pairs by key and then use binary search on lookup (not the best option, but far better than iterating over all entries each time).
Then you'll have to write a function get_text witch takes as argument the key of the text resource to be looked up an return the corresponding text in as read for the specified locale. You have to handle keys witch have no mapping, the simplest way would be to return key back.
Here is some sample code (using qsort and bsearch):
#include<stdio.h>
#include<stdlib.h>
#include<string.h>
#define DEFAULT_LOCALE "en"
#define NULL_ARG "[NULL]"
typedef struct localized_text {
char* key;
char* value;
} localized_text_t;
localized_text_t* localized_text_resources = NULL;
int counter = 0;
char* get_text(char*);
void read_localized_text_resources(char*);
char* read_line(FILE*);
void free_localized_text_resources();
int compare_keys(const void*, const void*);
void print_localized_text_resources();
int main(int argc, char** argv)
{
argv++;
argc--;
char* locale = DEFAULT_LOCALE;
if(! *argv) {
printf("No locale provided, default to %s\n", locale);
} else {
locale = *argv;
printf("Locale provided is %s\n", locale);
}
read_localized_text_resources(locale);
printf("\n%s, %s!\n", get_text("HELLO"), get_text("WORLD"));
printf("\n%s\n", get_text("foo"));
free_localized_text_resources();
return 0;
}
char* get_text(char* key)
{
char* text = NULL_ARG;
if(key) {
text = key;
localized_text_t tmp;
tmp.key = key;
localized_text_t* result = bsearch(&tmp, localized_text_resources, counter, sizeof(localized_text_t), compare_keys);
if(result) {
text = result->value;
}
}
return text;
}
void read_localized_text_resources(char* locale)
{
if(locale) {
char localized_text_resources_file_name[64];
sprintf(localized_text_resources_file_name, "resources_%s.txt", locale);
printf("Read localized text resources from file %s\n", localized_text_resources_file_name);
FILE* localized_text_resources_file = fopen(localized_text_resources_file_name, "r");
if(! localized_text_resources_file) {
perror(localized_text_resources_file_name);
exit(1);
}
int size = 10;
localized_text_resources = malloc(size * sizeof(localized_text_t));
if(! localized_text_resources) {
perror("Unable to allocate memory for text resources");
}
char* line;
while((line = read_line(localized_text_resources_file))) {
if(strlen(line) > 0) {
if(counter == size) {
size += 10;
localized_text_resources = realloc(localized_text_resources, size * sizeof(localized_text_t));
}
localized_text_resources[counter].key = line;
while(*line != '=') {
line++;
}
*line = '\0';
line++;
localized_text_resources[counter].value = line;
counter++;
}
}
qsort(localized_text_resources, counter, sizeof(localized_text_t), compare_keys);
// print_localized_text_resources();
printf("%d text resource(s) found in file %s\n", counter, localized_text_resources_file_name);
}
}
char* read_line(FILE* p_file)
{
int len = 10, i = 0, c = 0;
char* line = NULL;
if(p_file) {
line = malloc(len * sizeof(char));
c = fgetc(p_file);
while(c != EOF) {
if(i == len) {
len += 10;
line = realloc(line, len * sizeof(char));
}
line[i++] = c;
c = fgetc(p_file);
if(c == '\n' || c == '\r') {
break;
}
}
line[i] = '\0';
while(c == '\n' || c == '\r') {
c = fgetc(p_file);
}
if(c != EOF) {
ungetc(c, p_file);
}
if(strlen(line) == 0 && c == EOF) {
free(line);
line = NULL;
}
}
return line;
}
void free_localized_text_resources()
{
if(localized_text_resources) {
while(counter--) {
free(localized_text_resources[counter].key);
}
free(localized_text_resources);
}
}
int compare_keys(const void* e1, const void* e2)
{
return strcmp(((localized_text_t*) e1)->key, ((localized_text_t*) e2)->key);
}
void print_localized_text_resources()
{
int i = 0;
for(; i < counter; i++) {
printf("Key=%s value=%s\n", localized_text_resources[i].key, localized_text_resources[i].value);
}
}
Used with the following resource files
resources_en.txt
WORLD=World
HELLO=Hello
resources_de.txt
HELLO=Hallo
WORLD=Welt
resources_fr.txt
HELLO=Hello
WORLD=Monde
run
(1) out.exe /* default */
(2) out.exe en
(3) out.exe de
(4) out.exe fr
output
(1) Hello, World!
(2) Hello, World!
(3) Hallo, Welt!
(4) Hello, Monde!
gettext is the obvious answer but it seems it's not possible in your case. Hmmm. If you really, really need a custom solution... throwing out a wild idea here...
1: Create a custom multilingual string type. The upside is that you can easily add new languages afterwards, if you want. The downside you'll see in #4.
//Terrible name, change it
typedef struct
{
char *french;
char *english;
} MyString;
2: Define your strings as needed.
MyString s;
s.french = "Bonjour!";
s.english = "Hello!";
3: Utility enum and function
enum
{
ENGLISH,
FRENCH
};
char* getLanguageString(MyString *myStr, int language)
{
switch(language)
{
case ENGLISH:
return myStr->english;
break;
case FRENCH:
return myStr->french;
break;
default:
//How you handle other values is up to you. You could decide on a default, for instance
//TODO
}
}
4: Create wrapper functions instead of using plain old C standard functions. For instance, instead of printf :
//Function should use the variable arguments and allow a custom format, too
int myPrintf(const char *format, MyString *myStr, int language, ...)
{
return printf(format, getLanguageString(myStr, language));
}
That part is the painful one : you'll need to override every function you use strings with to handle custom strings. You could also specify a global, default language variable to use when one isn't specified.
Again : gettext is much, much better. Implement this only if you really need to.
the main idea of making programs translatable is using in all places you use texts any kind of id. Then before displaying the test you get the text using the id form the appropriate language-table.
Example:
instead of writing
printf("%s","Hello world");
You write
printf("%s",myGetText(HELLO_WORLD));
Often instead of id the native-language string itself is used. e.g.:
printf("%s",myGetText("Hello world"));
Finally, the myGetText function is usually implemented as a Macro, e.g.:
printf("%s", tr("Hello world"));
This macro could be used by an external parser (like in gettext) for identifying texts to be translated in source code and store them as list in a file.
The myGetText could be implemented as follows:
std::map<std::string, std::map<std::string, std::string> > LangTextTab;
std::string GlobalVarLang="en"; //change to de for obtaining texts in German
void readLanguagesFromFile()
{
LangTextTab["de"]["Hello"]="Hallo";
LangTextTab["de"]["Bye"]="Auf Wiedersehen";
LangTextTab["en"]["Hello"]="Hello";
LangTextTab["en"]["Bye"]="Bye";
}
const char * myGetText( const char* origText )
{
return LangTextTab[GlobalVarLang][origText ].c_str();
}
Please consider the code as pseudo-code. I haven't compiled it. Many issues are still to mention: unicode, thread-safety, etc...
I hope however the example will give you the idea how to start.

How to extract filename from path

There should be something elegant in Linux API/POSIX to extract base file name from full path
See char *basename(char *path).
Or run the command "man 3 basename" on your target UNIX/POSIX system.
Use basename (which has odd corner case semantics) or do it yourself by calling strrchr(pathname, '/') and treating the whole string as a basename if it does not contain a '/' character.
Here's an example of a one-liner (given char * whoami) which illustrates the basic algorithm:
(whoami = strrchr(argv[0], '/')) ? ++whoami : (whoami = argv[0]);
an additional check is needed if NULL is a possibility. Also note that this just points into the original string -- a "strdup()" may be appropriate.
You could use strstr in case you are interested in the directory names too:
char *path ="ab/cde/fg.out";
char *ssc;
int l = 0;
ssc = strstr(path, "/");
do{
l = strlen(ssc) + 1;
path = &path[strlen(path)-l+2];
ssc = strstr(path, "/");
}while(ssc);
printf("%s\n", path);
The basename() function returns the last component of a path, which could be a folder name and not a file name. There are two versions of the basename() function: the GNU version and the POSIX version.
The GNU version can be found in string.h after you include #define _GNU_SOURCE:
#define _GNU_SOURCE
#include <string.h>
The GNU version uses const and does not modify the argument.
char * basename (const char *path)
This function is overridden by the XPG (POSIX) version if libgen.h is included.
char * basename (char *path)
This function may modify the argument by removing trailing '/' bytes. The result may be different from the GNU version in this case:
basename("foo/bar/")
will return the string "bar" if you use the XPG version and an empty string if you use the GNU version.
References:
basename (3) - Linux Man Pages
Function: char * basename (const char *filename), Finding Tokens in a String.
Of course if this is a Gnu/Linux only question then you could use the library functions.
https://linux.die.net/man/3/basename
And though some may disapprove these POSIX compliant Gnu Library functions do not use const. As library utility functions rarely do. If that is important to you I guess you will have to stick to your own functionality or maybe the following will be more to your taste?
#include <stdio.h>
#include <string.h>
int main(int argc, char *argv[])
{
char *fn;
char *input;
if (argc > 1)
input = argv[1];
else
input = argv[0];
/* handle trailing '/' e.g.
input == "/home/me/myprogram/" */
if (input[(strlen(input) - 1)] == '/')
input[(strlen(input) - 1)] = '\0';
(fn = strrchr(input, '/')) ? ++fn : (fn = input);
printf("%s\n", fn);
return 0;
}
template<typename charType>
charType* getFileNameFromPath( charType* path )
{
if( path == NULL )
return NULL;
charType * pFileName = path;
for( charType * pCur = path; *pCur != '\0'; pCur++)
{
if( *pCur == '/' || *pCur == '\\' )
pFileName = pCur+1;
}
return pFileName;
}
call:
wchar_t * fileName = getFileNameFromPath < wchar_t > ( filePath );
(this is a c++)
You can escape slashes to backslash and use this code:
#include <stdio.h>
#include <string.h>
int main(void)
{
char path[] = "C:\\etc\\passwd.c"; //string with escaped slashes
char temp[256]; //result here
char *ch; //define this
ch = strtok(path, "\\"); //first split
while (ch != NULL) {
strcpy(temp, ch);//copy result
printf("%s\n", ch);
ch = strtok(NULL, "\\");//next split
}
printf("last filename: %s", temp);//result filename
return 0;
}
I used a simpler way to get just the filename or last part in a path.
char * extract_file_name(char *path)
{
int len = strlen(path);
int flag=0;
printf("\nlength of %s : %d",path, len);
for(int i=len-1; i>0; i--)
{
if(path[i]=='\\' || path[i]=='//' || path[i]=='/' )
{
flag=1;
path = path+i+1;
break;
}
}
return path;
}
Input path = "C:/Users/me/Documents/somefile.txt"
Output = "somefile.txt"
#Nikolay Khilyuk offers the best solution except.
1) Go back to using char *, there is absolutely no good reason for using const.
2) This code is not portable and is likely to fail on none POSIX systems where the / is not the file system delimiter depending on the compiler implementation. For some windows compilers you might want to test for '\' instead of '/'. You might even test for the system and set the delimiter based on the results.
The function name is long but descriptive, no problem there. There is no way to ever be sure that a function will return a filename, you can only be sure that it can if the function is coded correctly, which you achieved. Though if someone uses it on a string that is not a path obviously it will fail. I would have probably named it basename, as it would convey to many programmers what its purpose was. That is just my preference though based on my bias your name is fine. As far as the length of the string this function will handle and why anyone thought that would be a point? You will unlikely deal with a path name longer than what this function can handle on an ANSI C compiler. As size_t is defined as a unsigned long int which has a range of 0 to 4,294,967,295.
I proofed your function with the following.
#include <stdio.h>
#include <string.h>
char* getFileNameFromPath(char* path);
int main(int argc, char *argv[])
{
char *fn;
fn = getFileNameFromPath(argv[0]);
printf("%s\n", fn);
return 0;
}
char* getFileNameFromPath(char* path)
{
for(size_t i = strlen(path) - 1; i; i--)
{
if (path[i] == '/')
{
return &path[i+1];
}
}
return path;
}
Worked great, though Daniel Kamil Kozar did find a 1 off error that I corrected above. The error would only show with a malformed absolute path but still the function should be able to handle bogus input. Do not listen to everyone that critiques you. Some people just like to have an opinion, even when it is not worth anything.
I do not like the strstr() solution as it will fail if filename is the same as a directory name in the path and yes that can and does happen especially on a POSIX system where executable files often do not have an extension, at least the first time which will mean you have to do multiple tests and searching the delimiter with strstr() is even more cumbersome as there is no way of knowing how many delimiters there might be. If you are wondering why a person would want the basename of an executable think busybox, egrep, fgrep etc...
strrchar() would be cumbersome to implement as it searches for characters not strings so I do not find it nearly as viable or succinct as this solution. I stand corrected by Rad Lexus this would not be as cumbersome as I thought as strrchar() has the side effect of returning the index of the string beyond the character found.
Take Care
My example (improved):
#include <string.h>
const char* getFileNameFromPath(const char* path, char separator = '/')
{
if(path != nullptr)
{
for(size_t i = strlen(path); i > 0; --i)
{
if (path[i-1] == separator)
{
return &path[i];
}
}
}
return path;
}

How do I write a dispatcher, if my compiler's support for pointers-to-functions is broken?

I am working on an embedded application where the device is controlled through a command interface. I mocked the command dispatcher in VC and had it working to my satisfaction; but when I then moved the code over to the embedded environment, I found out that the compiler has a broken implementation of pointer-to-func's.
Here's how I originally implemented the code (in VC):
/* Relevant parts of header file */
typedef struct command {
const char *code;
void *set_dispatcher;
void *get_dispatcher;
const char *_description;
} command_t;
#define COMMAND_ENTRY(label,dispatcher,description) {(const char*)label, &set_##dispatcher, &get_##dispatcher, (const char*)description}
/* Dispatcher data structure in the C file */
const command_t commands[] = {
COMMAND_ENTRY("DH", Dhcp, "DHCP (0=off, 1=on)"),
COMMAND_ENTRY("IP", Ip, "IP Address (192.168.1.205)"),
COMMAND_ENTRY("SM", Subnet, "Subunet Mask (255.255.255.0)"),
COMMAND_ENTRY("DR", DefaultRoute, "Default router (192.168.1.1)"),
COMMAND_ENTRY("UN", Username, "Web username"),
COMMAND_ENTRY("PW", Password, "Web password"),
...
}
/* After matching the received command string to the command "label", the command is dispatched */
if (pc->isGetter)
return ((get_fn_t)(commands[i].get_dispatcher))(pc);
else
return ((set_fn_t)(commands[i].set_dispatcher))(pc);
}
Without the use of function pointers, it seems like my only hope is to use switch()/case statements to call functions. But I'd like to avoid having to manually maintain a large switch() statement.
What I was thinking of doing is moving all the COMMAND_ENTRY lines into a separate include file. Then wraps that include file with varying #define and #undefines. Something like:
/* Create enum's labels */
#define COMMAND_ENTRY(label,dispatcher,description) SET_##dispatcher, GET_##dispatcher
typedef enum command_labels = {
#include "entries.cinc"
DUMMY_ENUM_ENTRY} command_labels_t;
#undefine COMMAND_ENTRY
/* Create command mapping table */
#define COMMAND_ENTRY(label,dispatcher,description) {(const char*)label, SET_##dispatcher, GET_##dispatcher, (const char*)description}
const command_t commands[] = {
#include "entries.cinc"
NULL /* dummy */ };
#undefine COMMAND_ENTRY
/*...*/
int command_dispatcher(command_labels_t dispatcher_id) {
/* Create dispatcher switch statement */
#define COMMAND_ENTRY(label,dispatcher,description) case SET_##dispatcher: return set_##dispatcher(pc); case GET_##dispatcher: return get_##dispatcher(pc);
switch(dispatcher_id) {
#include "entries.cinc"
default:
return NOT_FOUND;
}
#undefine COMMAND_ENTRY
}
Does anyone see a better way to handle this situation? Sadly, 'get another compiler' is not a viable option. :(
--- Edit to add:
Just to clarify, the particular embedded environment is broken in that the compiler is supposed to create a "function-pointer table" which is then used by the compiler to resolve calls to functions through a pointer. Unfortunately, the compiler is broken and doesn't generate a correct function-table.
So I don't have an easy way to extract the func address to invoke it.
--- Edit #2:
Ah, yes, the use of void *(set|get)_dispatcher was my attempt to see if the problem was with the typedefine of the func pointers. Originally, I had
typedef int (*set_fn_t)(cmdContext_t *pCmdCtx);
typedef int (*get_fn_t)(cmdContext_t *pCmdCtx);
typedef struct command {
const char *code;
set_fn_t set_dispatcher;
get_fn_t get_dispatcher;
const char *_description;
} command_t;
You should try changing your struct command so the function pointers have the actual type:
typedef struct command {
const char *code;
set_fn_t set_dispatcher;
get_fn_t get_dispatcher;
const char *_description;
} command_t;
Unfortunately, function pointers are not guaranteed to be able to convert to/from void pointers (that applies only to pointers to objects).
What's the embedded environment?
Given the information posted in the updates to the question, I see that it's really a bugged compiler.
I think that your proposed solution seems pretty reasonable - it's probably similar to what I would have come up with.
A function pointer isn't actually required to fit in a void*. You could check to make sure that the value you're calling is actually the address of the function. If not, use a function pointer type in the struct: either get_fn_t, or IIRC void(*)(void) is guaranteed to be compatible with any function pointer type.
Edit: OK, assuming that calling by value can't be made to work, I can't think of a neater way to do what you need than auto-generating the switch statement. You could maybe use an off-the-shelf ASP-style preprocessor mode for ruby/python/perl/php/whatever prior to the C preprocessor. Something like this:
switch(dispatcher_id) {
<% for c in commands %>
case SET_<% c.dispatcher %>: return set_<% c.dispatcher %>(pc);
case GET_<% c.dispatcher %>: return get_<% c.dispatcher %>(pc);
<% end %>
default:
return NOT_FOUND;
}
might be a bit more readable than the macro/include trick, but introducing a new tool and setting up the makefiles is probably not worth it for such a small amount of code. And the line numbers in the debug info won't relate to the file you think of as the source file unless you do extra work in your preprocessor to specify them.
Can you get the vendor to fix the compiler?
To what extent is the pointer-to-function broken?
If the compiler allows you to get the address of a function (I'm from C++, but &getenv is what I mean), you could wrap the calling convention stuff into assembler.
As said, I'm a C++ssie, but something in the way of
; function call
push [arg1]
push [arg2]
call [command+8] ; at the 4th location, the setter is stored
ret
If even that is broken, you could define an array of extern void* pointers which you define, again, in assembly.
try this syntax:
return (*((get_fn_t)commands[i].get_dispatcher))(pc);
It's been awhile since I've done C & function pointers, but I believe the original C syntax required the * when dereferencing function pointers but most compilers would let you get away without it.
Do you have access to the link map?
If so, maybe you can hack your way around the wonky function-pointer table:
unsigned long addr_get_dhcp = 0x1111111;
unsigned long addr_set_dhcp = 0x2222222; //make these unique numbers.
/* Relevant parts of header file */
typedef struct command {
const char *code;
unsigned long set_dispatcher;
unsigned long get_dispatcher;
const char *_description;
} command_t;
#define COMMAND_ENTRY(label,dispatcher,description) {(const char*)label,
addr_set_##dispatcher, addr_get_##dispatcher, (const char*)description}
Now compile, grab the relevant addresses from the link map, replace the constants, and recompile. Nothing should move, so the map ought to stay the same. (Making the original constants unique should prevent the compiler from collapsing identical values into one storage location. You may need a long long, depending on the architecture)
If the concept works, you could probably add a post-link step running a script to do the replacement automagically. Of course, this is just a theory, it may fail miserably.
Maybe, you need to look into the structure again:
typedef struct command {
const char *code;
void *set_dispatcher; //IMO, it does not look like a function pointer...
void *get_dispatcher; //more like a pointer to void
const char *_description;
} command_t;
Let say your dispatchers have the following similar function definition:
//a function pointer type definition
typedef int (*genericDispatcher)(int data);
Assume that the dispatchers are like below:
int set_DhcpDispatcher(int data) { return data; }
int get_DhcpDispatcher(int data) { return 2*data; }
So, the revised structure will be:
typedef struct command {
const char *code;
genericDispatcher set_dispatcher;
genericDispatcher get_dispatcher;
const char *_description;
} command_t;
Your macro will be:
#define COMMAND_ENTRY(label,dispatcher,description) \
{ (const char*)label, \
set_##dispatcher##Dispatcher, \
get_##dispatcher##Dispatcher, \
(const char*)description }
Then, you can set your array as usual:
int main(int argc, char **argv)
{
int value1 = 0, value2 = 0;
const command_t commands[] = {
COMMAND_ENTRY("DH", Dhcp, "DHCP (0=off, 1=on)")
};
value1 = commands[0].set_dispatcher(1);
value2 = commands[0].get_dispatcher(2);
printf("value1 = %d, value2 = %d", value1, value2);
return 0;
}
Correct me, if I am wrong somewhere... ;)

Resources