I am looking for a function that checks if a string follows (matches exactly) the pattern of data specified by the additional arguments corresponding to the format string
Like this:
/* int strcmpf (char *str1, char *format, ...); */
char *test = "Hello World !"
if(!strcmpf(test, "%s%*s %c", "Hello ", '!')
return HELLO_HAS_BEEN_SAID;
else
return PROGRAM_ISNT_POLITE;
Implementing this myself I assume will be very difficult, but such function could be very useful for semantic parsing of content. Before I attempt to write such function myself, I need to know if there is already a library or code snippet that provides implementation of a function like this ?
To be more specific, I need pattern-matching behavior. So test must match exactly the pattern specified by the data corresponding to the format parameter.
I need to know if there is already a library or code snippet that provides implementation of a function like this
The standard library has no such functionality. Requests for third-party library recommendations are off-topic here, but to the extent that I understand the functionality you want, I am anyway unaware of an existing third-party implementation.
As I said in comments, I suggest that you design the pattern-matching aspect around bona fide regular expressions instead of around printf() or scanf() formats (which are not entirely the same). There are several regular expression libraries available to support that part.
Related
I am new to coding using MISRA C guidelines.
The following are two rules in MISRA C 2004:
Rule 16.1 (required): Functions shall not be defined with a variable number of arguments.
Rule 20.9 (required): The input/output library <stdio.h> shall not be used in production code.
This clearly means that I can't use printf in production code for it to be MISRA C compliant, because printf is a part of <stdio.h> and allows a variable number of arguments. So I set out on a quest to find out how I can write my own printf statement. So far I am unable to find any solution for this predicament. Any help from fellow developers would be appreciated.
so far I am unable to find any solution for this predicament
You have to use functions that print one (countable) things at a time. An example interface you might want to implement might look like the following:
print_string("Hello");
print_int(5);
print_char('\n');
so I set out on a quest to find out how I can write my own printf statement
Most MISRA-C systems are embedded systems where printf is just some bloated wrapper around an UART library. The usual solution is to develop your own logging/messaging tool instead. Not necessarily UART-based, might as well some other serial bus, or just 8 parallel data or some LCD/7-seg... all depending on what you need to display and if you intend for this to be part of the production code or not.
So how to do this is highly project-specific and it's typically more of a system design and electronics problem than a programming one.
EDIT
Since you seem to be making some sort of general-purpose library, one solution is to simply provide an API that returns strings to the caller, then let the caller worry about how to present them. That makes your lib MISRA-C compliant, while allowing the caller to print strings in whatever application-specific way they have available. For example:
void lib_getmsg (char* msg, size_t bufsize);
Where "lib" is some prefix for your library. Leave string allocation to the caller. Alternatively, the old-fashioned way:
lib_result_t lib_dosomething (void);
// Returns LIB_OK if went OK, returns LIB_ERR in case of errors.
// To get more information, call lib_get_lastmsg.
const char* lib_get_lastmsg (void);
This returns a pointer to an internal static string allocated by your library. The downside of this is that it won't work well in multi-process environments.
You need to understand the rationale for the MISRA C guidelines, understand the context they are used in, and the circumstances of your own code.
You also need to understand that the MISRA Guidelines are not to be blindly followed with a tick-box mentality... you then need to appreciate that those nice folk at MISRA provide several chapters of useful material before the actual guidelines. Part of that is the Deviation procedure.
If you can justify why you feel you need to violate a guidelines, then use the deviation procedure that is specified. This requires you understand the nature of the violation, and what you are going to do to ensure the integrity of your application.
If you genuinely need to use printf() and you can justify that, use it with a deviation
On Linux, running on a modern x86_64 processor:
int main()
{
char *s = "Hello, World!\n";
long l = 14;
long fd = 1;
long syscall = 1;
long ret = 0;
__asm__("syscall"
: "=a"(ret)
: "a"(syscall),
"D"(fd),
"S"(s),
"d"(l));
return 0;
}
Output:
Hello, World!
How can i get a function's name without calling/invoking it, or is that even possible ?
I have an array of sorting functions, my goal is to be able to list the name of each one, dynamically, without having to invoke any.
After searching on the web, i couldn't find any solution that doesn't require the function being invoked and uses __FUNCTION__ or __func__.
The array of functions that is use:
// Pointer to functions
char *(*srtFunc[])(int *, int) = {selection, bubble, recursiveBubble, insertion, recursiveInsertion};
More information about what I want to achieve with this:
I want to loop over each function in the given array, create a file with the name of the function, invoke the function 100 times with different arguments each time, and print the time spent by the function each time in its dedicated file, redo for the remaining functions.
Unfortunately, not easily. C is not built for introspection and doesn't have features like this-- the name of function foo and the call to function foo are compiled down to just some jump and call instructions in the output; the actual name "foo" is essentially a convenience for you when programming and disappears in the compiled output.
The macro __FUNCTION__ is a preprocessor macro-- and as you note it only works within a function, because all it does it tell the preprocessor (as its churning through the text) hey, as you're scanning this token just drop in the name of the function you're currently scanning and then continue on. It's very "dumb" and is upstream of even the compiler.
There are various ways to get the effective result you want here, including most simply just manually building a table of string literals that have the same names as your functions. You can do this in fairly clean ways (see #nielsen's answer for a useful snippet) using macros. But the preprocessor/compiler can't help you derive or enforce a table from the actual functions so you will always have some risk of an issue at runtime when you make changes to it. Unfortunately C just doesn't have the capability for the kind of elegance you're looking for in this design.
You may be able to do something with smart preprocessor tricks, but your code would be difficult to read. I think I would go for the really low-tech solution here and just add an array of the function names matching the array of function pointers:
#define ARRAY_SIZE(A) (sizeof(A)/sizeof(A[0]))
// Pointer to functions
char *(*srtFunc[])(int *, int) = {selection, bubble, recursiveBubble, insertion, recursiveInsertion};
const char *srtFuncNames[] = {"selection", "bubble", "recursiveBubble", "insertion", "recursiveInsertion"};
_Static_assert(ARRAY_SIZE(srtFuncNames)==ARRAY_SIZE(srtFunc), "Function table and names out of synch!");
Having the two definitions just after each other makes it easy to keep them synchronized and the code is easy to read. The _Static_assert (available from C11) will help remembering to add new names as new functions are added.
Alternatively, a structure can be defined holding a function pointer and corresponding name. This can be initialized using a macro as follows:
typedef struct
{
char *(*srtFunc)(int *, int);
const char *srtName;
} sortMethod;
#define SORT_METHOD(S) {(S), #S}
sortMethod methods[] = {
SORT_METHOD(selection),
SORT_METHOD(bubble),
SORT_METHOD(recursiveBubble),
SORT_METHOD(insertion),
SORT_METHOD(recursiveInsertion)
};
I am reading about using of safe strings at following location
https://www.securecoding.cert.org/confluence/pages/viewpage.action?pageId=5111861
It is mentioned as below.
SafeStr strings, when used properly, can eliminate many of these errors and provide backward compatibility to legacy code as well.
My question is what does author mean by "provide backward compatibility to legacy code as well." ? Request to explain with example.
Thanks for your time and help
It means that functions from the standard libc (and others) which expects plain, null terminated char arrays, will work even on those SafeStrs. This is probably achieved by putting a control structure at a negative offset (or some other trick) from the start of the string.
Examples: strcmp() printf() etc can be used directly on the strings returned by SafeStr.
In contrast, there are also other string libraries for C which are very "smart" and dynamic, but these strings can not be sent without conversion to "old school" functions.
From that page:
The library is based on the safestr_t type which is completely
compatible with char *. This allows casting of safestr_t structures to
char *.
That's some backward compatibility with all the existing code that takes char * or const char * pointers.
Why this distinction? I've landed up with terrible problems, assuming itoa to be in stdlib.h and finally ending up with linking a custom version of itoa with a different prototype and thus producing some crazy errors.
So, why isn't itoa not a standard function? What's wrong with it? And why is the standard partial towards its twin brother atoi?
No itoa has ever been standardised so to add it to the standard you would need a compelling reason and a good interface to add it.
Most itoa interfaces that I have seen either use a static buffer which has re-entrancy and lifetime issues, allocate a dynamic buffer that the caller needs to free or require the user to supply a buffer which makes the interface no better than sprintf.
An "itoa" function would have to return a string. Since strings aren't first-class objects, the caller would have to pass a buffer + length and the function would have to have some way to indicate whether it ran out of room or not. By the time you get that far, you've created something similar enough to sprintf that it's not worth duplicating the code/functionality. The "atoi" function exists because it's less complicated (and arguably safer) than a full "scanf" call. An "itoa" function wouldn't be different enough to be worth it.
The itoa function isn't standard probably for the reason is that there is no consistent definition of it. Different compiler and library vendors have introduced subtly different versions of it, possibly as an invention to serve as a complement to atoi.
If some non-standard function is widely provided by vendors, the standard's job is to codify it: basically add a description of the existing function to the standard. This is possible if the function has more or less consistent argument conventions and behavior.
Because multiple flavors of itoa are already out there, such a function cannot be added into ISO C. Whatever behavior is described would be at odds with some implementations.
itoa has existed in forms such as:
void itoa(int n, char *s); /* Given in _The C Programming Language_, 1st ed. (K&R1) */
void itoa(int input, void (*subr)(char)); /* Ancient Unix library */
void itoa(int n, char *buf, int radix);
char *itoa(int in, char *buf, int radix);
Microsoft provides it in their Visual C Run Time Library under the altered name: _itoa.
Not only have C implementations historically provided it under differing definitions, C programs also provide a function named itoa function for themselves, which is another source for possible clashes.
Basically, the itoa identifier is "radioactive" with regard to standardization as an external name or macro. If such a function is standardized, it will have to be under a different name.
Bearing in mind the answers given to a question about a safer formatting library for C, I'm wondering whether there is a safe C formatting library?
What I mean is:
there's no possibility to mismatch the format string from the arguments
there's no possibility to crash by passing the wrong type
there're no platform-dependent aspects
Please don't answer about the Microsoft Safe String Library, or libraries that are less unsafe but still not totally safe, as I'm aware of these, and they don't satisfy the requirements for total safety.
Thanks in advance
You're writing in C. C is not type-safe. You cannot avoid undefined behaviour if you pass an int* instead of a char*. There's no such thing as "there's no possibility" if your variables are not statically type checked / tagged for runtime checking.
If you have something that produces warnings, that's quite good already...
If you really need or want safety, you may want to have a look at cyclone (C dialect), or some completely different languages.
there's no possibility to mismatch the format string from the arguments
If you want a format string, without special compiler support you basically can't do it. That said, you could have a safe formatting library in C if you forgo the format string. I'm not aware of any, but I would not be surprised if they existed.
One could have an interface like:
typedef ... FORMATTER;
FORMATTER create_formatter();
int fmt_add_string_default(FORMATTER f, const char *s);
int fmt_add_string(FORMATTER f, const char *s, int maxlength, const char fill, enum fmt_alignment align);
...
int fmt_add_decimal_default(FORMATTER f, int d);
... // you get the idea
int fmt_write_result(FORMATTER f, char *out, int out_length);
void destroy_formatter(FORMATTER f);
Something like this would be perfectly safe, if a bit verbose.
No, because whatever "safety" you introduce can be suborned by the language. It's like building your castle on sand - it doesn't matter how good the castle is, it can still be made to fall if you dig out the sand from underneath it.
There is no mechanism in C to enforce specific parameter types, nor should there be.
If people don't use your tools as they're meant to, that's their own problem, in my opinion. You're not supposed to be providing software to three-year-olds - they're expected to have some modicum of intelligence.