I am currently working with parsing some MAC addresses. I am given an output that does not include leading zeros (like so).
char* host = "0:25:25:0:25:25";
and I would like to format it like so
char* host = "00:25:25:00:25:25";
What would be the easiest way to go about this?
For those wondering, I am using the libpcap library.
I may be missing something in the question. Assuming you know it is a valid MAC, and the input string is thus parsable, have you considered something as simple as:
char* host1 = "0:25:25:0:AB:25";
char *host2 = "0:1:02:3:0a:B";
char result[19];
int a,b,c,d,e,f;
// the question sample
if (sscanf(host1, "%x:%x:%x:%x:%x:%x", &a,&b,&c,&d,&e, &f) == 6)
sprintf(result, "%02X:%02X:%02X:%02X:%02X:%02X", a,b,c,d,e,f);
printf("host1: %s\n", result);
// a more daunting sample
if (sscanf(host2, "%x:%x:%x:%x:%x:%x", &a,&b,&c,&d,&e, &f) == 6)
sprintf(result, "%02X:%02X:%02X:%02X:%02X:%02X", a,b,c,d,e,f);
printf("host2: %s\n", result);
Output
host1: 00:25:25:00:AB:25
host2: 00:01:02:03:0A:0B
Obviously for the ultra-paranoid you would want to make sure a-f are all < 255, which is probably preferable. The fundamental reasons I prefer this where performance isn't a critical issue are the many things you may not be considering in your question. It handles all of
Lead values of "n:", where n is any hex digit; not just zero. Examples: "5:", "0:"
Mid values of ":n:", again under the same conditions as (1) above. Examples: ":A:", ":0:"
Tail values of ":n". once more, under the same conditions as (1) above. Examples: ":b", ":0"
Hex-digit agnostic when reading; it works with both upper and lower case digit chars.
Most important, does nothing (except upper-case the hex values) if your input string is already properly formatted.
Roughly like this:
Allocate an output string to hold the reformatted MAC address.
Iterate over the input string and use strtok with : delimiter. In each iteration convert the beginning of the string (2 bytes) into a numerical value (e.g., with atoi). If the result < 16 (i.e., < 0x10), set "0" into the output string at current position and the result in hex at the following position; otherwise copy the 2 bytes of input string. Append : to the output string. Continue till end of the input.
Related
I am making a game where the answer is stored in client_challenges->answer while the client inputs the answer (which is stored in buffer) in the following format:
A: myanswer
If the answer starts from alphabet A, then i need to compare myanswer with the answer pre-stored. Using the code below, I get the correct buffer and ans lengths but if I print out my store array and answer array, the results differ. For example, if I input A: color, my store gives colo instead of color. However, store-2 works in some cases. How can I fix this?
if (buffer[0] == 'A')
{
printf("ans len %ld, buff len %ld\n",strlen(client_challenges->answer,(strlen(buffer)-4));
if(strlen(client_challenges->answer) == (strlen(buffer)-4))
{
char store[100];
for (int i = 1; i<= strlen(client_challenges->answer);i++)
{
store[i-1]=buffer[2+i];
}
store[strlen(store)-2] = '\0';
//store[strlen(client_challenges->answer)+1]='\0';
printf("Buffer: <%s>\n", buffer);
printf("STORE: %s\n",store);
printf("ANSWER: %s\n",client_challenges->answer);
if(strcmp(store,client_challenges->answer)==0)
{
send(file_descriptor, correct, strlen(correct), 0);
}
}
}
Example:
Client enters
A: Advancement
ans len 11, buff len 11
But when I print out store, it is Advancemen while the answer is Advancement. However, in my previous attempt, answer was soon and I entered "soon". It worked then.
Although I can not pin point the exact reason of this bug with the given input, I can share my experiences about how to find the correct spot efficiently.
Always verify your input.
Never trust an input. You only printed out the lengths of the inputs, what is the content. You'd better check with every byte (preferably in hex) to spot not printable characters. Some IDE provide integrated debugger to show buffer contents.
Use defines, constants, some human readable things instead of 4 or 2. This makes life much easier For instance
/* what is 4 here */
strlen(buffer)-4
should have been:
/* remove 'A: ' (A, colon, and white space, and I do not know what is 4th item */
strlen(buffer) - USER_ADDED_HEADERS
Get more familiar with C library
You actually did not need store array here. C provides strncmp function to compare two strings up to size "n" or memcmp to compare two buffers. This would save copy operation (cpu cycles), and stack memory.
More clear version of your code fragment (without error checks) could have been written as:
if (buffer[0] == 'A') {
/* verify input here */
/* #define ANSWER_START 4 // I do not know what the 4 is */
/* compare lengths here if they are not equal return sth accordingly */
/* supplied answer correct? */
if (memcmp(client_challenges->answer,
buffer + ANSWER_START,
strlen(client_challenges->answer)) == 0) {
/* do whatever you want here */
}
}
Consistent code formatting
Code formatting DOES matter. Be consistent on indents, curly parenthesis, tabs vs spaces, spaces before/after atoms etc. You do not have to stick to one format, but you have to be consistent.
Use a debugger
Debugger is your best friend. Learn about it. The issue with this bug can be identified with the debugger very easily.
How do I split a string into two strings (array name, index number) only if the string is matching the following string structure: "ArrayName[index]".
The array name can be 31 characters at most and the index 3 at most.
I found the following example which suppose to work with "Matrix[index1][index2]". I really couldn't understand how it does it in order to take apart the part I need to get my strings.
sscanf(inputString, "%32[^[]%*[[]%3[^]]%*[^[]%*[[]%3[^]]", matrixName, index1,index2) == 3
This try over here wasn't a success, what am I missing?
sscanf(inputString, "%32[^[]%*[[]%3[^]]", arrayName, index) == 2
How do I split a string into two strings (array name, index number) only if the string is matching the following string structure: "ArrayName[index]".
With sscanf, you don't. Not if you mean that you can rely on nothing being modified in the event that the input does not match the pattern. This is because sscanf, like the rest of the scanf family, processes its input and format linearly, without backtracking, and by design it fills input fields as they are successfully matched. Thus, if you scan with a format that assigns multiple fields or has trailing literal characters then it is possible for results to be stored for some fields despite a matching failure occurring.
But if that's ok with you then #gsamaras's answer provides a nearly-correct approach to parsing and validating a string according to your specified format, using sscanf. That answer also presents a nice explanation of the meaning of the format string. The problem with it is that it provides no way to distinguish between the input fully matching the format and the input failing to match at the final ], or including additional characters after.
Here is a variation on that code that accounts for those tail-end issues, too:
char array_name[32] = {0}, idx[4] = {0}, c = 0;
int n;
if (sscanf(str, "%31[^[][%3[^]]%c%n", array_name, idx, &c, &n) >= 3
&& c == ']' && str[n] == '\0')
printf("arrayName = %s\nindex = %s\n", array_name, idx);
else
printf("Not in the expected format \"ArrayName[idx]\"\n");
The difference in the format is the replacement of the literal terminating ] with a %c directive, which matches any one character, and the addition of a %n directive, which causes the number of characters of input read so far to be stored, without itself consuming any input.
With that, if the return value is at least 3 then we know that the whole format was matched (a %n never produces a matching failure, but docs are unclear and behavior is inconsistent on whether it contributes to the returned field count). In that event, we examine variable c to determine whether there was a closing ] where we expected to find one, and we use the character count recorded in n to verify that all characters of the string were parsed (so that str[n] refers to a string terminator).
You may at this point be wondering at how complicated and cryptic that all is. And you would be right to do so. Parsing structured input is a complicated and tricky proposition, for one thing, but also the scanf family functions are pretty difficult to use. You would be better off with a regex matcher for cases like yours, or maybe with a machine-generated lexical analyzer (see lex), possibly augmented by machine-generated parser (see yacc). Even a hand-written parser that works through the input string with string functions and character comparisons might be an improvement. It's still complicated any way around, but those tools can at least make it less cryptic.
Note: the above assumes that the index can be any string of up to three characters. If you meant that it must be numeric, perhaps specifically a decimal number, perhaps specifically non-negative, then the format can be adjusted to serve that purpose.
A naive example to get you started:
#include <stdio.h>
#include <string.h>
int main(void)
{
char str[] = "myArray[123]";
char array_name[32] = {0}, idx[4] = {0};
if(sscanf(str, "%31[^[][%3[^]]]", array_name, idx) == 2)
printf("arrayName = %s\nindex = %s\n", array_name, idx);
else
printf("Not in the expected format \"ArrayName[idx]\"\n");
return 0;
}
Output:
arrayName = myArray
index = 123
which will find easy not-in-the-expected format cases, such as "ArrayNameidx]" and "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOP[idx]", but not "ArrayName[idx".
The essence of sscanf() is to tell it where to stop, otherwise %s would read until the next whitespace.
This negated scanset %[^[] means read until you find an opening bracket.
This negated scanset %[^]] means read until you find a closing bracket.
Note: I used 31 and 3 as the width specifiers respectively, since we want to reserve the last slot for the NULL terminator, since the name of the array is assumed to be 31 characters at the most, and the index 3 at the most. The size of the array for its token is the max allowed length, plus one.
How can I use sscanf to analyze string data?
Use "%n" to detect a completed scan.
array name can be 31 characters at most and the index 3 at most.
For illustration, let us assume the index needs to limit to a numeric value [0 - 999].
Use string literal concatenation to present the format more clearly.
char name[32]; // array name can be 31 characters
#define NAME_FMT "%31[^[]"
char idx[4]; //
#define IDX_FMT "%3[0-9]"
int n = 0; // be sure to initialize
sscanf(str, NAME_FMT "[" IDX_FMT "]" "%n", array_name, idx, &n);
// Did scan complete (is `n` non-zero) with no extra text?
if (n && str[n] == '\0') {
printf("arrayName = %s\nindex = %d\n", array_name, atoi(idx));
} else {
printf("Not in the expected format \"ArrayName[idx]\"\n");
}
I have a struct that contains a string and a length:
typedef struct string {
char* data;
size_t len;
} string_t;
Which is all fine and dandy. But, I want to be able to output the contents of this struct using a printf-like function. data may not have a nul terminator (or have it in the wrong place), so I can't just use %s. But the %.*s specifier requires an int, while I have a size_t.
So the question now is, how can I output the string using printf?
Assuming that your string doesn't have any embedded NUL characters in it, you can use the %.*s specifier after casting the size_t to an int:
string_t *s = ...;
printf("The string is: %.*s\n", (int)s->len, s->data);
That's also assuming that your string length is less than INT_MAX. If you have a string longer than INT_MAX, then you have other problems (it will take quite a while to print out 2 billion characters, for one thing).
A simple solution would just be to use unformatted output:
fwrite(x.data, 1, x.len, stdout);
This is actually bad form, since `fwrite` may not write everything, so it should be used in a loop;
for (size_t i, remaining = x.len;
remaining > 0 && (i = fwrite(x.data, 1, remaining, stdout)) > 0;
remaining -= i) {
}
(Edit: fwrite does indeed write the entire requested range on success; looping is not needed.)
Be sure that x.len is no larger than SIZE_T_MAX.
how can I output the string using printf?
In a single call? You can't in any meaningful way, since you say you might have null terminators in strange places. In general, if your buffer might contain unprintable characters, you'll need to figure out how you want to print (or not) those characters when outputting your string. Write a loop, test each character, and print it (or not) as your logic dictates.
I'm reading a txt file and getting all the chars that aren't space, transforming them to int using (int)c-'0' and that is working.
The problem is if the number has more than 1 digit, because I'm reading char by char.
How could I do to read like a sequence of chars, transform this sequence of chars into int?
I tried using a string, but when I try to pass this string to my other function, it treats each index as a number, but what I need is that the whole string is treated as one number.
Any ideas?
A convenient way to do the conversion is to read the whole number into a buffer (string) and then call atoi. Make triple sure that the string is properly null-terminated.
One solution, I won't say it's good or bad in your case since you don't provide any code, but you could do something like this: (pseudoish code)
int i;
int val = 0;
char *string = "5238785";
for (i = 0; i < strlen(string); i++) {
val = val * 10 + atoi(string[i]);
}
NOTE: I simplified it and you should do more string controls to make sure you don't go out of bounds etc. Make sure the string is NULL-terminated \0, but the concept is that you read one digit at the time, and just move what you've read so far "one step left" to fit next digit.
I'm trying to use Mac OS X's listxattr C function and turn it into something useful in Python. The man page tells me that the function returns a string buffer, which is a "simple NULL-terminated UTF-8 strings and are returned in arbitrary order. No extra padding is provided between names in the buffer."
In my C file, I have it set up correctly it seems (I hope):
char buffer[size];
res = listxattr("/path/to/file", buffer, size, options);
But when I got to print it, I only get the FIRST attribute ONLY, which was two characters long, even though its size is 25. So then I manually set buffer[3] = 'z' and low and behold when I print buffer again I get the first TWO attributes.
I think I understand what is going on. The buffer is a sequence of NULL-terminated strings, and stops printing as soon as it sees a NULL character. But then how am I supposed to unpack the entire sequence into ALL of the attributes?
I'm new to C and using it to figure out the mechanics of extending Python with C, and ran into this doozy.
char *p = buffer;
get the length with strlen(p). If the length is 0, stop.
process the first chunk.
p = p + length + 1;
back to step 2.
So you guessed pretty much right.
The listxattr function returns a bunch of null-terminated strings packed in next to each other. Since strings (and arrays) in C are just blobs of memory, they don't carry around any extra information with them (such as their length). The convention in C is to use a null character ('\0') to represent the end of a string.
Here's one way to traverse the list, in this case changing it to a comma-separated list.
int i = 0;
for (; i < res; i++)
if (buffer[i] == '\0' && i != res -1) //we're in between strings
buffer[i] = ',';
Of course, you'll want to make these into Python strings rather than just substituting in commas, but that should give you enough to get started.
It looks like listxattr returns the size of the buffer it has filled, so you can use that to help you. Here's an idea:
for(int i=0; i<res-1; i++)
{
if( buffer[i] == 0 )
buffer[i] = ',';
}
Now, instead of being separated by null characters, the attributes are separated by commas.
Actually, since I'm going to send it to Python I don't have to process it C-style after all. Just use the Py_BuildValue passing it the format character s#, which knows what do with it. You'll also need the size.
return Py_BuildValue("s#", buffer, size);
You can process it into a list on Python's end using split('\x00'). I found this after trial and error, but I'm glad to have learned something about C.