Need help figuring out parameters of __isoc9_scanf() - c

I have some some C code that I'm trying to understand which uses the function __isoc99_scanf(). I haven't encountered that function ever before. I looked it up and it turns out that it is some kind of variation of scanf(). This is the code:
__isoc99_scanf(&DAT_00400d18,local_78,(undefined4 *)((long)puVar3 + 4));
&DAT_00400d18 is a C string containing the value "%s". local_78 is an array of unknown data type. puVar3 is a pointer that points to the last element of that array.
What really confuses me is why does that function call have three parameters? I know that scanf() takes two parameters: the first one is the format string. The second one is the memory address to save the date into. However __isoc99_scanf() here is invoked with three parameters. I cannot understand why the third parameter is there. The first parameter &DAT_00400d18 is just "%s", which suggests that the second parameter be a memory location where to save that string. But why do you need the third parameter when it's not even specified in the format string?
This is not my code, I didn't write it. Actually it is a disassembled version of the assembly code for a particular application that I'm trying to debug. But I've never seen __isoc99_scanf() before because I only used scanf() in my own code.

When you compile scanf, the compiler automatically translates it to the __isoc99_scanf function in libc. If you compile this code:
#include <stdio.h>
int main() {
char buf[32];
scanf("%32s", buf);
return 0;
}
and decompile in GHIDRA, you get:
__isoc99_scanf(&DAT_001007b4,local_38);
where DAT_001007b4 is "%32s" and local_38 is a buffer. It behaves exactly the same as normal scanf. One important thing to keep in mind when using GHIDRA is it doesn't know exactly how many arguments a function should expect, so if a function is being passed in too many arguments, like in your case, you should just ignore the extra arguments since the code will too.

Related

Prompting user to enter number in MEX code

I wonder is there anyway to prompt user to enter integer within MEX code.
Something similar to what input in MATLAB or scanf in C.
I heard about mexCallMATLAB and its use in
str = mxCreateString("Enter extension: ");
mexCallMATLAB(1,&new_number,1,&str,"input");
However I do not really understand what is the point of mxCreateString and what does &str do. I will be really appreciative if anyone can elaborate a little about this or give me another technique to prompt user to enter data.
Let's start from the beginning. mexCallMATLAB calls a MATLAB function, user-defined MATLAB function or MEX file within MEX code. The function declaration is such that:
int mexCallMATLAB(int nlhs, mxArray *plhs[], int nrhs, mxArray *prhs[],
const char *functionName);
The parameters in detail are:
nlhs: The total number of output parameters that the MATLAB or MEX function is expected to produce.
*plhs[]: An array of pointers where each element is a pointer to an output argument
nrhs: The total number of input parameters that the MATLAB or MEX function is expected to take in.
*prhs[]: An array of pointers where each element is a pointer to an input argument.
functionName: A C string that contains the function name.
Take note that *plhs[] and *prhs[] must be an array of pointers to MEX-type variables. This is important because this will be used to understand what is going to happen next. Using the above logic, take a look at the call to mexCallMATLAB that you have referenced:
mexCallMATLAB(1,&new_number,1,&str,"input");
As we can see, the function to call in MATLAB is the input function which is a MATLAB function where the input argument is the string prompt that is used to display in the Command Window before taking in an input from the user and storing this into the output variable. Take note that what is expected is a numerical expression, usually a number or some operation on numbers.
An example call would look like so:
out = input('Enter a number: ');
Enter a number: would thus be displayed in the Command Window and whatever number you type in gets stored into the variable out.
When using mexCallMATLAB, you are doing the equivalent of the above but invoking this in MEX code. There is one input argument into this function and one output argument that is expected. The second parameter is technically a pointer to an output argument where this would be an array of just one element. The output of input will thus be stored in the variable new_number which is going to contain a number. The str variable is a MEX string that is created using mxCreateString. You must create a MEX string because remember that the expected inputs for the input variables for the function to call through mexCallMATLAB must be MEX variables. Therefore, str is a MEX string and &str would be a pointer to a MEX string. This is also technically an array of pointers with one element as well.
Once this function is called, you put in an input number into the MATLAB Command Window, and thus number is sent back into MEX and stored into new_number in your MEX code.
This seems to be an elegant way to get a variable from the MATLAB Command Window into MEX. I haven't encountered any other method from what I have seen in my MEX experience, so keep using it!

makes pointer from integer without a cast... DS1307, RTC, BCD

I know this question has been asked before. It's all over Google and on this site too, but I can't understand people when they explain it. I have already spent far too many hours trying to understand and I still don't so please try to understand that there's something fundamental I am NOT understanding... here we go.
When programming in C on Proteus, I often get the warning and/or error (in this case warning):
makes pointer from integer without a cast
and I don't get it. Like I said, I've already spent hours looking into it and I get it has to do with types, and/or pointers, blah. Someone please explain it to me like a normal person.
Also, I get this a lot. Could it be possible to get this warning from other types of variables without a cast? A character? How would I go about fixing this problem now, and avoiding it in the future?
Here's the context...
#include <avr/io.h>
#include <avr/interrupt.h>
#include <util/delay.h>
#include "stdlib.h"
#include "USART.h"
#include "I2C.h"
#include "ds1307.h"
void Wait()
{
uint8_t i;
for(i=0;i<20;i++)
_delay_loop_2(0);
}
uint8_t ss,mm,hh,dd,nn,yy,x; // Appropriately labeled variables
uint16_t sec[3],min[3],hr[3],day[3],month[3],year[3],mode[2];
uint16_t secs,mins,hrs,days,months,years,modes;
int main(void)
{
_delay_ms(50);
USART_interrupt_init(); //
USART_send('\r'); // Send carriage return
_delay_ms(100); // Allows for the LCD module to initialize
I2CInit(); // Initialize i2c Bus
DS1307Write(0x07,0x10); // Blink output at 1Hz
while(1)
{
int i=0;
/* SECONDS */
DS1307Read(0x00,&ss); // Read seconds address
/* MINUTES */
DS1307Read(0x01,&mm); // Read minutes address
/* HOURS */
DS1307Read(0x02,&hh); // Read hours address
/* DAY */
DS1307Read(0x04,&dd); // Read hours address
/* MONTH */
DS1307Read(0x05,&nn); // Read hours address
/* YEAR */
DS1307Read(0x06,&yy); // Read hours address
for(i=0;i<5;i++)
{Wait();i++;}
sec[0]=(0b00001111 & ss);
sec[1]=((0b01110000 & ss)>>4);
sec[2]='\0';
itoa(sec[0],secs,10);
USART_putstring(secs); // place string in buffer
and the 2 errors:
../main.c:59: warning: passing argument 2 of 'itoa' makes pointer from integer without a cast
../main.c:62: warning: passing argument 1 of 'USART_putstring' makes pointer from integer without a cast
Your compiler is telling you that the function expects a pointer, but you've passed an integer. So it's going to automatically treat your integer value as an address and use that as the pointer.
For example, itoa expects a pointer to a memory location for its second parameter - that's where it stores the resulting string that it builds from the integer you pass it. But you've passed secs for that - a uint16_t. The compiler is warning you that whatever value is in that integer is going to be used as the address where itoa puts its resulting string.
This kind of thing would cause a segfault on most targets, but I'm not familiar with Proteus.
Anyway, as an example, to fix the itoa warning, use something like the following:
char secs[3];
...
itoa(sec[0], secs, 10);
Hope that helps.
So here's a completely different answer to the question, at a much higher level, and to make the point clear, we're going to take a step back from C programming and talk about building houses. We're going to give instructions to the people building the house, but we're going to imagine we have some rigid, codified way of doing it, sort of like function calls.
Suppose it's time to paint the outside of the house. Suppose there's a "function" paint_the_house() that looks like this:
paint_the_house(char *main_color, char *trim_color);
You decide you want white trim on a yellow house, so you "call"
paint_the_house("white", "yellow");
and the painters dutifully paint the house white with yellow trim. Whoops! You made a mistake, and nobody caught it, and now the house is the wrong color.
Suppose there's another function, finish_the_floors() that looks like this:
finish_the_floors(char *floor_material, char *color)
The floor_material argument is supposed to be a string like "hardwood", "carpet", "linoleum", or "tile". You decide you want red tile floors in your house, so you call
finish_the_floors("red", "tile");
But the guy who installs the floors comes back and says, "Listen, buddy, 'red' is not a floor material, and 'tile' is not a color, so do you want to try that again?" This time, someone caught your mistake.
Finally, suppose there's a function
furnish_the_bathroom(char *bath_or_shower, int number_of_sinks)
where bath_or_shower is supposed to be the string "bathtub" or "shower", and the second argument is supposed to be the number of sinks you want. You decide you want two sinks and a bathtub and, continuing your careless ways, you call:
furnish_the_bathroom(2, "bathtub");
This time, your bogus "function call" doesn't even make it to the guy who's going to build the bathtub. The architect's dim-bulb nephew, who his brother conned him into hiring for the summer, who can't even tell the difference between a toaster oven and a two-by-four, he's been put in charge of relaying instructions from you to the laborers, and even he can see that there's something wrong. "Um, wait a minute," he whines. "I thought the first thing was supposed to be a string, and the second thing was supposed to be a number?"
And now we can go back to your question, because that's basically what's going on here. When you call a function, you have to pass the right arguments in the right order (just like your instructions to the builders). The compiler can't catch all your mistakes, but it can at least notice that you're doing something impossibly wrong, like passing an int where you're supposed to pass a pointer.
It means that the compiler implicitly casts it for you, however, it notifies you it did by emitting a warning so you know that.
Here's an example using numbers:
float f = 1.0;
int i = f;
Depending the platform, language and compiler settings a couple of scenarios are possible:
compiler implicitly casts float to int without a warning (bad)
idem but issues a warning (better)
compiler settings changed to treat warnings as errors (safe, security critical etc...)
Warnings are a good hint on possible bugs or errors and it's generally wise to fix them instead of suppressing or ignoring them.
In your specific case, I've been looking for an USART_pustring and the first I've found was this one :
void USART_putstring(char* StringPtr)
No need to look further, passing an int to a function expecting char* (if this is the case), 'might' produce an unexpected result.
Solution
Read the documentation of USART_putstring and ensure you 'transform' your input data to the correct type it accepts, the warning will vanish by itself.
EDIT:
+1 for Aenimated1
Ensure that you understand what are the differences between 'integer' and 'pointer to integer' too, he explained that rather well :)
Integers are for counting. Pointers are an abstract indication of where a variable may be found.
To keep things clear in your head it is a good idea to not mix up the two, even if you are on a system where the concrete implementation of a pointer is the same as the implementation of an integer.
It is an error to convert integer to pointer or vice versa, unless you write a cast to say "I know what I'm doing here".
Unfortunately some compilers will spit out "warning" and then generate a bogus binary. If possible, see if you can use compiler switches that will make the compiler say "error" for this case.
If you see this error it usually means that you supplied an integer where the compiler was expecting you to supply a pointer.
In your code, you do this with itoa(sec[0],secs,10); is a problem. The itoa function signature is:
char * itoa ( int value, char * str, int base );
You supplied secs, which is a uint16_t (a 16-bit integer), for the parameter char * str. This is an error because it expects the address of an object, but you supplied a number.
To fix this you need to stop supplying integers for parameters that are pointers.
For assistance with how to convert the output of DS1307Read to a display string, post a question asking about that specifically.

Two approaches to writing functions

I am asking this question in the context of the C language, though it applies really to any language supporting pointers or pass-by-reference functionality.
I come from a Java background, but have written enough low-level code (C and C++) to have observed this interesting phenomenon. Supposing we have some object X (not using "object" here in the strictest OOP sense of the word) that we want to fill with information by way of some other function, it seems there are two approaches to doing so:
Returning an instance of that object's type and assigning it, e.g. if X has type T, then we would have:
T func(){...}
X = func();
Passing in a pointer / reference to the object and modifying it inside the function, and returning either void or some other value (in C, for instance, a lot of functions return an int corresponding to the success/failure of the operation). An example of this here is:
int func(T* x){...x = 1;...}
func(&X);
My question is: in what situations makes one method better than the other? Are they equivalent approaches to accomplishing the same outcome? What are the restrictions of each?
Thanks!
There is a reason that you should always consider using the second method, rather than the first. If you look at the return values for the entirety of the C standard library, you'll notice that there's almost always an element of error handling involved in them. For example, you have to check the return value of the following functions before you assume they've succeeded:
calloc, malloc and realloc
getchar
fopen
scanf and family
strtok
There are other non-standard functions that follow this pattern:
pthread_create, etc.
socket, connect, etc.
open, read, write, etc.
Generally speaking, a return value conveys a number of items successfully read/written/converted or a flat-out boolean success/fail value, and in practice you'll almost always need such a return value, unless you're going to exit(EXIT_FAILURE); at any errors (in which case I would rather not use your modules, because they give me no opportunity to clean up within my own code).
There are functions that don't use this pattern in the standard C library, because they use no resources (e.g. allocations or files) and so there's no chance of any error. If your function is a basic translation function (e.g. like toupper, tolower and friends which translate single character values), for example, then you don't need a return value for error handling because there are no errors. I think you'll find this scenario quite rare indeed, but if that is your scenario, by all means use the first option!
In summary, you should always highly consider using option 2, reserving the return value for a similar use, for the sake of consistent with the rest of the world, and because you might later decide that you need the return value for communicating errors or number of items processed.
Method (1) passes the object by value, which requires that the object be copied. It's copied when you pass it in and copied again when it's returned. Method (2) passes only a pointer. When you're passing a primitive, (1) is just fine, but when you're passing an object, a struct, or an array, that's just wasted space and time.
In Java and many other languages, objects are always passed by reference. Behind the scenes, only a pointer is copied. This means that even though the syntax looks like (1), it actually works like (2).
I think I got you.
These to approach are very different.
The question you have to ask your self when ever you trying to decide which approach to take is :
Which class would have the responsibility?
In case you passing the reference to the object you are decapul the creation of the object to the caller and creating this functionality to be more serviceability and you would be able to create a util class that all of the functions inside will be stateless, they are getting object manipulate the input and returning it.
The other approach is more likely and API, you are requesting an opperation.
For an example, you are getting array of bytes and you would like to convert it to string, you would probably would chose the first approch.
And if you would like to do some opperation in DB you would chose the second one.
When ever you will have more than 1 function from the first approch that cover the same area you would encapsulate it into a util class, same applay to the second, you will encapsulate it into an API.
In method 2, we call x an output parameter. This is actually a very common design utilized in a lot of places...think some of the various built-in C functions that populate a text buffer, like snprintf.
This has the benefit of being fairly space-efficient, since you won't be copying structs/arrays/data onto the stack and returning brand new instances.
A really, really convenient quality of method 2 is that you can essentially have any number of "return values." You "return" data through the output parameters, but you can also return a success/error indicator from the function.
A good example of method 2 being used effectively is in the built-in C function strtol. This function converts a string to a long (basically, parses a number from a string). One of the parameters is a char **. When calling the function, you declare char * endptr locally, and pass in &endptr.
The function will return either:
the converted value if it was successful,
0 if it failed, or
LONG_MIN or LONG_MAX if it was out of range
as well as set the endptr to point to the first non-digit it found.
This is great for error reporting if your program depends on user input, because you can check for failure in so many ways and report different errors for each.
If endptr isn't null after the call to strtol, then you know precisely that the user entered a non-integer, and you can print straight away the character that the conversion failed on if you'd like.
Like Thom points out, Java makes implementing method 2 simpler by simulating pass-by-reference behavior, which is just pointers behind the scenes without the pointer syntax in the source code.
To answer your question: I think C lends itself well to the second method. Functions like realloc are there to give you more space when you need it. However, there isn't much stopping you from using the first method.
Maybe you're trying to implement some kind of immutable object. The first method will be the choice there. But in general, I opt for the second.
(Assuming we are talking about returning only one value from the function.)
In general, the first method is used when type T is relatively small. It is definitely preferable with scalar types. It can be used with larger types. What is considered "small enough" for these purposes depends on the platform and the expected performance impact. (The latter is caused by the fact that the returned object is copied.)
The second method is used when the object is relatively large, since this method does not perform any copying. And with non-copyable types, like arrays, you have no choice but to use the second method.
Of course, when performance is not an issue, the first method can be easily used to return large objects.
An interesting matter is optimization opportunities available to C compiler. In C++ language compilers are allowed to perform Return Value Optimizations (RVO, NRVO), which effectively turn the first method into the second one "under the hood" in situations when the second method offers better performance. To facilitate such optimizations C++ language relaxes some address-identity requirements imposed on the involved objects. AFAIK, C does not offer such relaxations, thus preventing (or at least impeding) any attempts at RVO/NRVO.
Short answer: take 2 if you don't have a necessary reason to take 1.
Long answer: In the world of C++ and its derived languages, Java, C#, exceptions help a lot. In C world, there is not very much you can do. Following is an sample API I take from CUDA library, which is a library I like and consider well designed:
cudaError_t cudaMalloc (void **devPtr, size_t size);
compare this API with malloc:
void *malloc(size_t size);
in old C interfaces, there are many such examples:
int open(const char *pathname, int flags);
FILE *fopen(const char *path, const char *mode);
I would argue to the end of the world, the interface CUDA is providing is much obvious and lead to proper result.
There are other set of interfaces that the valid return value space actually overlaps with the error code, so the designers of those interfaces scratched their heads and come up with not brilliant at all ideas, say:
ssize_t read(int fd, void *buf, size_t count);
a daily function like reading a file content is restricted by the definition of ssize_t. since the return value has to encode error code too, it has to provide negative number. in a 32bit system, the max of ssize_t is 2G, which is very much limited the number of bytes you can read from your file.
If your error designator is encoded inside of the function return value, I bet 10/10 programmers won't try to check it, though they really know they should; they just don't, or don't remember, because the form is not obvious.
And another reason, is human beings are very lazy and not good at dealing if's. The documentation of these functions will describe that:
if return value is NULL then ... blah.
if return value is 0 then ... blah.
yak.
In the first form, things changes. How do you judge if the value has been returned? No NULL or 0 any more. You have to use SUCCESS, FAILURE1, FAILURE2, or something similar. This interface forces users to code more safer and makes the code much robust.
With these macro, or enum, it's much easier for programmers to learn about the effect of the API and the cause of different exceptions too. With all these advantages, there actually is no extra runtime overhead for it too.
I will try to explain :)
Let say you have to load a giant rocket into semi,
Method 1)
Truck driver places a truck on a parking lot, and goes on to find a hookers, you are stack with putting the load onto forklift or some kind of trailer to bring it to the track.
Method 2)
Truck driver forgets hooker and backs truck up right to the rocket, then you need just to push it in.
That is the difference between those two :). What it boils down to in programming is:
Method 1)
Caller function reserves and address for called function to return its return value to, but how is calling function going to get that value does not matter, will it have to reserve another address or not does not matter, I need something returned, it is your job to get it to me :). So called function goes and reserves the address for its calculations and than stores the value in address then returns value to caller. So caller goes and say oh thank you let me just copy it to the address I reserved earlier.
Method 2)
Caller function says "Hey I will help you, I will give you the address that I have reserved, store what ever calculations you do in it", this way you save not only memory but you save in time.
And I think second is better, and here is why:
So let say that you have struct with 1000 ints inside of it, method 1 would be pointless, it will have to reserve 2*100*32 bits of memory, which is 6400 plus you have to copy it to first location than copy it to second one. So if each copy takes 1 millisecond you will need to way 6.4 seconds to store and copy variables. Where if you have address you only have to store it once.
They are equivalent to me but not in the implementation.
#include <stdio.h>
#include <stdlib.h>
int func(int a,int b){
return a+b;
}
int funn(int *x){
*x=1;
return 777;
}
int main(void){
int sx,*dx;
/* case static' */
sx=func(4,6); /* looks legit */
funn(&sx); /* looks wrong in this case */
/* case dynamic' */
dx=malloc(sizeof(int));
if(dx){
*dx=func(4,6); /* looks wrong in this case */
sx=funn(dx); /* looks legit */
free(dx);
}
return 0;
}
In a static' approach it is more comfortable to me doing your first method. Because I don't want to mess with the dynamic part (with legit pointers).
But in a dynamic' approach I'll use your second method. Because it is made for it.
So they are equivalent but not the same, the second approach is clearly made for pointers and so for the dynamic part.
And so far more clear ->
int main(void){
int sx,*dx;
sx=func(4,6);
dx=malloc(sizeof(int));
if(dx){
sx=funn(dx);
free(dx);
}
return 0;
}
than ->
int main(void){
int sx,*dx;
funn(&sx);
dx=malloc(sizeof(int));
if(dx){
*dx=func(4,6);
free(dx);
}
return 0;
}

Comparison between the two printf statements

please take a look at the two following c statements
printf("a very long string");
printf("%s","a very long string");
they produce the same result,but there is definitely some difference under the hood,so what is the difference and which one is better? Please share your ideas!
If you know what the string contents are, you should use the first form because it is more compact. If the string you want to print can come from the user or from any other source such that you do not know what the string contents are, you must use the second form; otherwise, your code will be wide open to format string injection attacks.
The first printf works like this
'a' is not a special character: print it
' ' is not a special character: print it
'v' is not a special character: print it
...
'g' is not a special character: print it
The second printf works like this
'%' is a special character:
's' print the contents of the string pointed to by the 2nd parameter
The first one passes one parameter and the second passes 2, so the call is slightly faster in the first one.
But in the first one, printf() has to scan the long string for format specifications and in the second one, the format string is very short, so the actual processing is probably faster in the second one.
More important (to me anyway), is that "a very long string" is not likely to be a a constant string as it is in this example. If you're printf'ing a long string, you're probably using a pointer to to something that the program generated. In that case, it's a MUCH better idea to use the second form because otherwise somewhere, somehow, sometime, the long string will contain a format printf format specification and that will cause printf to go looking for another argument and your program will crash. This exact problem just happened to me about a week ago in code that we have been using for nearly 20 years.
The bottom line is that your printf format specification should always be a constant string. If you need to output a variable, use printf("%s",var) or better yet, fputs(var, stdout).
The first is no less efficient than the second. Since there are no format sequences and no corresponding arguments, no work must be done by the printf() function. In the second case, if the compiler isn't smart enough to catch this, you will be calling for unnecessary work (note: miniscule compared to actually sending (and reading!) the output at the terminal.
printf was designed for printing with formatting. It is more useful to provide formatting arguments for the sake of debugging although they aren't required.
%s takes a value of a const char* whereas leaving no argument just prints the literal expression.
You could still cast a different pointer to the const char* explicitly and change its contents without changing the output expression.
First of all you should define "better" better since it is not smart enough by itself. Better in what way? performance, maintenance, readibility, extensibilty ...
With the one line of code presented I would choose option 1 for almost all versions of 'better'
It's more readible
It does what it should do and nothing more (KISS principle)
It's faster (no pointless moving memory around to stuff one string into another). But unless you are doing this printf a hell of a lot of times in a loop this is not that a big plus.

sprintf not copying?

I am writing a fuzzer that using the system() function and I need to copy:
char a[1100]; /* full of A's with null ending */
into:
char tmp[10000];
I used:
sprintf(tmp, "%s", a);
When I printf tmp there is nothing printed. What am I doing wrong?
There's no way to say what you are doing wrong without seeing the whole thing.
The above sprintf should work, although strcpy would make more sense for that purpose. I'd guess that sprintf works fine. Could be that your a array is not "full of A's" as you believe, but rather an empty string (full of zeros). Or maybe it is your printing that either doesn't work or it works but you don't see the output for some reason.
My bet would be that your a is an empty string. No A's there. Where and how do you put those A's into the a array?
Output is often line-buffered. If the string you're printing has no newline, you might not see it without calling fflush first (also see http://c-faq.com/stdio/fflush.html). But as AndreyT said, we can't tell without seeing the rest of your code.

Resources