strlen to return size_t? - c

In C:
My string length function is returning a size_t value?
Why is it not returning a integer which is conventional? And one more thing I noticed was that when I was trying concatenate this string with another string I received a bus error when I ran the program.
Context: I was kind of playing with gmp library and converting big numbers to strings and I end up with the above situation.
What kind of a string is that? Is my operating system playing a role in this issue? I use a MAC, 64-bit OS.
Edited: The error message I received was:
: warning: format ‘%d’ expects type ‘int’, but argument 3 has type ‘size_t’
Thanks!
#all: Thanks for the answers but I thought I will put the bus error as another question because it seems to be a different issue.

The problem is int might be not wide enough to store the whole range of possible length values. For example on 64-bit you can have a string longer than 4 gigabytes and if int is 32 bit you can't possibly return length of such a long string via an int variable.

strlen() always returned size_t ... and the POSIX standard also says that.
I guess the reason is that int has sign and the capacity of even an unsigned int might not be enough for holding size of an element (say if you have a 32bit int on x86-64 with 16GB RAM) ... the example is extreme, but possible.

POSIX strlen() does return size_t.
As to what's caused the bus error, it's impossible to say without seeing the code and knowing more details about the exact nature of your changes. One possibility is that you've caused a buffer overrun or did something with a NULL pointer you shouldn't have done.

To address your warning (which is actually an error - you've invoked undefined behavior by passing the wrong type to printf) you should use %zu rather than %d for printing size_t values.

strlen() returns a size_t since at least ISO C90 -- I just checked in my copy. And this standard should have no technical difference with ANSI C89.
There was a change of convention (size_t wasn't in K&R C), but it was a long time ago.

There is a very simple and logical reason for all of the functions from the standard library to work with size_t when it comes to lengths of memory blocks - the built-in sizeof operator yields a size_t result as well.
Moreover, size_t is unsigned, of a particular size, tied to the architecture and is semantically different than just a generic int which is meant for storing any number from the count of trees around your office to your SO reputation.

Related

Should I use a more universal variable alias/type or use the variable type's that the functions i will use take?

Introduction to the problem
So a few months back while i was really struggling to get better at C, i decided maybe its time for me to ditch the usual var types, char, short, int, long in favor of the alias provided in stdint.h those being the uint8_t, etc... The initial problem is that declaring long on some machine's has different byte size lengths than others but soon i started to see that it became a snob thing for me to "differentiate" me from other IT student's.
I'm now doing a project and as always, I started to use the variables aliases mentioned above, as they are easy for me to read, write and i don't have to wonder what storage allocation will my computer choose to use(i know the size doesn't change at will and its dependent on the computer running it, but not being evident and explicit causes me to just obsess over it).
As usual my debugger start's complaining from time to time especially due to pointer conversion's like so
Passing 'int8_t *' (aka 'signed char *') to parameter of type 'char *' converts between pointers to integer types where one is of the unique plain 'char' type and the other is notclang(-Wpointer-sign)
The Question
Although i usually just explicitly cast them i am just wondering if I'm making the code worse because of a preference of mine, and what is the good practice: using a more universal(to the project) variable alias, or just using the aliases depending on how will they be used(this implies using 2-3 different aliases on a file and probably 10+ project wide)?
Example
png_voidp per_chunck_ptr;
size_t readnum;
FILE *fp;
if ((fp = fopen(file, "rb")) == NULL){
fp = NULL;
//ERROR()
//ABORT(GENERAL_OPENING_READ_FILE_ERROR, " ERR01 ");
}
int8_t *pop;
pop=malloc(GENERAL_PNG_SIG_SIZE * sizeof(int8_t));
readnum=fread(pop,sizeof(int8_t),GENERAL_PNG_SIG_SIZE,fp);
if(readnum!=GENERAL_PNG_SIG_SIZE){
fclose(fp);
//ERR 5
fp = NULL;
fprintf(stderr, GENERAL_READ_FILE_ERROR, "ERR02");
exit(5);
}
if(png_sig_cmp((png_const_bytep)pop, 0, GENERAL_PNG_SIG_SIZE)!= 0){
fclose(fp);
fp = NULL;
fprintf(stderr, "File is not recognized as png file\n");
//err 6
exit(6);
}
free(pop);
The error checking is terrible, i use like 6-7 different ways to print error's for no other reason than me learning new functions or just getting distracted and using a certain one once and forgetting it there but although the answer could extend to it lets only focus on the variable pop for now
At the time i was pretty sure that the pointer "pop" would be used in the future as a parameter of a function which typically prefers png_bytep or in this case png_const_bytep and furthermore i the library has their own function for allocating memory although from what I've seen its not much different in its manner than using malloc, (although i haven't read the manual for the specific implementation and only know a few generic theoretical concepts about the designation(if you haven't figured it out, its PNG).
Now lets focus on size_t readnum, this part goes against what i said earlier since im using another alias when i could've just said uint64_t, the thing is that if im not wrong size_t can store the maximum size of a theoretically possible object of any type (including array).(although there's types for bigger "words", like long double, which for what I've read is 10 bytes long, or __uint128_t which 16 bytes these aren't as easily used by the cpu and they require optimizations on software to be able to store and manipulate them, and again correct me here if im wrong) so i shouldn't use a uint8_t here because i know that if someday cpu manufacturers decide to increase the bit size, of the register storage, ALU, and memory adresses(i heard my professor even say that modern intel ones sometimes had around 80 bits) to something like 128 bits the size_t would move and my function could fail if it ever overflowed. This caused me to not use uint64_t
Conclusion
So should i just quit using my preferences of exact variable size and instead use the variable aliases that the functions use, or is using exact size alias really better as they allow to define certain behavior more precisely
As a general rule, use the exact width types when you need to control exactly how big a particular variable is, using any kind of bit manipulation, or when dealing with raw data. You would use the base types when interacting with an API that specifies those types, i.e. if a function expects a long, pass it a long instead of a uint64_t.
Regarding the specific warning you're getting, that's because the type char may be either signed or unsigned, depending on the implementation. So if you were to pass a int8_t * to a function expecting a char *, and that system happens define char as an unsigned type, then you have a mismatch.
This isn't a big deal however, since it's allowed to access an integer object via a pointer to either the signed or unsigned version of its type.
First of all, why are you trying to cast an int8_t * into a char *(aka uint8_t *)? The warning is for your code's safety. It's not because you're using aliases but because you're implicitly casting a signed integer type into an unsigned one.
In practice, you should use whatever your API expects. And, if you're not using any other library or stuff like that, use whatever suits you the best. If you want to be specific about the size, use the stdint.h values and if you want to be vague about the sizes(for eg. you want to second biggest type) use the normal C keywords. Beware that the "second biggest" type may be the same as the biggest or the third biggest.

Why I got this error passing a array element to crypt()

i got 2 error messages for the code, I tried to figure it out myself searching online but it didn´t help. the message are like this:
error: incompatible integer to pointer conversion passing 'char' to parameter of type 'const char *'; take the address with & [-Werror,-Wint-conversion]
strcpy(genHash,crypt( letters[i], "abc"));
the other one is the same message but for passW[0]. I just want to understand what happen. I would appreciate any help. Also if anyone can recommend a good lecture about char arrays, char arrays using pointers. thanks
When you call a function and pass letters[i] or passW[0], you are passing a single character. But it sounds like these functions expect entire strings, that is, arrays of characters.
You might be able to get away with passing letters instead of letters[I], and passW instead of passW[0]. (But it's hard to be sure, because you haven't shown us your code.)
If you want to learn about char arrays and char array pointers then you should grab a copy of 'The C programming language' by Kernighan and Richie. It has a whole chapter on it. It's also a spectacular reference for C in general.
Char variables are actually unsigned 8-bit (1 byte) integers in C/C++. They have values from 0 to 255. When you use fputs() or other print functions in C/C++, these integer values are converted to characters based on your chosen character/type set.
Have you checked the type of the output that crypt() returns? If you're using a genuine GNU compiler then you can use the typeof() function/operator to check the output. Or you could look for the definition of crypt() in your header files. It's possible that it isn't the input to crypt that's causing the problem. It could be nesting crypt() inside strcpy() without an appropriate cast.

sprintf corrupting arrays in IAR microcontroller

I am currently learning embedded programming, and thus working on an IAR-platform using a TI microcontroller with ARM architecture. Since I am not at all familiar with the technicalities related to this kind of programming, or C programming in general, I would like to ask a basic question:
I have the following simple code snippet:
int i;
for(i = 0; i < NUM_SAMPLES; i++)
{
sinTable[i] = sinf(2*i*dT*PI);
}
for(i = 0; i < NUM_SAMPLES; i++)
{
char out[32];
sprintf(out,"sin: %.7f, %.7f;", i*dT, sinTable[i]);
putString(out);
delay(DELAY_100US);
}
Where sinTable[] is a global variable of size NUM_SAMPLES, putString(*char) is a function which writes to an RS232-port, and delay(float) is a simple delay-function.
My problem is that once the sprintf(...) is called, it corrupts sinTable, giving some very peculiar results when plotting the table on the receiver end of the COM-signal.
I don't expect that I run out of memory, as the MC has 64KB SRAM.
Does anyone have any thoughts?
Ensure that your stack pointer is on a 64-bit boundary when main is reached.
The symptom your are seeing is typical of a stack aligned on an odd 32-bit boundary. Everything seems to work properly until a double is used as a variadac argument. This breaks when the code expects such arguments to be on 8-byte boundaries.
Upon further review:
I suspect that Michael Burr's response regarding stack utilization was on the right track. Selecting a smaller printf library might be sufficient, but if you can increase your stack size, that seems safer. Note that the IAR C/C++ Development Guide includes info on linker stack usage analysis.
Original:
When I upgraded from IAR 6.1 (licensed) to 6.4 (kickstart), I ran into a similar problem - vsnprintf was writing "all over RAM", even though the return value indicated the number of characters written was well within the target bounds. The "solution" was to avoid the printf library that has multibyte support.
Project Options > General > Library Options > printf small w/o multi-byte
might want to also uncheck
Project Options > C/C++ Compiler / Language 2 / enable multibyte
I tried to report this to IAR, but since my support contract is expired ...
Unfortunately, a similar problem is back with IAR 7.3.4, and the multibyte "fix" does NOT seem to be sufficient. Happens with both sprintf() and snprintf(), although the out-of-bounds corruption is not identical between those 2.
You seem pretty confident that your result string is only 31 characters long. The only way to corrupt another variable with your sprintf statement is that your result string is longer than 32 bytes (31 chars and a nul-byte), therefore overwriting other parts of memory. Make your numbers smaller or make your temporary buffer larger.
Thank you to anyone who have suggested a solution to this problem. I eventually ended up writing a conversion method that gives a hex-representation of the string and transmitted that instead, omitting the sprintf(...) completely.
It is very crude, but suits my needs.

Why I can't declare unsigned char* test = "Some text"

This isn't working in visual studio 2010 , it gives me the following error
void main (void)
{
unsigned char* test = "ATGST";
}
Edit 1: My question is why this works on Embedded systems, but doesn't work on PC?
But when I change it to :
char* test = "ATGST";
it works.
The main thing that I write code for embedded systems using C, and I use visual studio to test some functions so I don't have to test it in real time on a Micro-controller.
I need an explanation, because Micro-controllers accepts the first code.
Edited to conform to the removal of the C++ tag and to appease the embedded tag.
First, the problem at hand, you are trying to pass a char[] literal into an unsigned char*. You can't really equate char with either unsigned or signed, it is a bit special in that regard. Also, a string literal is given unique storage and should never be modified. If you're dealing with characters, you need to use a standard char* in which char[] can decay into. You could forcefully cast it, but I don't like to recommend such things. It is safe to do, as one of the comments pointed out. Actually, it is actually one of the rare things that are really a safety no-brainer.
But there is far too little space for a tight answer to provide enough qualification on reinterpret_casting, which is basically saying to the compiler that you know what you're doing. That is potentially very dangerous and should only be done when you're quite sure about the problem at hand. The char is usually just generic, not even signed or unsigned. Since an unsigned char has a bigger range than a char and usually char uses the positive subset of the signed char to describe characters (or any other kind of data that can fit), if your data is not in the extended positive range, you're good to go. But, do conform to the environment and code safely.
On the entry point function - conforming edit
Since it has been established that you work on an embedded system, this implies that your program is very likely not required to return anything, so it can remain void main() (it could also be the case that it requires very different returns specified by the given embedded system, the OP knows the most about the requirements his system imposes). In a lot of cases, the reason you can remain with void is because there is no environment/OS to appease, nobody to communicate with. But embedded systems can also be quite specialized and it is best to approach by studying the given platform in detail in order to satisfy the requirements imposed (if any).
For one, you need a const in there. And secondly, char != unsigned char, and also (uniquely) != signed char.
String literals are of type const char[N]- for an appropriate size N, and therefore can only be converted to a const char*. Note that the language has a special rule allowing you to implicitly drop the const but it's still UB to modify a string literal, making it a terribly bad idea to do so.
The micro-controller's C implementation is non-conforming in this regard. It would be better to simply use const char*, as is correct, rather than try to hack VS into accepting incorrect code.
I believe this is the case of assigning string to unsigned char *.
Well, when you assign a string value, it will assign ASCII values associated with characters, so you should use char * in place of unsigned char *.
if you want to assign values other than strings, characters then your implementation is correct.
hope it helps.
If you are using character types for text, use the unqualified char.
If you are using character types as numbers, use unsigned char.
unsigned char, which gives you at least the 0 to 255 range.
for more information: What is an unsigned char?
My question is why this works on Embedded systems, but doesn't work on PC?
Most likely because you are accidentally compiling the PC code in C++ which has stricter type checking than C. In C, it doesn't matter the slightest whether you use unsigned char or plain char, the code will compile just fine.
However, there are some issues with your code that should be fixed, as suggested in other answers. If the code needs to run on embedded and Windows both, you should rewrite it as:
#ifdef _WIN32
int main (void)
#else
void main (void)
#endif
{
const unsigned char* test = "ATGST";
}

Why can't C constant be stored in short type

As the title implies, I don't understand why it is like that.
The Code:
#include <stdio.h>
#define try 32
int main(void)
{
printf("%ld\n" , sizeof try);
return 0;
}
The Question:
1.) When I try using sizeof operator to get the size of storage the constant try is stored in, I get 4 which is 32-bits.
2.) Why doesn't C store it in 16-bits short since it is large enough to hold it.
3.) Is there any ways to make a constant be stored in short type?
Thank you for reading my question, Much appreciated.
You misunderstand the C preprocessor. The C preprocessor simply performs string substitution - it has no knowledge of types. The '32' will get interpreted according to the language rules applying to the context where it gets inserted - typically, as an int.
To make your defined value always be seen as a short, you could do:
#define try ((short)32)
See also: How do I write a short literal in C++?
Declared constants are probably what you are looking for:
const short try = 32;
This is typically preferred as I believe without optimizations (or maybe even with) the compiler will not try to fit any data type into the smallest number of addressable bytes.
What you call a constant definition is in fact a macro definition that says the label try will be replaced by 32 at preprocessing time.
So the compiler will understand it sizeof(32) which does not have any type specified. So it gets the default types which is system dependent.
To make a short constant you will have either to cast your value in the defined macro or initialize it as a global one like follow:
const short try = 32;
I will take a crack at this, but I'm not entirely sure if I'm correct
Since #define is just telling the preprocessor to replace try with your number, it's acting as if it was just a value. On your specific machine, that value is being stored in a 4 byte space in memory. The only solution to this I can think of is to use a machine with an architecture that uses a default 16 bit size for an int.
Even if it was a short, it might still occupy 4 bytes in memory because of alignment, but it really depends.

Resources