Converting int64_t to string in C - c

I am working with C programming language and using Postgres database to insert records. One of the columns of the table in the database is of type timestamp. The data I am getting in the C program is of type int64_t and I need to pass this value as a parameter in the INSERT statement like shown below. In order to do that, I tried to convert the int64_t value to a string using sprintf().
char str[21];
sprintf(str, "%" PRId64 "\n", someTimeReceived inTheProgram);
const char * const paramValues[1] = { str };
PGresult *res = res = PQexecParams(
conn,
"INSERT INTO dummy VALUES(1,to_timestamp($1)::timestamp)",
1, /* one parameter */
NULL,
paramValues,
NULL, /* parameter lengths are not required for strings */
NULL, /* all parameters are in text format */
0 /* result shall be in text format */
);
But on using sprintf(), some garbage value are being inserted at the end of string. For example if the received int64_t value is 132408394771256230 then sprintf() is converting it to 132408394771256230\n\0\003. Because of the extra garbage values appended at the end of string, the program fails to insert the given value and then terminates.

if the received int64_t value is 132408394771256230 then sprintf() is converting it to 132408394771256230\n\0\003
no, sprintf only writes 132408394771256230\n\0 into str and let the rest of the array unchanged, the character of code 3 was present before the call of sprintf (the array is not initialized) but of course it can be anything else.
Because of the extra garbage values appended at the end of string, the program fails to insert the given value and then terminates.
no again, PQexecParams manages params as an array of strings (one string in your case) following the convention a string is ended by a null character, PQexecParams does not read/care about what there is after the null character, and does know even about the 'envelop' of your string (an array of 21 characters in your case).
Your problem probably comes because of the undesirable newline in your string, producing the command to be the string
INSERT INTO dummy VALUES(1,to_timestamp(132408394771256230\n)::timestamp)
just remove \n in the format of your sprintf.
Supposing it is needed to have the newline your array must be sized 22 rather than 21 to manage negative value without an undefined behavior writing out of the array.
...some garbage value
you certainly see the null character and the byte after because you use a debugger and that debugger knows str is sized 21 and when you ask to see the value of str the debugger consider str as an array of characters rather than a string. If you ask your debugger to show (char*) str it will consider str as a string and will not show the null character nor the character after.
[edit]
The function request the date-time as a string yyyy-mm-dd hh:mm:ss rather than your big number being an ua_datetime
To convert an ua_datetime to a the expected string use (see documentation) :
nt ua_datetime_snprintf (char * dst, size_t size, const ua_datetime * dt)
the result is a string YYYY-MM-DDThh:mm:ss.nnnZ so after just replace the last dot by a null character and 'T' by a space (supposing a 'T' is really produced, try and adapt)
An other way is to use :
time_t ua_datetime_to_time_t(const ua_datetime * dt)
then to make your string with standard function from the time_t
Anyway you have to understand you loose a lot of precision, ua_datetime are in 100 nanosecond while *postgress timestamp are in second only

Related

Why the strlen() function doesn't return the correct length for a hex string?

I have a hex string for example \xF5\x17\x30\x91\x00\xA1\xC9\x00\xDF\xFF, when trying to use strlen() function to get the length of that hex string it returns 4!
const char string_[] = { "\xF5\x17\x30\x91\x00\xA1\xC9\x00\xDF\xFF" };
unsigned int string_length = strlen(string_);
printf("%d", string_length); // the result: 4
Is the strlen() function dealing with that hex as a string, or is something unclear to me?
For string functions in the C standard library, a character with value zero, also called a null character, marks the end of a string. Your string contains \x00, which designates a null character, so the string ends there. There are four non-null characters before it, so strlen returns four.
C 2018 7.1.1 1 says:
A string is a contiguous sequence of characters terminated by and including the first null character… The length of a string is the number of bytes preceding the null character…
C 2018 7.24.6.3 2 says:
The strlen function computes the length of the string pointed to by s [its first argument].
You could compute the size of your array as sizeof string_ (because it is an array of char) or sizeof string_ / sizeof *string_ (to compute the number of elements regardless of type), but this will include a terminating null character because defining an array with [] and letting the length be computed from a string literal initializer includes the terminating null character of the string literal. You may need to hard-code the length of the array, possibly using #define to define a preprocessor macro, and use that length in the array definition and in other places where the length is needed.
It is because you have zero at index [4]
string_[0] == 0xF5
string_[1] == 0x17
string_[2] == 0x30
string_[3] == 0x91
string_[4] == 0
...
"\xf5" puts char having integer value 0xf5 at position [0]
To see it as a string you need to escape the \ character
const char string_[] = "\\xF5\\x17\\x30\\x91\\x00\\xA1\\xC9\\x00\\xDF\\xFF";
At compile time, your "string" appears as consecutive hex values expressed in C syntax inside a pair of quotation marks.
strlen() is a run time function that scans through a series of bytes, looking for the first instance of a zero-value byte.
It's good to understand the difference between "compile time" and "run time".

char array in a struct data type

I actually have a question regarding the concept of a char array, especially the one which is declared and initialized like below.
char aString[10] = "";
What i was taught was that this array can store up to 10 characters (index 0-9) and that at index 10 there is an automatically placed null terminating character (i know that accessing it would not be right) such that if we use string handling functions (printf, scanf, strcmp, etc.) they would know when the string stops.
However when I tried making a struct data type like below,
typedef struct customer{
char accountNum[10];
char name[100];
char idNum[15];
char address[200];
char dateOfBirth[10];
unsigned long long int balance;
char dateOpening[10];
}CUSTOMER;
inserted 10 characters into accountNum (any method, e.g. scanf), and printf it, what is printed out will be accountNum and values in the first word of name (i know that printf will stop at a space or a '\0'). This indicates that a char array does not have a terminating null at the end of the array.
Does this mean that if we have a char array of size 10 (char aString[10]), its maximum number of char it can store is 9 characters? or does things work differently in a struct? It would be nice if someone can help me the concept because it seems like i may have been working with undefined behaviour this whole time.
char aString[10] = "";
What i was taught was that this array can store up to 10 characters (index 0-9)
Yes.
and that at index 10 there is an automatically placed null terminating character
That is wrong. For one thing, index 10 would be out of bounds of the array. The compiler will certainly not initialize data outside of the memory it has reserved for the array.
What actually happens is that the compiler will copy the entire string literal including the null-terminator into the array, and if there are any remaining elements then they will be set to zeros. If the string literal is longer than the array can hold, the compile will simply fail.
In your example, the string literal has a length of 1 char (the null terminator), so the entire array ends up initialized with zeros.
i know that accessing it would not be right
There is no problem with accessing the null terminator, as long as it is inside the bounds of the array.
such that if we use string handling functions (printf, scanf, strcmp, etc.) they would know when the string stops.
Yes, they expect C-style strings and so will look for a null terminator - unless they are explicitly told the actual string length, ie by using a precision modifier for %s, or using strncmp(), etc.
However when I tried making a struct data type like below,
<snip>
inserted 10 characters into accountNum (any method, e.g. scanf), and printf it, what is printed out will be accountNum and values in the first word of name
That means you either forgot to null-terminate accountNum, or you likely overflowed it by writing too many characters into it. For instance, that is very easy to do when misusing scanf(), strcpy(), etc.
i know that printf will stop at a space or a '\0'
printf() does not stop on a space, only on a null terminator. Unless you tell it the max length explicitly, eg:
CUSTOMER c;
strncpy(c.accountNum, "1234567890", 10); // <-- will not be null terminated!
printf("%.10s", c.accountNum); // <-- stops after printing 10 chars!
If it has not encountered a null terminator by the time it reaches the 10th character, it will stop itself.
This indicates that a char array does not have a terminating null at the end of the array.
An array is just an array, there is no terminator, only a size. If you want to treat a character array as a C-style string, then you are responsible for making sure the array contains a nul character in it. But that is just semantics of the character data, the compiler will not do anything to ensure that behavior for you (except for in the one case of initializing a character array with a string literal).
Does this mean that if we have a char array of size 10 (char aString[10]), its maximum number of char it can store is 9 characters?
Its maximum storage will always be 10 chars, period. But if you want to treat the array as a C-style string, then one of those chars must be a nul.
or does things work differently in a struct?
No. Where an array is used does not matter. The compiler treats all array the same, regardless of context (except for the one special case of initializing a character array with a string literal).
What i was taught was that this array can store up to 10 characters (index 0-9) and that at index 10 there is an automatically placed null terminating character (i know that accessing it would not be right) such that if we use string handling functions (printf, scanf, strcmp, etc.) they would know when the string stops.
Yes, but accessing the null terminating character is absolutely safe.
inserted 10 characters into accountNum (any method, e.g. scanf), and printf it, what is printed out will be accountNum and values in the first word of name (i know that printf will stop at a space or a '\0'). This indicates that a char array does not have a terminating null at the end of the array.
printf does not stop for a space, only for a null terminating character. In this case, printf will print all characters until it sees '\0'.
Does this mean that if we have a char array of size 10 (char aString[10]), its maximum number of char it can store is 9 characters?
Yes.
or does things work differently in a struct?
There is no difference.

what are non null terminated string?

The "sz" part of the prefix is important, because some strings in the Windows world (especially when talking about the DDK) are not zero-terminated.reading this in STR,LPSTR section
can anyone tell me what are those non null terminated string?
In computer science, a string is a sequence of characters. A sequence has some length—there are some number of characters in it. To work with a string, one generally has to know the length of the string.
The length may be indicated in various ways. One way is to indicate the end of the sequence with a sentinel value, which is simply a chosen value that is not used in the sequence. With character strings, it is common to use zero as a sentinel: The string continues from its start until a zero character is found. When using a sentinel, the sentinel value cannot appear inside the string, since it marks the end.
Another way to indicate the length is to keep it separately from the string. For example, the length is passed to the C memcmp routine as a separate parameter. This allows memcmp to compare arbitrary sequences of bytes in memory, including sequences that contain zero bytes.
Sometimes the length is treated as part of the data structure for the string. It might be in the first byte or first several bytes of the string. So software using the string would get the length by reading the first byte, and the bytes after that would contain the characters of the string.
Another method, related to the sentinel method, is to use delimiters. For example, we commonly write strings such as "abc" in source code, text, and in shell commands. The quote marks are delimiters that mark the beginnings and ends of strings. Various methods are used to allow the delimiters themselves to be characters in the strings, such as “quoting” the delimiters with other special characters, as in: "This is a quote mark: \".".
In summary, the concept of a string that is not null-terminated is broad and open: Any method of indicating the length of a string other than marking the end with a null character is a string that is not null-terminated.
In windows kernel programming, the most often used string type is UNICODE_STRING, a non-null terminated string type:
typedef struct _UNICODE_STRING {
USHORT Length;
USHORT MaximumLength;
PWSTR Buffer;
} UNICODE_STRING
The purpose of this data structure is to efficiently processing string along the
stack drivers. Each driver in the stack may append text or modify the string in the range of "MaximumLenth" without allocating a new buffer.
For example, below is a typical unicode string stored in a continuous 64 bytes buffer:
address + 0 : 22 (Length)
address + 4 : 48 (MaximumLength)
address + 8 : buffer + 16 (Buffer)
address + 16: "Hello World" (UTF16 string, may without null terminated)
The standard string manipulating function can not use on UNICODE_STRING instead you should use the Rtl*UnicodeString() functions.
It would be easier to answer when we can use non null terminated strings.
Some API functions take only string pointer (SetWindowText, CreateFile) and strings have to be terminated with null character. Other functions (ExtTextOut, WriteConsole) take pointer and some form of length (usually number of chars, TCHARs or wchar_ts. These strings don't have to be terminated by null character.
// No termination NUL charcter bellow.
TCHAR hello[] = { 'H','E','L','L','O' };
ExtTextOut( hdc, 100, 100, 0, hello, 5, 0 );
TCHAR hello2[] = _T("HELLO WORLD!");
ExtTextOut( hdc, 100, 100, 0, hello2, 5, 0 );
In second ExtTextOut we don't have to artificially cut hello2 string (or copy it to temporary buffer). This function allows to use parts of string without null termination requirements.

Doesn't %[] or %[^] specifier in scanf(),sscanf() or fscanf() store the input in null-terminated character array?

Here's what the Beez C guide (LINK) tells about the %[] format specifier:
It allows you to specify a set of characters to be stored away (likely in an array of chars). Conversion stops when a character that is not in the set is matched.
I would appreciate if you can clarify some basic questions that arise from this premise:
1) Are the input fetched by those two format specifiers stored in the arguments(of type char*) as a character array or a character array with a \0 terminating character (string)? If not a string, how to make it store as a string , in cases like the program below where we want to fetch a sequence of characters as a string and stop when a particular character (in the negated character set) is encountered?
2) My program seems to suggest that processing stops for the %[^|] specifier when the negated character | is encountered.But when it starts again for the next format specifier,does it start from the negated character where it had stopped earlier?In my program I intend to ignore the | hence I used %*c.But I tested and found that if I use %c and an additional argument of type char,then the character | is indeed stored in that argument.
3) And lastly but crucially for me,what is the difference between passing a character array for a %s format specifier in printf() and a string(NULL terminated character array)?In my other program titled character array vs string,I've passed a character array(not NULL terminated) for a %s format specifier in printf() and it gets printed just as a string would.What is the difference?
//Program to illustrate %[^] specifier
#include<stdio.h>
int main()
{
char *ptr="fruit|apple|lemon",type[10],fruit1[10],fruit2[10];
sscanf(ptr, "%[^|]%*c%[^|]%*c%s", type,fruit1, fruit2);
printf("%s,%s,%s",type,fruit1,fruit2);
}
//character array vs string
#include<stdio.h>
int main()
{
char test[10]={'J','O','N'};
printf("%s",test);
}
Output JON
//Using %c instead of %*c
#include<stdio.h>
int main()
{
char *ptr="fruit|apple|lemon",type[10],fruit1[10],fruit2[10],char_var;
sscanf(ptr, "%[^|]%c%[^|]%*c%s", type,&char_var,fruit1, fruit2);
printf("%s,%s,%s,and the character is %c",type,fruit1,fruit2,char_var);
}
Output fruit,apple,lemon,and the character is |
It is null terminated. From sscanf():
The conversion specifiers s and [ always store the null terminator in addition to the matched characters. The size of the destination array must be at least one greater than the specified field width.
The excluded characters are unconsumed by the scan set and remain to be processed. An alternative format specifier:
if (sscanf(ptr, "%9[^|]|%9[^|]|%9s", type,fruit1, fruit2) == 3)
The array is actually null terminated as remaining elements will be zero initialized:
char test[10]={'J','O','N' /*,0,0,0,0,0,0,0*/ };
If it was not null terminated then it would keep printing until a null character was found somewhere in memory, possibly overruning the end of the array causing undefined behaviour. It is possible to print a non-null terminated array:
char buf[] = { 'a', 'b', 'c' };
printf("%.*s", 3, buf);
1) Are the input fetched by those two format specifiers stored in the
arguments(of type char*) as a character array or a character array
with a \0 terminating character (string)? If not a string, how to make
it store as a string , in cases like the program below where we want
to fetch a sequence of characters as a string and stop when a
particular character (in the negated character set) is encountered?
They're stored in ASCIIZ format - with a NUL/'\0' terminator.
2) My program seems to suggest that processing stops for the %[^|]
specifier when the negated character | is encountered.But when it
starts again for the next format specifier,does it start from the
negated character where it had stopped earlier?In my program I intend
to ignore the | hence I used %*c.But I tested and found that if I use
%c and an additional argument of type char,then the character | is
indeed stored in that argument.
It shouldn't consume the next character. Show us your code or it didn't happen ;-P.
3) And lastly but crucially for me,what is the difference between
passing a character array for a %s format specifier in printf() and a
string(NULL terminated character array)?In my other program titled
character array vs string,I've passed a character array(not NULL
terminated) for a %s format specifier in printf() and it gets printed
just as a string would.What is the difference?
(edit: the following addresses the question above, which talks about array behaviours generally and is broader than the code snippet in the question that specifically posed the case char[10] = "abcd"; and is safe)
%s must be passed a pointer to a ASCIIZ text... even if that text is explicitly in a char array, it's the mandatory presence of the NUL terminator that defines the textual content and not the array length. You must NUL terminate your character array or you have undefined behaviour. You might get away with it sometimes - e.g. strncpy into the array will NUL terminate it if-and-only-if there's room to do so, and static arrays start with all-0 content so if you only overwrite before the final character you'll have a NUL, your char[10] example happens to have elements for which values aren't specified populated with NULs, but you should generally take responsibility for ensuring that something is ensuring NUL termination.

C Sprintf() appends junk characters

I tried to use sprintf to append a int, string and an int.
sprintf(str,"%d:%s:%d",count-1,temp_str,start_id);
Here, the value of start_id is always the same. The value of temp_str which is a char * increases every time. I get correct output for some time and then my sprintf starts printing junk characters between temp_str and startid. So my str get corrupted.
Can anyone explain this behavior ?
example
at count 11
11:1:2:3:1:2:3:1:2:3:1:21:3:1:2:3:1:2:3:1:2:3:1:2:3:1:2:3:1:2:3:1:2:3:1:2:3:1:2:3:1:2:3:1:2:3:1:2:3:1:2:3:1:2:3:1:2:3:1:2:3:1:2:3:1:2:3:1:2:3:1:2:3:1:2:3:1:2:3:1:2:3:1:2:3:1:2:3:1:2
at count 8
8:1:2:3:1:2:3:1:2:3:1:21:3:1:2:3:1:2:3:1:2:3:1:2:3:1:2:3:1:2:3:1:2:3:1:2:3:1:2:3:1:2:3:1:2:3:1:2:3:1:2:3:1:2:3:1:2:3:1:2:3:1:2:3:1:2:3:1:2:3:1:2:3:1:2:3:1:2:3:1:2:3:1:2:3:1:2:3:1:2:3:1�:2
I don't understand why and how "�" is appended to the string
Either temp_str is not null-terminated at some point or you've blown the buffer for str and some other memory access is affecting it.
Without seeing the code, it's a little hard to tell but, if you double the size of str and the problem behaviour changes, then it's probably the latter.
1> try to memset your str buffer with 0 befor using sprintf
2> The value of temp_str which is a char * increases every time
what do u mean by this ?
this should be normal charachter pointer which will point some string and that string should be null terminated and tha will be copied to str
3> the total size by combing all three argument should not be exceed the size of str buffer
It looks like the string temp_str isn't NUL-terminated. You can either terminate it before the call to sprintf, or if you know the length you want to print, use the %.*s formatting operator like this:
int str_len = ...; // Calculate length of temp_str
sprintf(str, "%d:%.*s:%d", count-1, str_len, temp_str, start_id);
You are running off the end of temp_str. Check your bounds and make sure it's null terminated. Stop incrementing sun you get to the end.

Resources