Store in array with some spaces - c

I have a problem using memcpy().
I have an array of 36 bytes. the first 20 should be filled with mobile number and the other 16 with voucher number. If mobile number is less then 20 then it should be filled with spaces. But when I fill voucher number it overrides the first value. Below is my code.
char tempMobileNo[20],tempVoucherNo[16],o2RecordData[50];
memset(tempMobileNo,' ',20);
memset(tempVoucherNo,' ',16);
memset(o2RecordData,' ',RECORD_DATA_L);
memcpy(tempMobileNo,ValueB,20);
memcpy(tempVoucherNo,ValueC,16);
memcpy(&o2RecordData[0],tempMobileNo,20);
memcpy(&o2RecordData[22],tempVoucherNo,16);

The problem
memcpy is implemented in such way that you will always copy the number of specified bytes, it doesn't know if the "contents" of a buffer ends earlier and whether it shall stop copying because of this, nor does it care.
Since you first fill you buffers with spaces, but then unconditionally copy the length specified nto the buffer in (A) and (B), your spaces will be "overwritten" by whatever 20 and 16 bytes, respectively, available in Valueb and ValueC.
memcpy(tempMobileNo, ValueB, 20); // (A)
memcpy(tempVoucherNo, ValueC, 16); // (B)
Thoughts
If you are dealing with c-style strings (ie. null-terminated strings), consider using strncpy instead of memcpy.
strncpy (dst, src, n) will copy at most n characters, unless it hits the end of src (a null-byte).

Note: this post was created prior to OP editing his question, it's no longer of relevance.
memcpy(&o2RecordData[22],tempVoucherNo,22);
should be
memcpy(&o2RecordData[20],tempVoucherNo,16);

Related

How to determine the size of a string in C, or at least ensuring that it doesn't exceed a maximum number of bytes?

Is it possible to determine the size in bytes of a string in C?
I'm trying to ensure that JSON strings built in C do not exceed a 1 MB size limit before passing them to the requesting application. I don't know the strings at compile time.
I've read that it is just strlen * sizeof( char ); but I don't understand that, because I read elsewhere that UTF-8 can have characters of size up to four bytes and sizeof( char ) is always one.
I am likely misunderstanding something basic.
If a character array is allocated as char JSON[1048576], does this allocate that many characters or bytes? If it is bytes, then as long as something like snprintf is used when writing to JSON array, would this guarantee that it can never exceed 1 MB in size, even if there were character in that array that exceed one byte?
Thank you.
Since you are after a size limit 1MB and not a string length limit per se, you can just use strlen(json_str). Provided that your json string is null terminated, '\0'.
If you allocate char JSON[1048576] that will give you an array with that many bytes. And snprintf(JSON, 1048576, "<json string>", ...) will guarantee that you never overfill your array.
It does not guarantee however that your string is a valid utf-8 string since the last character may be a multi byte character that is split in the middle.
A C char is not the same as a utf-8 character. In C char is by definition 1 Byte but in utf-8 the visual character that you want, like the heart in your comment, may be represented by several bytes of data.
One byte gives you 256 different values and since there are way more than 256 Unicode "characters" more than one byte is needed to encode many of them. The designers of utf-8 was clever though so the first 127 characters can be encoded using just one byte and if only those characters are used it will both valid utf-8 and ascii.

Last value of char array unknown - C

I'm making a simple program in C, which checks the length of some char array and if it's less than 8, I want to fill a new array with zeroes and add it to the former array. Here comes the problem. I don't know why the last values are some signs(see the photo).
char* hexadecimalno = decToHex(decimal,hexadecimal);
printf("Hexadecimal: %s\n", hexadecimalno);
char zeroes [8 - strlen(hexadecimalno)];
if(strlen(hexadecimalno) < 8){
for(i = 0; i < (8-strlen(hexadecimalno)); i++){
zeroes[i]='0';
}
}
printf("zeroes: %s\n",zeroes);
strcat(zeroes,hexadecimalno);
printf("zeroes: %s\n",zeroes);
result
In C, strings (which are, as you are aware, arrays of characters) do not have any special metadata that tells you their length. Instead, the convention is that the string stops at the first character whose char value is 0. This is called "null-termination". The way your code is initializing zeroes does not put any null character at the end of the array. (Do not confuse the '0' characters you are putting in with NUL characters -- they have char value 48, not 0.)
All of the string manipulation functions assume this convention, so when you call strcat, it is looking for that 0 character to decide the point at which to start adding the hexadecimal values.
C also does not automatically allocate memory for you. It assumes you know exactly what you are doing. So, your code is using a C99 feature to dynamically allocate an array zeroes that has exactly the number of elements as you need '0' characters appended. You aren't allocating an extra byte for a terminating NUL character, and strcat is also going to assume that you have allocated space for the contents of hexadecimalno, which you have not. In C, this does not trigger a bounds check error. It just writes over memory that you shouldn't actually write over. So, you need to be very careful that you do allocate enough memory, and that you only write to memory you have actually allocated.
In this case, you want hexadecimalno to always be 8 digits long, left-padding it with zeroes. That means you need an array with 8 char values, plus one for the NUL terminator. So, zeroes needs to be a char[9].
After your loop that sets zeroes[i] = '0' for the correct number of zeroes, you need to set the next element to char value 0. The fact that you are zero-padding confuses things, but again, remember that '0' and 0 are two different things.
Provided you allocate enough space (at least 9 characters, assuming that hexadecimalno will never be longer than 8 characters), and then that you null terminate the array when putting the zeroes into it for padding, you should get the expected result.

Sending † character instead of Space character in Char array

I've migrated my project from XE5 to 10 Seattle. I'm still using ANSII codes to communicate with devices. With my new build, Seattle IDE is sending † character instead of space char (which is #32 in Ansii code) in Char array. I need to send space character data to text file but I can't.
I tried #32 (like before I used), #032 and #127 but it doesn't work. Any idea?
Here is how I use:
fillChar(X,50,#32);
Method signature (var X; count:Integer; Value:Ordinal)
Despite its name, FillChar() fills bytes, not characters.
Char is an alias for WideChar (2 bytes) in Delphi 2009+, in prior versions it is an alias for AnsiChar (1 byte) instead.
So, if you have a 50-element array of WideChar elements, the array is 100 bytes in size. When you call fillChar(X,50,#32), it fills in the first 50 bytes with a value of $20 each. Thus the first 25 WideChar elements will have a value of $2020 (aka Unicode codepoint U+2020 DAGGER, †) and the second 25 elements will not have any meaningful value.
This issue is explained in the FillChar() documentation:
Fills contiguous bytes with a specified value.
In Delphi, FillChar fills Count contiguous bytes (referenced by X) with the value specified by Value (Value can be of type Byte or AnsiChar)
Note that if X is a UnicodeString, this may not work as expected, because FillChar expects a byte count, which is not the same as the character count.
In addition, the filling character is a single-byte character. Therefore, when Buf is a UnicodeString, the code FillChar(Buf, Length(Buf), #9); fills Buf with the code point $0909, not $09. In such cases, you should use the StringOfChar routine.
This is also explained in Embarcadero's Unicode Migration Resources white papers, for instance on page 28 of Delphi Unicode Migration for Mere Mortals: Stories and Advice from the Front Lines by Cary Jensen:
Actually, the complexity of this type of code is not related to pointers and buffers per se. The problem is due to Chars being used as pointers. So, now that the size of Strings and Chars in bytes has changed, one of the fundamental assumptions that much of this code embraces is no longer valid: That individual Chars are one byte in length.
Since this type of code is so problematic for Unicode conversion (and maintenance in general), and will require detailed examination, a good argument can be made for refactoring this code where possible. In short, remove the Char types from these operations, and switch to another, more appropriate data type. For example, Olaf Monien wrote, "I wouldn't recommend using byte oriented operations on Char (or String) types. If you need a byte-buffer, then use ‘Byte’ as [the] data type: buffer: array[0..255] of Byte;."
For example, in the past you might have done something like this:
var
Buffer: array[0..255] of AnsiChar;
begin
FillChar(Buffer, Length(Buffer), 0);
If you merely want to convert to Unicode, you might make the following changes:
var
Buffer: array[0..255] of Char;
begin
FillChar(Buffer, Length(buffer) * SizeOf(Char), 0);
On the other hand, a good argument could be made for dropping the use of an array of Char as your buffer, and switch to an array of Byte, as Olaf suggests. This may look like this (which is similar to the first segment, but not identical to the second, due to the size of the buffer):
var
Buffer: array[0..255] of Byte;
begin
FillChar(Buffer, Length(buffer), 0);
Better yet, use this second argument to FillChar which works regardless of the data type of the array:
var
Buffer: array[0..255] of Byte;
begin
FillChar(Buffer, Length(buffer) * SizeOf(Buffer[0]), 0);
The advantage of these last two examples is that you have what you really wanted in the first place, a buffer that can hold byte-sized values. (And Delphi will not try to apply any form of implicit string conversion since it's working with bytes and not code units.) And, if you want to do pointer math, you can use PByte. PByte is a pointer to a Byte.
The one place where changes like may not be possible is when you are interfacing with an external library that expects a pointer to a character or character array. In those cases, they really are asking for a buffer of characters, and these are normally AnsiChar types.
So, to address your issue, since you are interacting with an external device that expects Ansi data, you need to declare your array as using AnsiChar or Byte elements instead of (Wide)Char elements. Then your original FillChar() call will work correctly again.
If you want to use ANSI for communication with devices, you would define the array as
x: array[1..50] of AnsiChar;
In this case to fill it with space characters you use
FillChar(x, 50, #32);
Using an array of AnsiChar as communication buffer may become troublesome in a Unicode environment, so therefore I would suggest to use a byte array as communication buffer
x: array[1..50] of byte;
and intialize it with
FillChar(x, 50, 32);

Scanning a file and allocating correct space to hold the file

I am currently using fscanf to get space delimited words. I establish a char[] with a fixed size to hold each of the extracted words. How would I create a char[] with the correct number of spaces to hold the correct number of characters from a word?
Thanks.
Edit: If I do a strdup on a char[1000] and the char[1000] actually only holds 3 characters, will the strdup reserve space on the heap for 1000 or 4 (for the terminating char)?
Here is a solution involving only two allocations and no realloc:
Determine the size of the file by seeking to the end and using ftell.
Allocate a block of memory this size and read the whole file into it using fread.
Count the number of words in this block.
Allocate an array of char * able to hold pointers to this many words.
Loop through the block of text again, assigning to each pointer the address of the beginning of a word, and replacing the word delimiter at the end of the word with 0 (the null character).
Also, a slightly philosophical matter: If you think this approach of inserting string terminators in-place and breaking up one gigantic string to use it as many small strings is ugly, hackish, etc. then you probably should probably forget about programming in C and use Python or some other higher-level language. The ability to do radically-more-efficient data manipulation operations like this while minimizing the potential points of failure is pretty much the only reason anyone should be using C for this kind of computation. If you want to go and allocate each word separately, you're just making life a living hell for yourself by doing it in C; other languages will happily hide this inefficiency (and abundance of possible failure points) behind friendly string operators.
There's no one-and-only way. The idea is to just allocate a string large enough to hold the largest possible string. After you've read it, you can then allocate a buffer of exactly the right size and copy it if needed.
In addition, you can also specify a width in your fscanf format string to limit the number of characters read, to ensure your buffer will never overflow.
But if you allocated a buffer of, say 250 characters, it's hard to imaging a single word not fitting in that buffer.
char *ptr;
ptr = (char*) malloc(size_of_string + 1);
char first = ptr[0];
/* etc. */

Doubt in count value passed to strncat

Suppose I have an array of size 10 characters (memset to 0), which I am passing to strncat as destination, and in source I am passing a string which is say 20 characters in length (null terminated), now should I pass the 'count' as 10 or 9?
The doubt is, does strncpy considers the 'count' as size of destination buffer or does it just copy 10 characters to the destination and then append a NULL terminating character in the 11th position.
Sorry if the question appears too trivial, but I was unable to make this out from the help documentation of strncpy.
You should probably just read the man page a bit more. The operative sentence seems to be this one:
The strncat() function is similar,
except that it will use at most n
characters from src. Since the result
is always terminated with '\0', at
most n+1 characters are written.
strncat will apply the null terminal for you. Since it has no knowledge about the alleged string you are pointing to, it will assume there is space for the null terminal. So you want to pass in 9.
If your array only has room for 10 characters then your count should be 9, as strncat will try to append count characters from src plus a null terminator.
Also, in this case your destination should have a null terminator in the first position, because that is where you need it to start appending.
$ man strncat
If src contains n or more characters, strncat() writes n+1 characters
to dest (n from src plus the terminating null byte). Therefore, the
size of dest must be at least strlen(dest)+n+1
$
The simple answer is: Ensure your buffer is big enough, if your buffer is to hold 10 characters, add one on to the size of the buffer to accomodate the nul character \0. That cannot be stressed enough and is one of the biggest stumbling blocks of learning C.
If you did not specify the appropriate length excluding the nul character, the buffer overflows and unpredictable results will occur, such as program crash, or jump off into the woods never to be seen again.
Hope this helps,
Best regards,
Tom.

Resources