Sending † character instead of Space character in Char array - arrays

I've migrated my project from XE5 to 10 Seattle. I'm still using ANSII codes to communicate with devices. With my new build, Seattle IDE is sending † character instead of space char (which is #32 in Ansii code) in Char array. I need to send space character data to text file but I can't.
I tried #32 (like before I used), #032 and #127 but it doesn't work. Any idea?
Here is how I use:
fillChar(X,50,#32);
Method signature (var X; count:Integer; Value:Ordinal)

Despite its name, FillChar() fills bytes, not characters.
Char is an alias for WideChar (2 bytes) in Delphi 2009+, in prior versions it is an alias for AnsiChar (1 byte) instead.
So, if you have a 50-element array of WideChar elements, the array is 100 bytes in size. When you call fillChar(X,50,#32), it fills in the first 50 bytes with a value of $20 each. Thus the first 25 WideChar elements will have a value of $2020 (aka Unicode codepoint U+2020 DAGGER, †) and the second 25 elements will not have any meaningful value.
This issue is explained in the FillChar() documentation:
Fills contiguous bytes with a specified value.
In Delphi, FillChar fills Count contiguous bytes (referenced by X) with the value specified by Value (Value can be of type Byte or AnsiChar)
Note that if X is a UnicodeString, this may not work as expected, because FillChar expects a byte count, which is not the same as the character count.
In addition, the filling character is a single-byte character. Therefore, when Buf is a UnicodeString, the code FillChar(Buf, Length(Buf), #9); fills Buf with the code point $0909, not $09. In such cases, you should use the StringOfChar routine.
This is also explained in Embarcadero's Unicode Migration Resources white papers, for instance on page 28 of Delphi Unicode Migration for Mere Mortals: Stories and Advice from the Front Lines by Cary Jensen:
Actually, the complexity of this type of code is not related to pointers and buffers per se. The problem is due to Chars being used as pointers. So, now that the size of Strings and Chars in bytes has changed, one of the fundamental assumptions that much of this code embraces is no longer valid: That individual Chars are one byte in length.
Since this type of code is so problematic for Unicode conversion (and maintenance in general), and will require detailed examination, a good argument can be made for refactoring this code where possible. In short, remove the Char types from these operations, and switch to another, more appropriate data type. For example, Olaf Monien wrote, "I wouldn't recommend using byte oriented operations on Char (or String) types. If you need a byte-buffer, then use ‘Byte’ as [the] data type: buffer: array[0..255] of Byte;."
For example, in the past you might have done something like this:
var
Buffer: array[0..255] of AnsiChar;
begin
FillChar(Buffer, Length(Buffer), 0);
If you merely want to convert to Unicode, you might make the following changes:
var
Buffer: array[0..255] of Char;
begin
FillChar(Buffer, Length(buffer) * SizeOf(Char), 0);
On the other hand, a good argument could be made for dropping the use of an array of Char as your buffer, and switch to an array of Byte, as Olaf suggests. This may look like this (which is similar to the first segment, but not identical to the second, due to the size of the buffer):
var
Buffer: array[0..255] of Byte;
begin
FillChar(Buffer, Length(buffer), 0);
Better yet, use this second argument to FillChar which works regardless of the data type of the array:
var
Buffer: array[0..255] of Byte;
begin
FillChar(Buffer, Length(buffer) * SizeOf(Buffer[0]), 0);
The advantage of these last two examples is that you have what you really wanted in the first place, a buffer that can hold byte-sized values. (And Delphi will not try to apply any form of implicit string conversion since it's working with bytes and not code units.) And, if you want to do pointer math, you can use PByte. PByte is a pointer to a Byte.
The one place where changes like may not be possible is when you are interfacing with an external library that expects a pointer to a character or character array. In those cases, they really are asking for a buffer of characters, and these are normally AnsiChar types.
So, to address your issue, since you are interacting with an external device that expects Ansi data, you need to declare your array as using AnsiChar or Byte elements instead of (Wide)Char elements. Then your original FillChar() call will work correctly again.

If you want to use ANSI for communication with devices, you would define the array as
x: array[1..50] of AnsiChar;
In this case to fill it with space characters you use
FillChar(x, 50, #32);
Using an array of AnsiChar as communication buffer may become troublesome in a Unicode environment, so therefore I would suggest to use a byte array as communication buffer
x: array[1..50] of byte;
and intialize it with
FillChar(x, 50, 32);

Related

Appending a char w/ null terminator in C

perhaps a lil trivial, but im just learning C and i hate doing with 2 lines, what can be done with one(as long as it does not confuse the code of course).
anyway, im building strings by appending one character at a time. im doing this by keeping track of the char index of the string being built, as well as the input file string's(line) index.
str[strIndex] = inStr[index];
str[strIndex + 1] = '\0';
str is used to temporarily store one of the words from the input line.
i need to append the terminator every time i add a char.
i guess what i want to know; is there a way to combine these in one statement, without using strcat()(or clearing str with memset() every time i start a new word) or creating other variables?
Simple solution: Zero out the string before you add anything to it. The NULs will already be at every location ahead of time.
// On the stack:
char str[STRLEN] = {0};
// On the heap
char *str = calloc(STRLEN, sizeof(*str));
In the calloc case, for large allocations, you won't even pay the cost of zeroing the memory explicitly (in bulk allocation mode, it requests memory directly from the OS, which is either lazily zero-ed (Linux) or has been background zero-ed before you ask for it (Windows)).
Obviously, you can avoid even this amount of work by defering the NUL termination of the string until you're done building it, but if you might need to use it as a C-style string at any time, guaranteeing it's always NUL-terminated up front isn't unreasonable.
I believe the way you are doing it now is the neatest that satisfies your requirement of
1) Not having string all zero to start with
2) At every stage the string is valid (as in always has a termination).
Basically you want to add two bytes each time. And really the most neat way to do that is the way you are doing it now.
If you are wanting to make the code seem neater by having the "one line" but not calling a function then perhaps a macro:
#define SetCharAndNull( String, Index, Character ) \
{ \
String[Index] = (Character); \
String[Index+1] = 0; \
}
And use it like:
SetCharAndNull( str, strIndex, inStr[index]);
Otherwise the only other thing I can think of which would achieve the result is to write a "word" at a time (two bytes, so an unsigned short) in most cases. You could do this with some horrible typecasting and pointer arithmetic. I would strongly recommend against this though as it won't be very readable, also it won't be very portable. It would have to be written for a particular endianness, also it would have problems on systems that require alignment on word access.
[Edit: Added the following]
Just for completeness I'm putting that awful solution I mentioned here:
*((unsigned short*)&str[strIndex]) = (unsigned short)(inStr[index]);
This is type casting the pointer of str[strIndex] to an unsigned short which on my system (OSX) is 16 bits (two bytes). It is then setting the value to a 16 bit version of inStr[index] where the top 8 bits are zero. Because my system is little endian, then the first byte will contain the least significant one (which is the character), and the second byte will be the zero from the top of the word. But as I said, don't do this! It won't work on big endian systems (you would have to add in a left shift by 8), also this will cause alignment problems on some processors where you can not access a 16bit value on a non 16-bit aligned address (this will be setting address with 8bit alignment)
Declare a char array:
char str[100];
or,
char * str = (char *)malloc(100 * sizeof(char));
Add all the character one by one in a loop:
for(i = 0; i<length; i++){
str[i] = inStr[index];
}
Finish it with a null character (outside the loop):
str[i] = '\0';

Store in array with some spaces

I have a problem using memcpy().
I have an array of 36 bytes. the first 20 should be filled with mobile number and the other 16 with voucher number. If mobile number is less then 20 then it should be filled with spaces. But when I fill voucher number it overrides the first value. Below is my code.
char tempMobileNo[20],tempVoucherNo[16],o2RecordData[50];
memset(tempMobileNo,' ',20);
memset(tempVoucherNo,' ',16);
memset(o2RecordData,' ',RECORD_DATA_L);
memcpy(tempMobileNo,ValueB,20);
memcpy(tempVoucherNo,ValueC,16);
memcpy(&o2RecordData[0],tempMobileNo,20);
memcpy(&o2RecordData[22],tempVoucherNo,16);
The problem
memcpy is implemented in such way that you will always copy the number of specified bytes, it doesn't know if the "contents" of a buffer ends earlier and whether it shall stop copying because of this, nor does it care.
Since you first fill you buffers with spaces, but then unconditionally copy the length specified nto the buffer in (A) and (B), your spaces will be "overwritten" by whatever 20 and 16 bytes, respectively, available in Valueb and ValueC.
memcpy(tempMobileNo, ValueB, 20); // (A)
memcpy(tempVoucherNo, ValueC, 16); // (B)
Thoughts
If you are dealing with c-style strings (ie. null-terminated strings), consider using strncpy instead of memcpy.
strncpy (dst, src, n) will copy at most n characters, unless it hits the end of src (a null-byte).
Note: this post was created prior to OP editing his question, it's no longer of relevance.
memcpy(&o2RecordData[22],tempVoucherNo,22);
should be
memcpy(&o2RecordData[20],tempVoucherNo,16);

Why would one add 1 or 2 to the second argument of snprintf?

What is the role of 1 and 2 in these snprintf functions? Could anyone please explain it
snprintf(argv[arg++], strlen(pbase) + 2 + strlen("ivlpp"), "%s%ccivlpp", pbase, sep);
snprintf(argv[arg++], strlen(defines_path) + 1, "-F\"%s\"", defines_path);
The role of the +2 is to allow for a terminal null and the embedded character from the %c format, so there is exactly the right amount of space for formatting the first string. but (as 6502 points out), the actual string provided is one space shorter than needed because the strlen("ivlpp") doesn't match the civlpp in the format itself. This means that the last character (the second 'p') will be truncated in the output.
The role of the +1 is also to cause snprintf() to truncate the formatted data. The format string contains 4 literal characters, and you need to allow for the terminal null, so the code should allocate strlen(defines)+5. As it is, the snprintf() truncates the data, leaving off 4 characters.
I'm dubious about whether the code really works reliably...the memory allocation is not shown, but will have to be quite complex - or it will have to over-allocate to ensure that there is no danger of buffer overflow.
Since a comment from the OP says:
I don't know the use of snprintf()
int snprintf(char *restrict s, size_t n, const char *restrict format, ...);
The snprintf() function formats data like printf(), but it writes it to a string (the s in the name) instead of to a file. The first n in the name indicates that the function is told exactly how long the string is, and snprintf() therefore ensures that the output data is null terminated (unless the length is 0). It reports how long the string should have been; if the reported value is longer than the value provided, you know the data got truncated.
So, overall, snprintf() is a relatively safe way of formatting strings, provided you use it correctly. The examples in the question do not demonstrate 'using it correctly'.
One gotcha: if you work on MS Windows, be aware that the MSVC implementation of snprintf() does not exactly follow the C99 standard (and it looks a bit as though MS no longer provides snprintf() at all; only various alternatives such as _snprintf()). I forget the exact deviation, but I think it means that the string is not properly null-terminated in all circumstances when it should be longer than the space provided.
With locally defined arrays, you normally use:
nbytes = snprintf(buffer, sizeof(buffer), "format...", ...);
With dynamically allocated memory, you normally use:
nbytes = snprintf(dynbuffer, dynbuffsize, "format...", ...);
In both cases, you check whether nbytes contains a non-negative value less than the size argument; if it does, your data is OK; if the value is equal to or larger, then your data got chopped (and you know how much space you needed to allocate).
The C99 standard says:
The snprintf function returns the number of characters that would have been written
had n been sufficiently large, not counting the terminating null character, or a negative
value if an encoding error occurred. Thus, the null-terminated output has been
completely written if and only if the returned value is nonnegative and less than n.
The programmer whose code you are reading doesn't know how to use snprintf properly. The second argument is the buffer size, so it should almost always look like this:
snprintf(buf, sizeof buf, "..." ...);
The above is for situations where buf is an array, not a pointer. In the latter case you have to pass the buffer size along:
snprintf(buf, bufsize, "...", ...);
Computing the buffer size is unneeded.
By the way, since you tagged the question as qt-related. There is a very nice QString class that you should use instead.
At a first look both seem incorrect.
In the first case the correct computation would be path + sep + name + NUL so 2 would seem ok, but for the name the strlen call is using ilvpp while the formatting code is using instead cilvpp that is one char longer.
In the second case the number of chars added is 4 (-L"") so the number to add should be 5 because of the ending NUL.

delphi declaring size of ansi string

Its easy to define a string at the size of 3 (in old delphi code)
st:string[3];
now, we wish to move the code to ansi
st:ansiString[3];
won't work!
and for adcanced oem type
st:oemString[3];
same problem, where
type
OemString = Type AnsiString(CP_OEMCP);
how could be declared a fixed length ansi string and the new oem type?
update: i know it will create a fixed length string. it is part of the design of the software to protect against mistakes, and is essential for the program.
You don't need to define the size of an AnsiString.
The notation
string[3]
is for short strings used by Pascal (and Delphi 1) and it is mostly kept for legacy purposes.
Short strings can be 1 to 255 bytes long. The first ("hidden") byte contains the length.
AnsiString is a pointer to a character buffer (0 terminated). It has some internal magic like reference counting. And you can safely add characters to an existing string because the compiler will handle all the nasty details for you.
UnicodeStrings are like AnsiStrings, but with unicode chars (2 bytes in this case). The default string now (Delphi 2009) maps to UnicodeString.
the type AnsiString has a construct to add a codepage (used to define the characters above 127) hence the CP_OEMCP:
OemString = Type AnsiString(CP_OEMCP);
"Short Strings" are "Ansi" String, because there are only available for backward compatibility of pre-Delphi code.
st: string[3];
will always create a fixed-length "short string" with the current Ansi Code Page / Char Set, since Delphi 2009.
But such short strings are NOT the same than so called AnsiString. There is not code page for short strings. As there is no reference-count for short strings.
The code page exists only for AnsiString type, which are not fixed-length, but variable-length, and reference counted, so a completely diverse type than a short string defined by string[...].
You can't just mix Short String and AnsiString type declaration, by design. Both are called 'strings' but are diverse types.
Here is the mapping of a Short String
st[0] = length(st)
st[1] = 1st char (if any) in st
st[2] = 2nd char (if any) in st
st[3] = 3rd (if any) in st
Here is the memory mapping of an AnsiString or UnicodeString type:
st = nil if st=''
st = PAnsiChar if st<>''
and here is the PSt: PAnsiChar layout:
PWord(PSt-12)^ = code page
PWord(PSt-10)^ = reference count
PInteger(PSt-8)^ = reference count
PInteger(PSt-4)^ = length(st) in AnsiChar or UnicodeChar count
PAnsiChar(PSt) / PWideChar(PSt) = Ansi or Unicode text stored in st, finished by a #0 char (AnsiChar or UnicodeChar)
So if there is some similarities between AnsiString and UnicodeString type, the short string type is totally diverse, and can't be mixed as you wished.
That would only be usefull when String[3] in unicode versions of Delphi defaults to 3 WideChars. That would supprise me, but in case it is, use:
st: array[1..3] of AnsiChar;
The size of an ansistring and unicodestring will grow dynamically. The compiler and runtime code handle all this stuff for you.
See: http://delphi.about.com/od/beginners/l/aa071800a.htm
For a more in depth explanation see: http://www.codexterity.com/delphistrings.htm
The length can be anything from 1 char to 2GB.
But the old ShortString type, the newer string types in Delphi are dynamic. They grow and shrink as needed. You can preallocate a string to a given length calling SetLength(), useful to avoid re-allocating memory if you have to add data piece by piece to a string you know the final length anyway, but even after that the string can still grow and shrink when data are added or deleted.
If you need static strings you can use array[0..n] of chars, whose size won't change dynamically.

How do I send an array of integers over TCP in C?

I'm lead to believe that write() can only send data buffers of byte (i.e. signed char), so how do I send an array of long integers using the C write() function in the sys/socket.h library?
Obviously I can't just cast or convert long to char, as any numbers over 127 would be malformed.
I took a look at the question, how to decompose integer array to a byte array (pixel codings), but couldn't understand it - please could someone dumb it down a little if this is what I'm looking for?
Follow up question:
Why do I get weird results when reading an array of integers from a TCP socket?
the prototype for write is:
ssize_t write(int fd, const void *buf, size_t count);
so while it writes in units of bytes, it can take a pointer of any type. Passing an int* will be no problem at all.
EDIT:
I would however, recomend that you also send the amount of integers you plan to send first so the reciever knows how much to read. Something like this (error checking omitted for brevity):
int x[10] = { ... };
int count = 10;
write(sock, &count, sizeof(count));
write(sock, x, sizeof(x));
NOTE: if the array is from dynamic memory (like you malloced it), you cannot use sizeof on it. In this case count would be equal to: sizeof(int) * element_count
EDIT:
As Brian Mitchell noted, you will likely need to be careful of endian issues as well. This is the case when sending any multibyte value (as in the count I recommended as well as each element of the array). This is done with the: htons/htonl and ntohs/ntohl functions.
Write can do what you want it to, but there's some things to be aware of:
1: You may get a partial write that's not on an int boundary, so you have to be prepared to handle that situation
2: If the code needs to be portable, you should convert your array to a specific endianess, or encode the endianess in the message.
The simplest way to send a single int (assuming 4-byte ints) is :
int tmp = htonl(myInt);
write(socket, &tmp, 4);
where htonl is a function that converts the int to network byte order. (Similarly,. when you read from the socket, the function ntohl can be used to convert back to host byte order.)
For an array of ints, you would first want to send the count of array members as an int (in network byte order), then send the int values.
Yes, you can just cast a pointer to your buffer to a pointer to char, and call write() with that. Casting a pointer to a different type in C doesn't affect the contents of the memory being pointed to -- all it does is indicate the programmer's intention that the contents of memory at that address be interpreted in a different way.
Just make sure that you supply write() with the correct size in bytes of your array -- that would be the number of elements times sizeof (long) in your case.
It would be better to have serialize/de-serialize functionality in your client /server program.
Whenever you want to send data, serialize the data into a byte buffer and send it over TCP with byte count.
When receiving data, de-serialize the data from buffer to your own interpretation .
You can interpret byte buffer in any form as you like. It can contain basic data type, objects etc.
Just make sure to take care of endianess and also alignment stuff.
Declare a character array. In each location of the array, store integer numbers, not characters.
Then you just send that.
For example:
char tcp[100];
tcp[0] = 0;
tcp[1] = 0xA;
tcp[2] = 0xB;
tcp[3] = 0xC;
.
.
// Send the character array
write(sock, tcp, sizeof(tcp));
I think what you need to come up with here is a protocol.
Suppose your integer array is:
100, 99, 98, 97
Instead of writing the ints directly to the buffer, I would "serialize" the array by turning it into a string representation. The string might be:
"100,99,98,97"
That's what would be sent over the wire. On the receiving end, you'd split the string by the commas and build the array back up.
This is more standardised, is human readable, and means people don't have to think about hi/lo byte orders and other silly things.
// Sarcasm
If you were working in .NET or Java, you'd probably encode it in XML, like this:
<ArrayOfInt><Int>100</Int><Int>99</Int><Int>98</Int><Int>97</Int></ArrayOfInt>
:)

Resources