Custom regional language

Custom regional language - c

I am writing code in C using 8051 MC at89c51 family to display a regional language in 16x2 lcd displayer.
Because the lcd doesn't read regional languages by default, I create the custom character and I converted each letter into hex. But what I don't understand is where I can put the converted hex value in my code and display as I want?
void main()
{
...
str_lcd("HELLO & WELCOME");
delay_ms(3000);
cmd_lcd(0x80);
cmd_lcd(0x01);
...
}
for "HELLO & WELCOME" the hex value is...
{0x40,0x60,0x30,0x1c,0x14,0x14,0x14,0x14},
{0x78,0x08,0x10,0x20,0x18,0x08,0x08,0x08},
{0x20,0x40,0x7c,0x24,0x24,0x04,0x0a,0x11},
{0x78,0x08,0x10,0x20,0x18,0x08,0x08,0x08},
{0x38,0x28,0x38,0x10,0x38,0x28,0x28,0x28},
{0x44,0x44,0x64,0x24,0x24,0x24,0x24,0x3c},
{0x3c,0x40,0x40,0x20,0x18,0x08,0x08,0x08},
{0x00,0x7f,0x55,0x55,0x55,0x55,0x77,0x00},
{0x7c,0x54,0x54,0x54,0x04,0x04,0x04,0x04},
{0x7c,0x10,0x1c,0x04,0x1f,0x04,0x04,0x04},
{0x48,0x48,0x48,0x4e,0x48,0x48,0x48,0x78},
};
so can any one help me where I can put this hex value and display it on the lcd?

Assuming each 8-byte array corresponds to a specific character, you could have a table of 128 such 8-byte arrays anywhere in the code, for example by having a static array of arrays of constant bytes, like
static const unsigned char character_data[128][8] = {
// Your data here, one entry per character
};
Most of the data in the table above would simply be zero.
Now where you put this table doesn't really matter, the compiler and linker will make sure it ends up in the correct segment (most likely in the text segment with the code). But since I declared it as static it should be placed in the source file which does the translation between the characters and the data sent to the LCD panel.

Related

How to code ASCII Text Based protocol over RS-232 in C

I have to implement a relatively simple communication protocol on top of RS-232.
It's an ASCII based text protocol with a couple of frame types.
Each frame looks something like this:
* ___________________________________
* | | | | |
* | SOH | Data | CRC-16 | EOT |
* |_____|_________|_________|________|
* 1B nBytes 2B 1B
Start Of Header (1 Byte)
Data (n-Bytes)
CRC-16 (2 Bytes)
EOT (End Of Transmission)
Each data-field needs to be separated by semicolon ";":
for example, for HEADER type data (contains code,ver,time,date,src,id1,id2 values):
{code};{ver};{time};{date};{src};{id1};{id2}
what is the most elegant way of implementing this in C is my question?
I have tried defining multiple structs for each type of frame, for example:
typedef struct {
uint8_t soh;
char code;
char ver;
Time_t time;
Date_t date;
char src; // Unsigned char
char id1[20]; // STRING_20
char id2[20]; // STRING_20
char crlf;
uint16_t crc;
uint8_t eot;
} stdHeader_t;
I have declared a global buffer:
uint8_t DATA_BUFF[BUFF_SIZE];
I then have a function sendHeader() in which I want to use RS-232 send function to send everything byte by byte by casting the dataBuffer to header struct and filling out the struct:
static enum_status sendHeader(handle_t *handle)
{
uint16_t len;
enum_RETURN_VALUE rs232_err = OK;
enum_status err = STATUS_OK;
stdHeader_t *header = (stdHeader_t *)DATA_BUFF;
memset(DATA_BUFF, 0, size);
header ->soh= SOH,
header ->code= HEADER,
header ->ver= 10, // TODO
header ->time= handle->time,
header ->date= handle->date,
header ->src= handle->config->source,
memset(header ->id1,handle->config->id1, strlen(handle->config->id1));
memset(header ->id2,handle->config->id2, strlen(handle->config->id1));
header ->crlf = '\r\n',
header ->crc = calcCRC();
header ->eot = EOT;
len = sizeof(stdHeader_t );
do
{
for (uint16_t i = 0; i < len; i++)
{
rs232_err= rs232_tx_send(DATA_BUFF[i], 1); // Send one byte
if (rs232_err!= OK)
{
err = STATUS_ERR;
break;
}
}
// Break do-while loop if there is an error
if (err == STATUS_ERR)
{
break;
}
} while (conditions);
return err;
}
My problem is that I do not know how to approach the problem of handling ascii text based protocol,
the above principle would work very well for byte based protocols.
Also, I do not know how to implement semicolon ";" seperation of data in the above snippet, as everything is sent byte by byte, I would need aditional logic to know when it is needed to send ";" and with current implementation, that would not look very good.
For fields id1 and id2, I am receiveing string values as a part of handle->config, they can be of any lenght, but max is 20. Because of that, with current implementation, I would be sending more than needed in case actual lenght is less than 20, but I cannot use pointers to char inside the struct, because in that case, only the pointer value would get sent.
So to sumarize, the main question is:
How to implement the above described text based protocol for rs-232 in a nice and proper way?

what is the most elegant way of implementing this (ASCII Text Based protocol) in C is my question?
Since this is ASCII, avoid endian issues of trying to map a multi-byte integer. Simply send an integer (including char) as decimal text. Likewise for floating point, use exponential notation and sufficient precision. E.g. sprintf(buf, "%.*e", DBL_DECIMAL_DIG-1, some_double);. Allow "%a" notation.
Do not use the same code for SOH and EOT. Different values reduce receiver confusion.
Send date and time using ISO 8601 as your guide. E.g. "2022-11-10", "23:38:42".
Send string with a leading/trailing ". Escape non-printable ASCII characters, and ", \, ;. Example for 10 long string 123\\;\"\xFF456 --> "123\\\;\"\xFF456".
Error check, like crazy, the received data. Reject packets of data for all sorts of reasons: field count wrong, string too long, value outside field range, bad CRC, timeout, any non-ASCII character received.
Use ASCII hex characters for CRC: 4 hex characters instead of 2 bytes.
Consider a CRC 32 or 64.
Any out-of-band input, (bytes before receiving a SOF) are silently dropped. This nicely allows an optional LF after each command.
Consider the only characters between SOH/EOT should be printable ASCII: 32-126. Escape others as needed.
Since "it's an ASCII based text protocol with a couple of frame types.", I'd expect a type field.
See What type of framing to use in serial communication for more ideas.

First of all, structs are really not good for representing data protocols. The struct in your example will be filled to the brim with padding bytes everywhere, so it is not a proper nor portable representation of the protocol. In particular, forget all about casting a struct to/from a raw uint8_t array - that's problematic for even more reasons: the first address alignment and pointer aliasing.
In case you insist on using a struct, you must write serialization/deserialization routines that manually copy to/from each member into the raw uint8_t buffer, which is the one that must be used for the actual transmission.
(De)serialization routines might not be such a bad idea anyway, because of another issue not addressed by your post: network endianess. RS-232 protocols are by tradition almost always Big Endian, but don't count on it - endianess must be documented explicitly.
My problem is that I do not know how to approach the problem of handling ascii text based protocol, the above principle would work very well for byte based protocols.
That is a minor problem compared to the above. Often it is acceptable to have a mix of raw data (essentially everything but the data payload) and ASCII text. If you want a pure ASCII protocol you could consider something like "AT commands", but they don't have much in the way of error handling. You really should have a CRC16 as well as sync bytes. Hint: preferably pick the first sync byte as something that don't match 7 bit ASCII. That is something with MSB set. 0xAA is popular.
Once you've sorted out data serialization, endianess and protocol structure, you can start to worry about details such as string handling in the payload part.
And finally, RS232 is dinosaur stuff. There's not many reasons why one shouldn't use RS422/RS485. The last argument for using RS232, "computers come with RS232 COM ports", went obsolete some 15-20 years back.

One thing your struct implementation is missing is packing. For efficiency reasons, depending on which processor your code is running on, the compiler will add padding to the structure to align on certain byte boundaries. Normally this doesn't effect you code that much, but if you are sending this data across a serial stream where every byte matters, then you will be sending random zeros across as well.
This article explains padding well, and how to pack your structures for use cases like yours
Structure Padding

What happens when we make an array defined using characters instead of integers in C?

This is a code I have used to define an array:
int characters[126];
following which I wanted to get a record of the frequencies of all the characters recorded for which I used the while loop in this format:
while((a=getchar())!=EOF){
characters[a]=characters[a]+1;
}
Then using a for loop I print the values of integers in the array.
How exactly is this working?
Does C assign a specific number for letters ie. a,b,c, etc in the array?

What happens when we make an array defined using characters instead of integers in C?
Let's be sure we are clear: you are using integer values returned by getchar() as indexes into your array. This is not defining the array, it is just accessing its elements.
Does C assign a specific number for letters ie. a,b,c, etc in the array?
There are no letters in the array. There are ints. However, yes, the characters read by getchar() are encoded as integer values, so they are, in principle, suitable array indexes. Thus, this line ...
characters[a]=characters[a]+1;
... reads the int value then stored at index a in array characters, adds 1 to it, and then assigns the result back to element a of the array, provided that the value of a is a valid index into the array.
More generally, it is important to understand that although one of its major uses is to represent characters, type char is an integer type. Its values are numbers. The mapping from characters to numbers is implementation and context dependent, but it is common enough for the mapping to be consistent with the ASCII code that you will often see programs that assume such a mapping.
Indeed, your code makes exactly such an assumption (and others) by allowing only for character codes less than 126.
You should also be aware that if your characters array is declared inside a function then it is not initialized. The code depends on all elements to be initially to zero. I would recommend this declaration instead:
int characters[UCHAR_MAX + 1] = {0};
That upper bound will be sufficient for all the non-EOF values returned by getchar(), and the explicit zero-initialization will ensure the needed initial values regardless of where the array is declared.

I have realized the charecter set that can function as an input for getchar() is part of the ASCII table and comes under an int. I used the code following to find that out:
#include <stdio.h>
int main(){
int a[128];
a['b']=4;
printf("%d",a[98]); //it is 98 as according to the table 'b' is assigned the value of 98
}
following which executing this code i get the output of 4.
I am really new to coding so feel free to correct me.

Character values are represented using some kind of integer encoding - ASCII (very common), EBCDIC (mostly IBM mainframes), UTF-8 (backward-compatible to ASCII), etc.
The character value 'a' maps to some integer value - 97 in ASCII and UTF-8, 129 in EBCDIC. So yes, you can use a character value to index into an array - arr['a']++ would be equivalent to arr[97]++ if you were using ASCII or UTF-8.
The C language does not dictate this - it's determined by the underlying platform.

How to find the address of a variable when using AVR?

I am trying to write a program that detects pixel level collision of bitmaps on a Teensy micro controller compiling with AVR-GCC. I am trying to work out how to calculate the position of a single byte of the bitmap on the screen and have been told I should be using pointers. I don't see the relationship between the physical address of a bitmap byte and it's position on the screen, but I would like to investigate. The problem is, I have no way of printing this address. AVR doesn't have printf and I don't know how to get it to display on the screen. Does anyone have a way of producing this address somehow in the terminal?
i.e. if I have a bitmap and want to print out the address of the first byte, what would I need to write to complete this:
??? *bit = &car_bitmap[1];
???("%??? ", bit);

Use snprintf and send the string to the terminal. It is very costly on the AVR uC. If you use gcc address spaces extensions you may have to link the support for the long numbers.

Assuming you have a working printf(), this should work:
void * const bit = &car_bitmap[1];
printf("%p\n", bit);
Where %p is how to print a void *. Other pointer types should be cast to void * in order to match, but I used a void * for the address into the framebuffer anyway.

printing the hex values after storing it to an array using C

I have done the reading from a file and is stored as hex values(say the first value be D4 C3). This is then stored to a buffer of char datatype. But whenever i print the buffer i am getting a value likebuff[0]=ffffffD4; buff[1]=ffffffC3 and so on.
How can I store the actual value to the buffer without any added bytes?
Attaching the snippet along with this
ic= (char *)malloc(1);
temp = ic;
int i=0;
char buff[1000];
while ((c = fgetc(pFile)) != EOF)
{
printf("%x",c);
ic++;
buff[i]=c;
i++;
}
printf("\n\nNo. of bytes written: %d", i);
ic = temp;
int k;
printf("\nBuffer value is : ");
for(k=0;k<i;k++)
{
printf("%x",buff[k]);
}

The problem is a combination of two things:
First is that when you pass a smaller type to a variable argument function like printf it's converted to an int, which might include sign extension.
The second is that the format "%x" you are using expects the corresponding argument to be an unsigned int and treat it as such
If you want to print a hexadecimal character, then use the prefix hh, as in "%hhx".
See e.g. this printf (and family) reference for more information.
Finally, if you only want to treat the data you read as binary data, then you should consider using int8_t (or possibly uint8_t) for the buffer. On any platform with 8-bit char they are the same, but gives more information to the reader of the code (saying "this is binary data and not a string").

By default, char is signed on many platforms (standards doesn't dictate its signedness). When passing to variable argument list, standard expansions like char -> int are invoked. If char is unsigned, 0xd3 remains integer 0xd3. If char is signed, 0xd3 becomes 0xffffffd3 (for 32-bit integer) because this is the same integer value -45.
NB if you weren't aware of this, you should recheck the entire program, because such errors are very subtle. I've dealed once with a tool which properly worked only with forced -funsigned-char into make's CFLAGS. OTOH this flag, if available to you, could be a quick-and-dirty solution to this issue (but I suggest avoiding it for any longer appoaching).
The approach I'm constantly using is passing to printf()-like functions a value not c, but 0xff & c, it's visually easy to understand and stable for multiple versions. You can consider using hh modifier (UPD: as #JoachimPileborg have already suggested) but I'm unsure it's supported in all real C flavors, including MS and embedded ones. (MSDN doesn't list it at all.)

You did store the actual values in the buffer without the added bytes. You're just outputting the signed numbers with more digits. It's like you have "-1" in your buffer but you're outputting it as "-01". The value is the same, it's just you're choosing to sign extend it in the output code.

C Newbie, ascii control function

I have written a program that works well in C that converts non-readable ASCII to their character values. I would appreciate if a C master? would show me a better way of doing it that I have currently done, mainly this section:
if (isascii(ch)) {
switch (ch) {
case 0:
printControl("NUL");
break;
case 1:
printControl("SOH");
break;
.. etc (32 in total)
case default:
putchar(ch);
break;
}
}
Is it normal to make a switch that big? Or should I be using some other method (input from an ascii table?)

If you're always doing the same operation (e.g., putchar), you can just statically initialize an array that maps to what each character should map. You could then access the proper mapping value by smartly accessing the array per the offset of the incoming character.
For example, (in pseudo-code -- it's been awhile since I wrote in C), you would define:
const char* [] map = {"NUL", "SOH, ...};
and then index that smartly via something like:
const char* val = map[((int)ch)];
to get your value.
You would not be able to use this if your "from" values are not sequential; in that case, you would need to have some conditional blocks. But if you can leverage the sequentiality, you should.

Too many years ago when assembly languages for 8-bit micros were how I spent my time, I would have written something like
printf("%3.3s",
("NULSOHSTXETXEOTENQACKBELBS HT LF VT FF CR SO SI "
"DLEDC1DC2DC3DC4NAKSYNETBCANEM SUBESCFS GS RS US ")[3*ch]);
but not because its particularly better. And the multiply by three is annoying because 8-bit micros don't multiply so it would have required both a shift and an add, as well as a spare register.
A much more C-like result would be to use a table with four bytes per control, with the NUL bytes included. That allows each entry to be referred to as a string constant, but saves the extra storage for 32 pointers.
const char *charname(int ch) {
if (ch >= 0 && ch <= 0x20)
return ("NUL\0" "SOH\0" "STX\0" "ETX\0" /* 00..03 */
"EOT\0" "ENQ\0" "ACK\0" "BEL\0" /* 04..07 */
"BS\0\0" "HT\0\0" "LF\0\0" "VT\0\0" /* 08..0B */
"FF\0\0" "CR\0\0" "SO\0\0" "SI\0\0" /* 0C..0F */
"DLE\0" "DC1\0" "DC2\0" "DC3\0" /* 10..13 */
"DC4\0" "NAK\0" "SYN\0" "ETB\0" /* 14..17 */
"CAN\0" "EM\0\0" "SUB\0" "ESC\0" /* 18..1B */
"FS\0\0" "GS\0\0" "RS\0\0" "US\0\0" /* 1C..1F */
"SP\0\0") + (ch<<2); /* 20 */
if (ch == 0x7f)
return "DEL";
if (ch == EOF)
return "EOF";
return NULL;
}
I've tried to format the main table so its organization is clear. The function returns NULL for characters that name themselves, or are not 7-bit ASCII. Otherwise, it returns a pointer to a NUL-terminated ASCII string containing the conventional abbreviation of that control character, or "EOF" for the non-character EOF returned by C standard IO routines on end of file.
Note the effort taken to pad each character name slot to exactly four bytes. This is a case where building this table with a scripting language or a separate program would be a good idea. In that case, the simple answer is to build a 129-entry table (or 257-entry) containing the names of all 7-bit ASCII (or 8-bit extended in your preferred code page) characters with an extra slot for EOF.
See the sources to the functions declared in <ctype.h> for a sample of handling the extra space for EOF.

You can make a switch this big but it does become a bit difficult to manage.
The way I would approach this is to build an array with char c; char* ctrl; for each item. Then you could just loop through the array. This would make it a little easier to maintain the data.
Note that if you use every character in a particular range (for example, character 0 through 32), then your array would only need the name and it wouldn't be necessary to store the character value.

I would say build a table with the vals (0-32) and their corresponding control string ("NUL", "SOH"). (In this case the table requires just an array)
Then you can just check if it is in range an index into the table to get the string to pass to your printControl() function.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight