Declaring Pascal-style strings in C - c

In C, is there a good way to define length first, Pascal-style strings as constants, so they can be placed in ROM? (I'm working with a small embedded system with a non-GCC ANSI C compiler).
A C-string is 0 terminated, eg. {'f','o','o',0}.
A Pascal-string has the length in the first byte, eg. {3,'f','o','o'}.
I can declare a C-string to be placed in ROM with:
const char *s = "foo";
For a Pascal-string, I could manually specify the length:
const char s[] = {3, 'f', 'o', 'o'};
But, this is awkward. Is there a better way? Perhaps in the preprocessor?

I think the following is a good solution, but don't forget to enable packed structs:
#include <stdio.h>
#define DEFINE_PSTRING(var,str) const struct {unsigned char len; char content[sizeof(str)];} (var) = {sizeof(str)-1, (str)}
DEFINE_PSTRING(x, "foo");
/* Expands to following:
const struct {unsigned char len; char content[sizeof("foo")];} x = {sizeof("foo")-1, "foo"};
*/
int main(void)
{
printf("%d %s\n", x.len, x.content);
return 0;
}
One catch is, it adds an extra NUL byte after your string, but it can be desirable because then you can use it as a normal c string too. You also need to cast it to whatever type your external library is expecting.

GCC and clang (and possibly others) accept the -fpascal-strings option which allows you to declare pascal-style string literals by having the first thing that appears in the string be a \p, e.g. "\pfoo". Not exactly portable, but certainly nicer than funky macros or the runtime construction of them.
See here for more info.

You can still use a const char * literal and an escape sequence as its first character that indicates the length:
const char *pascal_string = "\x03foo";
It will still be null-terminated, but that probably doesn't matter.

It may sound a little extreme but if you have many strings of this kind that need frequent updating you may consider writing your own small tool (a perl script maybe?) that runs on the host system, parses an input file with a custom format that you can design to your own taste and outputs a .c file. You can integrate it to your makefile or whatever and live happily ever after :)
I'm talking about a program that will convert this input (or another syntax that you prefer):
s = "foo";
x = "My string";
To this output, which is a .c file:
const char s[] = {3, 'f', 'o', 'o'};
const char x[] = {9, 'M', 'y', ' ', 's', 't', 'r', 'i', 'n', 'g'};

My approach would be to create functions for dealing with Pascal strings:
void cstr2pstr(const char *cstr, char *pstr) {
int i;
for (i = 0; cstr[i]; i++) {
pstr[i+1] = cstr[i];
}
pstr[0] = i;
}
void pstr2cstr(const char *pstr, char *cstr) {
int i;
for (i = 0; i < pstr[0]; i++) {
cstr[i] = pstr[i+1];
}
cstr[i] = 0;
}
Then I could use it this way:
int main(int arg, char *argv[]) {
char cstr[] = "ABCD", pstr[5], back[5];
cstr2pstr(cstr, pstr);
pstr2cstr(pstr, back);
printf("%s\n", back);
return 0;
}
This seems to be simple, straightforward, less error prone and not specially awkward. It may be not the solution to your problem, but I would recommend you to at least think about using it.

You can apply sizeof to string literals as well. This allows a little less awkward
const char s[] = {sizeof "foo" - 1u, 'f', 'o', 'o'};
Note that the sizeof a string literal includes the terminating NUL character, which is why you have to subtract 1. But still, it's a lot of typing and obfuscated :-)

One option might be to abuse the preprocessor. By declaring a struct of the right size and populating it on initialization, it can be const.
#define DECLARE_PSTR(id,X) \
struct pstr_##id { char len; char data[sizeof(X)]; }; \
static const struct pstr_##id id = {sizeof(X)-1, X};
#define GET_PSTR(id) (const char *)&(id)
#pragma pack(push)
#pragma pack(1)
DECLARE_PSTR(bob, "foo");
#pragma pack(pop)
int main(int argc, char *argv[])
{
const char *s = GET_PSTR(bob);
int len;
len = *s++;
printf("len=%d\n", len);
while(len--)
putchar(*s++);
return 0;
}

This is why Variable Length Arrays were introduced in c99 (and to avoid the use of the "struct hack") IIRC, Pascal-strings were limited to a maximal length of 255.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <limits.h> // For CHAR_BIT
struct pstring {
unsigned char len;
char dat[];
};
struct pstring *pstring_new(char *src, size_t len)
{
struct pstring *this;
if (!len) len = strlen(src);
/* if the size does not fit in the ->len field: just truncate ... */
if (len >=(1u << (CHAR_BIT * sizeof this->len))) len = (1u << (CHAR_BIT * sizeof this->len))-1;
this = malloc(sizeof *this + len);
if (!this) return NULL;
this->len = len;
memcpy (this->dat, src, len);
return this;
}
int main(void)
{
struct pstring *pp;
pp = pstring_new("Hello, world!", 0);
printf("%p:[%u], %*.*s\n", (void*) pp
, (unsigned int) pp->len
, (unsigned int) pp->len
, (unsigned int) pp->len
, pp->dat
);
return 0;
}

You can define an array in the way you like, but note that this syntax is not adequate:
const char *s = {3, 'f', 'o', 'o'};
You need an array instead of a pointer:
const char s[] = {3, 'f', 'o', 'o'};
Note that a char will only store numbers up to 255 (considering it's not signed) and this will be your maximum string length.
Don't expect this to work where other strings would, however. A C string is expected to terminate with a null character not only by the compiler, but by everything else.

Here's my answer, complete with an append operation that uses alloca() for automatic storage.
#include <stdio.h>
#include <string.h>
#include <alloca.h>
struct pstr {
unsigned length;
char *cstr;
};
#define PSTR(x) ((struct pstr){sizeof x - 1, x})
struct pstr pstr_append (struct pstr out,
const struct pstr a,
const struct pstr b)
{
memcpy(out.cstr, a.cstr, a.length);
memcpy(out.cstr + a.length, b.cstr, b.length + 1);
out.length = a.length + b.length;
return out;
}
#define PSTR_APPEND(a,b) \
pstr_append((struct pstr){0, alloca(a.length + b.length + 1)}, a, b)
int main()
{
struct pstr a = PSTR("Hello, Pascal!");
struct pstr b = PSTR("I didn't C you there.");
struct pstr result = PSTR_APPEND(PSTR_APPEND(a, PSTR(" ")), b);
printf("\"%s\" is %d chars long.\n", result.cstr, result.length);
return 0;
}
You could accomplish the same thing using c strings and strlen. Because both alloca and strlen prefer short strings I think that would make more sense.

Related

Cant print a string in a struct

I'm trying to copy a string into a struct, but it doesn't display anything. Can you help me figure it out where the problem is?
typedef struct{
long tipo;
char *buffer;
}msg;
msg mess;
strcpy(mess.buffer,"hello");
printf("%s\n",mess.buffer);
Observing the strcpy declaration
char * strcpy ( char * destination, const char * source );
We notice that it copies the chars from source and store them in destination. But note that it's not specified the length of destination. So it may cause problems if the destination is:
Smaller than the source (Overflow)
Not allocated to some space in the memory (Segmentation Fault)
It's because strcpy function tries to copy char by char until it gets to the end of the 'string'. See how it should look like:
char *strcpy(char *destination , const char *source ){
char *saved = destination;
while (*source){ // while is not NULL
*destination++ = *source++; // Pointer operation
}
*destination = 0; // last position is set to 0 (which is NULL, end of string)
return saved;
}
So when you perform strcpy(mess.buffer,"hello") you can't actually find mess.buffer++ because there's no next memory block since you did not allocated sequential memory. Thus, Segmentation Fault happens.
Finally, you could do:
/* Note that "hello" occupies 6 char spaces: 'h', 'e', 'l', 'l', 'o', '\0' */
int mySize = 10;
mess.buffer = malloc(mySize * sizeof(char));
strcpy(mess.buffer, "hello") // 10 > 6 so OK
Either assign the string directly (note that you will no long be able to modify this string):
msg mess = {.buffer= "hello"};
or use malloc and memcpy function from libc which will perform a memory allocation and copy the bytes of your string to your char pointer for you:
if (!(mess.buffptr = malloc(sizeof(char) * (strlen(s) + 1))))
return 1;
memcpy(mess.buffptr, s, strlen(s));
But I think that what you are looking for is rather to use char buffer[128] instead of a char pointer, see the following example:
#include <stdint.h>
#include <string.h>
#include <stdio.h>
#include <stdlib.h>
typedef struct{
long tipo;
char buffer[128];
char *buffptr; // remove if not needed
} msg;
int main(){
char *s = "hello";
msg mess = {.buffer= "hello"};
mess.buffer[1] = 'a';
printf("'%s' assigned at struct initialization\n",mess.buffer);
(void)mess.buffer;
strcpy(mess.buffer,"hello");
printf("'%s' char array strcpy\n",mess.buffer);
int len = strlen(s) + 1; // to account for '\0'
if (!(mess.buffptr = malloc(sizeof(char) * len)))
return 1;
if (len <= sizeof(mess.buffer))
memcpy(mess.buffptr, s, len);
printf("'%s' char ptr malloced\n",mess.buffptr);
return 0;
}
You can read more about the different between a char pointer and a char array here

Initializing strings in C in Visual Studio

I try to learn the XOR algorithm using C. I have found a great example on KyleBank's GitHub:
#include <stdio.h>
#include <string.h>
void encryptDecrypt(char *input, char *output) {
char key[] = {'K', 'C', 'Q'}; //Can be any chars, and any size array
int i;
for(i = 0; i < strlen(input); i++) {
output[i] = input[i] ^ key[i % (sizeof(key)/sizeof(char))];
}
}
int main (int argc, char *argv[]) {
char baseStr[] = "kylewbanks.com";
char encrypted[strlen(baseStr)];
encryptDecrypt(baseStr, encrypted);
printf("Encrypted:%s\n", encrypted);
char decrypted[strlen(baseStr)];
encryptDecrypt(encrypted, decrypted);
printf("Decrypted:%s\n", decrypted);
}
The above works well under Linux and gcc.
However, it does not compile in Visual Studio under Windows.
I am using build tools included in Visual Studio 2017.
What am I doing wrong?
Microsoft's compiler does not support C99 VLAs (see the note here). Array sizes must be a constant expression. The code is also broken because it fails accommodate and place a nul terminator in the output.
In this case, decrypted and encrypted might be declared thus:
char encrypted[sizeof(baseStr)] ;
...
char decrypted[sizeof(baseStr)] ;
And encryptDecrypt() modified thus:
void encryptDecrypt(char *input, char *output) {
...
output[i] = 0 ;
}
Finally the signed mismatch warning may be cleaned up by declaring i as type size_t.
On Windows of course you could always use MinGW/GCC if you want more modern C support. Or you could use C++ and std::string or std::vector containers if you want to stick with Microsoft's compiler.
Use malloc for dynamic memory allocation. Requires #include <stdlib.h>
char baseStr[] = "123";
char *encrypted = malloc(strlen(baseStr) + 1);
...
free(encrypted);
As mentioned before, you have to add 1 for the null-terminated character at the end.
The char* pointer is one piece of information, it shows where the string begins. But where does it end? strlen and other C functions have no idea where the string ends, so they go through all the characters until a '\0' character is encountered.
For efficiency, take strlen(input) out of the loop and calculate it only once:
void encryptDecrypt(char *input, char *output)
{
char key[] = { 'K', 'C', 'Q' };
int keysize = sizeof(key);
size_t i;
size_t len = strlen(input);
for(i = 0; i < len; i++)
output[i] = input[i] ^ key[i % keysize];
output[len] = 0; //will be same as output[i] = 0;
}
The function int main should return zero. Note that this method cannot be described as "encryption" by modern standards. You can call it "obfuscation".

C Macro to convert a string to a pascal string type

I would like some ideas for a macro to convert a preprocessor defined string to a pascal type string and then be able to use the macro to initialize const char arrays and the like.
Something like this would be great:
#define P_STRING_CONV(str) ...???...
const char *string = P_STRING_CONV("some string");
struct
{
char str[30];
...
}some_struct = {.str = P_STRING_CONV("some_other_string")};
I already tried something like this:
#define DEFINE_PASCAL_STRING(var, str, strlen) struct {uint8_t len; char content[strlen-1];} (var) = {sizeof(str)-1, (str)}
(The strlen parameter could be removed, but I need a defined size.)
That works fine, but cannot be used to initialize elements in a struct. And for const char arrays I need to cast it to some other variable.
Any great ideas?
to convert a string to a pascal string type
To convert a string literal, _Generic and compound literal will get close to OP objective.
For a better solution, more details and example use cases would help illustrate OP's goal.
#define P_STRING_CONV(X) _Generic((X)+0, \
char *: &((struct {char len; char s[sizeof(X)-1]; }){ (char)(sizeof(X)-1), (X) }).len \
)
void dump(const char *s) {
unsigned length = (unsigned char) *s++;
printf("L:%u \"", length);
while (length--) {
printf("%c", *s++);
}
printf("\"\n");
}
int main(void) {
dump(P_STRING_CONV(""));
dump(P_STRING_CONV("A"));
dump(P_STRING_CONV("AB"));
dump(P_STRING_CONV("ABC"));
return 0;
}
Output
L:0 ""
L:1 "A"
L:2 "AB"
L:3 "ABC"
#Jonathan Leffler recommended that the created pascal-like string also contain a terminating null character. To do so with above code, simple change sizeof(X)-1 into sizeof(X). Then by accessing the pascal_like_string + 1, code has a pointer to a valid C string.
(X)+0 converts an array type to a pointer
sizeof(X)-!!sizeof(X) produces a size of the string literal, not counting its \0. At least 1.
struct {char len; char s[sizeof(X)-!!sizeof(X)]; } Is a right-sized pascal-like structure.
(struct {char len; char s[sizeof(X)-!!sizeof(X)]; }){ (char)(sizeof(X)-1), (X) } is a compound literal.
The following will convert a C string to a pascal like string. Note that as a pascal like string, there is no '\0'.
#include <limits.h>
#include <stdlib.h>
#include <string.h>
char *pstring_convert(char *s) {
size_t len = strlen(s);
assert(len <= UCHAR_MAX);
memmove(s+1, s, len);
s[0] = (char) (unsigned char) len;
return s;
}
You could split the macro into two:
#define PASCAL_STRING_TYPE(size) struct { unsigned char len; char content[(size) - 1]; }
#define PASCAL_STRING_INIT(str) { .len = sizeof(str) - 1, .content = (str) }
Then use it like so:
static const PASCAL_STRING_TYPE(100) foo = PASCAL_STRING_INIT("foo");
struct bar {
int answer;
PASCAL_STRING_TYPE(100) question;
};
static const struct bar quux = {
.answer = 42,
.question = PASCAL_STRING_INIT("The Answer")
};
(Not tested.)

What is a good way to allocate a sized string buffer on the stack?

Using pure C (and the C-preprocessor) I would like to create a string buffer that includes a length parameter, so that I could pass it around easily and any functions that operate on it could do so without danger of writing past the end. This is simple enough:
typedef struct strbuf_t_ {
int length;
char str[1];
} strbuf_t;
but then if I want a small scratch space on the stack to format some output text, there is no trivial way to allocate a strbuf_t on the stack. What I would like is some clever macro that allows me to do:
STRBUF(10) foo;
printf("foo.length = %d\n", foo.length); // outputs: foo.length = 10
strncpy(foo.str, "this is too long", foo.length);
Unfortunately I don't seem to be able to do that. The best I've come up with is:
#define STRBUF(size, name) \
struct {\
strbuf_t buf;\
char space[size - 1];\
char zero;\
} name ## _plus_space_ = { .buf={.length=size}, .space="", .zero='\0'};\
strbuf_t *name = &name ## _plus_space_.buf
int main(void)
{
STRBUF(10, a);
strncpy(a->str, "Hello, world!", a->length);
printf("a->length = %d\n", a->length); // outputs: a->length = 10
puts(a->str); // outputs: Hello, wor
}
This meets all of the requirements I listed, but a is a pointer not the structure itself, and the allocation is certainly not intuitive.
Has anyone come up with something better?
I think you are already pretty close to a solution. Just keep a char* in your struct and allocate it via char-array. In order to have the save trailing zero at the end of string, just allocate an extra char additional to the size and initialize the whole array with zeroes.
typedef struct
{
int length;
char* str;
} strbuf_t;
#define STRBUF(varname, size) \
char _buffer_ ## varname[size + 1] = {'\0'}; \
strbuf_t varname = { size, _buffer_ ## varname }
int main()
{
STRBUF(a, 10);
strncpy(a.str, "Hello, world!", a.length);
printf("a.length = %d\n", a.length);
puts(a.str);
}
Perhaps the following. Allocate the memory with an aligned VLA and then overlay.
typedef struct strbuf_t_ {
int length;
char str[];
} strbuf_t;
#include <stddef.h>
#include <stdlib.h>
#include <stdio.h>
#include <stdalign.h>
int main(void) {
char *s = "Hello";
size_t length = strlen(s);
size_t n = sizeof (strbuf_t) + length + 1;
_Alignas(strbuf_t) unsigned char mem[n];
strbuf_t *xx = (strbuf_t*) mem;
xx->length = length;
memcpy(xx->str, s, n+1);
printf("int:%zu s:%zu n:%zu mem:%zu\n",
sizeof xx->length, sizeof (strbuf_t), n, sizeof mem);
return 0;
}
Output
int:4 s:4 n:6 mem:10
Note: C99 allows the last struct member to have an indefinite array count of []

Trouble Copying Char* to Char* in C

I have the following program I am trying to run but surely, due to my lack of good knowledge, my program crashes runtime:
#include <stdio.h>
#include "ptref.h"
mystruct_t *FRSt = NULL;
int main(int argc, char* argv[])
{
char ct[2] = {0, 1, '\0'};
char dd[2] = {0, 1, '\0'};
populate_contents(FRSt, 2, "FRES", ct, dd);
return 0;
}
HEADER
/*
* ptref.h
*
*/
#ifndef PTREF_H_
#define PTREF_H_
typedef struct mystruct
{
char* ct[2]; //
char* dd[2]; // = "0\0";
char* name[]; // = "1\0";
} mystruct_t;
extern mystruct_t p;
void populate_contents(mystruct_t* mystruct_var, int arrSize, char* name[], char* dd[], char* ct[])
{
/* Initialise arrays */
int i;
i = 0;
strncpy(mystruct_var->name, name, sizeof(name));
for (i = 0; i < arrSize; i++)
{
mystruct_var->dd[i] = dd[i];
mystruct_var->ct[i] = ct[i];
}
return;
}
#endif /* PTREF_H_ */
Because I am going to implement this in a real-time computer, I am not sure if using malloc will cause me any trouble. However, I have got a feeling that because I have not used malloc for my mystruct_var pointer, I am having trouble, or may be it is my moronic code. In any way, further education and advise will be highly appreciated.
P.S. I have looked into the other relevant post but my problem is quite different. So, I posed a new question.
Firstly, in main() char ct[2] = {0, 1, '\0'}; this particular array initialization is incorrect as you have defined array size as 2 and initializing 3 array elements.
In function populate_contents(FRSt, 2, "FRES", ct, dd);, the third argument is a character string which corresponding called function argument should be a char array as char name[] or char pointer as char *name. It should not be as you defined name as array of pointers char *name[]. Same thing goes for arguments passed ct & dd, they should be just char pointers in the callee function as there type is char *.
Also your structure mystruct_t declared is incorrect by the way looking at your usage of member elements.
As said by Grijesh, sizeof(name) is what you don't want as name is a pointer which could be 4 or 8 Bytes, so make use of strlen() to get the length of the string you received.

Resources