what chars are special in c? - c

Hey guys so I have this text with special chars in it and I wold like to "escape" special chars to be able to compile my program but I do not know are / : - = special chars ? And do I need to escape them as well ? Here is example
static const char *postthis="text and spec chars";
and here is example of text which I want to put in
<text:Text
text:text="http://text.text.text/text/text/"
text:text="text:text:text-text-text">
<text:Text>
<text text="http://text.com/text">
<productID>20630175</textID>
</text>
</text:text>
</text:text>
So I put \ before < and " but again I got error, what do I need to escape and how ?
static const char *postthis="\<text:Text
text:text=\"http://text.text.text/text/text/\"
text:text=\"text:text:text-text-text\"\>
\<text:Text\>
\<text text="http://text.com/text\"\>
\<textID\>20630175\</textID\>
\</text\>
\</text:text\>
\</text:text\>";

I think you need to escape the end of the line in a multiple line initializer
and the quotes. If there were tabs or new-lines you would need to escape those too \t \n.
static const char *postthis="\<text:Text\
text:text=\"http://text.text.text/text/text/\"\n\
text:text=\"text:text:text-text-text\">\n\
<text:Text>\n\
<text text="http://text.com/text\"\>\n\
<textID>20630175</textID>\n\
</text>\n\
</text:text>\n\
</text:text>";

The only characters you need to escape inside of a C string are \, and ". There's no reason to quote <. Newlines can be escaped, or you can include them as \n (which I prefer, since it lets you indent more nicely IMO). You cannot simply embed unquoted newlines, however (which is probably your main problem).
XML supports both ' and " as quotation, so you can almost always simplify your life by using '.
static const char *postthis=
"<text:Text\n"
" text:text='http://text.text.text/text/text/'\n"
" text:text='text:text:text-text-text'>\n"
" <text:Text>\n"
" <text text='http://text.com/text'>\n"
" <productID>20630175</textID>\n"
" </text>\n"
" </text:text>\n"
"</text:text>";
Of course XML does not need the newlines, so you could just drop them unless you want this to be human-read.
Note that the above uses a multi-part C string. (Multiple "..." that are joined together.) I can't remember what version of C added that (or if it's a GNU extension). If you can't use it, then you can just escape the newlines:
static const char *postthis=
"<text:Text\
text:text='http://text.text.text/text/text/'\
...
</text:text>";
But it makes the indentation a little harder to customize without making your code look worse.

You time is much better spent putting that text in to a file and reading that. Hand editing XML is guaranteed to backfire.
If you are working an an iOS project you'd be able to use the NSString initialiser:
- (id)initWithContentsOfFile:(NSString *)path usedEncoding:(NSStringEncoding *)enc error:(NSError **)error
Or if you are using vanilla C something along the lines of
#include <stdio.h>
#define BUFF_SIZE 1024
int main(int argc, char **argv) {
char s[BUFF_SIZE];
FILE *fp = fopen( "file.xml", "r" );
int i = 0;
while ( !feof(fp) && i < BUFF_SIZE-1 ) {
s[i++] = fgetc(fp);
}
s[i]='\0';
printf("My string is here %s\n", s);
}

See http://en.cppreference.com/w/cpp/language/escape for all escape sequences in a string literal. These will need to be escaped if you want them unchanged. Any new lines you want in your string will need to be written as \n.
As already mentioned earlier, you could keep the text in a file and read it at run time.
If you have a lot of text you want to compile into your program, another easy solution is to use objcopy:
objcopy -I binary -O elf64-x86-64 -B i386 --rename-section .data=.rodata,alloc,load,readonly,data,contents your_text.txt your_text.o
This will give you a object file with the following symbols in it:
_binary_your_text_txt_start
_binary_your_text_txt_end
_binary_your_text_txt_size
Link against your_text.o and use the symbols to access the text. For example:
#include <stdio.h>
extern char _binary_your_text_txt_start;
extern char _binary_your_text_txt_end;
extern char _binary_your_text_txt_size;
int main(int argc, char *argv[])
{
const char * b = &_binary_your_text_txt_start;
const char * e = &_binary_your_text_txt_end;
size_t s = (size_t)&_binary_your_text_txt_size;
fwrite(b, s, 1, stdout);
return 0;
}

Related

SWIG convert C-Pointer stringvalue to tcl string

because of my limited knowledge in C and SWIG i couldn't manage to adopt any public example for converting c-pointer chars to tcl strings ....
I always get stuck at the problem that my tcl variable just doesn't get dereferenced
like this :
tcl_str = _30e84c05ef550000_p_stringout2
string_pointer.c
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <time.h>
#include "string_pointer.h"
stringout2 Itla_Get_Model_Version (int laser, char * mv_string)
{
stringout2 * pointer2;
char *mod_ver ="PPCL600";
pointer2 = malloc( sizeof(stringout2) );
pointer2-> modelvers= *mod_ver;
printf ( "Itla_Get_Model_Version : read %s \n", mod_ver );
return *pointer2 ;
}
string_pointer.h
#include <sys/types.h>
#include <sys/resource.h>
typedef struct {
char * modelvers;
} stringout2;
stringout2 Itla_Get_Model_Version (int laser, char * mv_string) ;
string_pointer.swig
/* File : string_pointer.swig */
%module string_pointer
%{
#include "string_pointer.h"
%}
%include "typemaps.i"
%include "cpointer.i"
%include "cstring.i"
%typemap(argout) char* (char tmp) %{
$1 = &tmp;
%}
stringout2 Itla_Get_Model_Version (int laser, char *OUTPUT) ;
%include "string_pointer.h"
test.tcl
load ./string_pointer.so
proc test { laser } {
scan [Itla_Get_Model_Version $laser ] %s a
puts "$a "
return $a
}
set name [test 1 ]
puts "Itla_Get_Model_Version= $name"
when executing the tcl-script you get :
Itla_Get_Model_Version : read PPCL600
_f0a759f8d9550000_p_stringout2
Itla_Get_Model_Version= _f0a759f8d9550000_p_stringout2
so i finally need to dereference the Pointer to its value ...
But i don't know how to succeed.....
The C-function is given and can't be modified !
Anybody out there, knowing how to do it ?
If your strings are basically ASCII or UTF-8, all you need to do is to tell SWIG that your function has allocated the string it is returning. For details see, the SWIG docs on C strings.
yourcode.c
char *Itla_Get_Model_Version (int laser, char * mv_string) {
// I assume this is a proxy for something more complicated...
const char *mod_ver ="PPCL600";
size_t len = strlen(mod_ver) + 1;
char *output = malloc(len);
memcpy(output, mod_ver, len);
printf ( "Itla_Get_Model_Version : read %s \n", mod_ver );
return output;
}
yourcode.h
char *Itla_Get_Model_Version(int laser, char * mv_string);
yourcode.swig
/* Tell SWIG that this function returns something to be freed */
%newobject Itla_Get_Model_Version
/* And now we can use the standard C header */
%include "yourcode.h"
If the above simple solution doesn't work…
Things get a lot more complicated if you are using a different encoding for your strings or if you wrap them inside a structure (as you did in your question). That's when you need a typemap, particularly ones of the Tcl variety. Correctly writing a typemap depends on understanding the semantics of the values that you are producing and/or consuming and the semantics of the language that you're using. Assuming you want the wrapping, here's a very simple output typemap that might work:
%typemap(out) stringout2* {
Tcl_SetObjResult(interp, Tcl_NewStringObj($1->modelvers, -1));
free($1);
}
Your function also needs to be modified to return a stringout2* by doing return pointer2;, and not a stringout2 since otherwise you will be leaking memory on every call. You can return a stringout2, but if you are doing that then you should not allocate it with malloc, but rather keep it as a structure directly in a local variable.
In that case, the typemap you'd use is:
%typemap(out) stringout2 {
Tcl_SetObjResult(interp, Tcl_NewStringObj($1.modelvers, -1));
}
(Note the different type, different access to the field, and lack of free.)
And your structure should be declared as containing a const char * if it really is that.
If you have strings in a different encoding (and it isn't ISO 8859-1, for which you can cheat and use a binary string using Tcl_NewByteArrayObj; that's also what you want for slabbing a chunk of binary data over) then you'll need to write a typemap using Tcl_ExternalToUtfDString, and the amount of boilerplate code goes up. Tcl insists that its internal strings are in (almost) UTF-8, and ASCII is OK too as that's a strict subset; everything else must be converted.
Ask another question if that's what you need. You probably are either dealing with ASCII or binary data, so I'll leave (quite a bit more complex!) encoding conversion alone until requested.

Which lines are necessary to use espeak in our C/C++ program?

I found this code on the internet:
#include <string.h>
#include <malloc.h>
#include <espeak/speak_lib.h>
espeak_POSITION_TYPE position_type;
espeak_AUDIO_OUTPUT output;
char *path=NULL;
int Buflength = 1000, Options=0;
void* user_data;
t_espeak_callback *SynthCallback;
espeak_PARAMETER Parm;
char Voice[] = {"English"};
char text[30] = {"this is a english test"};
unsigned int Size,position=0, end_position=0, flags=espeakCHARS_AUTO, *unique_identifier;
int main(int argc, char* argv[] )
{
output = AUDIO_OUTPUT_PLAYBACK;
int I, Run = 1, L;
espeak_Initialize(output, Buflength, path, Options );
espeak_SetVoiceByName(Voice);
const char *langNativeString = "en"; //Default to US English
espeak_VOICE voice;
memset(&voice, 0, sizeof(espeak_VOICE)); // Zero out the voice first
voice.languages = langNativeString;
voice.name = "US";
voice.variant = 2;
voice.gender = 1;
espeak_SetVoiceByProperties(&voice);
Size = strlen(text)+1;
espeak_Synth( text, Size, position, position_type, end_position, flags,
unique_identifier, user_data );
espeak_Synchronize( );
return 0;
}
I only want the espeak reads my strings in my program, and the above code can do it, but I want to know, are all of this code necessary for that purpose? (I mean is it possible to simplifying it?)
***Also I like to know are there a way to using espeak as a system function? I mean system("espeak "something" "); ?
The usage of eSpeak itself seems pretty minimal - you need to read the documentation for that. There are some minor C coding simplifications possible, but perhaps hardly worth the effort:
The memset() is unnecessary. The structure can be initialised to zero thus:
espeak_VOICE voice = {0} ;
If you declare text thus:
char text[] = "this is a English test";
Then you can avoid using strlen() and replace Size with sizeof(text).
The variables I, Run and L are unused and can be removed.
To be able to pass the text as a string on the command line, and thus be able to issue system( "espeak \"Say Something\"") ; for example, you simply need to pass argv[1] to espeak_Synth() instead of text (but you will need to reinstate the strlen() call to get the size.

Checking for a blank line in C - Regex

Goal:
Find if a string contains a blank line. Whether it be '\n\n',
'\r\n\r\n', '\r\n\n', '\n\r\n'
Issues:
I don't think my current regex for finding '\n\n' is right. This is my first time really using regex outside of simple use of * when removing files in command line.
Is it possible to check for all of these cases (listed above) in one regex? or do I have to do 4 seperate calls to compile_regex?
Code:
int checkForBlankLine(char *reader) {
regex_t r;
compile_regex(&r, "*\n\n");
match_regex(&r, reader);
return 0;
}
void compile_regex(regex_t *r, char *matchText) {
int status;
regcomp(r, matchText, 0);
}
int match_regex(regex_t *r, char *reader) {
regmatch_t match[1];
int nomatch = regexec(r, reader, 1, match, 0);
if (nomatch) {
printf("No matches.\n");
} else {
printf("MATCH!\n");
}
return 0;
}
Notes:
I only need to worry about finding one blank line, that's why my regmatch_t match[1] is only one item long
reader is the char array containing the text I am checking for a blank line.
I have seen other examples and tried to base the code off of those examples, but I still seem to be missing something.
Thank you kindly for the help/advice.
If anything needs to be clarified please let me know.
It seems that you have to compile the regex as extended:
regcomp(&re, "\r?\n\r?\n", REG_EXTENDED);
The first atom, \r? is probably unnecessary, because it doesn't add to the blank-line condition if you don't capture the result.
In the above, blank line really means empty line. If you want blank line to mean a line that has no characters except for white space, you can use:
regcomp(&re, "\r?\n[ \t]*\r?\n", REG_EXTENDED);
(I don't think you can use the space character pattern, \s here instead of [ \t], because that would include carriage return and new-line.)
As others have already hinted at, the "simple use of * in the command line` is not a regular expression. This wildcard-matching is called file globbing and has different semantics.
Check what the * in a regex means. It's not like the wildcard "anything" in the command line. The * means that the previous component can appear any amount of times. The wildcard in regex is the .. So if you want to say match anything you can do .*, which would be anything, any amount of times.
So in your case you can do .*\n\n.* which would match anything that has \n\n.
Finally, you can use or in a regex and ( ) to group stuff. So you can do something like .*(\n\n|\r\n\r\n).* And that would match anything that has a \n\n or a \r\n\r\n.
Hope that helps.
Rather than looking for only \r or \n, look for not \r or \n?
Your regex would simply be
'[^\r\n]'
and a match result of false indicates a blank line to your specification.

How to remove the path to get the filename

How does one remove the path of a filepath, leaving only the filename?
I want to extract only the filename from a fts_path and store this in a char *fileName.
Here's a function to remove the path on POSIX-style (/-separated) pathnames:
char *base_name(const char *pathname)
{
char *lastsep = strrchr(pathname, '/');
return lastsep ? lastsep+1 : pathname;
}
If you need to support legacy systems with odd path separators (like MacOS 9 or Windows), you might need to adapt the above to search for multiple possible separators. For example on Windows, both / and \ are path separators and any mix of them can be used.
You want basename(3).
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <libgen.h>
int main(void)
{
char * path = "/homes/mk08/Desktop/lala.c";
char * tmp = strdup(path);
if(tmp) {
printf("%s\n", basename(tmp));
free(tmp);
}
return EXIT_SUCCESS;
}
This will output:
lala.c
I'm sure there is a less roundabout way of doing this, but you could always search through the filepath (I assume it is stored as a char array?), get the position of the final '\', and then erase everything prior to that.
Edit: See R's comment.

How to split a string literal across multiple lines in C / Objective-C?

I have a pretty long sqlite query:
const char *sql_query = "SELECT statuses.word_id FROM lang1_words, statuses WHERE statuses.word_id = lang1_words.word_id ORDER BY lang1_words.word ASC";
How can I break it in a number of lines to make it easier to read?
If I do the following:
const char *sql_query = "SELECT word_id
FROM table1, table2
WHERE table2.word_id = table1.word_id
ORDER BY table1.word ASC";
I am getting an error.
Is there a way to write queries in multiple lines?
There are two ways to split strings over multiple lines:
Each string on its own line. Works only with strings:
Plain C:
char *my_string = "Line 1 "
"Line 2";
Objective-C:
NSString *my_string = #"Line1 "
"Line2"; // the second # is optional
Using \ - can be used for any expression:
Plain C:
char *my_string = "Line 1 \
Line 2";
Objective-C:
NSString *my_string = #"Line1 \
Line2";
The first approach is better, because there isn't a lot of whitespace included. For a SQL query however, both are possible.
NOTE: With a #define, you have to add an extra \ to concatenate the two strings:
Plain C:
#define kMyString "Line 1"\
"Line 2"
There's a trick you can do with the pre-processor.
It has the potential down sides that it will collapse white-space, and could be confusing for people reading the code.
But, it has the up side that you don't need to escape quote characters inside it.
#define QUOTE(...) #__VA_ARGS__
const char *sql_query = QUOTE(
SELECT word_id
FROM table1, table2
WHERE table2.word_id = table1.word_id
ORDER BY table1.word ASC
);
the preprocessor turns this into:
const char *sql_query = "SELECT word_id FROM table1, table2 WHERE table2.word_id = table1.word_id ORDER BY table1.word ASC";
I've used this trick when I was writing some unit tests that had large literal strings containing JSON. It meant that I didn't have to escape every quote character \".
You could also go into XCode -> Preferences, select the Indentation tab, and turn on Line Wrapping.
That way, you won't have to type anything extra, and it will work for the stuff you already wrote. :-)
One annoying thing though is...
if (you're long on indentation
&& short on windows) {
then your code will
end up squished
against th
e side
li
k
e
t
h
i
s
}
I am having this problem all the time, so I made a tiny tool to convert text to an escaped multi-line Objective-C string:
http://multilineobjc.herokuapp.com/
Hope this saves you some time.
Extending the Quote idea for Objective-C:
#define NSStringMultiline(...) [[NSString alloc] initWithCString:#__VA_ARGS__ encoding:NSUTF8StringEncoding]
NSString *sql = NSStringMultiline(
SELECT name, age
FROM users
WHERE loggedin = true
);
One more solution for the pile, change your .m file to .mm so that it becomes Objective-C++ and use C++ raw literals, like this:
const char *sql_query = R"(SELECT word_id
FROM table1, table2
WHERE table2.word_id = table1.word_id
ORDER BY table1.word ASC)";
Raw literals ignore everything until the termination sequence, which in the default case is parenthesis-quote.
If the parenthesis-quote sequence has to appear in the string somewhere, you can easily specify a custom delimiter too, like this:
const char *sql_query = R"T3RM!N8(
SELECT word_id
FROM table1, table2
WHERE table2.word_id = table1.word_id
ORDER BY table1.word ASC
)T3RM!N8";
GCC adds C++ multiline raw string literals as a C extension
C++11 has raw string literals as mentioned at: https://stackoverflow.com/a/44337236/895245
However, GCC also adds them as a C extension, you just have to use -std=gnu99 instead of -std=c99. E.g.:
main.c
#include <assert.h>
#include <string.h>
int main(void) {
assert(strcmp(R"(
a
b
)", "\na\nb\n") == 0);
}
Compile and run:
gcc -o main -pedantic -std=gnu99 -Wall -Wextra main.c
./main
This can be used for example to insert multiline inline assembly into C code: How to write multiline inline assembly code in GCC C++?
Now you just have to lay back, and wait for it to be standardized on C20XY.
C++ was asked at: C++ multiline string literal
Tested on Ubuntu 16.04, GCC 6.4.0, binutils 2.26.1.
You can also do:
NSString * query = #"SELECT * FROM foo "
#"WHERE "
#"bar = 42 "
#"AND baz = datetime() "
#"ORDER BY fizbit ASC";
An alternative is to use any tool for removing line breaks. Write your string using any text editor, once you finished, paste your text here and copy it again in xcode.

Resources