Reverse a string containing ASCII chars and non-ASCII chars - c

I got a problem about how to reverse a string containing this 'abcd汉字efg'.
str_to_reverse = "abcd汉字efg"; /* those non-ASCII chars are Chinese characters, each of them takes 2 bytes */
after reversion, it should be:
str_toreverse = "gfe字汉dcba";
I thought, to reverse the string, I gotta identify those non-ASCII chars, because I think that simply reversing every byte won't get the right answer.
How can I do it?
PS:
I wrote this program under Ubuntu, 32-bit.
then I printed every byte:
for(i = 0; i < strlen(s); i++)
printf("%c", s[i]);
I got some gibberish text instead of "汉字".

Pure C89 answer:
#include <stdlib.h>
#include <stdio.h>
#include <locale.h>
#include <string.h>
int main()
{
char const* str;
size_t slen;
char* rev;
setlocale(LC_ALL, "");
str = "abcd汉字efg";
printf("%s\n", str);
slen = strlen(str);
rev = malloc(slen+1)+slen;
*--rev = '\0';
while (*str != '\0') {
int clen, i;
clen = mblen(str, slen);
if (clen == -1) {
fprintf(stderr, "Bad encoding\n");
return EXIT_FAILURE;
}
for (i = 0; i < clen; ++i) {
*--rev = str[clen-1-i];
}
str += clen;
}
printf("%s\n", rev);
return 0;
}

If the string is encoded as utf8, it is pretty simple. You can obtain the length of well formed utf8 sequences by inspecting only the first byte.
In a first pass you reverse only the utf8 "subsequences" (those with length > 1)
In a second pass you reverse the whole string.
Voila.

Related

Copying valid strings to 2d array in C

I am checking if a function returns true, it prints out valid strings according some other function I got. At the moment, it's printing it out correctly but it is also printing empty lines which seem to correspond to the invalid strings.
How can I make these empty lines go away?
Here is my code:
int main()
{
int i, count = 0;
char input[10];
char validStr[10][60] = {""};
for (i = 0; i < 60; ++i){
if(fgets(input,10, stdin) == NULL){
break;
}
input[strcspn(input,"\n")] = '\0';
if(checkIfValid(input)){
memcpy(validStr[i],input,sizeof(input));
count++;
}
}
printf("%d\n",count);
for (int j = 0 ; j < count; ++j){
printf("%s\n",validStr[j]);
}
}
The count indicates it is printing only the valid strings but as you can tell by the pic it prints white lines.
Note: For various reasons the program needs to follow the current order so the output is printed after the first for loop.
Thanks in advance!
Instead of this:
if(checkIfValid(input)){
memcpy(validStr[i],input,sizeof(input));
count++;
}
This:
if(checkIfValid(input)){
memcpy(validStr[count],input,sizeof(input));
count++;
}
As others have pointed out in the comments, you want to safely secure that string copy. May I suggest:
if(checkIfValid(input)){
char* dst = validStr[count];
size_t MAXLEN = 10;
strncpy(dst, input, MAXLEN);
dest[MAXLEN-1] = '\0';
count++;
}
Continuing from the comment, if you want to store the entire string, you need to provide adequate space for the nul-terminating character.
AAAAAAAAAA
QELETIURTE
...
contain strings that are 10 characters long and will not fit in input as declared char[10].
Instead of looping with a for, allow the return from fgets() control your read-loop and keep count as a condition controlling the loop to ensure you protect your array bounds, e.g.
#include <stdio.h>
#include <string.h>
#define MAXC 128 /* if you need a constant, #define one (or more) */
#define NSTR 10
int checkIfValid (const char *s) { return 1; (void)s; }
int main(void)
{
size_t count = 0;
char input[MAXC];
char validStr[NSTR][MAXC] = {""};
while (count < NSTR && fgets (input, sizeof input, stdin)) {
input[strcspn(input,"\n")] = '\0';
if(checkIfValid(input)){
strcpy (validStr[count], input);
count++;
}
}
printf ("%zu\n",count);
for (size_t j = 0 ; j < count; ++j) {
printf("%s\n",validStr[j]);
}
}
(adjust your array declaration for 60 strings of 10 characters each)
If you want to cut off at 9 characters and ensure the stings are nul-terminated, #selbie has that covered.
Example Use/Output
With your data (as good as I could read it) in dat/validstr.txt you could do:
$ ./bin/validstring <dat/validstr.txt
6
AAAAAAAAAA
QELETIURTE
321qweve
sdsdsdfFF
GRSGGFDDSS
toLotssAAA

find palindromes in the row and display them

A string consisting of words, no longer than 100 characters, is supplied. Words consist of Latin characters and are separated by a single space. It is necessary to output to the standard output stream a string containing only the words palindromes.
The source data must be read into memory as a whole and all manipulations must be carried out in memory, the result obtained in memory and then printed.
#define _CRT_SECURE_NO_WARNINGS
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int check(char str[])
{
int i, length;
length = strlen(str);
for (i = 0; i < length; i++)
if (str[i] != str[(length - 1) - i]) return 0;
return 1;
}
int main(void)
{
char str[100];
char* t;
gets(str);
t = strtok(str, " ");
while (t != NULL) {
if (check(t) == 1) {
printf("%s ", t);
}
t = strtok(NULL, " ");
}
_getch();
return 0;
}
this is my code (it fails the tests)
Please help me fix the code
Instead of fgets use function getline(&buffer, &size, stdion) for example.
Your for loop in the check function works fine but resolve another problem, than that you expected.
Palindrome a word or group of words that is the same when you read it forwards from the beginning or backwards from the end.

How get ASCII of a string

i need to get the ascii (int and hex format) representation of a string char by char. For example if i have the string "hello", i would get for int ascii 104 101 108 108 111
and for hex 68 65 6C 6C 6F
How about:
char *str = "hello";
while (*str) {
printf("%c %u %x\n", *str, *str, *str);
str++;
}
In C, A string is just a number of chars in neighbouring memory locations. Two things to do: (1) loop over the string, character by character. (2) Output each char.
The solution for (1) depends on the string's representation (0-terminated or with explicit length?). For 0-terminated strings, use
char *c = "a string";
for (char *i = c; *i; ++i) {
// do something with *i
}
Given an explicit length, use
for (int i = 0; i < length; ++i) {
// do something with c[i]
}
The solution for (2) obviously depends on what you are trying to achieve. To simply output the values, follow cnicutar's answer and use printf. To get a (0-terminated) string containing the representation,
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
/* convert a 0-terminated string to a 0-terminated string of its ascii values,
* seperated by spaces. The user is responsible to free() the result.
*/
char *to_ascii(const char *inputstring) {
// allocate the maximum needed to store the ascii represention:
char *output = malloc(sizeof(char) * (strlen(inputstring) * 4 + 1));
char *output_end = output;
if (!output) // allocation failed! omg!
exit(EXIT_FAILURE);
*output_end = '\0';
for (; *inputstring; ++inputstring) {
output_end += sprintf(output_end, "%u ", *inputstring);
//assert(output_end == '\0');
}
return output;
}
If you need to output an explicit-length string, use strlen() or the difference (size_t)(output_end-output).
int main()
{
enum type {decimal, hexa};
char *str = "hello";
char *temp_str = NULL;
temp_str = str;
static enum type index = decimal;
while (*str) {
if(index == decimal)
printf("%u\t", *str);
else
printf("%x\t",*str);
str++;
}
printf("\n");
if(index != hexa)
{
index = hexa;
str = temp_str;
main();
}
}
hope this will work fine as what u want, and if u want to store it in a uint8_t array, have to just declare an variable for it.
I know this is 5 years old but my first real program converted strings to ASCII and it was done in a clean and simple way by assigning a variable to getchar() and then calling it in printf() as an integer, all while it's in a loop of course, otherwise getchar() only accepts single characters.
#include <stdio.h>
int main()
{
int i = 0;
while((i = getchar()) != EOF)
printf("%d ", i);
return 0;
}
and here's the original version using the for() loop instead because I wanted to see just how small I could make the program.
#include <stdio.h>
int main()
{
for(int i = 0; (i = getchar()) != EOF; printf("%d ", i);
}
/* Receives a string and returns an unsigned integer
equivalent to its ASCII values summed up */
unsigned int str2int(unsigned char *str){
int str_len = strlen(str);
unsigned int str_int = 0;
int counter = 0;
while(counter <= str_len){
str_int+= str[counter];
printf("Acumulator:%d\n", str_int);
counter++;
}
return str_int;
}

Problems with simple c task

So after a few years of inactivity after studying at uni, I'm trying to build up my c experience with a simple string reverser.
here is my code:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
/*
*
*/
int main(int argc, char** argv) {
reverser();
return(0);
}
int reverser(){
printf("Please enter a String: ");
//return (0);
int len;
char input[10];
scanf("%s",&input);
int quit = strcmp(input,"quit");
if(quit == 0){
printf("%s\n","Program quitting");
return(0);
}
len = strlen(input);
printf("%i\n",len);
char reversed[len];
int count = 0;
while (count <= (len-1)){
//printf("%i\n",(len-count));
reversed[count] = input[(len-1)-count];
count++;
}
//printf("%s\n",input);
printf(reversed);
printf("\n");
reverser();
}
When I input "hello", you would expect "olleh" as the response, but I get "olleh:$a ca&#",
How do I just get the string input reversed and returned?
Bombalur
Add a '\0' at the end of the array. (as in, copy only chars until you reach '\0' - which is the point at array[strlen(array)], then when you're done, add a '\0' at the next character)
Strings are conventionally terminated by a zero byte. So it should be
char reversed[len+1];
And you should clear the last byte
reversed[len] = (char)0;
you forgot the \0 at the end of the string
This is because you are creating an array with size 10. When you take in some data into it (using scanf) and the array is not filled up completely, the printf from this array will give junk values in the memory. You should iterate for the length of the input by checking \n.
must have a size + 1 to string length so that you can have a \0 at the end of string that will solve your problem
The following is a (simple and minimal implementation of) string reverse program (obviously, error conditions, corner cases, blank spaces, wider character sets, etc has not been considered).
#include <stdio.h>
int strlen(char *s)
{
char *p = s;
while (*p)
p++;
return p - s;
}
char * strrev(char a[])
{
int i, j;
char temp;
for (i=0, j=strlen(a)-1 ; i<j ; i++, j--) {
temp = a[i];
a[i] = a[j];
a[j] = temp;
}
return a;
}
int main()
{
char str[100];
printf("Enter string: ");
scanf("%s", str);
printf("The reverse is %s \n", strrev(str));
return 0;
}
Hope this helps!

Iterating over string/strlen with umlauted characters

This is a follow-up to my previous question . I succeeded in implementing the algorithm for checking umlauted characters. The next problem comes from iterating over all characters in a string. I do this like so:
int main()
{
char* str = "Hej du kalleåäö";
printf("length of str: %d", strlen(str));
for (int i = 0; i < strlen(str); i++)
{
printf("%s ", to_morse(str[i]));
}
putchar('\n');
return 0;
}
The problem is that, because of the umlauted characters, it prints 18, and also makes the to_morse function fail (ignoring these characters). The toMorse method accepts an unsigned char as a parameter. What would be the best way to solve this? I know I can check for the umlaut character here instead of the letterNr function but I don't know if that would be a pretty/logical solution.
Normally, you'd store the string in a wchar_t and use something like ansi_strlen to get the length of it - that would give you the number of printed characters as opposed to the number of bytes you stored.
You really shouldn't be implementing UTF or Unicode or whatever multibyte character handling yourself - there are libraries for that sort of thing.
On OS X, Cocoa is a solution - note the use of "%C" in NSLog - that's an unichar (16-bit Unicode character):
#import <Cocoa/Cocoa.h>
int main()
{
NSAutoreleasePool * pool = [NSAutoreleasePool new];
NSString * input = #"Hej du kalleåäö";
printf("length of str: %d", [input length]);
int i=0;
for (i = 0; i < [input length]; i++)
{
NSLog(#"%C", [input characterAtIndex:i]);
}
[pool release];
}
You could do something like
for (int i = 0; str[i]!='\0'; ++i){
//do something with str[i]
}
Strings in C are terminated with '\0'. So it is possible to check for the end of the string like that.
EDIT: What locale are you using?
If you are going to iterating over a string, don't bother with getting its length with strlen. Just iterate until you see a NUL character:
char *p = str;
while(*p != '\0') {
printf("%c\n", *p);
++p;
}
As for the umlauted characters and such, are they UTF-8? If the string is multi-byte, you could do something like this:
size_t n = strlen(str);
char *p = str;
char *e = p + n;
while(*p != '\0') {
wchar_t wc;
int l = mbtowc(&wc, p, e - p);
if(l <= 0) break;
p += l;
/* do whatever with wc which is now in wchar_t form */
}
I honestly don't know if mbtowc will simply return -1 if it encounters a NUL in the middle of a MB character. If it does, you could just pass MB_CUR_MAX instead of e - p and do away with the strlen call. But I have a feeling this is not the case.

Resources