convert UTF-16LE to UTF-8 with iconv() - c

I`m trying to convert UTF-16LE to UTF-8 with iconv() in Linux.
and i think it done..
But i got some trouble.. with my code..
and i think two codes are same, but first one not working. only second one working.
#include "stdio.h"
#include "string.h"
#include "iconv.h"
#include "errno.h"
#if 1
int fn2Utf8(char inBuf[], char outBuf[]) {
size_t readBytes = sizeof(inBuf);
size_t writeBytes = sizeof(outBuf);
char* in = inBuf;
char* out = outBuf;
iconv_t convert = iconv_open("UTF-8","UTF-16LE");
if (iconv(convert, &in, &readBytes, &out, &writeBytes) < 0) {
return (-1);
}
iconv_close(convert);
printf("[%s] [%s]\n", inBuf, outBuf);
return (out - outBuf);
}
int main() {
char inBuf[128] ="\x5c\x00\xbd\xac\x01\xc6\x00\xd3\x5c\x00\x00\xb3\x78\xc6\x44\xbe\x5c\x00\x2a\x00\x00\x00";
char outBuf[128];
fn2Utf8(inBuf, outBuf);
return 0;
}
#else
int main() {
char inBuf[128] = "\x5c\x00\xbd\xac\x01\xc6\x00\xd3\x5c\x00\x00\xb3\x78\xc6\x44\xbe\x5c\x00\x2a\x00\x00\x00";
char outBuf[128];
size_t readBytes = sizeof(inBuf);
size_t writeBytes = sizeof(outBuf);
char* in = inBuf;
char* out = outBuf;
iconv_t convert = iconv_open("UTF-8","UTF-16LE");
if (iconv(convert, &in, &readBytes, &out, &writeBytes) < 0) {
return (-1);
}
iconv_close(convert);
printf("[%s] [%s]\n", inBuf, outBuf);
return 0;
}
#endif
You can complie two type of code with if 0 -> if 1
and i need if 1 method.

Here's the problem:
size_t readBytes = sizeof(inBuf);
size_t writeBytes = sizeof(outBuf);
When you pass arrays to a function, they decay to pointers to their first element. Your call
fn2Utf8(inBuf, outBuf);
is equal to
fn2Utf8(&inBuf[0], &outBuf[0]);
That means that in the function the arguments are not arrays, but pointers. And when you do sizeof on a pointer you get the size of the pointer and not what it's pointing to.
There are two solutions: The first is to pass in the length of the arrays as arguments to the function, and use that. The second, at least for the inBuf argument, is to rely on the fact that it's a null-terminated string and use strlen instead.
The second way, with strlen, works only on inBuf as I already said, but doesn't work on outBuf where you have to use the first way and pass in the size as an argument.
If works in the program without the function because then you are doing sizeof on the array, and not a pointer. When you have an array and not a pointer, sizeof will give you the size in bytes of the array.

Related

C Pointer of array of strings garbled when retreived later

I have read a lot of the answers on the theoretical issues with memory allocation to pointer to arrays, but have not been able to fix my code...so turning to you.
I have an array of strings in a STRUCT, which I need to write to and read from. Declared as:
typedef struct client_mod
{
/* Client ad_file */
char *ad_filenames[10];
/* Client's current ad array index*/
unsigned int ad_index;
} client;
Then , inside a function , I assign values to pointer:
static int get_spots (client_mod *client)
{
char buf[512];
FILE *ptr;
if ((ptr = popen("php /media/cdn/getspot.php", "r")) != NULL) {
/* Read one byte at a time, up to BUFSIZ - 1 bytes, the last byte will be used for null termination. */
size_t byte_count = fread(buf, 1, 512 - 1, ptr);
/* Apply null termination so that the read bytes can be treated as a string. */
buf[byte_count] = 0;
}
(void) pclose(ptr);
// parse extracted string here...
int i = 0;
client->ad_filenames[i] = strdup(strtok(buf,"|"));
while(client->ad_filenames[i]!= NULL && i<5)
{
client->ad_filenames[++i] = strdup(strtok(NULL,"|"));
if (client->ad_filenames[i] != NULL && strlen(client->ad_filenames[i]) > 5) {
LOG("TESTING FOR CORRECT FILE NAMES %s\n", client->ad_filenames[i]);
}
}
}
The problem comes when I retreive the values later:
/* in looping code block */
LOG("Checking file under index = %d, file is %s", client->ad_index, client->ad_filenames[client->ad_index]);
The first two members of the array are retreived normally, everything after that is garbled.
I would appreciate any guidance. Thanks!
I understand this probablby comes from undefined behaviour of assigning directly to the pointer, but I can't figure out how to solve it.
I think the problem is with assigning to this struct element.
char *ad_filenames[10];
ad_filenames is an array of 10 of pointer to characters.
What that means is that memory allocation is needed for each index.
Something like
client->ad_filenames[0] = strdup(var1);
strdup() does both malloc() and strcpy() within this function.
client should be a variable name. You already defined client as a type.
Here is working code:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
typedef struct client_mod
{
/* Client ad_file */
char *ad_filenames[10];
/* Client's current ad array index*/
unsigned int ad_index;
}CLIENT1;
CLIENT1 *client;
int func( char *var1 ) {
client->ad_filenames[0] = strdup(var1);
}
int
main(void)
{
char str1[10];
client = malloc( sizeof client );
strcpy( str1, "Hello" );
func( str1 );
printf("%s\n", client->ad_filenames[0] );
free(client->ad_filenames[0]);
free (client);
}
Your problem is with the line,
size_t byte_count = fread(buf, 1, 1000 - 1, ptr);
Read the man fread page,
size_t fread(void *ptr, size_t size, size_t nmemb, FILE *stream);
you read 1000-1 members of size 1 into buf, which is only allocated buf[512], either expand buf or decrease fread 3rd argument,
buf[1000+1];
size_t byte_count = fread(buf, 1, sizeof(buf)-1, ptr);

Assigning a char array to a char*

I'm trying to write a function that prefixes a string with its length. I can't seem to assign a char[] to a char *. Mysteriously, if I print out some debugging code before the assignment, it works.
char *prefixMsgWLength(char *msg){
char *msgWLength;
int msgLength = strlen(msg);
if (msgLength == 0){
msgWLength = "2|";
}
else{
int nDigits = floor(log10(abs(msgLength))) + 1;
int nDigits2 = floor(log10(abs(msgLength + nDigits + 1))) + 1;
if (nDigits2 > nDigits){
nDigits = nDigits2;
}
msgLength += nDigits + 1;
char prefix[msgLength];
sprintf(prefix, "%d|", msgLength);
strcat(prefix, msg);
// if I uncomment the below, msgWLength is returned correctly
// printf("msg: %s\n", prefix);
msgWLength = prefix;
}
return msgWLength;
}
The problem in your code is
msgWLength = prefix;
here, you're assigning the address of a local variable (prefix) to the pointer and you try to return it.
Once the function finishes execution, the local variables will go out of scope and the returned pointer will be invalid.
You need to make prefix as a pointer and allocate memory dynamically, if you want it to retain it's existence after returning from the function.
String reallocation to the exact length can be very cumbersome in C. You'd probably be much better off just using a sufficiently large buffer. Here, I use limits.h to determine the size of a line buffer according to the system (LINE_MAX):
#include <stdio.h>
#include <limits.h>
#include <string.h>
int main()
{
/* Our message */
char const msg[] = "Hello, world!";
/* Buffer to hold the result */
char buffer[LINE_MAX];
/* Prefix msg with length */
snprintf(buffer, LINE_MAX, "%lu|%s", strlen(msg)+1, msg);
/* Print result */
printf("%s\n", buffer);
return 0;
}

memcpy in a different function having a pointer to pointer argument

I have a following function process calling a routine dataFileBuffer which takes a pointer to a pointer and does a memcpy on the dereferenced pointer location.
int dataFileBuffer(uint8_t *index, char **tempBuf,int size)
{
if(index != stop_address)) /*stop_address is a fixed pointer to the end buffer*/
{
if(*tempBuf)
{
if(index + size < stop_address)
memcpy(*tempBuf,index,size);
else
{
size = stop_address-index-1;
memcpy(*tempBuf,index,size);
}
}
else
size = 0;
}
else
size = 0;
return size;
}
int process()
{
char *readBuf=NULL;
char *tBuf = (char *)malloc(MAX_LENGTH);
int readBytes = -1;
uint8_t *index = start_address;
uint8_t *complete = stop_address;
do
{
readBuf = tBuf+(sizeof(char)*40);
readBytes = 0;
readBytes = dataFileBuffer(index,&readBuf,MAX_LENGTH);
if(readBytes > 0)
{
index = index+readBytes;
}
}while(index <= complete);
return readBytes;
}
My process function is intermittently seeing stack corruptions which is making me think that something is wrong with my implementation of copy.
I just wanted to understand if we can pass a pointer to a pointer as an argument and safely memcpy to the dereferenced location in the called function ?
There are several things wrong with the question's code. Apart from some syntax errors, there is notably the function
dataFileBuffer(index, char **tempBuf,int size)
which does not compile for two reasons, there is no type declared for the argument index, and there is no return value declared - note that the function ends with
return size;
and is called like this:
readBytes = dataFileBuffer(index,&readBuf,MAX_LEN);
and my guess is that it should be
int dataFileBuffer(char *index, char **tempBuf, int size)
but I am puzzled why you have reversed the arguments given to dataFileBuffer() for the memcpy().
Next, you have used MAX_LEN, MAX_LENGTH and 40 to define buffers sizes or offsets, but there is no clear definition or checking as to the size of the available buffer index that you copy into - or is that from :-). It is more usual to offer a buffer size than a pointer limit.
You also have
...
readBytes = dataFileBuffer(index,&readBuf,MAX_LEN);
if(readBytes > 0)
{
index = index+readBytes;
}
} while(index <= complete);
which is likely to cause an infinite loop when readBytes == 0, and anyway will copy the same data on subsequent loops.
Sorry I can't offer a proper solution, as it's all a confused mess.
Added after OP comment
In reply to the specific question about deferencing a **pointer, this example succeeds in doing that, by finding the string length.
#include <stdio.h>
#include <string.h>
// return the length of the string
size_t slen(char **tempBuf)
{
return strlen (*tempBuf);
}
int main(void) {
char string[] = "abcde";
char *sptr = string;
printf ("Length of '%s' is %d\n", string, slen (&sptr));
return 0;
}
Program output:
Length of 'abcde' is 5

Overwriting parts of a string with parts of another string

I'm trying to overwrite a part of a string with parts of another String.
Basically, I want to access a given index of a string, write a given number of chars from another given index of another string.
So a function like memcpy(stringa[indexa], stringb[indexb], length);, except that this does not work.
Using strncpy would also suffice.
More code, as requested:
void mymemset(char* memloc, char* cmd, int data_blocks[], int len)
{
int i = 0;
while(i < len)
{
//missing part. Where I want the "memcpy" operation to take place
i++;
}
return;
}
memloc is the string we want to overwrite, cmd is the string we are overwriting from, data_blocks contains information about where in memloc we are supposed to write, and len is the number of operations we are executing. So I want to overwrite at location data_blocks[i], from cmd 8 chars at a time.
EDIT: I think I just forgot an &, so sorry to have confused you and thanks for your time. This seems to work:
void mymemset(char* memloc, char* cmd, int data_blocks[], int len)
{
int i = 0;
while(i < len)
{
memcpy(&memloc[data_blocks[i]], &cmd[i*8], 8);
i++;
}
return;
}
Takes 8 bytes at a time from cmd, stores them in memloc at the index given by data_blocks[i]. As commented, data_blocks contains information about different indexes in memloc that is available, and segmentation of the string cmd can occur.
Supposing stringa and stringb are declared as follows
char stringa[] = "Hello" ;
char stringb[] = "World" ;
This should work:
memcpy(&stringa[1], &stringb[1], 2) ;
Your example should not compile, or if it compiles if is likely to crash or to cause undefined behaviour :
memcpy(stringa[1], stringb[1], 2) ;
Your naming is confusing : memset works on bytes. If you manipulate strings you have extra precaution to take: think of the \0.
I think you want something like that:
void my_str_overwrite(char* dest, const char* ref, int idx, size_t count)
{
size_t input_len = strlen(dest);
if(input_len <= idx+count)
{
// Error: not enough space
}
for(size_t i=0; i<count; i++)
{
dest[idx+i] = ref[i];
}
return;
}
You don't need to pass the whole data_block[] array, you just interested in one element of this array which contains an offset for your copy, if I understood correctly.
As you don't modify cmd it should be const
The code above does not handle the NULL terminating byte which should be appended to memloc if it is actually a string
So I want to overwrite at location data_blocks[i], from cmd 8 chars at a time.
This one is confusing. If you know that you only want 8 bytes to be copied each time you call the function then in the code above make count an local variable within the function and fix it size_t count = 8;
if strings are the same size the you can just use memcpy:
#include <strings.h>
char text[] = "Hello James!";
char name[] = "Jenny";
char* pos = strstr(text, "James");
memcpy(pos, name, strlen(name)-1); // for the '\0'
If they're not then you must reallocate the string as the length will change
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <assert.h>
#define STR "Hello James!"
void replace(char** src, char* find, char* rep) {
char* ret = NULL;
char* pos = strstr(*src, find);
if (!pos)
return; // no changes
int l = (1 + strlen(*src) + strlen(rep) - strlen(find));
ret = (char*)malloc(sizeof(char) * l);
ret[l-1] = 0;
int ind = (int)(pos - *src);
strncpy(ret, *src, ind);
printf("ind: %d; %s\n", ind, ret);
strncpy(&ret[ind], rep, strlen(rep));
strncpy(&ret[ind+strlen(rep)], &pos[strlen(find)], strlen(pos)-strlen(find));
printf("%s\n", ret);
free(*src);
*src = ret;
}
int main() {
char *str = NULL;
str = (char*)malloc(sizeof(char) * (strlen(STR)+1));
assert(str);
strcpy(str, STR);
printf("before: %s\n", str);
replace(&str, "James", "John");
printf("after: %s\n", str);
free(str);
return 0;
}
This code in not optimized.

String (array) capacity via pointer

I am tring to create a sub-routine that inserts a string into another string. I want to check that the host string is going to have enough capacity to hold all the characters and if not return an error integer. This requires using something like sizeof but that can be called using a pointer. My code is below and I would be very gateful for any help.
#include<stdio.h>
#include<conio.h>
//#include "string.h"
int string_into_string(char* host_string, char* guest_string, int insertion_point);
int main(void) {
char string_one[21] = "Hello mother"; //12 characters
char string_two[21] = "dearest "; //8 characters
int c;
c = string_into_string(string_one, string_two, 6);
printf("Sub-routine string_into_string returned %d and creates the string: %s\n", c, string_one);
getch();
return 0;
}
int string_into_string(char* host_string, char* guest_string, int insertion_point) {
int i, starting_length_of_host_string;
//check host_string is long enough
if(strlen(host_string) + strlen(guest_string) >= sizeof(host_string) + 1) {
//host_string is too short
sprintf(host_string, "String too short(%d)!", sizeof(host_string));
return -1;
}
starting_length_of_host_string = strlen(host_string);
for(i = starting_length_of_host_string; i >= insertion_point; i--) { //make room
host_string[i + strlen(guest_string)] = host_string[i];
}
//i++;
//host_string[i] = '\0';
for(i = 1; i <= strlen(guest_string); i++) { //insert
host_string[i + insertion_point - 1] = guest_string[i - 1];
}
i = strlen(guest_string) + starting_length_of_host_string;
host_string[i] = '\0';
return strlen(host_string);
}
C does not allow you to pass arrays as function arguments, so all arrays of type T[N] decay to pointers of type T*. You must pass the size information manually. However, you can use sizeof at the call site to determine the size of an array:
int string_into_string(char * dst, size_t dstlen, char const * src, size_t srclen, size_t offset, size_t len);
char string_one[21] = "Hello mother";
char string_two[21] = "dearest ";
string_into_string(string_one, sizeof string_one, // gives 21
string_two, strlen(string_two), // gives 8
6, strlen(string_two));
If you are creating dynamic arrays with malloc, you have to store the size information somewhere separately anyway, so this idiom will still fit.
(Beware that sizeof(T[N]) == N * sizeof(T), and I've used the fact that sizeof(char) == 1 to simplify the code.)
This code needs a whole lot more error handling but should do what you need without needing any obscure loops. To speed it up, you could also pass the size of the source string as parameter, so the function does not need to calculate it in runtime.
#include <string.h>
#include <stdlib.h>
#include <stdio.h>
signed int string_into_string (char* dest_buf,
int dest_size,
const char* source_str,
int insert_index)
{
int source_str_size;
char* dest_buf_backup;
if (insert_index >= dest_size) // sanity check of parameters
{
return -1;
}
// save data from the original buffer into temporary backup buffer
dest_buf_backup = malloc (dest_size - insert_index);
memcpy (dest_buf_backup,
&dest_buf[insert_index],
dest_size - insert_index);
source_str_size = strlen(source_str);
// copy new data into the destination buffer
strncpy (&dest_buf[insert_index],
source_str,
source_str_size);
// restore old data at the end
strcpy(&dest_buf[insert_index + source_str_size],
dest_buf_backup);
// delete temporary buffer
free(dest_buf_backup);
}
int main()
{
char string_one[21] = "Hello mother"; //12 characters
char string_two[21] = "dearest "; //8 characters
(void) string_into_string (string_one,
sizeof(string_one),
string_two,
6);
puts(string_one);
return 0;
}
I tried using a macro and changing string_into_string to include the requirement for a size argument, but I still strike out when I call the function from within another function. I tried using the following Macro:
#define STRING_INTO_STRING( a, b, c) (string_into_string2(a, sizeof(a), b, c))
The other function which causes failure is below. This fails because string has already become the pointer and therefore has size 4:
int string_replace(char* string, char* string_remove, char* string_add) {
int start_point;
int c;
start_point = string_find_and_remove(string, string_remove);
if(start_point < 0) {
printf("string not found: %s\n ABORTING!\n", string_remove);
while(1);
}
c = STRING_INTO_STRING(string, string_add, start_point);
return c;
}
Looks like this function will have to proceed at risk. looking at strcat it also proceeds at risk, in that it doesn't check that the string you are appending to is large enough to hold its intended contents (perhaps for the very same reason).
Thanks for everyone's help.

Resources