Given wchar_t* str; which points to a null-terminated utf32 (or utf16) string, what command should I use to print it in lldb?
I assume you want to print it as utf8. It's a little involved - you need to create a summary provider for the type in python that returns a utf8 string for printing. It's not especially complicated though. Create a little python file like ~/lldb/wcharsummary.py with contents like
import lldb
def wchar_SummaryProvider(valobj, dict):
e = lldb.SBError()
s = u'"'
if valobj.GetValue() != 0:
i = 0
newchar = -1
while newchar != 0:
# read next wchar character out of memory
data_val = valobj.GetPointeeData(i, 1)
size = data_val.GetByteSize()
if size == 1:
newchar = data_val.GetUnsignedInt8(e, 0) # utf-8
if size == 2:
newchar = data_val.GetUnsignedInt16(e, 0) # utf-16
if size == 4:
newchar = data_val.GetUnsignedInt32(e, 0) # utf-32
else:
return '<error>'
if e.fail:
return '<error>'
i = i + 1
# add the character to our string 's'
if newchar != 0:
s = s + unichr(newchar)
s = s + u'"'
return s.encode('utf-8')
Load this in to lldb and set this python function as the summary provider for wchar_t*; easiest to put this in your ~/.lldbinit file for re-use:
command script import ~/lldb/wcharsummary.py
type summary add -F wcharsummary.wchar_SummaryProvider "wchar_t *"
then given some source that has some utf32 encoded characters in 32-bit wchar_t's,
NSString *str = #"こんにちは"; // 5 characters long
wchar_t *str_utf32_wchar = (wchar_t*) [[str dataUsingEncoding:NSUTF32StringEncoding] bytes];
lldb will print them in utf8 for us:
Process 22278 stopped
* thread #1: tid = 0x1c03, 0x0000000100000e92 a.out`main + 146 at a.m:11, stop reason = step over
#0: 0x0000000100000e92 a.out`main + 146 at a.m:11
8
9 NSString *str = #"こんにちは"; // 5 characters long
10 wchar_t *str_utf32_wchar = (wchar_t*) [[str dataUsingEncoding:NSUTF32StringEncoding] bytes];
-> 11 printf ("0x%llx 0x%llx 0x%llx 0x%llx\n", (uint64_t) str_utf32_wchar[0], (uint64_t) str_utf32_wchar[1],
12 (uint64_t) str_utf32_wchar[2], (uint64_t) str_utf32_wchar[3]);
13
14 [pool release];
(lldb) fr va
(NSAutoreleasePool *) pool = 0x0000000100108190
(NSString *) str = 0x0000000100001068 #"こんにちは"
(wchar_t *) str_utf32_wchar = 0x0000000100107f80 "こんにちは"
(lldb) p str_utf32_wchar
(wchar_t *) $0 = 0x0000000100107f80 "こんにちは"
(lldb) x/16b `str_utf32_wchar`
0x100107f80: 0xff 0xfe 0x00 0x00 0x53 0x30 0x00 0x00
0x100107f88: 0x93 0x30 0x00 0x00 0x6b 0x30 0x00 0x00
(lldb)
I've modified Jason's code a little bit to handle wxString directly instead of having to change the summary for int* pointers.
Test it by typing the script command in Xcode debugger console, then pasting the code below and pressing ctrl-D. Then at the lldb prompt again, type type summary add --python-function wxString_SummaryProvider "wxString". Works well with my wxWidgets build.
def wxString_SummaryProvider(valobj, dict):
e = lldb.SBError()
charPointer = valobj.GetChildMemberWithName('m_impl').GetChildMemberWithName('_M_dataplus').GetChildMemberWithName('_M_p')
valobj = charPointer
s = u'"'
if valobj.GetValue() != 0:
i = 0
newchar = -1
while newchar != 0:
# read next wchar character out of memory
data_val = valobj.GetPointeeData(i, 1)
size = data_val.GetByteSize()
if size == 1:
newchar = data_val.GetUnsignedInt8(e, 0) # utf-8
if size == 2:
newchar = data_val.GetUnsignedInt16(e, 0) # utf-16
if size == 4:
newchar = data_val.GetUnsignedInt32(e, 0) # utf-32
else:
return '<error>'
if e.fail:
return '<error>'
i = i + 1
# add the character to our string 's'
# print "char2 = %s" % newchar
if newchar != 0:
s = s + unichr(newchar)
s = s + u'"'
return s.encode('utf-8')
Related
I'm Trying to understand the PE Format & the source code of "hook_finder" in here
"https://github.com/Mr-Un1k0d3r/EDRs/blob/main/hook_finder64.c"
in this snippet I now it's trying to calculate Export_Table offset:
VOID DumpListOfExport(VOID *lib, BOOL bNt) {
DWORD dwIter = 0;
CHAR* base = (CHAR*)lib;
CHAR* PE = base + (unsigned char)*(base + 0x3c);
DWORD ExportDirectoryOffset = *((DWORD*)PE + (0x8a / 4));
CHAR* ExportDirectory = base + ExportDirectoryOffset;
DWORD dwFunctionsCount = *((DWORD*)ExportDirectory + (0x14 / 4));
DWORD OffsetNamesTableOffset = *((DWORD*)ExportDirectory + (0x20 / 4));
CHAR* OffsetNamesTable = base + OffsetNamesTableOffset;
printf("------------------------------------------\nBASE\t\t\t0x%p\t%s\nPE\t\t\t0x%p\t%s\nExportTableOffset\t0x%p\nOffsetNameTable\t\t0x%p\nFunctions Count\t\t0x%x (%d)\n------------------------------------------\n",
base, base, PE, PE, ExportDirectory, OffsetNamesTable, dwFunctionsCount, dwFunctionsCount);
for(dwIter; dwIter < dwFunctionsCount - 1; dwIter++) {
DWORD64 offset = *((DWORD*)OffsetNamesTable + dwIter);
CHAR* current = base + offset;
GetBytesByName((HANDLE)lib, current, bNt);
}
}
ox3c is e_lfnew offset. However, can't understand what's other hex values and why it's divided by 4 byte?
Further,
VOID GetBytesByName(HANDLE hDll, CHAR *name, BOOL bNt) {
FARPROC ptr = GetProcAddress((HMODULE)hDll, name);
DWORD* opcode = (DWORD*)*ptr;
if(bNt) {
if(name[0] != 'N' && name[1] != 't') {
return;
}
}
if((*opcode << 24) >> 24 == 0xe9) {
if(!IsFalsePositive(name)) {
printf("%s is hooked\n", name);
}
}
}
what's been exactly left & right shifting and Why 24 specifically?
From my understanding of EDRs, it adds a JMP instruction at the very beginning of the function and that's why the condition is trying to check if it's (0xe9), but how does it follow and be certain about the function flow?
and is this applicable only for ntdll.dll?
Sorry I'm starting to study the PE behavior and trying to make things very clear.
Thank you in advance
The function DumpListOfExport assumes that NtHeaders start at the offset 0x3c from the base but, this is not always the case depending on the size of the DOS stub. Probably, this code makes that assumption for ntdll.dll.
And in the function GetBytesByName, if first byte of the procedure starts with a JMP(in that case, it is near, relative jmp whose opcode starts with "E9") instruction and the procedure name is not in the false positives list, then the function makes decision that the function is hooked.
Let be the value of the 4-bytes pointed to by opcode 0xca0e4be9, left shifting it by 24 will result in 0xe9000000, and then right shifting by 24 the result will be 0x000000e9 which is the value of the first byte at ptr.
That procedure can be simplified as follows.
VOID GetBytesByName(HANDLE hDll, CHAR *name, BOOL bNt) {
FARPROC ptr = GetProcAddress((HMODULE)hDll, name);
BYTE* opcode = (BYTE*)ptr;
if(bNt) {
if(name[0] != 'N' && name[1] != 't') {
return;
}
}
if(!IsFalsePositive(name) && *opcode == 0xe9) {
printf("%s is hooked\n", name);
}
}
As a note : I can say that the code isn't written well, and doesn't follow any good coding style.
I'm working on an Android rom for a mobile phone and I want to make the kernel load the wifi MAC address from the device's NV partition. My code looks like this:
#include <linux/kernel.h>
#include <linux/random.h>
#include <linux/syscalls.h>
#define ETHER_ADDR_LEN 6
#define FILE_WIFI_MACADDR "/dev/block/mmcblk0p7"
static int bcm_wifi_get_mac_addr(unsigned char *buf)
{
int ret = 0;
mm_segment_t oldfs;
int i;
int fp;
int macbyte;
int readlen = 0;
uint rand_mac;
static unsigned char mymac[ETHER_ADDR_LEN] = {0,};
const unsigned char nullmac[ETHER_ADDR_LEN] = {0,};
const unsigned char bcastmac[] = {0xFF,0xFF,0xFF,0xFF,0xFF,0xFF};
if (buf == NULL)
return -EAGAIN;
memset(buf, 0x00, ETHER_ADDR_LEN);
oldfs = get_fs();
set_fs(get_ds());
fp = sys_open(FILE_WIFI_MACADDR, O_RDONLY, 0);
if (fp < 0) {
pr_err("%s: Failed to read error %d for %s\n",
__FUNCTION__, fp, FILE_WIFI_MACADDR);
goto random_mac;
}
for (i = 0; i<12; i++) {
macbyte=0;
sys_lseek( fp,i+7680,SEEK_SET);
readlen = sys_read(fp,&macbyte,1);
if (i)
sprintf(macaddr,"%s%c",macaddr,macbyte);
else
sprintf(macaddr,"%c",macbyte);
}
if (readlen > 0) {
unsigned char* macbin;
macbin = (unsigned char*)macaddr;
pr_info("%s: READ MAC ADDRESS %02X:%02X:%02X:%02X:%02X:%02X\n",
__FUNCTION__,
macbin[0], macbin[1], macbin[2],
macbin[3], macbin[4], macbin[5]);
if (memcmp(macbin, nullmac, ETHER_ADDR_LEN) == 0 ||
memcmp(macbin, bcastmac, ETHER_ADDR_LEN) == 0) {
sys_close(fp);
goto random_mac;
}
memcpy(buf, macbin, ETHER_ADDR_LEN);
} else {
sys_close(fp);
goto random_mac;
}
sys_close(fp);
return ret;
random_mac:
set_fs(oldfs);
pr_debug("%s: %p\n", __func__, buf);
if (memcmp( mymac, nullmac, ETHER_ADDR_LEN) != 0) {
/* Mac displayed from UI is never updated..
So, mac obtained on initial time is used */
memcpy(buf, mymac, ETHER_ADDR_LEN);
return 0;
}
srandom32((uint)jiffies);
rand_mac = random32();
buf[0] = 0x00;
buf[1] = 0x90;
buf[2] = 0x4c;
buf[3] = (unsigned char)rand_mac;
buf[4] = (unsigned char)(rand_mac >> 8);
buf[5] = (unsigned char)(rand_mac >> 16);
memcpy(mymac, buf, 6);
pr_info("[%s] Exiting. MAC %02X:%02X:%02X:%02X:%02X:%02X\n",
__FUNCTION__,
buf[0], buf[1], buf[2], buf[3], buf[4], buf[5] );
return 0;
}
The idea is to load the nv parition, located at /dev/block/mmcblk0p7, then read the mac address, which is located at offset 7680 on the nv. The problem is that the MAC address is written in hex, so I'm trying to print it to an ASCII string using sprintf().
for (i = 0; i<12; i++) {
macbyte=0;
sys_lseek( fp,i+7680,SEEK_SET);
readlen = sys_read(fp,&macbyte,1);
if (i)
sprintf(macaddr,"%s%c",macaddr,macbyte);
else
sprintf(macaddr,"%c",macbyte);
}
In the nv the MAC looks something like this: 34 30 42 30 46 41 36 35 39 33 34 39, which in ASCII is 40B0FA659349. But instead the resulting MAC is 34:30:42:30:46:41, which tells me that the hex values are not getting converted at all.
What would be the proper way to export the hex values into an ASCII string? I'm new to programming and i was hoping someone could give me some tips.
Thanks in advance.
In your loop you are reading single bytes and converting them to hex strings, while what you actually need to do is read the hex string and convert it byte values. Unless you actually want a hex string, in which case no conversion is necessary.
You have 12 hex characters representing 6 bytes so:
#define MAC_LEN 6
uint8_t macbytes[MAC_LEN] ;
for( i = 0; i < MAC_LEN; i++ )
{
char hex_str[3] ;
unsigned byte_val ;
sys_lseek( fp, (i*2) + 7680, SEEK_SET ) ;
readlen = sys_read( fp, hex_str, 2 ) ;
sscanf( hex_str, "%2X", &byte_val ) ;
macbytes[i] = (uint8_t)byte_val ) ;
}
The data in NV is already ASCII coded hexadecimal; for example 0x34 is the ASCII code for the hex digit '4', and 0x30 that for '0', together the ASCII character pair "40" represent the single 8 bit integer value 0x40. So the conversion you need is ASCII to byte array, not "hex to ASCII" (which makes no semantic sense).
I think this is OP's stubbing block: forming a string version of the MAC address.
I'll make this wiki for anyone to modify, borrow or steal.
sys_lseek( fp,i+7680,SEEK_SET);
char macaddr[100];
char *p = macaddr;
const char *sep = "";
for (i = 0; i < 12; i++) {
unsigned char macbyte;
int readlen = sys_read(fp, &macbyte, 1);
if (readlen != 1) Handle_Error();
p += sprintf(p, "%s%02X", sep, macbyte);
sep = ":";
}
puts(macaddr);
I'm trying to free a malloc'd buffer I made for a string, but free() gives me an error.
As I see it, the value of the pointer doesn't change, and both arrays are malloc'd. So it should be possible to free them?
I can't think of what I have done wrong.
Here is the code:
/* dump
* this function dumps the entry array to the command line
* */
void dump(PasswordEntry * entries, int numLines) {
int index = 0;
unsigned char *hexSalt = malloc(SALT_HEX_LENGTH+1), *hexHash = malloc(MAX_HASH_LEN+1); /* pointers for salt and hash, because we need them in hex instead of byte */
while (index < numLines) { /* go through every line */
/* gets us the salt in hex */
toHexBinary(hexSalt, entries[index].salt, SALT_HEX_LENGTH);
/* gets us the hash in hex, with length according to set algorithm */
toHexBinary(hexHash, entries[index].hash, (entries[index].algorithm == HASH_ALG_SHA1)?SHA1_HEX_LENGTH:SHA2_HEX_LENGTH);
/* prints one line to command line */
printf("%s: %s = %s (%s/%s)\n", entries[index].username, hexHash, (entries[index].password == NULL)?"???":entries[index].password, (entries[index].algorithm == HASH_ALG_SHA1)?"SHA1":"SHA2", hexSalt);
index++;
}
/* don't need these anymore, we can free them */
free(hexSalt);
free(hexHash);
}
/* takes a string in binary and returns it in hex (properly escaped) */
unsigned char * toHexBinary(unsigned char * to, unsigned char * from, int length) {
unsigned char c = '0';
int second = 0, first = 0;
if (to == NULL) { /* if to is null, we need to allocate it */
to = malloc(length+1);
}
to[length] = '\0';
while (length-- > 0) { /* go trough the string, starting at tthe end */
length--; /* we always need to read two characters */
c = from[length/2];
second = c % 16;
first = (c - second) / 16;
to[length] = toHex(first);
to[length+1] = toHex(second);
}
return to;
}
/* takes a numeric character and returns it's hex representation */
char toHex(int c) {
if (c < 10) return (char)(NUMBER_BEGIN + c); /* if it is under 10, we get the appropiate digit */
else return (char)(UPPER_BEGIN + (c - 10)); /* if it is over 10, we get the appropiate UPPERCASE character */
}
Here is the output of gdb:
Starting program: /crack -b ./hashes.txt 1 2
Breakpoint 1, dump (entries=0x604700, numLines=9) at crack.c:435
435 unsigned char *hexSalt = malloc(SALT_HEX_LENGTH+1), *hexHash = malloc(MAX_HASH_LEN+1); /* pointers for salt and hash, because we need them in hex instead of byte */
(gdb) next
437 while (index < numLines) { /* go through every line */
(gdb) p hexSalt
$1 = (unsigned char *) 0x604390 ""
(gdb) p hexHash
$2 = (unsigned char *) 0x604510 ""
(gdb) continue
Continuing.
Breakpoint 2, dump (entries=0x604700, numLines=9) at crack.c:449
449 free(hexSalt);
(gdb) p hexSalt
$3 = (unsigned char *) 0x604390 "1234567890FEDCBA0000"
(gdb) p hexHash
$4 = (unsigned char *) 0x604510 "05F770BDD6D78ED930A9B6B9A1F22776F13940B908679308C811978CD570E057"
(gdb) next
450 free(hexHash);
(gdb) next
*** Error in `/crack': free(): invalid next size (fast): 0x0000000000604510 ***
Program received signal SIGABRT, Aborted.
0x00007ffff7602267 in __GI_raise (sig=sig#entry=6)
at ../sysdeps/unix/sysv/linux/raise.c:55
55 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
toHexBinary(hexHash, entries[index].hash, (entries[index].algorithm == HASH_ALG_SHA1)?SHA1_HEX_LENGTH:SHA2_HEX_LENGTH);
You allocate only MAX_HASH_LEN+1 bytes for hexHash. But you are passing SHA1_HEX_LENGTH or SHA2_HEX_LENGTH.
If either of these values is greater than MAX_HASH_LEN, you have problem since the function toHexBinary() accesses hexHash[MAX_HASH_LEN]. This is probably what happens. You can't pass a value that's greater than MAX_HASH_LEN.
I encountered similar error, "free(): invalid next size " and "../sysdeps/unix/sysv/linux/raise.c: No such file or directory.".
I was running a peripheral init job and then do ROS init and some other job.
The peripheral init job runs fine and the ROS init job also works fine. But when do them together it always report this error.
finally I found this is a memory problem. in the malloc() I missed a * in sizeof() then the size of malloc memory is not correct.
just for someone who in the same boat.
This question already has an answer here:
PHP and C++ for UTF-8 code unit in reverse order in Chinese character
(1 answer)
Closed 9 years ago.
If I know the unicode codepoint of this 2 chinese character 你好 in str
How can I convert this char * str codepoint to chinese character and assign it to wchar_t * wstr ?
char * str = "4F60 597D";
wchar_t * wstr;
I know that I can directly assign like this and problem solved.
wchar_t * wstr = L"\u4F60\u597D";
But my problem is more complicated than that, my situation does not allow that.
How can I do the conversion from literal codepoint to wchar_t * ?
Thanks.
I am using MS Visual C with charset set to MBCS, assume that I cannot use UNICODE charset.
UPDATE :
Sorry, just corrected the wchar_t wstr to wchar_t * wstr
UPDATE
The char * str contain sequence of UTF-8 code units, for the 2 chinese character 你好
char * str = "\xE4\xBD\xA0\xE5\xA5\xBD";
size_t len = strlen(str) + 1;
wchar_t * wstr = new wchar_t[len];
size_t convertedSize = 0;
_locale_t local = _create_locale( LC_ALL , "Chinese");
_mbstowcs_s_l(&convertedSize, wstr, len, str, _TRUNCATE, local);
MessageBoxW( NULL, wstr , (LPCWSTR)L"Hello", MB_OK);
Why is the MessageBox printing out Japanese character ? Instead of chinese ? What is the right locale name to use ?
I can think about this function:
#define GetValFromHex(x) (x > '9' ? x-'A'+10 : x - '0')
wchar_t GetChineesChar(const char* strInput)
{
wchar_t result = 0;
LPBYTE ptr = (LPBYTE)&result;
ptr[0] = GetValFromHex(strInput[2]) * 16 + GetValFromHex(strInput[3]);
ptr[1] = GetValFromHex(strInput[6]) * 16 + GetValFromHex(strInput[7]);
return result;
}
wchatr_t* GetChineesString(const char* strInput)
{
size_t len = strlen(strInput) / 8;
wchar_t* returnVal = new wchar_t[len];
for (int i = 0; i < len; i++)
{
returnVal[i] = GetChineesChar(&strInput[i*8]);
}
return returnVal;
}
Then you should just call GetChineesString(); ofcourse you can add more validation to check the first two chars are \x and fivth and sixth chars are \x too before moving forward. but this is a start point for more robust code. this is not robust and not tested too.
Edit:
I am assuming all hex values are Upper Case.
I tried to memcpy measure_msg (struct test) to a buffer. However, the code below doesn't seems to copy the data. The value return
**** ptr:0xb781c238
**** ptr:0xb781c23c
**** ptr:0xb781c244
buff[0]=5 - buff[1]=0 - buff[2]=0 - buff[3]=0 - buff[4]=W - buff[5]= - buff[6]= - buff[7]= - buff[8]= - buff[9]= - buff[10]= - buff[11]= -
What has gone wrong in this chunk of code?
struct test{
int mode;
int id;
};
int func()
{
int i, size;
struct test measure_msg;
char buff[20];
char* ptr;
memset(&measure_msg, 0x00, sizeof(struct test));
ptr = buff;
fprintf(stderr, "**** ptr:%p\n", ptr);
sprintf(ptr, "%02d%02d", 50, 0);
ptr += 4;
size = 4;
size += sizeof(struct test);
fprintf(stderr, "**** ptr:%p\n", ptr);
measure_msg.id = 9999;
measure_msg.mode = 1111;
memcpy(ptr, &measure_msg, sizeof(struct test));
ptr += sizeof(struct test);
fprintf(stderr, "**** ptr:%p\n", ptr);
for (i=0; i<size; i++){
fprintf(stderr, "buff[%d]=%c - ", i, buff[i]);
}
return 0;
}
You're doing something strange but, look this:
sprintf(ptr, "%02d%02d", 50, 0);
You'll write a string to your buffer. Now buf will contains "5000". Please note that it won't contain the values 50 and 0 but their string representation!
Now when you copy the buffer to your struct you'll set its fields to these four bytes but they're not what you see when printing the string but its ASCII codes. Note that on this line:
fprintf(stderr, "buff[%d]=%c - ", i, buff[i]);
You print the content of the buffer as characters, '5' is stored as 0x35 (53 in decimal) then it'll be the content of the first byte of your structure (and so on).
If this is really what you want to do your code is exact (but you're playing too much with pointers, is it just a test?) but it's really really strange otherwise you're walking in the wrong direction to do what you need.
When you memcpy your measure_msg to the buff you are copying int type values. After that, you are printing char type values. An int type value is composed by 4 bytes which may have no printing representation: i.e 33752069 int value, 0x02030405 in hex format, has 4 bytes that, once been printed like chars you get 0x02, 0x03, 0x04 and 0x05 char values.
Change your print masc to use int values and cast each buff[i] to int and your values will be printed.
fprintf(stderr, "buff[%d]=%d - ", i, (int)buff[i])
The memcpy () call is working all right on my system (GCC/MinGW, Windows). You aren't getting the proper output because some of the "characters" getting copied into buff are non-printable.
Try
fprintf (stderr, "buff[%d]=%x - ", i, buff[i]);
instead.
The data will be stored as
buff [0] = 0x35 /* ASCII for '5' */
buff [1] = 0x30 /* ASCII for '0' */
buff [2] = 0x30
buff [3] = 0x30
buff [4] = 0x57 /* as 1111 is 0x00000457 in hex */
buff [5] = 0x04 /* stored in little endian convention */
buff [6] = 0x00 /* and here size of int = 4 */
buff [7] = 0x00
buff [8] = 0x0F /* as 9999 is 0x0000270F in hex */
buff [9] = 0x27
buff [10] = 0x00
buff [11] = 0x00
But what are you trying to do anyway, by copying a struct to an array of chars?