I'm trying to disassemble a arm64 based binary I want to know how can I reformat structure as it was before I mean not strings but the values at least placed that they were before in code?
Take example
static struct mystruct cmn = {
{ 0xFF, 0x03, {0x98, 0x81, 0x03} },
{ 0x01, 0x01, {0x00} },
{ 0x02, 0x01, {0x00} },
{ 0x03, 0x01, {0x53} },
};
But in binary it's actually hard to remember and I sometimes make mistakes while reversing. So, it possible to get a exactly same arranged chars in ida pro 7.2 or radare 2?
https://del.dog/raw/fomukovata
ENVIRONMENT
radare2: radare2 4.2.0-git 23519 # linux-x86-64 git.4.1.1-84-g0c46c3e1e commit: 0c46c3e1e30bb272a5a05fc367d874af32b41fe4 build: 2020-01-08__09:49:0
system: Ubuntu 18.04.3 LTS
SOLUTION
As #David Hoelzer mentioned you must first derive the format of the data in memory.
If you know the structure of the data we can use two commands in radare2 to structure that area of memory.
Command 1: pf.name [0|cnt]fmt # Define a new named format
Command 2: Cf[?][-] [sz] [0|cnt][fmt] [a0 a1...] [#addr] # format memory (see pf?)
EXAMPLE
Following the structure close to what you provided.
user#host:~$ r2 /bin/ls
[0x1000011e8]> pf.mystruct [5]c[3]c[3]c[3]c
[0x1000011e8]> Cf 14 ? (mystruct)example
[0x1000011e8]> pd 1
;-- rip:
0x1000011e8 format ? (mystruct)example {
example :
struct<mystruct>
0x1000011e8 = [ 'U', 'H', '.', '.', 'A' ]
0x1000011ed = [ 'W', 'A', 'V' ]
0x1000011f0 = [ 'A', 'U', 'A' ]
0x1000011f3 = [ 'T', 'S', 'H' ]
} 14
[0x1000011e8]> q
user#host:~$
Related
I make the program for STM8L151G6 on IAR Embedded Workbench for STM8 (version 3.11.1)
I need to place the instruction JPF 0xf000 at 0x008426 address.
I do this. In C code:
__root static const uint8_t jpfat0x8426 [] # "ENTRY_POINT" = {0xac, 0x00, 0xf0, 0x00}; // jpf 0xf000
In .icf file:
define region EntryPoint = [from 0x8426 to 0x842A];
define region VectorsRegion = [from 0x8000 size 0x80];
define region NearFuncCode = [from 0x8080 to 0xEF7F] - EntryPoint;
define region FarFuncCode = [from 0x8080 to 0xEF7F] - EntryPoint;
define region HugeFuncCode = [from 0x8080 to 0xEF7F] - EntryPoint;
place at start of EntryPoint { ro section ENTRY_POINT };
place in EntryPoint { };
Linker builds the code in next way:
...
"A2": 0x80
INTVEC 0x008000 0x80 <Block>
.intvec const 0x008000 0x80 interrupt.o [4]
- 0x008080 0x80
"A3": 0x4
ENTRY_POINT const 0x008426 0x4 project51.o [1]
- 0x00842a 0x4
"P3-P5": 0x20cb
.near_func.text ro code 0x00842b 0x3a6 float.o [4]
.near_func.text ro code 0x0087d1 0x2a1 data_exchange.o [1]
.near_func.text ro code 0x008a72 0x1fa fuel_gauge.o [1]
...
It is right. But the range [from 0x008080 to 0x00842b] is empty, so code is not compact.
I loose close 1K bytes, it is too many for stm8 MCU. For example object float.o (size 0x3a6)
can be placed to this range, but linker doesn't do this. Is there some way to tell the linker to do denser code and fill empty chunks of sections with objects.
Thank you.
I have the following string stored.
16 Bytes for 1-F and 4 nullBytes at the end.
e.g. 1234567890ABCDEF0000
unsigned char input[] = {0x31, 0x32, 0x33, 0x34, 0x35, 0x36, 0x37, 0x38, 0x39, 0x30, 0x41, 0x42, 0x43, 0x44, 0x45, 0x46, 0x00, 0x00, 0x00, 0x00};
How do I get the 10 Byte Binary of this?
EDIT:
Im trying to use the SHA1 function of the openssl crypto library properly.
I have the task to read a "salt" and a "password" from the command line.
Then add them together such that I have "salt" + "|" + "password".
If no salt is passed, the salt is just "\0\0\0\0\0\0\0\0\0\0" which is 10 bytes right? but if a salt is passed it could be "1234567890ABCDEF"
I then have to fill this up to the right with null Bytes, so that i have 10 bytes in total But the "1234567890ABCDEF" is already 16 Bytes so i have to convert it. I dont know, I'm really struggling with the memory part in c
The easiest could be to:
Create a 0-initialized array of 10 bytes:
unsigned char salt[10] = { 0 };
then read in the hexdigits bytewise with sscanf():
sscanf(input, "%02hhx%02hhx%02hhx%02hhx%02hhx%02hhx%02hhx%02hhx%02hhx%02hhx",
&salt[0], &salt[1], &salt[2], &salt[3], &salt[4],
&salt[5], &salt[6], &salt[7], &salt[8], &salt[9]);
This will convert as many bytes as needed; if only 6 hexdigits are given as salt, the first three bytes are filled and the rest remains 0.
This should do what you expect.
Hey I didn't get much from your example, but what you describe as bellow + the constrains could be solved like this. See snippet.
If no salt is passed, the salt is just "\0\0\0\0\0\0\0\0\0\0" which is 10 bytes right? but if a salt is passed it could be "1234567890ABCDEF"
#include <stdio.h>
#include <string.h>
#define SALT_MAX_BYTES 10
int main(int argc, char *argv[]) {
// Init the whole array with 0-s, which is the same value as '\0'
char salt[SALT_MAX_BYTES] = {0};
// Here get the input, now assuming ./a.out [salt]
if (argc > 1) // The executable name is always passed
{
printf("Input: %s\n", argv[1]);
// Assuming ASCII...
// Assuming you want to use the ASCII value representation of input "42"
// and not the number 42 ...
strncpy(salt, argv[1], SALT_MAX_BYTES);
// Note: from here on you must strictly handle salt as length terminated.
// => input may have more then SALT_MAX_BYTES
}
else
{
puts("Usage: ...");
return -1;
}
// Left aligned output, showing nothing for \0 bytes...
printf("Entered salt is : <%-*.*s>\n", SALT_MAX_BYTES, SALT_MAX_BYTES, salt);
return 0;
}
I am building a Single Page Application for Arduino. It graphically displays analog pin values on a wifi connected tablet.
I have built the sketch but want to clean it up. I have been able to upload a sketch to my (Uno Wifi Rev 2) Arduino, initialize the Wifi, and connect to it with a tablet. I am able send the static page "frame" to the tablet.
That static frame is able to request and receive Arduino analog pin values using the XMLHttpRequest object.
But sending the bulky static page is clunky. Tutorials do stuff like,
client.println("<html><body>");
client.println("Hello World!");
client.println("</body></html>");
I tried to get slick and create a FileText.h header file:
#define constFileText=
"<html><body>"
"Hello World!"
"</body></html>";
and combine that with:
#include "FileText.h"
client.println(constFileText);
What I would like to do is create a standard FileText.html:
Hello World!
And process it with something like:
ifstream hFile ("FileText.html");
while (getline(hFile, strLine))
client.println(strLine);
That would make it much easier to edit the html file. It would eliminate the waste of including all those serial.println calls. It would also eliminate the maximum length constraint on constant values.
Is there any way to provide a text file to the Arduino compiler and have the Arduino Server send it to the Arduino's client?
C++ has 'raw string literals'. You can put a constant string, without escaping special characters, into the source code between an opening and closing 'tag'. You can choose the tag to be something that is not in the raw string. In following example the tag is =====.
const char* s1 = R"=====(Hello
"World")=====";
is same as
const char* s2 = "Hello\n\"World\"";
This way you can put your large strings into separate .h files and include them. On AVR use PROGMEM to save RAM.
You can use the xxd tool to generate an include file from your HTML. For example, give a file test.html:
<html><body>
Hello World!
</body></html>
Using xxd -i test.html > test_html.h results in test_html.h containing :
unsigned char test_html[] = {
0x3c, 0x68, 0x74, 0x6d, 0x6c, 0x3e, 0x3c, 0x62, 0x6f, 0x64, 0x79, 0x3e,
0x0d, 0x0a, 0x48, 0x65, 0x6c, 0x6c, 0x6f, 0x20, 0x57, 0x6f, 0x72, 0x6c,
0x64, 0x21, 0x0d, 0x0a, 0x3c, 0x2f, 0x62, 0x6f, 0x64, 0x79, 0x3e, 0x3c,
0x2f, 0x68, 0x74, 0x6d, 0x6c, 0x3e
};
unsigned int test_html_len = 42;
You can then #include "test_html.h" in your sketch and pass the array to client.print(). This gets round the limits on the size of a string. Unfortunately you lose the ability to loop through the array line by line, so you would have to write a function to do that yourself if required.
xxd is a *nix tool, but there are Windows ports if you need one.
Armed with #jfowkes format (but not #Juraj) I created a Python program. It works on both Windows and Linux.
To use it, place all text files (for me, .html and .js) in a subdirectory of the sketch. Then, from the sketch folder, run python TextEncode.py "SubdirectoryName". Add an #include "SubdirectoryName.h" line at the beginning of the sketch. That header file includes a function, void SendPage(WiFiClient hClient) which sends the contents of the files in subdirectory to the client; call it when appropriate.
(It does send the files in alphabetical order, so I precede the files with numbers such as "F210". I think of files as modules. By having many modules like this, I can disable modules by selectively commenting out code. I actually have two development modules [a .js and a .html] and one production module [a .js]; I have a copy of the SendPage function in the main sketch. By selectively commenting out code, I can choose whether or not I want to see the results of the XMLHttpRequest function calls.)
I know that this is a lot more complex than either of the other proposed solutions, but it helps the development cycle: (1) Edit html/js code in my favorite IDE (2) run the python program (3) compile the sketch.
Here's the contents of my TextEncode.py:
# program to convert text files to file with constant array of ascii code of file characters
# converted file is to be used by Arduino compiler to efficiently send html/js code to Arduino
# Usage:
# 1) place files to be encoded into subfolder, "ClientHtml"
# 2) from console, 'python TextEncode.py "ClientHtml"'.
#
import os
import sys
import binascii
c_nCharsPerLine = 16
strFolderIn = sys.argv[1]
astrPseudos = [] # array of file pseudonyms. to be used to create inclusive [ClientHtml].h
if len(sys.argv) > 2:
strClientHandle = sys.argv[2]
else:
strClientHandle = "hClient"
for strFileIn in os.listdir(strFolderIn):
# encode each file in subdirectory
# it is easier to re encode every file than it is to check timestamps to re encode only updated files
strFilePseudo = strFileIn.replace (".", "_") # to be used in name of encoded file and name of variable with contents of file.
astrPseudos.append(strFilePseudo)
strContents = ""; # contents read from file itself, in pairs of hex digits
with open(strFolderIn + "/" + strFileIn, "r") as fileIn:
nChar = 0;
for strLineIn in fileIn:
for chIn in strLineIn:
# strContents = strContents + chIn.encode("hex") + "," # works on Linux
strContents = strContents + hex(ord(chIn)) + ","
nChar += 1
if nChar % c_nCharsPerLine == 0:
strContents += "\n"
# truncate trailing \n, if it exists
if nChar % c_nCharsPerLine == 0:
strContents = strContents[:-1]
strContents += "0\n"
with open (strFilePseudo + ".h", "w") as fileOut:
fileOut.write("const unsigned char c_" + strFilePseudo + "[] = {\n")
# fileOut.write("unsigned char c_" + strFilePseudo + "[] = {\n")
fileOut.write(strContents)
fileOut.write("};\n")
with open (strFolderIn + ".h", "w") as fileOut:
fileOut.write("// .h files with encoded files to be included:\n")
astrPseudos.sort()
for strFilePseudo in astrPseudos:
fileOut.write("#include \"" + strFilePseudo + ".h\"\n")
fileOut.write("/*\n")
fileOut.write("// Arduino Compiler function to send encoded files to web client:\n")
fileOut.write("// Comment these out if you don't want to use the functionality\n")
fileOut.write("void SendPage(WiFiClient " + strClientHandle + ")\n")
fileOut.write("{\n")
fileOut.write(" String strData;\n")
for strFilePseudo in astrPseudos:
fileOut.write(" strData=c_" + strFilePseudo + ";\n")
fileOut.write(" " + strClientHandle + ".println(strData);\n")
fileOut.write("}\n")
fileOut.write("*/\n")
I need a function which can calculate the length of an x86-64 instruction.
For example, it would be usable like so:
char ret[] = { 0xc3 };
size_t length = instructionLength(ret);
length would be set to 1 in this example.
I do not want to include an entire disassembly library, since the only information I require is the length of the instruction.
I am looking for a minimalist approach, written in C, and ideally as small as possible.
100% complete x86-64 instruction set is not strictly necessary (very obscure ones such as vector register set instructions can be omitted).
A similar answer to what I am looking for (but for the wrong architecture):
Get size of assembly instructions
There is XED library from Intel to work with x86/x86_64 instructions: https://github.com/intelxed/xed, and it is the only correct way to work with intel machine codes.
xed_decode function will provide you all information about instruction: https://intelxed.github.io/ref-manual/group__DEC.html
https://intelxed.github.io/ref-manual/group__DEC.html#ga9a27c2bb97caf98a6024567b261d0652
And xed_ild_decode is for instruction length decoding:
https://intelxed.github.io/ref-manual/group__DEC.html#ga4bef6152f61997a47c4e0fe4327a3254
XED_DLL_EXPORT xed_error_enum_t xed_ild_decode ( xed_decoded_inst_t * xedd,
const xed_uint8_t * itext,
const unsigned int bytes
)
This function just does instruction length decoding.
It does not return a fully decoded instruction.
Parameters
xedd the decoded instruction of type xed_decoded_inst_t . Mode/state sent in via xedd; See the xed_state_t .
itext the pointer to the array of instruction text bytes
bytes the length of the itext input array. 1 to 15 bytes, anything more is ignored.
Returns:
xed_error_enum_t indiciating success (XED_ERROR_NONE) or
failure. Only two failure codes are valid for this function:
XED_ERROR_BUFFER_TOO_SHORT and XED_ERROR_GENERAL_ERROR. In general
this function cannot tell if the instruction is valid or not. For
valid instructions, XED can figure out if enough bytes were provided
to decode the instruction. If not enough were provided, XED returns
XED_ERROR_BUFFER_TOO_SHORT. From this function, the
XED_ERROR_GENERAL_ERROR is an indication that XED could not decode the
instruction's length because the instruction was so invalid that even
its length may across implmentations.
To get length from xedd filled by xed_ild_decode, use xed_decoded_inst_get_length: https://intelxed.github.io/ref-manual/group__DEC.html#gad1051f7b86c94d5670f684a6ea79fcdf
static XED_INLINE xed_uint_t xed_decoded_inst_get_length ( const xed_decoded_inst_t * p )
Return the length of the decoded instruction in bytes.
Example code ("Apache License, Version 2.0", by Intel 2016): https://github.com/intelxed/xed/blob/master/examples/xed-ex-ild.c
#include "xed/xed-interface.h"
#include <stdio.h>
int main()
{
xed_bool_t long_mode = 1;
xed_decoded_inst_t xedd;
xed_state_t dstate;
unsigned char itext[15] = { 0xf2, 0x2e, 0x4f, 0x0F, 0x85, 0x99,
0x00, 0x00, 0x00 };
xed_tables_init(); // one time per process
if (long_mode)
dstate.mmode=XED_MACHINE_MODE_LONG_64;
else
dstate.mmode=XED_MACHINE_MODE_LEGACY_32;
xed_decoded_inst_zero_set_mode(&xedd, &dstate);
xed_ild_decode(&xedd, itext, XED_MAX_INSTRUCTION_BYTES);
printf("length = %u\n",xed_decoded_inst_get_length(&xedd));
return 0;
}
If you're on Windows, you can just use IDebugControl::Disassemble(..., &end_address) from dbgeng.dll. See this question for example usage.
I have an strace of my program that interacts with USB, and I am wondering what the following write command tells me. I understand the writev iovec structure consists of the data array pointer followed by the length, but what does the "#\10\335 \320\2w\4\240K\252\0\7" in the data array denote? I'm particularly wondering what the # symbol, 2w, and 240K means as those are not hex data values as I would expect them to be.
I'm running on Linux and here is the writev line:
writev(6, [{"#\10\335 \320\2w\4\240K\252\0\7", 13}, {"\0\0\0\4\0\0\0\4", 8}], 2) = 21
From the man page of writev:
ssize_t readv(int fd, const struct iovec *iov, int iovcnt);
That is, the second argument is an array of size the value of the third argument (2 in your case) elements of type struct iovec.
When strace prints those it octal escapes unprintable characters but displays all other exactly as they can be printed. Hence, # is just the byte corresponding to #, K is the byte corresponding to K and so on.
Answering your questions in the comment, another look at the man page shows
struct iovec {
void *iov_base; /* Starting address */
size_t iov_len; /* Number of bytes to transfer */
};
Which means that {"#\10\335 \320\2w\4\240K\252\0\7", 13} is to be read as iov_len = 13 and iov_base is a memory area containing the bytes printed as #\10\335 \320\2w\4\240K\252\0\7. Fire up gdb if you want to see the binary values:
[mihai#winterfell 1]$ gdb -q
(gdb) p/x "#\10\335 \320\2w\4\240K\252\0\7"
$1 = {0x40, 0x8, 0xdd, 0x20, 0xd0, 0x2, 0x77, 0x4, 0xa0, 0x4b, 0xaa, 0x0, 0x7, 0x0}
Where the last 0x0 is the null terminator of the string and should be ignored.