Compare data in bytes to a hex number - c

I need to read data from stdin where I need to check if 2 bytes are equal to 0x001 or 0x002 and then depending on the value execute one of two types of code. Here is what i tried:
uint16_t type;
uint16_t type1 = 0x0001;
while( ( read(0, &type, sizeof(type)) ) > 0 ) {
if (type == type1) {
//do smth
}
}
I'm not sure what numeric system the read uses since printing the value of type both as decimal and as hex returns something completely different even when i wrtie 0001 on stdin

The characters 0x0001 and 0x0002 are control characters in UTF-8, you wont be able to type them manually.
If you are trying to compare the character input - use gets and atoi to convert the character-based input of stdin into an integer, or if you want to continue using read - compare against the characters '0' '0' '0' '1'/'2' instead of the values to get proper results

You didn't say what kind of OS you're using. If you're using Unix or Linux or MacOS, at the command line, it can be pretty easy to rig up simple ways of testing this sort of program. If you're using Windows, though, I'm sorry, my answer here is not going to be very useful.
(I'm also going to be making heavy use of the Unix/Linux
"pipe" concept, that is, the vertical bar | which sends the output of one program to the input of the next. If you're not familiar with this concept, my answer may not make much sense at first.)
If it were me, after compiling your program to a.out, I would just run
echo 00 00 00 01 00 02 | unhex | a.out
In fact, I did compile your program to a.out, and tried this, and... it didn't work. It didn't work because my machine (like yours) uses little-endian byte order, so what I should have done was
echo 00 00 01 00 02 00 | unhex | a.out
Now it works perfectly.
The only problem, as you've discovered if you tried it, is that unhex is not a standard program. It's one of my own little utilities in my personal bin directory. It's massively handy, and it's be super useful if something like it were standard, but it's not. (There's probably a semistandard Linux utility for doing this sort of thing, but I don't know what it is.)
What is pretty standard, and you probably can use it, is a base-64 decoder. The problem there is that there's no good way to hand-generate the input you need. But I can generate it for you (using my unhexer). If you can run
echo AAABAAIA | base64 -d | a.out
that will perform an equivalent test of your program. AAABAAIA is the base-64 encoding of the three 16-bit little-endian words 0000 0001 and 0002, as you can see by running
echo AAABAAIA | base64 -d | od -h
On some systems, the base64 command uses a -D option, rather than -d, to request decoding. That is, if you get an error message like "invalid option -- d", you can try
echo AAABAAIA | base64 -D | a.out

Related

How to check or change encoding in USLT (ID3)?

It's not a duplicate. I've read em all.
I have a Nokia-N8-00. It's music player supports USLT (UnSynchronised Lyrics/Text). I use a tool called spotdl (https://github.com/Ritiek/Spotify-Downloader) that fetches song titles from "spotify" and downloads them from other sources (generally youtube) and merges metadata as well.
The problem is then, the music downloaded by that tool have lyrics on all my devices except N8. Fortunately, I got a music that had embedded lyrics that is supported on my phone too. I then analyzed both the files and found that in binary sequence, they have a very little difference (just for USLT section but they are different songs). The differences are :-
The one that supports :
55 53 4C 54 00 00 0A 56 00 00 03 58 58 58
The one that doesn't :
55 53 4C 54 00 00 07 38 00 00 01 58 58 58
(These sequences are for "USLT" declaration in the file)
I think it's an encoding difference. If I am right, what encoding is present and in which one? If it's not encoding, what is it?
I know these sequences can't elaborate the situation. So, here are the files I'm trying https://github.com/gaurav712/music.
I don't need supported USLT, I am just curious about it as I wanna make an implementation of it in C (I don't need language specific help though).
Here is what I got:
55 53 4C 54
Translates to:
USLT
So we got that right. Now, I believe we can merge that result with this answer:
Frame ID $xx xx xx xx (four characters)
Size $xx xx xx xx
Flags $xx xx
Encoding $xx
Text
(Taken from: ID3v2 Specification)
(or see this: https://web.archive.org/web/20161022105303/http://id3.org/id3v2-chapters-1.0)
Now, I couldn't get this from the source (because the site was down) but there is also this:
Encoding flag explanation:
• $00 ISO-8859-1 [ISO-8859-1]
• $01 UTF-16 [UTF-16]
• $02 UTF-16BE [UTF-16]
• $03 UTF-8 [UTF-8]
So, according to these findings (which I'm not too sure about), the one that is supported is UTF-8 encoded and the one not supported is UTF-16.
EDIT
I've downloaded and viewed your mp3 files for further inspection. Here are my new findings:
First of all, we were correct about the encodings:
UTF-8 is on supported:
UTF-16 is on unsupported:
Does this mean you can just turn '01' into '03' and it'll magically work? I doubt. It depends on the driver. What if the driver sees '\x00' bytes and iterprets it as end of string (as in end of USLT payload). To test this, you can try manually converting the encoding on the file (by removing extra bytes).
Secondly, running eyeD3 on linux on both files, I recovered that:
supported.mp3 -> ID3 v2.4
unsupported.mp3 -> ID3 v2.3
Perhaps that's an issue?
Also, note that the location of USLT tag in both files are different:
supported.mp3:
unsupported.mp3:
On linux, there are further tools to give you extra information, if need be:
mp3info, id3info, id3tool, exiftool, eyeD3, lltag
Are a couple examples. However, I think the main problem is in the text encoding. I was able to recover the lyrics quite fine using the above tools. But some of the tools give different answers because of ID3 version being different and so on.

Contiguous Hex file generation using GCC

I have a Hex file for STM32F427 that was built using GCC(gcc-arm-none-eabi) version 4.6 that had contiguous memory addresses. I wrote boot loader for loading that hex file and also added checksum capability to make sure Hex file is correct before starting the application.
Snippet of Hex file:
:1005C80018460AF02FFE07F5A64202F1D00207F5F9
:1005D8008E4303F1A803104640F6C821C2F2000179
:1005E8001A460BF053F907F5A64303F1D003184652
:1005F8000BF068F907F5A64303F1E80340F6FC1091
:10060800C2F2000019463BF087FF07F5A64303F145
:10061800E80318464FF47A710EF092FC07F5A643EA
:1006280003F1E80318460EF03DFC034607F5A64221
:1006380002F1E0021046194601F0F2FC07F56A5390
As you can see all the addresses are sequential. Then we changed the compiler to version 4.8 and i got the same type of Hex file.
But now we used compiler version 6.2 and the Hex file generated is not contiguous. It is somewhat like this:
:10016000B9BC0C08B9BC0C08B9BC0C08B9BC0C086B
:10017000B9BC0C08B9BC0C08B9BC0C08B9BC0C085B
:08018000B9BC0C08B9BC0C0865
:1001900081F0004102E000BF83F0004330B54FEA38
:1001A00041044FEA430594EA050F08BF90EA020FA5
As you can see after 0188 it is starting at 0190 means rest of 8 bytes(0189 to 018F) are 0xFF as they are not flashed.
Now boot loader is kind of dumb where we just pass the starting address and no of bytes to calculate the checksum.
Is there a way to make hex file in contiguous way as compiler 4.6 and compiler 4.8? the code is same in all the three times.
If post-processing the hex file is an option, you can consider using the IntelHex python library. This lets you manipulate hex file data (i.e. ignoring the 'markup'; record type, address, checksum etc) rather than as lines, will for instance create output with the correct line checksum.
A fast way to get this up and running could be to use the bundled convenience scripts hex2bin.py and bin2hex.py:
python hex2bin.py --pad=FF noncontiguous.hex tmp.bin
python bin2hex.py tmp.bin contiguous.hex
The first line converts the input file noncontiguous.hex to a binary file, padding it with FF where there is no data. The second line converts it the binary file back to a hex file.
The result would be
:08018000B9BC0C08B9BC0C0865
becomes
:10018000B9BC0C08B9BC0C08FFFFFFFFFFFFFFFF65
As you can see, padding bytes are added where the input doesn't have any data, equivalent to writing the input file to the device and reading it back out. Bytes that are in the input file are kept the same - and at the same address.
The checksum is also correct as changing the length byte from 0x08 to 0x10 compensates for the extra 0xFF bytes. If you padded with something else, IntelHex would output the correct checksum
You can skip the the creation of a temporary file by piping these: omit tmp.bin in the first line and replacing it with - in the second line:
python hex2bin.py --pad=FF noncontiguous.hex | python bin2hex.py - contiguous.hex
An alternative way could be to have a base file with all FF and use the hexmerge.py convenience script to merge gcc's output onto it with --overlap=replace
The longer, more flexible way, would be to implement your own tool using the IntelHex API. I've used this to good effect in situations similar to yours - tweak hex files to satisfy tools that are costly to change, but only handle hex files the way they were when the tool was written.
One of many possible ways:
Make your hex file with v6.2, e.g., foo.hex.
Postprocess it with this Perl oneliner:
perl -pe 'if(m/^:(..)(.*)$/) { my $rest=16-hex($1); $_ = ":10" . $2 . ("FF" x $rest) . "\n"; }' foo.hex > foo2.hex
Now foo2.hex will have all 16-byte lines
Note: all this does is FF-pad to 0x10 bytes. It doesn't check addresses or anything else.
Explanation
perl -pe '<some script>' <input file> runs <some script> for each line of <input file>, and prints the result. The script is:
if(m/^:(..)(.*)$/) { # grab the existing byte count into $1
my $rest=16 - hex($1); # how many bytes of 0xFF we need
$_ = ":10" . $2 . ("FF" x $rest) . "\n"; # make the new 16-byte line
# existing bytes-^^ ^^^^^^^^^^^^^^-pad bytes
}
Another solution is to change the linker script to ensure the preceding .isr_vector section ends on a 16 byte alignment, as the mapfile reveals that the following .text section is 16 byte aligned.
This will ensure there is no unprogrammed flash bytes between the two sections
You can use bincopy to fill all empty space with 0xff.
$ pip install bincopy
$ bincopy fill foo.hex
Use the -gap-fill option of objcopy, e.g.:
arm-none-eabi-objcopy --gap-fill 0xFF -O ihex firmware.elf firmware.hex

LPCXpresso error CreateProcess: No such file or directory

I know there are multiple questions regarding this subject, but they did not help.
When trying to compile, whatever, I keep getting the same error:
arm-none-eabi-gcc.exe: error: CreateProcess: No such file or directory
I guess it means that it can not find the compiler.
I have tried tweaking the path settings
C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\nxp\LPCXpresso_7.6.2
326\lpcxpresso\tools\bin;
Seems to be right?
I have tried using Sysinternals process monitor
I can see that a lot of arm-none-eabi-gcc.exe are getting a result of name not found but there are a lot of successful results too.
I have also tried reinstalling the compiler and the LPCXpresso, no luck.
If i type arm-none-eabi-gcc -v i get the version, so it means its working
but when i am trying to compile in CMD like this arm-none-eabi-gcc led.c
i get the same error as stated above
arm-none-eabi-gcc.exe: error: CreateProcess: No such file or directory
Tried playing around more with PATH in enviroments, no luck. I feel like something is stopping LPCXpresso from finding the compiler
The only Antivirus this computer has is Avira and i disabled it. I also allowed compiler and LPCXpresso through the firewall
I have tried some more things, i will add it shortly after trying to duplicate the test.
It seems your problem is a happy mess with Vista and GCC. Long story short, a CRT function, access, has a different behavior on Windows and Linux. This difference is actually mentioned on Microsoft documentation, but the GCC folks didn't notice. This leads to a bug on Vista because this version of Windows is more strict on this point.
This bug is mentioned here : https://gcc.gnu.org/bugzilla/show_bug.cgi?id=33281
I have no proof that your problem comes from here, but the chances are good.
The solutions :
Do not use Vista
Recompile arm-none-eabi-gcc.exe with the flag -D__USE_MINGW_ACCESS
Patch arm-none-eabi-gcc.exe
The 3rd is the easiest, but it's a bit tricky. The goal is to hijack the access function and add an instruction to prevent undesired behavior. To patch your gcc, you have two solutions : you upload your .exe and I patch it for you, or I give you the instructions to patch it yourself. (I can also patch it for you, then give the instructions if it works). The patching isn't really hard and doesn't require advanced knowledge, but you must be rigorous.
As I said, I don't have this problem myself, so I don't know if my solution really works. The patch seems to be working for this problem.
EDIT2:
The exact problem is that the linux access have a parameter flag to check whether a file is executable. The Windows access cannot check for this. The behavior of most Windows versions is just to ignore this flag, and check if the file exists instead, which will usually give the same behavior. The problem is that Vista doesn't ignore this, and whenever the access is used to check for executability, it will return an error. This lead to GCC programs to think that some executables are not here. The patch induced by -D__USE_MINGW_ACCESS, or done manually, is to delete the flag when access is called, thus checking for existence instead just like other Windows versions.
EDIT:
The patching is actually needed for every GCC program that invokes other executables, and not only gcc.exe. So far there is only gcc.exe and collect2.exe.
Here are the patching instruction :
Backup your arm-none-eabi-gcc.exe.
Download and install CFF Explorer (direct link here).
Open arm-none-eabi-gcc.exe with CFF Explorer.
On the left panel, click on Import Directory.
In the module list that appears, click on the msvcrt.dll line.
In the import list that appears, find _access. Be careful here, the list is long, and there are multiple _access entries. The last one (the very last entry for me) is probably the good one.
When you click on the _access line, an address should appear on the list header, in the 2nd column 2nd row, just below FTs(IAT). Write down that address on notepad (for me, it is 00180948, it may be different). I will refer to this address as F.
On the left panel, click on Address Converter.
Three fields should appear, enter address F in the File Offset field.
Write down on notepad a 6 bytes value : the first two bytes are FF 25, the last 4 are the address that appeared in the VA field, IN REVERSE. For example, if 00586548 appeared in the VA field, write down FF 25 48 65 58 00 (spaces added for legibility). I will refer to this value as J. This value J is the instruction that jumps to the _access function.
On the left panel, click on Section Headers.
In the section list that appeared on the right, click on the .text line (the .text section is where the code resides).
In the editor panel that appeared below, click on the magnifier and, in the Hex search bar, search for a series of 11 90 (9090909090..., 90 is NOP in assembly). This is to find a code cave (unused space) to insert the patch, which is 11 bytes long. Once you found a code cave, write down the offset of the first 90. The exact offset is displayed on the very bottom as Pos : xxxxxxxx. I will refer to this offset as C.
Use the editor to change the sequence of 11 90 : the first 5 bytes are 80 64 E4 08 06. These 5 bytes are the instruction that prevents the wrong behavior. The last 6 bytes are the value J (edit the next 6 bytes to J, ex. FF 25 48 65 58 00), to jump back to the _access function.
Click on the arrow icon (Go To Offset) a bit below, and enter 0, to navigate to the beginning of the file.
Use the Hex search bar again to search for value J. If you find the bytes you just modified, skip. The J value you need is located around many value containing FF 25 and 90 90. That is the DLL jump table. Write down the offset of the value J you found (offset of the first byte, FF). I will refer to this offset as S. Note 1: If you can't find the value, maybe you picked the wrong _access in step 6, or did something wrong between step 6 to 10. Note 2: The search bar doesn't loop when it hit the end; go to offset 0 manually to re-find.
Use a hexadecimal 32-bit 2-complement calculator (like this one : calc.penjee.com) to calculate C - S - 5. If your offset C is 8C0 and your offset S is 6D810, you must obtain FF F9 30 AB (8C0 minus 6D810, minus 5).
Replace the value J you found in the file (at step 16) by 5 bytes : the first byte is E9, the last 4 are the result of the last operation, IN REVERSE. If you obtained FF F9 30 AB, you must replace the value J (ex: FF 25 48 65 58 00) by E9 AB 30 F9 FF. The 6th byte of J can be left untouched. These 5 bytes are the jump to the patch.
File -> Save
Notes : You should have modified a total of 16 bytes. If the patched program crash, you did something wrong. Even if it doesn't work, this patch can't induce a crash.
Let me know if you have difficulties somewhere.

Simple C Program does not give Expected Output

I was writing a small program which was supposed to display a ☻ character to the screen. The program is listed below:
#include <stdio.h>
main()
{
printf("☻\n");
}
However, when I run this program, I get the output of
Γÿ║
Why am I getting this output, and what should I do to get the output I want?
You're getting that because whatever terminal program you're using isn't that compatible with some Unicode encodings.
For example, my Debian box compiles that fine and it actually prints out the smiley face, because gnome-terminal is a damn fine piece of software :-)
The fact that you're seeing three characters instead of one is a fairly good indication that it's outputting UTF-8. In fact, if I run that program on my Debian box and capture the binary output with od -xcb, I see:
0000000 98e2 0abb
342 230 273 \n
342 230 273 012
0000004
showing that it is coming out in UTF-8, it's just that gnome-terminal is smart enough to turn that back into the correct glyph.
Those bytes translate to binary as follows:
e2 98 bb
1110 0010 : 1001 1000 : 1011 1011
And, using this excellent answer here, stating that bit patterns starting with 10 are continuation bytes, we can decode it as follows:
U+000800-U+00ffff 1110yyyy yyyyyyyy xxxxxxxx
10yyyyxx
10xxxxxx
e2 98 bb
1110 0010 : 1001 1000 : 1011 1011
yyyy yy yyxx xx xxxx
Hence the code point is 0010 0110 : 0011 1011 which equates to 263b which, in a total lack of coincidence, is the black smiling face character.
In terms of fixing the problem of Windows not displaying Unicode correctly, as indicated by your comment:
I am on Windows Command Prompt. How should I make cmd.exe work with unicode?
You may want to look at this question, particularly the answer about using chcp to change the code page to 65001 (UTF-8). Note I haven't tested this, I provide it only as a pointer for you.
#include <fcntl.h>
_setmode(_fileno(stdout), _O_U16TEXT);
wprintf(L"☻\n");
valter

How Cobol dynamic call works using group as program identifier?

I have the following call statement :
038060 CALL PROG USING
038070 DFH
038080 L000
038090 ZONE-E
038100 ZONE-S.
This call is dynamic and use PROG.
PROG is a group defined as :
018630 01 XX00.
018640 10 PROG.
018650 15 XX00-S06 PICTURE X(6)
018660 VALUE SPACE.
018670 15 XX00-S02 PICTURE X(2)
018680 VALUE SPACE.
018690 10 XX00-S92 PICTURE 9(02)
018700 VALUE ZERO.
018710 10 XX00-S91 PICTURE 9(1)
018720 VALUE ZERO.
018730 10 XX00-S9Z PICTURE 9(1)
018740 VALUE ZERO.
018750 10 XX00-9B0 PICTURE X(05)
018760 VALUE SPACE.
018770 10 XX00-0B0 PICTURE X(02)
018780 VALUE SPACE.
018790 10 XX00-BB1 PICTURE X(01)
018800 VALUE SPACE.
018810 10 XX00-SFN PICTURE X(07)
I cut here but there is a lot of field after...
It seems that actual progname to use is stored in :
XX00-S06
and
XX00-S02
I've also other cases where the name is on 3 or 4 fields, and the progname length is not always 8.
So my question is how Cobol know where to pick the good program name in the group? What are the resolution rules?
Configuration : I use Microfocus Net Express compiler and the environment is UniKix.
Dynamic call rules in COBOL are fairly simple. Given something like:
CALL WS-NAME USING...
COBOL will resolve the program name currently stored in WS-NAME against the load module libraries
available to it based on
a linear search. The first matching load module entry point name that matches WS-NAME is used.
It doesn't matter how complex, or simple, the definition of WS-NAME is. The total length used for the name
is whatever the length of WS-NAME is. For example:
01 WS-NAME.
05 WS-NAME-FIRST-PART PIC X(3).
05 WS-NAME-MIDDLE-PART PIC X(2).
05 WS-NAME-LAST-PART PIC X(3).
WS-NAME is composed of 3 subordinate fields giving a total of 8 characters. You can populate these individually or just move
something into WS-NAME as a whole. If the length of WS-NAME is less than 8 characters, the trailing characters will be
set to spaces on any receiving field. For example:
01 WS-SHORT-NAME.
05 WS-SHORT-NAME-FIRST-PART PIC X(4) VALUE 'AAAA'.
05 WS-SHORT-NAME-LAST-PART PIC X(2) VALUE 'BB'.
Here WS-SHORT-NAME is only 6 characters long. MOVING WS-SHORT-NAME to any longer PIC X type variable as in:
MOVE WS-SHORT-NAME TO WS-NAME
Will result in WS-NAME taking on the value 'AAAABBbb' (note the two trailing spaces). During libary search
for a matching entry point name, the trailing spaces are not significant so on the CALL statement you could use
either:
CALL WS-NAME
or
CALL-WS-SHORT-NAME
And they will resolve to the same entry point.
I am not sure what the length rules are for MicroFocus COBOL but, for IBM z/os dynamically called
program names cannot exceed 8 characters (if they do, the name is truncated to 8 characters).
I will add little more to NeilB with specific information about Micro Focus COBOL.
fyi: PROGRAM-ID, ENTRY-POINTS are restricted to 30-31 characters (check your "System Limits and Programming Restrictions" section in the docs).

Resources