Good way to access mixed 8/16/32-bit words - arrays

I have a big lump of binary data in memory and I need to read/write from randomly accessed, byte-aligned addresses. However, sometimes I need to read/write 8-bit words, sometimes (big-endian) 16-bit words, and sometimes (big-endian) 32-bit ones.
There's the naïve solution of representing the data as a ByteArray and implementing 16/32-bit reads/writes by hand:
class Blob (val image: ByteArray, var ptr: Int = 0) {
fun readWord8(): Byte = image[ptr++]
fun readWord16(): Short {
val hi = readWord8().toInt() and 0xff
val lo = readWord8().toInt() and 0xff
return ((hi shl 8) or lo).toShort()
fun readWord32(): Int {
val hi = readWord16().toLong() and 0xffff
val lo = readWord16().toLong() and 0xffff
return ((hi shl 16) or lo).toInt()
(and similarly for writeWord8/writeWord16/writeWord32).
Is there a better way to do this? It just seems so inefficient doing all this byte-shuffling when Java itself already uses big-endian representation inside...
To reiterate, I need both read and write access, random seeks, and 8/16/32-bit access to big-endian words.

You can use Java NIO ByteBuffer:
val array = ByteArray(100)
val buffer = ByteBuffer.wrap(array)
val b = buffer.get()
val s = buffer.getShort()
val i = buffer.getInt()
The byte order of a newly created ByteBuffer is BIG_ENDIAN, but it can still be changed with the order(ByteOrder) function.
Also, use ByteBuffer.allocate(size) and buffer.array() if you want to avoid creating a ByteArray explicily.
More about ByteBuffer usage: see this question.


Convert two 16 Bit Registers to 32 Bit float value flutter

I am using a Modbus flutter lib, reading 2 registers I obtain:
[16177, 4660] I need a function to convert it to a 32 bit float value: 0.7
I found this function
ByteBuffer buffer = new Uint16List.fromList(fileBytes).buffer;
ByteData byteData = new ByteData.view(buffer);
double x = byteData.getFloat32(0);
It says 0.000045747037802357227 swap byte value
Can you help
I always find it easiest to start with the byte array and insert stuff into that. What's happening is that your code is defaulting to the native endianness (apparently little) but you need to be doing this in big endianness.
Here's an overly verbose solution that you can cut out the print statements (and hex codec)
import 'dart:typed_data';
import 'package:convert/convert.dart'; // only needed for debugging
void main() {
final registers = [16177, 4660];
final bytes = Uint8List(4); // start with the 4 bytes = 32 bits
var byteData = bytes.buffer.asByteData(); // ByteData lets you choose the endianness
byteData.setInt16(0, registers[0], Endian.big); // Note big is the default here, but shown for completeness
byteData.setInt16(2, registers[1], Endian.big);
print(bytes); // just for debugging - does my byte order look right?
print(hex.encode(bytes)); // ditto
final f32 = byteData.getFloat32(0, Endian.big);

How to get an unsigned integer from a byte array in Kotlin JVM?

Kotlin 1.3 introduced unsigned integer types, but I can't seem to figure out how to get an unsigned integer from a ByteArray in Kotlin JVM.
Kotlin Native has a convenient ByteArray.getUIntAt() method, but this does not exist for Kotlin JVM.
val bytes: ByteArray = byteArrayOf(1, 1, 1, 1)
val uint: UInt // = ???
What are my options here? Is there a more elegant way than using a ByteBuffer, or bit-shifting my way out of this?
As mentioned in the comments there is no out of the box solution in the JVM version of Kotlin. An extension function doing the same as the Kotlin/Native function might look like this:
fun ByteArray.getUIntAt(idx: Int) =
((this[idx].toUInt() and 0xFFu) shl 24) or
((this[idx + 1].toUInt() and 0xFFu) shl 16) or
((this[idx + 2].toUInt() and 0xFFu) shl 8) or
(this[idx + 3].toUInt() and 0xFFu)
fun main(args: Array<String>) {
// 16843009
println(byteArrayOf(1, 1, 1, 1).getUIntAt(0))
// 4294967295, which is UInt.MAX_VALUE
println(byteArrayOf(-1, -1, -1, -1).getUIntAt(0))
To expand upon #Alexander's excellent answer, I wrote a version that accepts ByteArrays that have fewer than four values. I've found this particularly useful when parsing ByteArrays with a mix of signed and unsigned integers (in my case bluetooth characteristic update notifications)
that I need to slice into sub-arrays.
fun ByteArray.fromUnsignedBytesToInt(): Int {
//Note: UInt is always 32 bits (4 bytes) regardless of platform architecture
val bytes = 4
val paddedArray = ByteArray(bytes)
for (i in 0 until bytes-this.size) paddedArray[i] = 0
for (i in bytes-this.size until paddedArray.size) paddedArray[i] = this[i-(bytes-this.size)]
return (((paddedArray[0].toULong() and 0xFFu) shl 24) or
((paddedArray[1].toULong() and 0xFFu) shl 16) or
((paddedArray[2].toULong() and 0xFFu) shl 8) or
(paddedArray[3].toULong() and 0xFFu)).toInt()
The test class below confirms the expected values:
import org.junit.Test
import org.junit.Assert.*
internal class ByteArrayExtKtTest {
fun testAsUnsignedToIntTwoBytes() {
val bytes = byteArrayOf (0x0A.toByte(), 0xBA.toByte())
assertEquals(2746, bytes.fromUnsignedBytesToInt())
fun testAsUnsignedToIntFourBytes() {
val bytes = byteArrayOf(1,1,1,1)
assertEquals(16843009, bytes.fromUnsignedBytesToInt())

Compare 2 different hexadecimal

I have a hexadecimal in unsigned char *hex_1 that contains:
hex_1[0] = 0x5b
hex_1[1] = 0x83
hex_1[2] = 0xb6
hex_1[3] = 0xe9
and I want to compare it with a hex value: 1ca0aaf9.
What should I do? Should I create a new character array, split 1ca0aaf9 into 1c ca 0a, then do memcpy()?
EDIT: I actually want them to either tell me whether "THEY ARE THE SAME!" or "THEY ARE NOT THE SAME!".
EDIT 2: I want it to be like hex[0] to be compared with 1c, etc...
Your probably want this:
uint32_t val = *(uint32_t*)(hex_1); // uint32_t is available by #include <stdint.h>
if (val == 0x1ca0aaf9)
On a big-endian architecture, you are done. On Intel and other little endian architectures you need to decide if that byte array is logicialy meant to be interpreted as in network byte order as 0x5b83b6e9 (‭decimal 1535358697‬). Or if it's meant to be in host byte order (0xe9b6835b) (decimal ‭3921052507‬). If the byte array is in network byte order, then you'll need to swap the bytes. That's what the ntohl function does.
uint32_t val = *(uint32_t*)(hex_1);
val = ntohl(val); // <arpa/inet.h> or <winsock2.h>
if (val == 0x1ca0aaf9)

Get the character dominant from a string

Okay.. according to the title i am trying to figure out a way - function that returns the character that dominates in a string. I might be able to figure it out.. but it seems something is wrong with my logic and i failed on this. IF someome can come up with this without problems i will be extremelly glad thank you.
I say "in a string" to make it more simplified. I am actually doing that from a buffered data containing a BMP image. Trying to output the base color (the dominant pixel).
What i have for now is that unfinished function i started:
(char *FILE_NAME)
dword size = bmp_dgets(FILE_NAME, byte);
FILE* fp = fopen(convert(FILE_NAME), "r");
BYTE *PIX_ARRAY = malloc(size-54+1), *PIX_CUR = calloc(sizeof(RGB), sizeof(BYTE));
dword readed, i, l;
RGB color, prime_color;
fseek(fp, 54, SEEK_SET); readed = fread(PIX_ARRAY, 1, size-54, fp);
for(i = 54; i<size-54; i+=3)
color = bitfox_pixel_init(PIXEL_ARRAY[i], PIXEL_ARRAY[i+1], PIXEL_ARRAY[i+2);
memmove(PIX_CUR, color, sizeof(RGB));
for(l = 54; l<size-54; l+=3)
if (PIX_CUR[2] == PIXEL_ARRAY[l] && PIX_CUR[1] == PIXEL_ARRAY[l+1] &&
Note that RGB is a struct containing 3 bytes (R, G and B).
I know thats nothing but.. thats all i have for now.
Is there any way i can finish this?
If you want this done fast throw a stack of RAM at it (if available, of course). You can use a large direct-lookup table with the RGB trio to manufacture a sequence of 24bit indexes into a contiguous array of counters. In partial-pseudo, partial code, something like this:
// create a zero-filled 2^24 array of unsigned counters.
uint32_t *counts = calloc(256*256*256, sizeof(*counts));
uint32_t max_count = 0
// enumerate your buffer of RGB values, three bytes at a time:
unsigned char rgb[3];
while (getNextRGB(src, rgb)) // returns false when no more data.
uint32_t idx = (((uint32_t)rgb[0]) << 16) | (((uint32_t)rgb[1]) << 8) | (uint32_t)rgb[2];
if (++counts[idx] > max_count)
max_count = idx;
R = (max_count >> 16) & 0xFF;
G = (max_count >> 8) & 0xFF;
B = max_count & 0xFF;
// free when you have no more images to process. for each new
// image you can memset the buffer to zero and reset the max
// for a fresh start.
Thats it. If you can afford to throw a big hulk of memory at this a (it would be 64MB in this case, at 4 bytes per entry at 16.7M entries), then performing this becomes O(N). If you have a succession of images to process you can simply memset() the array back to zeros, clear max_count, and repeat for each additional file. Finally, don't forget to free your memory when finished.
Best of luck.

C Library for compressing sequential positive integers

I have the very common problem of creating an index for an in-disk array of strings. In short, I need to store the position of each string in the in-disk representation. For example, a very naive solution would be an index array as follows:
uint64 idx[] = { 0, 20, 500, 1024, ..., 103434 };
Which says that the first string is at position 0, the second at position 20, the third at position 500 and the nth at position 103434.
The positions are always non-negative 64 bits integers in sequential order. Although the numbers could vary by any difference, in practice I expect the typical difference to be inside the range from 2^8 to 2^20. I expect this index to be mmap'ed in memory, and the positions will be accessed randomly (assume uniform distribution).
I was thinking about writing my own code for doing some sort of block delta encoding or other more sophisticated encoding, but there are so many different trade-offs between encoding/decoding speed and space that I would rather get a working library as a starting point and maybe even settle for something without any customizations.
Any hints? A c library would be ideal, but a c++ one would also allow me to run some initial benchmarks.
A few more details if you are still following. This will be used to build a library similar to cdb ( on top the library cmph ( In short, it is for a large disk based read only associative map with a small index in memory.
Since it is a library, I don't have control over input, but the typical use case that I want to optimize have millions of hundreds of values, average value size in the few kilobytes ranges and maximum value at 2^31.
For the record, if I don't find a library ready to use I intend to implement delta encoding in blocks of 64 integers with the initial bytes specifying the block offset so far. The blocks themselves would be indexed with a tree, giving me O(log (n/64)) access time. There are way too many other options and I would prefer to not discuss them. I am really looking forward ready to use code rather than ideas on how to implement the encoding. I will be glad to share with everyone what I did once I have it working.
I appreciate your help and let me know if you have any doubts.
I use fastbit (Kesheng Wu LBL.GOV), it seems you need something good, fast and NOW, so fastbit is a highly competient improvement on Oracle's BBC (byte aligned bitmap code, berkeleydb). It's easy to setup and very good gernally.
However, given more time, you may want to look at a gray code solution, it seems optimal for your purposes.
Daniel Lemire has a number of libraries for C/++/Java released on, I've read over some of his papers and they are quite nice, several advancements on fastbit and alternative approaches for column re-ordering with permutated grey codes's.
Almost forgot, I also came across Tokyo Cabinet, though I do not think it will be well suited for my current project, I may of considered it more if I had known about it before ;), it has a large degree of interoperability,
Tokyo Cabinet is written in the C
language, and provided as API of C,
Perl, Ruby, Java, and Lua. Tokyo
Cabinet is available on platforms
which have API conforming to C99 and
As you referred to CDB, the TC benchmark has a TC mode (TC support's several operational constraint's for varying perf) where it surpassed CDB by 10 times for read performance and 2 times for write.
With respect to your delta encoding requirement, I am quite confident in bsdiff and it's ability to out-perform any file.exe content patching system, it may also have some fundimental interfaces for your general needs.
Google's new binary compression application, courgette may be worth checking out, in case you missed the press release, 10x smaller diff's than bsdiff in the one test case I have seen published.
You have two conflicting requirements:
You want to compress very small items (8 bytes each).
You need efficient random access for each item.
The second requirement is very likely to impose a fixed length for each item.
What exactly are you trying to compress? If you are thinking about the total space of index, is it really worth the effort to save the space?
If so one thing you could try is to chop the space into half and store it into two tables. First stores (upper uint, start index, length, pointer to second table) and the second would store (index, lower uint).
For fast searching, indices would be implemented using something like B+ Tree.
I did something similar years ago for a full-text search engine. In my case, each indexed word generated a record which consisted of a record number (document id) and a word number (it could just as easily have stored word offsets) which needed to be compressed as much as possible. I used a delta-compression technique which took advantage of the fact that there would be a number of occurrences of the same word within a document, so the record number often did not need to be repeated at all. And the word offset delta would often fit within one or two bytes. Here is the code I used.
Since it's in C++, the code may is not going to be useful to you as is, but can be a good starting point for writing compressions routines.
Please excuse the hungarian notation and the magic numbers strewn within the code. Like I said, I wrote this many years ago :-)
// index compressor class
#pragma once
#include "File.h"
const int IC_BUFFER_SIZE = 8192;
// index compressor
class IndexCompressor
private :
File *m_pFile;
WA_DWORD m_dwRecNo;
WA_DWORD m_dwWordNo;
WA_DWORD m_dwRecordCount;
WA_DWORD m_dwHitCount;
WA_DWORD m_dwBytes;
bool m_bDebugDump;
void FlushBuffer(void);
public :
IndexCompressor(void) { m_pFile = 0; m_bDebugDump = false; }
~IndexCompressor(void) {}
void Attach(File& File) { m_pFile = &File; }
void Begin(void);
void Add(WA_DWORD dwRecNo, WA_DWORD dwWordNo);
void End(void);
WA_DWORD GetRecordCount(void) { return m_dwRecordCount; }
WA_DWORD GetHitCount(void) { return m_dwHitCount; }
void DebugDump(void) { m_bDebugDump = true; }
// index compressor class
#include "stdafx.h"
#include "IndexCompressor.h"
void IndexCompressor::FlushBuffer(void)
ASSERT(m_pFile != 0);
if (m_dwBytes > 0)
m_pFile->Write(m_byBuffer, m_dwBytes);
m_dwBytes = 0;
void IndexCompressor::Begin(void)
ASSERT(m_pFile != 0);
m_dwRecNo = m_dwWordNo = m_dwRecordCount = m_dwHitCount = 0;
m_dwBytes = 0;
void IndexCompressor::Add(WA_DWORD dwRecNo, WA_DWORD dwWordNo)
ASSERT(m_pFile != 0);
WA_BYTE buffer[16];
int nbytes = 1;
ASSERT(dwRecNo >= m_dwRecNo);
if (dwRecNo != m_dwRecNo)
m_dwWordNo = 0;
if (m_dwRecordCount == 0 || dwRecNo != m_dwRecNo)
WA_DWORD dwRecNoDelta = dwRecNo - m_dwRecNo;
WA_DWORD dwWordNoDelta = dwWordNo - m_dwWordNo;
if (m_bDebugDump)
TRACE("%8X[%8X] %8X[%8X] : ", dwRecNo, dwRecNoDelta, dwWordNo, dwWordNoDelta);
if (dwRecNoDelta == 0 && dwWordNoDelta < 128)
buffer[0] = 0x80 | WA_BYTE(dwWordNoDelta);
else if (dwRecNoDelta == 0 && dwWordNoDelta < 16384)
buffer[0] = 0x40 | WA_BYTE(dwWordNoDelta >> 8);
buffer[1] = WA_BYTE(dwWordNoDelta & 0x00ff);
nbytes += sizeof(WA_BYTE);
else if (dwRecNoDelta < 32 && dwWordNoDelta < 65536)
buffer[0] = 0x20 | WA_BYTE(dwRecNoDelta);
WA_WORD *p = (WA_WORD *) (buffer+1);
*p = WA_WORD(dwWordNoDelta);
nbytes += sizeof(WA_WORD);
// 0001rrww
buffer[0] = 0x10;
// encode recno
if (dwRecNoDelta < 256)
buffer[nbytes] = WA_BYTE(dwRecNoDelta);
nbytes += sizeof(WA_BYTE);
else if (dwRecNoDelta < 65536)
buffer[0] |= 0x04;
WA_WORD *p = (WA_WORD *) (buffer+nbytes);
*p = WA_WORD(dwRecNoDelta);
nbytes += sizeof(WA_WORD);
buffer[0] |= 0x08;
WA_DWORD *p = (WA_DWORD *) (buffer+nbytes);
*p = dwRecNoDelta;
nbytes += sizeof(WA_DWORD);
// encode wordno
if (dwWordNoDelta < 256)
buffer[nbytes] = WA_BYTE(dwWordNoDelta);
nbytes += sizeof(WA_BYTE);
else if (dwWordNoDelta < 65536)
buffer[0] |= 0x01;
WA_WORD *p = (WA_WORD *) (buffer+nbytes);
*p = WA_WORD(dwWordNoDelta);
nbytes += sizeof(WA_WORD);
buffer[0] |= 0x02;
WA_DWORD *p = (WA_DWORD *) (buffer+nbytes);
*p = dwWordNoDelta;
nbytes += sizeof(WA_DWORD);
// update current setting
m_dwRecNo = dwRecNo;
m_dwWordNo = dwWordNo;
// add compressed data to buffer
ASSERT(buffer[0] != 0);
ASSERT(nbytes > 0 && nbytes < 10);
if (m_dwBytes + nbytes > IC_BUFFER_SIZE)
CopyMemory(m_byBuffer + m_dwBytes, buffer, nbytes);
m_dwBytes += nbytes;
if (m_bDebugDump)
for (int i = 0; i < nbytes; ++i)
TRACE("%02X ", buffer[i]);
void IndexCompressor::End(void)
You've omitted critical information about the number of strings you intend to index.
But given that you say you expect the minimum length of an indexed string to be 256, storing the indices as 64% incurs at most 3% overhead. If the total length of the string file is less than 4GB, you could use 32-bit indices and incur 1.5% overhead. These numbers suggest to me that if compression matters, you're better off compressing the strings, not the indices. For that problem a variation on LZ77 seems in order.
If you want to try a wild idea, put each string in a separate file, pull them all into a zip file, and see how you can do with zziplib. This probably won't be great, but it's nearly zero work on your part.
More data on the problem would be welcome:
Number of strings
Average length of a string
Maximum length of a string
Median length of strings
Degree to which the strings file compresses with gzip
Whether you are allowed to change the order of strings to improve compression
The comment and revised question makes the problem much clearer. I like your idea of grouping, and I would try a simple delta encoding, group the deltas, and use a variable-length code within each group. I wouldn't wire in 64 as the group size–I think you will probably want to determine that empirically.
You asked for existing libraries. For the grouping and delta encoding I doubt you will find much. For variable-length integer codes, I'm not seeing much in the way of C libraries, but you can find variable-length codings in Perl and Python. There are a ton of papers and some patents on this topic, and I suspect you're going to wind up having to roll your own. But there are some simple codes out there, and you could give UTF-8 a try—it can code unsigned integers up to 32 bits, and you can grab C code from Plan 9 and I'm sure many other sources.
Are you running on Windows? If so, I recommend creating the mmap file using naive solution your originally proposed, and then compressing the file using NTLM compression. Your application code never knows the file is compressed, and the OS does the file compression for you. You might not think this would be very performant or get good compression, but I think you'll be surprised if you try it.
