Kotlin. Get string part out of array of bytes - arrays

From a bluetooth device I get data in an array of bytes. 20 bytes are reserved for a string the other bytes contain data like short and int values.
The bytes for the string are converted to a string using Charset.UTF_8 (or US_ASCII). The problem is I can not get ride of the part that contains ordinary zero's like in other languages as c, c# and c++. Tried droplast after determining the first zero character. Nothing works. What am I missing.
The piece of code is this:
val bytes = job.characteristic.value
var index = 0;
var tempBytes = ByteArray(30)
while(index < 20) {
if (bytes[index] != 0.toByte())
tempBytes[index] = bytes[index]
else
break
++index
}
val newString = tempBytes.toString(Charsets.ISO_8859_1).dropLast(20 - index)
Log.i("BleDeviceVM", "Received for newString: " + newString)
Outcome in Android Studio is like this:
I/BleDeviceVM:
Received for newString: LEDServer��������������������
instead of:
LEDServer
*Thanks Broot.
Coming from c, c#, some c++ and some Java Kotlin is a bit confusing at first.This piece of code works fine I think:
var position = job.characteristic.value.indexOf(0)
if ((position > 20) || (-1 == position))
position = 20
_deviceID.value = String(job.characteristic.value, 0, position)

You have a bug in dropLast(). Your tempBytes is of size 30, but in dropLast you subtract index from 20, not from 30. This is why it is usually better to use constants or reference the collection size directly:
tempBytes.toString(Charsets.ISO_8859_1).dropLast(tempBytes.size - index)
Also, there is no need to use dropLast() if we need first n items, because take() does exactly this:
tempBytes.toString(Charsets.ISO_8859_1).take(index)
But honestly, your code is pretty overcomplicated. You can achieve a similar effect by replacing your whole code with simply:
val newString = String(bytes, 0, bytes.indexOf(0))
There are some differences comparing to your code, e.g. it searches past index 20 and it requires a null-byte somewhere. Depending on your specific case it may need to be adjusted.

Related

Reading TrueType 'cmap' Format 4 Subtable in Swift 3

How might one go about writing the following C code in Swift?
glyphIndex = *(&idRangeOffset[i] + idRangeOffset[i] / 2 + (c - startCode[i]))
I'm attempting to read a Format 4 TrueType character mapping table using a simple little binary data reader. All is well and good until pointer manipulation is involved, as I can barely make heads or tails of pointers when working in C, let alone ones masquerading around with Unsafe prefixes attached to them.
I've tried multiple things, but nothing seems to work quite right. I guess I'm just not exactly sure how to work with pointer "addresses" in Swift.
For example, here's a more complete idea of where I am:
// The following variables are all [UInt16]:
// - startCodes
// - endCodes
// - idDeltas
// - idRangeOffsets
var gids = [Int]()
// Iterate segments, skipping the last character code (0xFFFF)
for i in 0 ..< segCount - 1 {
let start = startCodes[i]
let end = endCodes[i]
let delta = idDeltas[i]
let rangeOffset = idRangeOffsets[i]
let charRange = start ..< (end + 1)
if rangeOffset == 0 {
gids.append(contentsOf: charRange.map { charCode in
return (charCode + delta) & 0xFFFF
})
}
else {
for charCode in charRange {
// ???
}
}
}
In the code above, you'll notice ???. This is where I retrieve the glyph index using the strange C-pointer-address-pointer-huh trick mentioned above. The problem is, I just can't figure it out. Replacing the variables I actually understand, here's what I've got:
for charCode in charRange {
Not too sure about this Actual value of idRangeOffset[i]
| |
v v
glyphIndex = *(&idRangeOffset[i] + rangeOffset / 2 + (charCode - start))
^
|
Or this
}
Are there any Swift 3 pointer gurus out there that can lead me on the path to enlightenment? Any help would be greatly appreciated!
If I translate your pseudo C code word by word into Swift, it would be something like this:
//"May not work" example, do no use this
glyphIndex = withUnsafePointer(to: &idRangeOffset[i]) {idRangeOffsetPointer in
//In some cases `idRangeOffsetPointer` can work as `&idRangeOffset[i]` in C...
(idRangeOffsetPointer + Int(rangeOffset) / 2 + Int(charCode - start)).pointee
//To make pointer operation in Swift, all integers need to be `Int`,
//And to match the C-rule of integer operation, you need to cast each portion to `Int` before adding operation
//The equivalent to C's dereferencing operator `*` in Swift is `pointee` property
}
But this may not work as expected because of the copy-in/copy-out semantics of Swift's inout parameter. Swift may create a temporal region which contains the single element idRangeOffset[i], and pass the address of it to idRangeOffsetPointer, so the result of the pointer operation may be pointing somewhere near the temporal region, which is completely useless.
If you want to get a meaningful result from the pointer operation, you may need to work in a context where all elements of the array are guaranteed to be placed in a contiguous region.
And also you should need to know that the C-statement:
glyphIndex = *(&idRangeOffset[i] + idRangeOffset[i] / 2 + (c - startCode[i]))
is based on the fact, that whole idRangeOffset and glyphIdArray are placed in a contiguous region without any gaps or paddings. (I assume you know well about Format 4.)
So, if your idRangeOffset contains only segCount elements, the following code would not work.
//"Should work" in a certain condition
glyphIndex = idRangeOffset.withUnsafeBufferPointer{idRangeOffsetBufferPointer in
let idRangeOffsetPointer = idRangeOffsetBufferPointer.baseAddress! + i
//`idRangeOffsetPointer` is equivalent to `&idRangeOffset[i]` in C inside this closure
return (idRangeOffsetPointer + Int(rangeOffset) / 2 + Int(charCode - start)).pointee
}
But with considering the pointer and array semantics in C, the code above is equivalent to this:
glyphIndex = idRangeOffset[i + Int(rangeOffset) / 2 + Int(charCode - start)]
//`*(&arr[i] + n)` is equivalent to `arr[i + n]` in C
I repeat, the Array idRangeOffset needs to contain whole content of idRangeOffset and glyphIdArray.
To add to what #OOPer said, I would suggest reading the whole cmap or subtables of interest into memory and working with them in Swift using documentation as a guide; see, for example, https://developer.apple.com/fonts/TrueType-Reference-Manual/RM06/Chap6cmap.html. You could also use a C implementation as a reference, but that is not a good path unless you are an experienced C programmer: there are too many subtleties in C that can and will bite you. In C it's really convenient to work with a cmap in terms of pointers and offsets. Because Swift is not that pointer-friendly, it's better to just use offsets into a table. You will likely run into problems interpreting different parts of the map as values of different types, but at least you won't have to deal with pointer magic.

Appending string much faster than appending character

I was doing https://www.hackerrank.com/challenges/30-review-loop problem on hacker rank and I was running into a time out issue that was resolved in round about way. I was hoping someone on here can explain to me why one is faster than the other. Or point me to documentation that explains this phenomenon
If you don't have an account here's a description of the issue you feed in the number of test cases and then a string which your code is to create a string with all the characters in the odd indices and a string with all the characters in the even indices. Example input
2
Hacker
Rank
returns
Hce akr
Rn ak
Simple right? Here's the code I made.
if let line = readLine(), numOftests = Int(line) {
for iter in 0..<numOftests {
var evenString = ""
var oddString = ""
var string = readLine()!
var arrChars = [Character](string.characters) //1
for idx in 0..<string.characters.count {
if idx % 2 == 0 {
oddString.append(arrChars[idx]) //1
//oddString.append(string[string.startIndex.advancedBy(idx)]) //2 <= Times out
}
else {
evenString.append(arrChars[idx]) //1
//evenString.append(string[string.startIndex.advancedBy(idx)]) //2 <= Times out
}
}
print("\(oddString) \(evenString)")
}
}
Originally I used the commented out code. This lead to a time out. To sum my problem is that using the subscripting system for a string, causes it to be a lot slower than indexing an array of characters. It caught me by surprise and if it wasn't for the discussion group in hacker rank I wouldn't have found a solution. Now it goads me because I don't know why this would make a difference.
The issue isn't the speed of appending a string vs. appending a character. The issue is how long it takes to locate the value you are appending.
Indexing an array is O(1) which means it happens in the same time whether you are accessing the first character or the 97th. It is efficient because Swift knows the size of the array elements, so it can just multiply the index by the size of the element to find the nth element.
string.startIndex.advancedBy(idx) is O(idx). It will take longer depending on how far you go into the string. Accessing the 97th character will take about 97 times as long as accessing the first character. Why? Because characters in Swift are not uniform in size. Swift strings are fully unicode compatible and a "😀" takes more bytes to represent than "A". So, it is necessary to look at every character from the startIndex to the one you are accessing.
That said, there is no reason for you to start at startIndex each time. If you kept the current index in a variable, you could advance it by 1 each time which would make the string indexing version about the same speed as the character array indexing version.
var currentIndex = string.startIndex
for idx in 0..<string.characters.count {
if idx % 2 == 0 {
oddString.append(string[currentIndex])
}
else {
evenString.append(string[currentIndex])
}
currentIndex = currentIndex.successor()
}
That said, I would probably write it like this:
for (idx, char) in string.characters.enumerate() {
if idx % 2 == 0 {
oddString.append(char)
}
else {
evenString.append(char)
}
}

Efficient search for series of values in an array? Ideally OpenCL usable?

I have a massive array I need to search (actually it's a massive array of smaller arrays, but for all intents and purposes, lets consider it one huge array). What I need to find is a specific series of numbers. Obviously, a simple for loop will work:
Pseudocode:
for(x = 0; x++) {
if(array[x] == searchfor[location])
location++;
else
location = 0;
if(location >= strlen(searchfor))
return FOUND_IT;
}
Thing is I want this to be efficient. And in a perfect world, I do NOT want to return the prepared data from an OpenCL kernel and do a simple search loop.
I'm open to non-OpenCL ideas, but something I can implement across a work group size of 64 on a target array length of 1024 would be ideal.
I'm kicking around ideas (split the target across work items, compare each item, looped, against each target, if it matches, set a flag. After all work items complete, check flags. Though as I write that, that sounds very inefficient) but I'm sure I'm missing something.
Other idea was that since the target array is uchar, to lump it together as a double, and check 8 indexes at a time. Not sure I can do that in opencl easily.
Also toying with the idea of hashing the search target with something fast, MD5 likely, then grabbing strlen(searchtarget) characters at a time, hashing it, and seeing if it matches. Not sure how much the hashing will kill my search speed though.
Oh - code is in C, so no C++ maps (something I found while googling that seems like it might help?)
Based on comments above, for future searches, it seems a simple for loop scanning the range IS the most efficient way to find matches given an OpenCL implementation.
Create an index array[sizeof uchar]. For each uchar in the search string make array[uchar] = position in search string of first occurence of uchar. The rest of array contains -1.
unsigned searchindexing[sizeof char] = { (unsigned)-1};
memcpy(searchindexing + 1, searchindexing, sizeof char - 1);
for (i = 0; i < strlen(searchfor); i++)
searchindexing[searchfor[i]] = i;
If you don't start at the beginning, an uchar occuring more than one time will get the wrong position entered into searchindexing.
Then you search the array by stepping strlen(searchfor) unless finding an uchar from searchfor.
for (i = 0; i < MAXARRAYLEN; i += strlen(searchfor))
if ((unsigned)-1 != searchindexing[array[i]]) {
i -= searchindexing[array[i]];
if (!memcmp(searchfor, &array[i], strlen(searchfor)))
return FOUND_IT;
}
If most of the uchar in array isn't in searchfor, this is probably the fastest way. Note the code has not been optimized.
Example: searchfor = "banana". strlen is 6. searchindexing['a'] = 5, ['b'] = 0, ['n'] = 4 and the rest a value not between 0 to 5, like -1 or maxuint. If array[i] is something not in banana like space, i increments by 6. If array[i] now is 'a', you might be in banana and it can be any of the 3 'a's. So we assume the last 'a' and move 5 places back and do a compare with searchfor. If succes, we found it, otherwise we step 6 places forward.

Simulating an appendable array in C...Kinda

I'm trying to write a C code that does what a chunk of python code I have written does.
I tried to keep all its lines simple, but there still turns out to be some stuff I wrote that C cannot do.
My code will take an array of coordinates and replace/add items to that array over time.
For example:
[[[0,1]],[[2,1],[1,14]],[[1,1]]] ==> [[[0,1]],[[2,1],[1,14],[3,2]],[[1,1]]]
or
[[[0,1]],[[2,1],[1,14]],[[1,1]]] ==> [[[0,1]],[[40]],[[1,1]]]
I think this is impossible in C, but how about instead using strings to represent the lists so they can be added to? Like this:
[['0$1$'],['2$1$1$14$'],['1$1$']] ==> [['0$1$'],['2$1$1$14$3$2'],['1$1$']]
and
[['0$1$'],['2$1$1$14$'],['1$1$']] ==> [['0$1$'],['40$'],['1$1$']]
In my code, I know each array in the array is either one or more pairs of numbers or just one number so this method works for me.
Can C do this and if so please provide an example.
If you know that both the length of a string and the number of said strings won't exceed a certain value, you can do this:
char Strings[NUMBER_OF_STRINGS][MAX_STRING_LENGTH + 1]; // for the null terminator
It would then be a good practice to zero all this memory:
for (size_t i = 0; i < NUMBER_OF_STRINGS; i++)
memset(Strings[i], 0, MAX_STRING_LENGTH + 1);
And if you want to append a string, use strcat:
strcat(Strings[i], SourceString);
A safer (though slightly more costly since you need to call strlen which walks the entire string) solution would be:
strncat(Strings[i], SourceString, MAX_STRING_LENGTH - strlen(Strings[i]));

In Swift, how do I read an existing binary file into an array?

As part of my projects, I have a binary data file consisting of a large series of 32 bit integers that one of my classes reads in on initialization. In my C++ library, I read it in with the following initializer:
Evaluator::Evaluator() {
m_HandNumbers.resize(32487834);
ifstream inputReader;
inputReader.open("/path/to/file/7CHands.dat", ios::binary);
int inputValue;
for (int x = 0; x < 32487834; ++x) {
inputReader.read((char *) &inputValue, sizeof (inputValue));
m_HandNumbers[x] = inputValue;
}
inputReader.close();
};
and in porting to Swift, I decided to read the entire file into one buffer (it's only about 130 MB) and then copy the bytes out of the buffer.
So, I've done the following:
public init() {
var inputStream = NSInputStream(fileAtPath: "/path/to/file/7CHands.dat")!
var inputBuffer = [UInt8](count: 32478734 * 4, repeatedValue: 0)
inputStream.open()
inputStream.read(&inputBuffer, maxLength: inputBuffer.count)
inputStream.close()
}
and it works fine in that when I debug it, I can see inputBuffer contains the same array of bytes that my hex editor says it should. Now, I'd like to get that data out of there effectively. I know it's stored in.. whatever format you call it where the least significant bytes are first (i.e. the number 0x00011D4A is represented as '4A1D 0100' in the file). I'm tempted to just iterate through it manually and calculate the byte values by hand, but I'm wondering if there's a quick way I can pass an array of [Int32] and have it read those bytes in. I tried using NSData, such as with:
let data = NSData(bytes: handNumbers, length: handNumbers.count * sizeof(Int32))
data.getBytes(&inputBuffer, length: inputBuffer.count)
but that didn't seem to load the values (all the values were still zero). Can anyone please help me convert this byte array into some Int32 values? Better yet would be to convert them to Int (i.e. 64 bit integer) just to keep my variable sizes the same across the project.
Not sure about your endian-ness, but I use the following function. The difference from your code is using NSRanges of the actual required type, rather than lengths of bytes. This routine reads one value at a time (it's for ESRI files whose contents vary field by field), but should be easily adaptable.
func getBigIntFromData(data: NSData, offset: Int) -> Int {
var rng = NSRange(location: offset, length: 4)
var i = [UInt32](count: 1, repeatedValue:0)
data.getBytes(&i, range: rng)
return Int(i[0].bigEndian)// return Int(i[0]) for littleEndian
}
Grimxn provided the backbone of the solution to my problem, which showed me how to read sections of the buffer into an array; he then showed me a way to read the entire buffer in all at once. Rather than convert all of the items of the array needlessly to Int, I simply read the array into the buffer as UInt32 and did the casting to Int in the function that accesses that array.
For now, since I don't have my utility class defined yet, I integrated Grimxn's code directly into my initializer. The class initializer now looks like this:
public class Evaluator {
let HandNumberArraySize = 32487834
var handNumbers: [Int32]
public init() {
let data = NSData(contentsOfFile: "/path/to/file/7CHands.dat")!
var dataRange = NSRange(location: 0, length: HandNumberArraySize * 4)
handNumbers = [Int32](count: HandNumberArraySize, repeatedValue: 0)
data.getBytes(&handNumbers, range: dataRange)
println("Evaluator loaded successfully")
}
...
}
... and the function that references them is now:
public func cardVectorToHandNumber(#cards: [Int], numberToUse: Int) -> Int {
var output: Int
output = Int(handNumbers[53 + cards[0] + 1])
for i in 1 ..< numberToUse {
output = Int(handNumbers[output + cards[i] + 1])
}
return Int(handNumbers[output])
}
Thanks to Grimxn and thanks once again to StackOverflow for helping me in a very real way!

Resources