How to extract a pattern from string containing binary data

How to extract a pattern from string containing binary data - arrays

I have this array that comes from a previous a=array.unpack("C*") command.
a = [9, 32, 50, 53, 56, 53, 57, 9, 73, 78, 70, 79, 9, 73, 78, 70, 79, 53, 9,
32, 55, 52, 32, 50, 51, 32, 48, 51, 32, 57, 50, 32, 48, 48, 32, 48, 48, 32,
48, 48, 32, 69, 67, 32, 48, 50, 32, 49, 48, 32, 48, 48, 32, 69, 50, 32, 48,
48, 32, 55, 55, 9, 0, 0, 0, 0, 1, 12, 1, 0, 0, 0, 57, 254, 70, 6, 1, 6, 0, 3,
0, 3, 198, 0, 2, 198, 31, 147, 23, 0, 226, 7, 12, 17, 18, 56, 55, 3, 101, 1,
1, 0, 134, 7, 145, 5, 148, 37, 150, 133, 241, 135, 5, 22, 109, 145, 53, 38,
171, 4, 3, 2, 6, 192, 173, 22, 160, 20, 48, 18, 6, 9, 42, 134, 58, 0, 137, 97,
58, 1, 0, 164, 5, 48, 3, 129, 1, 7, 225, 16, 2, 1, 1, 4, 11, 9, 1, 10, 10, 6,
2, 19, 105, 145, 103, 116, 226, 35, 48, 3, 194, 1, 242, 48, 3, 194, 1, 241, 48,
3, 194, 1, 246, 48, 3, 194, 1, 245, 48, 3, 194, 1, 244, 48, 3, 194, 1, 243, 48,
3, 194, 1, 247, 177, 13, 10, 1, 1, 4, 8, 10, 6, 2, 19, 105, 145, 103, 116, 0, 0,
42, 3, 0, 0, 48, 48, 48, 48, 48, 48, 48, 50, 9, 82, 101, 99, 101, 105, 118, 101,
9, 50, 51, 9, 77, 111, 110, 32, 32]
when I convert to chr it looks like this:
irb(main):4392:0> a.map(&:chr).join
=> "\t 25859\tINFO\tINFO5\t 74 23 03 92 00 00 00 EC 02 10 00 E2 00 77\t\x00\x00\x00\x00
\x01\f\x01\x00\x00\x009\xFEF\x06\x01\x06\x00\x03\x00\x03\xC6\x00\x02\xC6\x1F\x93\x17\x00
\xE2\a\f\x11\x1287\x03e\x01\x01\x00\x86\a\x91\x05\x94%\x96\x85\xF1\x87\x05\x16m\x915&\xAB
\x04\x03\x02\x06\xC0\xAD\x16\xA0\x140\x12\x06\t*\x86:\x00\x89a:\x01\x00\xA4\x050\x03\x81
\x01\a\xE1\x10\x02\x01\x01\x04\v\t\x01\n\n\x06\x02\x13i\x91gt\xE2#0\x03\xC2\x01\xF20\x03
\xC2\x01\xF10\x03\xC2\x01\xF60\x03\xC2\x01\xF50\x03\xC2\x01\xF40\x03\xC2\x01\xF30\x03\xC2
\x01\xF7\xB1\r\n\x01\x01\x04\b\n\x06\x02\x13i\x91gt\x00\x00*\x03\x00\x000000..."
I would like to extract the hexadecimal values between INFO5\t and \t..., so the output would be
"74 23 03 92 00 00 00 EC 02 10 00 E2 00 77"
I'm doing like below but only removes the first unwanted part and leaves \n\n\x06...000
How can I fix this?
irb(main)>: a.map(&:chr).join.gsub(/(\t .*\t )|(\t.*)/,"")
=> "74 23 03 92 00 00 00 EC 02 10 00 E2 00 77\n\n\x06\x02\x13i\x91gt\xE2#0
\x03\xC2\x01\xF20\x03\xC2\x01\xF10\x03\xC2\x01\xF60\x03\xC2\x01\xF50\x03\xC2
\x01\xF40\x03\xC2\x01\xF30\x03\xC2\x01\xF7\xB1\r\n\x01\x01\x04\b\n\x06\x02\
x13i\x91gt\x00\x00*\x03\x00\x0000000002"
Thanks for the help in advance.
UDPATE
Below attached sample binary file.
input.dat

Here are two approaches (a below is abbreviated from that given in the question).
a = [9, 32, 50, 53, 56, 53, 57, 9, 73, 78, 70, 79, 9, 73, 78, 70, 79, 53, 9,
32, 55, 52, 32, 50, 51, 32, 48, 51, 32, 57, 50, 32, 48, 48, 32, 48, 48,
32, 48, 48, 32, 69, 67, 32, 48, 50, 32, 49, 48, 32, 48, 48, 32, 69, 50,
32, 48, 48, 32, 55, 55, 9, 0, 0]
Extract from the string that had been unpacked to create a
str = a.pack("C*")
#=> "\t 25859\tINFO\tINFO5\t 74 23 03 92 00 00 00 EC 02 10 00 E2 00 77\t\x00\x00"
str[/(?<=INFO5\t).+?(?=\t)/].strip
#=> "74 23 03 92 00 00 00 EC 02 10 00 E2 00 77"
str is the string that had been converted to a (a = str.unpack("C*)), so it need not be computed.
(?<=INFO5\t ) and (?=\t) are respectively a positive lookbehind and a positive lookahead. They must be matched but are not part of the match that is returned. The ("non-greedy") question mark in .+? ensures that the match terminates immediately before the first tab is encountered. By contrast,
"abc\td\tef"[/(?<=a).+(?=\t)/]
#=> "bc\td"
Extract from a and convert to a string
pfix = "INFO5\t".unpack("C*")
#=> [73, 78, 70, 79, 53, 9]
pfix_size = pfix.size
#=> 6
sfix = [prefix.last]
#=> [9]
sfix_size = sfix.size
start = idx_start(a, pfix) + pfix_size
#=> 19
a[start..idx_start(a[start..-1], sfix) + start - 1].pack("C*").strip
#=> "74 23 03 92 00 00 00 EC 02 10 00 E2 00 77"
def idx_start(a, arr)
arr_size = arr.size
a.each_index.find { |i| a[i, arr_size] == arr }
end

I assume you mean a=str.unpack("C*") - you can unpack a string but not an array.
To get the result you want, you don't need to use unpack at all1 - just perform a regex:
str.match(/INFO5\t(.*?)\t/).to_a[1]
# => " 74 23 03 92 00 00 00 EC 02 10 00 E2 00 77"
Note that there's a leading space in the result, but you can adjust the regex according to your needs; I'm not going to try to guess the specification of this format.
Tips:
The ? in .*? is needed to make the * non-greedy.
The to_a avoids raiseing an error in case the match finds nothing.
EDIT
Your comment regarding "invalid byte sequence in UTF-8" indicates that your data is probably ASCII-8BIT (i.e. it's not compatible with UTF-8), but it's stored in a string whose encoding attribute is "UTF-8". It would help if you explain how you obtained that string, because the string's encoding appears to be wrong.
Solution 1 (this is ideal):
Read in the file as ASCII-8BIT:
str = File.read("input.dat", encoding: 'ASCII-8BIT')
Solution 2 (a workaround, if you can't control the input encoding):
# NOTE: this changes the encoding on `str`
str.force_encoding("ASCII-8BIT")
After you've done this, the .match should work.
Further Explanation
The reason your map(&:chr).join works is because .chr will produce either US-ASCII or ASCII-8BIT strings (the latter happens for bytes above 127), never UTF-8.
When you join those strings, your result is in ASCII-8BIT if any byte was above 127. So this is effectively the same as calling force_encoding("ASCII-8BIT"), except that map/join doesn't modify the original string's encoding like force_encoding does.
1unpack is unnecessary because a.map(&:chr).join is the same as arr.pack('C*') which gives you the original str. Even if you had to unpack the string for another purpose, I recommend using the original string instead of re-packing the array. Maybe you can encapsulate this into a data structure, e.g.:
i_data = InfoData.new(str)
i_data.bytes # array of bytes
i_data.hex_string # "74 23 03 ..."
Note that the above code won't work as-is - you need to write the InfoData class yourself.

I assume that you don't need the non-ascii bytes, so in first step I trim them to the first null byte using take_while
Then I convert ints to string using map(&:chr).join
Finally I match them using a regex that /INFO5\t ?([^\t]*)\t/ that assumes the interesting part is between INFO5\t and next \t
--
a=array.unpack("C*")
a.take_while{|e| e > 0}.map(&:chr).join.match(/INFO5\t ?([^\t]*)\t/)[1]
# => "74 23 03 92 00 00 00 EC 02 10 00 E2 00 77"

Related

Pearson hash 8-bit implementation is producing very non-uniform values

I am implementing a pearson hash in order to create a lightweight dictionary structure for a C project which requires a table of files names paired with file data - I want the nice constant search property of hash tables. I'm no math expert so I looked up good text hashes and pearson came up, with it being claimed to be effective and having a good distribution. I tested my implementation and found that no matter how I vary the table size or the filename max length, the hash is very inefficient, with for example 18/50 buckets being left empty. I trust wikipedia to not be lying, and yes I am aware I can just download a third party hash table implementation, but I would dearly like to know why my version isn't working.
In the following code, (a function to insert values into the table), "csString" is the filename, the string to be hashed, "cLen" is the length of the string, "pData" is a pointer to some data which is inserted into the table, and "pTable" is the table struct. The initial condition cHash = cLen - csString[0] is somethin I experimentally found to marginally improve uniformity. I should add that I am testing the table with entirely randomised strings (using rand() to generate ascii values) with randomised length between a certain range - this is in order to easily generate and test large amounts of values.
typedef struct StaticStrTable {
unsigned int nRepeats;
unsigned char nBuckets;
unsigned char nMaxCollisions;
void** pBuckets;
} StaticStrTable;
static const char cPerm256[256] = {
227, 117, 238, 33, 25, 165, 107, 226, 132, 88, 84, 68, 217, 237, 228, 58, 52, 147, 46, 197, 191, 119, 211, 0, 218, 139, 196, 153, 170, 77, 175, 22, 193, 83, 66, 182, 151, 99, 11, 144, 104, 233, 166, 34, 177, 14, 194, 51, 30, 121, 102, 49,
222, 210, 199, 122, 235, 72, 13, 156, 38, 145, 137, 78, 65, 176, 94, 163, 95, 59, 92, 114, 243, 204, 224, 43, 185, 168, 244, 203, 28, 124, 248, 105, 10, 87, 115, 161, 138, 223, 108, 192, 6, 186, 101, 16, 39, 134, 123, 200, 190, 195, 178,
164, 9, 251, 245, 73, 162, 71, 7, 239, 62, 69, 209, 159, 3, 45, 247, 19, 174, 149, 61, 57, 146, 234, 189, 15, 202, 89, 111, 207, 31, 127, 215, 198, 231, 4, 181, 154, 64, 125, 24, 93, 152, 37, 116, 160, 113, 169, 255, 44, 36, 70, 225, 79,
250, 12, 229, 230, 76, 167, 118, 232, 142, 212, 98, 82, 252, 130, 23, 29, 236, 86, 240, 32, 90, 67, 126, 8, 133, 85, 20, 63, 47, 150, 135, 100, 103, 173, 184, 48, 143, 42, 54, 129, 242, 18, 187, 106, 254, 53, 120, 205, 155, 216, 219, 172,
21, 253, 5, 221, 40, 27, 2, 179, 74, 17, 55, 183, 56, 50, 110, 201, 109, 249, 128, 112, 75, 220, 214, 140, 246, 213, 136, 148, 97, 35, 241, 60, 188, 180, 206, 80, 91, 96, 157, 81, 171, 141, 131, 158, 1, 208, 26, 41
};
void InsertStaticStrTable(char* csString, unsigned char cLen, void* pData, StaticStrTable* pTable) {
unsigned char cHash = cLen - csString[0];
for (int i = 0; i < cLen; ++i) cHash ^= cPerm256[cHash ^ csString[i]];
unsigned short cTableIndex = cHash % pTable->nBuckets;
long long* pBucket = pTable->pBuckets[cTableIndex];
// Inserts data and records how many collisions there are - it may look weird as the way in which I decided to pack the data into the table buffer is very compact and arbitrary
// It won't affect the hash though, which is the key issue!
for (int i = 0; i < pTable->nMaxCollisions; ++i) {
if (i == 1) {
pTable->nRepeats++;
}
long long* pSlotID = pBucket + (i << 1);
if (pSlotID[0] == 0) {
pSlotID[0] = csString;
pSlotID[1] = pData;
break;
}
}
}

FYI (This is not an answer, I just need the formatting)
These are just single runs from a simulation, YMMV.
distributing 50 elements randomly over 50 bins:
kalender_size=50 nperson = 50
E/cell| Ncell | frac | Nelem | frac |h/cell| hops | Cumhops
----+---------+--------+----------+--------+------+--------+--------
0: 18 (0.360000) 0 (0.000000) 0 0 0
1: 18 (0.360000) 18 (0.360000) 1 18 18
2: 10 (0.200000) 20 (0.400000) 3 30 48
3: 4 (0.080000) 12 (0.240000) 6 24 72
----+---------+--------+----------+--------+------+--------+--------
4: 50 50 1.440000 72
Similarly: distribute 365 persons over a birthday-calendar (ignoring leap days ...):
kalender_size=356 nperson = 356
E/cell| Ncell | frac | Nelem | frac |h/cell| hops | Cumhops
----+---------+--------+----------+--------+------+--------+--------
0: 129 (0.362360) 0 (0.000000) 0 0 0
1: 132 (0.370787) 132 (0.370787) 1 132 132
2: 69 (0.193820) 138 (0.387640) 3 207 339
3: 19 (0.053371) 57 (0.160112) 6 114 453
4: 6 (0.016854) 24 (0.067416) 10 60 513
5: 1 (0.002809) 5 (0.014045) 15 15 528
----+---------+--------+----------+--------+------+--------+--------
6: 356 356 1.483146 528
For N items over N slots, the expectation for the number of empty slots and the number of slots with a single item in them is equal. The expected density is 1/e for both.
The final number (1.483146) is the number of ->next pointer traversels per found element (when using a chained hash table) Any optimal hash function will almost reach 1.5.

How to convert ArrayBuffer to blob so it can be converted to URL for video playback

I'm trying to create a React.js app that can record short video clips, store them in MongoDB, then retrieve those clips from Mongo at another time and playback for the user. I'm able to record video using the react-video-recorder which returns a videoBlob after recording. This videoBlob can be converted via URL.createObjectUrl and set to the src attribute in the HTML video tag. In this case, the video plays back just fine.
However...
If I store this videoBlob in MongoDB, it's converted into a BSON document which contains an ArrayBuffer element in the following format.
ArrayBuffer {
[Uint8Contents]: <1a 45 df a3 9f 42 86 81 01 42 f7 81 01 42 f2 81 04 42 f3 81 08 42 82 84 77 65 62 6d 42 87 81 04 42 85 81 02 18 53 80 67 01 ff ff ff ff ff ff ff 15 49 a9 66 99 2a d7 b1 83 0f 42 40 4d 80 86 43 68 72 6f 6d 65 57 41 86 43 68 72 6f 6d 65 16 54 ae 6b ea ae bd d7 81 01 73 c5 87 2a 21 fb 84 af c3 05 83 81 ... 86733 more bytes>,
byteLength: 86833
}
Here's how the same ArrayBuffer appears when console.logged in the browser:
(86833) [26, 69, 223, 163, 159, 66, 134, 129, 1, 66, 247, 129, 1, 66, 242, 129, 4, 66, 243, 129, 8, 66, 130, 132, 119, 101, 98, 109, 66, 135, 129, 4, 66, 133, 129, 2, 24, 83, 128, 103, 1, 255, 255, 255, 255, 255, 255, 255, 21, 73, 169, 102, 153, 42, 215, 177, 131, 15, 66, 64, 77, 128, 134, 67, 104, 114, 111, 109, 101, 87, 65, 134, 67, 104, 114, 111, 109, 101, 22, 84, 174, 107, 234, 174, 189, 215, 129, 1, 115, 197, 135, 42, 33, 251, 132, 175, 195, 5, 131, 129, …]
[0 … 9999]
[10000 … 19999]
[20000 … 29999]
[30000 … 39999]
[40000 … 49999]
[50000 … 59999]
[60000 … 69999]
[70000 … 79999]
[80000 … 86832]
length: 86833
I haven't been able to successfully convert this ArrayBuffer back into the original blob which can be used to play back the video. How would I go about doing this? Here is what I've already tried:
var returnedArrayBuffer = video.data;//extracting the ArrayBuffer element from BSON document
console.log(returnedArrayBuffer);
var newVideoBlob = new Blob([returnedArrayBuffer], { type: 'video/webm;codecs="vp8,opus"' });//have also tried with a type of just "video/webm", original videoBlob MIME type is exactly as defined here as well
console.log(newVideoBlob);
var url = URL.createObjectURL(newVideoBlob);
However, this returns a blob that is roughly three times the size of the original blob and fails to playback at all. What am I doing wrong here? Any assistance is greatly appreciated.

Taking minimum value of each entry +- 10 rows either side in numpy array

I have a 3d numpy array and want to generate a secondary array consisting of the minimum of each value and the values in the 10 rows directly above and 10 rows directly below (i.e each entry is the minimum value from 21 values) for each 2d array.
I've been trying to use 'numpy.clip' to deal with the edges of the array - here the range of values which the minimum is taken from should simply reduce to 10 at the values on the top/bottom of the array. I think something like 'scipy.signal.argrelmin' seems to be what I'm after.
Here's my code so far, definitely not the best way to go about it:
import numpy as np
array_3d = np.random.random_integers(50, 80, (3, 50, 18))
minimums = np.zeros(array_3d.shape)
for array_2d_index in range(len(array_3d)):
for row_index in range(len(array_3d[array_2d_index])):
for col_index in range(len(array_3d[array_2d_index][row_index])):
minimums[array_2d_index][row_index][col_index] = min(array_3d[array_2d_index][np.clip(row_index-10, 0, 49):np.clip(row_index+10, 0, 49)][col_index])
The main issue I think is that this is taking the minimum from the columns either side of each entry instead of the rows, which has been giving index errors.
Would appreciate any advice, thanks.

Approach #1
Here's one approach with np.lib.stride_tricks.as_strided -
def strided_3D_axis1(array_3d, L):
s0,s1,s2 = array_3d.strides
strided = np.lib.stride_tricks.as_strided
m,n,r = array_3d.shape
nL = n-L+1
return strided(array_3d, (m,nL,L,r),(s0,s1,s1,s2))
out = strided_3D_axis1(array_3d, L=21).min(axis=-2)
Sample run -
1) Input :
In [179]: array_3d
Out[179]:
array([[[73, 65, 51, 76, 59],
[74, 57, 75, 53, 70],
[60, 74, 52, 54, 60],
[54, 52, 62, 75, 50],
[68, 56, 68, 63, 77]],
[[62, 70, 60, 79, 74],
[70, 68, 50, 74, 57],
[63, 57, 69, 65, 54],
[63, 63, 68, 58, 60],
[70, 66, 65, 78, 78]]])
2) Strided view :
In [180]: strided_3D_axis1(array_3d, L=3)
Out[180]:
array([[[[73, 65, 51, 76, 59],
[74, 57, 75, 53, 70],
[60, 74, 52, 54, 60]],
[[74, 57, 75, 53, 70],
[60, 74, 52, 54, 60],
[54, 52, 62, 75, 50]],
[[60, 74, 52, 54, 60],
[54, 52, 62, 75, 50],
[68, 56, 68, 63, 77]]],
[[[62, 70, 60, 79, 74],
[70, 68, 50, 74, 57],
[63, 57, 69, 65, 54]],
[[70, 68, 50, 74, 57],
[63, 57, 69, 65, 54],
[63, 63, 68, 58, 60]],
[[63, 57, 69, 65, 54],
[63, 63, 68, 58, 60],
[70, 66, 65, 78, 78]]]])
3) Strided view based min :
In [181]: strided_3D_axis1(array_3d, L=3).min(axis=-2)
Out[181]:
array([[[60, 57, 51, 53, 59],
[54, 52, 52, 53, 50],
[54, 52, 52, 54, 50]],
[[62, 57, 50, 65, 54],
[63, 57, 50, 58, 54],
[63, 57, 65, 58, 54]]])
Approach #2
Here's another with broadcasting upon creating all sliding indices along the second axis -
array_3d[:,np.arange(array_3d.shape[1]-L+1)[:,None] + range(L)].min(-2)
Approach #3
Here's another using Scipy's 1D minimum filter -
from scipy.ndimage.filters import minimum_filter1d as minf
L = 21
hL = (L-1)//2
out = minf(array_3d,L,axis=1)[:,hL:-hL]
Runtime test -
In [231]: array_3d = np.random.randint(50, 80, (3, 50, 18))
In [232]: %timeit strided_3D_axis1(array_3d, L=21).min(axis=-2)
10000 loops, best of 3: 54.2 µs per loop
In [233]: %timeit array_3d[:,np.arange(array_3d.shape[1]-L+1)[:,None] + range(L)].min(-2)
10000 loops, best of 3: 81.3 µs per loop
In [234]: L = 21
...: hL = (L-1)//2
...:
In [235]: %timeit minf(array_3d,L,axis=1)[:,hL:-hL]
10000 loops, best of 3: 32 µs per loop

Ruby array conversion

I have a string of digits:
s = "12345678910"
As you can see it is the numbers 1 through 10 listed in increasing order. I want to convert it to an array of those numbers:
a = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
How can I do it?

How about this:
a = ["123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899"]
b = a.first.each_char.map {|n| n.to_i }
if b.size > 8
c = b[0..8]
c += b[9..b.size].each_slice(2).map(&:join).map(&:to_i)
end
# It would yield as follows:
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99]
For later numbers beyond 99, modify existing predicate accordingly.

Assuming a monotonic sequence, here's my run at it.
input = a.first.chars
output = []
previous_int = 0
until input.empty?
temp = []
temp << input.shift until temp.join.to_i > previous_int
previous_int = temp.join.to_i
output << previous_int
end
puts output.to_s
#=> [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

Assumptions
the first (natural) number extracted from the string is the first character of the string converted to an integer;
if the number n is extracted from the string, the next number extracted, m, satisfies n <= m (i.e., the sequence is monotonically non-decreasing);
if n is extracted from the string, the next number extracted will have as few digits as possible (i.e., at most one greater than the number of digits in n); and
there is no need to check the validity of the string (e.g., "54632" is invalid).
Code
def split_it(str)
return [] if str.empty?
a = [str[0]]
offset = 1
while offset < str.size
sz = a.last.size
sz +=1 if str[offset,sz] < a.last
a << str[offset, sz]
offset += sz
end
a.map(&:to_i)
end
Examples
split_it("12345678910")
#=> [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
split_it("12343636412252891407189118901")
#=> [1, 2, 3, 4, 36, 36, 41, 225, 289, 1407, 1891, 18901]

Using Grid-Like Data in C

I recently wrote code for a project euler problem, and by the time I had worked around every bug I ran into my code was pretty convoluted and no longer pretty and efficient. I had to manually manipulate my data far too much for my liking. I cannot find a straight forward answer elsewhere and would like a more graceful solution.
I'm not even sure this is possible in C, so keep that in mind.
The problem requires analyzing a grid of data that is in pain text.
The grid is as follows...
08 02 22 97 38 15 00 40 00 75 04 05 07 78 52 12 50 77 91 08
49 49 99 40 17 81 18 57 60 87 17 40 98 43 69 48 04 56 62 00
81 49 31 73 55 79 14 29 93 71 40 67 53 88 30 03 49 13 36 65
52 70 95 23 04 60 11 42 69 24 68 56 01 32 56 71 37 02 36 91
22 31 16 71 51 67 63 89 41 92 36 54 22 40 40 28 66 33 13 80
24 47 32 60 99 03 45 02 44 75 33 53 78 36 84 20 35 17 12 50
32 98 81 28 64 23 67 10 26 38 40 67 59 54 70 66 18 38 64 70
67 26 20 68 02 62 12 20 95 63 94 39 63 08 40 91 66 49 94 21
24 55 58 05 66 73 99 26 97 17 78 78 96 83 14 88 34 89 63 72
21 36 23 09 75 00 76 44 20 45 35 14 00 61 33 97 34 31 33 95
78 17 53 28 22 75 31 67 15 94 03 80 04 62 16 14 09 53 56 92
16 39 05 42 96 35 31 47 55 58 88 24 00 17 54 24 36 29 85 57
86 56 00 48 35 71 89 07 05 44 44 37 44 60 21 58 51 54 17 58
19 80 81 68 05 94 47 69 28 73 92 13 86 52 17 77 04 89 55 40
04 52 08 83 97 35 99 16 07 97 57 32 16 26 26 79 33 27 98 66
88 36 68 87 57 62 20 72 03 46 33 67 46 55 12 32 63 93 53 69
04 42 16 73 38 25 39 11 24 94 72 18 08 46 29 32 40 62 76 36
20 69 36 41 72 30 23 88 34 62 99 69 82 67 59 85 74 04 36 16
20 73 35 29 78 31 90 01 74 31 49 71 48 86 81 16 23 57 05 54
01 70 54 71 83 51 54 69 16 92 33 48 61 43 52 01 89 19 67 48
The idea is to find the largest possible product of four adjacent numbers (vertical, horizontal, or diagonal).
In the end my solution involved manually inputting this into a two-dimensional int array and manually changing all 08's or 09's to 8's and 9's to avoid the octal number problem.
Like so...
int str[20][20] = {{ 8, 02, 22, 97, 38, 15, 00, 40, 00, 75, 04, 05, 07, 78, 52, 12, 50, 77, 91, 8},{49, 49, 99, 40, 17, 81, 18, 57, 60, 87, 17, 40, 98, 43, 69, 48, 04, 56, 62, 00},{81, 49, 31, 73, 55, 79, 14, 29, 93, 71, 40, 67, 53, 88, 30, 03, 49, 13, 36, 65},{52, 70, 95, 23, 04, 60, 11, 42, 69, 24, 68, 56, 01, 32, 56, 71, 37, 02, 36, 91},{22, 31, 16, 71, 51, 67, 63, 89, 41, 92, 36, 54, 22, 40, 40, 28, 66, 33, 13, 80},{24, 47, 32, 60, 99, 03, 45, 02, 44, 75, 33, 53, 78, 36, 84, 20, 35, 17, 12, 50},{32, 98, 81, 28, 64, 23, 67, 10, 26, 38, 40, 67, 59, 54, 70, 66, 18, 38, 64, 70},{67, 26, 20, 68, 02, 62, 12, 20, 95, 63, 94, 39, 63, 8, 40, 91, 66, 49, 94, 21},{24, 55, 58, 05, 66, 73, 99, 26, 97, 17, 78, 78, 96, 83, 14, 88, 34, 89, 63, 72},{21, 36, 23, 9, 75, 00, 76, 44, 20, 45, 35, 14, 00, 61, 33, 97, 34, 31, 33, 95},{78, 17, 53, 28, 22, 75, 31, 67, 15, 94, 03, 80, 04, 62, 16, 14, 9, 53, 56, 92},{16, 39, 05, 42, 96, 35, 31, 47, 55, 58, 88, 24, 00, 17, 54, 24, 36, 29, 85, 57},{86, 56, 00, 48, 35, 71, 89, 07, 05, 44, 44, 37, 44, 60, 21, 58, 51, 54, 17, 58},{19, 80, 81, 68, 05, 94, 47, 69, 28, 73, 92, 13, 86, 52, 17, 77, 04, 89, 55, 40},{04, 52, 8, 83, 97, 35, 99, 16, 07, 97, 57, 32, 16, 26, 26, 79, 33, 27, 98, 66},{88, 36, 68, 87, 57, 62, 20, 72, 03, 46, 33, 67, 46, 55, 12, 32, 63, 93, 53, 69},{04, 42, 16, 73, 38, 25, 39, 11, 24, 94, 72, 18, 8, 46, 29, 32, 40, 62, 76, 36},{20, 69, 36, 41, 72, 30, 23, 88, 34, 62, 99, 69, 82, 67, 59, 85, 74, 04, 36, 16},{20, 73, 35, 29, 78, 31, 90, 01, 74, 31, 49, 71, 48, 86, 81, 16, 23, 57, 05, 54},{01, 70, 54, 71, 83, 51, 54, 69, 16, 92, 33, 48, 61, 43, 52, 01, 89, 19, 67, 48}};
This is not only tedious but it seems in efficient as well. Is there a way in c to take this data from the plain text grid, besides using a char string? And if not what would be a more elegant way to take this data?
I am self taught so I apologize for any glaring holes in what I know.

Is there a way in c to take this data from the plain text grid, besides using a char string? And if not what would be a more elegant way to take this data?
The approach to take is: save the data as a file (say input.txt) and pipe it to my program and read all of the entries through stdin. It would look like the following:
int rows = 20;
int cols = 20;
int arr[ rows ][ cols ] = { 0 };
int crow = 0;
int ccol = 0;
int num;
// Iterates until EOF is sent through stdin.
while ( scanf( "%d", &num ) != EOF ) {
// Determines whether we have filled all of the columns, if so
// reset the current column to 0 and increase the current row
// by 1.
if ( ccol >= 20 ) {
ccol = 0;
crow++;
}
// Mutate arr at position ( ( col * crow ) + ccol ) to have the
// value num.
arr[ crow ][ ccol ] = num;
}
... this would be inside a function in your driver file (possibly main). What this is doing is, reading each number one at a time then populating the array and stopping when EOF is sent (end of file). See documentation for scanf (here) for further details.
You would then run your program as follows to pipe the input file to your program:
./program.out < input.txt
Remark:
I am not using a dynamic array or blocks of memory from the memory pool. If you plan to receive an arbitrarily large file then I suggest implementing a dynamic array using the memory pool (as the stack is rather small in comparison to the memory pool).

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

How to extract a pattern from string containing binary data - arrays

Related

Pearson hash 8-bit implementation is producing very non-uniform values

How to convert ArrayBuffer to blob so it can be converted to URL for video playback

Taking minimum value of each entry +- 10 rows either side in numpy array

Ruby array conversion

Using Grid-Like Data in C

Categories

Resources