Matlab wrong in array multiplication? - arrays

I have this simple program:
% Read Image:
I=imread('Bureau.bmp');
% calculate Hist:
G= unique(I); % Calculate the different gray values
Hist= zeros(size(G)); % initialize an array with the same size as G
% For each different gray value, loop all the image, and each time you find
% a value that equals the gray value, increment the hist by 1
for j=1:numel(G)
for i= 1:numel (I)
if G(j)== I(i)
Hist(j)=Hist(j)+1;
end
end
end
Now look at this multiplication:
>> G(2)
ans =
1
>> Hist(2)
ans =
550
>> Hist(2)*G(2)
ans =
255
And it's giving me 255 not only for the index 2, but for any combination of indexes!

Two things for your problem.
First, here is the reason of your problem of multiplication: different types. I and so Gare of type uint8. H is of type double. When you perform the multiplication, Matlab seems to use the most restrictive type, so here uint8. So the result of Hist(2)*G(2) is of type uint8, comprised between 0 and 255.
Second: please DON'T compute an histogram this way. Matlab has numerous functions for that (hist and histc for the most common ones), so please read the doc and use it instead of creating your own code. If you want nevertheless write your own function (learning purpose), this code is far too slow. You go through the image about 256 times, it is useless. Instead of that, a classic way would be:
Hist = zeros(1,256);
for i=1:numel(I)
Hist(int32(I(i))+1) = Hist(int32(I(i))+1)+1
end
You use directly the value of the current pixel (+1 because index starts at 1 in Matlab) to access the corresponding slot of your histogram. Also, you must cast the pixel value to int32, to avoid the problem of value 255 (with uint8 variables, 255+1=0).
I don't want here to be pedantic, but Matlab comes with thousands of functions (without mentioning the dozens of toolboxes) and a very well-written doc, so please read it and use every suitable you can find inside, that's the best advice I could give to anybody who starts learning Matlab.

Related

Pre-allocate logical array with unassigned elements (not true or false)

I'm looking for the most efficient method of pre-allocating a logical array in MATLAB without specifying true or false at the time of pre-allocation.
When pre-allocating e.g. a 1×5 numeric array I can use nan(1,5). To my mind, this is better than using zeros(1,5), since I can easily tell which slots have been filled with data versus those that are yet to be filled. If using the zeros() solution it's hard to know whether any 0s are intentional 0s or just unfilled slots in the array.
I'm aware that I can pre-alloate a logical array using true(1,5) or false(1,5). The problem with these is similar to the use of zeros() in the numeric example; there's no way of knowing whether a slot is filled or not.
I know that one solution to this problem is to treat the array as numeric and pre-allocate using nan(1,5), and only converting to a logical array later when all the slots are filled. But this strikes me as inefficient.
Is there some smart way to pre-allocate a logical array in MATLAB and remain agnostic as to the actual content of that array until it is ready to be filled?
The short answer is no, the point of a logical array is that each element takes a single byte, and the implementation is only capable of storing only two states (true=1 or false=0). You might assume that logicals only need a single bit, but in fact they need 8 bits (a byte) to avoid compromising on performance.
If memory is a concern, you could use a single array instead of a double array, moving from 64-bit to 32-bit numbers and still capable of storing NaN. Then you can cast to logical whenever required (assuming you have no NaNs by that point, otherwise it will error).
If it was really important to track whether a value was ever assigned whilst also reducing memory, you could have a 2nd logical array which you update at the same time as the first, and stores simply whether a value was ever assigned. Then this can be used as a check on whether you have any default values left after assignments. Now we've dropped from 32-bit singles to two 8-bit logicals, which is worse than one logical but still twice as efficient than using floating point numbers for the sake of the NaN. Obviously assignment operations now take twice as long as using a single logical array, I don't know how they compare to float assignments.
Going off-piste, you could make your own class to do this assignment-tracking for you, and display the logical array as if it was capable of storing NaNs. This isn't really recommended but I've written the below code to complete the thought experiment.
Note you originally ask for "the most efficient method", in terms of execution time this is definitely not going to be as efficient than the native implementation of logical arrays.
classdef nanBool
properties
assigned % Tracks whether element of "value" was ever assigned
value % Tracks boolean array
end
methods
function obj = nanBool(varargin)
% Constructor: initialise main and tracking arrays to false
% handles same inputs as using "false()" normally
obj.value = false(varargin{:});
obj.assigned = false(size(obj.value));
end
function b = subsref(obj,S)
% Override the indexing operator so that indexing works like it
% would for a logical array unless accessing object properties
if strcmp(S.type,'.')
b = obj.(S.subs);
else
b = builtin('subsref',obj.value,S);
end
end
function obj = subsasgn(obj,S,B)
% Override the assignement operator so that the value array is
% updated when normal array indexing is used. In sync, update
% the assigned state for the corresponding elements
obj.value = builtin('subsasgn',obj.value,S,B);
obj.assigned = builtin('subsasgn',obj.assigned,S,true(size(B)));
end
function disp(obj)
% Override the disp function so printing to the command window
% renders NaN for elements which haven't been assigned
a = double(obj.value);
a(~obj.assigned) = NaN;
disp(a);
end
end
end
Test cases:
>> a = nanBool(3,1)
a =
NaN
NaN
NaN
>> a(2) = true
a =
NaN
1
NaN
>> a(3) = false
a =
NaN
1
0
>> a(:) = true
a =
1
1
1
>> whos a
Name Size Bytes Class Attributes
a 1x1 6 nanBool
>> b = false(3,1); whos b
Name Size Bytes Class Attributes
b 3x1 3 logical
Note the whos test shows this custom class has the same memory footprint as two logical arrays the same size. It also shows that the size is reported incorrectly, indicating we'd also have to override the size function in our custom class, I'm sure there are lots of other similar edge cases you'd want to handle.
you could check whether there's any "logical NaNs" (unassigned values) with something like this, or add a function which does this to the class:
fullyAssigned = all(a.assigned);
In 21b and newer you can do some more controlled indexing overrides for custom classes instead of subsref and subsasgn, but I can't test this:
https://uk.mathworks.com/help/matlab/customize-object-indexing.html

How do you use bitwise operators, masks, to find if a number is a multiple of another number?

So I have been told that this can be done and that bitwise operations and masks can be very useful but I must be missing something in how they work.
I am trying to calculate whether a number, say x, is a multiple of y. If x is a multiple of y great end of story, otherwise I want to increase x to reach the closest multiple of y that is greater than x (so that all of x fits in the result). I have just started learning C and am having difficulty understanding some of these tasks.
Here is what I have tried but when I input numbers such as 5, 9, or 24 I get the following respectively: 0, 4, 4.
if(x&(y-1)){ //if not 0 then multiple of y
x = x&~(y-1) + y;
}
Any explanations, examples of the math that is occurring behind the scenes, are greatly appreciated.
EDIT: So to clarify, I somewhat understand the shifting of bits to get whether an item is a multiple. (As was explained in a reply 10100 is a multiple of 101 as it is just shifted over). If I have the number 16, which is 10000, its complement is 01111. How would I use this complement to see if an item is a multiple of 16? Also can someone give a numerical explanation of the code given above? Showing this may help me understand why it does not work. Once I understand why it does not work I will be able to problem solve on my own I believe.
Why would you even think about using bit-wise operations for this? They certainly have their place but this isn't it.
A better method is to simply use something like:
unsigned multGreaterOrEqual(unsigned x, unsigned y) {
if ((x % y) == 0)
return x;
return (x / y + 1) * y;
}
In the trivial cases, every number that is an even multiple of a power of 2 is just shifted to the left (this doesn't apply when possibly altering the sign bit)
For example
10100
is 4 times
101
and
10100
is 2 time
1010
As for other multiples, they would have to be found by combining the outputs of two shifts. You might want to look up some primitive means of computer division, where division looks roughly like
x = a / b
implemented like
buffer = a
while a is bigger than b; do
yes: subtract a from b
add 1 to x
done
faster routines try to figure out higher level place values first, skipping lots of subtractions. All of these routine can be done bitwise; but it is a big pain. In the ALU these routines are done bitwise. Might want to look up a digital logic design book for more ideas.
Ok, so I have discovered what the error was in my code and since the majority say that it is impossible to calculate whether a number is a multiple of another number using masks I figured I would share what I have learned.
It is possible! - if you are using the correct data types that is.
The code given above works if y is declared as a constant unsigned long as x which was being passed in was also an unsigned long. The key point is not the long or constant part but that the number is unsigned. This sign bit causes miscalculation as the first place in the number indicates sign and when performing bitwise operations signs can get muddled.
So here is my code if we are looking for multiples of 16:
const unsigned long y = 16; //declared globally in my case
Then an unsigned long is passed to the function which runs the following code:
if(x&(y-1)){ //if not 0 then multiple of y
x = x&~(y-1) + y;
}
x will now be the size of the nearest multiple of 16.

Perfect Power detection in linear time

I'm trying to write a C program which, given a positive integer n (> 1) detect whether exists numbers x and r so that n = x^r
This is what I did so far:
while (c>=d) {
double y = pow(sum, 1.0/d);
if (floor(y) == y) {
out = y;
break;
}
d++;
}
In the program above, "c" is the maxium value for the exponent (r) and "d" will start by being equal to 2. Y is the value to be checked and the variable "out" is set to output that value later on. Basically, what the script does, is to check if the square roots of y exists: if not, he tries with the square cube and so on... When he finds it, he store the value of y in "out" so that: y = out^d
My question is, is there any more efficient way to find these values? I found some documentation online, but that's far more complicated than my high-school algebra. How can I implement this in a more efficient way?
Thanks!
In one of your comments, you state you want this to be compatible with gigantic numbers. In that case, you may want to bring in the GMP library, which supports operations on arbitrarily large numbers, one of those operations being checking if it is a perfect power.
It is open source, so you can check out the source code and see how they do it, if you don't want to bring in the whole library.
If n fits in a fixed-size (e.g. 32-bit) integer variable, the optimal solution is probably just hard-coding the list of such numbers and binary-searching it. Keep in mind, in int range, there are roughly
sqrt(INT_MAX) perfect squares
cbrt(INT_MAX) perfect cubes
etc.
In 32 bits, that's roughly 65536 + 2048 + 256 + 128 + 64 + ... < 70000.
You need the r-base logarithm, use an identity to calculate it using the natural log
So:
log_r(x) = log(x)/log(r)
So you need to calculate:
x = log(n)/log(r)
(In my neck of the wood, this is highschool math. Which immediately explains my having to look up whether I remembered that identity correctly :))
After you are calculating y in
double y = pow(sum, 1.0/d);
you can get the nearest int to it and you can use your own power function to check for the
equality condition with sum.
int x = (int)(y+0.5);
int a = your_power_func(x,d);
if (a == sum)
break;
I guess this way you can confirm whether a number is integer power of some other number or not.

How to map a long integer number to a N-dimensional vector of smaller integers (and fast inverse)?

Given a N-dimensional vector of small integers is there any simple way to map it with one-to-one correspondence to a large integer number?
Say, we have N=3 vector space. Can we represent a vector X=[(int16)x1,(int16)x2,(int16)x3] using an integer (int48)y? The obvious answer is "Yes, we can". But the question is: "What is the fastest way to do this and its inverse operation?"
Will this new 1-dimensional space possess some very special useful properties?
For the above example you have 3 * 32 = 96 bits of information, so without any a priori knowledge you need 96 bits for the equivalent long integer.
However, if you know that your x1, x2, x3, values will always fit within, say, 16 bits each, then you can pack them all into a 48 bit integer.
In either case the technique is very simple you just use shift, mask and bitwise or operations to pack/unpack the values.
Just to make this concrete, if you have a 3-dimensional vector of 8-bit numbers, like this:
uint8_t vector[3] = { 1, 2, 3 };
then you can join them into a single (24-bit number) like so:
uint32_t all = (vector[0] << 16) | (vector[1] << 8) | vector[2];
This number would, if printed using this statement:
printf("the vector was packed into %06x", (unsigned int) all);
produce the output
the vector was packed into 010203
The reverse operation would look like this:
uint8_t v2[3];
v2[0] = (all >> 16) & 0xff;
v2[1] = (all >> 8) & 0xff;
v2[2] = all & 0xff;
Of course this all depends on the size of the individual numbers in the vector and the length of the vector together not exceeding the size of an available integer type, otherwise you can't represent the "packed" vector as a single number.
If you have sets Si, i=1..n of size Ci = |Si|, then the cartesian product set S = S1 x S2 x ... x Sn has size C = C1 * C2 * ... * Cn.
This motivates an obvious way to do the packing one-to-one. If you have elements e1,...,en from each set, each in the range 0 to Ci-1, then you give the element e=(e1,...,en) the value e1+C1*(e2 + C2*(e3 + C3*(...Cn*en...))).
You can do any permutation of this packing if you feel like it, but unless the values are perfectly correlated, the size of the full set must be the product of the sizes of the component sets.
In the particular case of three 32 bit integers, if they can take on any value, you should treat them as one 96 bit integer.
If you particularly want to, you can map small values to small values through any number of means (e.g. filling out spheres with the L1 norm), but you have to specify what properties you want to have.
(For example, one can map (n,m) to (max(n,m)-1)^2 + k where k=n if n<=m and k=n+m if n>m--you can draw this as a picture of filling in a square like so:
1 2 5 | draw along the edge of the square this way
4 3 6 v
8 7
if you start counting from 1 and only worry about positive values; for integers, you can spiral around the origin.)
I'm writing this without having time to check details, but I suspect the best way is to represent your long integer via modular arithmetic, using k different integers which are mutually prime. The original integer can then be reconstructed using the Chinese remainder theorem. Sorry this is a bit sketchy, but hope it helps.
To expand on Rex Kerr's generalised form, in C you can pack the numbers like so:
X = e[n];
X *= MAX_E[n-1] + 1;
X += e[n-1];
/* ... */
X *= MAX_E[0] + 1;
X += e[0];
And unpack them with:
e[0] = X % (MAX_E[0] + 1);
X /= (MAX_E[0] + 1);
e[1] = X % (MAX_E[1] + 1);
X /= (MAX_E[1] + 1);
/* ... */
e[n] = X;
(Where MAX_E[n] is the greatest value that e[n] can have). Note that these maximum values are likely to be constants, and may be the same for every e, which will simplify things a little.
The shifting / masking implementations given in the other answers are a generalisation of this, for cases where the MAX_E + 1 values are powers of 2 (and thus the multiplication and division can be done with a shift, the addition with a bitwise-or and the modulus with a bitwise-and).
There is some totally non portable ways to make this real fast using packed unions and direct accesses to memory. That you really need this kind of speed is suspicious. Methods using shifts and masks should be fast enough for most purposes. If not, consider using specialized processors like GPU for wich vector support is optimized (parallel).
This naive storage does not possess any usefull property than I can foresee, except you can perform some computations (add, sub, logical bitwise operators) on the three coordinates at once as long as you use positive integers only and you don't overflow for add and sub.
You'd better be quite sure you won't overflow (or won't go negative for sub) or the vector will become garbage.
#include <stdint.h> // for uint8_t
long x;
uint8_t * p = &x;
or
union X {
long L;
uint8_t A[sizeof(long)/sizeof(uint8_t)];
};
works if you don't care about the endian. In my experience compilers generate better code with the union because it doesn't set of their "you took the address of this, so I must keep it in RAM" rules as quick. These rules will get set off if you try to index the array with stuff that the compiler can't optimize away.
If you do care about the endian then you need to mask and shift.
I think what you want can be solved using multi-dimensional space filling curves. The link gives a lot of references on this, which in turn give different methods and insights. Here's a specific example of an invertible mapping. It works for any dimension N.
As for useful properties, these mappings are related to Gray codes.
Hard to say whether this was what you were looking for, or whether the "pack 3 16-bit ints into a 48-bit int" does the trick for you.

How to define 2-bit numbers in C, if possible?

For my university process I'm simulating a process called random sequential adsorption.
One of the things I have to do involves randomly depositing squares (which cannot overlap) onto a lattice until there is no more room left, repeating the process several times in order to find the average 'jamming' coverage %.
Basically I'm performing operations on a large array of integers, of which 3 possible values exist: 0, 1 and 2. The sites marked with '0' are empty, the sites marked with '1' are full. Initially the array is defined like this:
int i, j;
int n = 1000000000;
int array[n][n];
for(j = 0; j < n; j++)
{
for(i = 0; i < n; i++)
{
array[i][j] = 0;
}
}
Say I want to deposit 5*5 squares randomly on the array (that cannot overlap), so that the squares are represented by '1's. This would be done by choosing the x and y coordinates randomly and then creating a 5*5 square of '1's with the topleft point of the square starting at that point. I would then mark sites near the square as '2's. These represent the sites that are unavailable since depositing a square at those sites would cause it to overlap an existing square. This process would continue until there is no more room left to deposit squares on the array (basically, no more '0's left on the array)
Anyway, to the point. I would like to make this process as efficient as possible, by using bitwise operations. This would be easy if I didn't have to mark sites near the squares. I was wondering whether creating a 2-bit number would be possible, so that I can account for the sites marked with '2'.
Sorry if this sounds really complicated, I just wanted to explain why I want to do this.
You can't create a datatype that is 2-bits in size since it wouldn't be addressable. What you can do is pack several 2-bit numbers into a larger cell:
struct Cell {
a : 2;
b : 2;
c : 2;
d : 2;
};
This specifies that each of the members a, b, c and d should occupy two bits in memory.
EDIT: This is just an example of how to create 2-bit variables, for the actual problem in question the most efficient implementation would probably be to create an array of int and wrap up the bit fiddling in a couple of set/get methods.
Instead of a two-bit array you could use two separate 1-bit arrays. One holds filled squares and one holds adjacent squares (or available squares if this is more efficient).
I'm not really sure that this has any benefit though over packing 2-bit fields into words.
I'd go for byte arrays unless you are really short of memory.
The basic idea
Unfortunately, there is no way to do this in C. You can create arrays of 1 byte, 2 bytes, etc., but you can't create areas of bits.
The best thing you can do, then, is to write a new library for yourself, which makes it look like you're dealing with arrays of 2 bits, but in reality does a lot of hard work. The same way that the string libraries give you functions that work on "strings" (which in C are just arrays), you'll be creating a new library which works on "bit arrays" (which in reality will be arrays of integers, with a few special functions to deal with them as-if they were arrays of bits).
NOTE: If you're new to C, and haven't learned the ideas of "creating a new library/module", or the concept of "abstraction", then I'd recommend learning about them before you continue with this project. Understanding them is IMO more important than optimizing your program to use a little less space.
How to implement this new "library" or module
For your needs, I'd create a new module called "2-bit array", which exports functions for dealing with the 2-bit arrays, as you need them.
It would have a few functions that deal with setting/reading bits, so that you can work with it as if you have an actual array of bits (you'll actually have an array of integers or something, but the module will make it seem like you have an array of bits).
Using this module would like something like this:
// This is just an example of how to use the functions in the twoBitArray library.
twoB my_array = Create2BitArray(size); // This will "create" a twoBitArray and return it.
SetBit(twoB, 5, 1); // Set bit 5 to 1 //
bit b = GetBit(twoB, 5); // Where bit is typedefed to an int by your module.
What the module will actually do is implement all these functions using regular-old arrays of integers.
For example, the function GetBit(), for GetBit(my_arr, 17), will calculate that it's the 1st bit in the 4th integer of your array (depending on sizeof(int), obviously), and you'd return it by using bitwise operations.
You can compact one dimension of array into sub-integer cells. To convert coordinate (lets say x for example) to position inside byte:
byte cell = array[i][ x / 4 ];
byte mask = 0x0004 << (x % 4);
byte data = (cell & mask) >> (x % 4);
to write data do reverse

Resources