How to put a datetime value into a numpy array? - arrays

I am learning python so please bear with me. I have been trying to get a datetime
variable into a numpy array, but have not been able to figure out how. I need to calculate differences between times for each index later on, so I didn't know if I should put the datetime variable into the array, or convert it to another data type. I get the error:
'NoneType' object does not support item assignment
Is my dtype variable constructed correctly? This says nothing about datetime type.
import numpy as np
from liblas import file
f = file.File(project_file, mode = 'r')
num_points = int(f.__len())
# dtype should be [float, float, float, int, int, datetime]
dt = [('x', 'f4'), ('y', 'f4'), ('z', 'f4'), ('i', 'u2'), ('c', 'u1'), ('time', 'datetime64')]
xyzict = np.empty(shape=(num_points, 6), dtype = dt)
# Load all points into numpy array
counter = 0
for p in f:
newrow = [p.x, p.y, p.z, p.i, p.c, p.time]
xyzict[counter] = newrow
counter += 1
Thanks in advance
EDIT: I should note that I plan on sorting the array by date before proceeding.
p.time is in the following format:
>>>p.time
datetime.datetime(1971, 6, 26, 19, 37, 12, 713269)
>>>str(p.time)
'1971-06-26 19:37:12.713275'

I don't really understand how you are getting a datetime object out of your file, or what p is for that matter, but assuming you have a list of tuples (not lists, see my comment above), you can do the setting all in one step:
dat = [(.5, .5, .5, 0, 34, datetime.datetime(1971, 6, 26, 19, 37, 12, 713269)),
(.3, .3, .6, 1, 23, datetime.datetime(1971, 6, 26, 19, 34, 23, 345293))]
dt = [('x', 'f4'), ('y', 'f4'), ('z', 'f4'), ('i', 'u2'), ('c', 'u1'), ('time', 'datetime64[us]')]
datarr = np.array(dat, dt)
Then you can access the fields by name:
>>> datarr['time']
array(['1971-06-26T15:37:12.713269-0400', '1971-06-26T15:34:23.345293-0400'], dtype='datetime64[us]')
Or sort by field:
>>> np.sort(datarr, order='time')
array([ (0.3, 0.3, 0.6, 1, 23, datetime.datetime(1971, 6, 26, 19, 34, 23, 345293)),
(0.5, 0.5, 0.5, 0, 34, datetime.datetime(1971, 6, 26, 19, 37, 12, 713269))],
dtype=[('x', '<f4'), ('y', '<f4'), ('z', '<f4'), ('i', '<u2'), ('c', 'u1'), ('time', '<M8[us]')])

Related

How to make a query with a custom order by parameter using array?

I have an algorithm that outputs an array in a particular order. Example:
arr = [0, 1, 21, 2, 22, 23, 24, 25, 3, 27, 35, 36, 28, 37, 38, 4, 29, 5, 34, 6, 7, 8, 9, 10, 11, 12]
The array will be different depending on the user's input so the example above is only one of many undefined amount of possibilities; longer, shorter or different values (all values will be integers). So I wont be able to use case in my query.
I want to produce an SQL-Server query in my views.py to display all objects in my model in that exact order.
Here is my "query" at the moment but obviously it doesn't work.
test = QuoteAssemblies.objects.raw("""SELECT qmaQuoteAssemblyID,
qmaPartID,
qmaLevel,
qmaPartShortDescription,
qmaQuantityPerParent
FROM QuoteAssemblies
WHERE qmaQuoteAssemblyID IN arr
ORDER BY qmaQuoteAssemblyID = arr""")
In essence I want the query to be ordered by qmaQuoteAssemblyID as long as it is in the same order of the array (not ASC, DESC etc).
qmaQuoteAssemblyID = 0
qmaQuoteAssemblyID = 1
qmaQuoteAssemblyID = 21
qmaQuoteAssemblyID = 2
etc...
There is a similar example for MySQL Here. I just need something like that but for MSSQL. Cheers.
If your version of SQL Server supports JSON querying (i.e. 2016+), you can use openjson() function to number the elements of your array, and then use that number for sorting:
declare #Arr nvarchar(max) = '[0, 1, 21, 2, 22, 23, 24, 25, 3, 27, 35, 36, 28, 37, 38, 4, 29, 5, 34, 6, 7, 8, 9, 10, 11, 12]';
SELECT q.qmaQuoteAssemblyID,
q.qmaPartID,
q.qmaLevel,
q.qmaPartShortDescription,
q.qmaQuantityPerParent
FROM dbo.QuoteAssemblies q
inner join openjson(#Arr) ar on ar.[value] = q.qmaQuoteAssemblyID
ORDER BY ar.[key];
If you can't utilise JSON for this task, you will need to somehow produce a rowset with your array elements being correctly numbered, and use it in a similar fashion. There are lots of ways to achieve this, and it doesn't necessarily have to be done on server side. For example, you can create a 2 column key-value user-defined table type in your database, and provide the data as a parameter for your query.
Another approach is to supply the data in the form of XML, something like this:
declare #Ax xml = N'<r>
<i n="0" v="0" />
<i n="1" v="1" />
<i n="2" v="21" />
...
</r>';
SELECT q.qmaQuoteAssemblyID,
q.qmaPartID,
q.qmaLevel,
q.qmaPartShortDescription,
q.qmaQuantityPerParent
FROM dbo.QuoteAssemblies q
inner join #Ax.nodes('/r/i') ar(c) on ar.c.value('./#v', 'int') = q.qmaQuoteAssemblyID
ORDER BY ar.c.value('./#n', 'int');
Still, the numbering of XML nodes is better to be done by the application, as there is no efficient way to do this on the database side. That, and performance might be rather worse compared to the option 1.

combine multiple numpy ndarrays as list

I have three equally dimensioned numpy arrays.
I would like to store the data from all three in an array of the same dimensions and size.
To do this, I would like to store three bytes of information per item in the array. I assume this would be a list.
e.g.
>>>red = np.array([[150,25],[37,214]])
>>>green = np.array([[190,27],[123,231]])
>>>blue = np.array([[10,112],[123,119]])
insert combination magic to make a combined array called RGB
>>>RGB
array([(150,190,10),(25,27,112)],[(37,123,123),(214,231,119)])
For a start, each is 2x2. Combined in a list with array, same construction as in making red, produces a 3x2x2.
In [344]: red = np.array([[150,25],[37,214]])
In [345]: green = np.array([[190,27],[123,231]])
In [346]: blue = np.array([[10,112],[123,119]])
In [347]: np.array([red,green,blue])
Out[347]:
array([[[150, 25],
[ 37, 214]],
[[190, 27],
[123, 231]],
[[ 10, 112],
[123, 119]]])
In [348]: _.shape
Out[348]: (3, 2, 2)
That's not the order you want, but we can easily reshape, and if needed transpose.
The target, with an added set of []
In [350]: np.array([[(150,190,10),(25,27,112)],[(37,123,123),(214,231,119)]])
Out[350]:
array([[[150, 190, 10],
[ 25, 27, 112]],
[[ 37, 123, 123],
[214, 231, 119]]])
In [351]: _.shape
Out[351]: (2, 2, 3)
so try moving the 3 shape to the end with transpose:
In [352]: np.array([red,green,blue]).transpose(1,2,0)
Out[352]:
array([[[150, 190, 10],
[ 25, 27, 112]],
[[ 37, 123, 123],
[214, 231, 119]]])
===========================
I should have suggested stack. This a newish version of concatenate that lets us join arrays on different new dimensions. With axis=0 it behaves like np.array. But to join on the last, to put the rgb dimension last use:
In [467]: np.stack((red,green,blue),axis=-1)
Out[467]:
array([[[150, 190, 10],
[ 25, 27, 112]],
[[ 37, 123, 123],
[214, 231, 119]]])
In [468]: _.shape
Out[468]: (2, 2, 3)
Note that this expression does not assume anything about the shape of red, etc, except that they are equal. So it will work with 3d arrays as well.

Methods of creating a structured array

I have the following information and I can produce a numpy array of the desired structure. Note that the values x and y have to be determined separately since their ranges may differ so I cannot use:
xy = np.random.random_integers(0,10,size=(N,2))
The extra list[... conversion is necessary for the conversion in order for it to work in Python 3.4, it is not necessary, but not harmful when using Python 2.7.
The following works:
>>> # attempts to formulate [id,(x,y)] with specified dtype
>>> N = 10
>>> x = np.random.random_integers(0,10,size=N)
>>> y = np.random.random_integers(0,10,size=N)
>>> id = np.arange(N)
>>> dt = np.dtype([('ID','<i4'),('Shape',('<f8',(2,)))])
>>> arr = np.array(list(zip(id,np.hstack((x,y)))),dt)
>>> arr
array([(0, [7.0, 7.0]), (1, [7.0, 7.0]), (2, [5.0, 5.0]), (3, [0.0, 0.0]),
(4, [6.0, 6.0]), (5, [6.0, 6.0]), (6, [7.0, 7.0]),
(7, [10.0, 10.0]), (8, [3.0, 3.0]), (9, [7.0, 7.0])],
dtype=[('ID', '<i4'), ('Shape', '<f8', (2,))])
I cleverly thought I could circumvent the above nasty bits by simply creating the array in the desired vertical structure and applying my dtype to it, hoping that it would work. The stacked array is correct in the vertical form
>>> a = np.vstack((id,x,y)).T
>>> a
array([[ 0, 7, 6],
[ 1, 7, 7],
[ 2, 5, 9],
[ 3, 0, 1],
[ 4, 6, 1],
[ 5, 6, 6],
[ 6, 7, 6],
[ 7, 10, 9],
[ 8, 3, 2],
[ 9, 7, 8]])
I tried several ways of trying to reformulate the above array so that my dtype would work and I just can't figure it out (this included vstacking a vstack etc). So my question is...how can I use the vstack version and get it into a format that meets my dtype requirements without having to go through the procedure that I did. I am hoping it is obvious, but I am sliced, stacked and ellipsed myself into an endless loop.
SUMMARY
Many thanks to hpaulj. I have included two incarnations based upon his suggestions for others to consider. The pure numpy solution is substantially faster and a lot cleaner.
"""
Script: pnts_StackExch
Author: Dan.Patterson#carleton.ca
Modified: 2015-08-24
Purpose:
To provide some timing options on point creation in preparation for
point-to-point distance calculations using einsum.
Reference:
http://stackoverflow.com/questions/32224220/
methods-of-creating-a-structured-array
Functions:
decorators: profile_func, timing, arg_deco
main: make_pnts, einsum_0
"""
import numpy as np
import random
import time
from functools import wraps
np.set_printoptions(edgeitems=5,linewidth=75,precision=2,suppress=True,threshold=5)
# .... wrapper funcs .............
def delta_time(func):
"""timing decorator function"""
import time
#wraps(func)
def wrapper(*args, **kwargs):
print("\nTiming function for... {}".format(func.__name__))
t0 = time.time() # start time
result = func(*args, **kwargs) # ... run the function ...
t1 = time.time() # end time
print("Results for... {}".format(func.__name__))
print(" time taken ...{:12.9f} sec.".format(t1-t0))
#print("\n print results inside wrapper or use <return> ... ")
return result # return the result of the function
return wrapper
def arg_deco(func):
"""This wrapper just prints some basic function information."""
#wraps(func)
def wrapper(*args,**kwargs):
print("Function... {}".format(func.__name__))
#print("File....... {}".format(func.__code__.co_filename))
print(" args.... {}\n kwargs. {}".format(args,kwargs))
#print(" docs.... {}\n".format(func.__doc__))
return func(*args, **kwargs)
return wrapper
# .... main funcs ................
#delta_time
#arg_deco
def pnts_IdShape(N=1000000,x_min=0,x_max=10,y_min=0,y_max=10):
"""Make N points based upon a random normal distribution,
with optional min/max values for Xs and Ys
"""
dt = np.dtype([('ID','<i4'),('Shape',('<f8',(2,)))])
IDs = np.arange(0,N)
Xs = np.random.random_integers(x_min,x_max,size=N) # note below
Ys = np.random.random_integers(y_min,y_max,size=N)
a = np.array([(i,j) for i,j in zip(IDs,np.column_stack((Xs,Ys)))],dt)
return IDs,Xs,Ys,a
#delta_time
#arg_deco
def alternate(N=1000000,x_min=0,x_max=10,y_min=0,y_max=10):
""" after hpaulj and his mods to the above and this. See docs
"""
dt = np.dtype([('ID','<i4'),('Shape',('<f8',(2,)))])
IDs = np.arange(0,N)
Xs = np.random.random_integers(0,10,size=N)
Ys = np.random.random_integers(0,10,size=N)
c_stack = np.column_stack((IDs,Xs,Ys))
a = np.ones(N, dtype=dt)
a['ID'] = c_stack[:,0]
a['Shape'] = c_stack[:,1:]
return IDs,Xs,Ys,a
if __name__=="__main__":
"""time testing for various methods
"""
id_1,xs_1,ys_1,a_1 = pnts_IdShape(N=1000000,x_min=0, x_max=10, y_min=0, y_max=10)
id_2,xs_2,ys_2,a_2 = alternate(N=1000000,x_min=0, x_max=10, y_min=0, y_max=10)
Timing results for 1,000,000 points are as follows
Timing function for... pnts_IdShape
Function... **pnts_IdShape**
args.... ()
kwargs. {'N': 1000000, 'y_max': 10, 'x_min': 0, 'x_max': 10, 'y_min': 0}
Results for... pnts_IdShape
time taken ... **0.680652857 sec**.
Timing function for... **alternate**
Function... alternate
args.... ()
kwargs. {'N': 1000000, 'y_max': 10, 'x_min': 0, 'x_max': 10, 'y_min': 0}
Results for... alternate
time taken ... **0.060056925 sec**.
There are 2 ways of filling a structured array (http://docs.scipy.org/doc/numpy/user/basics.rec.html#filling-structured-arrays) - by row (or rows with list of tuples), and by field.
To do this by field, create the empty structured array, and assign values by field name
In [19]: a=np.column_stack((id,x,y))
# same as your vstack().T
In [20]: Y=np.zeros(a.shape[0], dtype=dt)
# empty, ones, etc
In [21]: Y['ID'] = a[:,0]
In [22]: Y['Shape'] = a[:,1:]
# (2,) field takes a 2 column array
In [23]: Y
Out[23]:
array([(0, [8.0, 8.0]), (1, [8.0, 0.0]), (2, [6.0, 2.0]), (3, [8.0, 8.0]),
(4, [3.0, 2.0]), (5, [6.0, 1.0]), (6, [5.0, 6.0]), (7, [7.0, 7.0]),
(8, [6.0, 1.0]), (9, [6.0, 6.0])],
dtype=[('ID', '<i4'), ('Shape', '<f8', (2,))])
On the surface
arr = np.array(list(zip(id,np.hstack((x,y)))),dt)
looks like an ok way of constructing the list of tuples need to fill the array. But result duplicates the values of x instead of using y. I'll have to look at what is wrong.
You can take a view of an array like a if the dtype is compatible - the data buffer for 3 int columns is layed out the same way as one with 3 int fields.
a.view('i4,i4,i4')
But your dtype wants 'i4,f8,f8', a mix of 4 and 8 byte fields, and a mix of int and float. The a buffer will have to be transformed to achieve that. view can't do it. (don't even ask about .astype.)
corrected list of tuples method:
In [35]: np.array([(i,j) for i,j in zip(id,np.column_stack((x,y)))],dt)
Out[35]:
array([(0, [8.0, 8.0]), (1, [8.0, 0.0]), (2, [6.0, 2.0]), (3, [8.0, 8.0]),
(4, [3.0, 2.0]), (5, [6.0, 1.0]), (6, [5.0, 6.0]), (7, [7.0, 7.0]),
(8, [6.0, 1.0]), (9, [6.0, 6.0])],
dtype=[('ID', '<i4'), ('Shape', '<f8', (2,))])
The list comprehension produces a list like:
[(0, array([8, 8])),
(1, array([8, 0])),
(2, array([6, 2])),
....]
For each tuple in the list, the [0] goes in the first field of the dtype, and [1] (a small array), goes in the 2nd.
The tuples could also be constructed with
[(i,[j,k]) for i,j,k in zip(id,x,y)]
dt1 = np.dtype([('ID','<i4'),('Shape',('<i4',(2,)))])
is a view compatible dtype (still 3 integers)
In [42]: a.view(dtype=dt1)
Out[42]:
array([[(0, [8, 8])],
[(1, [8, 0])],
[(2, [6, 2])],
[(3, [8, 8])],
[(4, [3, 2])],
[(5, [6, 1])],
[(6, [5, 6])],
[(7, [7, 7])],
[(8, [6, 1])],
[(9, [6, 6])]],
dtype=[('ID', '<i4'), ('Shape', '<i4', (2,))])

Type-safe rectangular multidimensional array type

How do you represent a rectangular 2-dimensional (or multidimensional) array data structure in Scala?
That is, each row has the same length, verified at compile time, but the dimensions are determined at runtime?
Seq[Seq[A]] has the desired interface, but it permits the user to provide a "ragged" array, which can result in a run-time failure.
Seq[(A, A, A, A, A, A)] (and similar) does verify that the lengths are the same, but it also forces this length to be specified at compile time.
Example interface
Here's an example interface of what I mean (of course, the inner dimension doesn't have to be tuples; it could be specified as lists or some other type):
// Function that takes a rectangular array
def processArray(arr : RectArray2D[Int]) = {
// do something that assumes all rows of RectArray are the same length
}
// Calling the function (OK)
println(processArray(RectArray2D(
( 0, 1, 2, 3),
(10, 11, 12, 13),
(20, 21, 22, 23)
)))
// Compile-time error
println(processArray(RectArray2D(
( 0, 1, 2, 3),
(10, 11, 12),
(20, 21, 22, 23, 24)
)))
This is possible using the Shapeless library's sized types:
import shapeless._
def foo[A, N <: Nat](rect: Seq[Sized[Seq[A], N]]) = rect
val a = Seq(Sized(1, 2, 3), Sized(4, 5, 6))
val b = Seq(Sized(1, 2, 3), Sized(4, 5))
Now foo(a) compiles, but foo(b) doesn't.
This allows us to write something very close to your desired interface:
case class RectArray2D[A, N <: Nat](rows: Sized[Seq[A], N]*)
def processArray(arr: RectArray2D[Int, _]) = {
// Run-time confirmation of what we've verified at compile-time.
require(arr.rows.map(_.size).distinct.size == 1)
// Do something.
}
// Compiles and runs.
processArray(RectArray2D(
Sized( 0, 1, 2, 3),
Sized(10, 11, 12, 13),
Sized(20, 21, 22, 23)
))
// Doesn't compile.
processArray(RectArray2D(
Sized( 0, 1, 2, 3),
Sized(10, 11, 12),
Sized(20, 21, 22, 23)
))
Using encapsulation to ensure proper size.
final class Matrix[T]( cols: Int, rows: Int ) {
private val container: Array[Array[T]] = Array.ofDim[T]( cols, rows )
def get( col: Int, row: Int ) = container(col)(row)
def set( col: Int, row: Int )( value: T ) { container(col)(row) = value }
}
Note: I misread the question, mistaking a rectangle for a square. Oh, well, if you're looking for squares, this would fit. Otherwise, you should go with #Travis Brown's answer.
This solution may not be the most generic one, but it coincides with the way Tuple classes are defined in Scala.
class Rect[T] private (val data: Seq[T])
object Rect {
def apply[T](a1: (T, T), a2: (T, T)) = new Rect(Seq(a1, a2))
def apply[T](a1: (T, T, T), a2: (T, T, T), a3: (T, T, T)) = new Rect(Seq(a1, a2, a3))
// Continued...
}
Rect(
(1, 2, 3),
(3, 4, 5),
(5, 6, 7))
This is the interface you were looking for and the compiler will stop you if you have invalid-sized rows, columns or type of element.

How can I force the type of an array when initialized in Scala?

Basically, I have an array like this:
val base_length = Array(
0, 1, 2, 3, 4, 5, 6, 7, 8, 10, 12, 14, 16, 20, 24, 28, 32, 40, 48, 56,
64, 80, 96, 112, 128, 160, 192, 224, 0
);
And when scala sees it, it wants to do this:
base_length: Array[Int] = Array(...)
But I would prefer for it to do this:
base_length: Array[Byte] = Array(...)
I tried:
val base_length = Array[Byte](...)
But scala says:
<console>:4: error: type arguments [Byte] do not conform to method apply's type
parameter bounds [A <: AnyRef]
val base_length = Array[Byte](1,2,3,4,5)
This seems to me to basically be telling me that the Array constructor wants to figure out what the type of the array is from the arguments. Normally that's awesome, but in this instance I have good reasons for wanting the array elements to be Bytes.
I have looked around for guidance on this, but I don't seem to be able to find anything. Any help would be great!
It should be:
C:\prog\>scala
Welcome to Scala version 2.7.5.final (Java HotSpot(TM) Client VM, Java 1.6.0_16).
Type in expressions to have them evaluated.
Type :help for more information.
scala> val gu: Array[Byte] = Array(18, 19, 20)
gu: Array[Byte] = Array(18, 19, 20)
This is not immutable. A Seq would be a step in that direction even if it is only a trait (as Christopher mentions in the comments) adding finite sequences of elements. A Scala List would be immutable.
Works in Scala 2.8.0:
Welcome to Scala version 2.8.0.r18502-b20090818020152 (Java HotSpot(TM) 64-Bit Server VM, Java 1.6.0_15).
Type in expressions to have them evaluated.
Type :help for more information.
scala> Array[Byte](0, 1, 2)
res0: Array[Byte] = Array(0, 1, 2)

Resources