I have some functions like the below sample. The objective is to pass the first function as an argument to the second function and use numba to speed it up. But if I turn the cache=True for the second function, there will be error message:
TypeError: can't pickle weakref objects
My numba version is 0.49.0
Is there any solution/alternative solution to this problem so that I can cache the compilation?
#njit(cache=True)
def func_test(t1, t2):
return np.corrcoef(np.argsort(np.argsort(t1)), np.argsort(np.argsort(t2)))
#njit(cache=True)
def test_func_input2(func_1):
t = list(range(500))
t2 = list(range(500))
for i in range(1000):
t.pop(0)
t2.pop(0)
t.append(i)
t2.append(i)
a = np.array(t)
b = np.array(t2)
x = func_1(a, b)
return x
if __name__ == '__main__':
import numpy as np
test_func_input2(func_test)
Related
I meet a problem if I have multiple choices of type signatures in cuda.jit. In my function, I will reset data back to ref if done flag is on. The data type could be int32 or float32 array. It is just for experiment, do not bother with the dummy function itself. This function is running well if I just use cuda.jit, so I am pretty sure there is some issue with my type declare, but it looks like a correct way from the documentation. BTW, I am using Numba 0.54.
from numba import cuda
from numba import float32, int32
#cuda.jit([(int32[:], int32[:], int32[:], int32),
(float32[:], float32[:], int32[:], int32)])
def reset_when_done(data, ref, done, force_reset):
env_id = cuda.blockIdx.x
tid = cuda.threadIdx.x
if tid == 0:
if force_reset > 0.5 or done[env_id] > 0.5:
data[env_id] = ref[env_id]
And I got the error
TypeError: [(array(int32, 1d, A), array(int32, 1d, A), array(int32, 1d, A), int32), (array(float32, 1d, A), array(float32, 1d, A), array(int32, 1d, A), int32)] is not a callable object
The first parameter of cuda.jit needs to be either a signature or a function regarding the documentation. The same is true for the CPU JIT. The thing is the signature can be a list of basic signatures with the CPU JIT and not with CUDA JIT so far. This is a bug. It has been reported, still opened and plan to be fixed in Numba 0.57.
I'm reading a text file and have the syntax set up correctly to do so. What I want to do now is append all the integers into an array, but when I try to use a print statement to check what's going on, nothing shows up in the terminal.
package lecture
import scala.io.{BufferedSource, Source}
object LectureQuestion {
def fileSum(fileName: String): Int = {
var arrayOfnumbers = Array[String]()
var fileOfnumbers: BufferedSource = Source.fromFile(fileName)
for (line <- fileOfnumbers.getLines()){
val splits: Array[String] =line.split("#")
for (number <- splits){
arrayOfnumbers :+ number
println(arrayOfnumbers.mkString(""))
}
//println(splits.mkString(" "))
}
3
}
def main(args: Array[String]): Unit = {
println(fileSum("data/fileOfnumbers.txt"))
}
}
I set up a blank array to append the numbers to. I tried switching var to val, but that wouldn't make sense as var is mutuable, meaning it can change. I'm pretty sure the way to add things to an array in Scala is :+, so I'm not sure what's going on.
In Scala all you would need is flatMap the List of a List and then sum the result.
Here your example simplified, as we have extracted the lines already:
import scala.util.Try
def listSum(lines: List[String]): Int = {
(for{
line <- lines
number <- line.split("#").map(n => Try(n.trim.toInt).getOrElse(0))
} yield number).sum
}
listSum(List("12#43#134#bad","13#54#47")) // -> 303
No vars, resp. no mutability needed. Just a nice for-comprehension;).
And for comparison the solution with flatMap:
def listSum(lines: List[String]): Int = {
lines
.flatMap(_.split("#").map(n => Try(n.trim.toInt).getOrElse(0)))
.sum
}
I tried to call a function from external DLL in Python.
The function prototype is:
void Myfunction(int32_t *ArraySize, uint64_t XmemData[])
This function creates a table of uint64 with "ArraySize" elements. This dll is generated by labview.
Here is the Python code to call this function:
import ctypes
# Load the library
dllhandle = ctypes.CDLL("SharedLib.dll")
#specify the parameter and return types
dllhandle.Myfunction.argtypes = [ctypes.c_int,ctypes.POINTER(ctypes.c_uint64)]
# Next, set the return types...
dllhandle.Myfunction.restype = None
#convert our Python data into C data
Array_Size = ctypes.c_int(10)
Array = (ctypes.c_uint64 * Array_Size.value)()
# Call function
dllhandle.Myfunction(Array_Size,Array)
for index, value in enumerate(Array):
print Array[index]
When executing this I got the error code:
dllhandle.ReadXmemBlock(Array_Size,Array)
WindowsError: exception: access violation reading 0x0000000A
I guess that I don't pass correctly the parameters to the function, but I can't figure it out.
I tried to sort simple data from the labview dll like a uint64, and that works fine; but as soon as I tried to pass arrays of uint64 I'm stuck.
Any help will be appreciated.
It looks like it's trying to access the memory address 0x0000000A (which is 10). This is because you're passing an int instead of a pointer to an int (although that's still an int), and you're making that int = 10.
I'd start with:
import ctypes
# Load the library
dllhandle = ctypes.CDLL("SharedLib.dll")
#specify the parameter and return types
dllhandle.Myfunction.argtypes = [POINTER(ctypes.c_int), # make this a pointer
ctypes.c_uint64 * 10]
# Next, set the return types...
dllhandle.Myfunction.restype = None
#convert our Python data into C data
Array_Size = ctypes.c_int(10)
Array = (ctypes.c_uint64 * Array_Size.value)()
# Call function
dllhandle.Myfunction(byref(Array_Size), Array) # pass pointer using byref
for index, value in enumerate(Array):
print Array[index]
I'm new to Scala and I was playing around with the Array.tabulate method. I am getting a StackOverFlowError when executing this simplified piece of code snippet (originally a dp problem).
import Lazy._
class Lazy[A](x: => A) {
lazy val value = x
}
object Lazy {
def apply[A](x: => A) = new Lazy(x)
implicit def fromLazy[A](z: Lazy[A]): A = z.value
implicit def toLazy[A](x: => A): Lazy[A] = Lazy(x)
}
def tabulatePlay(): Int = {
lazy val arr: Array[Array[Lazy[Int]]] = Array.tabulate(10, 10) { (i, j) =>
if (i == 0 && j == 0)
0 // some number
else
arr(0)(0)
}
arr(0)(0)
}
Debugging, I noticed that since arr is lazy and when it reaches the arr(0)(0) expression it tries to evaluate it by calling the Array.tabulate method again -- infinitely over and over.
What am i doing wrong? (I updated the code snippet since I was basing it off the solution given in Dynamic programming in the functional paradigm in particular Antal S-Z's answer )
You have effectively caused an infinite recursion. You simply can't reference a lazy val from within its own initialization code. You need to compute arr(0)(0) separately.
I'm not sure why you are trying to access arr before it's built, tabulate seems to be used to fill the array with a function - calling arr would always result in infinite recursion.
See Rex's example here (and a vote for him), perhaps that will help.
In a multidimensional sequence created with tabulate, is the innermost seq the 1. dimension?
I was able to solve this by wrapping arr(0)(0) in Lazy so it is evaluated as a call-by-name parameter, thereby not evaluating arr in the tabulate method. The code that I referenced was automatically converting it using implicits (the binary + operator), so it wasn't clear cut.
def tabulatePlay(): Int = {
lazy val arr: Array[Array[Lazy[Int]]] = Array.tabulate(10, 10) { (i, j) =>
if (i == 0 && j == 0)
1 // some number
else
new Lazy(arr(0)(0))
}
arr(0)(0)
}
Thanks all.
I have a library in c++ and I'm trying to wrap it for python using Cython. One function returns an array of 3D vectors (float (*x)[3]) and I want to access that data from python. I was able to do so by doing something like
res = [
(self.thisptr.x[j][0],self.thisptr.x[j][1],self.thisptr.x[j][2])
for j in xrange(self.natoms)
]
but I would like to access this as a numpy array, so I tried numpy.array on that and it was much slower. I also tried
cdef np.ndarray res = np.zeros([self.thisptr.natoms,3], dtype=np.float)
cdef int i
for i in range(self.natoms):
res[i][0] = self.thisptr.x[i][0]
res[i][1] = self.thisptr.x[i][1]
res[i][2] = self.thisptr.x[i][2]
But is about three times slower than the list version.
Any suggestions on how to convert the list of vectors to an numpy array faster?
The complete code is
cimport cython
import numpy as np
cimport numpy as np
ctypedef np.float_t ftype_t
cdef extern from "ccxtc.h" namespace "ccxtc":
cdef cppclass xtc:
xtc(char []) except +
int next()
int natoms
float (*x)[3]
float time
cdef class pyxtc:
cdef xtc *thisptr
def __cinit__(self, char fname[]):
self.thisptr = new xtc(fname)
def __dealloc__(self):
del self.thisptr
property natoms:
def __get__(self):
return self.thisptr.natoms
property x:
def __get__(self):
cdef np.ndarray res = np.zeros([self.thisptr.natoms,3], dtype=np.float)
cdef int i
for i in range(self.natoms):
res[i][0] = self.thisptr.x[i][0]
res[i][1] = self.thisptr.x[i][1]
res[i][2] = self.thisptr.x[i][2]
return res
#return [ (self.thisptr.x[j][0],self.thisptr.x[j][1],self.thisptr.x[j][2]) for j in xrange(self.natoms)]
#cython.boundscheck(False)
def next(self):
return self.thisptr.next()
Define the type of res:
cdef np.ndarray[np.float64_t, ndim=2] res = ...
Use full index:
res[i,0] = ...
Turn off boundscheck & wraparound
#cython.boundscheck(False)
#cython.wraparound(False)
To summarize what HYRY said and to ensure Cython can generate fast indexing code, try something like the following:
cimport cython
import numpy as np
cimport numpy as np
ctypedef np.float_t ftype_t
cdef extern from "ccxtc.h" namespace "ccxtc":
cdef cppclass xtc:
xtc(char []) except +
int next()
int natoms
float (*x)[3]
float time
cdef class pyxtc:
cdef xtc *thisptr
def __cinit__(self, char fname[]):
self.thisptr = new xtc(fname)
def __dealloc__(self):
del self.thisptr
property natoms:
def __get__(self):
return self.thisptr.natoms
#cython.boundscheck(False)
#cython.wraparound(False)
cdef _ndarray_from_x(self):
cdef np.ndarray[np.float_t, ndim=2] res = np.zeros([self.thisptr.natoms,3], dtype=np.float)
cdef int i
for i in range(self.thisptr.natoms):
res[i,0] = self.thisptr.x[i][0]
res[i,1] = self.thisptr.x[i][1]
res[i,2] = self.thisptr.x[i][2]
return res
property x:
def __get__(self):
return self._ndarray_from_x()
#cython.boundscheck(False)
def next(self):
return self.thisptr.next()
All I did was put the fast stuff inside a cdef method, put the right optimizing decorators on it, and called that inside the property's __get__(). You should also make sure to refer to self.thisptr.natoms inside the range() call rather than use the natoms property, which has lots of Python overhead associated with it.