Try statement in Cython for cimport (for use with mpi4py) - try-catch

Is there a way to have the equivalent of the Python try statement in Cython for the cimport?
Something like that:
try:
cimport something
except ImportError:
pass
I would need this to write a Cython extension that can be compiled with or without mpi4py. This is very standard in compiled languages where the mpi commands can be put between #ifdef and #endif preprocessor directives. How can we obtain the same result in Cython?
I tried this but it does not work:
try:
from mpi4py import MPI
from mpi4py cimport MPI
from mpi4py.mpi_c cimport *
except ImportError:
rank = 0
nb_proc = 1
# solve a incompatibility between openmpi and mpi4py versions
cdef extern from 'mpi-compat.h': pass
does_it_work = 'Not yet'
Actually it works well if mpi4py is correctly installed but if
import mpi4py raises an ImportError, the Cython file does not
compile and I get the error:
Error compiling Cython file:
------------------------------------------------------------
...
try:
from mpi4py import MPI
from mpi4py cimport MPI
^
------------------------------------------------------------
mod.pyx:4:4: 'mpi4py.pxd' not found
The file setup.py:
from setuptools import setup, Extension
from Cython.Distutils import build_ext
import os
here = os.path.abspath(os.path.dirname(__file__))
include_dirs = [here]
try:
import mpi4py
except ImportError:
pass
else:
INCLUDE_MPI = '/usr/lib/openmpi/include'
include_dirs.extend([
INCLUDE_MPI,
mpi4py.get_include()])
name = 'mod'
ext = Extension(
name,
include_dirs=include_dirs,
sources=['mod.pyx'])
setup(name=name,
cmdclass={"build_ext": build_ext},
ext_modules=[ext])

Using a try-catch block in this way is something you won't be able to do.
The extension module you are making must be statically compiled and linked against the things it uses cimport to load at the C-level. A try-catch block is something that will be executed when the module is imported, not when it is compiled.
On the other hand, in theory, you should be able to get the effect you're looking for using Cython's support for conditional compilation.
In your setup.py file you can check to see if the needed modules are defined and then define environment variables to be passed to the Cython compiler that, in turn, depend on whether or not the needed modules are present.
There's an example of how to do this in one of Cython's tests.
There they pass a dictionary containing the desired environment variables to the constructor for Cython's Extension class as the keyword argument pyrex_compile_time_env, which has been renamed to cython_compile_time_env, and for Cython.Build.Dependencies.cythonize is called compile_time_env).

Thank you for your very useful answer #IanH. I include an example to show what it gives.
The file setup.py:
from setuptools import setup
from Cython.Distutils.extension import Extension
from Cython.Distutils import build_ext
import os
here = os.path.abspath(os.path.dirname(__file__))
import numpy as np
include_dirs = [here, np.get_include()]
try:
import mpi4py
except ImportError:
MPI4PY = False
else:
MPI4PY = True
INCLUDE_MPI = '/usr/lib/openmpi/include'
include_dirs.extend([
INCLUDE_MPI,
mpi4py.get_include()])
name = 'mod'
ext = Extension(
name,
include_dirs=include_dirs,
cython_compile_time_env={'MPI4PY': MPI4PY},
sources=['mod.pyx'])
setup(name=name,
cmdclass={"build_ext": build_ext},
ext_modules=[ext])
if not MPI4PY:
print('Warning: since importing mpi4py raises an ImportError,\n'
' the extensions are compiled without mpi and \n'
' will work only in sequencial.')
And the file mod.pyx, with a little bit of real mpi commands:
import numpy as np
cimport numpy as np
try:
from mpi4py import MPI
except ImportError:
nb_proc = 1
rank = 0
else:
comm = MPI.COMM_WORLD
nb_proc = comm.size
rank = comm.Get_rank()
IF MPI4PY:
from mpi4py cimport MPI
from mpi4py.mpi_c cimport *
# solve an incompatibility between openmpi and mpi4py versions
cdef extern from 'mpi-compat.h': pass
print('mpi4py ok')
ELSE:
print('no mpi4py')
n = 8
if n % nb_proc != 0:
raise ValueError('The number of processes is incorrect.')
if rank == 0:
data_seq = np.ones([n], dtype=np.int32)
s_seq = data_seq.sum()
else:
data_seq = np.zeros([n], dtype=np.int32)
if nb_proc > 1:
data_local = np.zeros([n/nb_proc], dtype=np.int32)
comm.Scatter(data_seq, data_local, root=0)
else:
data_local = data_seq
s = data_local.sum()
if nb_proc > 1:
s = comm.allreduce(s, op=MPI.SUM)
if rank == 0:
print('s: {}; s_seq: {}'.format(s, s_seq))
assert s == s_seq
Build with python setup.py build_ext --inplace and test with python -c "import mod" and mpirun -np 4 python -c "import mod". If mpi4py is not installed, one can still build the module and use it in sequential.

Related

Using glob to import txt files to an array for interpolation

Currently I am using data (wavelength, flux) in txt format and have six txt files. The wavelengths are the same but the fluxes are different. I have imported the txt files using pd.read_cvs (as can be seen in the code) and assigned each flux a different name. These different named fluxes are placed in an array. Finally, I interpolate the fluxes with a temperature array. The codes works and because currently I only have six files writing the code this way is ok. The problem I have moving forward is that when I have 100s of txt files I need a better method.
How can I use glob to import the txt files, assign a different name to each flux (if that is necessary) and finally interpolate? Any help would be appreciated. Thank you.
import pandas as pd
import numpy as np
from scipy import interpolate
fcf = 0.0000001 # flux conversion factor
wcf = 10 #wave conversion factor
temperature = np.array([725,750,775,800,825,850])
# import files and assign column headers; blank to ignore spaces
c1p = pd.read_csv("../c/725.txt",sep=" ",header=None)
c1p.columns = ["blank","0","blank","blank","1"]
c2p = pd.read_csv("../c/750.txt",sep=" ",header=None)
c2p.columns = ["blank","0","blank","blank","1"]
c3p = pd.read_csv("../c/775.txt",sep=" ",header=None)
c3p.columns = ["blank","0","blank","blank","1"]
c4p = pd.read_csv("../c/800.txt",sep=" ",header=None)
c4p.columns = ["blank","0","blank","blank","1"]
c5p = pd.read_csv("../c/825.txt",sep=" ",header=None)
c5p.columns = ["blank","0","blank","blank","1"]
c6p = pd.read_csv("../c/850.txt",sep=" ",header=None)
c6p.columns = ["blank","0","blank","blank","1"]
wave = np.array(c1p['0']/wcf)
c1fp = np.array(c1p['1']*fcf)
c2fp = np.array(c2p['1']*fcf)
c3fp = np.array(c3p['1']*fcf)
c4fp = np.array(c4p['1']*fcf)
c5fp = np.array(c5p['1']*fcf)
c6fp = np.array(c6p['1']*fcf)
cfp = np.array([c1fp,c2fp,c3fp,c4fp,c5fp,c6fp])
flux_int = interpolate.interp1d(temperature,cfp,axis=0,kind='linear',bounds_error=False,fill_value='extrapolate')
My attempts so far...I think I have loaded the files into a list using glob as
import pandas as pd
import numpy as np
from scipy import interpolate
import glob
c_list=[]
path = "../c/*.*"
for file in glob.glob(path):
print(file)
c = pd.read_csv(file,sep=" ",header=None)
c.columns = ["blank","0","blank","blank","1"]
c_list.append
I am still unsure how to extract just the fluxes into an array in order to interpolate. I will continue to post my attempts.
My updated code
fcf = 0.0000001
import pandas as pd
import numpy as np
from scipy import interpolate
import glob
c_list=[]
path = "../c/*.*"
for file in glob.glob(path):
print(file)
c = pd.read_csv(file,sep=" ",header=None)
c.columns = ["blank","0","blank","blank","1"]
c = c['1']*fcf
c_list.append(c)
fluxes = np.array(c_list)
temperature = np.array([7250,7500,7750,8000,8250,8500])
flux_int =interpolate.interp1d(temperature,fluxes,axis=0,kind='linear',bounds_error=False,fill_value='extrapolate')
When I run this code I get the following error
raise ValueError("x and y arrays must be equal in length along "
ValueError: x and y arrays must be equal in length along interpolation axis.
I think the error in the code that needs correcting is here fluxes = np.array(c_list). This is one list of all fluxes but I need a list of fluxes from each file. How is this done?
Final attempt
import pandas as pd
import numpy as np
from scipy import interpolate
import glob
c_list=[]
path = "../c/*.*"
for file in glob.glob(path):
print(file)
c = pd.read_csv(file,sep=" ",header=None)
c.columns = ["blank","0","blank","blank","1"]
c = c['1']* 0.0000001
c_list.append(c)
c1=np.array(c_list[0])
c2=np.array(c_list[1])
c3=np.array(c_list[2])
c4=np.array(c_list[3])
c5=np.array(c_list[4])
c6=np.array(c_list[5])
fluxes = np.array([c1,c2,c3,c4,c5,c6])
temperature = np.array([7250,7500,7750,8000,8250,8500])
flux_int = interpolate.interp1d(temperature,fluxes,axis=0,kind='linear',bounds_error=False,fill_value='extrapolate')
This code work but I am still not sure about
c1=np.array(c_list[0])
c2=np.array(c_list[1])
c3=np.array(c_list[2])
c4=np.array(c_list[3])
c5=np.array(c_list[4])
c6=np.array(c_list[5])
Is there a better way to write this?
Here's 2 things that you can tdo:
Instead of
c = c['1']* 0.0000001
try doing c = c['1'].to_numpy()* 0.0000001
This will build a list of numpy Arrays rather than a list of pandas Series
When constructing fluxes, you can just do
fluxes = np.array(c_list)

Error when Importing keras in embedded python in C

I'm trying to embed python in my C application. I download the package in python official website and manage to do a simple Hello World.
Now I want to go deeper and use some libraries of python like numpy, keras, tensorflow...
I'm working with Python 3.5.4, I installed all the needed package on my PC with pip3 :
pip3 install keras
pip3 install tensorflow
...
then I created my script and launch it in python environment, it works fine :
Python:
# Importing the libraries
#
import numpy as np
import pandas as pd
dataset2 = pd.read_csv('I:\RNA\dataset19.csv')
X_test = dataset2.iloc[:, 0:228].values
y_test = dataset2.iloc[:, 228].values
# 2.
import pickle
sc = pickle.load(open('I:\RNA\isVerb_sc', 'rb'))
X_test = sc.transform(X_test)
# 3.
import keras
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import Dropout
classifier = Sequential()
classifier.add(Dense(units = 114, kernel_initializer = 'uniform', activation = 'relu', input_dim = 228))
classifier.add(Dropout(p = 0.3))
classifier.add(Dense(units = 114, kernel_initializer = 'uniform', activation = 'relu'))
classifier.add(Dropout(p = 0.3))
classifier.add(Dense(units = 1, kernel_initializer = 'uniform', activation = 'sigmoid'))
classifier.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = ['accuracy'])
classifier.load_weights('I:\RNA\isVerb_weights.h5')
y_pred = classifier.predict(X_test)
y_pred1 = (y_pred > 0.5)
from sklearn.metrics import confusion_matrix
cm = confusion_matrix(y_test, y_pred1)
But when I execute the same script in a C environment with embed python it didn't work :
At first, I execute my script directly with PyRun_SimpleFile with no luck, so I sliced it in multiple instructions with PyRun_SimpleString to detect the problem :
C:
result = PyRun_SimpleString("import numpy as np"); // result = 0 (ok)
result = PyRun_SimpleString("import pandas as pd"); // result = 0 (ok)
...
result = PyRun_SimpleString("import pickle"); // result = 0 (ok)
... (all insctruction above works)
result = PyRun_SimpleString("import keras"); // result = -1 !!
... (all under this failed)
but there is not a single stack trace about this error, I tried this but I just got :
"Here's the output: (null)"
My initialization of Python in C seems correct since others libraries import fine :
// Python
wchar_t *stdProgramName = L"I:\\LIBs\\cpython354";
Py_SetProgramName(stdProgramName);
wchar_t *stdPythonHome = L"I:\\LIBs\\cpython354";
Py_SetPythonHome(stdPythonHome);
wchar_t *stdlib = L"I:\\LIBs\\cpython354;I:\\LIBs\\cpython354\\Lib\\python35.zip;I:\\LIBs\\cpython354\\Lib;I:\\LIBs\\cpython354\\DLLs;I:\\LIBs\\cpython354\\Lib\\site-packages";
Py_SetPath(stdlib);
// Initialize Python
Py_Initialize();
When inside a Python cmd, the line import keras take some time (3sec) but works (a warning but I found no harm around it) :
>>> import keras
I:\LIBs\cpython354\lib\site-packages\h5py\__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
from ._conv import register_converters as _register_converters
Using TensorFlow backend.
>>>
I'm at loss now, I don't know where to look at since there is no stack trace.
it seems like when you import keras, it executes this line :
sys.stderr.write('Using TensorFlow backend.\n')
or sys.stderr was not defined in python embedded on windows
A simple correction is to define sys.stderr, for example :
import sys
class CatchOutErr:
def __init__(self):
self.value = ''
def write(self, txt):
self.value += txt
catchOutErr = CatchOutErr()
sys.stderr = catchOutErr

I found an error which is wrong is FileNotFoundError in python

I want to import the dataset but fail because he has an error that is there for two \ i do not know how he was generated
import pickle
import numpy as np
import os
def load_cifar_batch(filename):
with open(filename,'rb') as f :
datadict=pickle.load(f,encoding='bytes')
x=datadict[b'data']
y=datadict[b'labels']
x=x.reshape(10000,3,32,32).transpose(0,2,3,1).astype('float')
y=np.array(y)
return x,y
def load_cifar10(root):
xs=[]
ys=[]
for b in range(1,6):
f=os.path.join(root,'data_batch_%d' % (b,))
x,y=load_cifar_batch(f)
xs.append(x)
ys.append(y)
Xtrain=np.concatenate(xs) #1
Ytrain=np.concatenate(ys)
del x ,y
Xtest,Ytest=load_cifar_batch(os.path.join(root,'test_batch')) #2
return Xtrain,Ytrain,Xtest,Ytest
and then i try to run
import numpy as np
from data_utils import load_cifar10
import matplotlib.pyplot as plt
datadir='E:\python\waa\cifar10\cifar-10-batches-bin'
x_train,y_train,x_test,y_test=load_cifar10('datadir')
the error is
FileNotFoundError: [Errno 2] No such file or directory:'datadir\\data_batch_1'
There are two \ in my error,how to fix it?

Functor misses implicit value for parameter instance (only after sbt clean)

I did some experiments with Kittens (https://github.com/milessabin/kittens) and have issues with compiling my code. I receive the following error.
[error] ...danirey\scala\kittens\Kittens.scala:23: could not find implicit value for parameter instance: cats.Functor[danirey.scala.kittens.AdtDefns.Tree]
[error] val funct = Functor[Tree]
[error] ^
[error] one error found
[error] (compile:compileIncremental) Compilation failed
The complete File is as follows
package danirey.scala.kittens
/**
* #author Dani
*/
import cats.Functor
import cats.syntax.AllSyntax
import cats.derived.functor._
import legacy._
import cats.derived.iterable.legacy._
import org.typelevel.discipline.scalatest.Discipline
import shapeless.cachedImplicit
object Kittens extends App {
val ft = new FunctorExperiment()
ft.print()
}
class FunctorExperiment extends AllSyntax {
import AdtDefns._
def print():Unit = {
val funct = Functor[Tree]
val tree: Tree[String] = Node(
Leaf("Reto"),
Node(
Leaf("Sandra"),
Leaf("Mike")
)
)
println(funct.map(tree)(_.length))
}
}
I have use the almost identical code in a ScalaTest which compiles without any issues.
package danirey.scala.kittens
import cats.Functor
import cats.syntax.AllSyntax
import cats.derived.functor._
import legacy._
import cats.derived.iterable.legacy._
import org.scalatest.FunSuite
import org.typelevel.discipline.scalatest.Discipline
import shapeless.cachedImplicit
/**
* #author Dani
*/
class FunctorExperimentTest extends FunSuite with Discipline with AllSyntax {
import AdtDefns._
test("functors experiment") {
val funct = Functor[Tree]
val tree: Tree[String] = Node(
Leaf("Reto"),
Node(
Leaf("Sandra"),
Leaf("Mike")
)
)
println(funct.map(tree)(_.length))
}
}
My build.sbt looks as follows
name := "shapeless-experiments"
version := "1.0-SNAPSHOT"
scalaVersion := "2.11.8"
exportJars := true
libraryDependencies ++= Seq(
"com.chuusai" % "shapeless_2.11" % "2.3.0",
"org.typelevel" % "kittens_2.11" % "1.0.0-M2",
"org.scalatest" %% "scalatest" % "3.0.0-M7" % "test"
)
scalacOptions ++= Seq(
"-feature",
"-language:higherKinds",
"-language:implicitConversions",
"-unchecked"
)
The most interesting thing is, that it compiles as part of an incremental compile.
If I comment line number 16, 23 and 32, then execute "sbt compile",
then remove the comments again and execute "sbt compile/package" it compiles and I can even execute the program. But as soon as I run "sbt clean", it will not compile anymore.
The AdtDefns Object is basically a copy of https://github.com/milessabin/kittens/blob/master/core/src/test/scala/cats/derived/adtdefns.scala
The relevant part is
object AdtDefns {
sealed trait Tree[T]
final case class Leaf[T](t: T) extends Tree[T]
final case class Node[T](l: Tree[T], r: Tree[T]) extends Tree[T]
}
PS: Would be nice if someone could create a tag for scala-kittens
#DaniRey we use kittens in our projects but only the sequence part. I am not aware of any project that use kittens derivation. What's your user case?

waf - build works, custom build targets fail

The waf command waf build shows compiler errors (if there are any) while waf debug or waf release does not and always fails, utilizing the following wscript file (or maybe the wscript file has some other shortcomings I am currently not aware of):
APPNAME = 'waftest'
VERSION = '0.0.1'
def configure(ctx):
ctx.load('compiler_c')
ctx.define('VERSION', VERSION)
ctx.define('GETTEXT_PACKAGE', APPNAME)
ctx.check_cfg(atleast_pkgconfig_version='0.1.1')
ctx.check_cfg(package='glib-2.0', uselib_store='GLIB', args=['--cflags', '--libs'], mandatory=True)
ctx.check_cfg(package='gobject-2.0', uselib_store='GOBJECT', args=['--cflags', '--libs'], mandatory=True)
ctx.check_cfg(package='gtk+-3.0', uselib_store='GTK3', args=['--cflags', '--libs'], mandatory=True)
ctx.check_cfg(package='libxml-2.0', uselib_store='XML', args=['--cflags', '--libs'], mandatory=True)
ctx.check_large_file(mandatory=False)
ctx.check_endianness(mandatory=False)
ctx.check_inline(mandatory=False)
ctx.setenv('debug')
ctx.env.CFLAGS = ['-g', '-Wall']
ctx.define('DEBUG',1)
ctx.setenv('release')
ctx.env.CFLAGS = ['-O2', '-Wall']
ctx.define('RELEASE',1)
def pre(ctx):
print ('Building [[[' + ctx.variant + ']]] ...')
def post(ctx):
print ('Building is complete.')
def build(ctx):
ctx.add_pre_fun(pre)
ctx.add_post_fun(post)
# if not ctx.variant:
# ctx.fatal('Do "waf debug" or "waf release"')
exe = ctx.program(
features = ['c', 'cprogram'],
target = APPNAME+'.bin',
source = ctx.path.ant_glob(['src/*.c']),
includes = ['src/'],
export_includes = ['src/'],
uselib = 'GOBJECT GLIB GTK3 XML'
)
# for item in exe.includes:
# print(item)
from waflib.Build import BuildContext
class release(BuildContext):
cmd = 'release'
variant = 'release'
class debug(BuildContext):
cmd = 'debug'
variant = 'debug'
Error resulting from waf debug :
Build failed
-> task in 'waftest.bin' failed (exit status -1):
{task 46697488: c qqq.c -> qqq.c.1.o}
[useless filepaths]
I had a look at the waf demos, read the wafbook at section 6.2.2 but those did not supply me with valuable information in order to fix this issue.
What's wrong, and how do I fix it?
You need to do at least the following:
def configure(ctx):
...
ctx.setenv('debug')
ctx.load('compiler_c')
...
Since the cfg.setenv function resets whole previous environment. If you want to save previous environment, you can do cfg.setenv('debug', env=cfg.env.derive()).
Also, you don't need to explicitly specify the features = ['c', 'cprogram'], since, it's redundant, when you call bld.program(...).
P.S. Don't forget to reconfigure after modifying wscript file.

Resources