Anyone had any success developing CUDA programs for NVIDIA shield? - mobile

Has anyone managed to get a CUDA program to work on the NVidia shield? In particular getting the wonderful NVidia profiling tools to work?

Nvidia shield SoC is based on Tegra 4. Tegra K1 is the first Tegra processor you can write CUDA programs for. So you can expect it's not possible to have CUDA programs working on (current) Nvidia shield.

Related

Where to start ARM Cortex-A programming

I have experience with Cortex-M controllers (LPC series from NXP) and Keil.
I want to move for cortex-A because my logic needs some better speed.
I found from internet that these processors will come with linux in it.
How can i use my code directly rather than using linux??
I don't need IO pins.
Where should i start?? What IDE should i use??
And i found debugging of Cortex-A controllers is tough because it is involving OS. is it true?
And is there any way without going for cortex A but achieving higher speeds (around Giga Hz)
By Cortex-M series, I suppose you have experience with M0 and M3. Right?
If you plan on using A-Series, you should know that they are more designed to run operating systems (than M-Series). (For example they have virtual memory management units...) That's why you may not find much bare-metal programming guides with these processors.
Also, these devices don't usually have on-board ROMs. So, you don't have an embedded flash... Therefore, you basically use an SD-Card or eMMC to boot them.
You may use Linux (Easier for you but won't be real-time), or an RTOS (also easier). If that doesn't suit you, you may use "UBoot" from SD-Card or eMMC and do a couple non-trivial steps (dependent on architecture) to run your bare-metal software (which is loaded from SD-Card or eMMC).
I suggest you buy a beagle bone and start from there.
You can still use Cortex-A for normal bare metal application adn with this way you will have something similair to what to what you had with application running on cortex-m
However it really depends from what you want:
if you want to understand how cortex-a is working or you are bringing
up a custom platform which is not that stable so bare metal coding is
your answer and with it you will be able learn a lot bout cortex-a
functionality
If you want to use Cortex-A from user point of view so you need to
compile your linux kernel for your cortex-a based board and start
using developing on top of your running kernel

SoftKinect ds325 (aka Creative Senz3D) and Raspberry Pi?

Is there a way to get a depth stream from ds325 on devices such as Raspberry pi?
What if to use Intel Galileo board instead?
If you have a x86 or x64 board, you can use the softkinetic driver and sdk for linux right away.
Raspberry Pi is Arm architecture, can be done if you compile yourself.
http://www.hirotakaster.com/weblog/openni2-ds325-driver-for-android-and-arm-linuxraspberry-pi/

is it possible to execute OpenCL code on ARM CPU (Cortex-a7) using the Mali OpenCL SDK?

Mali OpenCL SDK allows executing opencl code on the Mali GPU.
Is it possible to execute OpenCL code on ARM CPU (Cortex-a7) using the Mali OpenCL SDK?
Not at present - ARM have only publicly released drivers that support OpenCL on Mali GPUs. However, a couple of months ago they passed conformance for OpenCL running on an ARM CPU, so one might expect that this will be possible in the future:
(from the Khronos conformant products page)
ARM Limited 2014-06-13 OpenCL_1_1
Linux 3.9.0 with ARM drivers on v7 CPU Compute Device Type: CL_DEVICE_TYPE_CPU
Compute Device Name: ARM Cortex-A15 NEON
Compute Device Version: OpenCL 1.1
Compute Device Driver Version: 1.1
Another option for running OpenCL on ARM CPUs is to use pocl, an open-source project.

embedded uclinux footprint on cortex M3

I am having trouble with this question, somebody (hopefully mistakenly) moved the previous question to Unix/Linux list which has zero uclinux tagged questions. This is more of a embedded linux question..
I have a question about the footprint of uClinux. I have looked around to find a breakdown of requirements, there is no nice info on the net. The modules under interest are:
Core kernel TCPIP stack Serial Driver DHCP WiFi Support (any of the stack from vendors is ok) I am looking for RAM/Flash breakdown. I don't need a filesystem however there is a chance that I need it due to the driver model of Linux.
Bonus question: - Porting drivers from Linux to uClinux. I know the memory architecture is different. Considering driver doesn't do anything special wrt memory, could I just recompile the driver and expect it to work under uClinux?
I understand the drivers work pretty well. This driver has been ported to uClinux on Blackfin and STM32. http://www.sagrad.com/index.php?option=com_content&view=article&id=130&Itemid=130
The are running a sale on the ICs, Most of their modules that support WiFi and linux work with uClinux.

Intel atom or ARM for heavy Signal processing workload

I would like to know which is a better (in performance) option :
To get a Intel Dual core atom based board
To get a Arm cortex A9 based board (pandaboard etc)
I would like to run some light version of linux and do some very cpu intensive
computations like Image/Video processing (maybe 3D later) and also process audio
on them. Of-course all floating point mathematics.
Definitely #2, Pandaboard is an OMAP4 platform.
OMAP4 contains not only the ARM Cortex A9 (which is not likely to compete on it's own with dual core Atom), but, and this is crucial, a full C674x DSP core, both floating and fixed point mathematics.
The embedded DSP core in OMAP4 is fully capable of handling 1080p H.264 decode, with some resources to spare. I'm yet to see an Atom platform capable of that.
(shameless plug - my company is using OMAP3 and evaluating OMAP4 for some of our niche markets, and we might be interested in assisting in yours as well)

Resources