The concept of 'sending' data to the GPU in OpenGL ES

The concept of 'sending' data to the GPU in OpenGL ES - mobile

Having only used Direct3D and OpenGL on desktop computer, I have this concept in my head that whenever buffers need to be updated you need to send this data to the GPU with calls such as glBufferData()/glBufferSubData(), and that this type of thing should be minimised at all costs.
Since OpenGL ES works with embedded systems such as phones, and the devices I don't think have dedicated GPU RAM, I was wondering if such an API call does completely different things if compiled on one device (Android) vs on another device (Windows desktop computer). The calls seem the same or similar whether on OpenGL or OpenGL ES, and I was wondering if I write my program on Windows it'll work on a mobile device. My guess is that on a mobile device these function calls will load data onto the system RAM whereas on a desktop it will be sent to the GPU RAM, if a GPU is available.
If this is the case, then is it an advantage for the mobile device in that data is only kept in one place and never sent anywhere else (ie., to the GPU)?

The only significant difference is that in one case there is a DMA transfer over PCIe (desktop), in the other case it's just a memcpy to a driver-owned buffer in system RAM (mobile).
In both cases it can be relatively expensive (e.g. memory allocation, data copy, and possibly a need for cache maintenance on some systems), so it should still be minimized whenever possible. It's far more efficient to upload resources to wherever they need to be at the start of a game level and then just reference them from then onwards.

Related

Does libdrm talk to kernel DRM/graphics card via ioctl()?

This can be a silly question as I do not know much about this topic at all... It seems that user applications can talk directly to GPU to render an image, for example using OpenGL, through mesa and libdrm, where the libdrm is a wrapper around various ioctl() calls, as illustrated in this graph. Does it mean that for every new frame of a 3D game, the game application needs to call the ioctl() once (or maybe even twice if KMS needs to be reached)? That sounds like a lot of user-kernel space barrier crossing (thinking about a 120 fps game).

libdrm is an user space wrapper to perform fine grained access of the underlying KMS driver features like modesetting, checking if plane being used is an overlay plane or primary plane etc. libdrm implementations are generally different for various CPU/GPU/OS combinations, as the h/w driver running in kernel tend to support different set of functionalities apart from the standard ones. The standard way of working with libdrm is to open a drm device available in /dev/ node and perform libdrm function calls using the fd returned from open().
More often than not, the display compositor software for a particular OS like X11, wayland, hardware-composer will need to be in control of the drm device, which means non privileged applications have no way of being DRM master. Most of the libdrm mode setting functionalities do not work if the application trying to use them are not the DRM master. Recommended practice instead of using libdrm directly, is to use a standard graphics library like openGL or VULKAN to prepare and render frames in your application.
The number of ioctls required to interact with the kernel DRM module is most likely not the biggest bottleneck you will face when trying to render high FPS applications. The preferred way to run high fps applications while cooperating with the display compositor of the target system is to have
a double or triple buffered setup for rendering, where the next buffer to be rendered is ready to be rendered before the current frame has finished rendered.
Take advantage of h/w acceleration wherever possible, e.g for performing scaling/resizing/image format conversions/color space conversions.
Pre compute and reuse shader elements
Try to reuse texture elements as much as possible instead of computing a lot of textures for every frame being rendered.
Use vector/SIMD/SSEv2,3,4/AVX/neon instructions wherever possible to take advantage of modern CPU pipelines

Is it wrong to log inner working of an IoT device in case of failure?

I'm currently working on an IoT project and I want to log the execution of my software and hardware.
I want to log them then send them to some DB in case I need to have a look at my device remotely.
The wip IoT device will have to be as minimal as possible so the act of having to write very often inside a flash memory module seems weird to me.
I know that it will run the RTOS OS Nucleus on an Cortex-M4 with some modules connected through SPI.
Can someone with more expertise enlighten me ?
Thanks.

You will have to estimate your hourly/daily/whatever data volume that needs to go into the log and extrapolate to the expected lifetime of your product. Microcontroller flash usually isn't made for logging and thus it features neither enduring flash cells (some 10K-100K write cycles usually compared to 1M or more for dedicated data chips - look it up in the uC spec sheet) nor wear leveling. Wear leveling is any method which prevents software from writing to the same physical cell too frequently (which would e.g. be the directory for a simple file system).
For your log you will have to create a quite clever or complex method to circumvent any flash lifetime problems.
But the problems don't stop there: usually the MCU isn't able to read from Flash memory when writing to it where "writing" means a prolonged (several microseconds up to milliseconds depending on the chip) sequence of instructions controlling the internal Flash statemachine (programming voltage, saturation times, etc.) until the new values have reliably settled in the memory. And, maybe you guessed it, "reading" in this context also means reading instructions, that is you have to make sure that whichever code and interrupts that may occur during the Flash write are only executing code in RAM, cache or other memories and not in the normal instruction memory. It is doable but the more complex the SW system that you are running above the HW layer, the less likely it will work reliably.

Can I run a program from a SD memory instead of flash on an evaluation board (embedded programming)?

I have an evaluation board (Olimex STM32-P103) which supports a SD-card connector. I want to put my program in to a SD memory instead of internal flash of the micro-controler; and run it from there.
I don't know if it is possible to do that according to boot-loader issue!
P.S my goal is running linux on this board and then port my application over it.

To run programs from SD-Card in general you should know that you can't run them "right away". This means, you have to load it in a executable memory somewhere in your address space which is done by a (more or less) simple bootloader. In the simplest instance, the bootloader is capable to read from a SD-Card a specific binary and copy it into the memory.
That being said you should think about this considering you only got 20k of RAM and 128k of Flash on your board. So where should your program go? Or better: Why not flashing the program in the 128k of Flash from the very beginning? Especially you should know that Linux is a bit "hungry" in terms of memory.
If your goal is to run a "normal" Linux on this board, I'm afraid you're screwed. This because from what I know Linux needs a MMU to run and the chip on this board does not provide one (as far as researchable without access to datasheets from ST).
If you're lucky you can go with uCLinux. I'm not sure if a finished port exists for the STM32 but it seems there are some resources based on a short google search for "STM32 uCLinux". But even if you manage to run uCLinux I'm afraid there's not much left in your system for your application, so the result might be a bit disappointing.
Depending on why you are looking for Linux running on this MCU, there are maybe other solutions like a FreeRTOS in combination with a lwIP-stack (if networking is needed) or a FAT library like FullFAT if you are looking for reading SD-Cards and stuff.
Edit: One thing i'd like to add is that booting from the SD-Card is typically something you do with "bigger" (not much but slightly) systems where you have enough RAM to keep the whole image you'd like to run in it and still have some space left for the data you want to process.

You're going to have to have some code in the STM's onboard flash (typically called a "boot loader") that implements this since the "bare metal" very likely can't boot from SD card.
You're going to have to build that code, which figures out how to use the STM's onboard peripherals to talk to the SD card, finds the file you want to run in the file system (which you also have to implement), and loads it.
I wanted to include a link to the STM standard peripheral library, but it seems to be down (being moved). :/

The data on the SD card is not memory mapped, so cannot be executed directly.
It is possible to dynamically load the data from the card into RAM for execution. WindRiver's VxWorks RTOS supports loading and linking object modules dynamically, I know of no other OS that would scale to a Cortex-M that directly supports that but it would be possible to write your own.
However, I would suggest that in the case of the microcontroller you are using the idea is ill-advised; optimal performance on Cortex-M is achieved when the code is in on-chip flash and data in RAM allowing the data and instruction to be fetch to occur simultaneously on the separate buses (Harvard architecture). If you execute the code from RAM the performance will be severely hit since then data and instructions must be fetched sequentially over the same bus.
The board is entirely unsuited to running Linux, with only 128K Bytes Program Flash, and 20K Bytes RAM is is not at all feasible. Even the smallest Linux distribution requires 600Kb RAM plus whatever is needed by application code. uClinux can just about run on higher-end STM32 with external RAM and Flash, but that would suffer from the same bus contention performance hit and Linux without an MMU is rather missing the one major benefit of using Linux at all. The part on your board lacks an external memory interface, so cannot be expanded to support Linux.
If you need an OS consider a RTOS such as uC/OS-II, FreeRTOS, or emBOS for example.

AS other says you cannot directly execute your code directly from the SD CARD.
But like those "linux board", you can load the stored kernel/programm into an external SDRAM that can be mapped and execute it from there.
You'll still need to write that "bootloader" and store it in the internal flash.
That'is a lot work to my opinion, for limited application.
If you want to write your application in a linux environnement then port it suck small target, I would rather design my application using dependency injection, or even use an emulator.

How its made such as digicoder vcr dvd players graphical user interfaces from poweron till user interface?

I have C/Java knowledge but i never understand yet, how some hardwares show there own screens/graphics from poweron stage to user interface (where it never shows linux/unix boot screen nor it shows windows booting screens).
My question is, Compared to VCR/TV digicoders poweron till user interfaces, how its made? Do we use regular linux kernel or is there any special open source framework which allow us to develop such?
Thanks

Many embedded systems use u-boot as a boot loader. U-boot provides the ability to display a "splash" screen while the linux kernel is booting.

A device will start the bootloader right after the CPU comes out of reset (usually milliseconds after power-on at most). The bootloader code can initialize the display and show a splash screen if it wants (in the same way most modern non-embedded Linux distributions have a graphical grub splashscreen). The kernel can avoid changing the display configuration, and on an embedded device the kernel can boot pretty quickly to running userspace (at least an initramfs), which can take over the display and show whatever animation, progress bar, etc until the full UI is ready.

An operating systems such a Windows or Linux are both large and general purpose. They have to initialise themselves and the hardware, which includes interrogating all connected devices for "plug & play". The OS does not know in advance which such devices are connected; it has to "discover" the hardware every time it starts. The connected hardware may even have changed since it last booted.
Embedded systems do not usually have large operating systems (or often do not have an operating system at all), and they usually have very specific hardware known to the system a priori, so do not need to test and determine the correct configuration for such devices. Often also these devices are far simpler, and are often 'on-chip' peripherals.
That said, your PC is capable of instantly displaying a user interface (just not Windows). The BIOS boot process outputs text to the display almost immediately, and the BIOS console is an interactive user interface that starts on request during boot. Also last time I booted MS-DOS on a modern PC, it took only a few seconds to start.
Not all embedded systems start "instant-on", my digital TV PVR even has a progress bar while booting, but being application specific, it still starts far faster than a general purpose computer. My Network Attached Storage (NAS) device which is an embedded system running Linux on the other hand, takes considerable time to boot since among other things, it has to start the file-system, network, USB interfaces, print server, DNLA server, and web-server. In fact many of the things required for a general purpose computer (but it has no display, the UI is presented via the web-server)
Some embedded systems with large operating systems and complex hardware can achieve "instant-on" by never truly switching off, but rather going into a low power mode where the system state is retained in memory while all the high powered devices such as a screen, WiFi, Bluetooth etc. are switched off.

Can I create a file system accessible from CE 6.0 and my bootloader?

I have a CE 6.0 project on a PXA310 where I need to be able to download OS updates (nk.bin) via Wi-Fi and safely flash the new OS to my device. I'm open to other suggestions about how to do this, but I'm considering saving the nk.bin to my file system in NAND flash, then restarting and have the bootloader locate the file in the file system and flash it to the BINFS partition. Is this possible, and if so, can you give me an outline of what I'd need to do?
One caveat is that this needs to be very robust since the devices are deployed in the field and are not field serviceable. I need to be sure that if the OS flash fails (due to power failure, etc.) that upon reboot the bootloader can try again. That is why I'd like to store the downloaded image in persistent flash and avoid having to re-download the image.

Technically just about anything is possible. For this strategy what you would need is code for your bootloader to mount the NAND flash as a drive and have a FAT driver so that it can traverse that file system and find the image. That is a lot of work if you don't already have it.
THe other option is to just store it in flash outside of the file system in a known address location. That's a lot easier from the bootloader perspective as all you have to do is map to the address and copy. Of course it makes the writes more challenging because then you're doing it from the OS and you have to disable any other flash accesses completely while you do your write to prevent corruption by two threads sending flash commands to the chip at the same time.
In either case, if you have the space it's a good idea to store a "known-good" image elsewhere too, so that if the new image has a problem (fails a checksum or x number of load attempts fails) then you have a working OS that the bootloader can fall back to.

Clearly a lot depends on your hardware setup, but we've done this without making the Bootloader support the Flash Filesystem.
In our product, the OS image is loaded from Flash to execute from RAM -- I think most WinCE devices work this way nowadays. So to update the OS we use a special Flash driver which lets an application, running under WinCE, update the OS blocks in the Flash -- then all you need is a hard reboot and the Bootloader loads the new flash image into RAM in order to execute it. We've found this pretty reliable in the field (with some not-very-technical end-users!).
A special Flash driver was needed because the MS Flash Filesystem drivers have no access to the OS image sectors of the Flash, in order to prevent trashing the OS by accident.
You do need to load the NK.BIN into some memory which the OS programming application can read, normally the NAND Flash, but if you had enough RAM it could just go into the root of the filestore. However either way you can delete it when you've finished programming the OS sectors before the reboot so it's only a temporary requirement.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight