_BitScanForward64 using MinGW on Windows

_BitScanForward64 using MinGW on Windows - c

I am trying to use _BitScanForward64 intrinsic (https://software.intel.com/sites/landingpage/IntrinsicsGuide/#expand=381,421,373&cats=Bit%20Manipulation) using MinGW 64 on Windows (GCC 4.8.1).
I tried:
#include "immintrin.h"
#include "x86intrin.h"
#include "xmmintrin.h"
but it still doesn't see the function (according to Intel guide it should be in "immintrin.h").
How do I get it to work using MinGW64 on Windows?

GCC does not use a intrinsic for this. It uses a built in function _builtin_ffs. Wikipedia has a nice summary for each compiler.
The reason Intel lists this intrinsic I think is that the Intel C++ compilers tries to support the same intrinsics which Microsoft created at least when used in Visual Studio on Windows. This unfortunately means that some intrinsics are not defined for each compiler. Even worse is that sometimes they are defined differently. For example Intel's definition of addcarry-u64 disagrees with Microsoft's definition but looking at the assembly shows that Intel uses Microsoft's definition and GCC and Clang use Intel's defintion.
For most x86 SIMD intrinsics GCC, Intel, Clang, and MSVC agree (with a few exceptions coming from MSVC) but for other intrinsics you have to get used to them being only defined in some compilers or being defined differently or having to use builtin functions or a library function.
The Intel compiler even has it's own intrinsics for this: _bit_scan_forward.

Related

MSVC's supported subset of C

So Microsoft's MSVC from Visual Studio 2019 doesn't support C99 (or later), but it does support C89 and some constructs from C99.
Is there an option for the GCC C-compiler (not C++-compiler) to use a standard that would guarantee that the source can also be compiled with MSVC? If it compiles with -std=iso9899:199409 -pedantic under GCC, can MSVC compile it?

Is there a reason you need to care if MSVC can compile it? GCC ("mingw") can target Windows PE object files with ABI compatible with MSVC, so users could build with that, or you could even ship them binary object files/library files to use so they don't need any tooling.
Policing your code base for compatibility with a known-broken/intentionally-broken compiler does not seem like a worthwhile activity unless you actually have reason to want to use that compiler, rather than just allowing users of that compiler to link with your code.

Pretty much the only way to you can ensure it builds with MSVC and GCC is to build the code with both toolsets. In addition to language constructs, there are a number of differences in the handling of compiler-defined preprocessor symbols, differences in what the preprocessor can handle, etc.
Personally I've been doing a lot of work getting C++ code to build with MSVC and Clang, and I've hit many minor issues that have to be fixed to get things to build with both toolsets. The C/C++ language standards help make the code portable, but you still have to run it through more than one toolset to get it to build 'cleanly'.
If you want your code to be robustly portable you also should build it for multiple architectures.
For my GitHub libraries, I build for ARM, ARM64, x86, x64 on MSVC, VS 2015 Update 3/VS 2017/VS 2019, targeting Win32 desktop, UWP, and Xbox One. I also build with clang for Windows for x86 and x64. Each one finds slightly different issues, but the end result is a lot more portable.

Compiling half float neon instructions for iOS

The issue I am having is with some neon instructions which I believe are supported on the arm7 architecture. I am using the default compiler (Apple LLVM 5.0), it recognises other neon instructions although it does not like the half-float instruction.
Here is the code:
vcvt.f32.f16, q0, d1
This has compiled on gcc although the apple compiler does not like this instruction and gives the error: Instruction requires: half-float
Is there a compiler flag I can give to XCode? I can't find out how to enable the half float instructions googling around.
Thanks!

The half-float format is actually not supported on all ARM v7 implementations. See the ARM manual here. It's required by vfp4, so if your chip supports that, that's a good start. In general I would recommend using run-time detection and dispatching. To enable the instruction in general, you would need to use one of several floating point support options, in general "fp16" is the keyword, for example:
-mfpu=neon-fp16 if you are sure that your target supports it for neon. I couldn't find all of the examples for llvm either, but I think they are generally compatible with the GCC options, found in the GCC manual.

GCC Error while compiling for ARM

I am getting the following error while trying to compile some code for an ARM Cortex-M4
using
gcc -mcpu=cortex-m4 arm.c
`-mcpu=' is deprecated. Use `-mtune=' or '-march=' instead.
arm.c:1: error: bad value (cortex-m4) for -mtune= switch
I was following GCC 4.7.1 ARM options. Not sure whether I am missing some critical option. Any kickstart for using GCC for ARM will also be really helpful.

As starblue implied in a comment, that error is because you're using a native compiler built for compiling for x86 CPUs, rather than a cross-compiler for compiling to ARM.
GCC only supports a single general architecture type in any given compiler binary -- so, although the same copy of GCC can compile for both 32-bit and 64-bit x86 machines, you can't compile to both x86 and ARM with the same copy of GCC -- you need an ARM-specific GCC.
(As auselen suggests, getting a pre-built one will save you quite a lot of work, even if you're only using it as a starting point to get things set up. You need to have GCC, binutils, and a C library as a minimum, and those are all separate open-source projects that the pre-built versions have already done the work of combining. I'll recommend Sourcery CodeBench Lite since that's the one my company makes and I do think it's a fairly good one.)

As the error message says -mcpu is deprecated, and you should use the other options stated. However "deprectated" simply means that its use may not continue to be supported; it will still work.
ARM Cortex-M4 is ARM Architecture V7E-M, so you should use -march=armv7-m (the documentation does not specifically list armv7e-m, but that may have been added since the documentation was last updated. The E is essentially the difference between M3 and M4 - the DSP instructions, so the compiler will not generate code that takes advantage of these instructions. Using ARM's Cortex-M DSP library is probably the best way to use these instructions to benefit your application. If your part has an FPU, then other options will be needed enable code generation for that.

Like others already pointed out, you are using a compiler for your host machine, and you need a compiler for generating code for your target processor instead (a cross compiler). Like #Brooks suggested, you can use a pre-built toolchain, but if you want to roll out your own cross-compiler, libc and binutils, there is a nice tool called Crosstool-NG. It greatly simplifies the process of building a cross-compiler optimized to generate code for a specific processor, so you're not stuck with a generic prebuilt toolchain, which usually builds code for a family of compatible processors (e.g. you could tune the toolchain for generating ASM for your specific target, or floating point code for a hardware FPU which is specific to your processor, instead of using only software floating point routines, which are default to most pre-built toolchains).

Optimal Windows C tool stack

I am planning a bigger C-only project. It should run on both Linux and Windows. My question is, what is the optimal development stack (compiler, IDE) on Windows? Problem is, we would like to use C99 (if possible).
On Linux it's quite easy because usually combination GCC+VIM+GIT is optimal. But on Windows?
I am concerning: Visual Studio 2010 (no C99 support), MinGW and Intel C++ Compiler.
How do they compare with each other in terms of performance?

For IDE, I've happily used VC++, MonoDevelop, and Code::Blocks (the latter two are cross-platform, as a plus). C isn't a very difficult language to make an IDE for, so most anything will work and it'll just come down to personal preference.
For compiler... C is very quick to compile on all of them, so I guess by performance you mean of generated code?
In my experience, Intel C++ optimizes best if you're targeting Intel CPUs. MinGW GCC optimizes best for everything else. VC++ optimizes very good, but not quite as much as GCC or ICC. This is of course in the general sense only -- I've had plenty experiences where VC++ bests them both.
VC++ compiler can integrate with some non-VC++ IDEs, like Code::Blocks. As you said, lacks C99 support (though, it does have stdint.h).
MinGW integrates with most IDEs, except VC++. Once upon a time its port of libstdc++ lacked support for wchar_t, making Unicode apps very difficult to write. I'm not sure if this has changed.
ICC integrates with the VC++ IDE, some non-VC++ IDEs, as well as supporting C99 and Linux, however it has been shown in the past to deliberately use sub-optimal code when used with non-Intel CPUs -- I'm not sure if this is still the case.
Agner Fog's Optimizing software in C++ provides a decent comparison of compilers, included optimization capabilities.

Well, GCC is GCC. Performance is identical over different OSes, minus ABI and C stdlib differences that may impact performance.
This is an easy problem to solve, use GCC everywhere, meaning MinGW, or the more up to date mingw-w64 (includes 32 and 64-bit compilation capabilities). It provides all (most, meaning 99.9%) of the Win32 API when you need it.
Note that although it's the same compiler, ABI is different for say, Windows x64 vs Linux x64 (in this case: the size of long), and you should ensure the code compiles and works on all platforms you intend to target, regularly.
Using GCC with -pedantic-errors -Wall -Wextra is a nice help for this (if you silence all warnings!), but not perfect.
The Intel compiler will bring better performance, but is only free for personal use on Linux (you can't distribute binaries produced by the free version if I remember correctly), so if you want free tools, that's out. Visual C sucks at C99. It's a C89 compiler, and that isn't going to change soon.
Most development tools are available on Windows as well, see for example msysgit and vim.

What GNU C extensions are available that aren't trivial to implement in C99?

How come the Linux kernel can compile only with GCC? What GNU C extensions are really necessary for some projects and why?

Here's a couple gcc extensions the Linux kernel uses:
inline assembly
gcc builtins, such as __builtin_expect,__builtin_constant,__builtin_return_address
function attributes to specify e.g. what registers to use (e.g. __attribute__((regparm(0)),__attribute__((packed, aligned(PAGE_SIZE))) ) )
specific code depending on gcc predefined macros (e.g. workarounds for certain gcc bugs in certain versions)
ranges in switch cases (case 8 ... 15:)
Here's a few more: http://www.ibm.com/developerworks/linux/library/l-gcc-hacks/
Much of these gcc specifics are very architecture dependent, or is made possible because of how gcc is implemented, and probably do not make sense to be specified by a C standard. Others are just convenient extensions to C. As the Linux kernel is built to rely on these extensions, other compilers have to provide the same extensions as gcc to be able to build the kernel.
It's not that Linux had to rely on these features of gcc, e.g. the NetBSD kernel relies very little on gcc specific stuff.

GCC supports Nested Functions, which are not part of the C99 standard. That said, some analysis is required to see how prevalent they actually are within the linux kernel.

Linux kernel was written to be compiled by GCC, so standard compliance was never an objective for kernel developers.
And if GCC offers some useful extensions that make coding easier or compiled kernel smaller or faster, it was a natural choice to use these extensions.

I guess it's not they are really that necessary. Just there are many useful ones and cross-compiler portability is not that much an issue for Linux kernel to forgo the niceties. Not to mention sheer amount of work it would take to get rid of relying on extensions.