Does Google-MLKit face detection use a MTCNN model? - face-detection

Does Google-MLKit face detection use MTCNN, I'm using a FaceNet model that was trained on images from an MTCNN model, however google MLKit face detection seems to work well for aligning the face, I'm wondering if this would be a clean substitute.

Related

Can Google Vision API detects the outline of face in an image?

I want to draw lines around face (including forehead) and cut that face out from the image. Can I use Google Vision API to realize my goal? I have tested Google Vision API to detect face in some images, and it only returns the bounding poly (the rectangle area) around the face, the landmarks and face expression. It cannot detects the coordinates of outline around face. How to do that with Vision API? If Vision API cannot do it, than what library should I use?
The Vision API service offers a Face Detection feature that can be used to detect multiple faces within an image along with the associated key facial attributes. Based on this, Vision API feature that fits the most to your current requirement is the usage of the fdBoundingPoly. As mentioned in the official documentation:
The fdBoundingPoly bounding polygon is tighter than the boundingPoly,
and encloses only the skin part of the face. Typically, it is used to
eliminate the face from any image analysis that detects the "amount of
skin" visible in an image
I recommend you to check the FACE_DETECTION Responses example that you can use as a reference to know more about this functionality.
In case this feature doesn't cover your current needs, you can use the Send Feedback button, located at the lower left and upper right corners of the service public documentation, as well as take a look the Issue Tracker tool, in order to raise a Vision API feature request and notify to Google about this desired functionality.

Video Camera and Stage3D on Mobile Adobe Flash AIR. Augmented-Reality on AIR

Quick Question
How to show webcam rapidly on Adobe AIR application with Stage3D?
Detailed question
About
My goal is to create prototype of AR (Augmentation Reality) mobile application. I have chosen Adobe Flash AIR for good 3D graphics support on mobile and AIR apps easy to porting to many mobile platforms (iOS, Android, Blackberry Playbook).
Purpose
I want to show up complex 3D model (so i need to use Stage3D). And underneath a video from Front Camera. As usual AR application.
Here is examples
(source: augmentedplanet.com)
Problem
Stage3D not transparent at all so i can't use StageVideo for the rapid showing of content of Camera, because StageVideo doesn't seen under Stage3D.
So
And only decision i have found - it's to create flat surface with dynamic texture updating.
Here is example of integration of webcamvideo with Starling Framework (Stage3D). But with many ordinary mobile devices we get such a big texture updating (almost as size of screen resolution), that any app will fall down to low fps or even crashes down. What i have done on my Galaxy Note for example. With 320x200 texture size it's has fairly good performance but look ugly at AR app.
So is any brilliant solution for create AR on AIR? Is anybody got same challenge?
This use case is unfortunately not well supported in AIR. Your best bet is really the manual upload. It might help to add votes to feature requests on Adobe forums for transparent Stage3D.
Now for why this feature was low priority: If you are doing AR you are probably already doing CPU work on the video. That means you already read back the camera data for processing either on the CPU or as a Stage3D texture. That's the expensive part, not the uploading a texture back to Stage3D.
In order for this to be useful there would need to be a lot of complicated code paths working together flawlessly. On all supported devices:
Read back low resolution camera for CPU or GPU AR image processing
Show passthrough high resolution camera image
Overlay 3D with blending
This is unfortunately very hard. On many mobile chipsets video/camera, CPU, and 3D are very separate units so it is hard to share data between them without stalling or copying. It can be done very well IF you target specific hardware. I know this does not solve your problem but I hope it explains why this use case does not work well in AIR yet. I think you have those options:
Go with AIR and readback/upload. It will be very slow on some HW, but it will work reliably.
Go native. It will be a huge win in the best case, but you need a lot of custom code and testing for every single target.
Go native on a single very narrow platform. Many very cool AR demos do this. Look at SDKs for AR from GPU vendors. Most of them have one.
Make the best out of a bad situation: Stick with low res and uploads but add some interesting filters on the video. Once you paid for the upload doing Stage3D stuff with you texture is very cheap.
I hope this helps a bit. While developing Stage3D this exact use case came up every now and then. I still think it is really cool! Maybe this post explained why it never made the top of the list yet.

Water simulation using shaders in WPF

I am searching for a way to implement a water effect for in WPF like the one demonstrated on Lee Whitney's blog. This is for a full screen application so it should rely on shaders in order to utilize the GPU.
So far I have found the following methods & code examples:
WPF: This example does not live up to the requirements of
simulating drops of water, as it renders one single water ripple.
However it performs lightning fast as it uses shaders. I tried to add
several layers of ripples on top of each other but that did not look
right either.
Silverlight: This implementation offers the right features in
terms of simulating drops and their interaction as opposed to
rendering one ripple. However it does not perform well at all. I
suspect the example may not use the GPU at all but calculates every
pixel of the bitmap in software. I may have misunderstood the code
though as I am not strong in silverlight
[C++]: This example is similar to the silverlight example. It
performs a lot faster, but when scaled to a full HD size it gets too
slow. As with silverlight it appears to me that the example is
relying heavily on software calculations.
Windows Surface appear to have a similar effect implement in their
pond application. God knows how they did it.
Any ideas?
Take a look at this amazing lib. http://perspectivefx.codeplex.com/
Not really an answer, but look at this WPF Shaders
I'm learning too, and found this resource, hope this helps you.

The correct choice of tools for a new Deep Zoom application

I want to create a new application. It will basically be a Deep Zoom application that users can draw annotations on (that will save to a DB so other users can see those annotations.) At first it will just simply run in a browser. However, the app would be useful if it could be used by enthusiasts in the field, so ability to run on smartphones or other handheld devices would be massively beneficial. 3G/4G signal is likely to be practically non existent in those places, so having the ability to download all the images and info for an "area" would be good.
I can't decide on which technology to use. Silverlight Deep Zoom apps look really nice in browsers, but I have heard that it is not a widely supported technology that MS might be ditching anyway and the only smartphones that would be capable of running Silverlight would be Windows phones = a very small share of the smartphone market. Flash will probably never run on iPhones/Apple products in general. So should I use HTML5? HTML5 all seems a little confusing to me at the moment, would it even be possible to make a HTML5 Deep Zoom application that users could annotate?
Any thoughts and advice would be really handy, thanks for reading.
I wrote a Deep Zoom app that supported annotation for a proof of concept a couple of years ago.
I used Django for this, however it is not approach I would recommend. If i was doing the same job again I would use CanvasZoom, which is based on HTML5. Canvas Zoom can be embedded into a webpage through javascript. There is a guide on how to do this here:
a link
Unfortunately you need to run Microsoft DeepZoom composer on the original image first in order too generate the deep zoom data that CanvasZoom will use. If you want your app to run in a browser it is likely that you will have to go for the following approach.
User selects image.
Image gets uploaded to server
Server creates deep zoom information
Use a PHP based approach so you have a canvaszoom page for the image.
The annotations will probably complicate matters, I did this with javascript when I attempted it. The trick is to work out when the image has been zoomed in (with canvas zoom there are preset zoom levels) and redraw the annotation regions. I found this approach non-trivial but not overly complicated.
Canvas Zoom is MIT licensed, so you can do what you like with it.
Good luck with your project.

WPF 3-D performance for head-tracking app

I’m working on creating a full-screen 3-D app (based on Johnny Lee's Wii head tracking app) with some augmented reality features, and it seems that WPF is too slow to render even the simple models I’m using at a reasonable frame rate. I think the problem is that I need to change both the view and projection of the camera on just about every frame, because of the nature of the app (it uses a web cam to track your face, and uses that data to move the camera around and change its perspective).
I've spent a lot of time trying to narrow down the problem, and it's definitely related to the graphics, and not the speed of the head-tracking API that I'm using. Also, I recreated the app in XNA, and it seems to work fine there (28 FPS versus 9 in WPF). Finally, when I remove the "walls" or make the window much smaller (say, 800 x 600), WPF's performance greatly improves, which makes me think that the bottleneck is the graphics calculations.
As a result, I need to either find a new graphics back-end to work with, or find a way to make WPF much faster for this app. I’m mostly looking at DirectX and XNA, and possibly OpenGL. Any recommendations on which of these APIs would be best to use for this app in .NET? Or alternatively, any idea what I'm doing wrong in WPF that's slowing things down?
I've not done enough with 3D in WPF to be able to say what could be causing the slowness, but I did notice that it's model datastructure isn't very efficient. While this might not be the cause it could be symptomatic of general slowness to the whole pipeline.
One thing that does spring to mind is that WPF is rendering the scene in software rather than using the hardware acceleration on the graphics card. The fact that you get better performance (though you don't say how much better) with a smaller window.
If you remove any textures from your model do you get better performance too?
I don't think it really matters whether you go for Direct3D or OpenGL - virtually all modern cards support them equally well. XNA is the obvious choice if you're sticking with .NET as it's integrated into Visual Studio - even the Express edition.
I would check out SlimDX, a much thinner DirectX wrapper than XNA or WPF
The problem with changing the projection on every frame is that API's like XNA/WPF werent designed with this in mind, so were optimized to have projection set once in the initialization phase, then not again.
I would suggest a hybrid choice here: use WPF for what its good at (Windows UI, composition, etc) and use XNA to render the 3D. There are samples out there that demonstrates combining XNA with WinForms. It should be possible to do the same "trick" to render XNA onto surfaces in WPF.
Update: there are supposedly issues with using XNA directly with WPF. This thread indicates that using XNA with WinForms and then hosting the WinForms control in WPF is a workaround to these issues. I've not tested this myself though (yet).

Resources