I'm trying to build a GAN that trains on hundreds of thousands to millions of coordinate sets. The coordinate sets are collected by a JavaScript mousemove event listener, and are appended into an array. The array should be about length 250.
Without any prior knowledge of ML, and a requirement to build this with a GAN (as the discriminator used to analyze the mouse movement is quite strict), what would be the proper approach?
Here is an end-to-end example of an image genenrator using a GAN with Microsoft's CNTK and C# .NET.
That author shows the step by step activities in building a GAN. You can adopt the same approach and tweak the code for your training data set.
Reference:
Farragher, M. (2019), Create Any Image with C# And A Generative Adversarial Network, Medium article.
Related
We are in the process of developing our first website made using Three.js. It of course uses a collection of 3D models, some of which are fairly busy cityscapes. We made them low poly, and are avoiding animation at this point, but would like to add some moving elements eventually.
My 3D designer is more used to working with objects used in Unity games, and he says that the industry standard is to keep each model below 100K polygons. Is there a similar limit that is typically used for Three.js?
In my mind, the issue should rather be focussed on file size, so we are trying to optimize this of course. I was just wondering if anyone knows whether there are other concerns to take into consideration in terms of poly-count?
I'm trying to write my own software for security camera motion detection, but in the area of interest outside my house, there is a lot of vegetation motion that will obviously trigger recording if I use some of the more simple algorithms that rely just on the difference between images. Does anyone have any recommendations? I'm struggling to find motion detection information online. I'm guessing that I'll have to employ some edge detection, or maybe a filtering process.
Cheers,
Zan
Without having seen any of your recordings I would suspect that motion from the vegetation looks quite noisy and more random with only a few local edges as in contrast I would expect much stronger connected edges for people that move through the scenery. Also edges from objects moving on the floor will be mostly be oriented on specific directions for a longer period of time.
My first attempt would be
median filter on input image to reduce noise
difference image to previous (may be 2nd previous) image
some edge detector
build some edgelists based on the stronger
filter out weak/short edges
match edges from objects in last frame against the newly found
apply some tracking of positions and other features
classify object behaviour based on this features
consistent movement in one direction
consistently strong edges on the same object
object size
to trigger your recording
Alternatively you can jump on the recent Hype of Deep Neural Networks.
Look up online information and tools (and maybe embedded hardware) to train and run a CDNN.
Split your current videos into
videos there you do not want to be warned
videos there you do want to be warned
let the magic happen.
I am working on a AR based solution in which I am rendering some 3D models using SceneKit and ARKit. I have also integrated CoreML to identify objects and render corresponding 3D objects in scene.
But right now I am just rendering it in the center of screen as soon I detect the object(Only for the list of objects that I have). Is it possible to get the position of the real world object so that I can show some overlay above the object?
That is if I have a water bottled scanned, I should able to get the position of the water bottle. It could be anywhere in the water bottle but shouldn't go outside of it. Is this possible using SceneKit?
All parts of what you ask are theoretically possible, but a) for several parts, there’s no integrated API to do things for you, and b) you’re probably signing yourself up for a more difficult problem than you think.
What you presumably have with your Core ML integration is an image classifier, as that’s what most of the easy to find ML models do. Image classification answers one question: “what is this a picture of?”
What you’re looking for involves at least two additional questions:
“Given that this image has been classified as containing (some specific object), where in the 2D image is that object?”
“Given the position of a detected object in the 2D video image, where is it in the 3D space tracked by ARKit?”
Question 1 is pretty reasonable. There are models that do both classification and detection (location/bounds within an image) in the ML community. Probably the best known one is YOLO — here’s a blog post about using it with Core ML.
Question 2 is the “research team and five years” part. You’ll notice in the YOLO papers that it gives you only coarse bounding boxes for detected objects — that is, it’s working in 2D image space, not doing 3D scene reconstruction.
To really know the shape, or even the 3D bounding box of an object means integrating object detection with scene reconstruction. For example, if an object has some height in the 2D image, are you looking at a 3D object that’s tall with a small footprint, or one that’s long and low, receding into the distance? Such integration would require taking apart the inner workings of ARKit, which nobody outside Apple can do, or recreating an ARKit-alike from scratch.
There might be some assumptions you can make to get very rough estimates of 3D shape from a 2D bounding box, though. For example, if you do AR hit tests on the lower corners of a box and find that they’re on a horizontal plane, you can guess that the 2D height of the box is proportional to the 3D height of the object, and that its footprint on the plane is proportional to the box’s width. You’d have to do some research and testing to see if assumptions like that hold up, especially in whatever use cases your app covers.
I would like to use D3 to build simple charts with literally hundreds of millions of data points.
Obviously, I won't be attempting to plot millions of points at a time. Only a very, very tiny fraction of those points (<1000) would be in view at any given time. I'll download pre-processed data "on-demand" from the server depending on the current view and zoom level, and would like to use D3's built-in zoom and pan behaviors.
Basically, imagine an infinitely wide bar chart that pans back and forth, and alters itself to show the appropriate level of detail depending on the current zoom level (e.g. semantic zoom).
What techniques are available in D3 to achieve this, yet still have it feel responsive and smooth? What should I avoid doing? Are there any examples of this out there?
Examples: Have a look at Fabian Fischer's BankSafe, an award-winning entry to this year's VAST Challenge. Not sure if the code is available, but the report summarising the techniques he used certainly is. The dataset was also in the order of "hundreds of millions" and - if I remember correctly - had a zoom technique similar to the one you describe.
I would highly recommend you look into using canvas over svg. From what I've seen, having thousands of SVG elements doesn't scale particularly well. Microsoft has a pretty good writeup for how to know which to choose: http://msdn.microsoft.com/en-us/library/ie/gg193983(v=vs.85).aspx#Using_Canvas_AndOr_SVG
I'm trying to write an CAD-like application in WPF(.NET 4.0) that needs to be able to display a lot of 2D points/lines. It will be used to display CAD-plans of entire cities with zoom, pan, rotate and point snapping on mouseover.
Right now I purely use WPF. I read the objects from the CAD file draw them into a StreamGeometry, use it as stroke of a new Path and add it to a Canvas, with several transforms.
My problem is that this solution doesn't scale well enough. It works fine with small CAD-files, but when I want to display like half a city(with houses and land boundaries) it is very very delayed.
I also tried to convert my CAD-file to an image, but
- a resolution a 32000x32000 is sometimes not enough
- when zooming out the lines are too thin.
In the end I need to be able to place this on a Canvas(2D/3D) as background.
What are my best options here?
Thanks,
Niklas
wpf is not good for a large 3d models. im afraid it is too slow. Your best bet is direct 3d or openGL
However, even with the speed of direct3d,openGL you will still need to work out how to cull as many polygons/vertices as possible before the rendering of the scene if you are trying to show an entire city.
there is a large amount of information on this (generally under game development)
there are a few techniques including frustrum culling, near and far plane culling.
also, since you probably have a static scene you may be able to use binary spacial partitioning.
As I understand the subject is 2D CAD system within WPF.
Great! I use it...
OpenGL and DirectX are in infinite loop OnDraw always. The CPU works all the time.
WPF/Silverlight 2D is smart model.
Yes, total amount of elements (for example, primitives inherited from Shape) must be not so much. But how many?
I tested own app (Silverlight). WPF will be a bit faster I hope...
Here my 2D CAD results. Performance is still great. Each beam consists of multiple primitives.
Use a VirtualCanvas like this one from Chris Lovett.