Google Maps Fusion Tables feasibility - google-app-engine

I am wondering if someone can provide some insight about an approach for google maps. Currently I am developing a visualization with google maps api v3. This visualization will map out polygons for; country, state, zip code, cities, etc. As well as map 3 other markers(balloon, circle..). This data is dynamically driven by an underlying report which can have filters applied and can be drilled to many levels. The biggest problem I am running into is dynamically rendering the polygons. The data necessary to generate a polygon with Google Maps V3 is large. It also requires a good deal of processing at runtime.
My thought is that since my visualization will never allow the user to return very large data sets(all zip codes for USA). I could employ the use of dynamically created fusion tables.
Lets say for each run my report will return 50 states or 50 zip codes. Users can drill from state>zip.
The first run of the visualization users will run a report ad it will return the state name and 4 metrics. Would it be possible to dynamically create a fusion table based on this information? Would I be able to pass through 4 metrics and formatting for all of the different markers to be drawn on the map?
The second run the user will drill from state to zip code. The report will then return 50 zip codes and 4 metrics. Could the initial table be dropped and another table be created to map a map with the same requirements as above? Providing the fusion tables zip code(22054, 55678....) and 4 metric values and formatting.
Sorry for being long winded. Even after reading the fusion table documentation I am not 100% certain on this.

Fully-hosted solution
If you can upload the full dataset and get Google to do the drill-down, you could check out the Google Maps Engine platform. It's built to handle big sets of geospatial data, so you don't have to do the heavy lifting.
Product page is here: http://www.google.com/intl/en/enterprise/mapsearth/products/mapsengine.html
API doco here: https://developers.google.com/maps-engine/
Details on hooking your data up with the normal Maps API here: https://developers.google.com/maps/documentation/javascript/mapsenginelayers
Dynamic hosted solution
However, since you want to do this dynamically it's a little trickier. Neither the Fusion Tables API nor the Maps Engine API at this point in time support table creation via their APIs, so your best option is to model your data in a consistent schema so you can create your table (in either platform) ahead of time and use the API to upload & delete data on demand.
For example, you could create a table in MapsEngine ahead of time for each drill-down level (e.g. one for state, one for zip-code) & use the batchInsert method to add data at run-time.
If you prefer Fusion Tables, you can use insert or importRows.
Client-side solution
The above solutions are fairly complex & you may be better off generating your shapes using the Maps v3 API drawing features (e.g. simple polygons).
If your data mapping is quite complex, you may find it easier to bind your data to a Google Map using D3.js. There's a good example here. Unfortunately, this does mean investigating yet another API.

Related

Azure maps indoor module: How to access to Indoor Map GeoJSON from Azure Maps Web SDK

I have been working on a project using Azure Indoor Maps. I started to use the Azure Maps Web SDK. I have looked for a way to loop to all features that are loaded automatically by the SDK, without making a request to WFS API https://learn.microsoft.com/en-us/rest/api/maps/v2/wfs/get-feature.
As I see the map loaded, I think that this information should be accessible directly by SDK, and I do not need to create another request. But maybe I am wrong.
I have found a method that does something similar to what I need getRenderedShapes but it only returns the features that are visible when the method is called, and I need all the features in the indoor map or in one floor.
Does anybody know if this is possible? On one side I think should be something similar to getRenderedShapes, but on the other side, I think that the front-end only has the visual information and that azure indoor maps use the Vector tile source and are optimized in the back-end and only serve to the front-end the required information.
https://learn.microsoft.com/en-us/azure/azure-maps/web-sdk-best-practices#optimize-data-sources
The Web SDK has two data sources,
GeoJSON source: Known as the DataSource class, manages raw location
data in GeoJSON format locally. Good for small to medium data sets
(upwards of hundreds of thousands of features). Vector tile source:
Known at the VectorTileSource class, loads data formatted as vector
tiles for the current map view, based on the maps tiling system. Ideal
for large to massive data sets (millions or billions of features).
Vector tile source: Known at the VectorTileSource class, loads data
formatted as vector tiles for the current map view, based on the maps
tiling system. Ideal for large to massive data sets (millions or
billions of features).
As you noted, the map SDK only loads the indoor maps via vector tiles which are condensed set of the data set clipped to areas of the view port. This only loads a small subset of the data. This makes it possible to create a large scalable indoor map platform that in theory could support every building in the world in real time. As you noted, the getRenderedShapes function can retrieve data from the vector tiles, but only those that are in the current viewport (plus a small buffer). I believe the only way to get the data as GeoJSON if via the WFS GetFeatures service: https://learn.microsoft.com/en-us/rest/api/maps/v2/wfs/get-features

How do I manage multiple training sets using the Watson NLC Toolkit

From what I see, there's no way to upload multiple training sets to the new Watson NLC tooling. I need to manage separate training sets and their associated classifiers. What am I missing here?
Preferred option: Provision an NLC service instance for each set of training data you'd like to work with and separately access the tooling for each.
Workaround: Currently, the flow for managing multiple training sets in one NLC service instance is as follows:
(Optional to start fresh) Go to the training data page and click on the garbage icon to delete all training data.
Upload a training set on the training data page using the upload icon.
Manipulate the data as necessary. Add texts and classes, tag texts with classes, etc.
Create a classifier. When you create a classifier, it is essentially a snapshot of your current training data since you are able to retrieve it later from the classifiers page.
Repeat steps 1-4 as necessary until you have uploaded all of your training data sets and created the corresponding classifiers.
When you want to continue working on a previous training set:
Clear your training data (step 1 from above).
Go to the classifiers page.
Click on the download icon for the classifier which contains the training data you'd like to work with.
Return to the training data page and upload the file downloaded from step 3.
The best way to manage multiple training sets is to use a different NLC service instance for each training set.
The current beta NLC tooling is not intended to manage separate training sets within a single service instance. For example, the tool makes suggestions when you add texts without classes- these are based on the most recently trained classifier which won't make sense if that was based on a completely different training set.
The work around suggested by #John Bufe will work if you have a hard limit on the number of NLC services you can use for some reason, e.g. you have reached your limit of Bluemix services. Cost is not a factor here as additional NLC service instances will not increase the overall price since the monthly charge is for trained classifier instances. For example, if you have four service instances with a single classifier in each, you'll see 3 charged and 1 free.
If you want to use the NLC beta tooling to manage your training data, I would recommend using separate NLC services for each training set you require.

Spatial Search Objectify, appengine

I want to use, objectify for spatial search. I have entities that have longitude and latitude associated with them. Latitude and longitude information is dynamic e.g. service providers (like electrician, carpenter) in a city. I want to implement a query that gives me service providers providing some specific service in 1 Km radius. Searching on google reveals following options
Use Objectify with geohashes - Not sure, how accurate and scalable this solution is
Use Google Search - It will need entities(or part of it) duplicated in the form of documents and Will it be able to support dynamically updated locations.
Use other database like mongodb
Assuming few millions entities and latitude/longitude dynamically updated, please suggest me an appropriate option.
thanks
Ittium
I've used geohashes. It works, although you end up selecting more data than the exact bounds you are looking for and then filtering out the extra. This might or might not be a good solution depending on your specific application. It requires writing more code but has fewer moving parts (all in the datastore).
Google search and "other database" are basically the same architectural pattern - use the task queue to replicate updates to an external index. If you want a quick solution, the search service is probably is the easiest to wrap your head around.
Just pick one solution and run with it for a while. You can always reindex the data into a different solution.
It really depends on your query rate but I usually prefer to use google search. Building and maintaining docs is pretty simple and you get a different quota to handle this queries.

Confused about Google App Engine and Google Docs options

I want to use the Google App Engine to store my data and then query/display/ edit it using Google Spreadsheets as the user interface, with multiple concurrent users having their own view of the data. The problem I have now is that if I put everyone's data on the same Google Spreadsheet that everyone accesses, we can't each do sorting / filtering at the same time.
Is there a way to do this, and is it a good idea to build a simple system this way? I'll eventually need to query a series of Google Word Processor documents as well.
Can someone point me in the right direction on this or suggest other options?
I would ask what the advantage of doing something like this is as opposed to say hosting your application on Google App Engine and building a javascript front end with grids to help sort/filter and view data.
Anyway to answer your questions, you can build your interface over Google Spreadsheets using Google App Scripts. This will allow you to do things like authenticate your user, query, update and display data. If you want to merely display data it turns out that Google Spreadsheets has some built-in functions to do that.
Regarding consistency you should read up on GAE's Datastore as well as its features like transactions. The datastore is not an RDBMS, but is an object database which stores objects against keys. Again something to consider if you are planning to do a lot of data analysis and computation (summations, aggregations).
Overall I would recommend doing a rough design of your system without fixing on particular technologies (like GAE, and Google Spreadsheets). Once you identify what your key goals are for your application, then you can figure out which technologies and resources would make the most sense within your budget.

Custom Database integration with MOSS 2007

Hopefully someone has been down this road before and can offer some sound advice as far as which direction I should take. I am currently involved in a project in which we will be utilizing a custom database to store data extracted from excel files based on pre-established templates (to maintain consistency). We currently have a process (written in C#.Net 2008) that can extract the necessary data from the spreadsheets and import it into our custom database. What I am primarily interested in is figuring out the best method for integrating that process with our portal. What I would like to do is let SharePoint keep track of the metadata about the spreadsheet itself and let the custom database keep track of the data contained within the spreadsheet. So, one thing I need is a way to link spreadsheets from SharePoint to the custom database and vice versa. As these spreadsheets will be updated periodically, I need tried and true way of ensuring that the data remains synchronized between SharePoint and the custom database. I am also interested in finding out how to use the data from the custom database to create reports within the SharePoint portal. Any and all information will be greatly appreciated.
I have actually written a similar system in SharePoint for a large Financial institution as well.
The way we approached it was to have an event receiver on the Document library. Whenever a file was uploaded or updated the event receiver was triggered and we parsed through the data using Aspose.Cells.
The key to matching data in the excel sheet with the data in the database was a small header in a hidden sheet that contained information about the reporting period and data type. You could also use the SharePoint Item's unique ID as a key or the file's full path. It all depends a bit on how the system will be used and your exact requirements.
I think this might be awkward. The Business Data Catalog (BDC) functionality will enable you to tightly integrate with your database, but simultaneously trying to remain perpetually in sync with a separate spreadsheet might be tricky. I guess you could do it by catching the update events for the document library that handles the spreadsheets themselves and subsequently pushing the right info into your database. If you're going to do that, though, it's not clear to me why you can't choose just one or the other:
Spreadsheets in a document library, or
BDC integration with your database
If you go with #1, then you still have the ability to search within the documents themselves and updating them is painless. If you go with #2, you don't have to worry about sync'ing with an actual sheet after the initial load, and you could (for example) create forms as needed to allow people to modify the data.
Also, depending on your use case, you might benefit from the MOSS server-side Excel services. I think the "right" decision here might require more information about how you and your team expect to interact with these sheets and this data after it's initially uploaded into your SharePoint world.
So... I'm going to assume that you are leveraging Excel because it is an easy way to define, build, and test the math required. Your spreadsheet has a set of input data elements, a bunch of math, and then there are some output elements. Have you considered using Excel Services? In this scenario you would avoid running a batch process to generate your output elements. Instead, you can call Excel services directly in SharePoint and run through your calculations. More information: available online.
You can also surface information in SharePoint directly from the spreadsheet. For example, if you have a graph in the spreadsheet, you can link to that graph and expose it. When the data changes, so does the graph.
There are also some High Performance Computing (HPC) Excel options coming out from Microsoft in the near future. If your spreadsheet is really, really big then the Excel Services route might not work. There is some information available online (search for HPC excel - I can't post the link).

Resources