How to store location data to speedup distance calculation? - database

I am working on to create an application which will fetch a list of all the hospitals within a certain radius of the user's location. So in-case we have in our database a list of all the hospitals of this planet with there GPS locations, will we need to find distance of the user with each one of them and then list the hospitals which are at less than the prescribed radius? With multiple users accessing our application concurrently, it may crash our servers. Is there a way to optimize this?
How should I store the location data of the hospitals such that the p

Although your question does not seem quite finished, I recommend taking a look at R-trees as your data structure of choice:
http://en.wikipedia.org/wiki/R-tree

Related

AnyLogic: Set Agents to Nodes on a GIS Map

I am working within a project team and we want to create kind of a digital twin showing all the logistical streams within a city. We therefore already implemented all the stores, pharmacies, doctors, etc. as individual GIS points. In the next step we want to apply certain agents to the GIS points.
Therefore we tried to create a population of agents which we thought we could later on connect to the correct nodes representing the adress within the GIS map. Thats where our problem occured. We were only able to select one node although we set our population of agents to the amount 10. Is there a trick how to solve the problem or do we have to forget our approach and instead have to consolidate all of the adresses within an excel sheet which we can use as a database for our agents.
Thanks a lot in advance :)
on the node initial location you can use a function that retunrs a GISPoint
On this function you establish the rules you need in order to place the agents randomly accross the nodes or in a way that you want
for example if the function does this::
if(randomTrue(0.5)) return gisPoint;
else return gisPoint1;
all the agents would be placed randomly on gisPOint or gisPoint1

Data organisation in graph database

This is more of a logical question rather than technical one. I am asking for data organisation guidance for my requirements. Please keep in mind that I am willing to use a graph database for this purpose (though I am pretty new at that). So guidance in graph database context would be much appreciated.
Let me provide an overview of the scenario. There are two entities in the app, User and House. User can owns a house or rents a house. If an user rents a house, there should be time period mentioned for which the user has rented the house. An user may rent same house for different periods.
Demo Dataset:
A (User) -owns-> H1, H2, H3 (House) - one-liner for brevity
X -rents-> H2 (start=DATE1, end=DATE2)
Y -rents-> H2 (start=DATE3, end=DATE4)
X -rents-> H2 (start=DATE5, end=DATE6) - user rents same house again
I am assuming that User and House would be nodes and owns and rents would be edges. Rent period would be properties of rents edges. Please point out if there is any better way.
Questions:
Is this possible in graph database in general to have multiple edges of same type between two nodes? Should I keep just one edge for rent of a specific user to specific house and add periods? Or should I maintain multiple edges for multiple periods?
Is it possible to query for something like: "fetch all the houses that were empty for a period of 3 months"? This should fetch the houses that have a gap of 3 months between consecutive end and next start dates in rents. These houses may not be empty now.
I have checked neo4j, cayley, dgraph etc. Which may be better with this scenario?
Any guidance of how I should keep the data with relationships would be much appreciated. Have a nice day.
I think this may be solved, but I would just add that TerminusDB may be useful to assess as part of your process. The reason that I sat this is that you are:
TerminusDB uses an expressive logical schema language that allows anything that is logically expressible. So you could have multiple edges of same type between two nodes. However, data modeling is something of an art - as your question suggests - so it will depend on your domain. (I always think that 'deal' could be an edge or a node depending on context - you can participate in a deal or you could strike a deal with another party).
As TerminusDB is a native revision control database, time bound queries can be relatively straightforward. You can get a delta, or a series of deltas, between two events.
There could be a better answer than this, still posting my experience with graph on the given requirement if this of any help to you.
I think it is the best fit for the graph DB for your requirement and to answer your questions.
It is more of designing your graph model to suit the purpose and I think you can have multiple rent edges with different periods from node user to node house.
Which way you can maintain the history and you can later delete the older/expired period edges if you want.
[Just to avoid duplicates] Assume here you need to make sure the edge would be created between nodes (user & house) only if the period slot is free.
You can add logic to the query while creating the edge between nodes.
With the given demo data set, here is the sample graph I have created based on the scenario you have described.
http://console.neo4j.org/?id=bxu3sp
Click on the above console link and you can run the below cypher query in the query window at the bottom
MATCH (user:User)-[rent:RENT]->(house:House)
WITH house, rent
ORDER BY rent.startDate
WITH collect(rent) as rents, house
UNWIND range(0, size(rents)-1) as index
WITH rents, index, house
WHERE duration.inDays(date(rents[index].endDate), date(rents[index+1].startDate)).days > 30
RETURN house
This would get the list of houses that was with no allocation for a given period range.
I'm not an expert and never used other than neo4j and so far with my experience on neo4j, documentation is really good and it is powerful with additions like Kafka integrations, GraphQL, Halin monitoring, APOC, etc.
I'd say it is a learning curve, just explore and play around with it to get yourself into the graph DB world.
Update:
In case of the same user renting the same house for different periods then the graph would look something like below as said you should avoid creating duplicates by not allowing edges for the same/overlapping window period between any user node and any house node. here in this graph, I have created edges for the different and non-overlapping start/end date so which is valid and not a duplicate.

How to handle lots of unchanging backing data that is kind of unrelated to my application

Background
I'm creating a layered .net core application to handle tracking campaigns for a board game. Because of this there is a lot of data that comes from the game itself, for example:
Characters
Weapons
Equipment
Missions
Objectives that belong to a mission
Rewards that belong to objectives
Etc
The application is not to manipulate this data. This data is typically printed on cards that come with the board game so it won't change. The only changes it may have are when I manually add new characters or something due to a new expansion being released.
As far as the app is concerned, these are similar to how you might have a lookup table of States in the US. The app needs to list them so you can select them, entities in the domain hold references to them, but their actual data is irrelevant to the application itself. It's just lookup data.
Except there is a lot of this data and some of it is related. For example an objective belongs to a specific mission and a reward belongs to a specific objective.
The Problem
If my application was being designed to manage this data there would be no problem. However this is not the case. It is designed to manage "Campaigns", which are 2-5 players sitting down to play a game with these cards. It is managing "instances" of this data that have additional properties.
For example a new campaign is created and a row is added to the Campaign table. Now a mission must be added to it.
I can't just add a reference to the Mission data because I also need to store the outcome of the mission specific to this campaign. So I create a CampaignMission entity that references the mission data, the campaign id, and has a column for the mission outcome.
But that Mission data had related Objective data. The data just holds things like objective name, description, rewards etc, but in the campaign I also need to store the outcome of this objective specific to this CampaignMission. So again I create a CampaignObjective that references the Objective data, the CampaignMission, and has a column for the objective outcome.
Before you know it I am doing this for everything. CampaignCharacter, CampaignWeapon, CampaignReward. I feel like I'm just replicating the structure of the game data, relationships included.
Where the game data has relationships, my Campaign entities feel like they're mirroring the relationships to the point where, from the same object, you can access the same piece of game data by following two separate paths, the original game data relationship or the Campaign entity "replica" relationship.
For example if I want the name of the first reward for the first objective in the first campaign mission, you can access it in two ways:
Campaign.CampaignMissions[0].Mission.Objectives[0].Rewards[0].Name
Campaign.CampaignMissions[0].CampaignObjectives[0].CampaignRewards[0].Reward.Name
Both of these point to the same piece of game data. I really feel like there should only be one path:
Campaign.Missions[0].Objectives[0].Rewards[0].Name
Where I'm Stuck
I'm not sure if this is normal but it all just feels wrong. Almost as though the game data shouldn't even be part of the application. I mean the game data could be hosted on some 3rd party API and it wouldn't make any difference to my actual application. It's just data I need to read but I feel it's impacting my app structure in ways it shouldn't be.
My application doesn't really need to know the difference between Mission game data and a Mission in a campaign. All it needs to care about is that a campaign can have missions, and those missions have a name etc and an outcome. It doesn't feel like the Mission game data itself needs to be an entity in my domain.
What I've Tried
I tried keeping single entities in my domain and keeping them separate in my database. So for example a Mission in the domain would include both the game data fields like mission name, the mission outcome and a list of domain Objectives.
When a domain Mission for a campaign is requested from the data layer, the entry is retrieved from the CampaignMission table, along with its game data from the Mission table, then flattened via AutoMapper and returned to the domain as a single Mission entity containing everything.
This just caused a bit of a nightmare with Entity Framework and handling the mappings back and forth between data layer and domain because the CampaignMission in the database also had CampaignObjectives which linked to Objectives that also had to be flattened etc, and I had to keep track of the primary keys for all of these throughout my domain so everything could be unflattened and mapped back again when I want to persist something. It just didn't make sense, in terms of tracking primary keys/identity, for a single domain entity to be represented by entries in multiple tables.
What I'm Now Considering
I'm considering just moving all of the game data into a totally separate project, completely unrelated to my application. My application could then query project as though it was some third party API or something and get any data it needs and I can keep it all out of my solution.
Since the game data would no longer have IDs in my application, when I add a mission to a campaign it would simply have a column for "name" which would hold the mission name. When I want to use that mission I would grab it from the db and map it to a domain entity, so at this point it contains the campaign-specific data such as mission outcome, and also the name. Then I'd query the game data project using the mission name and map all the returned data back on to the entity as well, leaving me with a complete entity.
This is essentially replicating the behaviour of what I already tried but removing the need to track identity for the game data by simply using a name that I can query. It removes the concept of backing game data from my domain and leaves me with a single entity, Mission.
The Question
I've wasted a lot of time on this so far and I'm sure it must be a common problem in similar types of applications. I was wondering if anyone had a better solution for dealing with this kind of situation before I go ahead and try completely separating the data.
I have to admit, typing out the "What I'm Now Considering" section has clarified a few things for myself but I would still love to hear if there is a better way.
Thank you in advance if anyone reads all of this.
Here's what you should be doing. First, add the game data entities to the DbContext as DbQuery<T>:
public DbQuery<Campaign> Campaigns { get; set; }
This will allow you to query it, but will not allow changes. Then, since the game data is static, you might want to actually just persist it on a singleton, which you can then inject where you need it.
In either case, on the actual campaign data that's being persisted, you should only store the id of the game data concept. For example, MissionId, not CampaginMission.Mission. When you need the actual Mission info, just look it up based on the MissionId, either directly from your DbQuery<Mission> property on your context or your singleton class.

Database solution for route matching

i'm working on an application that lets users search for trips from point A to point B.
it needs to solve the following use cases:
find trips that go from point A to point B
find trips that start in some other point, but go trough point A to point B
I'm now looking for a database solution that would be best to support such use cases.
For now we are using MongoDB. But i had to figure out a workaround for the first use case and i have a feling that it's not possible to solve the second use case with it.
It seems to me that all the available noSql dbs that support spatial features allow only for one geospatial index on a document,node etc. This is fine for queries like show me all shops in radius of 5km from this point and the like.
So i'm looking for a solution that could solve both use cases. Is there something like that available?
pgRouting could be used, indeed. First solution, that pops into mind: when first user has entered New York and Columbus as source and destination of his trip, perform routing query and store path as PostGIS linestring geometry.
When second user enters From: Pittsburgh To: Columbus into search form, geocode city names to locations and make PostGIS queries, how far are those points (or city boundaries) from first user's route path. If they are close enough and first user drives on suitable direction, they could share car.
Second idea: after first user has entered trip details, perform routing query and store all place names, that are passed by route, into database.
Both solutions could be easily implemented with Postgres+PostGIS+pgRouting. Biggest disadvantage of pgRouting is low speed (it's possible to improve performance by reducing data in routing graph; routing speed is not so important etc). It's also possible to export road data to external files; use some high-speed routing engines (like OSRM, MoNav etc); and, if necessary, write result back to PostGIS. But this requires definitely much more effort.
Also, if you choose to avoid the Database route (no pun intended), you could use GeoTools graphing Java library.
http://docs.geotools.org/latest/userguide/extension/graph/index.html
Here is some example code and data I produced myself to demonstrate how it can be used.
http://usefulpracticalgeoblog.blogspot.ch/2012/09/geotools-routing.html
It is pretty flexible in terms of the spatial data formats that can be used to build the street network graph, and how the results can be outputted.
Then to find if the starting point of trip B is close to the pre-calculated route for Trip A, you could use JTS (Java Topology Suite), which is part of the GeoTools library. Here is an example of the analysis you might use.
https://gis.stackexchange.com/questions/7699/for-a-given-feature-find-the-closest-point-along-a-given-path
Postgresql with postgis and pgrouting. You need nothing else.

Is the CELL-ID stored in HLR database , how to get location of a Cell-Id

Is it easy to get the current LIST of mobile phone users under a particular tower(cell-id) from the Home Location Register, does the network operators or service providers have mapping of location information to a particular cell like latitude ,longitude.
To get actual cell id of a mobile phone form HLR/VLR - you should use SS7 signaling commands: sendRoutingInfoForSM and provideSubscriberInfo
To ged cell id latitude/longitude - you can use services like opencellid.org or locationapi.org
someonespecial
You will not get this information from the HLR.
Obviously operators know where their own cells are but it's often considered business intelligence and not public information.
There are some commercial and non-commercial list of cell towers and their GPS locations but those information may not be updated real time and may misguide you because of continuous operational tasks carried by MNOs such as moving cell stations to another locations, installing or decommissioning them.
There are several ways to get current cell id of mobile number but all require integration with MNO’s equipment using with either SS7 protocol suite or other available proprietary interfaces.
Sending AnyTimeInterrogation MAP operation to HLR and processing its response
Getting Location Update dumps by means of core network specific interface or port mirroring on L3 switches of HLR for all Sigtran links between HLR and MSSs.
Using existing SMLC platform of MNO
But it would not be enough to build up-to-date geolocation database. In any case you would need to get CELL ID/LAC/GPS location list regularly as exportable dump from MNO you are willing to work.

Resources