I once read about an open source database for postal codes with geolocation data but now I can't remember its name. Can someone help?
My guess is that your lost friend is GeoNames.
Find all countries at GeoNames or run a ready-parsed MySQL script (INSERTs):
Download delivery_zip.zip (11 Mb)
You're looking for an Open GIS database - the main keyword being "GIS". This will help you find results.
The information you're looking for is usually Commercial grade (i.e. you need to pay for the data), but you can see what's available on the open-source GIS websites:
http://www.osgeo.org/
http://opensourcegis.org/
http://freegis.org/
I'm currently using this one: https://www.back4app.com/database/back4app/zip-codes-all-countries-in-the-world it's free and very useful!
It's also possible to collaborate with this dataset, in order to improve it even more. :)
Yes Geonames is the place. To be more specific as slacy said
https://download.geonames.org/export/zip/
It has a list of countries' postal codes and it has an allCountries zip file.
And the data is updated daily.
Checkout postal code database . Its low cost,free updates as well for over a year.
Video demonstration of the database and supported fields on youtube
https://www.youtube.com/watch?v=PzjHzMDKyYw
Related
extract information about a particular person from a document which may contain information about many people. Statements like "he works for XYZ COMPANY", should also be considered for that particular person. Also Nick names should be considered.
I have tried using NLTK and Spacy and have managed to extract entities from the document. I am not sure how to proceed.
Try using a more complete NER library, maybe standford coreNLP can help you, https://stanfordnlp.github.io/CoreNLP/
I'm trying to setup solr that should understand English. For example I've indexed our company website (www.biginfolabs.com) or it could be any other website or our own data.
If i put some English like queries i should get the one word answer just what Google does;queries are:
Where is India located.
who is the father of Obama.
Workaround:
Integrated UIMA,Mahout with solr(person name,city name extraction is done).
I read the book called "Taming Text" and implemented https://github.com/tamingtext/book. But Did not get what i want.
Can anyone please tell how to move further. It can be anything our team is ready to do it.
This task is called Named Entity Recognition. You can look up this tutorial to see how they use Solr for extractic atomic elements in text into predefined categories such as the names of persons, organizations, locations, expressions of times, quantities, monetary values, percentages, etc. and then learning a model to answer queries.
Also have a look at Stanford NLP for more ideas on algorithms that you can use.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 7 years ago.
Improve this question
I want a .csv list, mysql database, or any other list of all U.S states and cities, including which cities are in which state. From this list I will generate a mysql database with the following fields:
states:
- id (int, auto_increment, primary)
- name (varchar 255)
cities:
- id (int, auto_increment, primary)
- stateId (id of the state from states table to which this city belongs)
- name (varchar 255)
Thanks in advance.
You can get city/state information in tab-separated value format from GeoNames.org. The data is free, comprehensive and well structured. For US data, grab the US.txt file at the free postal code data page. The readme.txt file on that page describes the format.
I spent a while looking for such a file, and ended up doing one myself, you can get it from here:
https://github.com/grammakov/us_cities_and_states/tree/master
Check out the MySQL world sample database. This db is used by mysql documentation as a sample db to test query on.
It already have a 'cities' table you are looking for.
Are you ready to pay for this content?
If YES, then you can find it at uscities.trumpetmarketing.net
I have also seen this information provided along with some programming books especially ones dealing with .NET database programming. Let me refer to my library and ge back to you on this:
You can also refer the following:
http://www.world-gazetteer.com/wg.php?x=1129163518&men=stdl&lng=en&gln=xx&dat=32&srt=npan&col=aohdq
http://www.geobytes.com/FreeServices.htm
Please dont bother voting for this answer. There is no information here that cannot be obtained via a simple google search!
Someone has posted a list here:
http://mhinze.com/archive/list-of-us-cities-all-city-names-and-states-regex-groups/
I use the us city and county database for this purpose and I just checked that it got updated in August. They claim to include 198,703 populated places (a GNIS term for a city or village). I see you need full state names and these names are included in a free database called us state list.
Both of them are in CSV files and they provide very detailed instructions about how to import them to both local and remote MySQL servers. You can join them in a select statement to pull records with full state names for your needs.
You can find csv or sql format or html format at below website. They have cities+states for some countries including usa.
http://www.countrystatecity.com/. They keep updating the site and doing good job. hope this will help to other developers also.
for usa you can check below link.
http://www.countrystatecity.com/USAStatesCities.php
That's a tall order. Consider creating one by scraping the links off this page:
WP: List of cities, towns, and villages in the US. It is much simpler if you scrape the wiki markup code rather than the raw HTML.
Will require some skill at regexes or at least parsers, but should be do-able.
This helped me a great deal: http://www.farinspace.com/us-cities-and-state-sql-dump/
It has a .zip file with 3 .sql files which you can just run in ssms.
You may need to replace some of the weirdly encoded single quotes with double quotes.
Are there any good technical solutions for extremely long term archiving of data, for example for 25 to 100 years?
Somehow I just don't have a lot of confidence that a SQL 2000 backup file will be usable in court cases or for historians in 25 to 100 years.
This is a customer requirement, not just speculation.
This is comparable to trying to trying to do something useful with a back up for ENIAC or reading Atari Writer wordprocesing files. The hardware doesn't necessarily exist anymore, the storage media is likely corrupt, the professionals for using the technology probably don't exist anymore, etc.
Actually, printing on Acid-free paper is probably a much better solution than any more advanced technological one. It is much more likely that the IT tech of +100 years will be able to high-speed scan and load print than any digital data storage based on 100 year-old media access HW, technology and standards, 100 year-old disk/file format standards and 100 year-old data encoding standards.
Disagree? I've got a whole attic full of vinyl CD's, 8-tracks, cassette tapes, floppy disks (4 different densities!) that argue otherwise. And they are only 20 years old! (OK, the 8-tracks are closer to 30).
The fact is that there is only one data storage & archiving technology that has ever withstood the test of time over 100 years or more and still been cost effectively retrievable, and thats writing/printing on physical media.
My advice? Don't trust any archival strategy until it's been tested, and there's only one that has passed the 100-year test so far.
You'll need to convert to text - perhaps XML.
Then upload it to the cloud, make archival copies etc.
I think you need to pick a multi-modal approach.
If you have the budget: http://www.archives.gov/era/papers/thic-04.html
<joke>Print it.</joke>
script the data into flat files (either one file per table or summarize multiple tables into a file), write them to high end archival CDs. in 100 years they will have to load this data into whatever "database" they have, so so some manual conversion will be necessary, so a nice schema script dump into a single file would help the poor guy trying to read these files and make the proper joins.
EDIT
offer the client a service contract, where you make sure they are up to date with the latest archival technology on a yearly basis. this could be a good thing $
I suggest you consult a specialist company in this field.
You might also be interested in this article:
Strategies for long-term data retention
It might help to speak to one of those companies/organisations
I don't know if anyone reads this thread or not anymore but there is a really good solution for this.
There is a new company called Millenniata, the have a product called M-Disc. The M-Disc is essentially a DVD made out of rock like materials that give it an estimated shelf life of 1,000 years +. You have to have a special DVD burner to burn the DVD's but it is not that expensive. Plus any normal DVD reader can read them. I have a professor at BYU that helped form this company, it is some pretty cool technology. Good Luck.
Link to M-Disc Website
I was trying to programmatically go through presidential campaign contributions to see which web 2.0 people contributed to which candidates. You can get the data file indiv08.zip on the site http://www.fec.gov/finance/disclosure/ftpdet.shtml#a2007_2008
I can parse out who contributed and how much they contributed, but I cannot figure out who the contribution went to. There seems to be some ID, but it does not match the candidate IDs in any other file I can find. If you can help with this, I think I could make a pretty nice page showing who contributed to which candidate.
update -- I wonder if I just misunderstand this contribution system. Perhaps candidates cannot receive contributions at all, only committees? For example I see "C00431445OBAMA FOR AMERICA" received a lot of contributions. That makes it a bit more complicated then to associate those committees to candidates. Basically I want to know who supported Obama and who supported McCain.
On page 3 of the tutorial that is linked at the top of the page you liked to contains the column names including "Candidate Identification".