Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 8 years ago.
Improve this question
Where can I find historical raw weather data for a project I am doing with focus on the USA and Canada. I need temperatures mainly, but other details would be nice. I am having a very hard time finding this data. I really dont want to have to scrape a weather site.
I found myself asking this same question, and will share my experience for future Googlers.
Data sources
I wanted raw data, and lots of it... an API wouldn't do. I needed to head directly to the source. The best source for all of that data seemed to be either the NCEP or NCDC NOMADS servers:
http://nomads.ncdc.noaa.gov/dods/ <- good for historical data
http://nomads.ncep.noaa.gov/dods/ <- good for recent data
(Note: A commenter indicated that you must now use https rather than http. I haven't tested it yet, but if you're having issues, try that!)
To give an idea of the amount of data, their data goes all the way back to 1979! If you're looking for Canada and the US, the North American Regional Reanalysis dataset is probably your best answer.
Using the data
I'm a big python user, and either pydap or NetCDF seemed like good tools to use. For no particular reason, I started playing around with pydap.
To give an example of how to get all of the temperature data for a particular location from the nomads website, try the following in python:
from pydap.client import open_url
# setup the connection
url = 'http://nomads.ncdc.noaa.gov/dods/NCEP_NARR_DAILY/197901/197901/narr-a_221_197901dd_hh00_000'
modelconn = open_url(url)
tmp2m = modelconn['tmp2m']
# grab the data
lat_index = 200 # you could tie this to tmp2m.lat[:]
lon_index = 200 # you could tie this to tmp2m.lon[:]
print tmp2m.array[:,lat_index,lon_index]
The above snippet will get you a time series (every three hours) of data for the entire month of January, 1979! If you needed multiple locations or all of the months, the above code would easily be modified to accommodate.
To super-data... and beyond!
I wasn't happy stopping there. I wanted this data in a SQL database so that I could easily slice and dice it. A great option for doing all of this is the python forecasting module.
Disclosure: I put together the code behind the module. The code is all open source -- you can modify it to better meet your needs (maybe you're forecasting for Mars?) or pull out little snippets for your project.
My goal was to be able to grab the latest forecast from the Rapid Refresh model (your best bet if you want accurate info on current weather):
from forecasting import Model
rap = Model('rap')
rap.connect(database='weather', user='chef')
fields = ['tmp2m']
rap.transfer(fields)
and then to plot the data on a map of the good 'ole USA:
The data for the plot came directly from SQL and could easily modify the query to get out any type of data desired.
If the above example isn't enough, check out the documentation, where you can find more examples.
At the United States National Severe Storms Laboratory Historical Weather Data Archive (note: this has since been retired).
Also, the United States National Climatic Data Center Geodata Portal.
The United States National Climatic Data Center Climate Data Online.
The United States National Climatic Data Center Most Popular Products.
wunderground.com has a good API. It is free for 500 calls per day.
http://www.wunderground.com/weather/api/
Related
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 2 years ago.
Improve this question
I am working on an online tool that serves to a number of merchants(e.g. lets say retail merchants).This application takes data from different merchants and provides some data on their retail shop. The solution that I am trying to incorporate here is that any merchant can signup for the tool, send (may be upload through excel or my application can input a json object) their transaction and inventory data and in turn return the result to merchant.
My application consist of domain that is intrinsic to the application and contain all the datapoints that can be used by merchants, e.g
Product {
productId,
productName,
...
}
But the problem that I am facing is that, each merchant will have their own way of representing data, for e.g. merchant x may call product as prod or merchant y may call product
as proddt.
Now I would need to way to convert data represented in merchant format to a way that application understand, i.e each time there is a request from merchant x, application should map prod to product e.t.c e.t.c.
Firstly I was thinking of coding these mappers but then this is not a viable solution as I can't really code these mappings for 1000's of merchants that may join my application.
Another solution I was think was to enable the merchant to map a field from their domain to application domain through UI. And then save this somewhere in DB and on each request from merchant first find the mapping from db and then apply it over any incoming request.(Though I am still confused how this can be done).
Does anyone has faced similar design issue before and know of the better way of solving this problem.
if you can find the order of fields then you can easily map data send by your client and you can return result. for example in Excel you client can mention data in this format:
product | name | quantity | cost
condition: your ALL client should send data in this format.
then it will be easy for you to map these field and access then with correct DTO and later save and process data.
I appreciate this "language" concern, and -in fact- multi-lingual applications do it the way you describe. You need to standardize your terminology at your end, so that each term has only one meaning and only one word/term to describe it. You could even use mnemonics for that, e.g. for "favourite product" you use "Fav_Prod" in your app and in your DB. Then, when you present data to you customer, your app looks-up their preferred term for it in a look-up-table, and uses "favourite product" for customer one, and perhaps the admin, and then "favr prod" for customer two, etc...
Look at SQL and DB design, you'll find that this is a form of normalization.
Are you dealing with legacy systems and/or APIs at the customer end? If so, someone will indeed have to type in the data.
If you have 1000s of customers, but there are only 10..50 terms, it may best to let the customer, not you, set the terms.
You might be lucky, and be able to cluster customers together who use similar or close enough terminology. For new customers you could offer them a menu of terms that they can choose from.
If merchants were required to input their mapping with their data, your tool would not require a DB. In JSON, the input could be like the following:
input = {mapping: {ID: "productId", name: "productName"}, data: {productId: 0, productName: ""}}
Then, you could convert data represented in any merchant's format to your tool's format as follows:
ID = input.data[input.mapping.ID]
name = input.data[input.mapping.name]
To reacp:
You have an application
You want to load client data (merchants in this case) into your application
Your clients “own” and manage this data
There are N ways in which such client data can be managed, where N <= the number of possible clients
You will run out of money and your business will close before you can build support for all N models
Unless you are Microsoft, or Amazon, or Facebook, or otherwise have access to significant resources (time, people, money)
This may seem harsh, but it is pragmatic. You should assume NOTHING about how potential clients will be storing their data. Get anything wrong, your product will process and return bad results, and you will lose that client. Unless your clients are using the same data management tools—and possibly even then—their data structures and formats will differ, and could differ significantly.
Based on my not entirely limited experience, I see three possible ways to handle this.
1) Define your own way of modeling data. Require your customers to provide data in this format. Accept that this will limit your customer base.
2) Identify the most likely ways (models) your potential clients will be storing data (e.g. most common existing software systems they might be using for this.) Build import structures, formats to suppor these models. This, too, will limit your customer base.
3) Start with either of the above. Then, as part of your business model, agree to build out your system to support clients who sign up. If you already support their data model, great! If not, you will have to build it out. Maybe you can work the expense of this into what you charge them, maybe not. Your customer base will be limited by how efficiently you can add new and functional means of loading data to your system.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
In this post I am looking for some advice/clearification regarding performance for external db (or api) vs local storage, specifically for a VueJS/React application (but that doesn't matter).
I am developing a website where the user can list, sort and filter around 200 products. During development I have stored the data in vuex using object litteral format:
{
name: "Productname",
description: "Description of product",
price: 20,
student: true
}
My plan was to put the data in a noSQL database on production. The reasoing being it would give better performance and remove one thousand lines of code from the store. However, using (e.g.) mongoDB I would probably fetch all products on mount and put them in the store anyway.
If I assume all users will see all 200 products, is it a bad idea to fetch all 200 products (from api or db) to vuex/redux on mount? I mean would not risk loads of fetch calls (which could become costly) and data would be preloaded for the user. For this specific example we are talking under 10000 rows of ascii json (but I am also curious in general).
Lets say a user wants to filter for student-products, then could there be any benefits of doing that remotely? Why not do products.filter(p => p.student) in the store or a component (we already fetched all products)?
If 1 and 2 are true then why use an extarnal db? Is it mostly for maintaining the data stored in these places, for example adding/removing products, that we use them? Can this statement be made: "Yes, of course if you have X products then external storage is not needed" and if so then what is X?
It is considered a bad idea performance wise and network wise. The advantage of using an API in this case, is that you can limit the data you send and paginate it. Most of the times a user doesn't need to see 200 items at once and can happen that the item he wants to see/update/delete is in the first ones returned, that means you sent a lot of data that didn't need to be sent. That is why you have pagination or infinite scroll (when you get to the bottom of the page and it loads more data).
You could first filter for the data that is already fetched and if it doesn't return anything you then could do a call to an end route that you defined in your backend and query your db there to return the data the user is searching for.
A user can delete is localStorage and all the items go bye bye unless they are hard coded, in which case why even use localStorage, if your data is in a db and you took all the precautions to make it secure and build the API whithout security faults, then you could make sure that your users would always have data available to them. It doesn't really matter how much X is suppose to be, what really matters is: Would various users have access to the same data that needs to be the same for all of them? Can the users alter the data in any way?
This is really what I've learned and you need to think more about really what your application will do. Your state manager in the frontend should be considered more of a, well, state manager. It will manage the data you fetched so you can guarantee one source of truth for your application.
I hope this somewhat helps, and I would also appreciate if someone with more experience could explain it better or tell me why I'm wrong.
I am developing an application - which would have users answer maybe 10 questions - which would have 3-4 options for each question. At the end of the 10th question, based on the responses, it would need to suggest a certain solution. Since there are 100's of permutation and combinations - what's the logic that would be required to use and the database design,
thanks
EDIT some more detailed explanation
if my application is used to recommend a data plan from various mobile operators - based on the user answering questions like the time spent on the internet, the type of files being downloaded and so on. So, if the response to question 1 was a and question 2 was c, etc - then it would be a certain plan. If the response to question 1 was b and for question 2 it was c, then it would recommend a certain plan. So, if there were 10 questions - then the combinations can be quite large. So is there a certain algorithm that can handle this?
I. what would be the logic?
If I understand correctly, you would define "rules" such as
If the answer to question 5. is either A or B then the suggested plan would be planB, otherwise execute the rest of the rules.
So you would use a rule engine e.g.: http://www.jboss.org/drools/
II. what would be the database design?
This is quite simple:
USERS table,
QUESTIONS table and
ANSWERS table which would refer to the two others
Possibly there would be a QUESTIONNAIRE table as well, and the QUESTIONS table would refer to it.
Just a 'quick' comment, consider letting the user see changes in what company they could be recommended as they answer every question.
For example, if I am most interested in price that would be the question I would answer first and immediately see the 3 cheapest plans/products recommended to me.
The second question could be coverage and if I then could see the 3 plans with best coverage (in my area) that would be interesting too.
When I answer the third question about smart phone features and I say I want internet, then the first question should spit out the 3 cheapest plans/products that include internet, obviously they could change.
And so on...
Maybe it also could be a good idea to let the user "dive into" each question and see the full range of options for that answer. As a user I would appreciate that.
Above comments is just how I would appreciate if a form was made for me, I don't want to answer 10 questions about stuff I'm not really putting any value on, each user is different and will prefer to make their choice on their questions.
So, based on above it would be like a check list where the top answers would be the plans/products with the most fitting check marks. And to give immediate responses (as the user answer/alter each question), here AJAX would probably be your choice.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 7 years ago.
Improve this question
How would one design a neural network for the purpose of a recommendation engine. I assume each user would require their own network, but how would you design the inputs and the outputs for recommending an item in a database. Are there any good tutorials or something?
Edit: I was more thinking how one would design a network. As in how many input neurons and how the output neurons point to a record in a database. Would you have say 6 output neurons, convert it to an integer (which would be anything from 0 - 63) and that is the ID of the record in the database? Is that how people do it?
I would suggest looking into neural networks using unsupervised learning such as self organising maps. It's very difficult to use normal supervised neural networks to do what you want unless you can classify the data very precisely for learning. self organising maps don't have this problem because the network learns the classification groups all on their own.
have a look at this paper which describes a music recommendation system for music
http://www.springerlink.com/content/xhcyn5rj35cvncvf/
and many more papers written about the topic from google scholar
http://www.google.com.au/search?q=%09+A+Self-Organizing+Map+Based+Knowledge+Discovery+for+Music+Recommendation+Systems+&ie=utf-8&oe=utf-8&aq=t&rls=com.ubuntu:en-US:official&client=firefox-a&safe=active
First you have to decide what exactly you are recommending and under what circumstances. There are many things to take into account. Are you going to consider the "other users who bought X also bought Y?" Are you going to only recommend items that have a similar nature to each other? Are you recommending items that have a this-one-is-more-useful-with-that-one type of relationship?
I'm sure there are many more decisions, and each one of them has their own goals in mind. It would be very difficult to train one giant network to handle all of the above.
Neural networks all boil down to the same thing. You have a given set of inputs. You have a network topology. You have an activation function. You have weights on the nodes' inputs. You have outputs, and you have a means to measure and correct error. Each type of neural network might have its own way of doing each of those things, but they are present all the time (to my limited knowledge). Then, you train the network by feeding in a series of input sets that have known output results. You run this training set as much as you'd like without over or under training (which is as much your guess as it is the next guy's), and then you're ready to roll.
Essentially, your input set can be described as a certain set of qualities that you believe have relevance to the underlying function at hand (for instance: precipitation, humidity, temperature, illness, age, location, cost, skill, time of day, day of week, work status, and gender may all have an important role in deciding whether or not person will go golfing on a given day). You must therefore decide what exactly you are trying to recommend and under what conditions. Your network inputs can be boolean in nature (0.0 being false and 1.0 being true, for instance) or mapped in a pseudo-continuous space (where 0.0 may mean not at all, .45 means somewhat, .8 means likely, and 1.0 means yes). This second option may give you the tools to map confidence level for a certain input, or simple a math calculation you believe is relevant.
Hope this helped. You didn't give much to go on :)
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 7 years ago.
Improve this question
I want a .csv list, mysql database, or any other list of all U.S states and cities, including which cities are in which state. From this list I will generate a mysql database with the following fields:
states:
- id (int, auto_increment, primary)
- name (varchar 255)
cities:
- id (int, auto_increment, primary)
- stateId (id of the state from states table to which this city belongs)
- name (varchar 255)
Thanks in advance.
You can get city/state information in tab-separated value format from GeoNames.org. The data is free, comprehensive and well structured. For US data, grab the US.txt file at the free postal code data page. The readme.txt file on that page describes the format.
I spent a while looking for such a file, and ended up doing one myself, you can get it from here:
https://github.com/grammakov/us_cities_and_states/tree/master
Check out the MySQL world sample database. This db is used by mysql documentation as a sample db to test query on.
It already have a 'cities' table you are looking for.
Are you ready to pay for this content?
If YES, then you can find it at uscities.trumpetmarketing.net
I have also seen this information provided along with some programming books especially ones dealing with .NET database programming. Let me refer to my library and ge back to you on this:
You can also refer the following:
http://www.world-gazetteer.com/wg.php?x=1129163518&men=stdl&lng=en&gln=xx&dat=32&srt=npan&col=aohdq
http://www.geobytes.com/FreeServices.htm
Please dont bother voting for this answer. There is no information here that cannot be obtained via a simple google search!
Someone has posted a list here:
http://mhinze.com/archive/list-of-us-cities-all-city-names-and-states-regex-groups/
I use the us city and county database for this purpose and I just checked that it got updated in August. They claim to include 198,703 populated places (a GNIS term for a city or village). I see you need full state names and these names are included in a free database called us state list.
Both of them are in CSV files and they provide very detailed instructions about how to import them to both local and remote MySQL servers. You can join them in a select statement to pull records with full state names for your needs.
You can find csv or sql format or html format at below website. They have cities+states for some countries including usa.
http://www.countrystatecity.com/. They keep updating the site and doing good job. hope this will help to other developers also.
for usa you can check below link.
http://www.countrystatecity.com/USAStatesCities.php
That's a tall order. Consider creating one by scraping the links off this page:
WP: List of cities, towns, and villages in the US. It is much simpler if you scrape the wiki markup code rather than the raw HTML.
Will require some skill at regexes or at least parsers, but should be do-able.
This helped me a great deal: http://www.farinspace.com/us-cities-and-state-sql-dump/
It has a .zip file with 3 .sql files which you can just run in ssms.
You may need to replace some of the weirdly encoded single quotes with double quotes.