Problems with current web AI systems - artificial-intelligence

I'm going into my third year of studies as an AI student and am planning my third year project. I have been considering a recommendation system of some sort. The motivation for this is to gain an understanding of how people evaluate products (what makes the products desirable) and consequently attempt to build a system that would understand this. Currently my thinking is along the lines of a system that would be able to differentiate between different priorities in peoples' likes and dislikes. For instance a person who is environmentally very aware probably wouldn't want to buy products that are not.
So the question is
- What things are most in need of repair/development in the modern web AI systems (Google, Amazon, Last.fm and so on).
My project is limited to about 6 months but I would be interested to hear any thought on the subject.

Some of the things that you might want to look at are Facebook OpenSocial Graph and Google Prediction API.

Related

Genetic Algorithm vs Expert System

I'm having some doubts about which system should I use for a new software.
No code has been written yet, I'm just breaking apart all the needs and only then start coding.
This will be implemented in a computer company that provides services for other companies, onsite and remotely.
These are my variables:
Number of technicians
Location of customer
Type of problem
Services already scheduled for the technician
Expertise of the technician about the situation
Customer priority
Maybe some are missing, but these are the most important ones.
This job is being done manually, and has humans, we fail to see the best route to be taken sometimes.
Let's say that a customer calls with a printer problem.
First, check which tech knows about printers.
Then, is the tech available? far from the customer? can it be done remotely (software issues)?
Can it be done by another tech who is closer from the customer location?
Does this customer have more priority than the other where the same tech should be going?
Is the technician schedule full? If yes, pass to another printer/hardware tech.
I know my english is not perfect (not my natural language), but I'll try to provide more details or correct the text as needed.
So, my question is this, what kind of approach would you take? Genetic algorithm seems nice for this kind of job, and I also have some experience with GAF and WatchMaker (Java GA Framework). However, when reading the text above, an expert system seems also appropriate.
Have someone done something like this?!I had search for this kind of software and couldn't find anything alike.
Would another approach be better than the two asked?!
Also, I'm building up a table with all the techs capabilities and expertise, with simple rules like, 1 to 5 about each expertise. This is also a decision factor.
Thanks.
Why not do both? Use an expert system (a rule engine) to define your constraints and use a metaheuristic (such as Local Search or Genetic Algorithms) to solve it. The planning engine OptaPlanner (java, open source) does exactly that (by using the rule engine Drools). The architecture look likes this:
Here's a video demonstrating the constraint flexibility on the vehicle routing problem (VRP). Your problem seems to be an advanced variant on VRP (which is a variant on TSP).
Maybe you can start off with TSP,
here http://en.m.wikipedia.org/wiki/Travelling_salesman_problem
I guess it only deals with the distance.

looking for a good project to work on as my graduation project in the university that involves Ai / Machine Learning, please help me

I need help to chose a project to work on for my master graduation, The project must involve Ai / Machine learning or Business intelegence.. but if there is any other suggestion out of these topics it is Ok, please help me.
One of the most rapid growing areas in AI today is Computer Vision. There are many practical needs where the results of your Master Thesis can be helpful. You can try research something like Emotion Detection, Eye-Tracking, etc.
An appropriate work for a MS in CS in any good University can highlight the current status of research on this field, compare different approaches and algorithms. As a practical part, it makes also a lot of fun when your program recognizes your mood properly :)
Netflix
If you want to work more on non trivial datasets (not google size, but not trivial either and with real application), with an objective measure of success, why not working on the netflix challenge (the first one) ? You can get all the data for free, you have many papers on it, as well as pretty good way to compare your results vs other peoples (since everyone used exactly the same dataset, and it was not so easy to "cheat", contrary to what happens quite often in the academic literature). While not trivial in size, you can work on it with only one computer (assuming it is recent enough), and depending on the type of algorithms you are using, you can implement them in a language which is not C/C++, at least for prototyping (for example, I could get decent results doing things entirely in python).
Bonus point, it passes the "family" test: easy to tell your parents what you are working on, which is always a pain in my experience :)
Music-related tasks
A bit more original: something that is both cool, not trivial but not too complicated in data handling is anything around music, like music genre recognition (classical / electronic / jazz / etc...). You would need to know about signal processing as well, though - I would not advise it if you cannot get easy access to professors who know about the topic.
I can use the same answer I used on a previous, similar question:
Russ Greiner has a great list of project topics for his machine learning course, so that's a great place to start.
Both GAs and ANNs are learners/classifiers. So I ask you the question, what is an interesting "thing" to learn? Maybe it's:
Detecting cancer
Predicting the outcome between two sports teams
Filtering spam
Detecting faces
Reading text (OCR)
Playing a game
The sky is the limit, really!
Since it has a business tie in - given some input set determine probable business fraud from the input (something the SEC seems challenged in doing). We now have several examples (Madoff and others). Or a system to estimate investment risk (there are lots of such systems apparently but were any accurate in the case of Lehman for example).
A starting point might be the Chen book Genetic Algorithms and Genetic Programming in Computational Finance.
Here's an AAAI writeup of an award to the National Association of Securities Dealers for a system thatmonitors NASDAQ insider trading.
Many great answers posted already, but I wanted to add my 2 cents.There is one hot topic in which big companies all around are investing lots of resources into, and is still a very challenging topic with lots of potential: Automated detection of fake news.
This is even more relevant nowadays where most of us are connecting though social media and there's a huge crisis looming over.
Fake news, content removal, source reliability... The problem is huge and very exciting. It is as I said challenging as it can be seen from many perspectives (from analising images to detect fakes using adversarial netwotks to detecting fake written news based on text content (NLP) or using graph theory to find sources) and the possbilities for a research proyect are endless.
I suggest you read some general articles (e.g this or this) or have a look at research articles from the last couple of years (a quick google seach will throw you a lot of related stuff).
I wish I had the opportunity of starting over a project based on this topic. I think it's going to be of the upmost relevance in the next few years.

Artificial Intelligence undergraduate project help on idea and its influence on a later on masters degree [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 5 years ago.
Improve this question
I am a Computer Science student. I want to do an AI project for my 4th year with two other students. (It's a 5-year degree in my university so I can pursue the same project for two consecutive years if I want to). Our knowledge in AI is very basic at this moment since we'll be specializing in it these coming two years, so a very advanced idea will probably be hard to accomplish. We're not expected to research new untouched soils either, so the more resources the better.
I'm interested in ideas that can benefit people and not just applying algorithms and techniques. I want to do a masters after graduation, but I'm not sure in what field yet.
I'd love to do a medical application or a project that of some use to the handicapped.
Some projects that were already pursued at the university included a project to recognize breast cancer, and to teach sign language to the deaf.
I'm wondering:
1) what other ideas we can work on in those fields?
2) how much will my choice of graduation project affect my application for a masters degree?
3) Is a stocks price prediction expert system too advanced for us?
Thanks a lot.
1) what other ideas we can work on in those fields?
It's amazing to me how little imagination computer science students seem to have. Stackoverflow.com is rife with questions about first projects from beginners and students.
I think that using statistics and data in novel ways, like Peter Norvig's spell checker, would be most interesting and fruitful.
Dr. Peter Norvig is a well-known computer science professor and AI guru. He's the CTO of Google now. Perhaps you can mine a choice out of his writings.
2) how much will my choice of
graduation project affect my
application for a masters degree?
Depends on too many other factors that you don't mention, like your past record as a student, etc. Probably a minor factor, in my opinion. Nobody is admitted to a masters program on the basis of a graduation project. Neither your undergrad project nor a masters thesis is a doctoral dissertation. Don't get them confused.
3) Is a stocks price prediction expert system too advanced for us?
I think stock price prediction is too advanced for anybody. After years of applying Fourier analysis, statistical models, Monte Carlo simulations, etc. if it were possible to do it would have been done.
2) how much will my choice of graduation project affect my application for a masters degree?
If you are applying for a PhD, the faculty in the prospective department tend to favor students who are interested in the research they are doing, or who have demonstrated the ability to do their own research. For a Masters these are not much of an issue, but they can make a little difference.
3) Is a stocks price prediction expert system too advanced for us?
Well, if you did then you would start using it to make money, others would see what you are doing an imitate you so that pretty soon your arbitrage opportunity would be gone.
Still, these type of systems are often built by students in machine learning classes, mostly due to the fact that there is a lot of data freely available and well formatted data on stock prices, so its easy to get starting writing the program. It is a good way to get insight into machine learning algorithms.
1) What other ideas we can work on in those fields?
Find some problem that you are passionate about, will learn something from by tackling it, and is within the scope of your time, effort, and ability. Projects like this are relevant not only for grad school but also when applying for entry-level jobs (even if a few years off still after doing a masters degree)l. It helps to pick something you can put on a resume that shows your level of accomplishment and ability to complete a task.
2) How much will my choice of graduation project affect my application for a masters degree?
The topic choice probably won't matter significantly except perhaps for top-tier programs or if you have notable weaknesses in other admissions criteria. If the latter is true, then a good project may help, but even the latter is uncertain. Masters program admissions I think is generally handled by administrative staff, so they are probably more interested in whether or not you did a project than what the topic is.
3) Is a stocks price prediction expert system too advanced for us?
Yes, a stock price prediction system is far too difficult if you want a system that actually can work reasonably well over anything other than a small training data set.
The market is neither a natural system, a machine, nor even a system of rational collective behavior. Its pricing mechanism is in general irrational: investors/traders may make transactions at prices that are reasonable for them relative to their own decision criteria, but the market as a whole is generally not rational. The market is more an aggregation of behavior rather than collective behavior.
The above alone would make for an intensively difficult problem to solve with AI methods, but beyond that there are issues of problem scale, the amount of training data which is needed, etc.
There are of course a large number of Wall Street trading firms using quantitative methods for high-frequency trading, etc. They are effective, however, because they are focused on narrow problems (price trends over the next few seconds-to-minutes in highly-liquid stocks, S&P index futures, etc.), they put a lot of work into their models and generally are constantly rebuilding the latter on a daily/weekly basis, and they understand the market's nature, i.e., it's largely irrational as a whole and is a competitive, shifting landscape of exploiting the pricing inefficiencies inherent to large money flows.
I would only recommend this problem domain if you have an intense personal interest in financial markets and have already spent a lot of time studying them, are prepared to fail, and are interested in learning a lot. Trying to work on this problem is certainly a good learning opportunity, but it will be hard to achieve any real success except for small problems unless you have many years to devote.
1) what other ideas we can work on in
those fields?
Dr. Russel Greiner has a nice list of possible student projects in machine learning, several of which are related to medicine.
2) how much will my choice of
graduation project affect my
application for a masters degree?
It probably won't matter very much. However, choosing a ridiculously easy project probably won't help. I'm sure that you'll be vetting whatever you choose with your prof, so don't worry about that so much. Find a topic you're passionate about first and foremost.
3) Is a stocks price prediction expert
system too advanced for us?
Yes. Don't bother with that nonsense. The game of Go will be solved before anyone figures out the stock market.
1) what other ideas we can work on in
those fields?
Are there any faculty members at your university that work in the field of bioinformatics? If so, talk to them and see if they give you a suitable project idea that gets you excited. If you decide to take this path, try to enroll in an Intro to Bioinformatics course as it will help you get familiar with the field and generally make things easier.

Looking for an example of when screen scraping might be worthwhile

Screen scraping seems like a useful tool - you can go onto someone else's site and steal their data - how wonderful!
But I'm having a hard time with how useful this could be.
Most application data is pretty specific to that application even on the web. For example, let's say I scrape all of the questions and answers off of StackOverflow or all of the results off of Google (assuming this were possible) - I'm left with data that is not very useful unless I either have a competing question and answer site (in which case the stolen data will be immediately obvious) or a competing search engine (in which case, unless I have an algorithm of my own, my data is going to be stale pretty quickly).
So my question is, under what circumstances could the data from one app be useful to some external app? I'm looking for a practical example to illustrate the point.
It's useful when a site publicly provides data that is (still) not available as an XML service. I had a client who used scraping to pull flight tracking data into one of his company's intranet applications.
The technique is also used for research. I had a client who wanted to compare the contents of several online dictionaries by part of speech, and all of these sites had to be scraped.
It is not a technique for "stealing" data. All ordinary usage restrictions apply. Many sites implement CAPTCHA mechanisms to prevent scraping, and it is inappropriate to work around these.
A good example is StackOverflow - no need to scrape data as they've released it under a CC license. Already the community is crunching statistics and creating interesting graphs.
There's a whole bunch of popular mashup examples on ProgrammableWeb. You can even meet up with fellow mashupers (O_o) at events like BarCamps and Hack Days (take a sleeping bag). Have a look at the wealth of information available from Yahoo APIs (particularly Pipes) and see what developers are doing with it.
Don't steal and republish, build something even better with the data - new ways of understanding, searching or exploring it. Always cite your data sources and thank those who helped you. Use it to learn a new language or understand data or help promote the semantic web. Remember it's for fun not profit!
Hope that helps :)
If the site has data that would benefit from being accessible through an API (and it would be free and legal to do so), but they just haven't implemented one yet, screen scraping is a way of essentially creating that functionality for yourself.
Practical example -- screen scraping would allow you to create some sort of mashup that combines information from the entire SO family of sites, since there's currently no API.
Well, to collect data from a mainframe. That's one reason why some people use screen scraping. Mainframes are still in use in the financial world and often it's running software that has been written in the previous century. The people who wrote it might already be retired and since this software is very critical for these organizations, they really hate it when some new code needs to be added. So, screenscraping offers an easy interface to communicate with the mainframe to collect information from the mainframe and then send it onwards to any process that needs this information.
Rewrite the mainframe application, you say? Well, software on mainframes can be very old. I've seen software on mainframes that was over 30 years old, written in COBOL. Often, those applications work just fine and companies don't want to risk rewriting parts because it might break some code that had been working for over 30 years! Don't fix things if they're not broken, please. Of course, additional code could be written but it takes a long time for mainframe code to be used in a production environment. And experienced mainframe developers are hard to find.
I myself had to use screen scraping too in a software project. This was a scheduling application which had to capture the output to the console of every child process it started. It's the simplest form of screen scraping, actually, and many people don't even realize that if you redirect the output of one application to the input of another, that it's still a kind of screen scraping. :)
Basically, screen scraping allows you to connect one (web) application with another one. It's often a quick solution, used when other solutions would cost too much time. Everyone hates it, but the amount of time it saves still makes it very efficient.
Let's say you wanted to get scores from a popular sports site that did not offer the information available with an XML feed or API.
For one project we found a (cheap) commercial vendor that offered translation services for a specific file format. The vendor didn't offer an API (it was, after all, a cheap vendor) and instead had a web form to upload and download from.
With hundreds of files a day the only way to do this was to use WWW::Mechanize in Perl, screen scrape the way through the login and upload boxes, submit the file, and save the returned file. It's ugly and definitely fragile (if the vendor changes the site in the least it could break the app) but it works. It's been working now for over a year.
One example from my experience.
I needed a list of major cities throughout the world with their latitude and longitude for an iPhone app I was building. The app would use that data along with the geolocation feature on the iPhone to show which major city each user of the app was closest to (so as not to show exact location), and plot them on a 3D globe of the earth.
I couldn't find an appropriate list in XML/Excel/CSV type format anywhere easily, but I did find this wikipedia page with (roughly) the info I needed. So I wrote up a quick script to scrape that page and load the data into a database.
Any time you need a computer to read the data on a website. Screen scraping is useful in exactly the same instances that any website API is useful. Some websites, however, don't have the resources to create an API themselves; screen scraping is the developer's way around that.
For instance, in the earlier days of Stack Overflow, someone built a tool to track changes to your reputation over time, before Stack Overflow itself provided that feature. The only way to do that, since Stack Overflow has no API, was to screen scrape.
The obvious case is when a webservice doesn't offer reverse search. You can implement that reverse search over the same data set, but it requires scraping the entire dataset.
This may be fair use if the reverse search also requires significant pre-processing, e.g. because you need to support partial matching. The data source may not have the technical skills or computing resources to provide the reverse search option.
I use screen scraping daily, I run some eCommerce sites and have screen-scraping scripts running daily to gather product lists automatically from my suppliers wholesale sites. This allows me to have upto date information on all the products available to me from several suppliers and allows me to flag non-economical margins due to price changes.

What are areas where you can program artificial intelligence? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
Welcome!
I very enjoyed programming artificial intelligence in my studies - neural networks, expert machines and other. But in work I develop mainly web applications.
And now I think about returning to such programming, maybe in hobby, or maybe in work. Are there areas where AI is commonly used in applications development and programmer with such skills can search work?
Or maybe I can sold some ideas to my boss and use AI to extend some of our applications.
What are you experience and ideas with using AI in applications?
I recently started reading the book Programming Collective Intelligence. It's an excellent book which discusses exactly what you are looking for - using AI techniques in web applications.
The book is written clearly, is easy to understand, explains everything in terms of real applications (it covers how some commonly used technology works: Google Pagerank, Amazons recommendation system, matchmaking websites, link recommendation systems, bayesian spam filters and more) and it uses actually useful examples using real data (ebay API, facebook API etc are used to collect data). In one chapter, it even explains how you can draw graphs (I mean the data structure, not bar/line/etc graphs) optimally (so that no nodes are too close together, minimum overlapping lines etc), which could be useful for, for example, mapping social networks.
I would recommend having a look at it and see the different ways AI can be applied to web applications.
As a counter-example, parsing data acquired from water testing equipment would probably be a bad place to use artificial intelligence:
The Daily WTF: No, We Need a Neural Network
Just a reminder for all of us to choose the right tool for the right job.
Neural networks are great for working on images, so one area of web applications you could use AI for would be identifying and/or manipulating patterns in images over large sets of data. For example, a site like Flickr or Facebook might have some interesting training material to identify people based on face or associating groupings of pixels (those being the features you work with) with certain items mentioned in captions or tags.
In terms of text manipulation, there's a lot of stuff, but it's usually icing on the cake for other web apps. I'm talking mostly in the areas of automatic completion in search bars and back-end things the user doesn't usually see, like automatic machine translation or improved search capability.
The problem with putting AI at the front of an application's offering is that usually, artificial intelligence is not a feature in and of itself, but rather a way of negotiating large data sets effectively without regular prompts from the designer. In general, a user will associate with an application on a one-to-one basis, and therefore judges it only on the quality of a relatively low number of responses.
Email spam filtering systems - definitely.
Any other security applications which need to spot patterns for malicious stuff.
You probably could analyze the behavior of the visitors of your web applications ; how do they navigate inside the website to provide a better, optimized interface. Now it depends on what kind of web applications you're working on. For on line shopping you can come with suggestions extrapolated from customers habits.
You can also detect "abnormal" behavior and fraud. Fraud and bot detection can take advantage of AI.
Forecasting, of course.
It has immense value for businesses (i.e.: inventory optimization) and is especially valuable in the time of global crisis.
Games do need AI.
Expert systems too.
Outside of games, I've seen very few commercial uses of AI.
It could, in theory, be very useful in industrial robotics and imaging, but those fields also tend to be very conservative, and uncomfortable with non-deterministic algorithms.
You might want to research what iRobot does, but even them use rather simple algorithms in their commercial robots.
In the area of cognitive architectures (e.g. Soar, ACT-R, etc), rather than concentrating on algorithms like A* and games, researchers investigate models of human behavior including decision-making, cultural interchange and learning. They often focus on cognitive plausibility, i.e. how close does a model track what a human would do, including timing, etc.
These systems tend to be strictly research-based with limited commercial applications. So far anyway. Military applications, especially for training, are fairly common though.
Image Processing for detecting cancer! (We actually code IEEE papers about it, creating the algoritms is way harder than coding them so we write papers about the performance of other papers)
Risk assessment is a pretty good case for neural networks, mostly because they're pretty good at pattern matching. Insurance and credit companies use them to some degree for determining the risk of a customer.
I have done some extensive research on using Artificial Neural Networks for classification of underwater sound sources. The algorithm seemed to work quite well, especially that I devoted a big portion of the work on figuring out what combination of fourier transform coefficient composed the best set for the classification (with Principal Component Analysis).
Anything (seriously):
http://highlevellogic.blogspot.com/2010/09/high-level-logic-rethinking-software.html
The High Level Logic (HLL) Open Source project is about finding and coding high level logic under which all the other AI (and in fact, all programming) fits. There are serious concrete ideas and code. HLL is already an application framework.

Resources