Common web problems where Neural Networks could help

Common web problems where Neural Networks could help - artificial-intelligence

I was wondering if you creative minds out there could think of some situations or applications in the web environment where Neural Networks would be suitable or an interesting spin.
Edit: Some great ideas here. I was thinking more web centric. Maybe bot detectors or AI in games.

To name a few:
Any type of recommendation system (whether it's movies, books, or targeted advertisement)
Systems where you want to adapt behaviour to user preferences (spam detection, for example)
Recognition tasks (intrusion detection)
Computer Vision oriented tasks (image classification for search engines and indexers, specific objects detection)
Natural Language Processing tasks (document/article classification, again search engines and the like)

The game located at 20q.net is one of my favorite web-based neural networks. You could adapt this idea to create a learning system that knows how to play a simple game and slowly learns how to beat humans at it. As it plays human opponents, it records data on game situations, the actions taken, and whether or not the NN won the game. Every time it plays, win or lose, it gets a little better. (Note: don't try this with too simple of a game like checkers, an overly simple game can have every possible game/combination of moves pre-computed which defeats the purpose of using the NN).
Any sort of classification system based on multiple criteria might be worth looking at. I have heard of some company developing a NN that looks at employee records and determines which ones are the least satisfied or the most likely to quit.
Neural networks are also good for doing certain types of language processing, including OCR or converting text to speech. Try creating a system that can decipher capchas, either from the graphical representation or the audio representation.

If you screen scrap or accept other sites item sales info for price comparison, NN can be used to flag possible errors in the item description for a human to then eyeball.
Often, as one example, computer hardware descriptions are wrong in what capacity, speed, features that are portrayed. Your NN will learn that generally a Video card should not contain a "Raid 10" string. If there is a trend to add Raid to GPUs then your NN will learn this over time by the eyeball-er accepting an advert to teach the NN this is now a new class of hardware.
This hardware example can be extended to other industries.

Web advertising based on consumer choice prediction
Forecasting of user's Web browsing direction in micro-scale and very short term (current session). This idea is quite similar, a generalisation, to the first one. A user browsing Web could be proposed with suggestions with other potentially interesting websites. The suggestions could be relevance-ranked according to prediction calculated in real-time during user's activity. For instance, a list of proposed links or categories or tags could be displayed in form of a cloud and font size indicates rank score. Each and every click a user makes is an input to the forecasting system, so the forecast is being constantly refined to provide user with as much accurate suggestions, in terms of match against user's interest, as possible.

Ignoring the "Common web problems" angle request but rather "interesting spin" view.
One of the many ways that a NN can be viewed/configured, is as a giant self adjusting, multi-input, multi-output kind of case flow control.
So when you want to offer match ups that are fuzzy, (not to be confused directly with fuzzy logic per se, which is another area of maths/computing) NN may offer a usable alternative.
So to save energy, you offer a lift club site, one-offs or regular trips. People enter where they are, where they want to go and at what time. Sort by city and display in browse control.
Using a NN you could, over time, offer transport owners to transport seekers by watching what owners and seekers link up. As a owner may not live in the same suburb that a seeker resides. The NN learns over time what variances in owners, seekers physical location difference appear to be acceptable. So it can then expand its search area when offering a seekers potential owners.
An idea.

Search! Recognize! Classify! Basically everything search engines do nowadays could benefit from a dose of neural networks and fuzzy logic. This applies in particular to multimedia content (e.g. content-indexing images and videos) since that's where current search technologies are lagging behind.

One thing that always amazes me is that we still don't have any pseudo-intelligent firewalling technology. Something that says "hey his range of urls is making too much requests when they are not supposed to", blocks them, and sends a report to an administrator. That could be done with a neural network.
On the nasty part of things, some virus makers could find lucrative uses to neural networks. Adaptative trojans that "recognized" credit card numbers on a hard drive (instead of looking for certain cookies) or that "learn" how to mask themselves from detectors automatically.

I've been having fun trying to implement a bot based on a neural net for the Diplomacy board game, interacting via DAIDE protocols. It turns out to be extremely tricky, so I've turned to XCS to simplify the problem.

Suppose EBay used neural nets to predict how likely a particular item was to sell; predict what the best day to list items of that type would be, suggest a starting price or "buy it now price"; or grade your description based on how likely it was to attract buyers? All of those could be useful features, if they worked well enough.

Neural net applications are great for representing discrete choices and the whole behavior of how an individual acts (or how groups of individuals act) when mucking around on the web.
Take news reading for instance:
Back in the olden days, you picked up usually one newspaper (a choice), picked a section (a choice), scanned a page and chose an article (a choice), and read the basics or the entire article (another choice).
Now you choose which news site to visit and continue as above, but now you can drop one paper, pick up another, click on ads, change sections, and keep going with few limits.
The whole use of the web and the choices people make based on their demographics, interests, experience, politics, time of day, location, etc. is a very rich area for NN application. This is especially relevant to news organizations, web page design, ad revenue, and may even be an under explored area.
Of course, it's very hard to predict what one person will do, but put 10,000 of them that are the same age, income, gender, time of day, etc. together and you might be able to predict behavior that will lead to better designs. Imagine a newspaper (or even a game) that could be scaled to people's needs based on demographics. An ad man's dream !

How about connecting users to the closest DNS, and making sure there are as few bounces as possible between the request and the destination?

Friend recommendation in social apps (Linkedin,facebook,etc)

Related

logic determining dialog in Watson assistant

I want to improve ibm’s Watson assistant results.
So, I want to know the algorithm to determine a dialog in Watson assistant’s conversations.
Is it a svm algorithm?
A paper is welcome.

There are a number of ML/NLP technologies under the covers of Watson Assistant. So it's not just a single algorithm. Knowing them is not going to help you improve your results.
I want to improve ibm’s Watson assistant results.
There are a number of ways.
Representative questions.
Focus on getting true representative questions from the end users. Not only in the language that they use, but if possible from the same medium you plan to use WA on (eg. Mobile device, Web, Audio).
This is the first factor that reduces accuracy. Manufacturing an intent can mean you build an intent that a customer may never ask (even if you think they do). Second you will use language/terms with similar patterns. This makes it harder for WA to train.
Total training questions
It's possible to train an intent with one question, but for best results 10-20 example questions. Where intents are close together then more examples are needed.
Testing
The current process is to create what is called a K-Fold Cross validation (sample script). If your questions are representative then the results should give you an accurate indicator of how well it is performing.
However, it is possible to overfit the training. So you should use a blind set. This is a 10-20% of all questions (Random sample). They should never be used to train WA. Then run them against the system. Both your Blind + K-Fold should fall within 5% of each other.
You can look at the results of the K-Fold to fix issues, but blind set you should not. Blinds can go stale as well. So try to create a new blind set after 2-3 training cycles.
End user testing.
No matter how well your system is trained, I can guarantee you that new things will pop up when put in front of end users. So you should plan to have users test before you put it into production.
When getting users to test, ensure they understand the general areas it has been trained on. You can do this with user stories, but try not to prime the user into asking a narrow scoped question.
Example:
"Your phone is not working and you need to get it fixed" - Good. They will ask questions you will never have seen before.
"The wifi on your phone is not working. Ask how you would fix it". - Bad. Very narrow scope and people will mention "wifi" even if they don't know what it means.

AI for MMORTS game

I'm not sure if this is the right place to ask this, but here goes.
I have been a programmer for about 12 years now with experience in php, java, c#, vb.net and asp. I have always been rather intrigued about Artificial Intelligence. I think it is really the ultimate challenge for any developer.
I have written many simple scripts to play games but nothing compared to what I want to do next. I want to write an AI program that will play a MMORTSG (Massively Multiplayer Online Real Time Strategy Game. I have been searching through many AI techniques but none seem to tackle the problems that I know I will face:
Problems I can foresee:
The game has no "win situation", instead, the best strategy is the one that has the greatest growth in comparison to that of other players. Growth is determined by 3 factors, economy, military and research.
Parts of the game state are unpredictable. Other players can attack me at random.
The game is time based and actions take time. ie. Building a new building make take several hours. While that building is being built, no other buildings can be built.
All the AI systems I have researched, require some sort of "Winning Function" to test if the AI has found an end. Where in my situation it would more likely be something like "I have X, Y, Z options, the best one is X".
ps. Sample code would be awesome. Even Psuedo would be great.

I've seen a few applications of Artificial Intelligence in the Gaming area, but most of this was for FPS, MMORPGs and RTS games. The genre type that you appear to be relating to sounds similar to 'Clash of Clans', where research, military and economy as well as random attacks occur over a random period of time, and runs over an endless period of time.
It seems that a model would be used at key points in the game (building is finished, or research is available, or castle is full) to make strategic decisions for progression. Perhaps a Genetic Algorithm could be applied at key moments determine a suitable sequence of future steps. A modular neural network could be defined to decide the logical Growth factor to take, but training such a network may be difficult as the game rules can change over time (either from previously unknown resources, research options, military and even game updates). Scripts are quite common as well in the MMORPG genre, but defining the manual rules could also be difficult without knowing all of the available options. The fact is that there are so many ways that your challenge can be addressed that it would be difficult to give a clear-cut answer to your problem, let alone the code or psudocode.
Looking briefly over the problem, it appears that the contributing factors to the problem would be current economic state, current military state, current research state, time lost if saving for next upgrade, time required to build next upgrade, cost of upgrade as well as other unknown factors.
Given that the problem has no clear winning objectives, I guess it is a matter of maintaining a healthy balance between the three growth factors. But how does one define the balance? Is research more important? Should you always have money, or just save enough for the next planned upgrade? Should military be as large as possible?
The challenge that you are placing before yourself is quite adventurous, but I would recommend taking on smaller challenges if you are not yet familiar with the models that AI have to provide. There are quite a number of Gaming Applications for AI resources available to inspire your model (including ziggystar's examples noted above).

Multi-Agent system application idea

I need to implement a multi-agent system for an assignment. I have been brainstorming for ideas as to what I should implement, but I did not come up with anything great yet. I do not want it to be a traffic simulation application, but I need something just as useful.

I once saw an application of multiagent systems for studying/simulating fire evacuation plans in large buildings. Imagine a large building with thousands of people; in case of fire, you want these people to follow some rules for evacuating the building. To evaluate the effectiveness of your evacuation plan rules, you may want to simulate various scenarios using a multiagent system. I think it's a useful and interesting application. If you search the Web, you will find papers and works in this area, from which you might get further inspiration.

A few come to mind:
Exploration and mapping: send a team of agents out into an environment to explore, then assimilate all of their observations into consistent maps (not an easy task!)
Elevator scheduling: how to service call requests during peak capacities considering the number and location of pending requests, car locations, and their capacities (not too far removed from traffic-light scheduling, though)
Air traffic control: consider landing priorities (i.e. fuel. number of passengers, emergency conditions,etc.), airplane position and speed, and landing conditions (ie. number of runways, etc). Then develop a set of rules so that each "agent" (i.e. airplane) assumes its place in a landing sequence. Note that this is a harder version of the flocking problem mentioned in another reply.

Not sure what you mean by "useful" but... you can always have a look at swarmbased AI (school of fish, flock of birds etc.) Each agent (boid) is very very simple in this case. Make the individual agents follow each other, stay away from a predator etc.

Its not quite multi-agent but have you considered a variation on ant colony optimisation ?
http://en.wikipedia.org/wiki/Ant_colony_optimization_algorithms

Need recommendations on techniques or designs for this type of web site?

So, I've just decided to build my own fantasy sports web site.
You know the type of site where you can pick players from your favourite league and depending on how they do you get a certain amount of points in your team. There are fantasy teams for all types of leagues and sports, I'm sure you know what I'm talking about.
I haven't settled for a specific sport or league just yet because I want the basics to fit to different types of team-based sports.
I have a few expectations on it myself. If you can come up with any other I'll be glad to hear them.
I expect the site to be dynamic and have many visits during a game, but almost only static content otherwise.
Player points should be updated in real-time during a game.
I would need a list that shows each game being played and the points of every player in that game. It should also show minutes played, goals, assists etc.
Each registered user would be able to see the points and players of his/hers team updated in real time.
I need the site to scale so that if I start with 1000 teams I could end up with 5 million.
I probably won't be needing language support right now, but who knows in the future.
Based on these prerequisites what would be best to use in terms of language (php, .NET, drupal or other cms's), database (mysql, sqlserver, xml) and other techniques?
Maybe it doesn't really matter what I use?
I guess the dynamic and real time update of each player's points is where I need help the most.
Thanks in advance!
/Niklas
EDITED
I could use an array with the following data for a specific game week:
Player ID
Minutes played
Sport specific points(goals, assists, penalties, yellow cards, man of the match bonus) etc.
Total points in current game week
When the game is over I'd add these to a DB and sum this data with any previous game weeks. Plus player value, number of teams that has selected this player, etc.

You are probably going to have to go down the custom route for your "Game" code - rather than using a CMS, although depending on your experience, you may be able to leverage a framework (e.g CodeIgniter) to speed up some of your DEV time.
This type of site would be pretty language agnostic, however it would depend on the actual numbers of users you are looking at as to the most scalable solution / set of techniques to deploy.
One of the biggest considerations you are going to have to look at would be the design of the data model, and the platform that this sits on.
If you want to be processing near to realtime updates, you are going to want to focus your efforts on making the DB queries / processing the most efficient possible.

One big consideration that you have not discussed here is caching. There is some data on your site that I am sure will be static for long periods of time (such as weekly totals etc), and there is data that will be very much real time (but only during match days).
However, during match days you will have a lot more traffic than non match days, and you will therefore have a lot of requests for the same data in a short period of time. Therefore, employing a good caching strategy will save you masses of CPU power. What I am thinking of, is to calculate a player's score and then cache for 1 minute at a time, therefore each time that specific player is requested, you are retrieving from a cache, rather than recalculating each time.

looking for a good project to work on as my graduation project in the university that involves Ai / Machine Learning, please help me

I need help to chose a project to work on for my master graduation, The project must involve Ai / Machine learning or Business intelegence.. but if there is any other suggestion out of these topics it is Ok, please help me.

One of the most rapid growing areas in AI today is Computer Vision. There are many practical needs where the results of your Master Thesis can be helpful. You can try research something like Emotion Detection, Eye-Tracking, etc.
An appropriate work for a MS in CS in any good University can highlight the current status of research on this field, compare different approaches and algorithms. As a practical part, it makes also a lot of fun when your program recognizes your mood properly :)

Netflix
If you want to work more on non trivial datasets (not google size, but not trivial either and with real application), with an objective measure of success, why not working on the netflix challenge (the first one) ? You can get all the data for free, you have many papers on it, as well as pretty good way to compare your results vs other peoples (since everyone used exactly the same dataset, and it was not so easy to "cheat", contrary to what happens quite often in the academic literature). While not trivial in size, you can work on it with only one computer (assuming it is recent enough), and depending on the type of algorithms you are using, you can implement them in a language which is not C/C++, at least for prototyping (for example, I could get decent results doing things entirely in python).
Bonus point, it passes the "family" test: easy to tell your parents what you are working on, which is always a pain in my experience :)
Music-related tasks
A bit more original: something that is both cool, not trivial but not too complicated in data handling is anything around music, like music genre recognition (classical / electronic / jazz / etc...). You would need to know about signal processing as well, though - I would not advise it if you cannot get easy access to professors who know about the topic.

I can use the same answer I used on a previous, similar question:
Russ Greiner has a great list of project topics for his machine learning course, so that's a great place to start.
Both GAs and ANNs are learners/classifiers. So I ask you the question, what is an interesting "thing" to learn? Maybe it's:
Detecting cancer
Predicting the outcome between two sports teams
Filtering spam
Detecting faces
Reading text (OCR)
Playing a game
The sky is the limit, really!

Since it has a business tie in - given some input set determine probable business fraud from the input (something the SEC seems challenged in doing). We now have several examples (Madoff and others). Or a system to estimate investment risk (there are lots of such systems apparently but were any accurate in the case of Lehman for example).
A starting point might be the Chen book Genetic Algorithms and Genetic Programming in Computational Finance.
Here's an AAAI writeup of an award to the National Association of Securities Dealers for a system thatmonitors NASDAQ insider trading.

Many great answers posted already, but I wanted to add my 2 cents.There is one hot topic in which big companies all around are investing lots of resources into, and is still a very challenging topic with lots of potential: Automated detection of fake news.
This is even more relevant nowadays where most of us are connecting though social media and there's a huge crisis looming over.
Fake news, content removal, source reliability... The problem is huge and very exciting. It is as I said challenging as it can be seen from many perspectives (from analising images to detect fakes using adversarial netwotks to detecting fake written news based on text content (NLP) or using graph theory to find sources) and the possbilities for a research proyect are endless.
I suggest you read some general articles (e.g this or this) or have a look at research articles from the last couple of years (a quick google seach will throw you a lot of related stuff).
I wish I had the opportunity of starting over a project based on this topic. I think it's going to be of the upmost relevance in the next few years.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight