Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
I am looking for a graduation project idea in AI and machine learning field...
The idea may require front-end user interface to attract users...
I am thinking of how AI and machine learning can help you in daily life..?
Any help/hint about new interesting ideas ?
Thanks
Edit:
I am talking about practical ideas that may be used in real life... Not an idea to prove theoretical things... Something like a OS (or an add on in existing one) that adapt with your way of work... or a word processor that helps you collecting information about what you are writing..
What about an project that uses Markov-Chain text generation to generate answers for Stack Overflow questions? (^__^)
Two ideas:
Write a current, robust version of SHRDLU with understandable source code.
Write a SHRDLU-like program that manipulates actual code instead of imaginary blocks. Such a tool could be used for manipulating extremely large, complicated programs, including its own code!
Imagine giving commands like the following...
(a) Scan web site X and list any sentences you failed to parse.
(b) Scan document Y and list any grammar rules you didn't need.
(c) Instead of iterating over every element of "proplist" in your "search" function, only process the cdr of "proplist" if the initial call to "lookup" returns nil. After you make the modification, confirm the sentence "pick up a very very big block" will succeed and the sentence "pick up a very and very big block" will fail.
(d) Your "conjoin" grammar currently requires a coordinator word like "and", but that requirement is wrong. Split your "coordination" grammar into "syndetic coordination" and "asyndetic coordination" as follows: conjoins using "and", as in "quickly and quietly, he walked into the bank" are called "syndetic coordinations". Conjoins without a coordinator, as in "quickly, quietly, he walked into the bank" are "asyndetic coordinations". Now scan corpus Z to see if fewer sentences fail to parse.
One component of intelligence is imagination.
It wouldn't take much to Google for "artificial intelligence research projects" and see what other people are doing at other schools. Since it's not a Ph.D., there's no uniqueness requirement for you.
You could also look at Peter Norvig's text to see what's been done before and adapt it.
I'd also recommend doing something with the reams of data that's available to you on the web. Try thinking about "Programming Collective Intelligence" and "Beautiful Data" to see how you could use information to teach a program how to adapt its behavior based on new information (neural nets, genetic algorithms, ant colony algorithms, etc.)
What interests you?
AI is used in a great many areas, so find something you are passionate about and then see how to use AI for it.
For example, if you are interested in games, then you could find an interesting algorithm for the ghosts in Pac-man to chase, and use some more interesting mazes. You may find someone that is interested in doing a 3D project and they could write a 3D version and your algorithm could be more interesting.
Or, you may be interested in robotics. Again, it would be ideal if you could find someone with an interest in making a robot and you could write the AI part. So, for example, you could see if you can figure out how to get a robot to determine the difference between a farm crop and a weed/grass.
Basically, your starting point should be on what really interests you.
Perform a clustering on documents, as done by ex-clusty search engine. This clearly is an attractive application.
Related
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 3 years ago.
Improve this question
I am programming a vocabulary trainer in Haskell in my spare time.
I have a file with the words where entries are modeled as algebraic data types, which look like
Word { _frq=1
, _fra="le"
, _eng="the; him, her, it, them"
, _deu="der,die,das; er,sie,es"
, _uses=[Determiner [], Pronoun []]
, _phrase=" vive la politique, vive l'amour"
, _sentence="long live politics, long live love"
, _satz="Lang lebe die Politik, lang lebe die Liebe."
}
Most often the german translation _deu= and _satz= is most often just an empty string which I want to update within the program.
Now I have a few questions:
1. Is there a database using Haskell datatypes for haskell (I would really like type safety in my database too)?
The things I found were the HDBC bindings to MySQL and the like, and some other xml/JSON stuff.
If i update the file instead of using a database, is there a way around recompiling the whole program - it would be a bit tedious to do that.
and a third question
I want to save the learnt vocabularies in a datastructure which needs often to be updated, as in each learning step i update a number indicating the knowledge of this word - and sort this datastructure while inserting/or afterwards. Then i pick a new word based on its position in this datastructure. Lists seem to be inefficient for doing a full list traversal and sorting is a big effort if there is a better solution.
A note at last I do have only 5000 list entries, so maybe is it worrying about speed in the wrong place?
Database-wise, take a look at Acid-State. There's also a tutorial for it as part of the Happstack Crash Course.
It does what you ask in terms of maintaining type safety in the model. I'm not sure how useful this'll be for you, but I've put it to use in a couple of web-apps, including here, and here (that second one is part of a benchmarking attempt pitting HDBC against MongoDB and AcidState, so you can use it to see how the three compare implementation-wise in the context of a Haskell web-application).
To your third question, at 5000 inserts/reads, you really shouldn't be worried about performance. If you take a look at those benchmarks I mentioned, the "large" benchmark runs a (relatively small) 50 000 transactions in very short order, and they were meatier insertions than what you seem to be doing.
Check out Persistent from Yesod:
Persistent is Yesod’s answer to data storage- a type-safe, universal data store interface for Haskell.
[...]
Persistent allows us to choose among existing databases that are highly tuned for different data storage use cases, interoperate with other programming languages, and to use a safe and productive query interface, while still keeping the type safety of Haskell datatypes.
Persistent follows the guiding principles of type safety and concise, declarative syntax.
I am developing an application - which would have users answer maybe 10 questions - which would have 3-4 options for each question. At the end of the 10th question, based on the responses, it would need to suggest a certain solution. Since there are 100's of permutation and combinations - what's the logic that would be required to use and the database design,
thanks
EDIT some more detailed explanation
if my application is used to recommend a data plan from various mobile operators - based on the user answering questions like the time spent on the internet, the type of files being downloaded and so on. So, if the response to question 1 was a and question 2 was c, etc - then it would be a certain plan. If the response to question 1 was b and for question 2 it was c, then it would recommend a certain plan. So, if there were 10 questions - then the combinations can be quite large. So is there a certain algorithm that can handle this?
I. what would be the logic?
If I understand correctly, you would define "rules" such as
If the answer to question 5. is either A or B then the suggested plan would be planB, otherwise execute the rest of the rules.
So you would use a rule engine e.g.: http://www.jboss.org/drools/
II. what would be the database design?
This is quite simple:
USERS table,
QUESTIONS table and
ANSWERS table which would refer to the two others
Possibly there would be a QUESTIONNAIRE table as well, and the QUESTIONS table would refer to it.
Just a 'quick' comment, consider letting the user see changes in what company they could be recommended as they answer every question.
For example, if I am most interested in price that would be the question I would answer first and immediately see the 3 cheapest plans/products recommended to me.
The second question could be coverage and if I then could see the 3 plans with best coverage (in my area) that would be interesting too.
When I answer the third question about smart phone features and I say I want internet, then the first question should spit out the 3 cheapest plans/products that include internet, obviously they could change.
And so on...
Maybe it also could be a good idea to let the user "dive into" each question and see the full range of options for that answer. As a user I would appreciate that.
Above comments is just how I would appreciate if a form was made for me, I don't want to answer 10 questions about stuff I'm not really putting any value on, each user is different and will prefer to make their choice on their questions.
So, based on above it would be like a check list where the top answers would be the plans/products with the most fitting check marks. And to give immediate responses (as the user answer/alter each question), here AJAX would probably be your choice.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
Just out of curiosity because I've always wondered this. How does the application Shazam work? I know how to use it, I'm speaking in terms of programming. How does the application listen to any part of a song and then give you the results? Obviously it receives it's song information from a database, but there is NO way someone could enter every single song known to man in that database. Also, how does Shazam not constantly update all the time? New songs are constantly being released yet it was like Shazam already had the future songs programmed into it. This is just mind boggling to me, and I would just like to know how exactly this all works. I know this is not a help question, but could someone please clarify? Thanks!
Shazam only starts with Fourier transforms (which isn't surprising since pretty much all audio processing works this way).
You can read Avery Wang's original paper, if you like. He is the inventor of the Shazam algorithm. I happen to think that it is best explained as a nearest neighbor technique, which is why we included it as an example in Chapter 9 of "Data Mining Techniques, 3rd Edition".
You might be interested in what we have to say there (http://www.amazon.com/Data-Mining-Techniques-Relationship-Management/dp/0470650931/ref=pd_sim_b_5).
They don't say much on the link diciu posted.
The algorithm is based on Fourier's waves, which allows expressing a mathematical function as a linear sum of harmonic functions. This transform allows mapping between time to frequency which is exactly what you need in order to create voice recognition. I find it hard to believe that Shazaam has a patent over Fourier's transformation. But if you try to build a "2nd Shazaam" you'll probably fail since they already took over all the market...
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 7 years ago.
Improve this question
How would one design a neural network for the purpose of a recommendation engine. I assume each user would require their own network, but how would you design the inputs and the outputs for recommending an item in a database. Are there any good tutorials or something?
Edit: I was more thinking how one would design a network. As in how many input neurons and how the output neurons point to a record in a database. Would you have say 6 output neurons, convert it to an integer (which would be anything from 0 - 63) and that is the ID of the record in the database? Is that how people do it?
I would suggest looking into neural networks using unsupervised learning such as self organising maps. It's very difficult to use normal supervised neural networks to do what you want unless you can classify the data very precisely for learning. self organising maps don't have this problem because the network learns the classification groups all on their own.
have a look at this paper which describes a music recommendation system for music
http://www.springerlink.com/content/xhcyn5rj35cvncvf/
and many more papers written about the topic from google scholar
http://www.google.com.au/search?q=%09+A+Self-Organizing+Map+Based+Knowledge+Discovery+for+Music+Recommendation+Systems+&ie=utf-8&oe=utf-8&aq=t&rls=com.ubuntu:en-US:official&client=firefox-a&safe=active
First you have to decide what exactly you are recommending and under what circumstances. There are many things to take into account. Are you going to consider the "other users who bought X also bought Y?" Are you going to only recommend items that have a similar nature to each other? Are you recommending items that have a this-one-is-more-useful-with-that-one type of relationship?
I'm sure there are many more decisions, and each one of them has their own goals in mind. It would be very difficult to train one giant network to handle all of the above.
Neural networks all boil down to the same thing. You have a given set of inputs. You have a network topology. You have an activation function. You have weights on the nodes' inputs. You have outputs, and you have a means to measure and correct error. Each type of neural network might have its own way of doing each of those things, but they are present all the time (to my limited knowledge). Then, you train the network by feeding in a series of input sets that have known output results. You run this training set as much as you'd like without over or under training (which is as much your guess as it is the next guy's), and then you're ready to roll.
Essentially, your input set can be described as a certain set of qualities that you believe have relevance to the underlying function at hand (for instance: precipitation, humidity, temperature, illness, age, location, cost, skill, time of day, day of week, work status, and gender may all have an important role in deciding whether or not person will go golfing on a given day). You must therefore decide what exactly you are trying to recommend and under what conditions. Your network inputs can be boolean in nature (0.0 being false and 1.0 being true, for instance) or mapped in a pseudo-continuous space (where 0.0 may mean not at all, .45 means somewhat, .8 means likely, and 1.0 means yes). This second option may give you the tools to map confidence level for a certain input, or simple a math calculation you believe is relevant.
Hope this helped. You didn't give much to go on :)
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
I'm writing a genetic programming (GP) system (in C but that's a minor detail). I've read a lot of the literature (Koza, Poli, Langdon, Banzhaf, Brameier, et al) but there are some implementation details I've never seen explained. For example:
I'm using a steady state population rather than a generational approach, primarily to use all of the computer's memory rather than reserve half for the interim population.
Q1. In GP, as opposed to GA, when you perform crossover you select two parents but do you create one child or two, or is that a free choice you have?
Q2. In steady state GP, as opposed to a generational system, what members of the population do the children created by crossover replace? This is what I haven't seen discussed. Is it the two parents, or is it two other, randomly-selected members? I can understand if it's the latter, and that you might use negative tournament selection to choose members to replace, but would that not create premature convergence? (After a crossover event the population contains the two original parents plus two children of those parents, and two other random members get removed. Elitism is inherent.)
Q3. Is there a Web forum or mailing list focused on GP? Oddly I haven't found one. Yahoo's GP group is used almost exclusively for announcements, the Poli/Langdon Field Guide forum is almost silent, and GP discussions on general/game programming sites like gamedev.net are very basic.
Thanks for any help you can provide!
Firstly, relax.
There are no "correct" methods in GP. GP is more art than science. Try lots of schemes and pick the ones that work best.
Q1: 1, 2, or many. You choose.
Q2: Replace, 1, 2, all. Or try some elitism.
Q3: You probably won't find forums discussing these questions b/c there are no right/best answers. Sorry.
PS. In my research, crossover never really performed well...
If you can read Python, you may want to take a look at Pyevolve. I am mainly involved in it on the GA side, but it has support for GP as well. May be you can get some hint there.
Q1 is your choice, but single child would probably be more common. Every time you do the lottery selection of parents, you're applying selection pressure, which is what you want.
Q2: Negative tournament selection is exactly the right approach. Yes, losing low-fitness members of the population causes rapid convergence initially, but once your population gets into the hard-to-search part of the solution space, it won't be as cut-and-dried which ones lose the tournament / lottery. What you do have to beware of is stagnation of the gene pool; I suggest monitoring the entropy of the genome to track its heterogeneity. "elitism is inherent" -- Well, yeah, that's the point! ;-)
Q3: comp.ai.genetic is probably your best bet. Sometimes the topic is picked up in game development fora, like on Gamasutra.
P.S. Genetic programming in C?!? How are you assuring the viability of the offspring? Doing genetic programming in a non-homoiconic language is a real challenge.
Check out MetaOptimize.com for your stacky needs.
As Ray, says, it's mostly up to you but typically in a steady-state setup you would only create a single offspring.
Again you have options. I wouldn't replace the parents. If they've been picked as parents based on their fitness you could be eliminating some of the fittest members of the population. Easiest is just to randomly pick an individual to be replaced. Alternatively, you could replace the least fit individual, but that can lead to premature convergence. Another option is to use the same selection strategy that you use to choose parents but use the inverse fitness so that it favours less fit individuals.
You could try comp.ai.genetic on USENET (and Google Groups).
It sounds like some of your questions are not necessarily specific to genetic programming; if that's true, you might have some luck asking the folks over at the NEAT Users Group.
They primarily discuss the Neuroevolution of Augmenting Topologies (or NEAT) algorithm, which is a genetic algorithm used to evolve neural networks. But topics like elitism and crossover strategies are pretty general, and can apply to both GA and GP algorithms.
Otherwise, as Dan and Ray have said, a lot of these decisions are made after experimentation with one's particular software and domain. Try applying your algorithm to different problems and pay attention to how it behaves -- after a while, you'll probably develop an intuition for what works and what doesn't.
I would create an unlimited number of offspring, but only on the basis of success, and let older members of the population die. Lack of fitness can also lead to early death. This just seems to follow a natural order.
Q1. In GP, as opposed to GA, when you perform crossover you select two parents but
do you create one child or two, or is that a free choice you have?
Yes its your choice; but generally, its not advisable to create many individuals with the same parents, because the difference among the individual's trends created by the same parents would be very limited and that could cost processing speed and memory which could have been spent on other individuals showing different trends and behaviors that requires analysis (but creating more individuals cannot be a problem if the evolution process is close to reaching its endpoint).
Q2. In steady state GP...
It is advisable to replace individuals based on the ranking provided by the fitness function you have adopted.