Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 11 months ago.
Improve this question
Let's say we're programming a snake game.
The game has a 20x20 playing field. That means there is 400 individual cells.
In snake, a new apple has to generate on a random unoccupied field/cell.
There's two common ways to do this
Try to place the apple on a random cell, until you hit a free one
Create a list of all free cells, and randomly choose one to place the apple on
Most people on here would recommend the second approach, because of "better performance". And I thought the same. But after having had statistics in my computer science class, i'm not so sure anymore:
At the start of the game, by logic, the first approach would be faster, because it would very likely instantly find a free cell, whereas the second apprach has a big overhead of creating a list of free cells.
And a little performance test in JS confirms this:
At the end of the game - which isn't reached often - when there is only one free field left, the second approach would probably win in speed, because it would always find the field in 1 go. The first approach needs way more tries - using logarithm we can calulate how many.
50% of the time, it takes less than 276 tries. 90% of the time, it takes less than 919 tries. 99.999999% of the time, it takes less than 5519 tries. And so on. log with base 399/400 of (100-percentage). A few thousands tries more is nothing for a modern computer, so it should only be a little bit slower. This is confirmed by another performance test:
0%-4% slower on average ... negligible.
And most of the time, most cells are free, which means the first approach is way faster on average.
Ontop of that, in many languages, for example in C, the first appreach would be shorter in terms of code. There is no overhead for a second array.
This brings me to the conclusion, that randomly choosing a cell until you find a free one is the best practise and creating a list of empty cells is premate optimization, and it actually does the opposite (makes performance worse on average because of the added overhead).
Do you agree? Did I miss something?
What's the best practice in your opinion and why?
Related
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I encountered a question where a given array of integers I needed to find the pair which could satisfy the given sum.
The first solution that came to me was to check all possible pairs which was about O(n^2) time, but the interviewer requested me to come up with the improved run time at which I suggested to sort the array and then do binary search but that was also O(nlogn).
Overall I failed to come up with the O(n) solution. Googling that I came to know that it can be achieved via extra memory using set.
I know that there cannot be any fix rules to thinking about algorithms but I am optimistic and think that there must be some heuristic or mental model while thinking algorithms on array. I want to know if there is any generic strategy or array specific thinking which would help me explore more about solution rather than acting dead.
Generally, think about how to do it naively first. If in an interview, make clear what you are doing, say "well the naive algorithm would be ...".
Then see if you can see any repeated work or redundant steps. Interview questions tend to be a bit unrealistic, mathematical special case type questions. Real problems more often come down to using hash tables or sorted arrays. A sort is N log N, but it make all subsequent searches O log N, so it's usually worth sorting data. If data is dynamic, keep it sorted via a binary search tree (C++ "set").
Secondly, can you "divide and conquer" or "build up". Is the N = 2 case trivial? In than case, can we divide N = 4 into two N = 2 cases, and another integration step? You might need to divide the input into two groups, low and high, in which case it is "divide and conquer", or you might need to start with random pairs, then merge into fours, eights and so on, in which case it is "build up".
If the problem is geometrical, can you exploit local coherence? If the problem is realistic instead of mathematical, and there typical inputs you can exploit (real travelling salesmen don't travel between cities on a random grid, but over a hub and spoke transport system with fast roads connecting major cities and slow roads then branching out to customer destinations)?
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
Which would be the best method for uniformly distributing values into buckets. The values are generated using gaussian distribution, hence most values are near the median.
I am implementing bucket sort in CUDA. Since most of the values are generated near median they are inserted into 4-5 buckets. I can make large number of buckets and would like to evenly distribute the values in all/most buckets instead of just 3-4 buckets.
It seems you're looking for an histogram.
If you are looking for performance, go into the CUB or Thrust libraries as the two comments point out, otherwise you'll end up expending a lot of time and still not achieving those performance levels.
If you are decided to implement the histogram I'll recommend you to start with the simplest implementation; a two-step approach. In the first step you calculate the number of elements which falls into each bucket, so you can create the container structure with the right array sizes. The second step simply copy the elements to the corresponding array of the structure.
Since here, you can evolve to more complex versions, using for example a prefix sum to calculate the initial positions of the buckets on a large array.
The application is bounded by memory traffic (you have not arithmetic workload at all), so try to improve the locality and the access patterns as much as you can.
Of course, check out the open source code to get some ideas.
Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 9 years ago.
Improve this question
Here the problem:
you got a list of ingredients (assuming their value unitary) with
their respective quantities, and a list of products. Each product got
a price and the recipe which contain the needed ingredients an their
quantities.
You need is to maximize the total proceeds from those products with
the given ingredients.
The first thing blowing up in my mind is to create a price/(n° needed items) ratio and start creating the products with the highest ratio. I know that this is some kind of greedy algorithm (if I'm not wrong) and not always lead to the best solution but I had no other implementable ideas.
Another way may be to brute-force all the possibilities, but I'm not able to realize how I can implement it; I'm not so familiar with the brute-forcing. My first brute-force algorithm was this one, but it was easy because it was with numbers and, furthermore, the element that comes after is not precluded by the previous elements.
Here the things are different, because the next element is a function of the available ingredients, whom are influenced from the previous products, and so on.
Have you any hint? This is some kind of homework, so I prefer not a direct solution, but something to start from!
The language I have to use is C
Many thanks in advance :)
I would first try looking at this as a linear programming problem; there are algorithms available to solve them efficiently.
If your problem can't accept a fractional number of items, then it is actually an integer programming problem. There are algorithms available to solve these as well, but in general it can be difficult (as in time-consuming) to solve large integer programming problems exactly.
Note that a linear programming solution may be a good first approximation to an integer programming solution, e.g. if your production quantities are large.
If you have the CPU cycles to do it (and efficiency doesn't matter), brute force is probably the best way to go, because it's the simplest and also guaranteed to always (eventually) find the best answer.
Probably the first thing to do is figure out how to enumerate your options -- i.e. come up with a way to list all the different possible combinations of pastries you could make with the given ingredients. Don't worry about prices at first.
As a (contrived) example, with a cup of milk and a dozen eggs and some flour and sugar, I could make:
12 brownies
11 brownies and 1 cookie
10 brownies and 2 cookies
[...]
1 brownie and 11 cookies
12 cookies
Then once you have that list, you can iterate over the list, calculate how much money you would make on each option, and choose the one that makes the most money.
As far as generating the list of options goes, I would start by calculating how many cookies you could make if you were to make only cookies; then how many brownies you could make if you were to make only brownies, and so on. That will give you an absolute upper bound on how many of each item you ever need to consider. Then you can just consider every combination of items with per-type-numbers less than or equal to that bound, and throw out any combinations that turn out to require more ingredients than you have on hand. This would be really inefficient and slow, of course, but it would work.
It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 10 years ago.
Here might not be the proper place to ask this question but I didn't find any better place to ask it. I have a program that have for example 10 parameters. Every time I ran it, it could lead to 3 results. 0, 0.5 or 1. I don't know how the parameters would influence the last result. I need something to little by little improve my program so it gets more 1s and less 0s.
First, just to get the terminology right, this is really a "search" problem, not a "machine learning" problem (you're trying to find a very good solution, not trying to recognize how inputs relate to outputs). Your problem sounds like a classic "function optimization" search problem.
There are many techniques that can be used. The right one depends on a few different factors, but the biggest question is the size and shape of the solution space. The biggest question there is "how sensitive is the output to small changes in the inputs?" If you hold all the inputs except one the same and make a tiny change, are you going to get a huge change in the output or just a small change? Do the inputs interact with each other, especially in complex ways?
The smaller and "smoother" the solution space (that is, the less sensitive it is to tiny changes in inputs), the more you would want to pursue straightforward statistical techniques , guided search, or perhaps, if you wanted something a little more interesting, simulated annealing.
The larger and more complex the solution space, the more that would guide you towards either more sophisticated statistical techniques or my favorite class of algorithms, which are genetic algorithms, which can very rapidly search a large solution space.
Just to sketch out how you might apply genetic algorithms to your problem, let's assume that the inputs are independent from each other (a rare case, I know):
Create a mapping to your inputs from a series of binary digits 0011 1100 0100 ...etc...
Generate a random population of some significant size using this mapping
Determine the fitness of each individual in the population (in your case, "count the 1s" in the output)
Choose two "parents" by lottery:
For each half-point in the output, an individual gets a "lottery ticket" (in other words, an output that has 2 "1"s and 3 "0.5"s will get 7 "tickets" while one with 1 "1" and 2 "0.5"s will get 4 "tickets")
Choose a lottery ticket randomly. Since "more fit" individuals will have more "tickets" this means that "more fit" individuals will be more likely to be "parents"
Create a child from the parents' genomes:
Start copying one parents genome from left to right 0011 11...
At every step, switch to the other parent with some fixed probability (say, 20% of the time)
The resulting child will have some amount of one parents genome and some amount of the other's. Because the child was created from "high fitness" individuals, it is likely that the child will have a fitness higher than the average of the current generation (although it is certainly possible that it might have lower fitness)
Replace some percentage of the population with children generated in this manner
Repeat from the "Determine fitness" step... In the ideal case, every generation will have an average fitness that is higher than the previous generation and you will find a very good (or maybe even ideal) solution.
Are you just trying to modify the parameters so the results come out to 1? It sounds like the program is a black box where you can pick the input parameters and then see the results. Since that is the case I think it would be best to choose a range of input parameters, cycle through those inputs, and view the outputs to try to discern a pattern. If you could automate it it'll help out a lot. After you run through the data you may be able to spot check to see which parameter give you which results, or you could apply some machine learning techniques to determine which parameters lead to which outputs.
As Larry said, looks like a combinatorial search and the solution will depends on the "topology" of the problem.
If you can, try to get the Algorithm Design Manuel book (S. Skiena), it has a chapter on this that can help determine the good method for this problem...
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I have written an artifical neural network (ANN) implementation for myself (it was fun). I am thinking now about where can I use it.
What are the key areas in the real world, where ANN is being used?
ANNs are an example of a "learning" system, one that "trains" on input data (in some domain) in order to effectively classify (unseen) data in that domain. They've been used for everything from character recognition to computer games and beyond.
If you're trying to find a domain, pick some topic or field that interests you, and see what kinds of classification problems exist there.
Most often for classifying noisy inputs into fixed categories, like handwritten letters into their equivalent character, spoken voice into phonemes, or noisy sensor readings into a set of fixed values. Usually, the set of categories is small (23 letters, couple of dozen phonemes, etc.)
Others will point out how all these things are better done with specialized algorithms....
I once wrote an ANN to predict the stock market. It succeeded with about 80% accuracy.
The cue here was to first get hold of a couple of million rows of real stock data. I used this data to train the network and prime it for real data. There were about 8-10 input variables and a single output value that would indicate the predicted value of the stock on the next day.
You could also check out the (ancient) ALVINN network where a car learnt to drive by itself by observing road data when a human driver was behind the wheel.
ANNs are also widely used in bioinformatics.