Generating Strings - c

I am about creating a distributed Password Cracker, in which I will use brute force technique, so I need every combination of string.
For the sake of distribution, Server will give Client a range of strings like from "aaaa" to "bxyz". I am supposing that string length will be of four. So I need to check every string between these two bounds.
I am trying to generate these strings in C. I am trying to make logic for this but I'm failing; I also searched on Google but no benefit. Any Idea?
EDIT
Sorry brothers, I would like to edit it
I want combination of string with in a range, lets suppose between aaaa and aazz that would be strings like aaaa aaab aaac aaad ..... aazx aazy aazz .. my character space is just upper and smaller English letters that would be like 52 characters. I want to check every combination of 4 characters. but Server will distribute range of strings among its clients. MY question was if one client gets range between aaaa and aazz so how will I generate strings between just these bounds.

If your strings will comprehend only the ASCII table, you'll have, as an upper limit, 256 characters, or 2^8 characters.
Since your strings are 4 characters length, you'll have 2^8 * 2^8 * 2^8 * 2^8 combinations,
or 2^8^4 = 2^32 combinations.
Simply split the range of numbers and start the combinations in each machine.
You'll probably be interested in this: Calculating Nth permutation step?
Edit:
Considering your edit, your space of combinations would be 52^4 = 7.311.616 combinations.
Then, you do simply need to divide these "tasks" for each machine to compute, so, 7.311.616 / n = r, having r as the amount of permutations calculated by each machine -- the last machine may compute r + (7.311.616 % n) combinations.
Since you know the amount of combinations to build in each machine, you'll have to execute the following, in each machine:
function check_permutations(begin, end, chars) {
for (i = begin; i < end; i++) {
nth_perm = nth_permutation(chars, i);
check_permutation(nth_perm); // your function of verification
}
}
The function nth_permutation() is not hard to derive, and I'm quite sure you can get it in the link I've posted.
After this, you would simply start a process with such a function as check_permutations, giving the begin, end, and the vector of characters chars.

You can generate a tree containing all the permutations. E.g., like in this pseudocode:
strings(root,len)
for(c = 'a' to 'z')
root->next[c] = c
strings(&root->next[c], len - 1)
Invoke by strings(root, 4).
After that you can traverse the tree to get all the permutations.

Related

Storing large numbers from user input into an array of integers

I am currently working on a C project that requires the creation, storage and mathematical usage of numbers that are too large to be put into normal variable types. To do this, we were instructed to represent numbers as a sequence of digits stored in an array of integers. I use a struct defined as so:
struct BigInt {
int val[300000];
int size;
};
(I know I can dynamically allocate memory, and that that is
preferable, however this is how I am most comfortable doing it, it has
worked perfectly fine so far and this is how the professor instructed us to do it.)
I then define member A:
struct BigInt A={NULL};
I can generate and store, then add, subtract and multiply random numbers with this, and they can have any number digits up to 300000(far more than I will ever need to account for). For example, if the number 1432 was generated and stored into BigInt A, A.size would be 4 and A.val[2] would be 3.
Now I need to create a way to store user input into this type. For example, the user needs to be able go straight from inputting 50! and then it be stored into this struct array type I have created. How would I go about doing this?
The only ways that I could think of would be to store the user input as a string then have the math in that string be executed multiple times, each time storing a different digit, or reading numbers straight off of stdout, but I don't know if either of those are even possible or would solve my problem.
You can try using string as follows:
char s[300001];
scanf("%s", s);
A.size = strlen(s);
for(int i = 0; i < A.size; i++){
A.val[i] = s[i] - '0';
}
I think it will solve your problem, but this way of implementation for big integers is not efficient though.
Sorry for previous answer, to solve in c you need to use array of chars to store each digits.

Implementing Radix sort in java - quite a few questions

Although it is not clearly stated in my excercise, I am supposed to implement Radix sort recursively. I've been working on the task for days, but yet, I only managed to produce garbage, unfortunately. We are required to work with two methods. The sort method receives a certain array with numbers ranging from 0 to 999 and the digit we are looking at. We are supposed to generate a two-dimensional matrix here in order to distribute the numbers inside the array. So, for example, 523 is positioned at the fifth row and 27 is positioned at the 0th row since it is interpreted as 027.
I tried to do this with the help of a switch-case-construct, dividing the numbers inside the array by 100, checking for the remainder and then position the number with respect to the remainder. Then, I somehow tried to build buckets that include only the numbers with the same digit, so for example, 237 and 247 would be thrown in the same bucket in the first "round". I tried to do this by taking the whole row of the "fields"-matrix where we put in the values before.
In the putInBucket-method, I am required to extent the bucket (which I managed to do right, I guess) and then returning it.
I am sorry, I know that the code is total garbage, but maybe there's someone out there who understands what I am up to and can help me a little bit.
I simply don't see how I need to work with the buckets here, I even don't understand why I have to extent them, and I don't see any way to returning it back to the sort-method (which, I think, I am required to do).
Further description:
The whole thing is meant to work as follows: We take an array with integers ranging from 0 to 999. Every number is then sorted by its first digit, as mentioned above. Imagine you have buckets denoted with the numbers ranging from 0 to 9. You start the sorting by putting 523 in bucket 5, 672 in bucket 6 and so on. This is easy when there is only one number (or no number at all) in one of the buckets. But it gets harder (and that's where recursion might come in hand) when you want to put more than one number in one bucket. The mechanism now goes as follows: We put two numbers with the same first digit in one bucket, for example 237 and 245. Now, we want to sort these numbers again by the same algorithm, meaning we call the sort-method (somehow) again with an array that only contains these two numbers and sorting them again, but now my we do by looking at the second digit, so we would compare 3 and 4. We sort every number inside the array like this, and at the end, in order to get a sorted array, we start at the end, meaning at bucket 9, and then just put everything together. If we would be at bucket 2, the algorithm would look into the recursive step and already receive the sorted array [237, 245] and deliver it in order to complete the whole thing.
My own problems:
I don't understand why we need to extent a bucket and I can't figure it out from the description. It is simply stated that we are supposed to do so. I'd imagine that we would to it to copy another element inside it, because if we have the buckets from 0 to 9, putting in two numbers inside the same bucket would just mean that we would overwrite the first value. This might be the reason why we need to return the new, extended bucket, but I am not sure about that. Plus, I don't know how to go further from there. Even if I have an extened bucket now, it's not like I can simply stick it to the old matrix and copy another element into it again.
public static int[] sort(int[] array, int digit) {
if (array.length == 0)
return array;
int[][] fields = new int[10][array.length];
int[] bucket = new int[array.length];
int i = 0;
for (int j = 0; j < array.length; j++) {
switch (array[j] / 100) {
case 0: i = 0; break;
case 1: i = 1; break;
...
}
fields[i][j] = array[j]
bucket[i] = fields[i][j];
}
return bucket;
}
private static int[] putInBucket(int [] bucket, int number) {
int[] bucket_new = int[bucket.length+1];
for (int i = 1; i < bucket_new.length; i++) {
bucket_new[i] = bucket[i-1];
}
return bucket_new;
}
public static void main (String [] argv) {
int[] array = readInts("Please type in the numbers: ");
int digit = 0;
int[] bucket = sort(array, digit);
}
You don't use digit in sort, that's quite suspicious
The switch/case looks like a quite convoluted way to write i = array[j] / 100
I'd recommend to read the wikipedia description of radix sort.
The expression to extract a digit from a base 10 number is (number / Math.pow(10, digit)) % 10.
Note that you can count digits from left to right or right to left, make sure you get this right.
I suppose you first want to sort for digit 0, then for digit 1, then for digit 2. So there should be a recursive call at the end of sort that does this.
Your buckets array needs to be 2-dimensional. You'll need to call it this way: buckets[i] = putInBucket(buckets[i], array[j]). If you handle null in putInBuckets, you don't need to initialize it.
The reason why you need a 2d bucket array and putInBucket (instead of your fixed size field) is that you don't know how many numbers will end up in each bucket
The second phase (reading back from the buckets to the array) is missing before the recursive call
make sure to stop the recursion after 3 digits
Good luck

Looping through all character combinations with increasing number of elements

What I want to achieve:
I have a function where I want to loop through all possible combinations of printable ascii-characters, starting with a single character, then two characters, then three etc.
The part that makes this difficult for me is that I want this to work for as many characters as I can (leave it overnight).
For the record: I know that abc really is 97 98 99, so a numeric representation is fine if that's easier.
This works for few characters:
I could create a list of all possible combinations for n characters, and just loop through it, but that would require a huge amount of memory already when n = 4. This approach is literally impossible for n > 5 (at least on a normal desktop computer).
In the script below, all I do is increment a counter for each combination. My real function does more advanced stuff.
If I had unlimited memory I could do (thanks to Luis Mendo):
counter = 0;
some_function = #(x) 1;
number_of_characters = 1;
max_time = 60;
max_number_of_characters = 8;
tic;
while toc < max_time && number_of_characters < max_number_of_characters
number_of_characters = number_of_characters + 1;
vectors = [repmat({' ':'~'}, 1, number_of_characters)];
n = numel(vectors);
combs = cell(1,n);
[combs{end:-1:1}] = ndgrid(vectors{end:-1:1});
combs = cat(n+1, combs{:});
combs = reshape(combs, [], n);
for ii = 1:size(combs, 1)
counter = counter + some_function(combs(ii, :));
end
end
Now, I want to loop through as many combinations as possible in a certain amount of time, 5 seconds, 10 seconds, 2 minutes, 30 minutes, so I'm hoping to create a function that's only limited by the available time, and uses only some reasonable amount of memory.
Attempts I've made (and failed at) for more characters:
I've considered pre-computing the combinations for two or three letters using one of the approaches above, and use a loop only for the last characters. This would not require much memory, since it's only one (relatively small) array, plus one or more additional characters that gets looped through.
I manage to scale this up to 4 characters, but beyond that I start getting into trouble.
I've tried to use an iterator that just counts upwards. Every time I hit any(mod(number_of_ascii .^ 1:n, iterator) == 0) I increment the m'th character by one. So, the last character just repeats the cycle !"# ... ~, and every time it hits tilde, the second character increments. Every time the second character hits tilde, the third character increments etc.
Do you have any suggestions for how I can solve this?
It looks like you're basically trying to count in base-26 (or base 52 if you need CAPS). Each number in that base will account for a specific string of character. For example,
0,1,2,3,4,5,6,7,8,9,A,B,C,D,E,F,G,H,I,J,K,L,M,N,O,P,10,11,12,...
Here, cap A through P are just symbols that are used to represent number symbols for base-26 system. The above simply represent this string of characters.
a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z,ba,bb,bc,...
Then, you can simply do this:
symbols = ['0','1','2','3','4','5','6','7','8','9','A','B','C','D','E',...
'F','G','H','I','J','K','L','M','N','O','P']
characters = ['a','b','c','d','e','f','g','h','i','j','k','l',...
'm','n','o','p','q','r','s','t','u','v','w','x','y','z']
count=0;
while(true)
str_base26 = dec2base(count,26)
actual_str = % char-by-char-lookup-of-str26 to chracter string
count=count+1;
end
Of course, it does not represent characters that begin with trailing 0's. But that should be pretty simple.
You were not far with your idea of just getting an iterator that just counts upward.
What you need with this idea is a map from the integers to ASCII characters. As StewieGriffin suggested, you'd just need to work in base 95 (94 characters plus whitespace).
Why whitespace : You need something that will be mapped to 0 and be equivalent to it. Whitespace is the perfect candidate. You'd then just skip the strings containing any whitespace. If you don't do that and start directly at !, you'll not be able to represent strings like !! or !ab.
First let's define a function that will map (1:1) integers to string :
function [outstring,toskip]=dec2ASCII(m)
out=[];
while m~=0
out=[mod(m,95) out];
m=(m-out(1))/95;
end
if any(out==0)
toskip=1;
else
toskip=0;
end
outstring=char(out+32);
end
And then in your main script :
counter=1;
some_function = #(x) 1;
max_time = 60;
max_number_of_characters = 8;
currString='';
tic;
while numel(currString)<=max_number_of_characters&&toc<max_time
[currString,toskip]=dec2ASCII(counter);
if ~toskip
some_function(currString);
end
counter=counter+1;
end
Some random outputs of the dec2ASCII function :
dec2ASCII(47)
ans =
O
dec2ASCII(145273)
ans =
0)2
In terms of performance I can't really elaborate as I don't know what you want to do with your some_function. The only thing I can say is that the running time of dec2ASCII is around 2*10^(-5) s
Side note : iterating like this will be very limited in terms of speed. With the function some_function doing nothing, you'd just be able to cycle through 4 characters in around 40 minutes, and 5 characters would already take up to 64 hours. Maybe you'd want to reduce the amount of stuff you want to pass through the function you iterate on.
This code, though, is easily parallelizable, so if you want to check more combinations, I'd suggest trying to do it in a parallel manner.

Word ranking efficiency

I am not sure how to solve this problem within the constraints.
Consider a "word" as any sequence of capital letters A-Z (not limited to just "dictionary words"). For any word with at least two different letters, there are other words composed of the same letters but in a different order (for instance, STATIONARILY/ANTIROYALIST, which happen to both be dictionary words; for our purposes "AAIILNORSTTY" is also a "word" composed of the same letters as these two). We can then assign a number to every word, based on where it falls in an alphabetically sorted list of all words made up of the same set of letters. One way to do this would be to generate the entire list of words and find the desired one, but this would be slow if the word is long. Write a program which takes a word as a command line argument and prints to standard output its number. Do not use the method above of generating the entire list. Your program should be able to accept any word 25 letters or less in length (possibly with some letters repeated), and should use no more than 1 GB of memory and take no more than 500 milliseconds to run. Any answer we check will fit in a 64-bit integer.
Sample words, with their rank:
ABAB = 2
AAAB = 1
BAAA = 4
QUESTION = 24572
BOOKKEEPER = 10743
examples:
AAAB - 1
AABA - 2
ABAA - 3
BAAA - 4
AABB - 1
ABAB - 2
ABBA - 3
BAAB - 4
BABA - 5
BBAA - 6
I thought about using a binary search for a word and all the possible words built from the characters (1 - permutation(word)) but I think that would take too long. O(logN) might be too slow.
I found this solution but I am a bit confused and need a bit of help understanding it:
Consider the n-letter word { x1, x2, ... , xn }. My solution is based on the idea that the word number will be the sum of two quantities:
The number of combinations starting with letters lower in the alphabet than x1, and
how far we are into the the arrangements that start with x1.
The trick is that the second quantity happens to be the word number of the word { x2, ... , xn }. This suggests a recursive implementation.
Getting the first quantity is a little complicated:
Let uniqLowers = { u1, u2, ... , um } = all the unique letters lower than x1
For each uj, count the number of permutations starting with uj.
Add all those up.
The solutions says that the answer consists of two numbers. Look at the following picture describing the words that can be made from the word QUESTION:
EIONQSTU (first word lexographically, rank 1)
...
...
... (first word before Q, rank A)
QEIONSTU
....
....
QUESTION (our given word, rank x)
...
This phrase "how far we are into the the arrangements that start with x1", is the quantity (x-A), call it B. The thing is B is exactly equal to the word rank of "UESTION", which is our original word with the first letter cut off. This is asking the same question but with a subset of our input, suggesting a recursive solution.
It then remains to find A, this says to find the number of permutations of words beginning with words that come before Q. So A = number of words beginning with {E, I, O, N}

C Newbie: Split a number into pairs and sum up

I'm teaching myself C (not C++, not yet). After searching the web in general and SO specifically for a couple of hours, I'm still stumped on how to do something fairly basic. Split a number into pairs, then sum those pairs. Something like:
1234567890 -->
12 + 34 + 56 +78 + 90 = 270
I tried treating the number as a string, putting it into an array, splitting off each number and then concatenating those into pairs, and started getting lost around that point.
What's the best way to do this? Do I have to treat the number as a string to get the pairs, or is there a better way?
What's the best way to do this? Do I have to treat the number as a
string to get the pairs, or is there a better way?
You could do
while (number) {
x = number % 100; /* Get the last two digits. */
number /= 100; /* Get rid of them. */
}
It also depends on what you plan to do if you have an odd number of digits.

Resources