Odd Repetitions of Patterns When Using Rand() - c

Sample random password/string generator which generates 32 character strings. So, generates random numbers and keep those which are between 33 and 127 as these are the ASCII values which constitute valid text.
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
int main()
{
srand(time(0));
clock_t start = clock();
long long iterations = 0;
printf("Generating String...\n\n\t\" ");
for (int i = 0; i < 32; i++)
{
long long holder = 0;
while(holder < 33 || holder > 126)
{
holder = rand();
iterations++;
}
putchar(holder);
}
clock_t end = clock();
printf(" \"\n\n%.2lf s , %lld iterations & %lld avg\n",(double)(end - start)/CLOCKS_PER_SEC,iterations,iterations/32);
return 0;
}
Output repeats the string DEX&H1_(okd/YVf8;49=el%<j:#"T,NU in one form or another.
Some Outputs :
Generating String...
" DEX&H1_(okd/YVf8;49=el%<j:#"T,NU "
9.11 s , 893836506 iterations & 27932390 avg
Generating String...
" xq?!#O]tDEX&H1_(okd/YVf8;49=el%< "
7.59 s , 768749018 iterations & 24023406 avg
Generating String...
" MJxq?!#O]tDEX&H1_(okd/YVf8;49=el "
7.63 s , 748742990 iterations & 23398218 avg
Compiled with cc file.c -o file on Clang/macOS.

The way you're trying to get random numbers in a range is extremely inefficient. It's also most likely the source of the repetition you're seeing.
You should instead reduce the number returned to be within the desired range.
for (int i = 0; i < 32; i++)
{
int holder = (rand() % (126 - 33 + 1)) + 33;
putchar(holder);
}

The question of how to do it right has been addressed in another answer already. This is about the "*odd repetitions" part, which may not be so "odd" after all.
The following assumes a typical rand() implementation where:
all possible values are taken exactly once before rand() returns a previous value;
the next rand() value depends only on the previous value.
Under these assumptions, the 94 values between 34 = '\"' and 125 = '}' will be returned in a cycle, which will then repeat unchanged. Then the posted code will always return 32 consecutive characters from that cycle (including wraparound)
Suppose for example that the first run returns the 32-char string DEX&H1_(okd/YVf8;49=el%<j:#"T,NU. That means the 94-char cycle of the rand() generator includes that string followed by some permutation of the remaining 62 characters.
Then the next run will produce a 32-char string that overlaps the first one for at least, say, 16 characters iff the first eligible character has an index between [0, 15] or [78, 93]. The probability of that happening is 16 / 94 ≈ 17%. Conversely, the probability of not having such overlap is ≈ 83%, and the probability of no overlaps in the next 7 runs is 0.83^7 ≈ 0.27. So the chance of getting a "repetition" for the first string in the next 7 runs is ≈ 73% i.e. not too surprising.
[ EDIT ]   Also, it follows by a straight pigeonhole argument that any 6 runs will produce at least two strings that have a substring in common of length 16 or more.

Related

Maximum product of 13 adjacent numbers of a 1000-digit number

I have to find the largest product of 13 adjacent numbers of a 1000-digit number below. My code for the problem is as follows:
#include <stdio.h>
int main()
{
char arr[1000] =
"731671765313306249192251196744265747423553491949349698352031277450"
"632623957831801698480186947885184385861560789112949495459501737958"
"331952853208805511125406987471585238630507156932909632952274430435"
"576689664895044524452316173185640309871112172238311362229893423380"
"308135336276614282806444486645238749303589072962904915604407723907"
"138105158593079608667017242712188399879790879227492190169972088809"
"377665727333001053367881220235421809751254540594752243525849077116"
"705560136048395864467063244157221553975369781797784617406495514929"
"086256932197846862248283972241375657056057490261407972968652414535"
"100474821663704844031998900088952434506585412275886668811642717147"
"992444292823086346567481391912316282458617866458359124566529476545"
"682848912883142607690042242190226710556263211111093705442175069416"
"589604080719840385096245544436298123098787992724428490918884580156"
"166097919133875499200524063689912560717606058861164671094050775410"
"022569831552000559357297257163626956188267042825248360082325753042"
"0752963450";
int i, j;
long int max;
max = 0;
long int s = 1;
for (i = 0; i < 988; i++) {
int a = 0;
for (j = 1; j <= 13; j++) {
printf("%c", arr[i + a]);
s = s * arr[i + a];
a++;
}
printf("%c%d", '=', s);
printf("\n");
if (s > max) {
max = s;
}
}
printf("\nMaximum product is %d", max);
getchar();
}
Some outputs are zero even if none of the input is zero. The second output happens to be negative. The answers don't even match. Any help is appreciated.
Many set of 13 digits in your char array arr contains zeroes and that is why the multiplication of these sets will result in 0.
There are a couple of issues with your code:
You are using %d instead of %ld to print long int. Using the wrong conversion specifier will result in undefined behaviour.
If any argument is not the correct type for the corresponding conversion specification, the behavior is undefined.
You are not converting the ASCII value of the digit into its actual value before multiplication. (ASCII value of '0' is 48). This results in integer overflow and is the cause for negative values to be printed.
So the statement:
s = s * arr[i + a];
should be changed to:
s = s * (arr[i + a] - '0');
You are also not resetting the product s to 1 at the beginning of the inner for loop and because of this, you end up multiplying values from the results of different sets of 13.
After making these changes, you can see the live demo here.
There are a few issues to tackle in this code:
Clean up spacing and variable names (an edit by another user helped resolve this issue). Remove redundant variables like a, which j could easily represent by iterating from 0 to 12 rather than 1 to 13. This seems cosmetic but will make it easier for you to understand your program state, so it's actually critical.
Numerical overflow: As with all PE problems, you'll be dealing with extremely large numbers which may overflow the capacity of the long int datatype (231 - 1). Use unsigned long long to store your max and s (which I'd call product) variables. Print the result with %llu.
Convert chars to ints: arr[i+j] - '0'; so that you're multiplying the actual numbers the chars represent rather than their ASCII values (which are 48 higher).
s (really product) is not reset on each iteration of the inner loop, so you're taking the product of the entire 1000-sized input (or trying to, until your ints start to overflow).

How to store 100 digit number in C using strings

My problem is that i dont know what this functions do, thats program
from my teacher(not whole program just functions). Just wanna ask you what this functions do, mainly why
i store my number from right to left at string? thanks
#include<stdio.h>
#include<string.h>
#define MAX 1000
void str_to_num(char *str, char *number, int *dlzka)
{
int i;
for(i=0; i < MAX; i++)
number[i] = 0;
*dlzka = strlen(str);
for(i = 0; i < *dlzka; i++)
cis[(*dlzka) - 1 - i] = str[i] - '0';
}
void plus(char *cislo, int *d1, char *cis2, int d2)
{
int i; prenos = 0;
if(*d1 < d2)
*d1 = d2;
for(i = 0; i < *d1; i++)
{
pom = number[i] + number[i];
pom += prenos;
number[i] = pom % 10;
prenos = pom / 10;
}
}
Here is the lesson your teacher should be teaching:
There is a difference between the numerical value of 1, and the computer code (ASCII for example) that is used to represent character 1 displayed on the screen or typed on the keyboard.
Every time you see 1 on the screen, your computer sees 49 in memory.
0 is 48, 2 is 50 and so on.
Conveniently, all digit characters are arranged in a sequence from 0 to 9, so to convert their character codes to their numeric values all you have to do is subtract the character code of zero to get the digit position in the sequence.
For example: 49 - 48 = 1 --> '1' - '0' = 1
And this is how the first function, str_to_num works.
C language does not provide a variable large enough to work with 100 digit numbers, so you need to sum them up one digit at a time.
The second function has completely wrong variable names, but it is still pretty obvious what it is trying to do:
It sums up two single digit numbers, then stores the ones part of the result in an array and the tenth (if sum is > 9) in a helper variable.
As already suggested in the comments, this is how you sum up numbers manually on a page one digit at a time.
I don't know what prenos means in your language, but in English this variable should be called carry and it keeps the overflowing tens digit for the next round.
There is however something missing from the sum function: if the sum of the last (leftmost) two digits is more than 9, the extra 1 will be lost, and the result will be wrong.
Check the original code your teacher gave you - either you copied it wrong, or he is giving a bad example.

storing multiple values in c language without using arrays [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
have to reverse number and get difference between normal and reverse number.
The input consists of N numbers, where N is an arbitrary positive integer. The first line of the input
contains only a positive integer N. Then follows one or more lines with the N numbers; these numbers
should all be non-negative and may be single or multiple digits. These are the original numbers you need
to generate their N corresponding magic numbers.
i was thinking maybe using a while loop and just doing one input at a time, anyone have any thoughts?
what i have so far
#include <stdio.h>
int reverseInteger();
int generateMagicNumber();
int main()
{
int n,i;
char all;
printf("How many magic numbers do you want");
scanf("%d",&n);
while (i<n){ //
while (n != 0) //reversing number
{
rev = rev * 10;
rev = rev + n%10;
n = n/10;
i++;
all = n;
}
}
}
Assignment 1:
Reverse Number Magic Sequence
Due: Wednesday January 27, 2016 11:59pm EST
A reverse number is a number written in arabic numerals, but where the
order of digits is reversed. The first digit becomes the last and vice
versa. For example, the number 1245 when its digits are reversed it
would become 5421. Note that all the leading zeros are omitted. That
means if the number ends with a zero, the zero is lost by reversing
(e.g. 1200 gives 21). Also note that the reversed number never has any
trailing zeros. Finally, every single digit number (i.e. 0-9) is its
own reverse number. In order to generate a magic number, we reverse a
given original number and store the absolute value of the difference
between the original number and its reversed version. For example,
given the number 476, we will generate the reverse number 674 and then
compute the absolute value of the difference between 476 and 674 to be
198. We then reverse 198 to display the number 891; we call that the magic number!
We need your help to compute the magic numbers of a given sequence.
Your task is to calculate the difference between a given number and
its reverse version, and output the reverse of the difference. Of
course, the result is not unique because any particular number is a
reversed form of several numbers (e.g. 21 could be 12, 120 or 1200
before reversing). Thus we must assume that no zeros were lost by
reversing (e.g. assume that the original number was 12).
Input
The
input consists of N numbers, where N is an arbitrary positive integer.
The first line of the input contains only a positive integer N. Then
follows one or more lines with the N numbers; these numbers should all
be non-negative and may be single or multiple digits. These are the
original numbers you need to generate their N corresponding magic
numbers.
Output
For each original number in the sequence, print
exactly one integer – its magic number. Omit any leading zeros in the
output. On a separate line, output the largest absolute difference
encountered in the sequence. Sample Input
6
24 1 4358 754 305 794
Sample Output
81 0 6714 792 891 792
4176
Specific Requirements: [15 pts]
[ 3 pts] Write a function called reverseInteger, that takes as input an unsigned integer and returns its reversed digits version as an
unsigned integer.
[ 3 pts] Write a function called generateMagicNumber, that takes as input an unsigned integer and return its magic number as described in
the problem.
[ 3 pts] Display the sequence of magic numbers correctly. (shown in the script file)
[ 2 pts] Display the largest absolute difference (shown in the script file)
[ 3 pts] Demonstrate the complete program using a main function capable of processing the input of any sequence and producing its
corresponding output.
[ 1 pt] Compilation on the CS server gcc compiler without errors and warnings.
Failure to properly document your entire code will receive a mark of
zero.
You are to submit the following:
Source code file: assign1.c
Script file demonstrating the compilation and execution : assign1.txt
To generate the script file use the following command from the CS
server:
cp assign1.c assign1.backup
typescript assign1.txt
cc assign1.c
a.out
[test your code here with at least 3 different input test cases in addition to the example given]
exit
[These steps will create a file called assign1.txt. Do not edit its contents - just submit it!]
Hint: This table explains the work done in this example:
Originalnumber
Reverse Absolute difference
Reverse (Magic number)
X Xr |X-Xr| |X-Xr|r
24 42 18 81
1 1 0 0
4358 8534 4176 6714
754 457 297 792
305 503 198 891
794 497 297 792
Note that your program should not use arrays and should be able to
read a sequence of N size, for any value of N (a 32 bit integer). Of
course, memory space optimization should be considered since there is
no need to store all the N numbers in memory all at once at any given
time.
You should read a new number in each iteration of the while loop:
#include <stdio.h>
int reverseInteger();
int generateMagicNumber();
int main() {
int n, i;
char all;
printf("How many magic numbers do you want");
if (scanf("%d", &n) != 1)
return 1;
for (i = 0; i < n; i++) {
int num, temp, rev, magic;
if (scanf("%d", &num) != 1)
return 2;
rev = 0;
temp = num;
while (temp != 0) { //reversing number
rev = rev * 10;
rev = rev + temp % 10;
temp = temp / 10;
}
if (rev < num)
magic = num - rev;
else
magic = rev - num;
printf("%d ", magic);
}
printf("\n");
return 0;
}
If you enter all the numbers on one line, the answers will appear on a single line below it.

C print first million Fibonacci numbers

I am trying to write C code which will print the first 1million Fibonacci numbers.
UPDATE: The actual problem is I want to get the last 10 digits of F(1,000,000)
I understand how the sequence works and how to write the code to achieve that however as F(1,000,000) is very large I am struggling to find a way to represent it.
This is code I am using:
#include<stdio.h>
int main()
{
unsigned long long n, first = 0, second = 1, next, c;
printf("Enter the number of terms\n");
scanf("%d",&n);
printf("First %d terms of Fibonacci series are :-\n",n);
for ( c = 0 ; c < n ; c++ )
{
if ( c <= 1 )
next = c;
else
{
next = first + second;
first = second;
second = next;
}
printf("%d\n",next);
}
return 0;
}
I am using long long to try and make sure there are enough bits to store the number.
This is the output for the first 100 numbers:
First 100 terms of Fibonacci series are :-
0
1
1
2
3
5
8
13
21
34
55
89
144
233
377
610
987
1597
2584
4181
6765
10946
17711
28657
46368
75025
121393
196418
317811
514229
832040
1346269
2178309
3524578
5702887
9227465
14930352
24157817
39088169
63245986
102334155
165580141
267914296
433494437
701408733
1134903170
1836311903
-1323752223
512559680
-811192543
-298632863
-1109825406
-1408458269
...
Truncated the output but you can see the problem, I believe the size of the number generated is causing the value to overflow to negative. I don't understand how to stop it in all honesty.
Can anybody point me in the right direction to how to actually handle numbers of this size?
I haven't tried to print the first million because if it fails on printing F(100) there isn't much hope of it printing F(1,000,000).
You want the last 10 digits of Fib(1000000). Read much more about Fibonacci numbers (and read twice).
Without thinking much, you could use some bignum library like GMPlib. You would loop to compute Fib(1000000) using a few mpz_t bigint variables (you certainly don't need an array of a million mpz_t, but less mpz_t variables than you have fingers in your hand). Of course, you won't print all the fibonacci numbers, only the last 1000000th one (so a cheap laptop today has enough memory, and would spit that number in less than an hour). As John Coleman answered it has about 200K digits (i.e. 2500 lines of 80 digits each).
(BTW, when thinking of a program producing some big output, you'll better guess-estimate the typical size of that output and the typical time to get it; if it does not fit in your desktop room -or your desktop computer-, you have a problem, perhaps an economical one: you need to buy more computing resources)
Notice that efficient bignum arithmetic is a hard subject. Clever algorithms exist for bignum arithmetic which are much more efficient than the naive one you would imagine.
Actually, you don't need any bigints. Read some math textbook about modular arithmetic. The modulus of a sum (or a product) is congruent to the sum (resp. the product) of the modulus. Use that property. A 10 digits integer fits in a 64 bits int64_t so with some thinking you don't need any bignum library.
(I guess that with slightly more thinking, you don't need any computer or any C program to compute that. A cheap calculator, a pencil and a paper should be enough, and probably the calculator is not needed at all.)
The lesson to learn when programming (or when solving math exercises) is to think about the problem and try to reformulate the question before starting coding. J.Pitrat (an Artificial Intelligence pioneer in France, now retired, but still working on his computer) has several interesting blog entries related to that: Is it possible to define a problem?, When Donald and Gerald meet Robert, etc.
Understanding and thinking about the problem (and sub-problems too!) is an interesting part of software development. If you work on software developement, you'll be first asked to solve real-world problems (e.g. make a selling website, or an autonomous vacuum cleaner) and you'll need to think to transform that problem into something which is codable on a computer. Be patient, you'll need ten years to learn programming.
To "get the last 10 digits of F(1,000,000)", simply apply the remainder function % when calculating next and use the correct format specifier: "%llu".
There is no need to sum digits more significant than the 10 least significant digits.
// scanf("%d",&n);
scanf("%llu",&n);
...
{
// next = first + second;
next = (first + second) % 10000000000;
first = second;
second = next;
}
// printf("%d\n",next);
printf("%010llu\n",next);
My output (x'ed the last 5 digits to not give-away the final answer)
66843xxxxx
By Binet's Formula the nth Fibonacci Number is approximately the golden ratio (roughly 1.618) raised to the power n and then divided by the square root of 5. A simple use of logarithms shows that the millionth Fibonacci number thus has over 200,000 digits. The average length of one of the first million Fibonacci numbers is thus over 100,000 = 10^5. You are thus trying to print 10^11 = 100 billion digits. I think that you will need more than a big int library to do that.
On the other hand -- if you want to simply compute the millionth number, you can do so -- though it would be better to use a method which doesn't compute all of the intermediate numbers (as simply computing rather than printing them all would still be infeasible for large enough n). It is well known (see this) that the nth Fibonacci number is one of the 4 entries of the nth power of the matrix [[1,1],[1,0]]. If you use exponentiation by squaring (which works for matrix powers as well since matrix multiplication is associative) together with a good big int library -- it becomes perfectly feasible to compute the millionth Fibonacci number.
[On Further Edit]: Here is a Python program to compute very large Fibonacci numbers, modified to now accept an optional modulus. Under the hood it is using a good C bignum library.
def mmult(A,B,m = False):
#assumes A,B are 2x2 matrices
#m is an optional modulus
a = A[0][0]*B[0][0] + A[0][1]*B[1][0]
b = A[0][0]*B[0][1] + A[0][1]*B[1][1]
c = A[1][0]*B[0][0] + A[1][1]*B[1][0]
d = A[1][0]*B[0][1] + A[1][1]*B[1][1]
if m:
return [[a%m,b%m],[c%m,d%m]]
else:
return [[a,b],[c,d]]
def mpow(A,n,m = False):
#assumes A is 2x2
if n == 0:
return [[1,0],[0,1]]
elif n == 1: return [row[:] for row in A] #copy A
else:
d,r = divmod(n,2)
B = mpow(A,d,m)
B = mmult(B,B,m)
if r > 0:
B = mmult(B,A,m)
return B
def Fib(n,m = False):
Q = [[1,1],[1,0]]
return mpow(Q,n,m)[0][1]
n = Fib(999999)
print(len(str(n)))
print(n % 10**10)
googol = 10**100
print(Fib(googol, googol))
Output (with added whitespace):
208988
6684390626
3239047153240982923932796604356740872797698500591032259930505954326207529447856359183788299560546875
Note that what you call the millionth Fibonacci number, I call the 999,999th -- since it is more standard to start with 1 as the first Fibonacci number (and call 0 the 0th if you want to count it as a Fibonacci number). The first output number confirms that there are over 200,000 digits in the number and the second gives the last 10 digits (which is no longer a mystery). The final number is the last 100 digits of the googolth Fibonacci number -- computed in a small fraction of a second. I haven't been able to do a googolplex yet :)
This question comes without doubt from some programming competition, and you have to read these questions carefully.
The 1 millionth Fibonacci number is HUGE. Probably about 200,000 digits or so. Printing the first 1,000,000 Fibonacci number will kill a whole forest of trees. But read carefully: Nobody asks you for the 1 millionth Fibonacci number. You are asked for the last ten digits of that number.
So if you have the last 10 digits of Fib(n-2) and of Fib(n-1), how can you find the last 10 digits of Fib(n)? How do you calculate the last ten digits of a Fibonacci number without calculating the number itself?
PS. You can't print long long numbers with %d. Use %lld.
Your algorithm is actually correct. Since you're using unsigned long long, you have enough digits to capture the last 10 digits and the nature of unsigned overflow functions as modulo arithmetic, so you'll get the correct results for at least the last 10 digits.
The problem is in the format specifier you're using for the output:
printf("%d\n",next);
The %d format specifier expects an int, but you're passing an unsigned long long. Using the wrong format specifier invokes undefined behavior.
What's most likely happening in this particular case is that printf is picking up the low-order 4 bytes of next (as your system seems to be little endian) and interpreting them as a signed int. This ends up displaying the correct values for roughly the first 60 numbers or so, but incorrect ones after that.
Use the correct format specifier, and you'll get the correct results:
printf("%llu\n",next);
You also need to do the same when reading / printing n:
scanf("%llu",&n);
printf("First %llu terms of Fibonacci series are :-\n",n);
Here's the output of numbers 45-60:
701408733
1134903170
1836311903
2971215073
4807526976
7778742049
12586269025
20365011074
32951280099
53316291173
86267571272
139583862445
225851433717
365435296162
591286729879
956722026041
You can print Fibonacci(1,000,000) in C, it takes about 50 lines, a minute and no library :
Some headers are required :
#include <stdio.h>
#include <stdlib.h>
#define BUFFER_SIZE (16 * 3 * 263)
#define BUFFERED_BASE (1LL << 55)
struct buffer {
size_t index;
long long int data[BUFFER_SIZE];
};
Some functions too :
void init_buffer(struct buffer * buffer, long long int n){
buffer->index = BUFFER_SIZE ;
for(;n; buffer->data[--buffer->index] = n % BUFFERED_BASE, n /= BUFFERED_BASE);
}
void fly_add_buffer(struct buffer *buffer, const struct buffer *client) {
long long int a = 0;
size_t i = (BUFFER_SIZE - 1);
for (; i >= client->index; --i)
(a = (buffer->data[i] = (buffer->data[i] + client->data[i] + a)) > (BUFFERED_BASE - 1)) && (buffer->data[i] -= BUFFERED_BASE);
for (; a; buffer->data[i] = (buffer->data[i] + a), (a = buffer->data[i] > (BUFFERED_BASE - 1)) ? buffer->data[i] -= BUFFERED_BASE : 0, --i);
if (++i < buffer->index) buffer->index = i;
}
A base converter is used to format the output in base 10 :
#include "string.h"
// you must free the returned string after usage
static char *to_string_buffer(const struct buffer * buffer, const int base_out) {
static const char *alphabet = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz";
size_t a, b, c = 1, d;
char *s = malloc(c + 1);
strcpy(s, "0");
for (size_t i = buffer->index; i < BUFFER_SIZE; ++i) {
for (a = buffer->data[i], b = c; b;) {
d = ((char *) memchr(alphabet, s[--b], base_out) - alphabet) * BUFFERED_BASE + a;
s[b] = alphabet[d % base_out];
a = d / base_out;
}
while (a) {
s = realloc(s, ++c + 1);
memmove(s + 1, s, c);
*s = alphabet[a % base_out];
a /= base_out;
}
}
return s;
}
Example usage :
#include <sys/time.h>
double microtime() {
struct timeval time;
gettimeofday(&time, 0);
return (double) time.tv_sec + (double) time.tv_usec / 1e6;
}
int main(void){
double a = microtime();
// memory for the 3 numbers is allocated on the stack.
struct buffer number_1 = {0}, number_2 = {0}, number_3 = {0};
init_buffer(&number_1, 0);
init_buffer(&number_2, 1);
for (int i = 0; i < 1000000; ++i) {
number_3 = number_1;
fly_add_buffer(&number_1, &number_2);
number_2 = number_3;
}
char * str = to_string_buffer(&number_1, 10); // output in base 10
puts(str);
free(str);
printf("took %gs\n", microtime() - a);
}
Example output :
The 1000000th Fibonacci number is :
19532821287077577316320149475 ... 03368468430171989341156899652
took 30s including 15s of base 2^55 to base 10 conversion.
Also it's using a nice but slow base converter.
Thank You.

Non-recursive combination algorithm to generate distinct character strings

This problem has been irritating me for too long. I need a non-recursive algorithm in C to generate non-distinct character strings. For instance, if a given character string is 26 characters long, and the string is of length 2, then there are 26^2 non-distinct characters.
Please note that these are distinct combinations, aab is not the same as baa or aba. I've searched S.O., and most solutions produce non-distinct combinations. Also, I do not need permutations.
The algorithm can't rely on a libraries. I'm going to translate this C code into cuda where standard C libraries don't work (at least not efficiently).
Before I show you what I started, let me explain an aspect of the program. It is multithreaded on a GPU, so I initialize the beginning string with a few characters, aa in this case. To create a combination, I add one or more characters depending on the desired length.
Here's one method that I have attempted:
int main(void){
//Declarations
char final[12] = {0};
char b[3] = "aa";
char charSet[27] = "abcdefghijklmnopqrstuvwxyz";
int max = 4; //Set for demonstration purposes
int ul = 1;
int k,i;
//This program is multithreaded on a GPU. Each thread is initialized
//to a starting value for the string. In this case, it is aa
//Set final with a starting prefix
int pref = strlen(b);
memcpy(final, b, pref+1);
//Determine the number of non-distinct combinations
for(int j = 0; j < length; j++) ul *= strlen(charSet);
//Start concatenating characters to the current character string
for(k = 0; k < ul; k++)
{
final[pref+1] = charSet[k];
//Do some work with the string
}
...
It should be obvious that this program does nothing useful, accept if I'm only appending one character from charSet.
My professor suggested that I try using a mapping (this isn't homework; I asked him about possible ways to generate distinct combinations without recursion).
His suggestion is similar to what I started above. Using the number of combinations calculated, he suggested to decompose it according to mod 10. However, I realized it wouldn't work.
For example, say I need to append two characters. This gives me 676 combinations using the character set above. If I am on the 523rd combination, the decomposition he demonstrated would yield
523 % 10 = 3
52 % 10 = 2
5 % 10 = 5
It should be obvious that this doesn't work. For one, it yields three characters, and two, if my character set is larger than 10 characters, the mapping ignores those above index 9.
Still, I believe a mapping is key to the solution.
The other method I explored utilized for loops:
//Psuedocode
c = charset;
for(i = 0; i <length(charset); i++){
concat string
for(j = 0; i <length(charset); i++){
concat string
for...
However, this hardcodes the length of the string I want to compute. I could use an if statement with a goto to break it, but I would like to avoid this method.
Any constructive input is appreciated.
Given a string, to find the next possible string in the sequence:
Find the last character in the string which is not the last character in the alphabet.
Replace it with the next character in the alphabet.
Change every character to the right of that character with the first character in the alphabet.
Start with a string which is a repetition of the first character of the alphabet. When step 1 fails (because the string is all the last character of the alphabet) then you're done.
Example: the alphabet is "ajxz".
Start with aaaa.
First iteration: the rightmost character which is not z is the last one. Change it to the next character: aaaj
Second iteration. Ditto. aaax
Third iteration: Again. aaaz
Four iteration: Now the rightmost non-z character is the second last one. Advance it and change all characters to the right to a: aaja
Etc.
First, thanks for everyone's input; it was helpful. Being that I am translating this algorithm into cuda, I need it to be as efficient as possible on a GPU. The methods proposed certainly work, but not necessarily optimal for GPU architecture. I came up with a different solution using modular arithmetic that takes advantage of the base of my character set. Here's an example program, primarily in C with a mix of C++ for output, and it's fairly fast.
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <iostream>
using namespace std;
typedef unsigned long long ull;
int main(void){
//Declarations
int init = 2;
char final[12] = {'a', 'a'};
char charSet[27] = "abcdefghijklmnopqrstuvwxyz";
ull max = 2; //Modify as need be
int base = strlen(charSet);
int placeHolder; //Maps to character in charset (result of %)
ull quotient; //Quotient after division by base
ull nComb = 1;
char comb[max+1]; //Array to hold combinations
int c = 0;
ull i,j;
//Compute the number of distinct combinations ((size of charset)^length)
for(j = 0; j < max; j++) nComb *= strlen(charSet);
//Begin computing combinations
for(i = 0; i < nComb; i++){
quotient = i;
for(j = 0; j < max; j++){ //No need to check whether the quotient is zero
placeHolder = quotient % base;
final[init+j] = charSet[placeHolder]; //Copy the indicated character
quotient /= base; //Divide the number by its base to calculate the next character
}
string str(final);
c++;
//Print combinations
cout << final << "\n";
}
cout << "\n\n" << c << " combinations calculated";
getchar();
}

Resources