Thursday, January 29, 2009

A nice permutations problem

Hello all,
It has been a while since I wrote in this blog. I'm back now with a nice permutation problem that I just solved correctly today. Given a string with size <= 20 and an index i (0-based) return the ith permutation of the string characters. This link is for the problem statement on TopCoder. You can view my code in the practice room as I've having problems with the "less than" signs here.

So, my idea is based on the fact that, having a string of n different chars, the number of permutations that begin with any given char is equal to Factorial(n-1). The proof is so simple, I will keep the first char in its place while trying to get all possible permutations of the remaining n-1 chars which is, as we learned in school, equal to Factorial(n-1);

When the string has some repeated chars it's a little bit different but still very simple. You count the frequencies of each char in the string excluding the first one. This is because you will leave it as it is, so it doesn't matter whether it is repeated or not. Then, for each obtained frequency of chars, divide Factorial(n-1) by Factorial(frequency[i]);

If index is not less than p, then you should try another char as the first char of the string, decrement your index by p and try again. The reason why you have to decrement index by p is because you are now beginning with a different char, which means that you counted p different permutations before getting to begin with that char. You should also note that, when selecting another char to begin with, you should select the one that comes after the current char in alphabetical order.

After selecting this char, put it at the beginning of the string and sort the remaining chars. So, if the current string is "abcd" and index is greater than or equal to p, select 'b', replace it with 'a' and sort the remaining chars to get "bacd". In this example, after replacing 'a' and 'b' yielded an order substring "acd", but it's not always the case, so sorting is required after replacing the first char with a new one.

One more thing to say, the boolean calcFreq vector is used to ensure that you count the frequency of a given char only once. If you have a string "aabc", at the first iteration of i you will get the frequency of the first 'a' charecter which is 2. At the second iteration you will calculate the frequency of second 'a' char which is again 2. But you shouldn't do that because this way you will divide by Fact(2) twice while you should only do it once.

Note:
The "<=20" constraint is required to solve the problem in this way. The maximum Factorial that can fit in a long long is Factorial(19) and since you don't have to calculate Factorial(n), just Factorial(n-1) then 20 is the maximum size of the string allowable.

No comments: