#include <stdio.h> /** Let < denote the lexicographic order when applied to words. Now * rank(W) - 1 is the number of words W' < W on the same letter multiset * as W. We partition these as follows. Let S_n(W) be the set of words * W' < W on the same letter multiset where W and W' disagree in the nth * position from the end, but agree in all positions left of that one. * * For example, let W be the word POOLS. * * S_1(POOL|S) = S_1(POOLS) is the set of words W' < POOLS that begin POOL * and have the same letter multiset as POOLS. Then the last letter of W' * must be S, but then it is not the case that W' < POOLS, so S_1 is empty. * * S_2(POO|LS) is the set of words POOxy where xy is a word on letter multiset * LS so that x < L. Again there are none. (S was a candidate, but S >= L.) * * S_3(PO|OLS) is the set of words POxyz where xyz is a word on letter * multiset LOS so that x < O. (For this purpose we can ignore the prefix PO * and focus on the suffix OLS.) Then xyz is one of LOS, LSO. In fact, the * size #S_3 of set S_3 is the multinomial coefficient * #S_3 = m(2; 1, 1) = m(1, 1) = (1 + 1)! / (1! * 1!), * for we know x is L, and once this is determined, there are m(1, 1) ways * to choose y and z from letter multiset OS (there is one O and one S). * * S_4(P|OOLS) is the set of words Pxyzw where xyzw is a word on letter * multiset LOOS and x < O. Then x can be only L, and yzw is a word on letter * multiset OOS. Then * #S_4 = m(2, 1) = (2 + 1)! / (2! * 1!) = 3 * (yzw is one of OOS, OSO, SOO). * * S_5(|POOLS) is the set of words xyzwv on letter multiset LOOPS with x < P. * Then x is one of OL, and * #S_5 = m(1, 1, 1, 1) + m(2, 1, 1). * * Finally, rank(POOLS) - 1 is * m(1, 1) + m(2, 1) + m(1, 1, 1, 1) + m(2, 1, 1) = 2 + 3 + 24 + 12 = 41. * That is, rank(POOLS) = 42. * * A useful insight is that each multinomial coefficient that we consider * differs only a little from the previous one. Instead of computing from * scratch each time, we just multiply and divide by little factors. * * Some definitions will be useful. * * Let the sequence of tail words considered in the loop be denoted * W_1, W_2, ..., e.g., * W_0 empty, W_1 = S, W_2 = LS, W_3 = OLS, W_4 = OOLS, W_5 = POOLS. * * Let M_n denote the set of arrangements of the letters of W_n. * Let w_n(c) be the number of occurrences of character c in W_n. * Then |M_n| is a multinomial coefficient, * |M_n| = m_n = (w_n(A) + ... + w_n(Z))! / (w_n(A)! * ... * w_n(Z)!). * * In the loop, *t refers to the character at the head of W_n. * Let w_n(c < *t) denote the number of characters c in W_n that precede * *t in the alphabet. * * Finally, a note on overflow. * * In this problem, words can be 25 letters long, but 24! doesn't fit in a * 64-bit integer. A way to avoid overflow is to separate factors of 2: * 6! = 1 * 2 * 3 * 4 * 5 * 6 = 2 * 3 * (2^2) * 5 * (2 * 3) = 45 * 2^4. * We store the number of factors 2 rather than 2 to a power. The odd * part of 24! fits in around 58 bits. Likewise for each multinomial * coefficient we consider: m(k; ...) has no more of each prime factor than * does k!, and k! has no more of each prime factor than does 24!. **/ int odd[] = {0,1,1,3,1,5,3,7,1,9,5,11,3,13,7,15,1,17,9,19,5,21,11,23,3,25}; int even[] = {0,0,1,0,2,0,1,0,3,0,1, 0,2, 0,1, 0,4, 0,1, 0,2, 0, 1, 0,3, 0}; void rank(char *s) { char L[32]; // L[k]: kth distinct character seen int C[32]; // C[k]: no. chars. L[k] in W_n int D = 0; // D: no. distinct characters seen int next[32]; // Let i(0) = 0; i(k) = next[i(k-1)] if k > 0; and next[i(D)] next[0] = 0; // = 0. The alphabetically kth distinct char. seen is L[i(k)]. unsigned long long rank = 1, m_odd = 1; int m_even = 0; // m = m_odd * 2 to the m_even power char *t; // tail string W_n for(t = s; *t; t++); // advance t to end of string for(int n = 1; t-- != s; n++) { // read string right to left int k = 0, w = 0; // w: w_n(c < *t) for(; next[k] && L[next[k]] < *t; k = next[k]) w += C[next[k]]; // now L[k] is *t's alphabetical predecessor if(!next[k] || L[next[k]] > *t) { // *t is new; update L, C, D, and next D++; L[D] = *t; C[D] = 0; next[D] = next[k]; next[k] = D; } int ct = ++C[next[k]]; // w_n(*t) // rank += m_{n-1} * w_n(c < *t) / w_n(*t); see comments @1, @2 if(w) rank += m_odd * odd[w] / odd[ct] << m_even + even[w] - even[ct]; // m = m_n = m_{n-1} * n / w_n(*t) m_odd = m_odd * odd[n] / odd[ct]; m_even = m_even + even[n] - even[ct]; } printf("%llu\n", rank); } /** Comment @1: * This formula is a straightforward consequence of the idea (above) to make * small adjustments to multinomial coefficients. A combinatorial * interpretation would be nice. * * Let the cartesian product of sets A and B be denoted A x B. Then a * combinatorial interpretation means a one-to-one correspondence from: * S_n x {1, 2, ..., w_n(*t)} * to: * M_{n-1} x {1, 2, ..., w_n(c < *t)}. * * Here's one. We can think of an element (U, k) of S_n x {1, ..., w_n(*t)} * as a word U < W_n (on the letters of W_n) with its kth character *t * underscored. (By definition, there are w_n(*t) occurrences of the letter * *t in W_n. Put an underscore under the kth one, but don't tie it to * the character, as the character will be removed shortly.) * * E.g., if W_n = BAABA, then U could be AAABB, and k could be 1, giving * AAAbB, where we use lower case to indicate the underscored character. * * Consider the following operation: take such an element (U, k), * pop off its head letter, and replace the underscored character by it. * The result is a word on the letters of W_{n-1} with a character c < *t * underscored. * * E.g., AAAbB -> AAaB. * * The inverse operation is to remove the underscored character to the * front and fill its vacated place by *t. **/ /** Comment @2: * Overflow may occur only when we reconstitute a multinomial coefficient * from its odd and even parts or when we add it to the tally. Either way, * if overflow occurs, it is because the answer itself is too large. **/ int main(int argc, char **argv) { if(argc < 2) { printf("Usage: %s WORD\n", argv[0]); return 1; } if(argv[1][0] != '-') rank(argv[1]); else { // take words from stdin instead of argv (for performance testing) char line[30]; while(0 < scanf("%s", line)) rank(line); } return 0; }