- EEPP
- EPEP
- EPPE
**PEEP**- PEPE
- PPEE

*S*_{1}(POOL|S) = *S*_{1}(POOLS) is the set of words *W*' < POOLS that begin POOL
and have the same letter multiset as POOLS. Then the last letter of *W*'
must be S, but then it is not the case that *W*' < POOLS, so *S*_{1} is empty.

*S*_{2}(POO|LS) is the set of words POO*xy* where *xy* is a word on letter multiset
LS so that *x* < L. Again there are none.

*S*_{3}(PO|OLS) is the set of words PO*xyz* where *xyz* is a word on letter
multiset LOS so that *x* < O. (For this purpose we can ignore the prefix PO
and focus on the suffix OLS.) Then *xyz* is one of LOS, LSO. In fact, the
size #*S*_{3} of set *S*_{3} is the multinomial coefficient

#for we knowS_{3}=m(2; 1, 1) =m(1, 1) = (1 + 1)! / (1! · 1!) = 2,

*S*_{4}(P|OOLS) is the set of words P*xyzw* where *xyzw* is a word on letter
multiset LOOS and *x* < O. Then *x* can be only L, and *yzw* is a word on letter
multiset OOS. Then

#as there are 2 instances of O and 1 of S;S_{4}=m(2, 1) = (2 + 1)! / (2! · 1!) = 3,

*S*_{5}(|POOLS) is the set of words *xyzwv* on letter multiset LOOPS with *x* < P.
Then *x* is one of OL, and

#S_{5}=m(1, 1, 1, 1) +m(2, 1, 1).

In all,

That is,N(POOLS) - 1 =m(1, 1) +m(2, 1) +m(1, 1, 1, 1) +m(2, 1, 1) = 2 + 3 + 24 + 12 = 41.

W_{0}empty,W_{1}= S,W_{2}= LS,W_{3}= OLS,W_{4}= OOLS,W_{5}= POOLS.

Let *M _{n}* denote the set of arrangements of the letters of

|M| =_{n}m= [_{n}w(A) + ... +_{n}w(Z)]! / [_{n}w(A)! · ... ·_{n}w(Z)!]._{n}

In the code below, `*t` refers to the character at the head of *W _{n}*.
Let

6! = 1 · 2 · 3 · 4 · 5 · 6 = 2 · 3 · 2We store the number of factors 2 rather than a power of 2. For example, the^{2}· 5 · (2 · 3) = 45 · 2^{4}.

#include <stdio.h> int odd[] = {0,1,1,3,1,5,3,7,1,9,5,11,3,13,7,15,1,17,9,19,5,21,11,23,3,25}; int even[] = {0,0,1,0,2,0,1,0,3,0,1, 0,2, 0,1, 0,4, 0,1, 0,2, 0, 1, 0,3, 0};We need odd and even parts of only a few small numbers. Precompute them for speed.

void rank(char *s) { char L[32]; // L[k]: kth distinct character seen int C[32]; // C[k]: no. chars. L[k] in W_n int D = 0; // D: no. distinct characters seen int next[32]; // Let i(0) = 0; i(k) = next[i(k-1)] if k > 0; and next[i(D)] next[0] = 0; // = 0. The alphabetically kth distinct char. seen is L[i(k)]. unsigned long long rank = 1, m_odd = 1; int m_even = 0; // m = m_odd * (2 ^ m_even) char *t; // tail string W_n for(t = s; *t; t++); // advance t to end of string for(int n = 1; t-- != s; n++) { // read string right to left int k = 0, w = 0; // w: w_n(c < *t) for(; next[k] && L[next[k]] < *t; k = next[k]) w += C[next[k]]; // now L[k] is *t's alphabetical predecessor if(!next[k] || L[next[k]] > *t) { // *t is new; update L, C, D, and next D++; L[D] = *t; C[D] = 0; next[D] = next[k]; next[k] = D; } int ct = ++C[next[k]]; // ct: w_n(*t) // rank += m_{n-1} * w_n(c < *t) / w_n(*t); see comments @1, @2 if(w) rank += m_odd * odd[w] / odd[ct] << m_even + even[w] - even[ct]; // m = m_n = m_{n-1} * n / w_n(*t) m_odd = m_odd * odd[n] / odd[ct]; m_even = m_even + even[n] - even[ct]; } printf("%llu\n", rank); }

toS× {1, 2, ...,_{n}w(_{n}*t)}

M_{n-1}× {1, 2, ...,w(_{n}c<*t)}.

Here's one. Think of an element (*U*, *k*) of *S _{n}* × {1, ...,

E.g., if *W _{n}* = BAABA, then

Consider the following operation: take such an element (*U*, *k*),
pop off its head letter, and replace the underscored character by it.
The result is a word on the letters of *W*_{n-1} with a character *c* < `*t`
underscored.

E.g., AAA**B**B -> AA**A**B. That is, A_{1}A_{2}A_{3}**B**_{i}B_{ii} -> A_{2}A_{3}**A**_{1}B_{ii}.

The inverse operation is to remove the underscored character to the
front and fill its vacated place with `*t`.

It would be easier to avoid overflow by using a big-integer library such as GMP, but slower. If longer words are to be considered, we might try to separate more prime factors. Because factorials lack large prime factors, it should be possible to pack their prime-factor representations into a single 64-bit integer for strings 33 characters long. Combining this with the idea of separating 2 from the next 12 primes, it should be possible to pack prime-factor representations (say, with a 5-bit count of each prime) into a single 64-bit integer for strings of 43 characters or so.

int main(int argc, char **argv) { if(argc < 2) { printf("Usage: %s WORD\n", argv[0]); return 1; } if(argv[1][0] != '-') rank(argv[1]); else { // take words from stdin instead of argv (for performance testing) char line[30]; while(0 < scanf("%s", line)) rank(line); } return 0; }

Here is the complete program source.