Barrett reduction for polynomials

Let F denote some field (e.g. the real numbers, or integers modulo some prime), and let F[x] denote the ring of polynomials over that field, and let P, S, T denote elements of that polynomial ring. Polynomial division can then be defined; division of one polynomial by another yields a (unique) quotient and a (unique) remainder. Following the Python 3 notation where // denotes floor division, let S // P denote the quotient obtained from dividing S by P. The key property of Barrett reduction is then:

Provided that deg(S // T) ≤ 0, S // P equals (S * (T // P)) // T.

This lets us turn division by P into multiplication by T // P followed by division by T. Said division is trivial if T is chosen to be xⁿ (with n ≥ deg(S) to make the precondition hold), and T // P can be precomputed if P is fixed.

Proving that this property is true turns out to be surprisingly elusive; all the good expositions on the internet are about Barrett reduction for integers rather than for polynomials. Accordingly, I'm providing a proof in this post. The proof proceeds in four parts:

Defining / (true division rather than floor division) for (extended) polynomials, and defining term-slicing P[j:k] for (extended) polynomials.
Showing that S // P equals (S * (T / P)) // T.
Showing that (S*(T/P))//T equals (S*(T//P))//T plus (S*(T/P)[-∞:0])//T.
Showing that (S*(T/P)[-∞:0])//T equals zero. This final part is the only part which requires the precondition of deg(S // T) ≤ 0.

These four parts combine in the obvious way to prove the overall reduction property.

Part 1: Definitions

P being an element of the polynomial ring F[x] just means that P equals sum(p_i * xⁱ), where i runs over some finite set of non-negative integers, and each p_i is a non-zero element of F. Note that deg(P) denotes the maximum such i from this set (or -∞ if this set is empty).

Extended polynomials can be defined similarly; P equals sum(p_i * xⁱ), where i runs over some (possibly infinite) set of (possibly negative) integers, and each p_i is a non-zero element of F. The requirement on p_i being non-zero can also be dropped, provided that deg(P) is redefined as the maximum i such that p_i is non-zero. As described, deg(P) could be +∞, but we'll restrict ourselves to deg(P) being finite (or being -∞ for the zero polynomial).

Every polynomial is trivially an extended polynomial. Going the other way is harder; an extended polynomial can only be trivially turned into a polynomial when it has a finite number of terms and has no xⁱ terms with negative i.

It is also useful to define term-slicing; given P and integer j and integer k, P[j:k] denotes sum(p_i * xⁱ) restricted to just j ≤ i < k. As a shorthand, P[j] denotes P[j:j+1]. It is trivially true that P equals P[-∞:j] plus P[j:+∞].

Addition and multiplication of extended polynomials follows that of normal polynomials. Division is where things get interesting; the whole point of introducing extended polynomials is to allow the definition of true division (/) for extended polynomials, which is just like floor division (//) for normal polynomials, except that it produces a quotient with an infinite number of (negative i) terms (and no remainder), rather than terminating with a finite quotient and a remainder. For comparison, the division algorithms are:

def floor_division(S, P):
    Q = 0
    while S != 0 and deg(S) >= deg(P):
        T = S[deg(S)] / P[deg(P)]
        Q = Q + T
        S = S - T * P
    return Q, S

def true_division(S, P):
    Q = 0
    while S != 0:
        T = S[deg(S)] / P[deg(P)]
        Q = Q + T
        S = S - T * P
    return Q

Note that as written true_division often never terminates. For actual computation, it needs to be re-expressed as a generator which yields terms of Q in descending i order (and then only a finite number of terms used). Both algorithms involve division of monomonials, which is just division of the field elements and subtraction of the x powers.

Given the similarity in algorithms, it is clear that S // P equals (S / P)[0:+∞]. Various other properties hold; most notably (S / P) // T equals S // (P * T).

One benefit of true division over floor division is that true division is associative with multiplication; that is (S / P) * T equals S / (P * T). Accordingly, S / P * T can expressed without any parentheses. Additionally, it commutes with multiplication, so S / P * T equals S * T / P, and it commutes with itself in so far as S / P / T equals S / T / P.

Part 2: S // P equals (S * (T / P)) // T

This is just algebraic manipulation:

S // P == (S      / P    ) // 1
       == (S *  T / P / T) // 1
       == (S *  T / P    ) // T
       == (S * (T / P)   ) // T

Part 3: (S*(T/P))//T equals (S*(T//P))//T plus (S*(T/P)[-∞:0])//T

This involves term-slicing on T / P, splitting it into (T / P)[-∞:0] plus (T / P)[0:+∞], then noting that the latter is T // P:

(S*(T/P))//T == ( S*((T /P)[0:+∞] +    (T/P)[-∞:0]) )//T
             == ( S*((T//P)       +    (T/P)[-∞:0]) )//T
             ==  (S* (T//P))//T   + (S*(T/P)[-∞:0] )//T

Part 4: (S*(T/P)[-∞:0])//T equals zero

Start with some algebraic manipulation:

(S*(T/P)[-∞:0])//T == ( S    * (T/P)[-∞:0]  )//T
                   == ( S    * (T/P)[-∞:0]/T)//1
                   == ((S/T) * (T/P)[-∞:0]  )//1
                   == ((S/T) * (T/P)[-∞:0]  )[0:+∞]

As deg(S // T) ≤ 0 (by hypothesis), it is also the case that deg(S / T) ≤ 0. Clearly also deg((T/P)[-∞:0]) < 0 by the nature of term-slicing. Hence the deg of their product is also less than zero (as multiplication of polynomials sums their degrees). Hence the [0:+∞] term-slicing results in no terms — or in other words, results in the zero polynomial.