Jupyter:-SetOutputRendererByType( algebraic, "text/latex" ): # ignore this
kilobytes used=1802, alloc=5424, time=0.30
We first do some symbolic processing and setup for the perturbation analysis and companion matrix methods we will use. This is not the famous Wilkinson polynomial! This one (p. 67 in part II, cited above), "arose in filter design."
f := mul(z^2 + z*A[i] + B[i], i = 1 .. 7) - k*mul((z + C[i])^2, i = 1 .. 6);
Numerical values for the parameters $A_i$, $B_i$, $C_i$, and $k$ will be given soon.
varlist := [seq(A[i], i = 1 .. 7), seq(B[i], i = 1 .. 7), seq(C[i], i = 1 .. 6), k]:
G := Student:-MultivariateCalculus:-Gradient(f, varlist):
with(LinearAlgebra):
#
# BHIP: Barycentric Hermite Interpolation Program
#
# (c) Robert M. Corless, December 2007, August 2012
#
# Compute the barycentric form of the unique Hermite interpolant
# of the polynomial given by values and derivative values of
# p(t) at the nodes tau.
#
# CALLING SEQUENCES
#
# ( p, gam ) := BHIP( flist, tau, t );
# ( p, gam ) := BHIP( ftayl, tau, t, 'Taylor' = true, 'Denominator' = q );
# ( p, gam, DD ) := BHIP( ftayl, tau, t, <opts>, 'Dmat'=true )
#
# Input:
# flist :: list of lists of derivative values
# [ [f[1,0],f[1,1],...,f[1,s[1]-1]],
# [f[2,0],f[2,1],...,f[2,s[2]-1]],
# ...
# [f[n,0],f[n,1],...,f[n,s[n]-1]] ]
# or, equivalently, when 'Taylor' is true,
# ftayl :: list of lists of Taylor Coefficients,
# same as above but divided by factorials
# [ [f[1,0]/0!,f[1,1]/1!,...,f[1,s[1]-1]/(s[1]-1)!],
# [f[2,0]/0!,f[2,1]/1!,...,f[2,s[2]-1]/(s[2]-1)!],
# ...
# [f[n,0]/0!,f[n,1]/1!,...,f[n,s[n]-1]/(s[n]-1)!] ]
#
# As a convenience, singleton lists may be entered without []
# If plist[i] = [] is empty, then nothing
# is known about f at t = tau[i], and
# tau[i] will not figure in the output.
# Sometimes this is useful. Not all flist[i] may
# be empty.
#
# tau :: list of distinct complex nodes
# [tau[1],tau[2],...,tau[n]]
#
# t :: name of the variable to use
#
# Taylor :: boolean, 'true' if input is in Taylor form
#
# Denominator :: polynomial in t, default 1 if absent
# :: list of lists of values and derivatives
# or Taylor coefficients if Taylor is true
#
# N.B. f[i,j] = (D@@j)(f)(tau[i]) 0 <= j <= s[i]-1
# = flist[i][j+1]
# Dmat :: boolean, 'true' if differentiation matrix desired
#
# Output:
# p :: The unique (rational) Hermite interpolant, in barycentric
# form. Namely,
# #w(t)*add(add(gam[i,j]/(t-tau[i])^(1+j)*add(p[i,j]/j!*(t-tau[i])^k)
# This can be converted to distributed form by calling
# distrib(p);
#
# gam :: The Array(1..n, 0..smax-1) of coefficients
#
# Here w(t) = mul( ( t-tau[i])^s[i], i = 1..n )
# smax = max( op(s) )
# and s = list of sizes of each list in plist,
# so s[i] is the number of pieces of
# information at node tau[i].
#
# DD --- Differentiation matrix on the nodes tau
#
# Processing: local Laurent series.
# This approach is different to that of
# Reference: C. Schneider & W.Werner, "Hermite
# Interpolation: The Barycentric Approach",
# Computing 46, 1991, pp 35-51.
#
BHIP := proc( pin::list, tau::list, t::name,
{Taylor::truefalse:=true},
{Conditioning::truefalse:=false},
{Dmat::truefalse:=false},
{Denominator::{algebraic,list}:=1} )
local brks, d, DD, denr, dens, dgam,
dr, g, gam, ghat,h, i, irow, j,
k, mu, n, numr, nums,
p, P, q, r, rs, rt, s, smax, sq;
n := nops(tau);
if nops(pin) <> n then
error "Mismatched size of node list and data list"
end if;
if nops(convert(tau,set)) < n then
error "Nodes must be distinct, with confluency explicitly specified."
end if;
p := map(t -> `if`(t::list,t,[t]),pin); # singletons ok
s := map(nops,p); # confluency
smax := max(op(s));
if smax = 0 then
error "At least one piece of data is necessary."
end if;
d := -1 + add( s[i], i=1..nops(s) ); # degree bound
p := `if`( Taylor, p, [seq([seq(p[i][j]/(j-1)!,j=1..s[i])],i=1..n)] );
gam := Array( 1..n, 0..smax-1 ); #default 0
if Conditioning then
dgam := Array( 1..n, 0..smax-1, 1..n ); #default 0
end if;
# The following works for n>=1
for i to n do
if s[i] > 0 then # ignore empty lists
h[i] := mul( (t-tau[j])^s[j], j = 1..i-1 )*
mul( (t-tau[j])^s[j], j=i+1..n );
r[i] := series( 1/h[i], t=tau[i], s[i] );
for j to s[i] do
gam[i,s[i]-j] := coeff( r[i], t-tau[i], j-1) ; #op( 2*j-1, r[i] );
end do;
if Conditioning then
# We could compose a series for 1/(t-tau[k]) with
# what we know, but using the kernel function "series"
# is likely faster.
for k to i-1 do
dr[i,k] := series( s[k]/h[i]/(t-tau[k]), t=tau[i], s[i] );
for j to s[i] do
dgam[i, s[i]-j, k] := coeff( dr[i,k], t-tau[i], j-1 );
end do;
end do;
# We could reuse earlier series, and do one O(n^2)
# computation to get gam[i,-1], but it's simpler to
# use series (and likely faster because series is in the kernel)
dr[i,i] := series( 1/h[i], t=tau[i], s[i]+1 );
# We implicitly divide this by t-tau[i], and take
# coefficients one higher.
for j to s[i] do
dgam[i,s[i]-j,i] := j*coeff( dr[i,i], t-tau[i], j );
end do;
for k from i+1 to n do
dr[i,k] := series( s[k]/h[i]/(t-tau[k]), t=tau[i], s[i] );
for j to s[i] do
dgam[i, s[i]-j, k] := coeff( dr[i,k], t-tau[i], j-1 );
end do;
end do;
end if;
end if;
end do;
if not (Denominator::algebraic and Denominator=1) then
# adjust gam by folding in q
if Denominator::list then
if nops(Denominator)<>n then
error "Denominator list (q) has the wrong length."
end if;
q :=`if`( Taylor, q, [seq([seq(q[i][j]/(j-1)!,j=1..s[i])],i=1..n)] );
else
ghat := Array( 1..n );
for i to n do
sq := series(Denominator,t=tau[i],s[i]);
ghat[i] :=[seq(coeff(sq,t-tau[i],j),j=0..s[i]-1)];
end do;
q := [seq(ghat[i],i=1..n)];
end if;
ghat := Array( 1..n, 0..smax-1 );
for i to n do
for j from 0 to s[i]-1 do
ghat[i,j] := add( gam[i,j+k]*q[i][k+1], k=0..s[i]-j-1 );
end do;
end do;
gam := ghat;
end if;
P := mul( (t-tau[i])^s[i],i=1..n)*
add(add(gam[i,j]/(t-tau[i])^(1+j)*
add(p[i][1+k]*(t-tau[i])^k, k=0..j),
j=0..s[i]-1),
i=1..n );
# Translated from Matlab. Nearly working.
if Dmat then
# Compute differentiation matrix
DD := Matrix( d+1, d+1 );
brks := [seq(add(s[j],j=1..i-1),i=1..nops(s))]; #cumsum([0,s.']);
irow := 0;
for k to n do
# trivial rows
for j to s[k]-1 do
irow := irow+1;
# next available row
DD[irow,brks[k]+j+1] := j; # result is in Taylor form
end;
# Nontrivial row
irow := irow+1;
for i in [seq(j,j=1..k-1),seq(j,j=k+1..n)] do
for j to s[i] do
g := 0;
for mu from j-1 to s[i]-1 do
g := g + gam[i,mu]*(tau[k]-tau[i])^(j-2-mu);
end;
DD[irow,brks[i]+j] := g/gam[k,s[k]-1];
end;
end;
DD[irow,brks[k]+2..brks[k]+s[k]] := -gam[k,0..s[k]-2]/gam[k,s[k]-1];
# Final entry
DD[irow,brks[k]+1] := -add( DD[irow,brks[j]+1], j=1..nops(brks) );
DD[irow,1..-1] := DD[irow,1..-1]*s[k]; # want Taylor form of derivative
end;
end if;
return P, gam, `if`(Conditioning,dgam,NULL), `if`(Dmat,DD,NULL) ;
end proc:
#Condition number
BBHIP := proc( pin::list, tau::list, t::name, {Taylor::boolean:=false} )
local B, gam, n, i, j, k, p, s;
n := nops(tau);
p := map(t -> `if`(t::list,t,[t]),pin); # singletons ok
s := map(nops,p); # confluency
( p, gam ) := BHIP( pin, tau, t, ':-Taylor'=Taylor );
return abs(mul((t-tau[i])^s[i],i=1..n))*add( add(
abs( add( gam[i,j+k]/k!/(t-tau[i])^(j+1),j=0..s[i]-1-k) ),
k=0..s[i]-1),
i=1..n )
end proc:
# Generalized companion matrix pencil
# Upended October 2012
# Transposed and negated to agree with Standard Triples Paper May 2018
# Remaining TODO: incorporate matrix polynomials via Kronecker product
CMP := proc( pin::list, tau::list, t::name, {Taylor::boolean:=false} )
local C0, C1, gam, n, i, j, k, p, P, s, d;
n := nops(tau);
p := map(t -> `if`(t::list,t,[t]),pin); # singletons ok
s := map(nops,p); # confluency
p := `if`(Taylor, p, [seq([seq(p[i][j]/(j-1)!,j=1..s[i])],i=1..n)]);
d := -1 + add( s[i], i=1..n );
# P is irrelevant here, bar shape
( P, gam ) := BHIP( p, tau, t, ':-Taylor'=true );
C1 := Matrix( d+2, d+2, (i,j)->`if`(i<>j,0,`if`(i=1,0,1)) );
C0 := Matrix( d+2, d+2, 0 );
k := 0;
for i to n do
for j to s[i] do
k := k+1;
if j < s[i] then
C0[d+3-(k),d+3-(k+1)] := 1;
end if;
C0[d+3-k,d+3-k] := tau[i];
C0[1,d+3-k] := -p[i][j];
C0[d+3-k,1] := gam[i,j-1];
end do;
end do;
return C0, C1
end proc:
# distribution utility
distrib := proc( p )
local w;
if op(0,p) <> `*` then
return p
elif op(-1,p)::`+` then
w := p/op(-1,p);
return add( w*t, t in [op(op(-1,p))] )
else
return p
end;
end:
TPHC := proc( n, m )
local B, Bc, ctrs, i, mx, p, plts, sx, t, x, y;
p := [[seq((-1)^k, k = 0 .. n-1)], [seq((-1)^k, k = 0 .. m-1)]];
B := BBHIP( p, [0,2], t );
sx := fsolve( diff( `assuming`([simplify(B)], [t::(RealRange(0, 2))]), t ),
t=0.0001..1.9999 );
if [sx] = [] then error "Failed to locate the maximum" end;
if has(sx,t) then sx := 1; end;
mx := eval(B, t = sx);
Bc := eval(B, t=x+I*y );
ctrs := mx*[seq(100^k, k = 0 .. 3)];
plts := NULL;
for i in ctrs do
plts := plts, plots[implicitplot](Bc-i, x = -1.5 .. 3.5, y = -2 .. 2, colour =
BLACK, grid = 2*[50, 50], gridrefine = 2, view=[-1.5..3.5, -2..2],
scaling=CONSTRAINED )
end do;
return (mx, {plts});
end:
# This routine writes out the coefficients of 1 in the Hermite basis
# in reverse order with an extra zero in front RMC May 2018
# This gives us our X for the standard triple for Hermite/Lagrange linearizations
rho2x := proc( pin )
local i, j, n, one, p, s;
p := map(t -> `if`(t::list,t,[t]),pin); # singletons ok
n := nops(p);
s := map(nops,p); # confluency
p := `if`(Taylor, p, [seq([seq(p[i][j]/(j-1)!,j=1..s[i])],i=1..n)]);
one := Vector[row]([0,seq(op([seq(0,j=2..s[n+1-i]),1]),i=1..n)] );
end:
Here are the coefficients in the Wilkinson filter design polynomial: After we input them, we solve the polynomial using Maple's built-in polynomial solver, using high precision.
Digits := 40:
A := Vector(7, [2.008402247, 1.974225110, 1.872661356, 1.714140938, 1.583160527, 1.512571776, 1.485030592]):
B := Vector(7, [1.008426206, 0.9749050168, 0.8791058345, 0.7375810928, 0.6279419845, 0.5722302977, 0.5513324340]):
C := Vector(6, [0, 0.7015884551, 0.6711668301, 0.5892018711, 1.084755941, 1.032359024]):
k := 0.1380e-7:
rts := fsolve(f, z, complex):
We now have accurate roots, computed by Maple's built-in solver, for comparison. We will now compute them using two different companion pencils. I had hoped to get to the new "Algebraic Linearization" style companion pencil, but found I was having too much fun with these. Let's see how far we get.
Jupyter:-SetOutputRendererByType( algebraic, default ): # ignore this
rtplot := plots[complexplot]([seq(rts[j], j = 1 .. nops([rts]))], style = POINT, scaling = constrained, gridlines):
plots[display](rtplot, labels=[Re(lambda),Im(lambda)]);
Now let's use a Hermite interpolational companion pencil to find those roots, instead of using the general-purpose fsolve.
We choose to interpolate $f$ and $f'$ at $-C_i$, plus just $f$ at 3 other nodes. This is not optimal, but it is something that someone might try (if they got mixed up and thought $k$ was large and not small). But as we will see, it works anyway, partly because the two pieces of the polynomial almost have roots in common.
Jupyter:-SetOutputRendererByType( algebraic, "text/latex" ): # ignore this
Digits := 15:
nodes := [seq(-C[i], i = 1 .. 6), -0.8, -0.9, -1.0];
0 | -.7015884551 | -.6711668301 | -.5892018711 | -1.084755941 | -1.032359024 | -.8 | -.9 | -1.0 |
rho := [seq([eval(f, z = nodes[i]), eval(diff(f, z), z = nodes[i])], i = 1 .. 6), seq([eval(f, z = nodes[i])], i = 7 .. 9)];
.126287765219049 | 2.06213618858892 |
.605505365886479e-12 | .815161144963878e-10 |
.184170713582482e-10 | .175669428835534e-8 |
.704348208520194e-8 | .402288357954962e-6 |
.966822391131390e-10 | -.805664141294696e-8 |
.305598214392876e-12 | -.473585628114844e-10 |
.170313496239327e-14 | 0 |
.944297283269356e-14 | 0 |
.267201106433612e-15 | 0 |
Oh, that printed in an interesting way in this version of Jupyter with a Maple kernel. This is a new interface, and some of the kinks are still being worked out (as you can see).
Anyway, what we just did was to specify $15$ pieces of information about that polynomial: $f$ and $f'$ at 6 nodes, and just $f$ at three more (making $15$). Those three zeros at the end are just because of the list-to-array style formatting output: they should be blank. The fact that it's a list of lists and of numbers probably confused it.
C0, C1 := CMP(rho, nodes, t);
C0;
[: , : , : , : , : , : , : , : , : , : , "16 x 16 Matrix"]
Here's what that $C_0$ matrix looks like (kind of like Matlab's "spy" command). Black boxes are nonzero entries. We can see the arrowhead shape, with little Jordan block transposes on the diagonal.
The $C_1$ matrix is just an identity matrix with its first entry zeroed out:
C1;
The Hermite interpolational polynomial is just $\det(\lambda C_1 - C_0 )$. Note the pencil is $16$ by $16$ while the degree of the polynomial is $14$. So we will have spurious eigenvalues at infinity.
In the scalar case, this is actually no bother. In the matrix polynomial case, it gets to be annoying.
We let Maple's interface to the QZ algorithm (which is the NAG library version) deal with the balancing.
lambda := Eigenvalues(C0, C1);
Double precision arithmetic (IEEE standard) is used underneath, then translated to Maple's floats if necessary. Notice that both spurious infinite eigenvalues were computed exactly. We always get one exactly; sometimes the other. This is because the zero is in the top left corner and the arrow points up and to the left and so the first infinite eigenvalue gets deflated without rounding error simply by executing the standard algorithm (thanks, Piers, for pointing that out, and arguing for up-and-to-the-left).
We now want to see how good an answer we got. We'll compute a structured backward error.
finiteroots := [seq(lambda[i], i = 3 .. 16)]:
NB: The existing backward error theory for Hermite interpolational companion pencils is not well-developed. I know only of one paper, by Piers Lawrence and myself, from 2011; published only in extended abstract form.
In short: backward stable, with errors $O(\mathbf{u}\| \rho \|)$ (not $\| \rho \|^2$), similar to Lagrange bases but not quite as good because of a factor $(\tau_i-\tau_j)^{-s}$ where $s$ is the confluency hidden in that $O$ symbol.
But in any case that theory doesn't apply here, because the data for this polynomial are the $A$, $B$, $C$, and $k$ coefficients, and they enter nonlinearly. So we have to do a special-purpose analysis. $\newcommand{\e}{\varepsilon}$ We replace each $A_i$, $B_i$, $C_i$, and $k$ with $A_i + \e \Delta A_i$, etc, then expand and ignore terms of $O(\e^2)$.
macro(e = varepsilon):
Digits := 2*Digits: # Residuals need to be accurate
fpdf := convert(series(mul(z^2 + (e*dA[i] + A[i])*z + B[i] + e*dB[i], i = 1 .. 7) - (dk*e + k)*mul((e*dC[i] + z + C[i])^2, i = 1 .. 6), e, 2), polynom):
We constrain the changes in the coefficients to satisfy the residuals (and keep it real):
constrs := [seq(eval(fpdf, z = finiteroots[2*i - 1]), i = 1 .. 7)]:
constrs := map(t -> evalc(Re(t)), constrs):
constrs := eval(constrs, e = 1):
constrs := map(t -> t = 0, constrs):
obj := add(dA[i]^2, i = 1 .. 7) + add(dB[i]^2, i = 1 .. 7)
+ add(dC[i]^2, i = 1 .. 6) + dk^2;
kilobytes used=15621, alloc=14475, time=1.42
We minimize the sum-of-squares of the changes that ensure the constraints are satisfied (Maple's minimization methods, from NAG, prefer smooth objective functions). Could have done this with the SVD instead (but this is just fine)
errorbackward := Optimization:-Minimize(obj, constrs):
dvarlist := [seq(dA[i], i = 1 .. 7), seq(dB[i], i = 1 .. 7), seq(dC[i], i = 1 .. 6), dk]:
max( map(abs, eval( dvarlist, errorbackward[2] ) ) );
fullfdf := mul(z^2 + (dA[i] + A[i])*z + B[i] + dB[i], i = 1 .. 7)
- (dk + k)*mul((dC[i] + z + C[i])^2, i = 1 .. 6):
computedfdf := eval( fullfdf, errorbackward[2] ):
max( map(abs, [seq( eval(computedfdf, z=finiteroots[i]),i=1..14 )]) );
That showed that the componentwise backward error is no larger than 3.1e-12, which (given that Wilkinson gave the coefficients only to at most 10 digits) is fine.
Ordinary double precision arithmetic, using the simple QZ algorithm on a suboptimal Hermite interpolational companion pencil, found the roots of the Wilkinson filter polynomial to all the accuracy that could be desired.
What if instead we were naive and expanded the polynomial in the monomial basis? Wilkinson warned against this, and said he had to use "treble precision" in that case.
Traditionally expansion in monomials is very bad, but in this case, because the roots are mostly inside the unit disk, where monomials are great, this should actually work reasonably well, considering. Let's try.
Digits := 15:
fex := expand(f):
Bcond := map(abs, fex);
The condition number of that polynomial (by the Farouki—Rajan formulation) on the unit disk (which is mostly where we are working) is no more than the sum of those coefficients, which is about $6000$. So if our backward error is good, here, we should wind up with pretty accurate roots.
(Narrator: That story is wrong, and it's wrong in at least two ways.)
Frob := CompanionMatrix(fex, z);
lambdaex := Eigenvalues(Frob);
Digits := 4*Digits: # Residuals need to be accurate
resex := [seq( eval(f,z=lambdaex[i]), i=1..14 )]:
max( map(abs, resex));
constrs := map(expand,[seq(eval(fpdf, z = lambdaex[i]), i = 1 .. 14)]):
constrs := eval(constrs, e = 1):
constrs := map(t -> op([evalc(Re(t)), evalc(Im(t))]), constrs):
constrs := map(t -> t = 0, constrs):
obj := add(dA[i]^2, i = 1 .. 7) + add(dB[i]^2, i = 1 .. 7)
+ add(dC[i]^2, i = 1 .. 6) + dk^2;
kilobytes used=24433, alloc=14475, time=2.05
constrs[1];
.195898703900313161791243947325665959368038158774879892550175e-12-.126806030899680498551899507584511018975186619307264296339426e-9*dA[7]+.178735617507357017711981628726250294111247044889610986834719e-9*dB[7]-.566240573151992483818004098622691007220203681814427167852162e-10*dA[6]+.798127327083590175782903874621530693555117941064410312690825e-10*dB[6]-.171877013061620193220834941652812756482975571733682760320048e-10*dA[5]+.242264061471907143602860498613732792377968333610219877232459e-10*dB[5]-.560420210447053867059078436090948006641184301572631831370566e-11*dA[4]+.789923410323455893866839451551116312229890801536874412871322e-11*dB[4]-.258042691854863520264004341995010514330973429264801569227920e-11*dA[3]+.363716296020869179419372889135480891366964383278943777468588e-11*dB[3]-.179092256623234929856975801562237099101609004828178630436616e-11*dA[2]+.252434090486311388814804094116755747378242314593037734510388e-11*dB[2]-.159974932094656672399793837129166032223425137880563026672486e-11*dA[1]+.225487842106317444029768377675335101437730793187132244552031e-11*dB[1]-.830560140402223770978250754704613188509709188128634844669231e-18*dC[6]-.714601094555001571973549568196699379127688837748292608599322e-18*dC[5]+.223005807659714631336824523399529333875364619729897992195757e-17*dC[4]+.700322477657096985570116312086084730825324211071867606861259e-17*dC[3]+.340640065322292464586427522897078084756822809961217766264240e-16*dC[2]+.378013281441662425816112598983251479112607477997830142547835e-18*dC[1]-.971687869221101495284880837562652532837058136606673539745688e-11*dk = 0
errorbackward := Optimization:-Minimize(obj, constrs,
feasibilitytolerance=10^(3-Digits)):
max( map(abs, eval( dvarlist, errorbackward[2] ) ) );
fullfdf := mul(z^2 + (dA[i] + A[i])*z + B[i] + dB[i], i = 1 .. 7)
- (dk + k)*mul((dC[i] + z + C[i])^2, i = 1 .. 6):
computedfdf := eval( fullfdf, errorbackward[2] ):
max( map(abs, [seq( eval(computedfdf, z=lambdaex[i]),i=1..14 )]) );
A backward error of more than 0.4? But the actual backward error is 4.0e-10? Worse than before? I'm probably not using the Optimization package correctly.
Oh, smacks forehead! Of course that matrix is ill-conditioned, the majority of the roots are complex conjugates.
But a direct use of the SVD also fails find a decent backward error (or at least I couldn't make it work, in my jet-lagged state).
Note that two of those computed eigenvalues were real, this time. Not looking good anyway. How about direct solution of the expanded polynomial?
Digits := 15:
fsolvex := Vector( [fsolve( fex, z, complex )] );
Holy moly.
Digits := 2*Digits: # Residuals need to be accurate
resex := [seq( eval(f,z=fsolvex[i]), i=1..14 )]:
max( map(abs, resex));
Jupyter:-SetOutputRendererByType( algebraic, default ): # ignore this
pfsolvex := plots[complexplot](convert(fsolvex,list), style = point, colour = [red], scaling = constrained, symbol = solidbox):
plamdaex := plots[complexplot](convert(lambdaex,list), style = point, colour = [red], scaling = constrained):
plots[display](pfsolvex,plamdaex);
Some of those roots are vastly different. Both working from the monomial basis.
pfsolveorig := plots[complexplot]([rts], style = point, colour = [blue], scaling = constrained, symbol = solidbox):
plamdaHermite := plots[complexplot](finiteroots, style = point, colour = [blue], scaling = constrained):
plots[display](pfsolveorig,plamdaHermite);
plots[display](pfsolvex,plamdaex,pfsolveorig,plamdaHermite);
We see that the computed roots by the Frobenius companion matrix are in fact entirely different from the correct roots (and different again from the results of fsolve on the expanded polynomial, though less different), while the Hermite interpolational companion got things beautifully (remember the very small componentwise backward error). Yet the condition number of the expanded form wasn't that bad. What happened?
A) There are some nonlinear interactions between the parameters when they are multiplied out–the backward errors are not the same at all
B) Maple is using the NAG library QZ algorithm and not the right algorithm. What's the right algorithm? Well, here's one of the papers (we heard Leonard Robol talk yesterday about more):
The backward error bounds for simple QZ applied to a companion pencil contains a factor $\| p\|$, I believe. Say, $10^4$, then? I didn't scale the companion pencil, which makes it worse. Conditioning still plays a role, though, another factor of $10^4$ or so. That explains some of what we saw.
Why are Lagrange and Hermite interpolational bases better? Besides the backward error results of 2015, the condition numbers are also good. Remember, Bernstein bases are optimal, but, sometimes, Lagrange bases are better (2004)
(the 2015 paper was published in a good place, but the 2004 paper was not)
Reiscovered independently and published in 2019 in Journal of Approximation Theory by Carnicer, Khiar, and Peña. A much better place! A better paper, too, he says somewhat grudgingly.
But in this case, we really want the structured condition number anyway.
Jupyter:-SetOutputRendererByType( algebraic, "text/latex" ): # ignore this
k := 'k':
Digits := 40;
polyk := discrim(convert(f, rational), z):
badk := [fsolve(polyk, k)]:
possibles := select(t -> abs(t) < 0.00010, badk):
evalf[4]( min( possibles ) );
kilobytes used=38316, alloc=14475, time=2.70
We have found that for $k=1.386\times 10^{-8}$, less than half a percent different to Wilkinson's value for $k$, the polynomial has a multiple root! So the structured condition number of this polynomial isn't great.
To see what it really is, we have to use some calculus. But before we do that, let's try a Lagrange interpolational basis. After all, they are "sometimes better than optimal!"
Digits := 15;
eval(f, k=0 );
Digits := 15
Those are just quadratics. We can solve them one at a time. In fact, fsolve will do this because it will detect the factored form. Since $k$ is small in the original filter, we expect the zeros to be close to these zeros, and in that case the Lagrange basis condition number really is optimal (it's just $1$ if the interpolation points include the zeros!)
kzeroroots := [fsolve( eval(f, k=0 ),z,complex )]:
But we need one more node. Let's pick $-1$.
lagrangenodes := [op(kzeroroots), -1]:
k := 1.380e-8:
lagrangerho := [seq([eval(f, z = lagrangenodes[i])], i = 1 .. 15)]:
C0L, C1L := CMP(lagrangerho, lagrangenodes, t);
That's a pure arrowhead shape (no Jordan blocks).
We could make them better balanced by insising that the first row have norm 1 and the first column have norm 1. This is free here and can be done independently: we can multiply a polynomial's values by any scalar and not change the roots, and we can multiply the barycentric weights by any scalar and not change the roots (by the 2nd barycentric form).
lagrangecompanionlambda := Eigenvalues(C0L, C1L):
flagroots := [seq(lagrangecompanionlambda[i], i=3..16)]:
Digits := 2*Digits;
constrs := [seq(eval(fpdf, z = flagroots[2*i - 1]), i = 1 .. 7)];
constrs := map(t -> evalc(Re(t)), constrs);
constrs := eval(constrs, e = 1);
constrs := map(t -> t = 0, constrs);
obj := add(dA[i]^2, i = 1 .. 7) + add(dB[i]^2, i = 1 .. 7)
+ add(dC[i]^2, i = 1 .. 6) + dk^2;
- 0.905033366033543366078544714330 10 dC[6] = 0]
errorbackward := Optimization:-Minimize(obj, constrs):
max( map(abs, eval(dvarlist, errorbackward[2]) ) );
So the Lagrange basis companion gets a structured backward error at the machine precision level (to compute the backward error we had to work in quad precision, but to get the result that has that good a backward error we only worked in double precision).
So we see that the Lagrange basis companion pencil, as expected, computed more accurate roots, from this backward error point of view. We found the exact roots of a polynomial with different $A$, $B$, $C$, and $k$, where the differences were no more than $3.3\times 10^{-15}$.
So, what will the roots do if we change these? That's a multivariate calculus question, and we have already computed the gradient vectors. We can look at the magnitude of these.
A := Vector(7, [2.008402247, 1.974225110, 1.872661356, 1.714140938, 1.583160527, 1.512571776, 1.485030592]):
B := Vector(7, [1.008426206, 0.9749050168, 0.8791058345, 0.7375810928, 0.6279419845, 0.5722302977, 0.5513324340]):
C := Vector(6, [0, 0.7015884551, 0.6711668301, 0.5892018711, 1.084755941, 1.032359024]):
k := 0.1380e-7:
[seq(
[evalf[3](abs(eval(diff(f, z), z = flagroots[i]))^(-1)*Norm(eval(G, z = flagroots[i])))]
, i=1..14 )];
20.1 |
20.1 |
321. |
321. |
.731e4 |
.731e4 |
.411e5 |
.411e5 |
.514e5 |
.514e5 |
.572e4 |
.572e4 |
.402e4 |
.402e4 |
The maximum ratio of those is about $5.7\times 10^4$. That's our structured condition number.
Roughly, then, changes in the parameters will be amplified by at worst 1.0e5 in the location of the roots. That's quite ill-conditioned. For the Hermite interpolational companion we had backward error about 1.0e-12, so we will have about 7 figures accurate in the roots (at least). For the Lagrange interpolational companion we had backward error about 1.0e-15, so about 10 figures accurate in the roots (at least).
I demonstrated the Hermite interpolational companion pencil on a historically interesting example, which was not presented in any standard basis, and showed that the pencil gave roots with excellent componentwise backward error in a specialized sense appropriate to the example.
I demonstrated that the monomial basis did a quite bad job in comparison (I computed a forward error but failed to find a good estimate of the componentwise backward error, because my linearized method failed).
I demonstrated that a Lagrange interpolational companion pencil was even better.
I showed that the original polynomial was moderately ill-conditioned in a structured sense.
Thank you for listening.
If $f(z;a,b,c,k) = P_0(z;a,b) + kP_1(z,c)$ and $k$ is small then one may wish to expand $z$ in a series in $k$. The zeroth term is of course $z=z_0$ where $z_0$ is any root of $P_0$. Since here \begin{equation} P_0 = \prod_{i=1}^7 (z^2 + A_i z + B_i) \end{equation} we must have $z_0$ being one of $(-A_i \pm \sqrt{A_i^2 - 4B_i})/2$ for some $i$ by the quadratic formula. So, in principle, we should be able to compute as many terms as we like in each expansion.
Let's just do one as an example.
k := 'k':
P0 := eval(f,k=0):
P1 := diff(f,k):
f - P0-k*P1;
Davidenko := isolate( diff(P0,z)*dzdk + P1 + k*diff(P1,z)*dzdk, dzdk ):
Davidenko := diff(z(k),k) = eval(rhs(Davidenko),z=z(k)):
Digits := 15:
rts0 := Vector( [fsolve( P0, z, complex )] );
Jupyter:-SetOutputRendererByType( algebraic, "text/latex" ): # ignore this
Order := 5;
sol1 := eval(z(k), evalf( dsolve( {Davidenko, z(0)=rts0[1]}, z(k), series ) ) );
kilobytes used=105845, alloc=14475, time=8.05
sol5 := eval( convert(sol1,polynom), k=1.38e-8 );
From a previous run with Order=3 we have the following result:
sol3 := -1.00377939005873 - 0.001461234650*I;
(sol3-sol5)/sol5;
res5 := eval(f,[z=sol5,k=1.38e-8]);
res3 := eval(f, [z=sol3,k=1.38e-8]);
So, the perturbation series appears to be working, but only just. This is of course because of the singularity at $k=1.386\times 10^{-8}$.