MATRICES AND MATRIX OPERATIONS: Unit 22
Dr. Wlodzislaw Kostecki
The Papua New Guinea University of Technology (PNGUT)
Department of Electrical and Communication Engineering
Lae, Morobe Province
Papua New Guinea
Copyright © 2000 by Wlodzislaw Kostecki
All rights reserved
-------------------------------------------------------------------
(22) Similarity of matrices
OBJECTIVES :
To state the condition necessary for square matrices to be similar and express it in the form of two equivalent relations.
To provide examples of similar matrices.
To introduce the function issimilar for testing matrices for similarity.
To investigate properties of certain operations involving similar matrices.
To define the modal matrix.
To state the condition necessary for the eigenvectors of a matrix to be linearly independent .
To propose a universal method of extracting eigenvectors and associating them with distinct eigenvalues of a matrix in a unique way.
To introduce the term 'unique' (enclosed in single quotes) for use in connection with a modal matrix whose columns are eigenvectors uniquely associated with distinct eigenvalues of a given matrix.
To suggest an application of the 'unique' modal matrix to computing functions of matrices and stress that permutations of elements in modal-matrix columns do not affect such computations.
To present several variants of a 'unique' modal matrix associated with a given matrix having distinct eigenvalues.
To define the Jordan form of a square matrix.
To emphasise that the Jordan form of a square matrix must be unique for computation of matrix functions.
To point out that the function jordan does not return a unique matrix and provide alternative methods of constructing the unique Jordan form of a square matrix.
To provide equivalent expressions of a statement that every square matrix is similar to the associated Jordan form of the matrix and vice versa, and give supporting examples.
To discuss application of matrix similarity for computing functions of matrices.
To provide rules for computation of functions of diagonal and non-diagonal matrices using matrix similarity.
To provide alternative methods of computation of functions of diagonal matrices.
> restart : interface(warnlevel=0) : with(linalg, augment, charpoly, coldim, diag, eigenvals, eigenvects, inverse, issimilar, jordan, rowdim, trace) :
A. Similar matrices
Two square matrices of the same order, [ A ] and [ B ], are similar if and only if there exists an invertible matrix [ S ] such that
[ A ] = Inv [ S ] [ B ] [ S ]
If [ A ] is similar to [ B ], then [ B ] is also similar to [ A ], i.e.
[ B ] = [ S ] [ A ] Inv [ S ]
A relationship equivalent to either of the above is
[ S ] [ A ] = [ B ] [ S ]
Exemplarily, consider
(
Χ
)
matrices [
A
] and [
B
] given as
> A := matrix(3, 3, [6, 1, 0, 0, 6, 1, 0, 0, 6]) : B := matrix(3, 3, [6, 3, 0, 0, 6, 3, 0, 0, 6]) :
> A = matrix(A) ; B = matrix(B) ;
and find out if matrices [ A ] and [ B ] are mutually similar.
Let an invertible matrix [ S ] be given in the form
> S := matrix(3, 3, [9, 0, 0, 0, 3, 0, 0, 0, 1]) : S = matrix(S) ;
(1) The product Inv [ S ] [ B ] [ S ] yields the matrix
> `inv(S) B S` := evalm(inverse(S) &* B &* S) : Inv(S)*B*S = matrix(`inv(S) B S`) ;
which is equal to matrix [ A ]. Therefore, matrix [ A ] is found to be similar to matrix [ B ].
This operation may be presented in "like-in-a-book" form, viz.
> matrix(A) = inverse(S) * matrix(B) * matrix(S) ;
(2) The product [ S ] [ A ] Inv [ S ] yields the matrix
> `S A inv(S)` := evalm(S &* A &* inverse(S)) : S*A*Inv(S) = matrix(`S A inv(S)`) ;
which is equal to matrix [ B ]. Therefore, matrix [ B ] is similar to matrix [ A ].
This operation may be presented in "like-in-a-book" form, viz.
> matrix(B) = matrix(S) * matrix(A) * inverse(S) ;
(3a) The matrix product [ S ] [ A ] yields the matrix
> `SA` := evalm(S &* A) : S*A = matrix(`SA`) ;
(3b) The matrix product [ B ] [ S ] yields the matrix
> `BS` := evalm(B &* S) : `B S` = matrix(`BS`) ;
Both product matrices of (3a) and (3b) are equal, which verifies that matrices [ A ] and [ B ] are similar. This verification may be done directly using the issimilar function, viz.
> issimilar(A, B) ;
The
Boolean
value
returned by the
issimilar
function verifies the fact.
* * *
N.B. Similar matrices have the same characteristic equation and, therefore, the same eigenvalues and the same trace.
Exemplarily, consider the similar matrices [ A ] and [ B ] defined above.
(a) The characteristic equation, eigenvalues, and trace of matrix [ A ] are
> l := lambda :
> char_eq(A) := charpoly(A, l) = 0 : charroots(A) := eigenvals(A) : `trace(A)` := trace(A) :
> char_eq(A) ; char_roots(A) = charroots(A) ; Trace(A) = `trace(A)` ;
(b) The characteristic equation, eigenvalues, and trace of matrix [ B ] are
> char_eq(B) := charpoly(B, l) = 0 : charroots(B) := eigenvals(B) : `trace(B)` := trace(B) :
> char_eq(B) ; char_roots(B) = charroots(B) ; Trace(B) = `trace(B)` ;
It is easily observed that the results of (a) and (b) are identical.
* * *
N.B.
If [
X
] is an eigenvector of a matrix [
A
] associated with eigenvalue
and [
A
] is similar to a matrix [
B
], then [
Y
]
=
[
S
] [
X
] is an eigenvector for [
B
] corresponding to the same eigenvalue.
Exemplarily, consider the similar matrices [ A ] and [ B ] defined earlier and the same invertible matrix [ S ] with which the equality [ A ] = Inv [ S ] [ B ] [ S ] holds.
(a) The eigenvalues of [ A ] are
> charroots(A) := eigenvals(A) : char_roots(A) = charroots(A) ;
from which it can be seen that matrix [ A ] has one root of multiplicity three.
Denoting this root as
and extracting it from from the solution sequence give
> root(A) := charroots(A)[1] : l[A] = root(A) ;
(b) The sequence of lists containing eigenvalues and eigenvectors of [ A ] is
> `roots&vectors(A)` := eigenvects(A) : roots_and_vectors(A) = `roots&vectors(A)` ;
(c) Extracting the set of the eigenvectors of [ A ] yields
> vectors(A) := `roots&vectors(A)`[3] : char_vectors(A) = vectors(A) ;
(d) Extracting the eigenvector from the above set gives
> `vector(A)` := vectors(A)[1] : char_vector[l](A) = op(`vector(A)`) ;
or, as a column matrix (vector),
> X[l(A)] := convert(`vector(A)`, matrix) : X[l[A]] = matrix(X[l(A)]) ;
(e) The eigenvalues of [ B ] are
> charroots(B) := eigenvals(B) : char_roots(B) = charroots(B) ;
from which it can be seen that matrix [ B ] also has one root of multiplicity three.
Denoting this root as
and extracting it from from the solution sequence gives
> root(B) := charroots(B)[1] : l[B] = root(B) ;
(f) The sequence of lists containing eigenvalues and eigenvectors of [ B ] is
> `roots&vectors(B)` := eigenvects(B) : roots_and_vectors(B) = `roots&vectors(B)` ;
(g) Extracting the set of the eigenvectors of [ B ] yields
> vectors(B) := `roots&vectors(B)`[3] : char_vectors(B) = vectors(B) ;
(h) Extracting the eigenvector from the above set gives
> `vector(B)` := vectors(B)[1] : char_vector[l](B) = op(`vector(B)`) ;
or, as a column matrix (vector),
> X[l(B)] := convert(`vector(B)`, matrix) : X[l[B]] = matrix(X[l(B)]) ;
(i) The product matrix [ Y ] = [ S ] [ X ] is the following column matrix (vector):
> Y := evalm(S &* X[l(A)]) : Y = matrix(Y) ;
(j) From the inspection of
and [
Y
], it can be easily noticed that
> Y = 9*X[l[B]] ;
or
> matrix(Y) = 9*matrix(X[l(B)]) ;
(k) According to the statement found in Unit (21), the product of the eigenvector of [
B
], i.e.
,
and an arbitrary scalar, i.e.
in this particular case, is also the eigenvector of [
B
]. Therefore, [
Y
] is the eigenvector of [
B
]. This implies that [
Y
] is a solution to the equation (4) of Unit (21). Using the designations applicable to this particular case, equation (4) takes the form
(
[
B
]
[
U
]
)
[
Y
]
=
[
0
]
(l) With the unit matrix [ U ] appropriately sized for this case, i.e.
> U := diag(1, 1, 1) : U = matrix(U) ;
and respective substitutions, equation (4) assumes the form
> '(B - lambda[B]*U)' * Y = (matrix(B) - root(B) * matrix(U)) * matrix(Y) ;
Evaluation of this equation yields the
column matrix (vector)
> (matrix(B) - root(B) * matrix(U)) * matrix(Y) = evalm((B - root(B) * U) &* Y) ;
which verifies that [ Y ] = [ S ] [ X ] is an eigenvector for [ B ] corresponding to the same eigenvalue.
* * *
B. Modal matrix
If the eigenvalues of a square matrix [ A ] are distinct , whether real or complex, then a so-called modal matrix [ M ] associated with [ A ] is defined as a matrix of the same order as [ A ], having as its columns all the (linearly independent) eigenvectors of [ A ] corresponding to the eigenvalues of [ A ].
Eigenvectors are linearly independent if they correspond to distinct eigenvalues.
Exemplarily, consider a
(
Χ
)
matrix [
A
] given as
> A := matrix(4, 4, [1, -2, 3, -4, -4, 1, -2, 3, 3, -4, 1, -2, -2, 3, -4, 1]) : A = matrix(A) ;
and obtain the modal matrix [ M ] associated with [ A ].
Step 1 . Find the eigenvalues of [ A ]:
> charroots(A) := eigenvals(A) : char_roots(A) = charroots(A) ;
Extracting the four distinct eigenvalues
yields
> No_roots(A) := nops([charroots(A)]) :
> for i to No_roots(A) do ch_r[i](A) := charroots(A)[i] : print(l[i] = ch_r[i](A)) : od :
Step 2 . Find the sequence of lists containing eigenvalues and eigenvectors of [ A ]:
> `roots&vectors(A)` := eigenvects(A) : roots_and_vectors(A) = `roots&vectors(A)` ;
Step 3
. Extract the eigenvectors
corresponding to eigenvalues
of [
A
]:
As mentioned in Unit (21), the eigenvects function returns an unordered sequence of the component lists. To extract the eigenvectors so that each of them corresponds to a proper eigenvalue, the following selection method is developed. The method works for matrices of any order, whose eigenvalues are distinct.
> for i to No_roots(A) do e[i] := charroots(A)[i] : List[i] := `roots&vectors(A)`[i] : od :
> for j to No_roots(A) do for i to No_roots(A) do if List[i][1] = e[j] then Lst[j] := `roots&vectors(A)`[i] : fi : od : od :
> for i to No_roots(A) do ch_v[i](A) := op(Lst[i][3]) : print(v[lambda[i]] = eval(ch_v[i](A)), ` --> ` * lambda[i] = ch_r[i](A)) : od :
Conversion of the eigenvectors
into column matrices
yields
> for i to No_roots(A) do X[l[i]] := convert(ch_v[i](A), matrix) : print(X[l[i]] = matrix(X[l[i]])) : od :
Step 4 . Construct the 'unique' modal matrix [ M ] associated with [ A ]:
> i := 'i' : `seq(X[l[i]])` := seq(X[l[i]], i=1..coldim(A)) : M := augment(`seq(X[l[i]])`) : M=matrix(M) ;
[ Refer to Unit (2) H where the augment function is used to create a rectangular matrix from given column matrices. Also recall that its synonim is the concat function. ]
* * *
N.B. In general, any permutation of the columns of [ M ] will produce another equally acceptable modal matrix. However, if [ M ] is to be used in computation of matrix functions (see Section D of this Unit), the ordering of columns in [ M ] must correspond to the unique sequence of the eigenvalues of [ A ] as returned by the eigenvals function. This means that [ M ] must be unique in this sense . However, such a uniqueness of [ M ] is not "absolute" since the eigenvectors are not unique for a given matrix.
Consequently, the modal matrix thus created is referred to as 'unique' (enclosed in single quotes) in this and subsequent Units.
In this particular case, it has been found that the eigenvectors corresponding to eigenvalues
,
,
and
can be returned with their components at different locations. Some of the obtained variants of the modal matrix are presented below.
> M1 = matrix(4, 4, [-1, 1, I, -I, 1, 1, -1, -1, -1, 1, -I, I, 1, 1, 1, 1]) ; M2 = matrix(4, 4, [1, 1, -I, I, -1, 1, 1, 1, 1, 1, I, -I, -1, 1, -1, -1]) ; M3 = matrix(4, 4, [-1, 1, 1, 1, 1, 1, I, -I, -1, 1, -1, -1, 1, 1, -I, I]) ; M4 = matrix(4, 4, [-1, 1, -1, -1, 1, 1, -I, I, -1, 1, 1, 1, 1, 1, I, -I]) ;
However, permutations of the elements in columns 1, 3, and 4 do not affect computation of matrix functions.
* * *
C. The Jordan form of a matrix
The
Jordan
form [
J
] of a square matrix [
A
] is defined as a matrix of the same order as [
A
] whose diagonal elements are the eigenvalues of [
A
] and whose other elements are all
.
Exemplarily, consider a
(
Χ
)
matrix [
A
] given as
> A := matrix(3, 3, [3, 2, -1, 2, 3, -1, -1, -1, 4]) : A = matrix(A) ;
and obtain its Jordan form.
Step 1 . Find the eigenvalues of [ A ]:
> charroots(A) := eigenvals(A) : char_roots(A) = charroots(A) ;
Extracting the distinct eigenvalues
yields
> No_roots(A) := nops([charroots(A)]) :
> for i to No_roots(A) do ch_r[i](A) := charroots(A)[i] : print(l[i] = ch_r[i](A)) : od :
Step 2 . Construct the Jordan form [ J ] of [ A ]:
This may be done using either of the following methods.
Method 1 . Using the diag function with manually entered diagonal elements:
> J := diag(ch_r[1](A), ch_r[2](A), ch_r[3](A)) : J = matrix(J) ;
Method 2 . Using the double for -loop construct and the diagonal indexing function:
> J_A := array(diagonal, 1..rowdim(J), 1..coldim(J)) :
> for i to rowdim(A) do for j to coldim(A) do if j = i then J_A[i,j] := ch_r[i](A) fi : od : od :
> J := matrix(J_A) : J = matrix(J) ;
* * *
N.B. In Maple , the jordan function may be used to compute directly the Jordan form [ J ] of a matrix. For the above matrix [ A ], the Jordan form is
> J := jordan(A) : J = matrix(J) ;
WARNING
:
The order of matrix eigenvalues returned by the
eigenvals
function is always a unique, reproducible sequence of the eigenvalues of [
A
] that may, for instance, be designated as
,
,
, ...,
.
The jordan function returns the Jordan form [ J ] of [ A ] that is unique up to permutations of the diagonal entries. In other words, the locations of the diagonal elements in [ J ] are not unique. This implies that the eigenvalues of [ A ] may, occasionally, appear at different locations of the diagonal of [ J ], which, in general, is acceptable.
Where it is essential that locations of the diagonal elements of [ J ] correspond to the unique ordering of eigenvalues returned by the eigenvals function, the Jordan form [ J ] of [ A ] should be constructed using Method 1 or 2. This is an imperative in computation of matrix functions using similarity of matrices refer to Units (23) to (25).
* * *
N.B. Every square matrix [ A ] is similar to the associated Jordan form [ J ] of the matrix. This implies that if [ M ] is a modal matrix for [ A ], then
[ A ] = [ M ] [ J ] Inv [ M ]
or, equivalently,
[ A ] [ M ] = [ M ] [ J ]
Exemplarily, consider the same matrix [ A ] as before, i.e.
> A = matrix(A) ;
(1) The 'unique' modal matrix [ M ] for [ A ] is:
> `roots&vectors(A)` := eigenvects(A) :
> for i to No_roots(A) do e[i] := charroots(A)[i] : List[i] := `roots&vectors(A)`[i] : od :
> for j to No_roots(A) do for i to No_roots(A) do if List[i][1] = e[j] then Lst[j] := `roots&vectors(A)`[i] : fi : od : od :
> for i to No_roots(A) do ch_v[i](A) := op(Lst[i][3]) : X[l[i]] := convert(ch_v[i](A), matrix) : od :
> i := 'i' : `seq(X[l[i]])` := seq(X[l[i]], i=1..coldim(A)) : M := augment(`seq(X[l[i]])`) : M=matrix(M) ;
(2) With the unique Jordan form [ J ] of [ A ] computed earlier (see Method 2 above), i.e.
> J := matrix(J_A) : J = matrix(J) ;
the matrix product [ M ] [ J ] Inv [ M ] yields the matrix
> `M J Inv(M)` := evalm(M &* J &* M^(-1)) : M*J*Inv(M) = matrix(`M J Inv(M)`) ;
which is equal to matrix [ A ].
(3a) The matrix product [ A ] [ M ] yields the matrix
> `AM` := evalm(A &* M) : A*M = matrix(`AM`) ;
(3b) The matrix product [ M ] [ J ] yields the matrix
> `MJ` := evalm(M &* J) : `M J` = matrix(`MJ`) ;
Both product matrices of (3a) and (3b) are equal, which verifies that matrix [ A ] is similar to [ J ]. This verification may be done directly using the issimilar function, viz.
> issimilar(A, J) ;
The
Boolean
value
returned by the
issimilar
function verifies the fact.
* * *
N.B. The Jordan form [ J ] of a square matrix [ A ] is similar to [ A ]. This implies that if [ M ] is a modal matrix for [ A ], then
[ J ] = Inv [ M ] [ A ] [ M ]
Exemplarily, using the same matrix [ A ] as before together with its 'unique' modal matrix [ M ] and the unique Jordan form [ J ] of [ A ], evaluation of the matrix product Inv [ M ] [ A ] [ M ] yields the matrix
> `Inv(M) A M` := evalm(M^(-1) &* A &* M) : Inv(M)*A*M = matrix(`Inv(M) A M`) ;
which is equal to the unique Jordan form [ J ] of [ A ]. This verifies the above relationship. This verification may be done directly using the issimilar function, viz.
> issimilar(J, A) ;
The
Boolean
value
returned by the
issimilar
function verifies the fact.
* * *
D. Application of matrix similarity
The foregoing considerations are useful in computing functions of square matrices. For this purpose, it is convenient to divide matrices into two classes as follows.
(1) Diagonal matrices
If [ A ] is a diagonal matrix, then a function of the matrix is also a diagonal matrix of the same order whose elements are the function of the elements of [ A ].
For example, if [
A
] is given as the following
(
Χ
)
scalar matrix
> l := 'l' : l := lambda :
> A := diag(l[1], l[2], l[3]) : A = matrix(A) ;
then a function f of the matrix may be obtained using either of the following alternative methods.
Method 1 . Using the array function together with the diagonal indexing function:
> f(A) := array(diagonal, 1..3, 1..3, [(1,1)=f(A[1,1]), (2,2)=f(A[2,2]), (3,3)=f(A[3,3])]) : 'f(A)' = f(A) ;
N.B. The diag function cannot be used to construct a diagonal matrix whose entries are functions of scalars because this function works only if the diagonal elements are scalars in numerical or symbolic form. That is why the array function together with the diagonal indexing function must be used if this method is chosen.
If large diagonal matrices are involved, the method proposed below will be more efficient.
Method 2 . Using the double for -loop construct and the diagonal indexing function:
> a := array(diagonal, 1..rowdim(A), 1..coldim(A)) :
> for i to rowdim(A) do for j to coldim(A) do if j = i then a[i,j] := f(A[i,j]) fi : od : od :
> f(A) := matrix(a) : 'f(A)' = f(A) ;
N.B. The construction of this method is such that it has no effect on [ A ] in view of a possible need for using the matrix in further computations. For instance, this is the case where verification of f( [ A ] ) using Maclaurin s series is performed refer to Units (23) to (25).
(2) Non-diagonal matrices
If a non-diagonal matrix [ A ] is similar to the associated Jordan form of the matrix, i.e. if
[ A ] = [ M ] [ J ] Inv [ M ]
then a function of [ A ] is given by
f( [ A ] ) = [ M ] f( [ J ] ) Inv [ M ]
where [ M ] is the modal matrix associated with [ A ].
The formula requires that two necessary conditions are satisfied, viz.
(a) The ordering of eigenvector columns in [ M ] must correspond to the unique sequence of the eigenvalues of [ A ] as returned by the eigenvals function.
(b) The locations of the diagonal elements of [ J ] must correspond to the unique ordering of eigenvalues returned by the eigenvals function.
Both conditions may be summarised as follows:
"The diagonal elements of [ J ] are the eigenvalues associated with the eigenvectors of [ A ], in the same order as their corresponding eigenvectors in [ M ]."
* * *
Proceed to Unit (23) for " Functions of constant matrices ".
-------------------------------------------------------------------