# FreeBSD Manual Pages

PZSTEIN(l)			       )			    PZSTEIN(l)

NAME
PZSTEIN - compute the eigenvectors of a symmetric tridiagonal matrix in
parallel, using inverse iteration

SYNOPSIS
SUBROUTINE PZSTEIN( N, D, E, M, W, IBLOCK, ISPLIT, ORFAC,  Z,  IZ,  JZ,
DESCZ,  WORK, LWORK,	IWORK, LIWORK, IFAIL, ICLUSTR,
GAP,	INFO )

INTEGER	   INFO, IZ, JZ, LIWORK, LWORK,	M, N

DOUBLE	   PRECISION ORFAC

INTEGER	   DESCZ( * ), IBLOCK( * ), ICLUSTR( * ), IFAIL( *  ),
ISPLIT( * ),	IWORK( * )

DOUBLE	   PRECISION D(	* ), E(	* ), GAP( * ), W( * ), WORK( *
)

COMPLEX*16	   Z( *	)

PURPOSE
PZSTEIN computes	the eigenvectors of a symmetric	tridiagonal matrix  in
parallel, using inverse iteration. The eigenvectors found correspond to
user specified eigenvalues. PZSTEIN does	not orthogonalize vectors that
are  on	different  processes.  The extent of orthogonalization is con-
trolled by the input parameter LWORK.  Eigenvectors that	are to be  or-
thogonalized  are  computed by the same process.	PZSTEIN	decides	on the
allocation of work among	the processes and then calls DSTEIN2 (modified
LAPACK  routine)	 on each individual process. If	insufficient workspace
is allocated, the expected orthogonalization may	not be done.

Note : If the eigenvectors obtained are not orthogonal, increase
LWORK and	run the	code again.

Notes
=====

Each global data	object is described by an associated description  vec-
tor.  This vector stores	the information	required to establish the map-
ping between an object element and its corresponding process and	memory
location.

Let  A  be  a generic term for any 2D block cyclicly distributed	array.
Such a global array has an associated description vector	DESCA.	In the
following  comments,  the  character _ should be	read as	"of the	global
array".

NOTATION	       STORED IN      EXPLANATION
---------------	--------------	--------------------------------------
DTYPE_A(global) DESCA( DTYPE_ )The descriptor type.  In this case,
DTYPE_A =	1.
CTXT_A (global) DESCA( CTXT_ ) The BLACS	context	handle,	indicating
the BLACS	process	grid A is distribu-
ted over.	The context itself is glo-
bal, but the handle (the integer
value) may vary.
M_A    (global) DESCA( M_ )    The number of rows in the	global
array A.
N_A    (global) DESCA( N_ )    The number of columns in the global
array A.
MB_A   (global) DESCA( MB_ )   The blocking factor used to distribute
the rows of the array.
NB_A   (global) DESCA( NB_ )   The blocking factor used to distribute
the columns of the array.
RSRC_A (global) DESCA( RSRC_ ) The process row over which the first
row  of  the  array  A  is  distributed.
CSRC_A (global) DESCA( CSRC_ ) The process column over which the
first column of the array	A is
distributed.
LLD_A  (local)  DESCA( LLD_ )  The leading dimension of the local
array.  LLD_A >= MAX(1,LOCr(M_A)).

Let K be	the number of rows or columns of a distributed matrix, and as-
sume that its process grid has dimension	r x c.
LOCr(  K	) denotes the number of	elements of K that a process would re-
ceive if	K were distributed over	the r processes	of its process column.
Similarly, LOCc(	K ) denotes the	number of elements of K	that a process
would receive if	K were distributed over	the c processes	of its process
row.
The values of LOCr() and	LOCc() may be determined via  a	 call  to  the
ScaLAPACK tool function,	NUMROC:
LOCr( M ) = NUMROC( M, MB_A, MYROW, RSRC_A, NPROW ),
LOCc(  N	) = NUMROC( N, NB_A, MYCOL, CSRC_A, NPCOL ).  An upper
bound for these quantities may be computed by:
LOCr( M ) <= ceil( ceil(M/MB_A)/NPROW )*MB_A
LOCc( N ) <= ceil( ceil(N/NB_A)/NPCOL )*NB_A

ARGUMENTS
P = NPROW * NPCOL is the	total number of	processes

N       (global input) INTEGER
The order of the	tridiagonal matrix T.  N >= 0.

D       (global input) DOUBLE PRECISION array, dimension	(N)
The n diagonal elements of the tridiagonal matrix T.

E       (global input) DOUBLE PRECISION array, dimension	(N-1)
The (n-1) off-diagonal elements of the tridiagonal matrix T.

M       (global input) INTEGER
The total number	of eigenvectors	to be found. 0 <= M <= N.

W       (global input/global output) DOUBLE PRECISION array, dim	(M)
On input, the first M elements of W contain all the eigenvalues
for  which  eigenvectors	 are  to  be computed. The eigenvalues
should be grouped by split-off block and	ordered	from  smallest
to  largest  within  the	block (The output array	W from PDSTEBZ
with ORDER='b' is expected here). This array should  be	repli-
cated  on  all processes.  On output, the first M elements con-
tain the	input eigenvalues in ascending order.

Note : To obtain	orthogonal vectors, it is best if  eigenvalues
are  computed to	highest	accuracy ( this	can be done by setting
ABSTOL to the underflow threshold = DLAMCH('U') ---  ABSTOL  is
an input	parameter to PDSTEBZ )

IBLOCK  (global input) INTEGER array, dimension (N)
The  submatrix indices associated with the corresponding	eigen-
values in W -- 1	for eigenvalues	belonging to the first	subma-
trix  from  the top, 2 for those	belonging to the second	subma-
trix, etc. (The output array IBLOCK from	 PDSTEBZ  is  expected
here).

ISPLIT  (global input) INTEGER array, dimension (N)
The  splitting  points,	at which T breaks up into submatrices.
The first submatrix consists of rows/columns  1	to  ISPLIT(1),
the second of rows/columns ISPLIT(1)+1 through ISPLIT(2), etc.,
and the NSPLIT-th consists of  rows/columns  ISPLIT(NSPLIT-1)+1
through	ISPLIT(NSPLIT)=N (The output array ISPLIT from PDSTEBZ
is expected here.)

ORFAC   (global input) DOUBLE PRECISION
ORFAC specifies which eigenvectors  should  be  orthogonalized.
Eigenvectors  that  correspond  to eigenvalues which are	within
ORFAC*||T|| of each other are to	be  orthogonalized.   However,
if  the	workspace  is insufficient (see	LWORK),	this tolerance
may be decreased	until all eigenvectors	to  be	orthogonalized
can  be	stored	in  one	process.  No orthogonalization will be
done if ORFAC equals zero.  A default value of 10^-3 is used if
ORFAC is	negative.  ORFAC should	be identical on	all processes.

Z       (local output) COMPLEX*16 array,
dimension  (DESCZ(DLEN_), N/npcol + NB) Z contains the computed
eigenvectors associated with  the  specified  eigenvalues.  Any
vector  which  fails  to	converge is set	to its current iterate
after MAXITS iterations ( See DSTEIN2 ).	 On output, Z is  dis-
tributed	across the P processes in block	cyclic format.

IZ      (global input) INTEGER
Z's global row index, which points to the beginning of the sub-
matrix which is to be operated on.

JZ      (global input) INTEGER
Z's global column index,	which points to	the beginning  of  the
submatrix which is to be	operated on.

DESCZ   (global and local input)	INTEGER	array of dimension DLEN_.
The array descriptor for	the distributed	matrix Z.

WORK    (local workspace/global output) DOUBLE PRECISION	array,
dimension  (  LWORK ) On	output,	WORK(1)	gives a	lower bound on
the workspace ( LWORK ) that guarantees the  user  desired  or-
thogonalization	(see  ORFAC).  Note that this may overestimate
the minimum workspace needed.

LWORK   (local input) integer
LWORK  controls the extent of orthogonalization	which  can  be
done. The number	of eigenvectors	for which storage is allocated
on each process is NVEC	=  floor((  LWORK-  max(5*N,NP00*MQ00)
)/N).   Eigenvectors  corresponding  to	eigenvalue clusters of
size NVEC - ceil(M/P) + 1 are guaranteed	to be orthogonal ( the
orthogonality  is similar to that obtained from ZSTEIN2).  Note
:  LWORK	  must	be  no	smaller	 than:	max(5*N,NP00*MQ00)   +
ceil(M/P)*N,  and  should have the same input value on all pro-
cesses.	It is the minimum value	of LWORK  input	 on  different
processes that is significant.

If LWORK	= -1, then LWORK is global input and a workspace query
is assumed; the routine only calculates the minimum and optimal
size  for  all work arrays. Each	of these values	is returned in
the first entry of the corresponding work array,	and  no	 error
message is issued by PXERBLA.

IWORK   (local workspace/global output) INTEGER array,
dimension  (  3*N+P+1 ) On return, IWORK(1) contains the	amount
of integer workspace required.  On return, the IWORK(2) through
IWORK(P+2)  indicate the	eigenvectors computed by each process.
Process I  computes  eigenvectors  indexed  IWORK(I+2)+1	 thru'
IWORK(I+3).

LIWORK  (local input) INTEGER
Size of array IWORK.  Must be >=	3*N + P	+ 1

If  LIWORK  =  -1,  then	LIWORK is global input and a workspace
query is	assumed; the routine only calculates the  minimum  and
optimal	size  for all work arrays. Each	of these values	is re-
turned in the first entry of the	corresponding work array,  and
no error	message	is issued by PXERBLA.

IFAIL   (global output) integer array, dimension	(M)
On normal exit, all elements of IFAIL are zero.	If one or more
eigenvectors fail to converge after MAXITS  iterations  (as  in
ZSTEIN),	 then  INFO > 0	is returned.  If mod(INFO,M+1)>0, then
for I=1 to mod(INFO,M+1), the eigenvector corresponding to  the
eigenvalue W(IFAIL(I)) failed to	converge ( W refers to the ar-
ray of eigenvalues on output ).

ICLUSTR (global output) integer	array,	dimension  (2*P)  This
output  array contains indices of eigenvectors corresponding to
a cluster of eigenvalues	that could not be  orthogonalized  due
to  insufficient	 workspace (see	LWORK, ORFAC and INFO).	Eigen-
vectors	corresponding  to  clusters  of	 eigenvalues   indexed
ICLUSTR(2*I-1)  to ICLUSTR(2*I),	I = 1 to INFO/(M+1), could not
be orthogonalized due to	lack of	workspace. Hence the eigenvec-
tors  corresponding  to	these  clusters	may not	be orthogonal.
ICLUSTR is a zero  terminated  array  ---  (  ICLUSTR(2*K).NE.0
.AND.  ICLUSTR(2*K+1).EQ.0  ) if	and only if K is the number of
clusters.

GAP     (global output) DOUBLE PRECISION	array, dimension (P)
This output array contains the gap  between  eigenvalues	 whose
eigenvectors  could  not	 be  orthogonalized. The INFO/M	output
values in this array correspond to the INFO/(M+1) clusters  in-
dicated	by the array ICLUSTR. As a result, the dot product be-
tween eigenvectors corresponding	to the I^th cluster may	be  as
high as ( O(n)*macheps )	/ GAP(I).

INFO    (global output) INTEGER
= 0:  successful	exit
<  0:   If the i-th argument is an array	and the	j-entry	had an
illegal value, then INFO	= -(i*100+j), if the i-th argument  is
a  scalar  and had an illegal value, then INFO =	-i.  < 0 :  if
INFO = -I, the I-th argument had	an illegal value
> 0 :  if mod(INFO,M+1) = I, then I eigenvectors	failed to con-
verge in	MAXITS iterations. Their indices are stored in the ar-
ray IFAIL.  if INFO/(M+1) = I, then eigenvectors	 corresponding
to I clusters of	eigenvalues could not be orthogonalized	due to
insufficient workspace. The indices of the clusters are	stored
in the array ICLUSTR.

ScaLAPACK version 1.7		13 August 2001			    PZSTEIN(l)

NAME | SYNOPSIS | PURPOSE | ARGUMENTS

Want to link to this manual page? Use this URL:
<https://www.freebsd.org/cgi/man.cgi?query=pzstein&manpath=FreeBSD+12.0-RELEASE+and+Ports>