Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
PZSTEIN(l)			       )			    PZSTEIN(l)

NAME
       PZSTEIN - compute the eigenvectors of a symmetric tridiagonal matrix in
       parallel, using inverse iteration

SYNOPSIS
       SUBROUTINE PZSTEIN( N, D, E, M, W, IBLOCK, ISPLIT, ORFAC,  Z,  IZ,  JZ,
			   DESCZ,  WORK, LWORK,	IWORK, LIWORK, IFAIL, ICLUSTR,
			   GAP,	INFO )

	   INTEGER	   INFO, IZ, JZ, LIWORK, LWORK,	M, N

	   DOUBLE	   PRECISION ORFAC

	   INTEGER	   DESCZ( * ), IBLOCK( * ), ICLUSTR( * ), IFAIL( *  ),
			   ISPLIT( * ),	IWORK( * )

	   DOUBLE	   PRECISION D(	* ), E(	* ), GAP( * ), W( * ), WORK( *
			   )

	   COMPLEX*16	   Z( *	)

PURPOSE
       PZSTEIN computes	the eigenvectors of a symmetric	tridiagonal matrix  in
       parallel, using inverse iteration. The eigenvectors found correspond to
       user specified eigenvalues. PZSTEIN does	not orthogonalize vectors that
       are  on	different  processes.  The extent of orthogonalization is con-
       trolled by the input parameter LWORK.  Eigenvectors that	are to be  or-
       thogonalized  are  computed by the same process.	PZSTEIN	decides	on the
       allocation of work among	the processes and then calls DSTEIN2 (modified
       LAPACK  routine)	 on each individual process. If	insufficient workspace
       is allocated, the expected orthogonalization may	not be done.

       Note : If the eigenvectors obtained are not orthogonal, increase
	      LWORK and	run the	code again.

       Notes
       =====

       Each global data	object is described by an associated description  vec-
       tor.  This vector stores	the information	required to establish the map-
       ping between an object element and its corresponding process and	memory
       location.

       Let  A  be  a generic term for any 2D block cyclicly distributed	array.
       Such a global array has an associated description vector	DESCA.	In the
       following  comments,  the  character _ should be	read as	"of the	global
       array".

       NOTATION	       STORED IN      EXPLANATION
       ---------------	--------------	--------------------------------------
       DTYPE_A(global) DESCA( DTYPE_ )The descriptor type.  In this case,
				      DTYPE_A =	1.
       CTXT_A (global) DESCA( CTXT_ ) The BLACS	context	handle,	indicating
				      the BLACS	process	grid A is distribu-
				      ted over.	The context itself is glo-
				      bal, but the handle (the integer
				      value) may vary.
       M_A    (global) DESCA( M_ )    The number of rows in the	global
				      array A.
       N_A    (global) DESCA( N_ )    The number of columns in the global
				      array A.
       MB_A   (global) DESCA( MB_ )   The blocking factor used to distribute
				      the rows of the array.
       NB_A   (global) DESCA( NB_ )   The blocking factor used to distribute
				      the columns of the array.
       RSRC_A (global) DESCA( RSRC_ ) The process row over which the first
				      row  of  the  array  A  is  distributed.
       CSRC_A (global) DESCA( CSRC_ ) The process column over which the
				      first column of the array	A is
				      distributed.
       LLD_A  (local)  DESCA( LLD_ )  The leading dimension of the local
				      array.  LLD_A >= MAX(1,LOCr(M_A)).

       Let K be	the number of rows or columns of a distributed matrix, and as-
       sume that its process grid has dimension	r x c.
       LOCr(  K	) denotes the number of	elements of K that a process would re-
       ceive if	K were distributed over	the r processes	of its process column.
       Similarly, LOCc(	K ) denotes the	number of elements of K	that a process
       would receive if	K were distributed over	the c processes	of its process
       row.
       The values of LOCr() and	LOCc() may be determined via  a	 call  to  the
       ScaLAPACK tool function,	NUMROC:
	       LOCr( M ) = NUMROC( M, MB_A, MYROW, RSRC_A, NPROW ),
	       LOCc(  N	) = NUMROC( N, NB_A, MYCOL, CSRC_A, NPCOL ).  An upper
       bound for these quantities may be computed by:
	       LOCr( M ) <= ceil( ceil(M/MB_A)/NPROW )*MB_A
	       LOCc( N ) <= ceil( ceil(N/NB_A)/NPCOL )*NB_A

ARGUMENTS
       P = NPROW * NPCOL is the	total number of	processes

       N       (global input) INTEGER
	       The order of the	tridiagonal matrix T.  N >= 0.

       D       (global input) DOUBLE PRECISION array, dimension	(N)
	       The n diagonal elements of the tridiagonal matrix T.

       E       (global input) DOUBLE PRECISION array, dimension	(N-1)
	       The (n-1) off-diagonal elements of the tridiagonal matrix T.

       M       (global input) INTEGER
	       The total number	of eigenvectors	to be found. 0 <= M <= N.

       W       (global input/global output) DOUBLE PRECISION array, dim	(M)
	       On input, the first M elements of W contain all the eigenvalues
	       for  which  eigenvectors	 are  to  be computed. The eigenvalues
	       should be grouped by split-off block and	ordered	from  smallest
	       to  largest  within  the	block (The output array	W from PDSTEBZ
	       with ORDER='b' is expected here). This array should  be	repli-
	       cated  on  all processes.  On output, the first M elements con-
	       tain the	input eigenvalues in ascending order.

	       Note : To obtain	orthogonal vectors, it is best if  eigenvalues
	       are  computed to	highest	accuracy ( this	can be done by setting
	       ABSTOL to the underflow threshold = DLAMCH('U') ---  ABSTOL  is
	       an input	parameter to PDSTEBZ )

       IBLOCK  (global input) INTEGER array, dimension (N)
	       The  submatrix indices associated with the corresponding	eigen-
	       values in W -- 1	for eigenvalues	belonging to the first	subma-
	       trix  from  the top, 2 for those	belonging to the second	subma-
	       trix, etc. (The output array IBLOCK from	 PDSTEBZ  is  expected
	       here).

       ISPLIT  (global input) INTEGER array, dimension (N)
	       The  splitting  points,	at which T breaks up into submatrices.
	       The first submatrix consists of rows/columns  1	to  ISPLIT(1),
	       the second of rows/columns ISPLIT(1)+1 through ISPLIT(2), etc.,
	       and the NSPLIT-th consists of  rows/columns  ISPLIT(NSPLIT-1)+1
	       through	ISPLIT(NSPLIT)=N (The output array ISPLIT from PDSTEBZ
	       is expected here.)

       ORFAC   (global input) DOUBLE PRECISION
	       ORFAC specifies which eigenvectors  should  be  orthogonalized.
	       Eigenvectors  that  correspond  to eigenvalues which are	within
	       ORFAC*||T|| of each other are to	be  orthogonalized.   However,
	       if  the	workspace  is insufficient (see	LWORK),	this tolerance
	       may be decreased	until all eigenvectors	to  be	orthogonalized
	       can  be	stored	in  one	process.  No orthogonalization will be
	       done if ORFAC equals zero.  A default value of 10^-3 is used if
	       ORFAC is	negative.  ORFAC should	be identical on	all processes.

       Z       (local output) COMPLEX*16 array,
	       dimension  (DESCZ(DLEN_), N/npcol + NB) Z contains the computed
	       eigenvectors associated with  the  specified  eigenvalues.  Any
	       vector  which  fails  to	converge is set	to its current iterate
	       after MAXITS iterations ( See DSTEIN2 ).	 On output, Z is  dis-
	       tributed	across the P processes in block	cyclic format.

       IZ      (global input) INTEGER
	       Z's global row index, which points to the beginning of the sub-
	       matrix which is to be operated on.

       JZ      (global input) INTEGER
	       Z's global column index,	which points to	the beginning  of  the
	       submatrix which is to be	operated on.

       DESCZ   (global and local input)	INTEGER	array of dimension DLEN_.
	       The array descriptor for	the distributed	matrix Z.

       WORK    (local workspace/global output) DOUBLE PRECISION	array,
	       dimension  (  LWORK ) On	output,	WORK(1)	gives a	lower bound on
	       the workspace ( LWORK ) that guarantees the  user  desired  or-
	       thogonalization	(see  ORFAC).  Note that this may overestimate
	       the minimum workspace needed.

       LWORK   (local input) integer
	       LWORK  controls the extent of orthogonalization	which  can  be
	       done. The number	of eigenvectors	for which storage is allocated
	       on each process is NVEC	=  floor((  LWORK-  max(5*N,NP00*MQ00)
	       )/N).   Eigenvectors  corresponding  to	eigenvalue clusters of
	       size NVEC - ceil(M/P) + 1 are guaranteed	to be orthogonal ( the
	       orthogonality  is similar to that obtained from ZSTEIN2).  Note
	       :  LWORK	  must	be  no	smaller	 than:	max(5*N,NP00*MQ00)   +
	       ceil(M/P)*N,  and  should have the same input value on all pro-
	       cesses.	It is the minimum value	of LWORK  input	 on  different
	       processes that is significant.

	       If LWORK	= -1, then LWORK is global input and a workspace query
	       is assumed; the routine only calculates the minimum and optimal
	       size  for  all work arrays. Each	of these values	is returned in
	       the first entry of the corresponding work array,	and  no	 error
	       message is issued by PXERBLA.

       IWORK   (local workspace/global output) INTEGER array,
	       dimension  (  3*N+P+1 ) On return, IWORK(1) contains the	amount
	       of integer workspace required.  On return, the IWORK(2) through
	       IWORK(P+2)  indicate the	eigenvectors computed by each process.
	       Process I  computes  eigenvectors  indexed  IWORK(I+2)+1	 thru'
	       IWORK(I+3).

       LIWORK  (local input) INTEGER
	       Size of array IWORK.  Must be >=	3*N + P	+ 1

	       If  LIWORK  =  -1,  then	LIWORK is global input and a workspace
	       query is	assumed; the routine only calculates the  minimum  and
	       optimal	size  for all work arrays. Each	of these values	is re-
	       turned in the first entry of the	corresponding work array,  and
	       no error	message	is issued by PXERBLA.

       IFAIL   (global output) integer array, dimension	(M)
	       On normal exit, all elements of IFAIL are zero.	If one or more
	       eigenvectors fail to converge after MAXITS  iterations  (as  in
	       ZSTEIN),	 then  INFO > 0	is returned.  If mod(INFO,M+1)>0, then
	       for I=1 to mod(INFO,M+1), the eigenvector corresponding to  the
	       eigenvalue W(IFAIL(I)) failed to	converge ( W refers to the ar-
	       ray of eigenvalues on output ).

	       ICLUSTR (global output) integer	array,	dimension  (2*P)  This
	       output  array contains indices of eigenvectors corresponding to
	       a cluster of eigenvalues	that could not be  orthogonalized  due
	       to  insufficient	 workspace (see	LWORK, ORFAC and INFO).	Eigen-
	       vectors	corresponding  to  clusters  of	 eigenvalues   indexed
	       ICLUSTR(2*I-1)  to ICLUSTR(2*I),	I = 1 to INFO/(M+1), could not
	       be orthogonalized due to	lack of	workspace. Hence the eigenvec-
	       tors  corresponding  to	these  clusters	may not	be orthogonal.
	       ICLUSTR is a zero  terminated  array  ---  (  ICLUSTR(2*K).NE.0
	       .AND.  ICLUSTR(2*K+1).EQ.0  ) if	and only if K is the number of
	       clusters.

       GAP     (global output) DOUBLE PRECISION	array, dimension (P)
	       This output array contains the gap  between  eigenvalues	 whose
	       eigenvectors  could  not	 be  orthogonalized. The INFO/M	output
	       values in this array correspond to the INFO/(M+1) clusters  in-
	       dicated	by the array ICLUSTR. As a result, the dot product be-
	       tween eigenvectors corresponding	to the I^th cluster may	be  as
	       high as ( O(n)*macheps )	/ GAP(I).

       INFO    (global output) INTEGER
	       = 0:  successful	exit
	       <  0:   If the i-th argument is an array	and the	j-entry	had an
	       illegal value, then INFO	= -(i*100+j), if the i-th argument  is
	       a  scalar  and had an illegal value, then INFO =	-i.  < 0 :  if
	       INFO = -I, the I-th argument had	an illegal value
	       > 0 :  if mod(INFO,M+1) = I, then I eigenvectors	failed to con-
	       verge in	MAXITS iterations. Their indices are stored in the ar-
	       ray IFAIL.  if INFO/(M+1) = I, then eigenvectors	 corresponding
	       to I clusters of	eigenvalues could not be orthogonalized	due to
	       insufficient workspace. The indices of the clusters are	stored
	       in the array ICLUSTR.

ScaLAPACK version 1.7		13 August 2001			    PZSTEIN(l)

NAME | SYNOPSIS | PURPOSE | ARGUMENTS

Want to link to this manual page? Use this URL:
<https://www.freebsd.org/cgi/man.cgi?query=pzstein&manpath=FreeBSD+12.0-RELEASE+and+Ports>

home | help