Skip site navigation (1)Skip section navigation (2)

ports/117923: USE_FORTRAN=yes: shared libraries for blas, lapack, and atlas do not contain any dependencies on other libs

From:Thomas Ludwig <tludwig@smr.ch>
Date:Thu, 8 Nov 2007 13:41:52 GMT
Subject:USE_FORTRAN=yes: shared libraries for blas, lapack, and atlas do not contain any dependencies on other libs
Send-pr version:www-3.1

Number:117923
Category:ports
Synopsis:USE_FORTRAN=yes: shared libraries for blas, lapack, and atlas do not contain any dependencies on other libs
Severity:non-critical
Priority:low
Responsible:maho@FreeBSD.org
State:feedback
Class:sw-bug
Arrival-Date:Thu Nov 08 13:50:01 UTC 2007
Closed-Date:
Last-Modified:Fri Jul 11 00:30:09 UTC 2008
Originator:Thomas Ludwig
Release:6.3-PRERELEASE

Organization:
SMR
 
Environment:
FreeBSD pingu.smr-internal.ch 6.3-PRERELEASE FreeBSD 6.3-PRERELEASE #0: Thu Nov 1 16:38:55 CET 2007 root@pingu.smr-internal.ch:/usr/obj/usr/src/sys/GENERIC i386
Description:
The USE_FORTRAN=yes directive present in the ports Makefiles for math/blas, math/lapack, and math/atlas does not add any dependency, including the dependency for libgfortran, to the shared libraries:

$ ldd /usr/local/lib/libblas.so.2
/usr/local/lib/libblas.so.2:

This in turn leads to problems when linking with such shared libraries.
 
How-To-Repeat:
 
Fix:
Release-Note:
 
Audit-Trail:
Responsible Changed
From-To:freebsd-ports-bugs->maho
By:edwin
When:Thu Nov 8 21:28:36 UTC 2007
Why:The situation wants that maho@ is maintainer of all three ports.

State Changed
From-To:open->feedback
By:maho
When:Sun Nov 11 08:32:31 UTC 2007
Why:Could you please explain more about the defect?

Reply via E-mail
From:Maho NAKATA <chat95@mac.com>
Date:Sun, 11 Nov 2007 17:30:38 +0900 (JST)
From: edwin@FreeBSD.org
Subject: Re: ports/117923: USE_FORTRAN=yes: shared libraries for blas, lapack, and atlas do not contain any dependencies on other libs
Date: Thu, 08 Nov 2007 21:31:02 +0000 (GMT)

> Synopsis: USE_FORTRAN=yes: shared libraries for blas, lapack, and atlas do not contain any dependencies on other libs
>
> Responsible-Changed-From-To: freebsd-ports-bugs->maho
> Responsible-Changed-By: edwin
> Responsible-Changed-When: Thu Nov 8 21:28:36 UTC 2007
> Responsible-Changed-Why:
> The situation wants that maho@ is maintainer of all three ports.
>
> http://www.freebsd.org/cgi/query-pr.cgi?pr=117923

Verified. But what is the problem actually?
Linking some objects against libblas requires libgfortran
or something like that, though.

All the best,
-- Nakata Maho (maho@FreeBSD.org)


Reply via E-mail
From:bf <bf2006a@yahoo.com>
Date:Tue, 29 Apr 2008 22:39:57 -0700 (PDT)
I think, Maho, that he is referring to two problems
that were not present before because many of these
ports used static libraries, and/or because Fortran
was in the base FreeBSD system. Now the problems are:

1) Some of the libraries built by these ports have
runtime dependencies on libraries associated with the
compiler used to build them -- gfortran, for instance,
for some of these ports built with lang/gcc42. At the
moment, this runtime dependency is not recorded in
LIB_DEPENDS for the port. Instead, USE_FORTRAN only
records a BUILD_DEPENDS. This is appropriate for
USE_FORTRAN, which is trying to retain some
flexibility for users by permitting them to use
several different Fortran compilers, and which is
intended to also apply to ports that require a fortran
compiler during the build stage, but not afterwards.
However, for those ports that do need the compiler
libraries at runtime, failing to record this
dependency can break a port if, for instance, a
compiler is first used to build the port and then the
compiler is deinstalled or updated to a new compiler
version with a different ABI or some subtly-different
library behavior. This may also create some problems
for people using packages instead of ports. The
solution is to record the runtime dependency in
LIB_DEPENDS, but it is a bit of a pain because some of
the logic used in bsd.gcc.mk has to be repeated in the
relevant port Makefiles for each compiler accepted by
USE_FORTRAN, or some hack like using libmap.conf(5)
has to be employed.

2)In some of these ports, we are assembling shared
libraries from static libraries. This is simply
because some of these older, standard ports first used
only static libraries, and then we subsequently
altered them to also make shared libraries in the
simplest and quickest way. However, the resulting
shared libraries, because of the way that they are
built, lack some of the nice features of shared
libraries built in other ports. For instance, the
lapack library ${LOCALBASE}/lib/liblapack.so.4 is
built by the command:

cd ${WRKSRC_SHARED} ; ld -Bshareable -o
liblapack.so.${SVERSION} -x -soname
liblapack.so.${SVERSION} --whole-archive liblapack.a ;
${LN} -s liblapack.so.${SVERSION} liblapack.so

in the math/lapack Makefile, and the resulting library
lacks things like ELF DT_NEEDED tags that make it
easier to resolve shared library dependencies. You
can't use "ldd" or "objdump -x ... | grep -ie needed
-ie rpath" (like in
${PORTSDIR}/Tools/scripts/neededlibs.sh or
systutils/libchk ) or to find the needed libraries --
instead you have to laboriously go through and resolve
undefined symbols by trying to find them in other
libraries that may be dependencies, or by carefully
looking through makefiles, documentation, and compiler
settings. This can probably be fixed by tinkering
with the commands used to assemble the shared
libraries.

b.



____________________________________________________________________________________
Be a better friend, newshound, and
know-it-all with Yahoo! Mobile. Try it now. http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ

Reply via E-mail
From:bf <bf2006a@yahoo.com>
Date:Tue, 29 Apr 2008 23:05:30 -0700 (PDT)
On second thought, it may be easier to solve the first
problem in my earlier reply by changing USE_FORTRAN in
bsd.gcc.mk to USE_FORTRAN_BUILD and USE_FORTRAN_RUN or
something similar, like we already have for PERL, TCL,
TK, TWISTED, PYTHON, etc. in ports. That way we could
properly account for different types of dependencies
and avoid cluttering up Makefiles for individual
ports.

b.


____________________________________________________________________________________
Be a better friend, newshound, and
know-it-all with Yahoo! Mobile. Try it now. http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ

Reply via E-mail
From:bf <bf2006a@yahoo.com>
Date:Fri, 4 Jul 2008 06:34:25 -0700 (PDT)


Here are two patches, for math/blas and math/lapack, (1) that extend the regression test targets, (2) clean up some parts of the build, (3) add some Netlib mirrors, and (4) build shared libraries directly, using the Fortran compiler for linking (which seems to be the best way to record the Fortran-related library dependencies) and the methods of /usr/share/mk/bsd.lib.mk, rather than trying to convert static libraries into shared libraries, as is currently done (as I suggested, this seems to cause a loss of information). The last step could also be accomplished as in NetBSD pkgsrc, by using devel/libtool15 as a wrapper during compilation and linking, but I didn't think that this extra dependency was necessary. Tested on 7-Stable i386 with lang/gcc42. This is a first step towards making linking with these ports easier, and towards more easily allowing for the use of multiple blas/lapack variants interchangeably in Ports. Some additional changes to
correctly record the Fortran-related runtime dependencies still need to be made, but at least such dependencies are now correctly recorded within the blas and lapack shared libraries. Comments?

Regards,
b.f.




Download blas.txt
diff -ruN blas.orig/Makefile blas/Makefile
--- blas.orig/Makefile   2008-02-21 00:11:43.000000000 -0500
+++ blas/Makefile        2008-07-03 07:40:32.305931580 -0400
@@ -7,13 +7,18 @@
 
 PORTNAME=       blas
 PORTVERSION=    1.0
-PORTREVISION=   2
+PORTREVISION=   3
 CATEGORIES=     math
-MASTER_SITES=   http://www.netlib.org/blas/ \
-                ftp://ftp.mirrorservice.org/sites/netlib.bell-labs.com/netlib/blas/ \
-                ftp://netlib.bell-labs.com/netlib/blas/
-DISTNAME=       ${PORTNAME}
-EXTRACT_SUFX=   .tgz
+MASTER_SITES=   ftp://ftp.netlib.org/blas/ \
+                http://www.netlib.org/blas/ \
+                http://netlib.sandia.gov/blas/ \
+                http://www.mirrorservice.org/sites/netlib.bell-labs.com/netlib/blas/ \
+                http://www.netlib.no/netlib/blas/
+DISTFILES=      ${PORTNAME}.tgz sblat1 dblat1 cblat1 zblat1 \
+                sblat2 sblat2d dblat2 dblat2d cblat2 cblat2d zblat2 zblat2d \
+                sblat3 sblat3d dblat3 dblat3d cblat3 cblat3d zblat3 zblat3d
+DIST_SUBDIR=    ${PORTNAME}
+EXTRACT_ONLY=   ${PORTNAME}.tgz
 
 MAINTAINER=     maho@FreeBSD.org
 COMMENT=        Basic Linear Algebra, level 1, 2, and 3
@@ -21,11 +26,42 @@
 USE_LDCONFIG=   yes
 USE_FORTRAN=    yes
 WRKSRC=         ${WRKDIR}/BLAS
+SHLIB_MAJOR=    2
 
-PLIST_FILES=    lib/libblas.a lib/libblas.so lib/libblas.so.2
+PLIST_FILES=    lib/libblas.a lib/libblas_p.a lib/libblas.so \
+                lib/libblas.so.${SHLIB_MAJOR}
+
+.include <bsd.port.pre.mk>
+
+.if ${ARCH} == "sparc64" || ${ARCH} == "amd64"
+PICFLAG=        -fPIC
+.else
+PICFLAG=        -fpic
+.endif
+
+AR?=            ar
+NM?=            nm
+RANLIB?=        ranlib
+
+.if ${CC} == "icc"
+POFLAG= -p
+.else
+POFLAG= -pg
+.endif
+
+MAKE_ENV+=      PICFLAG="${PICFLAG}" POFLAG="${POFLAG}" AR="${AR}" NM="${NM}" \
+                RANLIB="${RANLIB}" SHLIB_MAJOR="${SHLIB_MAJOR}" ECHO_CMD="${ECHO_CMD}" \
+                LDFLAGS="${LDFLAGS}" RM="${RM}" MV="${MV}" MKDIR="${MKDIR}" LN="${LN}"
 
 do-configure:
-        @${INSTALL_DATA} ${FILESDIR}/makefile.lib ${WRKSRC}/Makefile
-        @${REINPLACE_CMD} -e 's+@FFLAGS@+${FFLAGS}+g' ${WRKSRC}/Makefile
+        @${CP} ${FILESDIR}/makefile.lib ${WRKSRC}/Makefile
+        @cd ${DISTDIR}/${DIST_SUBDIR} && ${CP} ${DISTFILES:N*.tgz} ${WRKSRC}
+
+regression-test:
+        @( cd ${WRKSRC} && ${SETENV} ${MAKE_ENV} ${MAKE} test ; )
+                @${ECHO_CMD} ""
+        @${ECHO_CMD} " Examine the *.out and *.SUMM files in ${WRKSRC}/sharedtests and "
+        @${ECHO_CMD} " ${WRKSRC}/statictests for test outcomes "
+        @${ECHO_CMD} ""
 
-.include <bsd.port.mk>
+.include <bsd.port.post.mk>
diff -ruN blas.orig/distinfo blas/distinfo
--- blas.orig/distinfo   2008-02-21 00:11:43.000000000 -0500
+++ blas/distinfo        2008-07-03 07:40:32.305931580 -0400
@@ -1,3 +1,63 @@
-MD5 (blas.tgz) = 7e6af7022440d8688d16be86d55fb358
-SHA256 (blas.tgz) = bc2f25898141c3ed9513abe3b3f15e00f0d2e8881c7f26b74950cdee45fb541d
-SIZE (blas.tgz) = 98957
+MD5 (blas/blas.tgz) = 7e6af7022440d8688d16be86d55fb358
+SHA256 (blas/blas.tgz) = bc2f25898141c3ed9513abe3b3f15e00f0d2e8881c7f26b74950cdee45fb541d
+SIZE (blas/blas.tgz) = 98957
+MD5 (blas/sblat1) = 14c8578c6bef465d4092b38b9dcad351
+SHA256 (blas/sblat1) = f5bcbc542f1de381cebe4adbb1d16507cbecdf4b0cebe51dc343ae9c2c6d7dbe
+SIZE (blas/sblat1) = 31203
+MD5 (blas/dblat1) = 20d35821fa6e4fa5e1a72a6f377504ee
+SHA256 (blas/dblat1) = bced74cc3c3fc399e92f7606a80a3e579e2f7a3ae33dec19273b1c962804396d
+SIZE (blas/dblat1) = 31203
+MD5 (blas/cblat1) = cac7aba98e33e64e87d08a639c5ec462
+SHA256 (blas/cblat1) = 92978040d4d0300414d46552a506a5c5d3d174bbda2b2302e1462d54e89e446f
+SIZE (blas/cblat1) = 31188
+MD5 (blas/zblat1) = 2de5acfebdffb67e3051128703205dfa
+SHA256 (blas/zblat1) = df9e3f864bcca8681f7752e96525f5163327ae7eb4262dd6efd5bf18a9373f02
+SIZE (blas/zblat1) = 31188
+MD5 (blas/sblat2) = f373eadeed0f1411c9312ca2a0ad4b12
+SHA256 (blas/sblat2) = 77381852d9681314ebfe814f647938c7ef53b17ffa5e9d64d1423c987e412fd2
+SIZE (blas/sblat2) = 111315
+MD5 (blas/sblat2d) = 44c3d207ba01976dbb6e97cf068d7ac3
+SHA256 (blas/sblat2d) = 0a4eaa18ded58529306fbda6caba8b363702576c8355f8068b0a941ecea4b93d
+SIZE (blas/sblat2d) = 1466
+MD5 (blas/dblat2) = 50358b20506ff32e40a259857b36ff7c
+SHA256 (blas/dblat2) = 3a0653c952bf151901722db76de59ef6699f9b751e05d8f1c1d9026448c79d49
+SIZE (blas/dblat2) = 111388
+MD5 (blas/dblat2d) = 8d17b803ded05d8f306b54959bf61abd
+SHA256 (blas/dblat2d) = 7b2689a74db3a46e0d2ddf209188420b8099644093e8481b76ff2bada764d29b
+SIZE (blas/dblat2d) = 1466
+MD5 (blas/cblat2) = 86dfcec29632eb8fd89c0d404a42ea5d
+SHA256 (blas/cblat2) = f4900c334efb96fb7030ac5d9240c8a48cc7d44655825a58acfcc4f69dd030ab
+SIZE (blas/cblat2) = 115732
+MD5 (blas/cblat2d) = ea5074ba38527483943de87f7fa89f90
+SHA256 (blas/cblat2d) = 52dde13a56556d771e1c02799b584356ebbe6bd07f042ee50d910d5b532b0278
+SIZE (blas/cblat2d) = 1546
+MD5 (blas/zblat2) = fd08715785f80d29565db16f508bc113
+SHA256 (blas/zblat2) = bf5677f5614501ce729f7ad61f1d342dab604b455a0d2495070ebbe28f3eb172
+SIZE (blas/zblat2) = 116080
+MD5 (blas/zblat2d) = 3e40d3166714973566e076f7cf4865aa
+SHA256 (blas/zblat2d) = bbaa4d57b37fc895f92fb4c448063d9b7a1990e232d3dbf7fa96ace7f289b8af
+SIZE (blas/zblat2d) = 1546
+MD5 (blas/sblat3) = 074e4eea05be5f6ddf2e061f4dd1064b
+SHA256 (blas/sblat3) = 9a2f6c45f9caa6ddb219d8e058d18fadc7743859a23f5e0d22bbada2e7585aa1
+SIZE (blas/sblat3) = 102977
+MD5 (blas/sblat3d) = 7453aeea348b100538f6e7107a4fdfbb
+SHA256 (blas/sblat3d) = cb1f2a3615d3a2ac3a7e886154f594284c251087417d3c2d2dd247a87c416fde
+SIZE (blas/sblat3d) = 882
+MD5 (blas/dblat3) = 67109216e71370bd74e5a0573ceadbda
+SHA256 (blas/dblat3) = 1df6fb4f1971d604f1ce33559a9a8753494f7becc9ed3846bbedff5fab8987ed
+SIZE (blas/dblat3) = 103029
+MD5 (blas/dblat3d) = 71ca8160f36a74d8cee313038a5e6dd4
+SHA256 (blas/dblat3d) = c4c95934a7bb7a715c913c35e53c75a0559eb275f0bd4ffa582555344a092a78
+SIZE (blas/dblat3d) = 882
+MD5 (blas/cblat3) = a7f2ff2684cee68fa2dc36573cec1fae
+SHA256 (blas/cblat3) = d91446c1d05d70f9dc929755b383a03740a3f9d194a51f67d0e48901ed22f259
+SIZE (blas/cblat3) = 130271
+MD5 (blas/cblat3d) = 071bc85efe3b78583202f7e2a0c109ac
+SHA256 (blas/cblat3d) = 73dd9efdcbe12fadda8eca57754b548c37da31393e68608ede6fac657d75fd05
+SIZE (blas/cblat3d) = 1046
+MD5 (blas/zblat3) = 0fd36e72f1226d7a09ff4f43b13a7b77
+SHA256 (blas/zblat3) = accc44079788b6e4a887a25c49cfea4c01141af5a228f30bbb1ec62ba2245660
+SIZE (blas/zblat3) = 130561
+MD5 (blas/zblat3d) = aa83e4fd400cf72d5445ce4553a40735
+SHA256 (blas/zblat3d) = e3372bad1f0fb2e15a36df3a3523cc5cda1a3459cb0bccba6da77f95525d5d83
+SIZE (blas/zblat3d) = 1046
diff -ruN blas.orig/files/makefile.lib blas/files/makefile.lib
--- blas.orig/files/makefile.lib 2008-02-21 00:11:43.000000000 -0500
+++ blas/files/makefile.lib      2008-07-03 07:40:32.305931580 -0400
@@ -1,18 +1,5 @@
-#       @(#)Makefile    5.7 (Berkeley) 6/27/91
-FFLAGS= @FFLAGS@
+.SUFFIXES: .o .po .So .f
 
-LIBDIR= ${PREFIX}/lib
-.if (${OSVERSION} > 600007)
-NO_PROFILE= no
-.else
-NOPROFILE= no
-.endif
-
-SHLIB_MAJOR= 2
-
-# BLAS sources
-LIB=blas
-#NOPROFILE=1
 SRCS =  caxpy.f  ccopy.f  cdotc.f  cdotu.f  cgbmv.f  cgemm.f  cgemv.f   \
         cgerc.f  cgeru.f  chbmv.f  chemm.f  chemv.f  cher.f   cher2.f   \
         cher2k.f cherk.f  chpmv.f  chpr.f   chpr2.f  crotg.f  cscal.f   \
@@ -36,6 +23,76 @@
         zsyr2k.f zsyrk.f  ztbmv.f  ztbsv.f  ztpmv.f  ztpsv.f  ztrmm.f   \
         ztrmv.f  ztrsm.f  ztrsv.f
 
-CLEANFILES+= *.c
+TESTSRCS =      sblat1 dblat1 cblat1 zblat1 sblat2 dblat2 cblat2 zblat2 \
+                sblat3 dblat3 cblat3 zblat3
 
-.include <bsd.lib.mk>
+OBJS=   ${SRCS:.f=.o}
+POBJS=  ${SRCS:.f=.po}
+SOBJS=  ${SRCS:.f=.So}
+
+.f.o:
+        @${RM} -f ${.TARGET}
+        ${FC} ${FFLAGS} -o ${.TARGET} -c ${.IMPSRC}
+
+.f.po:
+        @${RM} -f ${.TARGET}
+        ${FC} ${POFLAG} ${FFLAGS} -o ${.TARGET} -c ${.IMPSRC}
+
+.f.So:
+        @${RM} -f ${.TARGET}
+        ${FC} ${PICFLAG} -DPIC ${FFLAGS} -o ${.TARGET} -c ${.IMPSRC}
+
+libblas.a: ${OBJS}
+        @${RM} -f ${.TARGET}
+        @${ECHO_CMD} "building static library"
+        @${AR} cq ${.TARGET} `NM='${NM}' lorder ${OBJS} | tsort -q`
+        ${RANLIB} ${.TARGET}
+
+libblas_p.a: ${POBJS}
+        @${RM} -f ${.TARGET}
+        @${ECHO_CMD} "building profiled static library"
+        @${AR} cq ${.TARGET} `NM='${NM}' lorder ${POBJS} | tsort -q`
+        ${RANLIB} ${.TARGET}
+
+libblas.so.${SHLIB_MAJOR}: ${SOBJS}
+        @${RM} -f ${.TARGET}
+        @${ECHO_CMD} "building shared library"
+        ${FC} ${PICFLAG} -DPIC ${FFLAGS} ${LDFLAGS} -shared -Wl,-x \
+        -o ${.TARGET} -Wl,-soname,libblas.so.${SHLIB_MAJOR} \
+        `lorder ${SOBJS} | tsort -q`
+
+all: libblas.a libblas_p.a libblas.so.${SHLIB_MAJOR}
+
+install: all
+        ${BSD_INSTALL_SCRIPT} libblas.a ${PREFIX}/lib
+        ${BSD_INSTALL_SCRIPT} libblas_p.a ${PREFIX}/lib
+        ${BSD_INSTALL_PROGRAM} libblas.so.${SHLIB_MAJOR} ${PREFIX}/lib
+        ${LN} -fs ${PREFIX}/lib/libblas.so.${SHLIB_MAJOR} ${PREFIX}/lib/libblas.so
+
+test: all
+        @${ECHO_CMD} "testing static library"
+        ${MKDIR} statictests
+.for _S in ${TESTSRCS}
+        @${MV} ${_S} ${_S}.f
+        @${RM} -f ${_S}.o ./statictests/${_S}statictest
+        ${FC} ${FFLAGS} -o ${_S}.o -c ${_S}.f
+        ${FC} ${FFLAGS} ${LDFLAGS} ${_S}.o libblas.a -o ./statictests/${_S}statictest
+.endfor
+.for _l in s d c z
+        cd statictests && ./${_l}blat1statictest > ${_l}blat1.out && \
+        ./${_l}blat2statictest < ../${_l}blat2d && \
+        ./${_l}blat3statictest < ../${_l}blat3d
+.endfor
+        @${ECHO_CMD} "testing shared library"
+        ${MKDIR} sharedtests
+.for _S in ${TESTSRCS}
+        @${RM} -f ${_S}.So ./sharedtests/${_S}sharedtest
+        ${FC} ${PICFLAG} -DPIC ${FFLAGS} -o ${_S}.So -c ${_S}.f
+        ${FC} ${PICFLAG} -DPIC ${FFLAGS} ${LDFLAGS} ${_S}.So \
+        libblas.so.${SHLIB_MAJOR} -o ./sharedtests/${_S}sharedtest
+.endfor
+.for _l in s d c z
+        cd sharedtests && ./${_l}blat1sharedtest > ${_l}blat1.out && \
+        ./${_l}blat2sharedtest < ../${_l}blat2d && \
+        ./${_l}blat3sharedtest < ../${_l}blat3d
+.endfor


Download lapack.txt
diff -ruN lapack.orig/Makefile lapack/Makefile
--- lapack.orig/Makefile 2008-06-18 22:08:21.000000000 -0400
+++ lapack/Makefile      2008-07-03 08:37:41.394319630 -0400
@@ -7,9 +7,13 @@
 
 PORTNAME=       lapack
 PORTVERSION=    3.1.1
-PORTREVISION=   1
+PORTREVISION=   2
 CATEGORIES=     math
-MASTER_SITES=   ftp://ftp.netlib.org/lapack/
+MASTER_SITES=   ftp://ftp.netlib.org/lapack/ \
+                http://www.netlib.org/lapack/ \
+                http://netlib.sandia.gov/lapack/ \
+                http://www.mirrorservice.org/sites/netlib.bell-labs.com/netlib/lapack/ \
+                http://www.netlib.no/netlib/lapack/
 DISTFILES=      lapack-${PORTVERSION}.tgz manpages-${PORTVERSION}.tgz
 
 MAINTAINER=     maho@FreeBSD.org
@@ -40,8 +44,11 @@
 WRKSRC_PROFILE=${WRKSRC}_profile
 FFLAGS_PROFILE=-pg
 
-SVERSION=4
-BLAS=   -L${LOCALBASE}/lib -lblas
+SVERSION=       4
+PLIST_SUB+=     SVERSION="${SVERSION}"
+BLAS?=          -L${LOCALBASE}/lib -lblas
+LAPACKOBJ=      ${MAKE} -C ${WRKSRC_SHARED}/SRC -V ALLOBJ
+TMGOBJ=         ${MAKE} -C ${WRKSRC_SHARED}/TESTING/MATGEN -V ALLOBJ
 
 pre-fetch:
         @${ECHO} "You can override FC and FFLAGS on the command line."
@@ -56,52 +63,65 @@
                           -e 's,%%FFLAGS%%,${FFLAGS},g' \
                           -e 's,%%EXTRAFLAGS%%,,g' \
                           -e 's,%%BLAS%%,${BLAS},g' \
+                          -e 's,%%PLAT%%,,g' \
                                 ${WRKSRC}/make.inc
         @${REINPLACE_CMD} -e 's,%%F77%%,${F77},g' \
                           -e 's,%%FFLAGS%%,${FFLAGS},g' \
                           -e 's,%%EXTRAFLAGS%%,${FFLAGS_SHARED},g' \
                           -e 's,%%BLAS%%,${BLAS},g' \
+                          -e 's,%%PLAT%%,,g' \
                                 ${WRKSRC_SHARED}/make.inc
         @${REINPLACE_CMD} -e 's,%%F77%%,${F77},g' \
                           -e 's,%%FFLAGS%%,${FFLAGS},g' \
                           -e 's,%%EXTRAFLAGS%%,${FFLAGS_PROFILE},g' \
                           -e 's,%%BLAS%%,${BLAS},g' \
+                          -e 's,%%PLAT%%,_p,g' \
                                 ${WRKSRC_PROFILE}/make.inc
 
 do-build:
         @${ECHO_CMD} "Building static lapack library"
-        cd ${WRKSRC} ; ${MAKE} ${.MAKEFLAGS} ARCH=ar
-        @${ECHO_CMD} "Building shared lapack library"
-        cd ${WRKSRC_SHARED} ; ${MAKE} ${.MAKEFLAGS} ARCH=ar
-        @${ECHO_CMD} "Building profile lapack library"
-        cd ${WRKSRC_PROFILE} ; ${MAKE} ${.MAKEFLAGS} ARCH=ar
+        cd ${WRKSRC} && ${MAKE} ${.MAKEFLAGS} ARCH=ar lib
+        @${ECHO_CMD} "Building shared lapack library components"
+        cd ${WRKSRC_SHARED} && ${MAKE} ${.MAKEFLAGS} lapack_install
+        cd ${WRKSRC_SHARED}/SRC && ${MAKE} ${.MAKEFLAGS} `${LAPACKOBJ}`
+        cd ${WRKSRC_SHARED}/TESTING/MATGEN && ${MAKE} ${.MAKEFLAGS} `${TMGOBJ}`
+        @${ECHO_CMD} "Building profiled static lapack library"
+        cd ${WRKSRC_PROFILE} && ${MAKE} ${.MAKEFLAGS} ARCH=ar lib
 
 post-build:
-        ${CP} ${WRKSRC}/lapack_FREEBSD.a ${WRKSRC}/liblapack.a
-        ${CP} ${WRKSRC}/tmglib_FREEBSD.a ${WRKSRC}/libtmglib.a
-        ${CP} ${WRKSRC_SHARED}/lapack_FREEBSD.a ${WRKSRC_SHARED}/liblapack.a
-        ${CP} ${WRKSRC_SHARED}/tmglib_FREEBSD.a ${WRKSRC_SHARED}/libtmglib.a
-        ${CP} ${WRKSRC_PROFILE}/lapack_FREEBSD.a ${WRKSRC_PROFILE}/liblapack_p.a
-        ${CP} ${WRKSRC_PROFILE}/tmglib_FREEBSD.a ${WRKSRC_PROFILE}/libtmglib_p.a
-        cd ${WRKSRC_SHARED} ; ld -Bshareable -o liblapack.so.${SVERSION} -x -soname liblapack.so.${SVERSION} --whole-archive liblapack.a ; ${LN} -s liblapack.so.${SVERSION} liblapack.so
-        cd ${WRKSRC_SHARED} ; ld -Bshareable -o libtmglib.so.${SVERSION} -x -soname libtmglib.so.${SVERSION} --whole-archive libtmglib.a ; ${LN} -s libtmglib.so.${SVERSION} libtmglib.so
+        @${ECHO_CMD} "Assembling shared lapack library from components"
+        cd ${WRKSRC_SHARED}/SRC && \
+                lorder `${LAPACKOBJ}` | tsort -q | ${XARGS} -J % ${FC} \
+                ${FFLAGS} ${FFLAGS_SHARED} ${LDFLAGS} -shared \
+                -Wl,-x -o ../liblapack.so.${SVERSION} \
+                -Wl,-soname,liblapack.so.${SVERSION} % ${BLAS}
+         cd ${WRKSRC_SHARED}/TESTING/MATGEN && \
+                lorder `${TMGOBJ}` | tsort -q | ${XARGS} -J % ${FC} \
+                ${FFLAGS} ${FFLAGS_SHARED} ${LDFLAGS} -shared \
+                -Wl,-x -o ../../libtmglib.so.${SVERSION} \
+                -Wl,-soname,libtmglib.so.${SVERSION} % ${BLAS}
 
 do-install:
-        ${INSTALL_DATA} ${WRKSRC}/liblapack.a ${PREFIX}/lib
-        ${INSTALL_DATA} ${WRKSRC}/libtmglib.a ${PREFIX}/lib
-        ${INSTALL_DATA} ${WRKSRC_SHARED}/liblapack.so.${SVERSION} ${PREFIX}/lib
-        ${INSTALL_DATA} ${WRKSRC_SHARED}/libtmglib.so.${SVERSION} ${PREFIX}/lib
-        ${LN} -sf liblapack.so.${SVERSION} ${PREFIX}/lib/liblapack.so
-        ${LN} -sf libtmglib.so.${SVERSION} ${PREFIX}/lib/libtmglib.so
-        ${INSTALL_DATA} ${WRKSRC_PROFILE}/liblapack_p.a ${PREFIX}/lib
-        ${INSTALL_DATA} ${WRKSRC_PROFILE}/libtmglib_p.a ${PREFIX}/lib
-        ${INSTALL_MAN} ${WRKSRC}/manpages/man/manl/[a-c]*.l ${PREFIX}/man/manl
-        ${INSTALL_MAN} ${WRKSRC}/manpages/man/manl/[d-l]*.l ${PREFIX}/man/manl
-        ${INSTALL_MAN} ${WRKSRC}/manpages/man/manl/[m-s]*.l ${PREFIX}/man/manl
-        ${INSTALL_MAN} ${WRKSRC}/manpages/man/manl/[t-z]*.l ${PREFIX}/man/manl
+        ${INSTALL_SCRIPT} ${WRKSRC}/lapack.a ${PREFIX}/lib/liblapack.a
+        ${INSTALL_SCRIPT} ${WRKSRC}/tmglib.a ${PREFIX}/lib/libtmglib.a
+        ${INSTALL_PROGRAM} ${WRKSRC_SHARED}/liblapack.so.${SVERSION} ${PREFIX}/lib
+        ${INSTALL_PROGRAM} ${WRKSRC_SHARED}/libtmglib.so.${SVERSION} ${PREFIX}/lib
+        ${LN} -sf ${PREFIX}/lib/liblapack.so.${SVERSION} ${PREFIX}/lib/liblapack.so
+        ${LN} -sf ${PREFIX}/lib/libtmglib.so.${SVERSION} ${PREFIX}/lib/libtmglib.so
+        ${INSTALL_SCRIPT} ${WRKSRC_PROFILE}/lapack_p.a ${PREFIX}/lib/liblapack_p.a
+        ${INSTALL_SCRIPT} ${WRKSRC_PROFILE}/tmglib_p.a ${PREFIX}/lib/libtmglib_p.a
+        ${INSTALL_MAN} ${WRKSRC}/manpages/man/manl/*.l ${PREFIX}/man/manl
 
 regression-test: build
         @${ECHO_CMD} "Testing static lapack library"
-        cd ${WRKSRC}/TESTING ; ${MAKE} ${.MAKEFLAGS} ARCH=ar
+        cd ${WRKSRC}/TESTING && ${MAKE} ${.MAKEFLAGS} ARCH=ar
+        @${ECHO_CMD} "Testing shared lapack library"
+        cd ${WRKSRC_SHARED}/TESTING && ${MAKE} -E LAPACKLIB \
+        -E TMGLIB ${.MAKEFLAGS} ARCH=ar LAPACKLIB=liblapack.so.${SVERSION} \
+        TMGLIB=libtmglib.so.${SVERSION}
+        @${ECHO_CMD} ""
+        @${ECHO_CMD} " Examine the *.out files in ${WRKSRC}/TESTING and "
+        @${ECHO_CMD} " ${WRKSRC_SHARED}/TESTING for test outcomes "
+        @${ECHO_CMD} ""
 
 .include <bsd.port.post.mk>
diff -ruN lapack.orig/files/patch-INSTALL+make.inc.gfortran lapack/files/patch-INSTALL+make.inc.gfortran
--- lapack.orig/files/patch-INSTALL+make.inc.gfortran    1969-12-31 19:00:00.000000000 -0500
+++ lapack/files/patch-INSTALL+make.inc.gfortran 2008-07-03 07:49:15.759906038 -0400
@@ -0,0 +1,36 @@
+--- INSTALL/make.inc.gfortran.orig      2007-02-23 15:07:35.000000000 -0500
++++ INSTALL/make.inc.gfortran   2008-07-02 13:46:47.204240167 -0400
+@@ -8,7 +8,7 @@
+ #
+ #  The machine (platform) identifier to append to the library names
+ #
+-PLAT = _LINUX
++PLAT = %%PLAT%%
+ #  
+ #  Modify the FORTRAN and OPTS definitions to refer to the
+ #  compiler and desired compiler options for your machine.  NOOPT
+@@ -16,11 +16,11 @@
+ #  selected.  Define LOADER and LOADOPTS to refer to the loader and 
+ #  desired load options for your machine.
+ #
+-FORTRAN  = gfortran 
+-OPTS     = -O2
++FORTRAN  = %%F77%%
++OPTS     = %%FFLAGS%% %%EXTRAFLAGS%%
+ DRVOPTS  = $(OPTS)
+-NOOPT    = -O0
+-LOADER   = gfortran
++NOOPT    = -O0 %%EXTRAFLAGS%%
++LOADER   = %%F77%%
+ LOADOPTS =
+ #
+ # Timer for the SECOND and DSECND routines
+@@ -48,7 +48,7 @@
+ #  machine-specific, optimized BLAS library should be used whenever
+ #  possible.)
+ #
+-BLASLIB      = ../../blas$(PLAT).a
++BLASLIB      = %%BLAS%%
+ LAPACKLIB    = lapack$(PLAT).a
+ TMGLIB       = tmglib$(PLAT).a
+ EIGSRCLIB    = eigsrc$(PLAT).a
diff -ruN lapack.orig/files/patch-Makefile lapack/files/patch-Makefile
--- lapack.orig/files/patch-Makefile     2007-09-29 08:53:55.000000000 -0400
+++ lapack/files/patch-Makefile  1969-12-31 19:00:00.000000000 -0500
@@ -1,14 +0,0 @@
---- Makefile    2007-09-29 10:36:38.000000000 +0900
-+++ Makefile    2007-09-29 10:38:25.000000000 +0900
-@@ -7,7 +7,11 @@
- include make.inc
- 
- 
-+.if defined(ENABLE_TESTING) && ${ENABLE_TESTING} == "YES"
- all: lapack_install lib lapack_testing blas_testing
-+.else
-+all: lapack_install lib
-+.endif
- 
- lib: lapacklib tmglib
- #lib: blaslib lapacklib tmglib
diff -ruN lapack.orig/files/patch-make.inc.gfortran lapack/files/patch-make.inc.gfortran
--- lapack.orig/files/patch-make.inc.gfortran    2007-09-29 08:53:55.000000000 -0400
+++ lapack/files/patch-make.inc.gfortran 1969-12-31 19:00:00.000000000 -0500
@@ -1,36 +0,0 @@
---- INSTALL/make.inc.gfortran.orig      2007-02-24 05:07:35.000000000 +0900
-+++ INSTALL/make.inc.gfortran   2007-09-29 10:22:01.000000000 +0900
-@@ -8,7 +8,7 @@
- #
- #  The machine (platform) identifier to append to the library names
- #
--PLAT = _LINUX
-+PLAT = _FREEBSD
- #  
- #  Modify the FORTRAN and OPTS definitions to refer to the
- #  compiler and desired compiler options for your machine.  NOOPT
-@@ -16,11 +16,11 @@
- #  selected.  Define LOADER and LOADOPTS to refer to the loader and 
- #  desired load options for your machine.
- #
--FORTRAN  = gfortran 
--OPTS     = -O2
-+FORTRAN  = %%F77%%
-+OPTS     = %%FFLAGS%% %%EXTRAFLAGS%%
- DRVOPTS  = $(OPTS)
--NOOPT    = -O0
--LOADER   = gfortran
-+NOOPT    = -O0 %%EXTRAFLAGS%%
-+LOADER   = %%F77%%
- LOADOPTS =
- #
- # Timer for the SECOND and DSECND routines
-@@ -48,7 +48,7 @@
- #  machine-specific, optimized BLAS library should be used whenever
- #  possible.)
- #
--BLASLIB      = ../../blas$(PLAT).a
-+BLASLIB      = %%BLAS%%
- LAPACKLIB    = lapack$(PLAT).a
- TMGLIB       = tmglib$(PLAT).a
- EIGSRCLIB    = eigsrc$(PLAT).a
diff -ruN lapack.orig/pkg-plist lapack/pkg-plist
--- lapack.orig/pkg-plist        2007-09-29 08:53:54.000000000 -0400
+++ lapack/pkg-plist     2008-07-03 07:49:15.759906038 -0400
@@ -1,8 +1,8 @@
 lib/liblapack.a
 lib/liblapack.so
-lib/liblapack.so.4
+lib/liblapack.so.%%SVERSION%%
 lib/liblapack_p.a
 lib/libtmglib.a
 lib/libtmglib.so
-lib/libtmglib.so.4
+lib/libtmglib.so.%%SVERSION%%
 lib/libtmglib_p.a



Reply via E-mail
From:bf <bf2006a@yahoo.com>
Date:Tue, 8 Jul 2008 18:00:40 -0700 (PDT)
Using the regression tests in the patches I sent in earlier, it now appears that ICAMAX in math/blas is not behaving as desired. It seems to be a problem that has been experienced elsewhere -- see, for example,

https://bugs.launchpad.net/ubuntu/hardy/+source/blas/+bug/202869

and

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34616

which are thought to be a variant of infamous gcc bug 323:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=323

There are several different ways to handle this problem, some of which are discussed in the "323" link above, or in the periodic discussions of floating-point calculations on FreeBSD that have taken place in the cvs-src mailing list, usually after modifications of relevant code by bde@ and das@. From what little I know about this, I'm not convinced that using CFLAGS+=-ffloat-store for icamax.f and izamax.f if ${CC} or ${FC} are versions of gcc, as adopted by Debian, Ubuntu and others, is the best solution . I'll see if I can come up with something. These precision-related problems are starting to surface more often, not only with these venerable ports that are now being scrutinized more carefully, but with newer software that often makes use of extended and mixed precision. The whole range of problems needs to be addressed as simply and consistently as possible, considering the full range of platforms and compilers that we now have. Perhaps someone
could draw up some detailed guidelines for dealing with them in the base system and ports, in a way that doesn't involve wholesale rewriting of existing third-party software? (At least until we move to a compiler that can better handle these problems without intervention ... )

Regards,
b.




Reply via E-mail
From:David Schultz <das@FreeBSD.ORG>
Date:Tue, 8 Jul 2008 22:30:44 -0400
On Tue, Jul 08, 2008, bf wrote:
> Using the regression tests in the patches I sent in earlier, it now appears that ICAMAX in math/blas is not behaving as desired. It seems to be a problem that has been experienced elsewhere -- see, for example,
>
> https://bugs.launchpad.net/ubuntu/hardy/+source/blas/+bug/202869
>
> and
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34616
>
> which are thought to be a variant of infamous gcc bug 323:
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=323

Hi Mr. or Ms. "bf",

FreeBSD isn't subject to gcc bug 323. On i386, we deliberately set
the FPU precision to double by default to avoid this problem. On
other supported architectures, extended precision is not an issue
in the first place.

I'm not sure why you think -ffloat-store is needed. Perhaps it
would help if you could explain what the actual problem is, or
even produce a minimal test that fails. Is this a FORTRAN-specific
issue? Thanks!

Reply via E-mail
From:bf <bf2006a@yahoo.com>
Date:Wed, 9 Jul 2008 02:15:39 -0700 (PDT)

Download lapack.txt
 :~,^^韢h^zbaz-.wn+i"*'N       עZbjK%xzy櫱(櫗+azax-yƮZvkyȬ剹p]n^m)
vz%xz˩j|k#hjn܆xzfrبajZk^Ƭj[h~^nqi"*'M-.n肶zt  q@ HCH4%4&818EA QDDCB]8}}}N6}}mODI=8vugdDD@H KQLDMgVvMDNNMD


Download testdata.tar.bz2
(Binary attachment not viewable.)



Reply via E-mail
From:David Schultz <das@FreeBSD.ORG>
Date:Wed, 9 Jul 2008 07:51:50 -0400
On Wed, Jul 09, 2008, bf wrote:
> COMPLEX Z1, Z2
> REAL S1

Aah, but the REAL type in FORTRAN is IEEE 754 single precision, so
you do run into a similar issue. (The FPU is set to double
precision in FreeBSD, so you get more precision than you asked for
in intermediate calculations.) There's usually very little
performance advantage to using single precision instead of double
precision; double precision is certainly a lot faster than using
-ffloat-store.

Another option is to tell the compiler to use SSE, or switch to
amd64 where that is the default. Then you won't run into these
issues.

> * SCABS1 computes absolute value of a complex number
> *
> * .. Intrinsic Functions ..
> INTRINSIC ABS,AIMAG,REAL
> * ..
> SCABS1 = ABS(REAL(Z)) + ABS(AIMAG(Z))

This is not the correct formula for the absolute value of a
complex number, by the way.

Reply via E-mail
From:Bruce Evans <brde@optusnet.com.au>
Date:Thu, 10 Jul 2008 02:36:16 +1000 (EST)
On Thu, 10 Jul 2008, Bruce Evans wrote:

> [... points about side issues deleted]
> On Wed, 9 Jul 2008, David Schultz wrote:
>
> > On Wed, Jul 09, 2008, bf wrote:
> > > COMPLEX Z1, Z2
> > > REAL S1
> >
> > Aah, but the REAL type in FORTRAN is IEEE 754 single precision, so
> > you do run into a similar issue. (The FPU is set to double
> > precision in FreeBSD, so you get more precision than you asked for
> > in intermediate calculations.) There's usually very little
> > performance advantage to using single precision instead of double
> > precision;
> ...

Back to the main point. I looked at the test data. This problem is
fully understood. It is essentially the ancient gcc spilling bug that
gives f(x) != f(x) almost everywhere for all interesting sub-default-precision
libm functions f. This is because exactly one of the f(x)'s is spilled
unless you use -O0 or -ffloat-store (not a bug -- spilling is unavoidable),
both of the f(x)'s are normally evaluated and returned in extra precision
(extra precision for the evaluation is a feature and for the return
it is a required bugfeature for C99), and spilling is broken in the
presence of non-explicit extra precision (this is the bug). Spilling
loses the extra precision (and also any extra exponent range, but this
is unusual) so the results compare unequal.

FreeBSD changes the default precision to 53-bits (double) to mostly
avoid this bug. Spilling still breaks float precision (when x and
f(x) are floats and f(x) has extra precision) and cases where the extra
exponent range has an effect.

The test program is a little different -- it uses f(x) == testdata[x]
on floats. For C, the test should be for inequality (or better use
quaility with extra-precision testdata[x]) for certain values of
FLT_EVAL_METHOD including the i386 one (actually the one in the missing
documentation for the weird i386 FLT_EVAL_METHOD), since there is no
way that a float testdata[x] can match an extra-precision f(x), and
there is normally no spill to lose any extra precision in f(x). The
C bugfeature requires returning any extra precision in the calculation
of f(x) (modulo i386's FLT_EVAL_METHOD allowing anything), and the
test is of cases where such extra precision actually occurs. Fortran
is unlikely to have the same bugfeature as C here, so it might require
equality.

Patience might be required waiting for this to be fixed ;-(. I've
been waiting for 20 years for it to be fixed in C so far. Fortan seems
to be using the C back end too much since it has the same bugs. Even
without the bugs, there might be semantic differences like C's bugfeature
for return values not being present in Fortran.

Bruce

Reply via E-mail
From:bf <bf2006a@yahoo.com>
Date:Wed, 9 Jul 2008 12:31:15 -0700 (PDT)
Thanks for all the valuable information, guys. math/blas is an old, standard library (most of it was written in 1978), and I suppose that single precision was more important at that time. The "absolute value" was the author's terminology -- I would have called it the l^1 norm -- but of course it's "equivalent", in the analytical sense, to the usual absolute value (the l^2 norm), and I suppose it's easier to evaluate.

I suppose then that we should leave the library as it is, except perhaps adding a compilation with -ffloat-store only for the two susceptible routines when they are compiled on vulnerable platforms -- pre-SSE i386 and m68k, am I missing any others?

Supposedly gcc 4.3+ have better control over precision than their predecessors. Right now the default choice (set in /usr/ports/Mk/bsd.gcc.mk ) of compiler for FreeBSD Fortran Ports is lang/gcc42, and the base system C compiler for allied C code. Would it be better if we were to change this default to lang/gcc43, and use some of the newer compiler's features to protect the unwary user on the vulnerable platforms? If we did so, what should we use -- the new -mpcXX compiler flags? FENV_ACCESS? And are you aware of any problems that may occur when mixing code compiled with lang/gcc4X and the base system C compiler? I notice that there are a few differences between gcc42 from lang/gcc42 and the base system C compiler. For instance, on RELENG7 the base system C compiler reports LDBL_MANT_DIG = 64 while lang/gcc42 gives LDBL_MANT_DIG = 53. I am not sure if this has consequences, or if there are other significant differences.

The modern incarnation of BLAS is at

http://crd.lbl.gov/~xiaoye/XBLAS/

I started to make a port for this candidate reference implementation of the new mixed- and extended-precision BLAS, but ran into a few problems on i386. Do you have any comments about their chosen method of juggling precisions, and how best to handle it on FreeBSD as it is now?

Regards,
b.




Reply via E-mail
From:Bruce Evans <brde@optusnet.com.au>
Date:Thu, 10 Jul 2008 00:22:20 +1000 (EST)
I just read the entire gcc bugzilla #323 thread (118 comments) and
many links. Some points of interest, especially to me:
- the guy from inria agrees with me and wants Linux to use the same
precision hack as in FreeBSD. Linux used this in the early 90's but
was changed to default to 64-bit precision as soon as gcc started
pretending to support 8-bit long doubles.
- the Microsoft Visual C++ documented pointed to in a comment shows
that VC++ handled this better in 2005 (the doc is old (last update
June 2004) but documents a 2005 version of VC++). VC++ apparently
switched from a default of 64-bit precision back(?) to 53-bit
precision in 2005, presumably to more or less avoid this problem
in the usual case, as in FreeBSD. The document also shows almost
correct but rather inefficient handling of the problem:
- compiler flag fp:precise fixes all of the gcc precision bugs
except for spilling:
- assignments discard any extra precision as required by C99.
fp_precise is claimed to be efficient, or at least as efficient
as possible, but it cannot be efficient with this, except in
code that uses complicated expressions to avoid assignments.
It is indeed about as efficient as possible given the C99
requirement.
- casts work discard any extra precision as required by C99. There
must be some way to do this, and an explicit cast is as good as
any.
- function calls discard any extra precision as required by C99.
gcc does this too (it is required by ABIs).
- function returns discard any extra precision as required by
POLA and a possibly a future IEEE standard but prohibited by
C99, efficiency and accuracy (the accuracy and probably the C99
requirement for not discarding here is that expressions in return
statments shouldn't have a different evaluation method than
expressions in other statements.
- spilling of intermediate results that have extra precision is
completely broken, as in gcc. It's strange to fix assignment,
which has a large runtime cost by a small POLA cost while not
fixing this which has a small runtime cost and a large POLA
cost.
- compiler flag fp:fast mode is like gcc's -ffast-math.
- compiler flag fp:strict is fp_precise, plus the strictness required
for fenv access including exceptions, plus no contraction (no fma...).
- C99 pragmas and extensions to control all this are implemented.

On Wed, 9 Jul 2008, David Schultz wrote:

> On Wed, Jul 09, 2008, bf wrote:
> > COMPLEX Z1, Z2
> > REAL S1
>
> Aah, but the REAL type in FORTRAN is IEEE 754 single precision, so
> you do run into a similar issue. (The FPU is set to double
> precision in FreeBSD, so you get more precision than you asked for
> in intermediate calculations.) There's usually very little
> performance advantage to using single precision instead of double
> precision;

Only in practice. Single precision is 2 to 4 times faster in many
FreeBSD libm functions, and that is without much parallelism. This
is about half due to specialized algorithms and half due to reduced
memory traffic combined with easier optimization. With full
vectorization, float SSE can stream 2 to 4 times faster than double
SSE (2 times faster due to twice as many elements per operation and
another factor of 2 times faster >= 2 year old CPUs where doubles are
not pipelined as well). Fortran is more vectorizable than C, so
single precision is probably more useful for efficiency in it.

> double precision is certainly a lot faster than using
> -ffloat-store.

Except in rare cases where the latency of -float-store can be hidden.

> Another option is to tell the compiler to use SSE, or switch to
> amd64 where that is the default. Then you won't run into these
> issues.

Of course, not using extended precision defeats the point of having it.
Does Fortran even permit extra precision?

The default precision should be chosen by the library according to the
requirements of the language and its implementation, not by the kernel.

> > * SCABS1 computes absolute value of a complex number
> > *
> > * .. Intrinsic Functions ..
> > INTRINSIC ABS,AIMAG,REAL
> > * ..
> > SCABS1 = ABS(REAL(Z)) + ABS(AIMAG(Z))
>
> This is not the correct formula for the absolute value of a
> complex number, by the way.

:-).

Complex arithmetic also benefits from extra precision. Just implementing
multiplication efficiently and fairly accurately is almost imposible
without it (using it avoids all overflow possibilities for multiplications
and reduces cancellation errors significantly). Complex numbers are
ancient technology in Fortran so they are presumably used more in it.

Bruce

Unformatted:
 
Submit Followup | Raw PR | Find another PR