Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
NCCOPY(1)		       UNIDATA UTILITIES		     NCCOPY(1)

NAME
       nccopy  -  Copy a netCDF	file, optionally changing format, compression,
       or chunking in the output.

SYNOPSIS
       nccopy [-k  kind_name ] [-kind_code] [-d	 n ]  [-s]  [-c	  chunkspec  ]
	      [-u]  [-w]  [-[v|V] var1,...]  [-[g|G] grp1,...]	[-m  bufsize ]
	      [-h  chunk_cache ] [-e  cache_elems ] [-r] [-F  filterspec ] [-L
	      n	] [-M  n ]  infile  outfile

DESCRIPTION
       The  nccopy utility copies an input netCDF file in any supported	format
       variant to an output netCDF file, optionally converting the  output  to
       any compatible netCDF format variant, compressing the data, or rechunk-
       ing the data.  For example, if  built  with  the	 netCDF-3  library,  a
       netCDF  classic file may	be copied to a netCDF 64-bit offset file, per-
       mitting larger variables.  If built with	the netCDF-4 library, a	netCDF
       classic	file may be copied to a	netCDF-4 file or to a netCDF-4 classic
       model file as  well,  permitting	 data  compression,  efficient	schema
       changes,	larger variable	sizes, and use of other	netCDF-4 features.

       If  no  output  format  is  specified,  with  either  -k	 kind_name  or
       -kind_code, then	the output will	use the	same format as the input,  un-
       less  the input is classic or 64-bit offset and either chunking or com-
       pression	is specified, in which case the	output will be netCDF-4	 clas-
       sic  model format.  Attempting some kinds of format conversion will re-
       sult in an error, if the	conversion is not possible.  For  example,  an
       attempt to copy a netCDF-4 file that uses features of the enhanced mod-
       el, such	as groups or variable-length strings,  to  any	of  the	 other
       kinds  of  netCDF  formats that use the classic model will result in an
       error.

       nccopy also serves as an	example	of a generic  netCDF-4	program,  with
       its  ability  to	 read  any valid netCDF	file and handle	nested groups,
       strings,	and user-defined types,	including arbitrarily nested  compound
       types, variable-length types, and data of any valid netCDF-4 type.

       If  DAP	support	 was  enabled when nccopy was built, the file name may
       specify a DAP URL. This may be used to convert data on DAP  servers  to
       local netCDF files.

OPTIONS
	-k   kind_name
	      Use  format  name	to specify the kind of file to be created and,
	      by  inference,  the  data	 model	(i.e.  netcdf-3	 (classic)  or
	      netcdf-4 (enhanced)).  The possible arguments are:

		     'nc3' or 'classic'	=> netCDF classic format

		     'nc6' or '64-bit offset' => netCDF	64-bit format

		     'nc4'  or	'netCDF-4'  =>	netCDF-4 format	(enhanced data
		     model)

		     'nc7' or 'netCDF-4	classic	 model'	 =>  netCDF-4  classic
		     model format

	      Note:  The  old format numbers '1', '2', '3', '4', equivalent to
	      the format names 'nc3', 'nc6', 'nc4', or 'nc7' respectively, are
	      also  still  accepted  but deprecated, due to easy confusion be-
	      tween format numbers and format names.

       [-kind_code]
	      Use format numeric code (instead of format name) to specify  the
	      kind  of	file  to  be created and, by inference,	the data model
	      (i.e. netcdf-3 (classic) versus netcdf-4 (enhanced)).   The  nu-
	      meric codes are:

		     3 => netcdf classic format

		     6 => netCDF 64-bit	format

		     4 => netCDF-4 format (enhanced data model)

		     7 => netCDF-4 classic model format
       The  numeric  code  "7"	is used	because	"7=3+4", specifying the	format
       that uses the netCDF-3 data model for compatibility with	 the  netCDF-4
       storage	format	for performance. Credit	is due to NCO for use of these
       numeric codes instead of	the old	and confusing format numbers.

	-d   n
	      For netCDF-4 output, including netCDF-4 classic  model,  specify
	      deflation	level (level of	compression) for variable data output.
	      0	corresponds to no compression and 9  to	 maximum  compression,
	      with higher levels of compression	requiring marginally more time
	      to  compress  or	uncompress  than  lower	 levels.   Compression
	      achieved may also	depend on output chunking parameters.  If this
	      option is	specified for a	classic	format or 64-bit offset	format
	      input  file, it is not necessary to also specify that the	output
	      should be	netCDF-4 classic model,	as that	will be	 the  default.
	      If  this	option	is  not	 specified and the input file has com-
	      pressed variables, the compression will still  be	 preserved  in
	      the output, using	the same chunking as in	the input by default.

	      Note  that  nccopy requires all variables	to be compressed using
	      the same compression level, but the API has no such restriction.
	      With  a  program you can customize compression for each variable
	      independently.

	-s    For netCDF-4 output, including netCDF-4 classic  model,  specify
	      shuffling	of variable data bytes before compression or after de-
	      compression.  Shuffling refers to	 interlacing  of  bytes	 in  a
	      chunk  so	 that  the first bytes of all values are contiguous in
	      storage, followed	by all the second bytes, and so	on, which  of-
	      ten  improves compression.  This option is ignored unless	a non-
	      zero deflation level is specified.  Using	-d0 to specify no  de-
	      flation  on  input  data	that  has been compressed and shuffled
	      turns off	both compression and shuffling in the output.

	-u    Convert any unlimited size dimensions in the input to fixed size
	      dimensions  in the output.  This can speed up variable-at-a-time
	      access, but slow down record-at-a-time access to multiple	 vari-
	      ables along an unlimited dimension.

	-w    Keep  output  in memory (as a diskless netCDF file) until	output
	      is closed, at which time output file is written to  disk.	  This
	      can  greatly speedup operations such as converting unlimited di-
	      mension to fixed size (-u	option), chunking, rechunking, or com-
	      pressing	the input.  It requires	that available memory is large
	      enough to	hold the output	file.  This option may provide a larg-
	      er speedup than careful tuning of	the -m,	-h, or -e options, and
	      it's certainly a lot simpler.

	-c  chunkspec
	      For netCDF-4 output, including netCDF-4 classic  model,  specify
	      chunking (multidimensional tiling) for variable data in the out-
	      put.  This is useful to specify the units	of disk	 access,  com-
	      pression,	 or  other  filters  such  as checksums.  Changing the
	      chunking in a netCDF file	can also greatly  speedup  access,  by
	      choosing	chunk  shapes that are appropriate for the most	common
	      access patterns.

	      The chunkspec argument has several forms.	The first form is  the
	      original,	deprecated form	and is a string	of comma-separated as-
	      sociations, each specifying a dimension name, a  '/'  character,
	      and  optionally  the  corresponding chunk	length for that	dimen-
	      sion.  No	blanks should appear in	the chunkspec  string,	except
	      possibly	escaped	 blanks	 that are part of a dimension name.  A
	      chunkspec	names at least one dimension, and may omit  dimensions
	      which  are  not  to  be  chunked	or for which the default chunk
	      length is	desired.  If a dimension name is  followed  by	a  '/'
	      character	 but  no subsequent chunk length, the actual dimension
	      length is	assumed.   If  copying	a  classic  model  file	 to  a
	      netCDF-4	output	file  and  not	naming	all  dimensions	in the
	      chunkspec, unnamed dimensions will also use the actual dimension
	      length  for  the	chunk  length.	 An example of a chunkspec for
	      variables	that use 'm' and 'n' dimensions	might be 'm/100,n/200'
	      to specify 100 by	200 chunks. To see the chunking	resulting from
	      copying with a chunkspec,	use the	'-s' option of ncdump  on  the
	      output file.

	      The chunkspec '/'	that omits all dimension names and correspond-
	      ing chunk	lengths	specifies that no chunking is to occur in  the
	      output, so can be	used to	unchunk	all the	chunked	variables.  To
	      see the chunking resulting from copying with  a  chunkspec,  use
	      the '-s' option of ncdump	on the output file.

	      As  an  I/O optimization,	nccopy has a threshold for the minimum
	      size of non-record variables that	get  chunked,  currently  8192
	      bytes. The -M flag can be	used to	override this value.

	      Note  that  nccopy  requires variables that share	a dimension to
	      also share the chunk size	associated with	 that  dimension,  but
	      the  programming interface has no	such restriction.  If you need
	      to customize chunking for	variables independently, you will need
	      to  use  the  second  form  of  chunkspec.  This	second form of
	      chunkspec	has this syntax:  var:n1,n2,...,nn . This assumes that
	      the  variable named "var"	has rank n. The	chunking to be applied
	      to each dimension	of the variable	is specified by	the values  of
	      n1 through nn. This second form of chunking specification	can be
	      repeated multiple	times to specify the exact chunking  for  dif-
	      ferent  variables.   If  the  variable is	specified but no chunk
	      sizes are	specified (i.e.	 -c var: ) then	chunking  is  disabled
	      for  that	variable.  If the same variable	is specified more than
	      once, the	second and later specifications	 are  ignored.	 Also,
	      this  second  form, per-variable chunking, takes precedence over
	      any per-dimension	chunking except	the bare "/" case.

	      The third	form of	the chunkspec has the syntax:  var:compact  or
	      var:contiguous.	This  explicitly  attempts to set the variable
	      storage type as compact or contiguous, respectively.  These  may
	      be overridden if other flags require the variable	to be chunked.

	-v   var1,...
	      The output will include data values for the specified variables,
	      in addition to the declarations of  all  dimensions,  variables,
	      and  attributes. One or more variables must be specified by name
	      in the comma-delimited list following this option. The list must
	      be  a  single  argument to the command, hence cannot contain un-
	      escaped blanks or	other white space characters. The named	 vari-
	      ables  must be valid netCDF variables in the input-file. A vari-
	      able within a group in a netCDF-4	file may be specified with  an
	      absolute	path  name,  such  as "/GroupA/GroupA2/var".  Use of a
	      relative path name such as  'var'	 or  "grp/var"	specifies  all
	      matching	variable names in the file.  The default, without this
	      option, is to include data values	for   all   variables  in  the
	      output.

	-V   var1,...
	      The output will include the specified variables only but all di-
	      mensions and global or group attributes. One or  more  variables
	      must  be specified by name in the	comma-delimited	list following
	      this option. The list must be a single argument to the  command,
	      hence cannot contain unescaped blanks or other white space char-
	      acters. The named	variables must be valid	 netCDF	 variables  in
	      the input-file. A	variable within	a group	in a netCDF-4 file may
	      be   specified   with   an   absolute   path   name,   such   as
	      '/GroupA/GroupA2/var'.   Use  of	a  relative  path name such as
	      'var' or 'grp/var' specifies all matching	variable names in  the
	      file.   The  default,  without  this  option, is to include  all
	      variables	in the output.

	-g   grp1,...
	      The output will include  data  values  only  for	the  specified
	      groups.	One  or	 more  groups must be specified	by name	in the
	      comma-delimited list following this option. The list must	 be  a
	      single  argument	to the command.	The named groups must be valid
	      netCDF groups in the input-file. The default, without  this  op-
	      tion, is to include data values for all groups in	the output.

	-G   grp1,...
	      The  output will include only the	specified groups.  One or more
	      groups must be specified by name	in  the	 comma-delimited  list
	      following	this option. The list must be a	single argument	to the
	      command. The named groups	must be	valid netCDF groups in the in-
	      put-file.	 The  default,	without	this option, is	to include all
	      groups in	the output.

	-m   bufsize
	      An integer or floating-point number that specifies the size,  in
	      bytes,  of the copy buffer used to copy large variables.	A suf-
	      fix of K,	M, G, or T multiplies the  copy	 buffer	 size  by  one
	      thousand,	 million, billion, or trillion,	respectively.  The de-
	      fault is 5 Mbytes, but will be increased if necessary to hold at
	      least one	chunk of netCDF-4 chunked variables in the input file.
	      You may want to specify a	value  larger  than  the  default  for
	      copying  large files over	high latency networks.	Using the '-w'
	      option may provide better	performance, if	 the  output  fits  in
	      memory.

	-h   chunk_cache
	      For  netCDF-4 output, including netCDF-4 classic model, an inte-
	      ger or floating-point number that	specifies the size in bytes of
	      chunk  cache allocated for each chunked variable.	 This is not a
	      property of the file, but	merely a performance tuning  parameter
	      for avoiding compressing or decompressing	the same data multiple
	      times while copying and changing chunk shapes.  A	suffix	of  K,
	      M, G, or T multiplies the	chunk cache size by one	thousand, mil-
	      lion,  billion,  or  trillion,  respectively.   The  default  is
	      4.194304	Mbytes	(or  whatever was specified for	the configure-
	      time constant  CHUNK_CACHE_SIZE  when  the  netCDF  library  was
	      built).  Ideally,	the nccopy utility should accept only one mem-
	      ory buffer size and divide it optimally between  a  copy	buffer
	      and  chunk cache,	but no general algorithm for computing the op-
	      timum chunk cache	size has been implemented yet. Using the  '-w'
	      option  may  provide  better  performance, if the	output fits in
	      memory.

	-e   cache_elems
	      For netCDF-4 output, including netCDF-4 classic model, specifies
	      number  of  chunks that the chunk	cache can hold.	A suffix of K,
	      M, G, or T multiplies the	number of chunks that can be  held  in
	      the  cache  by  one thousand, million, billion, or trillion, re-
	      spectively.  This	is not a property of the file,	but  merely  a
	      performance  tuning parameter for	avoiding compressing or	decom-
	      pressing the same	data multiple times while copying and changing
	      chunk  shapes.   The  default is 1009 (or	whatever was specified
	      for the  configure-time  constant	 CHUNK_CACHE_NELEMS  when  the
	      netCDF  library  was built).  Ideally, the nccopy	utility	should
	      determine	an optimum value for this parameter,  but  no  general
	      algorithm	 for  computing	the optimum number of chunk cache ele-
	      ments has	been implemented yet.

	-r    Read netCDF classic or 64-bit offset input file into a  diskless
	      netCDF  file in memory before copying.  Requires that input file
	      be small enough to fit into memory.  For	nccopy,	 this  doesn't
	      seem  to provide any significant speedup,	so may not be a	useful
	      option.

	-L  n Set the log level; only usable if	nccopy supports	netCDF-4  (en-
	      hanced).

	-M  n Set  the	minimum	 chunk	size;  only  usable if nccopy supports
	      netCDF-4 (enhanced).

	-F  filterspec
	      For netCDF-4 output, including netCDF-4 classic model, specify a
	      filter  to  apply	to a specified set of variables	in the output.
	      As a rule, the filter is a  compression/decompression  algorithm
	      with  a unique numeric identifier	assigned by the	HDF Group (see
	      https://support.hdfgroup.org/services/filters.html).

	      The filterspec argument has this general form.
	      fqn1|fqn2...,filterid,param1,param2...paramn	or	*,fil-
	      terid,param1,param2...paramn
       An fqn (fully qualified name) is	the name of a variable prefixed	by its
       containing groups with the  group  names	 separated  by	forward	 slash
       ('/').	An  example might be /g1/g2/var. Alternatively,	just the vari-
       able name can be	given if it is in the root group: e.g. var.  Backslash
       escapes may be used as needed.  A note of warning: the '|' separator is
       a bash reserved character, so you will probably need to put the	filter
       spec in some kind of quotes or otherwise	escape it.

	      The filterid is an unsigned positive integer representing	the id
	      assigned by the HDFgroup to the filter. Following	the  id	 is  a
	      sequence	of  parameters	defining  the operation	of the filter.
	      Each parameter is	a 32-bit unsigned integer.

	      This parameter may be repeated  multiple	times  with  different
	      variable names.

EXAMPLES
       Make a copy of foo1.nc, a netCDF	file of	any type, to foo2.nc, a	netCDF
       file of the same	type:

	      nccopy foo1.nc foo2.nc

       Note that the above copy	will not be as fast as use of cp or other sim-
       ple copy	utility, because the file is copied using only the netCDF API.
       If the input file has extra bytes after the end	of  the	 netCDF	 data,
       those  will  not	be copied, because they	are not	accessible through the
       netCDF interface.  If the original file was generated in	"No fill" mode
       so  that	fill values are	not stored for padding for data	alignment, the
       output file may have different padding bytes.

       Convert a netCDF-4 classic model	file, compressed.nc,  that  uses  com-
       pression, to a netCDF-3 file classic.nc:

	      nccopy -k	classic	compressed.nc classic.nc

       Note that 'nc3' could be	used instead of	'classic'.

       Download	the variable 'time_bnds' and its associated attributes from an
       OPeNDAP server and copy the result to a netCDF file named 'tb.nc':

	      nccopy	      'http://test.opendap.org/opendap/data/nc/sst.mn-
		     mean.nc.gz?time_bnds' tb.nc

       Note  that  URLs	that name specific variables as	command-line arguments
       should generally	be quoted, to avoid  the  shell	 interpreting  special
       characters such as '?'.

       Compress	 all  the variables in the input file foo.nc, a	netCDF file of
       any type, to the	output file bar.nc:

	      nccopy -d1 foo.nc	bar.nc

       If foo.nc was a classic or 64-bit offset	netCDF file, bar.nc will be  a
       netCDF-4	classic	model netCDF file, because the classic and 64-bit off-
       set format  variants  don't  support  compression.   If	foo.nc	was  a
       netCDF-4	 file  with  some variables compressed using various deflation
       levels, the output will also be a netCDF-4 file of the same  type,  but
       all  the	 variables, including any uncompressed variables in the	input,
       will now	use deflation level 1.

       Assume the input	data includes gridded variables	that  use  time,  lat,
       lon  dimensions,	 with 1000 times by 1000 latitudes by 1000 longitudes,
       and that	the time dimension varies most slowly.	Also assume that users
       want  quick  access  to	data  at  all times for	a small	set of lat-lon
       points.	Accessing data for 1000	times would typically require  access-
       ing 1000	disk blocks, which may be slow.

       Reorganizing  the  data	into  chunks on	disk that have all the time in
       each chunk for a	few lat	and lon	coordinates  would  greatly  speed  up
       such  access.   To  chunk  the data in the input	file slow.nc, a	netCDF
       file of any type, to the	output file fast.nc, you could use;

	      nccopy -c	time/1000,lat/40,lon/40	slow.nc	fast.nc

       to specify data chunks of 1000 times, 40	latitudes, and 40  longitudes.
       If you had enough memory	to contain the output file, you	could speed up
       the rechunking operation	significantly by creating the output in	memory
       before writing it to disk on close (using the -w	flag):

	      nccopy -w	-c time/1000,lat/40,lon/40 slow.nc fast.nc
       Alternatively,  one could write this using the alternate, variable-spe-
       cific chunking specification and	assuming that times, lat, and lon  are
       variables.

	      nccopy -c	time:1000 -c lat:40 -c lon:40 slow.nc fast.nc

Chunking Rules
       The complete set	of chunking rules is captured here.  As	a rough	summa-
       ry, these rules preserve	all chunking properties	from the  input	 file.
       These  rules apply only when the	selected output	format supports	chunk-
       ing, i.e. for the netcdf-4 variants.

       The variable specific chunking  specification  should  be  obvious  and
       translates  directly  to	 the  corresponding  "nc_def_var_chunking" API
       call.

       The original per-dimension, chunking specification requires some	inter-
       pretation  by nccopy.  The following rules are applied in the given or-
       der independently for each variable to be copied	from input to  output.
       The  rules are written assuming we are trying to	determine the chunking
       for a given output variable Vout	that comes from	an input variable Vin.

       1.     If there is no '-c' option that applies to a  variable  and  the
	      corresponding  input variable is contiguous or the input is some
	      netcdf-3 variant,	then let the netcdf-c library make all	chunk-
	      ing decisions.

       2.     For  each	 dimension of Vout explicitly specified	on the command
	      line (using the '-c' option), apply the chunking value for  that
	      dimension	regardless of input format or input properties.

       3.     For  dimensions  of Vout not named on the	command	line in	a '-c'
	      option, preserve chunk sizes from	the corresponding input	 vari-
	      able, if it is chunked.

       4.     If  Vin  is  contiguous, and none	of its dimensions are named on
	      the command line,	and chunking is	not mandated by	other options,
	      then make	Vout be	contiguous.

       5.     If  the  input variable is contiguous (or	is some	netcdf-3 vari-
	      ant) and there are no options requiring  chunking,  or  the  '/'
	      special  case  for the '-c' option is specified, then the	output
	      variable V is marked as contiguous.

       6.     Final, default case: some	or all chunk sizes are not  determined
	      by  the  command	line  or the input variable. This includes the
	      non-chunked input	cases such as  netcdf-3,  cdf5,	 and  DAP.  In
	      these cases retain all chunk sizes determined by previous	rules,
	      and use the full dimension size as the default. The exception is
	      unlimited	dimensions, where the default is 4 megabytes.

SEE ALSO
       ncdump(1),ncgen(1),netcdf(3)

Release	4.2			  2012-03-08			     NCCOPY(1)

NAME | SYNOPSIS | DESCRIPTION | OPTIONS | EXAMPLES | Chunking Rules | SEE ALSO

Want to link to this manual page? Use this URL:
<https://www.freebsd.org/cgi/man.cgi?query=nccopy&sektion=1&manpath=FreeBSD+13.0-RELEASE+and+Ports>

home | help