Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
NCCOPY(1)		       UNIDATA UTILITIES		     NCCOPY(1)

NAME
       nccopy  -  Copy a netCDF	file, optionally changing format, compression,
       or chunking in the output.

SYNOPSIS
       nccopy [-k  kind_name ] [-kind_code] [-d	 n ]  [-s]  [-c	  chunkspec  ]
	      [-u]  [-w]  [-[v|V] var1,...]  [-[g|G] grp1,...]	[-m  bufsize ]
	      [-h  chunk_cache ] [-e  cache_elems ] [-r] [-F  filterspec ] [-L
	      n	] [-M  n ]  infile  outfile

DESCRIPTION
       The  nccopy utility copies an input netCDF file in any supported	format
       variant to an output netCDF file, optionally converting the  output  to
       any compatible netCDF format variant, compressing the data, or rechunk-
       ing the data.  For example, if  built  with  the	 netCDF-3  library,  a
       netCDF  classic file may	be copied to a netCDF 64-bit offset file, per-
       mitting larger variables.  If built with	the netCDF-4 library, a	netCDF
       classic	file may be copied to a	netCDF-4 file or to a netCDF-4 classic
       model file as  well,  permitting	 data  compression,  efficient	schema
       changes,	larger variable	sizes, and use of other	netCDF-4 features.

       If  no  output  format  is  specified,  with  either  -k	 kind_name  or
       -kind_code, then	the output will	use the	same format as the input,  un-
       less  the input is classic or 64-bit offset and either chunking or com-
       pression	is specified, in which case the	output will be netCDF-4	 clas-
       sic  model format.  Attempting some kinds of format conversion will re-
       sult in an error, if the	conversion is not possible.  For  example,  an
       attempt to copy a netCDF-4 file that uses features of the enhanced mod-
       el, such	as groups or variable-length strings,  to  any	of  the	 other
       kinds  of  netCDF  formats that use the classic model will result in an
       error.

       nccopy also serves as an	example	of a generic  netCDF-4	program,  with
       its  ability  to	 read  any valid netCDF	file and handle	nested groups,
       strings,	and user-defined types,	including arbitrarily nested  compound
       types, variable-length types, and data of any valid netCDF-4 type.

       If  DAP	support	 was  enabled when nccopy was built, the file name may
       specify a DAP URL. This may be used to convert data on DAP  servers  to
       local netCDF files.

OPTIONS
	-k   kind_name
	      Use  format  name	to specify the kind of file to be created and,
	      by  inference,  the  data	 model	(i.e.  netcdf-3	 (classic)  or
	      netcdf-4 (enhanced)).  The possible arguments are:

		     'nc3' or 'classic'	=> netCDF classic format

		     'nc6' or '64-bit offset' => netCDF	64-bit format

		     'nc4'  or	'netCDF-4'  =>	netCDF-4 format	(enhanced data
		     model)

		     'nc7' or 'netCDF-4	classic	 model'	 =>  netCDF-4  classic
		     model format

	      Note:  The  old format numbers '1', '2', '3', '4', equivalent to
	      the format names 'nc3', 'nc6', 'nc4', or 'nc7' respectively, are
	      also  still  accepted  but deprecated, due to easy confusion be-
	      tween format numbers and format names.

       [-kind_code]
	      Use format numeric code (instead of format name) to specify  the
	      kind  of	file  to  be created and, by inference,	the data model
	      (i.e. netcdf-3 (classic) versus netcdf-4 (enhanced)).   The  nu-
	      meric codes are:

		     3 => netcdf classic format

		     6 => netCDF 64-bit	format

		     4 => netCDF-4 format (enhanced data model)

		     7 => netCDF-4 classic model format
       The  numeric  code  "7"	is used	because	"7=3+4", specifying the	format
       that uses the netCDF-3 data model for compatibility with	 the  netCDF-4
       storage	format	for performance. Credit	is due to NCO for use of these
       numeric codes instead of	the old	and confusing format numbers.

	-d   n
	      For netCDF-4 output, including netCDF-4 classic  model,  specify
	      deflation	level (level of	compression) for variable data output.
	      0	corresponds to no compression and 9  to	 maximum  compression,
	      with higher levels of compression	requiring marginally more time
	      to  compress  or	uncompress  than  lower	 levels.   Compression
	      achieved may also	depend on output chunking parameters.  If this
	      option is	specified for a	classic	format or 64-bit offset	format
	      input  file, it is not necessary to also specify that the	output
	      should be	netCDF-4 classic model,	as that	will be	 the  default.
	      If  this	option	is  not	 specified and the input file has com-
	      pressed variables, the compression will still  be	 preserved  in
	      the output, using	the same chunking as in	the input by default.

	      Note  that  nccopy requires all variables	to be compressed using
	      the same compression level, but the API has no such restriction.
	      With  a  program you can customize compression for each variable
	      independently.

	-s    For netCDF-4 output, including netCDF-4 classic  model,  specify
	      shuffling	of variable data bytes before compression or after de-
	      compression.  Shuffling refers to	 interlacing  of  bytes	 in  a
	      chunk  so	 that  the first bytes of all values are contiguous in
	      storage, followed	by all the second bytes, and so	on, which  of-
	      ten  improves compression.  This option is ignored unless	a non-
	      zero deflation level is specified.  Using	-d0 to specify no  de-
	      flation  on  input  data	that  has been compressed and shuffled
	      turns off	both compression and shuffling in the output.

	-u    Convert any unlimited size dimensions in the input to fixed size
	      dimensions  in the output.  This can speed up variable-at-a-time
	      access, but slow down record-at-a-time access to multiple	 vari-
	      ables along an unlimited dimension.

	-w    Keep  output  in memory (as a diskless netCDF file) until	output
	      is closed, at which time output file is written to  disk.	  This
	      can  greatly speedup operations such as converting unlimited di-
	      mension to fixed size (-u	option), chunking, rechunking, or com-
	      pressing	the input.  It requires	that available memory is large
	      enough to	hold the output	file.  This option may provide a larg-
	      er speedup than careful tuning of	the -m,	-h, or -e options, and
	      it's certainly a lot simpler.

	-c  chunkspec
	      For netCDF-4 output, including netCDF-4 classic  model,  specify
	      chunking (multidimensional tiling) for variable data in the out-
	      put.  This is useful to specify the units	of disk	 access,  com-
	      pression,	 or  other  filters  such  as checksums.  Changing the
	      chunking in a netCDF file	can also greatly  speedup  access,  by
	      choosing	chunk  shapes that are appropriate for the most	common
	      access patterns.

	      The chunkspec argument has two forms.  The  first	 form  is  the
	      original,	deprecated form	and is a string	of comma-separated as-
	      sociations, each specifying a dimension name, a  '/'  character,
	      and  optionally  the  corresponding chunk	length for that	dimen-
	      sion.  No	blanks should appear in	the chunkspec  string,	except
	      possibly	escaped	 blanks	 that are part of a dimension name.  A
	      chunkspec	names at least one dimension, and may omit  dimensions
	      which  are  not  to  be  chunked	or for which the default chunk
	      length is	desired.  If a dimension name is  followed  by	a  '/'
	      character	 but  no subsequent chunk length, the actual dimension
	      length is	assumed.   If  copying	a  classic  model  file	 to  a
	      netCDF-4	output	file  and  not	naming	all  dimensions	in the
	      chunkspec, unnamed dimensions will also use the actual dimension
	      length  for  the	chunk  length.	 An example of a chunkspec for
	      variables	that use 'm' and 'n' dimensions	might be 'm/100,n/200'
	      to specify 100 by	200 chunks. To see the chunking	resulting from
	      copying with a chunkspec,	use the	'-s' option of ncdump  on  the
	      output file.

	      The chunkspec '/'	that omits all dimension names and correspond-
	      ing chunk	lengths	specifies that no chunking is to occur in  the
	      output, so can be	used to	unchunk	all the	chunked	variables.  To
	      see the chunking resulting from copying with  a  chunkspec,  use
	      the '-s' option of ncdump	on the output file.

	      As  an  I/O optimization,	nccopy has a threshold for the minimum
	      size of non-record variables that	get  chunked,  currently  8192
	      bytes. The -M flag can be	used to	override this value.

	      Note  that  nccopy  requires variables that share	a dimension to
	      also share the chunk size	associated with	 that  dimension,  but
	      the  programming interface has no	such restriction.  If you need
	      to customize chunking for	variables independently, you will need
	      to  use  the  second  form  of  chunkspec.  This	second form of
	      chunkspec	has this syntax:  var:n1,n2,...,nn . This assumes that
	      the  variable named "var"	has rank n. The	chunking to be applied
	      to each dimension	of the variable	is specified by	the values  of
	      n1 through nn. This second form of chunking specification	can be
	      repeated multiple	times to specify the exact chunking  for  dif-
	      ferent  variables.   If  the  variable is	specified but no chunk
	      sizes are	specified (i.e.	 -c var: ) then	chunking  is  disabled
	      for  that	variable.  If the same variable	is specified more than
	      once, the	second and later specifications	 are  ignored.	 Also,
	      this  second  form, per-variable chunking, takes precedence over
	      any per-dimension	chunking except	the bare "/" case.

	-v   var1,...
	      The output will include data values for the specified variables,
	      in  addition  to	the declarations of all	dimensions, variables,
	      and attributes. One or more variables must be specified by  name
	      in the comma-delimited list following this option. The list must
	      be a single argument to the command, hence  cannot  contain  un-
	      escaped  blanks or other white space characters. The named vari-
	      ables must be valid netCDF variables in the input-file. A	 vari-
	      able  within a group in a	netCDF-4 file may be specified with an
	      absolute path name, such as  "/GroupA/GroupA2/var".   Use	 of  a
	      relative	path  name  such  as  'var' or "grp/var" specifies all
	      matching variable	names in the file.  The	default, without  this
	      option,  is  to  include	data values for	 all  variables	in the
	      output.

	-V   var1,...
	      The output will include the specified variables only but all di-
	      mensions	and  global or group attributes. One or	more variables
	      must be specified	by name	in the comma-delimited list  following
	      this  option. The	list must be a single argument to the command,
	      hence cannot contain unescaped blanks or other white space char-
	      acters.  The  named  variables must be valid netCDF variables in
	      the input-file. A	variable within	a group	in a netCDF-4 file may
	      be   specified   with   an   absolute   path   name,   such   as
	      '/GroupA/GroupA2/var'.  Use of a	relative  path	name  such  as
	      'var'  or	'grp/var' specifies all	matching variable names	in the
	      file.  The default, without this	option,	 is  to	 include   all
	      variables	in the output.

	-g   grp1,...
	      The  output  will	 include  data	values	only for the specified
	      groups.  One or more groups must be specified  by	 name  in  the
	      comma-delimited  list  following this option. The	list must be a
	      single argument to the command. The named	groups must  be	 valid
	      netCDF  groups  in the input-file. The default, without this op-
	      tion, is to include data values for all groups in	the output.

	-G   grp1,...
	      The output will include only the specified groups.  One or  more
	      groups  must  be	specified  by name in the comma-delimited list
	      following	this option. The list must be a	single argument	to the
	      command. The named groups	must be	valid netCDF groups in the in-
	      put-file.	The default, without this option, is  to  include  all
	      groups in	the output.

	-m   bufsize
	      An  integer or floating-point number that	specifies the size, in
	      bytes, of	the copy buffer	used to	copy large variables.  A  suf-
	      fix  of  K,  M,  G,  or T	multiplies the copy buffer size	by one
	      thousand,	million, billion, or trillion, respectively.  The  de-
	      fault is 5 Mbytes, but will be increased if necessary to hold at
	      least one	chunk of netCDF-4 chunked variables in the input file.
	      You  may	want  to  specify  a value larger than the default for
	      copying large files over high latency networks.  Using the  '-w'
	      option  may  provide  better  performance, if the	output fits in
	      memory.

	-h   chunk_cache
	      For netCDF-4 output, including netCDF-4 classic model, an	 inte-
	      ger or floating-point number that	specifies the size in bytes of
	      chunk cache allocated for	each chunked variable.	This is	not  a
	      property	of the file, but merely	a performance tuning parameter
	      for avoiding compressing or decompressing	the same data multiple
	      times  while  copying and	changing chunk shapes.	A suffix of K,
	      M, G, or T multiplies the	chunk cache size by one	thousand, mil-
	      lion,  billion,  or  trillion,  respectively.   The  default  is
	      4.194304 Mbytes (or whatever was specified  for  the  configure-
	      time  constant  CHUNK_CACHE_SIZE	when  the  netCDF  library was
	      built).  Ideally,	the nccopy utility should accept only one mem-
	      ory  buffer  size	 and divide it optimally between a copy	buffer
	      and chunk	cache, but no general algorithm	for computing the  op-
	      timum  chunk cache size has been implemented yet.	Using the '-w'
	      option may provide better	performance, if	 the  output  fits  in
	      memory.

	-e   cache_elems
	      For netCDF-4 output, including netCDF-4 classic model, specifies
	      number of	chunks that the	chunk cache can	hold. A	suffix	of  K,
	      M,  G,  or T multiplies the number of chunks that	can be held in
	      the cache	by one thousand, million, billion,  or	trillion,  re-
	      spectively.   This  is  not a property of	the file, but merely a
	      performance tuning parameter for avoiding	compressing or	decom-
	      pressing the same	data multiple times while copying and changing
	      chunk shapes.  The default is 1009 (or  whatever	was  specified
	      for  the	configure-time	constant  CHUNK_CACHE_NELEMS  when the
	      netCDF library was built).  Ideally, the nccopy  utility	should
	      determine	 an  optimum  value for	this parameter,	but no general
	      algorithm	for computing the optimum number of chunk  cache  ele-
	      ments has	been implemented yet.

	-r    Read  netCDF classic or 64-bit offset input file into a diskless
	      netCDF file in memory before copying.  Requires that input  file
	      be  small	 enough	 to fit	into memory.  For nccopy, this doesn't
	      seem to provide any significant speedup, so may not be a	useful
	      option.

	-L  n Set  the log level; only usable if nccopy	supports netCDF-4 (en-
	      hanced).

	-M  n Set the minimum chunk  size;  only  usable  if  nccopy  supports
	      netCDF-4 (enhanced).

	-F  filterspec
	      For netCDF-4 output, including netCDF-4 classic model, specify a
	      filter to	apply to a specified set of variables in  the  output.
	      As  a  rule, the filter is a compression/decompression algorithm
	      with a unique numeric identifier assigned	by the HDF Group  (see
	      https://support.hdfgroup.org/services/filters.html).

	      The filterspec argument has this general form.
	      fqn1|fqn2...,filterid,param1,param2...paramn	or	*,fil-
	      terid,param1,param2...paramn
       An fqn (fully qualified name) is	the name of a variable prefixed	by its
       containing  groups  with	 the  group  names  separated by forward slash
       ('/').  An example might	be /g1/g2/var. Alternatively, just  the	 vari-
       able  name can be given if it is	in the root group: e.g.	var. Backslash
       escapes may be used as needed.  A note of warning: the '|' separator is
       a  bash reserved	character, so you will probably	need to	put the	filter
       spec in some kind of quotes or otherwise	escape it.

	      The filterid is an unsigned positive integer representing	the id
	      assigned	by  the	 HDFgroup to the filter. Following the id is a
	      sequence of parameters defining the  operation  of  the  filter.
	      Each parameter is	a 32-bit unsigned integer.

	      This  parameter  may  be	repeated multiple times	with different
	      variable names.

EXAMPLES
       Make a copy of foo1.nc, a netCDF	file of	any type, to foo2.nc, a	netCDF
       file of the same	type:

	      nccopy foo1.nc foo2.nc

       Note that the above copy	will not be as fast as use of cp or other sim-
       ple copy	utility, because the file is copied using only the netCDF API.
       If  the	input  file  has extra bytes after the end of the netCDF data,
       those will not be copied, because they are not accessible  through  the
       netCDF interface.  If the original file was generated in	"No fill" mode
       so that fill values are not stored for padding for data alignment,  the
       output file may have different padding bytes.

       Convert	a  netCDF-4  classic model file, compressed.nc,	that uses com-
       pression, to a netCDF-3 file classic.nc:

	      nccopy -k	classic	compressed.nc classic.nc

       Note that 'nc3' could be	used instead of	'classic'.

       Download	the variable 'time_bnds' and its associated attributes from an
       OPeNDAP server and copy the result to a netCDF file named 'tb.nc':

	      nccopy	      'http://test.opendap.org/opendap/data/nc/sst.mn-
		     mean.nc.gz?time_bnds' tb.nc

       Note that URLs that name	specific variables as  command-line  arguments
       should  generally  be  quoted,  to avoid	the shell interpreting special
       characters such as '?'.

       Compress	all the	variables in the input file foo.nc, a netCDF  file  of
       any type, to the	output file bar.nc:

	      nccopy -d1 foo.nc	bar.nc

       If  foo.nc was a	classic	or 64-bit offset netCDF	file, bar.nc will be a
       netCDF-4	classic	model netCDF file, because the classic and 64-bit off-
       set  format  variants  don't  support  compression.   If	 foo.nc	 was a
       netCDF-4	file with some variables compressed  using  various  deflation
       levels,	the  output will also be a netCDF-4 file of the	same type, but
       all the variables, including any	uncompressed variables in  the	input,
       will now	use deflation level 1.

       Assume  the  input  data	includes gridded variables that	use time, lat,
       lon dimensions, with 1000 times by 1000 latitudes by  1000  longitudes,
       and that	the time dimension varies most slowly.	Also assume that users
       want quick access to data at all	times  for  a  small  set  of  lat-lon
       points.	 Accessing data	for 1000 times would typically require access-
       ing 1000	disk blocks, which may be slow.

       Reorganizing the	data into chunks on disk that have  all	 the  time  in
       each  chunk  for	 a  few	lat and	lon coordinates	would greatly speed up
       such access.  To	chunk the data in the input  file  slow.nc,  a	netCDF
       file of any type, to the	output file fast.nc, you could use;

	      nccopy -c	time/1000,lat/40,lon/40	slow.nc	fast.nc

       to  specify data	chunks of 1000 times, 40 latitudes, and	40 longitudes.
       If you had enough memory	to contain the output file, you	could speed up
       the rechunking operation	significantly by creating the output in	memory
       before writing it to disk on close (using the -w	flag):

	      nccopy -w	-c time/1000,lat/40,lon/40 slow.nc fast.nc
       Alternatively, one could	write this using the alternate,	 variable-spe-
       cific  chunking specification and assuming that times, lat, and lon are
       variables.

	      nccopy -c	time:1000 -c lat:40 -c lon:40 slow.nc fast.nc

Chunking Rules
       The complete set	of chunking rules is captured here.  As	a rough	summa-
       ry,  these  rules preserve all chunking properties from the input file.
       These rules apply only when the selected	output format supports	chunk-
       ing, i.e. for the netcdf-4 variants.

       The  variable  specific	chunking  specification	 should	be obvious and
       translates directly  to	the  corresponding  "nc_def_var_chunking"  API
       call.

       The original per-dimension, chunking specification requires some	inter-
       pretation by nccopy.  The following rules are applied in	the given  or-
       der  independently for each variable to be copied from input to output.
       The rules are written assuming we are trying to determine the  chunking
       for a given output variable Vout	that comes from	an input variable Vin.

       1.     If  there	 is  no	'-c' option that applies to a variable and the
	      corresponding input variable is contiguous or the	input is  some
	      netcdf-3	variant, then let the netcdf-c library make all	chunk-
	      ing decisions.

       2.     For each dimension of Vout explicitly specified on  the  command
	      line  (using the '-c' option), apply the chunking	value for that
	      dimension	regardless of input format or input properties.

       3.     For dimensions of	Vout not named on the command line in  a  '-c'
	      option,  preserve	chunk sizes from the corresponding input vari-
	      able, if it is chunked.

       4.     If Vin is	contiguous, and	none of	its dimensions	are  named  on
	      the command line,	and chunking is	not mandated by	other options,
	      then make	Vout be	contiguous.

       5.     If the input variable is contiguous (or is some  netcdf-3	 vari-
	      ant)  and	 there	are  no	options	requiring chunking, or the '/'
	      special case for the '-c'	option is specified, then  the	output
	      variable V is marked as contiguous.

       6.     Final,  default case: some or all	chunk sizes are	not determined
	      by the command line or the input	variable.  This	 includes  the
	      non-chunked  input  cases	 such  as  netcdf-3, cdf5, and DAP. In
	      these cases retain all chunk sizes determined by previous	rules,
	      and use the full dimension size as the default. The exception is
	      unlimited	dimensions, where the default is 4 megabytes.

SEE ALSO
       ncdump(1),ncgen(1),netcdf(3)

Release	4.2			  2012-03-08			     NCCOPY(1)

NAME | SYNOPSIS | DESCRIPTION | OPTIONS | EXAMPLES | Chunking Rules | SEE ALSO

Want to link to this manual page? Use this URL:
<https://www.freebsd.org/cgi/man.cgi?query=nccopy&sektion=1&manpath=FreeBSD+13.0-RELEASE+and+Ports>

home | help