Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages


home | help
File::Formula(3)      User Contributed Perl Documentation     File::Formula(3)

       Chemistry::File::Formula	- Molecular formula reader/formatter

	   use Chemistry::File::Formula;

	   my $mol = Chemistry::Mol->parse("H2O");
	   print $mol->print(format => formula);
	   print $mol->formula;	   # this is a shorthand for the above
	   print $mol->print(format => formula,
	       formula_format => "%s%d{<sub>%d</sub>});

       This module converts a molecule object to a string with the formula and
       back.  It registers the 'formula' format	with Chemistry::Mol.  Besides
       its obvious use,	it is included in the Chemistry::Mol distribution
       because it is a very simple example of a	Chemistry::File	derived	I/O

   Writing formulas
       The format can be specified as a	printf-like string with	the following
       control sequences, which	are specified with the formula_format
       parameter to $mol->print	or $mol->write.

       %s  symbol
       %D  number of atoms
       %d  number of atoms, included only when it is greater than one
       %d{substr}  substr is only included when	number of atoms	is greater
       than one
       %j{substr}  substr is inserted between the formatted string for each
       element.	(The 'j' stands	for 'joiner'.) The format should have only one
       joiner, but its location	in the format string doesn't matter.
       %% a percent sign

       If no format is specified, the default is "%s%d". Some examples follow.
       Let's assume that the formula is	C2H6O, as it would be formatted	by

	   Like	the default, but include explicit indices for all atoms.  The
	   formula would be formatted as "C2H6O1"

	   HTML	format.	The output would be "C<sub>2</sub>H<sub>6</sub>O".

       "%D %s%j{, }"
	   Use a comma followed	by a space as a	joiner.	The output would be "2
	   C, 6	H, 1 O".

       Symbol Sort Order

       The elements in the formula are sorted by default in the	"Hill order",
       which means that:

       1) if the formula contains carbon, C goes first,	followed by H, and the
       rest of the symbols in alphabetical order. For example, "CH2BrF".

       2) if there is no carbon, all the symbols (including H) are listed
       alphabetically.	For example, "BrH".

       It is possible to supply	a custom sorting subroutine with the
       'formula_sort' option. It expects a subroutine reference	that takes a
       hash reference describing the formula (similar to what is returned by
       parse_formula, discussed	below),	and that returns a list	of symbols in
       the desired order.

       For example, this will sort the symbols in reverse asciibetical order:

	   my $formula = $mol->print(
	       format	       => 'formula',
	       formula_sort    => sub {
		   my $formula_hash = shift;
		   return reverse sort keys %$formula_hash;

   Parsing Formulas
       Formulas	can also be parsed back	into Chemistry::Mol objects.  The
       formula may have	parentheses and	square or triangular brackets, and it
       may have	the following abbreviations:

	   Me => '(CH3)',
	   Et => '(CH3CH2)',
	   Bu => '(C4H9)',
	   Bn => '(C6H5CH2)',
	   Cp => '(C5H5)',
	   Ph => '(C6H5)',
	   Bz => '(C6H5CO)',

       The formula may also be preceded	by a number, which multiplies the
       whole formula. Some examples of valid formulas:

	   Formula		Equivalent to
	   CH3(CH2)3CH3		C5H12
	   C6H3Me3		C9H12
	   2Cu[NH3]4(NO3)2	Cu2H24N12O12
	   2C(C[C<C>5]4)3	C152
	   2C(C(C(C)5)4)3	C152
	   C 1 0 H 2 2		C10H22 (whitespace is completely ignored)

       When a formula is parsed, a molecule object is created which consists
       of the set of the atoms in the formula (no bonds	or coordinates,	of
       course).	 The atoms are created in alphabetical order, so the molecule
       object for C2H5Br would have the	atoms in the following sequence: Br,
       C, C, H,	H, H, H, H.

       If you don't want to create a molecule object, but would	rather have a
       simple hash with	the number of atoms for	each element, use the
       "parse_formula" method:

	   my %formula = Chemistry::File::Formula->parse_formula("C2H6O");
	   use Data::Dumper;
	   print Dumper	\%formula;

       which prints something like

	   $VAR1 = {
		     'H' => 6,
		     'O' => 1,
		     'C' => 2

       The "parse_formula" method is called internally by the "parse_string"

       Non-integer numbers in formulas

       The "parse_formula" method can also accept formulas that	contain
       floating-point numbers, such as H1.5N0.5. The numbers must be positive,
       and numbers smaller than	one should include a leading zero (e.g., 0.9,
       not .9).

       When formulas with non-integer numbers of atoms are turned into
       molecule	objects	as described in	the previous section, the number of
       atoms is	always rounded up. For example,	H1.5N0.5 will produce a
       molecule	object with two	hydrogen atoms and one nitrogen	atom.

       There is	currently no way of producing formulas with non-integer
       numbers;	perhaps	a future version will include an "occupancy" property
       for atoms that will result in non-integer formulas.


       Chemistry::Mol, Chemistry::File

       For discussion about Hill order,	just search the	web for	"formula "hill
       order"".	The original reference is J. Am. Chem. Soc. 1900, 22, 478-494.

       The PerlMol website <>

       Ivan Tubert-Brohman <>.

       Formula parsing code contributed	by Brent Gregersen.

       Patch for non-integer formulas by Daniel	Scott.

       Copyright (c) 2005 Ivan Tubert-Brohman. All rights reserved. This
       program is free software; you can redistribute it and/or	modify it
       under the same terms as Perl itself.

perl v5.32.0			  2009-05-10		      File::Formula(3)


Want to link to this manual page? Use this URL:

home | help