Skip site navigation (1)Skip section navigation (2)

FreeBSD Manual Pages

  
 
  

home | help
Text::Shellwords::CursUser)Contributed Perl DocumenText::Shellwords::Cursor(3)

NAME
       Text::Shellwords::Cursor	- Parse	a string into tokens

SYNOPSIS
	use Text::Shellwords::Cursor;
	my $parser = Text::Shellwords::Cursor->new();
	my $str	= 'ab cdef "ghi"    j"k\"l "';
	my ($tok1) = $parser->parse_line($str);
	  $tok1	= ['ab', 'cdef', 'ghi',	'j', 'k"l ']
	my ($tok2, $tokno, $tokoff) = $parser->parse_line($str,	cursorpos => 6);
	   as above, but $tokno=1, $tokoff=3  (under the 'f')

       DESCRIPTION

       This module is very similar to Text::Shellwords and Text::ParseWords.
       However,	it has one very	significant difference:	it keeps track of a
       character position in the line it's parsing.  For instance, if you pass
       it ("zq fmgb", cursorpos=>6), it	would return (['zq', 'fmgb'], 1, 3).
       The cursorpos parameter tells where in the input	string the cursor
       resides (just before the	'b'), and the result tells you that the	cursor
       was on token 1 ('fmgb'),	character 3 ('b').  This is very useful	when
       computing command-line completions involving quoting, escaping, and
       tokenizing characters (like '(' or '=').

       A few helper utilities are included as well.  You can escape a string
       to ensure that parsing it will produce the original string
       (parse_escape).	You can	also reassemble	the tokens with	a visually
       pleasing	amount of whitespace between them (join_line).

       This module started out as an integral part of Term::GDBUI using	code
       loosely based on	Text::ParseWords.  However, it is now basically	a
       ground-up reimplementation.  It was split out of	Term::GDBUI for
       version 0.8.

METHODS
       new
	  Creates a new	parser.	 Takes named arguments on the command line.

	  keep_quotes
	      Normally all unescaped, unnecessary quote	marks are stripped.
	      If you specify "keep_quotes=>1", however,	they are preserved.
	      This is useful if	you need to know whether the string was	quoted
	      or not (string constants)	or what	type of	quotes was around it
	      (affecting variable interpolation, for instance).

	  token_chars
	      This argument specifies the characters that should be considered
	      tokens all by themselves.	 For instance, if I pass
	      token_chars=>'=',	then 'ab=123' would be parsed to ('ab',	'=',
	      '123').  Without token_chars, 'ab=123' remains a single string.

	      NOTE: you	cannot change token_chars after	the constructor	has
	      been called!  The	regexps	that use it are	compiled once (m//o).
	      Also, until the Gnu Readline library can accept "=[]," without
	      diving into an endless loop, we will not tell history expansion
	      to use token_chars (it uses " \t\fIen()<>;&|" by default).

	  debug
	      Turns on rather copious debugging	to try to show what the	parser
	      is thinking at every step.

	  space_none
	  space_before
	  space_after
	      These variables affect how whitespace in the line	is normalized
	      and it is	reassembled into a string.  See	the join_line routine.

	  error
	      This is a	reference to a routine that should be called to
	      display a	parse error.  The routine takes	two arguments: a
	      reference	to the parser, and the error message to	display	as a
	      string.

	  parsebail(msg)
	      If the parsel routine or any of its subroutines runs into	a
	      fatal error, they	call parsebail to present a very descriptive
	      diagnostic.

	  parsel
	      This is the heinous routine that actually	does the parsing.  You
	      should never need	to call	it directly.  Call parse_line instead.

	  parse_line(line, named args)
	      This is the entrypoint to	this module's parsing functionality.
	      It converts a line into tokens, respecting quoted	text, escaped
	      characters, etc.	It also	keeps track of a cursor	position on
	      the input	text, returning	the token number and offset within the
	      token where that position	can be found in	the output.

	      This routine originally bore some	resemblance to
	      Text::ParseWords.	 It has	changed	almost completely, however, to
	      support keeping track of the cursor position.  It	also has nicer
	      failure modes, modular quoting, token characters (see
	      token_chars in "new"), etc.  This	routine	now does much more.

	      Arguments:

	      line
		 This is a string containing the command-line to parse.

	      This routine also	accepts	the following named parameters:

	      cursorpos
		 This is the character position	in the line to keep track of.
		 Pass undef (by	not specifying it) or the empty	string to have
		 the line processed with cursorpos ignored.

		 Note that passing undef is not	the same as passing some
		 random	number and ignoring the	result!	 For instance, if you
		 pass 0	and the	line begins with whitespace, you'll get	a
		 0-length token	at the beginning of the	line to	represent the
		 cursor	in the middle of the whitespace.  This allows command
		 completion to work even when the cursor is not	near any
		 tokens.  If you pass undef, all whitespace at the beginning
		 and end of the	line will be trimmed as	you would expect.

		 If it is ambiguous whether the	cursor should belong to	the
		 previous token	or to the following one	(i.e. if it's between
		 two quoted strings, say "a""b"	or a token_char), it always
		 gravitates to the previous token.  This makes more sense when
		 completing.

	      fixclosequote
		 Sometimes you want to try to recover from a missing close
		 quote (for instance, when calculating completions), but
		 usually you want a missing close quote	to be a	fatal error.
		 fixclosequote=>1 will implicitly insert the correct quote if
		 it's missing.	fixclosequote=>0 is the	default.

	      messages
		 parse_line is capable of printing very	informative error
		 messages.  However, sometimes you don't care enough to	print
		 a message (like when calculating completions).	 Messages are
		 printed by default, so	pass messages=>0 to turn them off.

	      This function returns a reference	to an array containing three
	      items:

	      tokens
		 A the tokens that the line was	separated into (ref to an
		 array of strings).

	      tokno
		 The number of the token (index	into the previous array) that
		 contains cursorpos.

	      tokoff
		 The character offet into tokno	of cursorpos.

	      If the cursor is at the end of the token,	tokoff will point to 1
	      character	past the last character	in tokno, a non-existant
	      character.  If the cursor	is between tokens (surrounded by
	      whitespace), a zero-length token will be created for it.

	  parse_escape(lines)
	      Escapes characters that would be otherwise interpreted by	the
	      parser.  Will accept either a single string or an	arrayref of
	      strings (which will be modified in-place).

	  join_line(tokens)
	      This routine does	a somewhat intelligent job of joining tokens
	      back into	a command line.	 If token_chars	(see "new") is empty
	      (the default), then it just escapes backslashes and quotes, and
	      joins the	tokens with spaces.

	      However, if token_chars is nonempty, it tries to insert a
	      visually pleasing	amount of space	between	the tokens.  For
	      instance,	rather than 'a ( b , c )', it tries to produce 'a (b,
	      c)'.  It won't reformat any tokens that aren't found in
	      $self->{token_chars}, of course.

	      To change	the formatting,	you can	redefine the variables
	      $self->{space_none}, $self->{space_before}, and
	      $self->{space_after}.  Each variable is a	string containing all
	      characters that should not be surrounded by whitespace, should
	      have whitespace before, and should have whitespace after,
	      respectively.  Any character found in token_chars, but non in
	      any of these space_ variables, will have space placed both
	      before and after.

BUGS
       None known.

LICENSE
       Copyright (c) 2003-2011 Scott Bronson, all rights reserved.  This
       program is covered by the MIT license.

AUTHOR
       Scott Bronson <bronson@rinspin.com>

perl v5.24.1			  2012-02-03	   Text::Shellwords::Cursor(3)

NAME | SYNOPSIS | METHODS | BUGS | LICENSE | AUTHOR

Want to link to this manual page? Use this URL:
<https://www.freebsd.org/cgi/man.cgi?query=Text::Shellwords::Cursor&sektion=3&manpath=FreeBSD+12.1-RELEASE+and+Ports>

home | help