Skip site navigation (1)Skip section navigation (2)

bin/16393: /bin/sh doesn't strip comments on shebang line

From:ryand@amazon.com
Date:Wed, 26 Jan 2000 18:37:59 -0800 (PST)
Subject:/bin/sh doesn't strip comments on shebang line
Send-pr version:www-1.0

Number:16393
Category:bin
Synopsis:/bin/sh doesn't strip comments on shebang line
Severity:critical
Priority:low
Responsible:gad@FreeBSD.org
State:closed
Class:sw-bug
Arrival-Date:Wed Jan 26 18:40:01 PST 2000
Closed-Date:Mon May 30 22:13:29 GMT 2005
Last-Modified:Mon May 30 22:13:29 GMT 2005
Originator:Ryan Davis
Release:3.3-STABLE (with latest /bin/sh)

Organization:
Amazon.com
 
Environment:
FreeBSD qa-tools.amazon.com 3.3-STABLE FreeBSD 3.3-STABLE #0: Tue Jan 11 12:50:31 PST 2000 root@qa-tools.amazon.com:/usr/src/sys/compile/RWD_BSD_v3 i386
Description:
Basically, if I follow the suggestions in the perl book to make
portably executable scripts, I must use a shebang hack where the
perl script starts being executed as a sh script. sh will pass it off
to perl. Currently sh chokes on the # after -- as the executable.

[503] x.pl
#: Can't open #

What should be happening is sh strips everything including and after
the #. With no args the file should be executed in the same sh. Then
the eval/exec will transfer responsibility to perl. This works on
DEC UNIX, linux, and several others.
How-To-Repeat:
Run the following script:

#!/bin/sh -- # -*- perl -*-

eval "exec perl $0 -S ${1+'$@'}"
if 0;

print "1+1=", (1+1), "\n";
Fix:
Release-Note:
 
Audit-Trail:
Reply via E-mail
From:Shawn Halpenny <malachai@iname.com>
Date:Thu, 10 Feb 2000 16:00:30 -0500
I've run into this, too. The problem seems to have two parts.

First, the kernel parses the shebang line into white-space-separated
tokens without any regard to the presence of a '#' character (which in
the case we're interested in, denotes a comment). The first patch
makes the parsing slurp up everthing from the '#' to the end-of-line
and store it as a single word. This is necessary so that /bin/sh
knows where the comment ends (otherwise (as it does currently),
/bin/sh would receive the '#' and any comment-words as separate
arguments and not know where the comment ended), because when the
interpreter is started, the name of the script is tacked on as the the
last argument.

Second, /bin/sh will take the first non-option as a file name
(according to sh(1)), which means it starts looking for a file named
the first word of the comment on the shebang line. So, I modified
(see second patch) /bin/sh to ignore any command line words that begin
with '#' when searching for a file to interpret. This continues to
allow things like:

sh -c '# this is a nop'

and also preservers the original command line for the interpreter.

I'm not sure if this is the best way to fix things, but it appears to
be consistent with current behavior and address the problem.

Patches below.


--
Shawn Halpenny | Maniacal@I Ache, Ohm | "Universal Danger!"
+- - - - - - - - - - - - - - - - + - - - - - - - - - - - - - - \
| vi:G3kfM~lxfAPXh~l~2x2FirllpfcxlrifaprmfOX~Xp2hr.lrcelyl2p
- - - - - - - -| fU~X~refsPprnlxppri2lxlpr,pFrpprrfaPlpfiprgllxp~3Xlpfndw



Download patch-1.diff
--- /usr/src/sys/kern/imgact_shell.c~    Wed Feb  9 17:14:09 2000
+++ /usr/src/sys/kern/imgact_shell.c     Wed Feb  9 17:14:13 2000
@@ -51,6 +51,7 @@
 exec_shell_imgact(imgp)
         struct image_params *imgp;
 {
+        const char *comment = NULL;
         const char *image_header = imgp->image_header;
         const char *ihp, *line_endp;
         char *interp;
@@ -112,7 +113,15 @@
                          *      because this is at the front of the string buffer
                          *      and the maximum shell command length is tiny.
                          */
-                        while ((ihp < line_endp) && (*ihp != ' ') && (*ihp != '\t')) {
+                        while ((ihp < line_endp) &&
+                            ((*ihp != ' ') && (*ihp != '\t') || comment)) {
+
+                                /* Shell comment characters at the start of a token cause
+                                 *      everything to EOL to be one token.
+                                 */
+                                if (*ihp == '#')
+                                        comment = ihp;
+
                                 *imgp->stringp++ = *ihp++;
                                 imgp->stringspace--;
                         }

--- /usr/src/bin/sh/options.c~ Thu Feb 10 11:02:38 2000 +++ /usr/src/bin/sh/options.c Thu Feb 10 13:41:10 2000 @@ -108,6 +108,15 @@ optlist[i].val = 0; arg0 = argv[0]; if (sflag == 0 && minusc == NULL) { + /* Skip any arguments that start with shell-comment character + * since it is unlikely the filename of a script given on + * the command line will start with one. + */ + while (*argptr && **argptr == '#') + { + argptr++; + } + commandname = arg0 = *argptr++; setinputfile(commandname, 0); }



State Changed
From-To:open->closed
By:cracauer
When:Tue Feb 15 09:50:06 MET 2000
Why:Fixed for 4.0.
Will be merged into 3.x after some time.
Thanks for the bug report

Reply via E-mail
From:Ahmon Dancy <dancy@dancysoft.com>
Date:Tue, 19 Feb 2002 08:26:26 -0800
We ran into a problem today related to this issue (we used the #
character as switch to our program). I did some studies on various
other operating systems and FreeBSD hosts that have the modifications
suggested by bin/16393 fall short. Here are the results of my study:

Given a file called '/tmp/x2' with shebang line:
#!/tmp/interp -a -b -c #dee eee

If /tmp/x2 is exec'd, the operating system runs /tmp/interp w/ the
following arguments:

Solaris 8:
args: "/tmp/interp" "-a" "/tmp/x2"

Tru64 4.0:
args: "interp" "-a -b -c #dee eee" "/tmp/x2"

FreeBSD 2.2.7:
args: "/tmp/interp" "-a" "-b" "-c" "#dee" "eee" "/tmp/x2"

FreeBSD 4.0:
args: "/tmp/interp" "-a" "-b" "-c" "/tmp/x2"

Linux 2.4.12:
args: "/tmp/interp" "-a -b -c #dee eee" "/tmp/x2"

Linux 2.2.19:
args: "interp" "-a -b -c #dee eee" "/tmp/x2"

Irix 6.5:
args: "/tmp/interp" "-a -b -c #dee eee" "/tmp/x2"

HPUX 11.00:
args: "/tmp/x2" "-a -b -c #dee eee" "/tmp/x2"

AIX 4.3:
args: "interp" "-a -b -c #dee eee" "/tmp/x2"

Mac OX X:
args: "interp" "-a -b -c #dee eee" "/tmp/x2"


The most common behavior is:
argv[0]: full path of interpreter
argv[1]: all remaining args, coalesced into one string
argv[2]: The file file exec'd.

FreeBSD's behavior is way out there. No other system treats "#" in any special
way.




Responsible Changed
From-To:freebsd-bugs->gad
Why:

State Changed
From-To:closed->analyzed
By:gad
When:Tue Mar 1 23:50:34 GMT 2005
Why:A fix for the "doesn't strip comments" problem was committed in 2000,
but that caused trouble for other people (as documented in this PR).
A fix for those problems was made to kern/imgact_shell.c was committed
to 5.3-stable in late 2004, but that change broke the "strip-comments"
processing that perl expects.
See the thread on "Bug in #! processing - One More Time" in freebsd-arch
for more details. I intend to fix this for real with another set of
changes, but those changes aren't going to be ready for 5.4-release.

State Changed
From-To:analyzed->closed
By:gad
When:Mon May 30 22:11:46 GMT 2005
Why:A change has been made to sys/kern/imgact_shell.c which will probably be
the final fix for this issue. This has been committed to the 6.x-current
branch, but it is an incompatible change and thus will probably not be
MFC'ed into 5.x-stable.

Unformatted:
 
Submit Followup | Raw PR | Find another PR