Parse a string into words like a POSIX shell does.
|Current Version: ||0.2|
When using Python 2.3 or later, you may want to use the standard
instead of shellwords
use this code:
words = shlex.split(input_line)
Why this module?
Out in the wild there are quite a few modules for executing commands
in a sub-process. Most of them take a string (the command-line) as
input and use os.exec* for executing the program. This requires
splitting the string into words exacly like the shell does.
If this parsing/splitting is incorrect you can have quite some funny
time debugging. ;-)
Since I didn't find found any other module that-like, I decided to
develop this module.
Benefits using this module
Using this module has the following benefits:
- Enables your application or module to mimic word splitting like a
POSIX shell without effort.
- Saves yourself debugginf-time then doing word-splitting.
- Avoids confusin the users of you application/module when splitting
shell command lines into words, since this module behaves exactly
like a POSIX shell does.
- The Unittest-Suite proves the correct word-splitting. Currently 75
command lines are used, each testing a special pattern. The input
data for this test-suite consists of command-lines which are split
ba the shell on-fly fly. You can add your own test-patterns without
This module parses a string into words according to the parings-rules
of a POSIX shell. These parsing rules are (quoted after 'man bash'):
- Words are split at whitespace charakters; these are Space, Tab,
Newline, Carriage-Return, Vertival-Tab (0B) and Form-Feet (0C).
NB: Quotes do not separate words! Thus
will be parsed into a single word:
- A non-quoted backslash (\) is the escape character. It preserves
the literal value of the next character that follows.
- Enclosing characters in single quotes preserves the literal value
of each character within the quotes. A single quote may not occur
between single quotes, even when preceded by a backslash.
This means: baskslash (\) has no special meaning within single
quotes. All charakters within single quotes are taken as-is.
- Enclosing characters in double quotes preserves the literal value
of all characters within the quotes, with the exception of \. The
backslash retains its special meaning only when followed " or \. A
double quote may be quoted within double quotes by preceding it
with a backslash.
Frequently Asked Questions
Q: Hey, there is 'shlex' coming with Python. Why there is a need for
A: I know 'shlex' and I gave it a try. But 'shlex' takes quotes as
word-delemiters which divers from the shell-semantic (see above).
And even if 'shlex' would parse strings as needed, I would have
written a (very, very) thin layer above, since 'shlex' is simple
but seldomly used for this kind of job.
© Copyright 2002 by Hartmut Goebel <email@example.com>
License: Python Software Foundation License
Requires Python >= 2.0