The subprocess module provides some very useful functionality for working with external programs from Python applications, but is often complained about as being harder to use than it needs to be. See, for example, Kenneth Reitz’s Envoy project, which aims to provide an ease-of-use wrapper over subprocess. There’s also Andrew Moffat’s pbs project, which aims to let you do things like
from pbs import ifconfig
print ifconfig("eth0")
Which it does by replacing sys.modules['pbs'] with a subclass of the module type which overrides __getattr__ to look for programs in the path. Which is nice, and I can see that it would be useful in some contexts, but I don’t find that wc(ls("/etc", "-1"), "-l") is more readable than call(“ls /etc –1 | wc –l”) in the general case.
I’ve been experimenting with my own wrapper for subprocess, called sarge. The main things I need are:
- I want to use command pipelines, but using subprocess out of the box often leads to deadlocks because pipe buffers get filled up.
- I want to use bash-style pipe syntax on Windows as well as Posix, but Windows shells don’t support some of the syntax I want to use, like &&, ||, |& and so on.
- I want to process output from commands in a flexible way, and communicate() is not always flexible enough for my needs - for example, if I need to process output a line at a time.
- I want to avoid shell injection problems by having the ability to quote command arguments safely, and I want to minimise the use of shell=True, which I generally have to use when using pipelined commands.
- I don’t want to set arbitrary limits on passing data between processes, such as Envoy’s 10MB limit.
- subprocess allows you to let stderr be the same as stdout, but not the other way around - and I sometimes need to do that.
I’ve been working on supporting these use cases, so sarge offers the following features:
A simple run function which allows a rich subset of Bash-style shell command syntax, but parsed and run by sarge so that you can run cross-platform on Posix and Windows without cygwin:
>>> p = run('false && echo foo')
>>> p.commands
[Command('false')]
>>> p.returncodes
[1]
>>> p.returncode
1
>>> p = run('false || echo foo')
foo
>>> p.commands
[Command('false'), Command('echo foo')]
>>> p.returncodes
[1, 0]
>>> p.returncode
0The ability to format shell commands with placeholders, such that variables are quoted to prevent shell injection attacks:
>>> from sarge import shell_format
>>> shell_format('ls {0}', '*.py')
"ls '*.py'"
>>> shell_format('cat {0}', 'a file name with spaces')
"cat 'a file name with spaces'"The ability to capture output streams without requiring you to program your own threads. You just use a Capture object and then you can read from it as and when you want:
>>> from sarge import Capture, run
>>> with Capture() as out:
... run('echo foobarbaz', stdout=out)
...
<sarge.Pipeline object at 0x175ed10>
>>> out.read(3)
'foo'
>>> out.read(3)
'bar'
>>> out.read(3)
'baz'
>>> out.read(3)
'\n'
>>> out.read(3)
''A Capture object can capture the output from multiple commands:
>>> from sarge import run, Capture
>>> p = run('echo foo; echo bar; echo baz', stdout=Capture())
>>> p.stdout.readline()
'foo\n'
>>> p.stdout.readline()
'bar\n'
>>> p.stdout.readline()
'baz\n'
>>> p.stdout.readline()
''Delays in commands are honoured in asynchronous calls:
>>> from sarge import run, Capture
>>> cmd = 'echo foo & (sleep 2; echo bar) & (sleep 1; echo baz)'
>>> p = run(cmd, stdout=Capture(), async=True) # returns immediately
>>> p.close() # wait for completion
>>> p.stdout.readline()
'foo\n'
>>> p.stdout.readline()
'baz\n'
>>> p.stdout.readline()
'bar\n'
>>>Here, the sleep commands ensure that the asynchronous echo calls occur in the order foo (no delay), baz (after a delay of one second) and bar (after a delay of two seconds); the capturing works as expected.
Sarge hasn’t been released yet, but it’s not far off being ready. It’s meant for Python >= 2.6.5 and is tested on 2.6, 2.7, 3.1, 3.2 and 3.3 on Linux, Mac OS X, Windows XP and Windows 7 (not all versions are tested on all platforms, but the overall test coverage is comfortably over 90%).
I have released the sarge documentation on Read The Docs; I’m hoping people will read this and give some feedback about the API and feature set being proposed, so that I can fill in any gaps where possible and perhaps make it more useful to other people. Please add your comments here, or via the issue tracker on the BitBucket project for the docs.
Another (far less ambitious) one to add to your list: my own ShellCommand (http://shell-command.readthedocs.org).
ReplyDeleteIt makes no attempt to be cross-platform - it's really just the thinnest possible layer I can devise to make subprocess usable for system administration style tasks without wanting to punch the screen and without creating easy openings for shell injection attacks against naive applications. (I need it to be minimalist, because I plan to add the relevant APIs directly to subprocess)
The API design is guided by one simple rule: do not reinvent anything a POSIX system shell handles well. That means no specific pipelining support or anything else. It's directly inspired by the simplicity of Perl's shell invocation syntax, just with the implicit interpolation replaced by explicit interpolation, interpolated values quoted by default and the special literal syntax replaced by calls to convenience functions.
Yes - I've had a look at ShellCommand, and borrowed some ideas from it in the area of shell formatting (I do mention the inspiration you provided in the docs).
DeleteThe reason I've encroached into the Posix shell's core competence area (at least as far as command parsing goes) is a desire for cross-platform support (but partly just as an experiment, to see how far I could take it).
I got into this whole area through some ruminations borne out of maintaining the python-gnupg project, which is a cross-platform subprocess wrapper for GnuPG. While python-gnupg works directly with subprocess, and doesn't have to worry about doing long pipelines, it was interesting enough a problem to see if I could take it beyond a very minimal wrapper.
AFAICT some of the enhancements I've added in sarge could go into subprocess proper, but they're hardly mature enough for that. You know how stdlib backwards compatibility constraints can be a millstone :-(