The subprocess module provides some very useful functionality for working with external programs from Python applications, but is often complained about as being harder to use than it needs to be. See, for example, Kenneth Reitz’s Envoy project, which aims to provide an ease-of-use wrapper over subprocess. There’s also Andrew Moffat’s pbs project, which aims to let you do things like

from pbs import ifconfig
print ifconfig("eth0")

Which it does by replacing sys.modules['pbs'] with a subclass of the module type which overrides __getattr__ to look for programs in the path. Which is nice, and I can see that it would be useful in some contexts, but I don’t find that wc(ls("/etc", "-1"), "-l") is more readable than call(“ls /etc –1 | wc –l”) in the general case.

I’ve been experimenting with my own wrapper for subprocess, called sarge. The main things I need are:

  • I want to use command pipelines, but using subprocess out of the box often leads to deadlocks because pipe buffers get filled up.
  • I want to use bash-style pipe syntax on Windows as well as Posix, but Windows shells don’t support some of the syntax I want to use, like &&, ||, |& and so on.
  • I want to process output from commands in a flexible way, and communicate() is not always flexible enough for my needs - for example, if I need to process output a line at a time.
  • I want to avoid shell injection problems by having the ability to quote command arguments safely, and I want to minimise the use of shell=True, which I generally have to use when using pipelined commands.
  • I don’t want to set arbitrary limits on passing data between processes, such as Envoy’s 10MB limit.
  • subprocess allows you to let stderr be the same as stdout, but not the other way around - and I sometimes need to do that.

I’ve been working on supporting these use cases, so sarge offers the following features:

  • A simple run function which allows a rich subset of Bash-style shell command syntax, but parsed and run by sarge so that you can run cross-platform on Posix and Windows without cygwin:

    >>> p = run('false && echo foo')
    >>> p.commands
    [Command('false')]
    >>> p.returncodes
    [1]
    >>> p.returncode
    1
    >>> p = run('false || echo foo')
    foo
    >>> p.commands
    [Command('false'), Command('echo foo')]
    >>> p.returncodes
    [1, 0]
    >>> p.returncode
    0
  • The ability to format shell commands with placeholders, such that variables are quoted to prevent shell injection attacks:

    >>> from sarge import shell_format
    >>> shell_format('ls {0}', '*.py')
    "ls '*.py'"
    >>> shell_format('cat {0}', 'a file name with spaces')
    "cat 'a file name with spaces'"
  • The ability to capture output streams without requiring you to program your own threads. You just use a Capture object and then you can read from it as and when you want:

    >>> from sarge import Capture, run
    >>> with Capture() as out:
    ... run('echo foobarbaz', stdout=out)
    ...
    <sarge.Pipeline object at 0x175ed10>
    >>> out.read(3)
    'foo'
    >>> out.read(3)
    'bar'
    >>> out.read(3)
    'baz'
    >>> out.read(3)
    '\n'
    >>> out.read(3)
    ''

    A Capture object can capture the output from multiple commands:

    >>> from sarge import run, Capture
    >>> p = run('echo foo; echo bar; echo baz', stdout=Capture())
    >>> p.stdout.readline()
    'foo\n'
    >>> p.stdout.readline()
    'bar\n'
    >>> p.stdout.readline()
    'baz\n'
    >>> p.stdout.readline()
    ''

    Delays in commands are honoured in asynchronous calls:

    >>> from sarge import run, Capture
    >>> cmd = 'echo foo & (sleep 2; echo bar) & (sleep 1; echo baz)'
    >>> p = run(cmd, stdout=Capture(), async=True) # returns immediately
    >>> p.close() # wait for completion
    >>> p.stdout.readline()
    'foo\n'
    >>> p.stdout.readline()
    'baz\n'
    >>> p.stdout.readline()
    'bar\n'
    >>>

    Here, the sleep commands ensure that the asynchronous echo calls occur in the order foo (no delay), baz (after a delay of one second) and bar (after a delay of two seconds); the capturing works as expected.

Sarge hasn’t been released yet, but it’s not far off being ready. It’s meant for Python >= 2.6.5 and is tested on 2.6, 2.7, 3.1, 3.2 and 3.3 on Linux, Mac OS X, Windows XP and Windows 7 (not all versions are tested on all platforms, but the overall test coverage is comfortably over 90%).

I have released the sarge documentation on Read The Docs; I’m hoping people will read this and give some feedback about the API and feature set being proposed, so that I can fill in any gaps where possible and perhaps make it more useful to other people. Please add your comments here, or via the issue tracker on the BitBucket project for the docs.

2

View comments

When I set up xrdp on Raspbian Jessie a while ago, the keyboard layout appeared to be wrong - commonly used keys seemed to be returning US keycodes rather than UK ones. I found this post very helpful in resolving the problem, but it didn't quite fit the bill when I tried to do the same with a Raspbian Stretch instance recently. Here's what I did on Raspbian Stretch to set up xrdp to provide the correct keycodes.

First, I checked the keboard layout was as expected:

$ cat /etc/default/keyboard | grep LAYOUT XKBLAYOUT="gb" Then, I generated a keyboard mapping file using xrdp-genkeymap:

$ xrdp-genkeymap km-00000809.ini This filename follows the current filename convention (under Jessie, it was km-0809.ini).
1

The implementation of PEP 391 (Dictionary-Based Configuration for Logging) provides, under the hood, the basis for a flexible, general-purpose configuration mechanism. The class which performs the logging configuration work is DictConfigurator, and it's based on another class, BaseConfigurator.

Last year, Mark Hammond proposed PEP 397 (Python launcher for Windows), to bring some much needed functionality for Python on Windows. Historically, Python on Windows does not add itself to the system path; this needs to be done manually by users as a separate step. This may change in the future, but it remains the case for Python versions that are already released.

With the acceptance of PEP 414 (Explicit Unicode Literal for Python 3.3), string literals with u prefixes will be permitted syntax in Python 3.3, though they cause a SyntaxError in Python 3.2 (and earlier 3.x versions). The motivation behind the PEP is to make it easier to port any 2.x project which has a lot of Unicode literals to 3.3 using a single codebase strategy. That’s a strategy which avoids the need to repeatedly run 2to3 on the code, either during development or at installation time.

The subprocess module provides some very useful functionality for working with external programs from Python applications, but is often complained about as being harder to use than it needs to be. See, for example, Kenneth Reitz’s Envoy project, which aims to provide an ease-of-use wrapper over subprocess.
2

Eric Holscher, one of the creators of Read The Docs, recently posted about the importance of a documentation culture in Open Source development, and about things that could be done to encourage this. He makes some good points, and Read The Docs is a very nice looking showcase for documentation.
5
About
About
Occasional posts about Python programming.
Blog Archive
Loading
All content on this blog is Copyright © 2012-2016 Vinay Sajip. Dynamic Views theme. Powered by Blogger. Report Abuse.