/ 2024

Bash vs Python for Scripting

I recently got into a small discussion on Hacker News with someone who suggested that "Using python ... is almost always a better choice than trying to write even one loop, if statement, or argument parser in a bash script."

I've written a fair share of Python and Bash, and it is my opinion that this thought is misguided, and that Bash and Python have their own niches to fulfill, and they both excel in their own domains.

So if you're someone who writes only Python scripts, maybe this post will encourage you to explore Bash.

Content

Two Use Cases for Bash
- Quick and dirty scripts
- Interacting with system configuration
Best of Both Worlds

Two Use Cases for Bash

Here are two strong use cases for Bash. I was going to write more, but felt that the article was getting too long for what I thought was a relatively straightforward topic.

Quick and dirty scripts
Interacting with system configuration and tools

Quick and dirty scripts

If you know exactly what you need to do for a script, and only need to run it once, it just makes sense to make a Bash script. The pipe operator is an extremely powerful operator and massively reduces the amount of text you need to write. Take for example, scraping images off a website.

This how I would do it in a Bash script.

bash
#!/usr/bin/env bash

set -euo pipefail
URL="https://example.com"
curl -s "$URL" \
    | pup 'img.myClass attr{src}' \
    | grep -v 'myFilter' \
    | xargs -n 1 -- wget -P imageDir

Look at that beauty. Now let's say I want to do it in Python. Maybe this is just a skill issue thing, but this is how I would do it.

python
from requests import get
from bs4 import BeautifulSoup
import os

URL="https://example.com"
IMAGE_DIR="imageDir"

res = get(URL)
soup = BeautifulSoup(res.text, 'html.parser')

for img in soup.select('img.myClass'):
    url = img['src']
    if 'myFilter' in url:
        continue
    filename = url.split("/")[-1]
    with get(url, stream=True) as r:
        with open(os.path.join(SAVE_DIR, filename), "wb") as f:
            for ck in r.iter_content(chunk_size=8192)
                f.write(ck)

Boy that was a handful! And I didn't even bother to add error handling with try-except!

Let's look at some issues:

1. External dependency

The Python script depends on the external dependency requests and BeautifulSoup. Technically, you could make the same script using the HTMLParser and the http.client that is part of Python's stdlib, but that would make the script even more complex.

Aha! You have committed a folly, for you have forgotten the Bash script depends on pup

Well, the difference lies in how each dependency is implemented. Firstly, the Python dependencies must be installed per interpreter. Now, unless you're trying to create a development environment for each Python script you write, that means installing the dependency globally.

And because these tools do not follow KISS, it is very likely that some time down the road, a version will be released that will break your script.

But I'm only using the simple APIs, those aren't going to change!

True, but the other issue is down the road, you might need to use a different library that depends on a specific version of these dependencies, and that's going to drop you down into shit creek without a paddle, as they say.

pup is a tool made with the UNIX philosophy in mind, and does a very specific job. Breaking changes are hard to fathom here, though I'm sure it's possible.

Most tools for the CLI are purpose-built and have minimal dependencies, if any. Whereas the design of Python inherently relies on importing modules, and while it's thankfully not too common, this can cause dependency conflicts to arise.

2. Verbosity

I don't think I need to say much here.

3. Complexity

The great thing about tools that follow KISS is that it's hard to get them wrong. The tools are designed for a very specific, common task, and as such their APIs are extremely accessible.

Python on the other hand, being an extremely powerful language, comes with a lot of the associated complexity. Everything is an object, and all those objects have a bazillion methods. Who has time to remember all the specific methods and their options? Why is there a res.json() and a req.text and a req.raw and a req.content?

And thank god the filter string was a simple string instead of a regex, because now we would have to interact with the re library and its objects and methods.

Interacting with system configuration

On a Linux system, the command line is King. In any way you want to configure your system, you will be able to do it faster on the CLI than with a GUI. A lot of services and daemons will have CLI tools to interact with them.

I cannot fathom why anyone would want to interact with say, cron, docker, nginx, or ssh with Python as a middleman. You would need to use the subprocess module, and then decide between using Popen() versus run(), and in both cases it's a PITA to access STDOUT and STDERR. And forget piping if you're going that route.

Best of both worlds

Now, what Python has over bash in terms of simple APIs, is data manipulation. For example, working with arrays, maps, and floats just isn't fun in Bash. Let's do something that would really suck, let's square an array of floats! We can employ Python inside our Bash script.

bash
UGLY_FLOAT_ARR=("1.1" "2.2" "3.3" "10.123")
SQUARED_FLOAT_ARR=($(python <<EOF
x=[$(printf '%s,' "${UGLY_FLOAT_ARR[@]}"| sed 's/,$//')]
print(list([a**2 for a in x]))
EOF
))
echo "${SQUARED_FLOAT_ARR[@]}"
# Output:
# [
#  1.2100000000000002,
#  4.840000000000001,
#  10.889999999999999,
#  102.47512899999998
# ]

Ah, nice to see you again my old friend floating-point inaccuracies. Now, it's also true that you could also embed bash commands inside a Python script like so:

python
import subprocess

bash_command = """
cat /etc/os-release | grep -o "^NAME=.*"
"""

result = subprocess.run(
    bash_command,
    shell=True,
    capture_output=True,
    text=True
)
print(result.stdout)
# Output: 
# NAME=NixOS

And I think that works decently, too. I haven't explored Bash-in-Python as much, so it's possible there are some additional benefits here that I'm not recognizing, but I believe the most significant factor boils down to whether most of your script is suited for native Bash or native Python. Any how, hopefully this helps some people in their scripting endeavors.

Home blog skills contact

Current visitors: 1