I recently got into a small discussion on Hacker News with someone who suggested that "Using python ... is almost always a better choice than trying to write even one loop, if statement, or argument parser in a bash script."
I've written a fair share of Python and Bash, and it is my opinion that this thought is misguided, and that Bash and Python have their own niches to fulfill, and they both excel in their own domains.
So if you're someone who writes only Python scripts, maybe this post will encourage you to explore Bash.
Here are two strong use cases for Bash. I was going to write more, but felt that the article was getting too long for what I thought was a relatively straightforward topic.
If you know exactly what you need to do for a script, and only need to run it once, it just makes sense to make a Bash script. The pipe operator is an extremely powerful operator and massively reduces the amount of text you need to write. Take for example, scraping images off a website.
This how I would do it in a Bash script.
bash#!/usr/bin/env bash set -euo pipefail URL="https://example.com" curl -s "$URL" \ | pup 'img.myClass attr{src}' \ | grep -v 'myFilter' \ | xargs -n 1 -- wget -P imageDir
Look at that beauty. Now let's say I want to do it in Python. Maybe this is just a skill issue thing, but this is how I would do it.
pythonfrom requests import get from bs4 import BeautifulSoup import os URL="https://example.com" IMAGE_DIR="imageDir" res = get(URL) soup = BeautifulSoup(res.text, 'html.parser') for img in soup.select('img.myClass'): url = img['src'] if 'myFilter' in url: continue filename = url.split("/")[-1] with get(url, stream=True) as r: with open(os.path.join(SAVE_DIR, filename), "wb") as f: for ck in r.iter_content(chunk_size=8192) f.write(ck)
Boy that was a handful! And I didn't even bother to add error handling with try-except
!
Let's look at some issues:
The Python script depends on the external dependency requests
and BeautifulSoup
. Technically, you could make the same script using the HTMLParser
and the http.client
that is part of Python's stdlib, but that would make the script even more complex.
Aha! You have committed a folly, for you have forgotten the Bash script depends on
pup
Well, the difference lies in how each dependency is implemented. Firstly, the Python dependencies must be installed per interpreter. Now, unless you're trying to create a development environment for each Python script you write, that means installing the dependency globally.
And because these tools do not follow KISS, it is very likely that some time down the road, a version will be released that will break your script.
But I'm only using the simple APIs, those aren't going to change!
True, but the other issue is down the road, you might need to use a different library that depends on a specific version of these dependencies, and that's going to drop you down into shit creek without a paddle, as they say.
pup
is a tool made with the UNIX philosophy in mind, and does a very specific job. Breaking changes are hard to fathom here, though I'm sure it's possible.
Most tools for the CLI are purpose-built and have minimal dependencies, if any. Whereas the design of Python inherently relies on importing modules, and while it's thankfully not too common, this can cause dependency conflicts to arise.
I don't think I need to say much here.
The great thing about tools that follow KISS is that it's hard to get them wrong. The tools are designed for a very specific, common task, and as such their APIs are extremely accessible.
Python on the other hand, being an extremely powerful language, comes with a lot of the associated complexity. Everything is an object, and all those objects have a bazillion methods. Who has time to remember all the specific methods and their options? Why is there a res.json()
and a req.text
and a req.raw
and a req.content
?
And thank god the filter string was a simple string instead of a regex, because now we would have to interact with the re
library and its objects and methods.
On a Linux system, the command line is King. In any way you want to configure your system, you will be able to do it faster on the CLI than with a GUI. A lot of services and daemons will have CLI tools to interact with them.
I cannot fathom why anyone would want to interact with say, cron
, docker
, nginx
, or ssh
with Python as a middleman. You would need to use the subprocess
module, and then decide between using Popen()
versus run()
, and in both cases it's a PITA to access STDOUT and STDERR. And forget piping if you're going that route.
Now, what Python has over bash in terms of simple APIs, is data manipulation. For example, working with arrays, maps, and floats just isn't fun in Bash. Let's do something that would really suck, let's square an array of floats! We can employ Python inside our Bash script.
bashUGLY_FLOAT_ARR=("1.1" "2.2" "3.3" "10.123") SQUARED_FLOAT_ARR=($(python <<EOF x=[$(printf '%s,' "${UGLY_FLOAT_ARR[@]}"| sed 's/,$//')] print(list([a**2 for a in x])) EOF )) echo "${SQUARED_FLOAT_ARR[@]}" # Output: # [ # 1.2100000000000002, # 4.840000000000001, # 10.889999999999999, # 102.47512899999998 # ]
Ah, nice to see you again my old friend floating-point inaccuracies. Now, it's also true that you could also embed bash commands inside a Python script like so:
pythonimport subprocess bash_command = """ cat /etc/os-release | grep -o "^NAME=.*" """ result = subprocess.run( bash_command, shell=True, capture_output=True, text=True ) print(result.stdout) # Output: # NAME=NixOS
And I think that works decently, too. I haven't explored Bash-in-Python as much, so it's possible there are some additional benefits here that I'm not recognizing, but I believe the most significant factor boils down to whether most of your script is suited for native Bash or native Python. Any how, hopefully this helps some people in their scripting endeavors.