Ghost in the shell script: Boffins reckon they can catch bugs before programs run
- Reference: 1746005226
- News link: https://www.theregister.co.uk/2025/04/30/shell_script_code_correctness/
- Source link:
The team argues it's possible to analyze shell scripts ahead of execution, offering developers pre-runtime guarantees more typical of statically typed languages. Their research focuses on taming the brittle and unpredictable behavior of shell environments like Bash and Zsh — where a single poorly constructed rm -rf can potentially reduce a system to rubble.
Unix and Linux environments have long relied on shells like Bash and Zsh, which serve as command line interpreters for interacting with the system. Shell programming remains hugely popular – it was the eighth most popular programming language in 2024, [1]according to GitHub .
[2]
"The Unix shell has been around for more than half a century at this point," Nikos Vasilakis, assistant professor of computer science at Brown University in the US, told The Register . "Because of certain characteristics that it has, it's unusual. It's a source of many, many serious bugs or problems, both in terms of supply chain security and in terms of correctness."
[3]
[4]
Vasilakis pointed to high profile shell-related bugs affecting Nvidia drivers, Apple iTunes, and the [5]2015 Steam shell scripting blunder that wiped files from Linux PCs.
But according to Vasilakis, shell programming doesn't get much attention from academics because of its unusual semantics.
[6]
"Most programming languages already have a principal design, so their syntax and semantics follow a very, very principled approach," he explained. "But the shell is actually one of the oldest environments out there. And it was designed at a time when people didn't design languages and environments in such a principled way, so it was a Wild West."
Shell scripts can therefore be difficult to debug, develop, and maintain. And yet they're everywhere.
[7]New SSL/TLS certs to each live no longer than 47 days by 2029
[8]Still browsing like it's 1999: Fresh tools that keep vintage Macs online and weirdly alive
[9]Users hated a new app – maybe so much they filed a fake support call
[10]SystemRescue 12 lands with added bcachefs support
"Shell programs are sort of the underlying infrastructure used for all sorts of continuous integration and continuous deployment," said Vasilakis. "And so everything, in some sense, runs on shell programs, but it's the kind of infrastructure that you do not easily see."
So Vasilakis and his academic colleagues – Lukas Lazarek, Seong-Heon Jung, Evangelos Lamprou, Zekai Li, Anirudh Narsipur, Eric Zhao, Michael Greenberg, Konstantinos Kallas, and Konstantinos Mamouras – have been developing ways to apply static analysis - a method for analyzing how code will perform without having to actually execute it - to evaluate shell scripts. Their idea is to make it possible to check a script for correctness before it gets the chance to nuke your files.
They describe their efforts in [11]a forthcoming paper [PDF] titled "From Ahead-of- to Just-in-Time and Back Again: Static Analysis for Unix Shell Programs," which they will present at the [12]HotOS XX conference in May. (The event’s 20th edition brings with it a Roman numeral that has nothing to do with the adult entertainment industry.)
[13]
The paper, which will eventually be formally available at [14]this URL , argues that making shell scripts amenable to static analysis needs three things:
Breaking out and recognizing elements suitable for static guarantees
Using large language models to check shell command documentation against actual behavior
Deploying safety-aware runtime monitoring to catch serious bugs before they do damage.
"We're developing essentially a series of systems that alleviate these problems by checking the correctness of these computations before the execution of the program," said Vasilakis. "So basically within a second you can tell whether your program is going to crash or whether it's going to execute as expected."
Static analysis is currently not particularly well suited to shell scripts, the paper points out. Shell scripts are dynamic in nature, with runtime code evaluation and [15]shell parameter expansion that can't easily be anticipated.
Vasilakis said that his colleagues and collaborators from other institutions have created compilers and analysis systems to help with the parallelizing and distribution of shell programs.
"And now we're building on these compilers and analysis systems to tackle a very different challenge, which is correctness," he explained. "Can we say something about the correct execution of these programs across environments? That is a new thing."
We're told the team's code so far for performing this analysis will be shared shortly.
"This is the third serious attempt on this problem, but the first successful one," said Vasilakis. "The first time we tried to solve this problem was in 2022 at MIT with a team of researchers from the Max Planck Institute in Germany. We failed. Then, I tried again with a larger team during my first year at Brown — with collaborators from several institutions in the US and Europe. We semi-failed: we found a way to bypass the narrow version of the problem, in some environments, and with some assumptions – but we did not solve it."
Assuming the authors' efforts pan out – this is the first in a series of papers under submission that attempt to address the shell scripting problem – shell scripting could become far more predictable. ®
Get our [16]Tech Resources
[1] https://github.blog/news-insights/octoverse/octoverse-2024/#the-most-popular-programming-languages
[2] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_security/cso&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=2&c=2aBJJLhBEf4flnwbBBuil9AAAAso&t=ct%3Dns%26unitnum%3D2%26raptor%3Dcondor%26pos%3Dtop%26test%3D0
[3] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_security/cso&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=4&c=44aBJJLhBEf4flnwbBBuil9AAAAso&t=ct%3Dns%26unitnum%3D4%26raptor%3Dfalcon%26pos%3Dmid%26test%3D0
[4] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_security/cso&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=3&c=33aBJJLhBEf4flnwbBBuil9AAAAso&t=ct%3Dns%26unitnum%3D3%26raptor%3Deagle%26pos%3Dmid%26test%3D0
[5] https://www.theregister.com/2015/01/17/scary_code_of_the_week_steam_cleans_linux_pcs/
[6] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_security/cso&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=4&c=44aBJJLhBEf4flnwbBBuil9AAAAso&t=ct%3Dns%26unitnum%3D4%26raptor%3Dfalcon%26pos%3Dmid%26test%3D0
[7] https://www.theregister.com/2025/04/14/ssl_tls_certificates/
[8] https://www.theregister.com/2025/04/15/classic_mac_os_apps/
[9] https://www.theregister.com/2025/04/11/on_call/
[10] https://www.theregister.com/2025/03/20/systemrescue_12_bcachefs/
[11] https://nikos.vasilak.is/p/sash:hotos:2025.pdf
[12] https://sigops.org/s/conferences/hotos/2025/program.html
[13] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_security/cso&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=3&c=33aBJJLhBEf4flnwbBBuil9AAAAso&t=ct%3Dns%26unitnum%3D3%26raptor%3Deagle%26pos%3Dmid%26test%3D0
[14] https://doi.org/10.1145/3713082.3730395
[15] https://www.gnu.org/software/bash/manual/html_node/Shell-Parameter-Expansion.html
[16] https://whitepapers.theregister.com/
Shellcheck
There's an amazing Github project called shellcheck that already does a lot of this work... I wonder if the researchers were aware of it?
Re: Shellcheck
I have a feeling the researchers weren't interested in existing solutions, instead they had AI to peddle. From TFA:
Using large language models to check shell command documentation against actual behavior
Re: Shellcheck
Great, we can look forward to correctness when compared against some bash/ksh combined AI slop scripting language.
Shell programming checks - apply Occam's razor.
I'll clam up now.
Boffins reckon they can catch bugs before programs run
So did Crowdstrike, that did not end well. ;)
Bash compiler ?
Surely this is a bag of a fag packet task for "AI" these days ?
Re: Bash compiler ?
" bag of a fag packet task "
Sometime a typo just makes it golden. :))
HotOS XX conference?
I'll wait 10 years for the really hot one
Two easy bash script tests
1) The first line should be "#! /bin/bash -e". That -e means exit on error. If the author missed it then when the script finds something unexpected it will plow on regardless and do things you do not want. This tells you enough about the author to not run his scripts.
2) If the script is over 50 lines long it should have been written in a proper high level language.
Re: Two easy bash script tests
Nobody expects the Spanish inquisition. Our three tests are 1) -e, 2) script length, and...
3) bash -n
Re: Two easy bash script tests
4) set -u, to make using an undeclared variable an error (good for catching typos like FILENAME=$(foo) ; rm -f $FILENME)
Example
Worse:
DIRNAME=TMP
rm -rf /${DIRNAM}
Yes, there are (multiple) things programmers can do to prevent the bad results from this type of error. Despite that, someone at vALVE made this sort of error with ~/ in a script that was pushed to all on-line STEAM users some years back, wiping out all the affected users' stats and saved games.
(Luckily for me, my STEAM-running games PC was powered off that week.)
Re: Two easy bash script tests
I agree that the exit-on-error & error-on-undefined flags should always be set either via the shebang or the set command at the top of the file.
If using bash, then set -o pipefail is also a good idea.
But the first line should be #!/bin/sh -e not #!/bin/bash -e unless the script is actually using bash features like shopt -s lastpipe , arrays, local variables etc.
Using bash for shell scripts that only need basic POSIX shell features wastes a lot of resources, especially if said script is going to be run repeatedly from cron or via other background triggering.
On 64-bit Raspbian for example:-
$ ls -l /bin/{ba,da,}sh
-rwxr-xr-x 1 root root 1346480 Mar 29 2024 /bin/bash
-rwxr-xr-x 1 root root 133640 Jan 5 2023 /bin/dash
lrwxrwxrwx 1 root root 4 Jan 5 2023 /bin/sh -> dash
Re: Two easy bash script tests
$ ls -l /bin/bash
ls: /bin/bash: No such file or directory
If you must use bash features it should be launched with #!/usr/bin/env bash . /usr/bin/env is almost standardised.
Moreover set -e is overrated and unreliable. If a command can fail the script should check for the failure explicitly.
Unix, which is what's really being programmed here the shell language is a distraction, is user friendly: it's just picky about who its friends are.
Re: Two easy bash script tests
I always go for bash first because there is always some utility that is quicker to write in bash than some other higher level language.
- It's always : #!/usr/bin/env bash
- set -euo pipefail (or set -eo pipefail, if the script is not destructive but if not -u then we do...
- local varname=${some_otther_varname:?undefined variable, abort}
Because the :? is a very useful bash variable expansion...
You can do some awful awful things with a combination of tools like yq + jq and still store your "configuration" in semi-readable YAML.
Re: Two easy bash script tests
I know it works and is the more portable thing to do but #!/usr/bin/env bash seems illogical from a purely abstract point of view.
Avoiding an absolute path for invoking one program via the absolute path to another seems daft!
Anyway, back on topic, some other ways of improving code quality is always double-quoting substitutions (completely e.g. cmd1 "$(cmd2 "${var}")" ), never using backticks for command substitutions, using xargs or mapfile instead of command substitution.
As someone said, installing shellcheck to lint all your scripts is a good place to start and will pick up these and many other potential gotchas.
Re: Two easy bash script tests
Moreover set -e is overrated and unreliable. If a command can fail the script should check for the failure explicitly.
I agree that if the failure of a command would lead to the script leaving things in an inconsistent state, then it should be explicitly invoked via if or by checking $? .
But checking the result of every external command, shell built-in & shell function that might fail is exhausting if a simple abort of the script - maybe with a trap EXIT to clean up - is less error prone and in any case set -e will act as a last resort assertion.
Shell
I take some kind of morbid pride in the horrendous shell scripts that litter the landscape of many a company I've worked for over the past 30 years. If your eyes don't bleed, you've done it wrong.
Really?
I find the very concept of checking a shell script is a hiding to nothing.
The issue is that very little of what is written in a shell script is actually written in shell commands. A huge amount is just herding other (external) commands to get a task done, passing data through pipelines of non-shell tools to achieve the result.
As such, I like to think of the shell (pretty much any variant since UNIX Edition 6 shell, which predates the Bourne shell) as more of a harness for holding things together with some programming structures to make life easier, rather than a fully specified programming language in it's own right. In addition, it has handling of wild-card and variable substitution that make it very suitable for tasks that would be very difficult to code in a more formal language.
There are two reasons I think this. A while ago, some article or other issued a challenge to write some shell to do something. I can't remember exactly what it was, but it was date related. The majority of posted solutions were posted as shell scripts, but in which, most of the actual processing was not done using shell built-ins. They were shell commands calling other tools like cal, or date, and using something like awk, sed and various other tools to process the output to produce the required result.
I actually tried to write a version that was pure shell. And it was very difficult. I was working in ksh88, but I also wrote versions in ksh93. This is not what shells were written to do!
The second reason is that I have been recently reviewing some Python which is being used to run Ansible playbooks, and this often involves quite a few hoops to run external command required for automation tasks, and then process the results in Python to work out how well it worked or whether it failed. I look at these programs, and then imagine what I would do in a shell, and find that quite often it would be much, much easier to do it as a series of commands in something like a pipeline, run from shell, rather than trying to do it in Python. For my own tasks that I have been told I need to write, I'm really thinking of writing shell scripts together with a deploy method in ansible, and playbooks that just call the shell scripts. It would be much easier to write, and (IMHO) will be easier to maintain, although it's obviously subverting the whole reasoning of Ansible.
In conclusion, what I think I'm trying to say is that the way shell and shell scripts are used should not be treated as a formal language, and as such is very difficult to subject to automated code review.
rc
I recall the Plan9 shell, rc was a more modern design intended to address some of the shortcomings of the Bourne and Korn shells ([t]csh wasn't used for serious scripting by anyone who valued their sanity) and bash either didn't exist or was early days.
Years ago I had rc running on a hpux workstation with a X11 port of the window manager 9wm(?) which as I recall was sort of ok.
I know I knocked out a colossal quantity of shell script mostly to glue unrelated applications together
Awk and Perl always seemed to have a impedance mismatch with the OS and file system which just made shell seem an easier choice.
Until the ascendance of Linux distributions the variety of Unix meant that nothing much beyond Bourne shell could be assumed. Even awk was old awk, perl ancient or not present, definitely no python by default. The init script on hpux 10.20 that configured (multiple) network interfaces and aliases was an impenetrable wonder of Bourne shell gymnastics.
Part of the problem, I suspect, is that users would prefer their scripting language to be the same as their interactive shell which probably present inreconcilable design goals.
Re: rc
To tell you the truth, for all I mistrust it, I suspect that something like PowerShell will end up being the way forward. But in using this, you need an OS whose components can and do understand complex data objects.
I actually quite liked the VM/CMS implementation of REXX as a command processing language, but the OS/2 and AIX implementations were not complete, so never really gained any headway.
One of the biggest problems (and past strengths as well) is that UNIX-like commands are designed to work on streams of bytes, often arranged as lines. This made every sense when OS's were more simple than they have to be now, and commands were written to be run as interactive commands. But modern problems are often much more complex than can be represented by just a stream of bytes.
If an OS's command set was re-implemented to use some form of object passing, this would make much of the convoluted shell scripting that has been required in the past redundant.
But it would no longer really be a UNIX-like OS!
Boy, am I glad I'm retiring shortly.
Double-guessing the user's intention is so reliable. Maybe I really do mean m -rfy /