Pure Bash Web Server

133 points by shakna 2 years ago

For your consideration, the jukebox from my old CS club:

    #!/bin/bash
    cd songs
    if [[ ! -p q ]]; then
        mkfifo q
    fi

    (while true; do echo "playing" | nc -l 10000 | head -n 1 | cut -f 2 -d " " | cut -b 2- | grep -E "^[a-zA-Z0-9/:.?=-]{1,}$" > q; done) &

    while true; do
        if read a <q; then
        echo $a;
        fn=$(youtube-dl -f m4a --id --get-filename $a);
        youtube-dl -f m4a --id $a
        echo $fn
        mplayer $fn
        fi
    done

To queue up a song, we'd find it on youtube and prepend "http://jukebox.local:10000/" to the url

zamadatix 2 years ago

More nc, cut, grep, head, mkfifo, youtube-dl, and mplayer than anything about Bash.
- NegativeLatency 2 years ago
  
  If bash is passing and handling input from an http request that’s pretty much good enough for me.
  
  zamadatix 2 years ago
  
  It's not really doing that though. Nc+head/cut/grep are and youtube-dl is grabbing the data. Bash in this case is just orchestrating the order of communication between the tools doing the actual work, as a normal Bash script.
  The post is about doing all of this in pure bash builtins like /dev/tcp and bash functions. Not about gluing together tools which do the work.
  
  gtroja 2 years ago
  
  Nonetheless pretty wicked app made with bash
  
  zamadatix 2 years ago
  
  Yeah, it's a nifty little tool. On the other hand when a really interesting post on pure bash exposition comes up the comments section turns to talking about plain bash scripts they've made or know of instead. There are several normal scripts replied in this post and a few others via links, more than actual conversation around what the post is actually about (so far). That's all I was commenting on, not the value of the tool. It's a hill :).
  
  QuadmasterXLII 2 years ago
  
  your hill is valid, sorry
  
  stonekyx 2 years ago
  
  Even the post isn't using /dev/tcp, but compiled a C file into bash "loadable builtin" (which is something I learned today). It still feels kind of cheating to me tbh.. But cool enough!
  
  zamadatix 2 years ago
  
  It is a bit of a cheat, to be fair. Official bash loadable module, but not necessarily a part of the static bash binary.
  
  remram 2 years ago
  
  That's not the rules used in this submission though.
  > A purely bash web server, no socat, netcat, etc...
- agumonkey 2 years ago
  
  as always
- samatman 2 years ago
  
  Yeah that's how bash scripting works.
  
  zamadatix 2 years ago
  
  See the original GitHub post for how this post is not about scripting other binaries with Bash but pure Bash implementations of functionality. The posted script e.g. does not call nc or grep to make the socket or process the text. What makes this post interesting and upvoted is that it's esoteric.
  It's like replying to a thread about implementing a web server in C macros with how you implemented a music jukebox as a standard C program. Great, but also not the point or really even related.

solatic 2 years ago

Projects like this reiterate just how important it is, from a security perspective, to ensure your production services are running in containers without an included shell. If an attacker can get a shell, they can do pretty much anything.

Debug containers are now a stable feature in Kubernetes. It honestly boggles my mind how companies will throw so much time, money, and effort into the cybersecurity product du jour when they can get the vast majority of the value by moving everything into distroless, shell-less containers running on managed VMs that are optimized for container workloads.

robinhoodexe 2 years ago

Out of curiosity, how would you run Python or R workloads in kubernetes without a distro or shell?
- jonhohle 2 years ago
  
  Python needs an ld.so and libc (minimally) but not a shell or other external utilities. Shebang scripts are loaded by ld.so, not the shell.
  
  khrbtxyz 2 years ago
  
  Shebang scripts are supported directly by the kernel via the exec family of system calls, so ld.so shouldn't be involved.
  https://github.com/torvalds/linux/blob/master/fs/Kconfig.bin...
  config BINFMT_SCRIPT tristate "Kernel support for scripts starting with #!" default y help Say Y here if you want to execute interpreted scripts starting with #! followed by the path to an interpreter.
  
  thecodedmessage 2 years ago
  
  Yeah, I remember reading the code in the kernel that handles shebang a long time ago. ld.so is not involved.
  
  usr1106 2 years ago
  
  Python with batteries included, doesn't that mean exploit tools included?
  No personal attack intended, I am wondering this about my own embedded product which contains Python.
- ttymck 2 years ago
  
  https://stackoverflow.com/questions/62581924/is-there-a-way-...
- oneshtein 2 years ago
  
  Python is better than shell, so intruder will use it first.
riddley 2 years ago

Please do a write-up of these debug features. I'd love to learn about them.
- deathanatos 2 years ago
  
  Have you read the docs? https://kubernetes.io/docs/tasks/debug/debug-application/deb...
swozey 2 years ago

I work on multi-tenant k8s clusters at CDNs and used to work at Rancher and have seen just about every multi-tenant / federated deployment there is, nasa, meta, etc. Stuff where even the hardware vfio paths mounting the nic or gpu channels keep users apart from one another, and the entire path out of the cluster are completely apart from k8s and any userspace and would be something like multus as a shim- what exactly are you referring to that can cgroup hop via a shell that we're not currently mitigating? It's the hardware being infected by something we worry about at this level.
A lot of the CDNs even use tools like kubevirt where the segmentation is even further. And then we have gvisor, firecracker, etc.
I admittedly haven't touched k8s code since 1.18 but I can't think of anything like you're referring to and I definitely would like to know about it.
Thanks.
kiririn 2 years ago

Also can go even simpler and use apparmor and/or systemd hardening instead of containers
deathanatos 2 years ago

Debug containers have been a bit of a let down in UX. It's rather difficult to do sort of basic stuff, and a lot of stuff is hidden behind flags.
E.g., if you just sort of roll with the defaults, you're dropped into a pod in a very confused state: `ps -ef` says nothing in your debugee is running, and the filesystem of the debugee is nowhere to be found.
You can work around both of those (the first is --target, but the latter requires an intricate SO answer¹) but its the sort of thing that would be nicer out of the box?
The node debugging mode is a bit better: by default, puts you in at least the host pidns, and mounts the host FS.
¹https://stackoverflow.com/questions/73355970/how-to-get-acce...
- bravetraveler 2 years ago
  
  As someone who doesn't do much (any) k8s...
  Seems only marginally better than using nsenter on a privileged container to just go muck with the host
- freedomben 2 years ago
  
  I've also been disappointed with debug containers. They are often not useful for debugging trick production-only issues because so many of those issues are related to container state, which can be (often is) different inside the container. Certain languages/platforms and developer discipline are better about this than others, like if you're using functional/immutable languages then it's less of an issue.
  For applications that aren't super high security, I've been really appreciating using immutable hosts (that get regularly updated/rotated), along with CI/CD that is constantly rebuilding from source, applying latest software updates, and deploying the latest version of the app. Combined with other tools like scanners, and de-bloating your images, it really raises the height of the fruit.
- solatic 2 years ago
  
  > UX
  I would agree, but so much of day-to-day Kubernetes is arcane CLI commands to begin with. Other stuff that is non-trivial to do on the CLI but comes up in most reasonable production deployments:
  * Rotating secrets without exposing the secret to the shell history file (hint: kubectl apply -f can take - to signify atdin, but not kubectl patch!) * Ensuring your edits to a ConfigMap pass application-level validation (i.e. your configuration changes won't crash your app, not just that it's a valid ConfigMap) * Anything to do with user auth or RBAC * Scaling the default persistent volume size of a StatefulSet
  The truth is that Kubernetes is a platform, and just like how most people don't want to run a bare copy of Bash or VIM on their laptop, people will figure out aliases, one-liners, and other functions to help make them effective. So some of working effectively with Kubernetes means, yes, building your own custom debug containers, and writing your own helper shell stuff.
  
  arve0 2 years ago
  
  > without exposing the secret to the shell history file
  Any command in shell with a space before it will be omitted from history.
  Agree it should take input from stdin.
  
  sirn 2 years ago
  
  This requires `setopt HIST_IGNORE_SPACE` (zsh) or `HISTCONTROL=ignorespace`/`HISTCONTROL=ignoreboth` (bash/ksh) to be set and may not be enabled by default in many distros (e.g. NixOS doesn't, Alpine doesn’t, etc).
  Always check your shell before assuming this will work!

dzove855 2 years ago

Creator here:

I'm really happy to see somebody shared it on hackernews :D If you have some questions, feel free to ask me

zamadatix 2 years ago

Hey, great work! Reading through I started wonder how necessary the loadables are? It'd be fun to have one that's not dependent on loadbales, even if it's not as clean. E.g. could mktemp be replaced with a timestamp named directory or something? Can rm be avoided by just allowing garbage to pile up? Is finfo something that can be worked around in some way?
- dzove855 2 years ago
  
  Hello,
  You could avoid loadables.
  Finfo <- load file inside a variable and get the size Mktemp <- like you said with timestamp Rm <- with a fifo or variable
  
  zamadatix 2 years ago
  
  Ah "stick it in a variable" seems obvious now, good point!

cduzz 2 years ago

I suspect lots of people have written a "use tcpserver or inetd and feed stdout to a shell script" antics.

The thing is, shell can't cope with nulls -- if you do something like

  n=$(gzip -9 < /etc/passwd)
  gzip -9 < /etc/passwd | sum
  echo "$n" | sum

This falls apart because shell just can't deal with nulls.

You can probably hack around all those issues, and may not run into this too much, at first, in a web server, but golly you'll pretty quickly fall into a pit.

mike_d 2 years ago
```
  tr -dc '[[:print:]]'
```
Will sanitize strings of non-printable characters. While it is true that you can't have nulls inside bash variables, your example actually contains the correct syntax if you just remove the first and last lines.
- cduzz 2 years ago
  
  well, sure, you can use some external program do process the stdout; then it's no longer "pure bash" which is fine, nobody grades ingots of script based on if they're 90% or 99% or 70% "pure" shell.
  But -- importantly -- running
  n=$(gzip -9 < /etc/passwd | tr -dc '[[:print:]]')
  may process the nulls, but is it reversible? Can I now send $n into gzip -d and get whatever I put into it out?
  I can do things that are reversible --
  n=$(gzip -9 < /etc/passwd | base64 )
  But now I can't process the output "natively" except by calling base64 every time.
  And maybe I've gotten myself into this hole because sometimes the contents of $n have nulls and other times not?
  Pure shell is a road to madness. Don't ask me how I know...
  
  mike_d 2 years ago
  
  You are in this mess because you are supposed to be piping binary data or using temp files, not putting it in variables. Also bash scripts are just glue between external programs.
  I mentioned the sanitization in the context of taking user input (since we are talking about a bash web server) because I thought you were pointing out a user could do bad things by feeding in nulls.
  
  cduzz 2 years ago
  
  I'm just pointing out the insanity / inanity of "pure" shell anything. There are lots of other gotchas hiding in shell that you wouldn't encounter in other languages.
  As glue, shell's wonderful. Reading from /dev/tcp/ and such is a cute trick but ultimately a dead dead dead end.
  
  mike_d 2 years ago
  
  As with any language there are going to be foot-guns, gotchas, and edge cases. If you don't feel comfortable with a language (including bash) then don't use it.
  Before I really knew how to program I was a systems administrator and used nothing but bash via CGI to build a $2k/month revenue site, so obviously your claims of it being a dead end are just hyperbole.
  
  cduzz 2 years ago
  
  It's not a matter of comfort or discomfort with a tool; A long time ago I wrote: a web server in shell (using either inetd or tcp_server) in 1996, and wrote a mail user agent / web server to read mail out of a maildir and display it as html (that one had both a y2k bug and a "time_t is 11 digits now?" bug); that one only used tcp_server... I also wrote a web server (in shell) to manage a wireless captive portal. Some of this was on solaris, some on ultrix, some on freebsd....
  I'm proud of you for making a ton of money on bash using CGI; "it's not dumb if it works" but ... doing complex stuff like this in shell is ... dumb.
  At least, I certainly know better. You do you, though.
  
  mike_d 2 years ago
  
  > "it's not dumb if it works"
  It's not dumb if you know what you are doing. I and others have pointed out how to properly handle binary data in shell scripts.
  This might be a good start if you'd like to learn more: https://tldp.org/HOWTO/Bash-Prog-Intro-HOWTO.html
  
  cduzz 2 years ago
  
  I'm simply pointing out that a "pure bash" or other shell program is folly once you get past a trivial level of complexity.
  That How-to doesn't mention that shell eats nulls. If you're using shell as glue, that doesn't matter, but if you're using shell to process (not pass to another program) raw tcp connections, you'll need to manage binary data, which is full of nulls.
  Perhaps you're not even aware of these issues? Anyhow, go on about your life grand troll, you're the winner.
- throwway120385 2 years ago
  
  Yeah you can pipe nulls between processes just fine. You just can't print them in a shell.
  
  cduzz 2 years ago
  
  In the case of
  gzip -9 < /dev/random | dd of=/tmp/gibberish
  the shell's not actually doing anything but forking things and connecting file descriptors of processes to each other
  gzip -9 < /dev/random | while read line ; do echo "$line" ; done > /tmp/gibberish
  The stdout of gzip is being processed by shell, and will make all the nulls go away.
  (edited to add another example:)
  Similarly - it isn't printing that you can't do -- it's anything -- consider:
  case $(cat /bin/sh) in $(cat /bin/bash)) echo "they're the same!" ;; *) echo "they're not the same!" ;; esac
  This is obviously an insane way to see if two files are identical, but worse -- it's going to fail for two different files whose only difference is how many nulls are in the file.
- 082349872349872 2 years ago
  
  or use vis(1) on both sides of bash-land

dang 2 years ago

Show HN: A pure bash web server. No netcat, socat, etc. - https://news.ycombinator.com/item?id=29794979 - Jan 2022 (97 comments)

gtroja 2 years ago

The conectiva Linux distro had a programmer that wrote a book on Shell Script in which he implemented a bash server, but as apache cgi scripts. I learned to properly program with that book

https://www.amazon.com.br/Script-Profissional-Aurelio-Marinh...

goombacloud 2 years ago

When socat is around a simple server can also be constructed with it:

        tee /tmp/server > /dev/null <<'EOF'
        #!/bin/bash
        set -euo pipefail
        SERVE="$1"
        TYPE="$2"
        read -a WORDS
        if [ "${#WORDS[@]}" != 3 ] || [ "${WORDS[0]}" != "GET" ]; then
          echo -ne "HTTP/1.1 400 Bad request\r\n\r\n"; exit 0
        fi
        # Subfolders are not supported for security reasons as this avoids having to deal with ../../ attacks
        FILE="${SERVE}/$(basename -- "${WORDS[1]}")"
        if [ -d "${FILE}" ] || [ ! -e "${FILE}" ]; then
          echo -ne "HTTP/1.1 404 Not found\r\n\r\n" ; exit 0
        fi
        echo -ne "HTTP/1.1 200 OK\r\n"
        echo -ne "Content-Type: ${TYPE};\r\n"
        LEN=$(stat -L --printf='%s\n' "${FILE}")
        echo -ne "Content-Length: ${LEN}\r\n"
        echo -ne "\r\n"
        cat "${FILE}"
        EOF
        chmod +x /tmp/server
        # switch from "text/plain" to "application/octet-stream" for file downloads
        socat TCP-LISTEN:8000,reuseaddr,fork SYSTEM:'/tmp/server /tmp/ text-plain'

# test: curl -v http://localhost:8000/server

cf100clunk 2 years ago
There are other such tiny web server tricks out there too, but his GitHub README says:
```
  A purely bash web server, no socat, netcat, etc...
```

VWWHFSfQ 2 years ago

A long time ago I made a similarly pure bash version of something like tcpdump just parsing various packets and protocols off a raw socket. I wish I still had that code somewhere. It was pretty much the slowest and least-robust thing of all time but was kind of fun to play around with.

cool project

calvinmorrison 2 years ago

Of course 9front ships a rc based http server

https://werc.cat-v.org/docs/web-server-setup/rc-httpd

cf100clunk 2 years ago

Related to this?
https://news.ycombinator.com/item?id=7614718

recursivedoubts 2 years ago