urlencode and urldecode in sh

This is a fun piece of shell I thought I’d share. For gnome-doc-tool, I need to convert file paths into URLs and back. That means urlencoding and urldecoding them. I searched around and found a few solutions, mostly using a few dozen lines of awk. Now, I’ve been known to write some crazy stuff in awk (like an RNG compact syntax parser), but this seemed like too much work for a simple problem.

Then I remembered printf(1). It can do all the work of converting characters into hex byte representations and back. All you need to write is a loop to iterate over the string.


# This is important to make sure string manipulation is handled
# byte-by-byte.
export LANG=C

urlencode() {
arg="$1"
i="0"
while [ "$i" -lt ${#arg} ]; do
c=${arg:$i:1}
if echo "$c" | grep -q '[a-zA-Z/:_\.\-]'; then
echo -n "$c"
else
echo -n "%"
printf "%X" "'$c'"
fi
i=$((i+1))
done
}

urldecode() {
arg="$1"
i="0"
while [ "$i" -lt ${#arg} ]; do
c0=${arg:$i:1}
if [ "x$c0" = "x%" ]; then
c1=${arg:$((i+1)):1}
c2=${arg:$((i+2)):1}
printf "\x$c1$c2"
i=$((i+3))
else
echo -n "$c0"
i=$((i+1))
fi
done
}

That’s it. If you use these functions on potentially garbage input, you might want to add some error checking. In particular, the decoder should probably check that there are two more characters, and that they are valid hex digits.

Creative Commons Attribution 3.0 United States
This work by Shaun McCance is licensed under a Creative Commons Attribution 3.0 United States.