Obvious PE\0\0 \x7FELF BPG\xFB Egocentric \x89PNG\x0D\x0A\x1A\x0A MZ (DOS header) Mark Zbikowski dex\n035\0 RAR\x1a\7\0 BZ PK\3\4 (ZIP) Philip Katz GIF89a BM RIFF BPG\xFB Fabrice Bellard Not obvious Specific logic GZip 1F 8B TIFF: JPG FF D8 II Intel (little) endianness Not obvious, but l33tsp34k ^_^ MM Motorola (big) endianness CAFEBABE Java / universal (old) Mach-O Flash: DOCF11E0 Office FWS ShockWave Flash (Flat) FEEDFACE Mach-O CWS (zlib) compressed FEEDFACF Mach-O (64b) ZWS LZMA compressed “Magic” signatures, enforced at offset 0
not enforcing signature at offset 0: ZIP, 7z, RAR, HTML actually enforcing signature at offset 0: bzip2, GZip File formats not enforcing signature at offset 0 (ZIP is used in many formats: APK, ODT, DOCX, JAR…)
ZIP actually enforces “finishing” near the end of the file.
● TAR: Tape Archive ● Disk images: ISO, Master Boot Record ● TGA (image) ● (Console) roms Hardware-bound formats: code/data at offset 0 ‘header’ often (optionally) later in the memory space
a good magic signature: ● enforced at offset 0 ● unique no magic ⇒ no excuse
Standard tool: checks magic, chooses path, never returns...
Another common yet important property (useful for abuses)
It’s a complete cow (you can see its whole body), with something next: appending something doesn’t invalidate the start.
Remember: there’s nothing to parse after the terminator.
PE PDF HTML formats not enforced at offset 0 + tolerating appended data = polyglots by concatenation ZIP
a JAR(JAR) || BINK polyglot JAR = ZIP(CLASS)
If a cow keeps a frog in its mouth, it can also speak 2 languages! (the outer leaves space for an inner)
Ok, I know… here is a more realistic analogy...
...if our cow swallows a microSD, it’s still a valid cow! Even if it contains foreign data, that is tolerated by the system.
2 infection chains in one file: the PDF part is stored in a Java buffer
Such parasites exist already in the wild (they just use unallocated space)
PoC||GTFO 0x2: MBR || PDF || ZIP
by Travis Goodspeed PoC||GTFO 0x3: JPG || AFSK || AES(PNG) || PDF || ZIP
PoC||GTFO 0x4: TrueCrypt || PDF || ZIP
by Alex Inführ PoC||GTFO 0x5: Flash || ISO || PDF || ZIP
PoC||GTFO 0x6: TAR || PDF || ZIP $ tar -tvf pocorgtfo06.pdf -rw-r--r-- Manul/Laphroaig 0 2014-10-06 21:33 %PDF-1.5 -rw-r--r-- Manul/Laphroaig 525849 2014-10-06 21:33 1.png -rw-r--r-- Manul/Laphroaig 273658 2014-10-06 21:33 2.bmp $ unzip -l pocorgtfo06.pdf Archive: pocorgtfo06.pdf warning [pocorgtfo06.pdf]: 10672929 extra bytes at... (attempting to process anyway) Length Date Time Name --------- ---------- ----- ---- 4095 11/24/2014 23:44 64k.txt 818941 08/18/2014 23:28 acsac13_zaddach.pdf 4564 10/05/2014 00:06 burn.txt 342232 11/24/2014 23:44 davinci.tgz.dvs 3785 11/24/2014 23:44 davinci.txt 5111 09/28/2014 21:05 declare.txt 0 08/23/2014 19:21 ecb2/
Extreme files bypass filters
Farmer got denied permit to build a horse shelter. So he builds a giant table & chairs which don’t need a permit.
a mini PDF (Adobe-only, 36 bytes) ⇒ skipped by scanners yet valid !
a 64K sections PE (all executed) ⇒ crashes many softwares, evades scanning
This is a how a user sees a cow.
This is how a dev sees a cow…
This is how another dev sees a cow ! (this one: brazilian beef cut - previous: french beef cut)
Same data, different parsers it would have been too easy ;)
commented line missing trailer keyword a schizophrenic PDF: 3 different trailers, seen by 3 different readers
a schizophrenic PDF (screen ⇔ printer)
PDF viewer PDF slides a (generated) PDF || PE || JAR [JAVA+ZIP] || HTML polyglot...
But brandings can be faked! or “patched” into another symbol ⇒ attribution is hard
… and in a pure PoC||GTFO fashion, @munin forged a branding iron !
an encrypted file is not always “encrypted” ⇒ encrypt(file) is not always “random” encrypt(file) can be valid
.D.A.T.A.[.18.104.22.168.22.214.171.124.9.A.B .C.D.E.F.].E.N.D ? .T.E.X.T0A.t.h.i.s. .i.s. .a. .t .e.x.t0A We want to encrypt a DATA file to a TEXT file. DATA tolerates appended data after it’s END marker TEXT accepts /* */ comments chunk (think ‘parasite in a host’)
.D.A.T.A.[.126.96.36.199.188.8.131.52.9.A.B .C.D.E.F.].E.N.D <random> if we encrypt, we get random result. we can’t control AES output & input together.
AES works with blocks File encryption applies AES via a mode of operation
Electronic Code Book: penguin = bad
choose the IV to control both first blocks (P1 & C1)
.D.A.T.A.[.184.108.40.206.220.127.116.11.9.A.B .C.D.E.F.].E.N.D +IV1 .T.E.X.T <something we control> <random rest> Encrypt with pure AES, then determine IV to control the output block
.D.A.T.A.[.18.104.22.168.22.214.171.124.9.A.B .C.D.E.F.].E.N.D +IV2 .T.E.X.T./.* <ignored random rest> We can’t control the rest of the garbage… so let’s put a comment start in the first block
.D.A.T.A.[.126.96.36.199.188.8.131.52.9.A.B .C.D.E.F.].E.N.D .T.E.X.T./.* <ignored random rest> .*./0A.t.h.i.s. .i.s. .a. .t .e.x.t0A If we close the comment and append the target file’s data in the encrypted file. then this file is valid and equivalent to our initial target.
.D.A.T.A.[.184.108.40.206.220.127.116.11.9.A.B .C.D.E.F.].E.N.D <pre-decrypted ignored random> +IV2 .T.E.X.T./.* <ignored random rest> .*./0A.t.h.i.s. .i.s. .a. .t .e.x.t0A ...then we decrypt that file: we get the original source file, with some random data, that will be ignored since it’s appended data.
.D.A.T.A.[.18.104.22.168.22.214.171.124.9.A.B .C.D.E.F.].E.N.D <pre-decrypted ignored random> +IV2 .T.E.X.T./.* <ignored random rest> .*./0A.t.h.i.s. .i.s. .a. .t .e.x.t0A Since AES CBC only depends on previous blocks, this DATA file will indeed encrypt to a TEXT file.
Chimera (if you skip identified bodies, you’ll miss other files)
a JPEG || ZIP || PDF Chimera
image data a chimera defeats sequential parsing with optimization
a Picture of Cat (BMP ! uncompressed ! OMG)
BMP let us define bit masks for each color: 32 bits: 0000000000000000rrrrrggggggbbbbb (no alpha) ⇒ 16 bits of free space!
let’s play the picture! no, seriously :)
Consider the BMP as RAW 32b PCM 1. store sound in the lower 16 bits: sound ignored by BMP image data too low to be audible 2. store a picture encoded as sound ○ viewable as spectrogram http://wiki.yobi.be/wiki/BMP_PCM_polyglot
an RGB BMP || raw (3-channel spectrogram) polyglot by @doegox
Cerbero same type of heads, one body
an RGB picture... RGB picture data = bytes triplets for R, G, B colors
...with an unused palette palette picture data = each byte is an index in the palette in theory, it could be used:
How to make a pic-ception adjust each RGB value to the closest palette index ⇒ store a second picture with the same data…. (original idea by @reversity)
a polyglot collision (multiple use for a single backdoor)
Pwnie award… for the best song! err… what is it pwning exactly ?
Even songs should also have a nice PoC (never forget to load your PDFs in your favorite NES emulator)
Do you remember this ?
A Super NES & Megadrive rom (and PDF at the same time)
Ange’s recipes :) Never forget to: ● open your PDFs in a hex editor ● open your pictures in a sound player ● run your documents in a console emulator ● encrypt/decrypt with any cipher ● double-check what you printed
Security advice: DON’T * It’s easy to blame others - new insecure paths appear everyday
Research advice: DO * PoC||GTFO ! stop the marketing! cheap blamers ⇔ blatant marketers?
F.F.F. conclusion ● many abuses of the specs ○ specs often are wrong or misleading ● few parsers, even fewer dissectors ● standard tools evolve the wrong way ○ try to repair ‘corrupted’ file outside the specs ○ standard and recovery mode For technical details, check my previous talks.
Bonus after the talk, we tried some PoCs on professional (very expensive!) forensic softwares: ● polyglot files ○ a single file format found + no warning whatsoever ● schizophrenic files: ○ no warning yet different tabs of the same software showing different content :D BIG FAIL - yet we trust them for court cases ?
a PDF: ● containing the game as ZIP ● hand-written ○ with walkthrough’s screenshots (in original resolution) ○ a lightweight title ○ while maintaining compatibility a good way to distribute as a single file! $ unzip -t ZeroNights2014-Is-Infosec-A-Game.pdf Archive: ZeroNights2014-Is-Infosec-A-Game.pdf warning [ZeroNights2014-Is-Infosec-A-Game.pdf]: 6381506 extra bytes (attempting to process anyway) testing: ZN14GAME/ OK testing: ZN14GAME/COMMON/ OK ...
Quine prints its own source
a PE quine (in assembler, no linker)
Most quines aren’t very sexy Using a compiler is cheap :p
Quine Relay A prints B’s source B prints A’s source