4042

SECCON CTF Quals just ended and NIA finished 122nd /o/ .
Here is a writeup for 4042.

Points: 100
Category: Unknown

The amount of information given about the encoded text was really small, which hinted at some recon aspect to the challenge. The very first result obtained on searching “2005” and “4042” is this particular April Fool’s RFC which describes two encodings, UTF-9 and UTF-18. Unlike the other ones by the IETF, the encoding described in this document could actually be implemented.

Below is the code for decoding it to the UTF-32 format:

# Returns a hexadecimal value for a UTF-32 encoded character
# https://gist.github.com/O-I/6758583
def get_character(hexnum)
    char = ''
    char << hexnum.to_i(16)
end

# Read the encoded data file
# We can consider the numbers to be encoded in nonets i.e a group of 9 bits.
# The first bit describes a 'continuation character',
# The other 8 bits can be considered to be an octet.
# If the continuation character is true, the bits are shifted to right(by 8) and the next octet is appended.
# Error correction has not been encorporated.

# Split into groups of 3 bytes i.e 9 bits
strings = File.read('data.txt').scan(/.{1,3}/)
# strings = '403221'.scan(/.{1,3}/)

val = 0

for i in 0...strings.length

    # Convert into octal representation
    nonet_val = strings[i].to_i(8)

    # Check the MSB of the number if 1
    continuation_char = (nonet_val & 256)/256

    # Find value of the remaining 8 bits forming the octet
    octet_val         = nonet_val & 255

    # XOR current val with the octet val.
    # If the current val is carried over due to continuation, the next 8 bits are added to it.
    val = val ^ octet_val

    if continuation_char == 1
        val = val << 8
    else
        print get_character(val.to_s(16))

        # Reset val
        val = 0
    end
end

puts

Easy challenge :)