Files
php-malware-scanner/b64p.py
nichogenius 4014f414dc This is how I generate base64 sample patterns.
Example usage:

I want to see if a giant block of base64 code contains any references to the string 'base64'. 
The naive approach is to convert the string to it's base64 equivalent, YmFzZTY0.

There are two problems with this approach.  The first is that the string will be different depending on the position of the first character 'Y' in the input string.  Possible offents are 0 bits, 2 bits or 4 bits.  The above example only calculates the 0 bit offset.  There should be 3 separate base64 strings to look for.

The second problem is that base64 strings use a 6 bit encoding, so the characters don't align the same as 8 bit encoding.  This leads to character bleeding at the beginning and ends of a string where the string will change depending on its immediate context.  This script calculates the maximum constant string length that should be present.  Unfortunately it requires trimming characters which can often lead to very short strings.
2017-07-28 05:15:39 -06:00

1.9 KiB