Skip to content
This repository was archived by the owner on Jun 7, 2023. It is now read-only.
This repository was archived by the owner on Jun 7, 2023. It is now read-only.

IndexError when outputting Base64 strings to a table #797

@SimonHertoge

Description

@SimonHertoge

When scanning an Office document using the office module with --vba option, one of the sections olevba returns is the "Base64 strings":
https://github.com/viper-framework/viper-modules/blob/master/office.py#L405

This eventually is printed as a table by terminaltables, but crashes when the decoded base64 text contains raw bytes which correspond with newline characters (\r, \n, \f, maybe some others). As a result viper-web does not output anything and returns the following error:
We were unable to complete the command surface . Error: list index out of range

This is what happens in olevba first:

import base64
encoded = "go4ihosidlsHDasd"
decoded = base64.b64decode(encoded).decode('utf8', errors='replace')

Later on those decoded strings are sent to the viper/common/out.py:table(header, rows) function, which uses an AsciiTable to print them:
https://github.com/viper-framework/viper/blob/master/viper/common/out.py#L58

Example:

from terminaltables import AsciiTable
table_data = [
    ['Decoded', 'Raw'],
    ['\\xefbfbd\\xefbfbd"\\xefbfbd\\xefbfbd"v[\\x07\r\\xefbfbd\\x1d', 'go4ihosidlsHDasd'],
    ['\x0cQ\r', 'DFEN'],
    ['0\\xefbfbd\\xf2808ca4\\xefbfbd\\xefbfbdF\\x0e,\\xefbfbd', 'MNfygIyki6FGDiyd'],
    ['nyl\\xefbfbd\\xefbfbd', 'bnlsweSd']
]
table = AsciiTable(table_data)
print(table.table)

This code fails because terminaltables tries to interpret the input to format multiline strings:
https://github.com/Robpol86/terminaltables/blob/master/terminaltables/width_and_alignment.py#L58
and eventually crashes on this line because some columns now contain more lines than others:
https://github.com/Robpol86/terminaltables/blob/master/terminaltables/build.py#L140

The easiest solution seems to escape all special characters in the table() function, so the input is no longer split into multiple lines.

Example output with this solution:

+--------------------------------------------------------+------------------+
| Decoded                                                | Raw              |
+--------------------------------------------------------+------------------+
| \xefbfbd\xefbfbd"\xefbfbd\xefbfbd"v[\x07\r\xefbfbd\x1d | go4ihosidlsHDasd |
| \x0cQ\r                                                | DFEN             |
| 0\xefbfbd\xf2808ca4\xefbfbd\xefbfbdF\x0e,\xefbfbd      | MNfygIyki6FGDiyd |
| nyl\xefbfbd\xefbfbd                                    | bnlsweSd         |
+--------------------------------------------------------+------------------+

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions