Skip to content

Prefixed and others reset offsets returned by Tell #1014

@ntrrgc

Description

@ntrrgc

Imagine we have this simple example where we want to record the position of the "payload" object within our file to be able to reference it:

In [58]: Struct(
    ...:     "version" / Int16ub,
    ...:     "box" / Struct(
    ...:         "size" / Int16ub,
    ...:         "payload_position" / Tell,
    ...:         "payload" / Bytes(this.size),
    ...:     ),
    ...: ).parse(bytes([0, 1, 0, 5, 1, 2, 3, 4, 5]))
Out[58]: Container(version=1, box=Container(size=5, payload_position=4, payload=b'\x01\x02\x03\x04\x05'))

In that example, payload_position is 4, as expected. Our payload starts at byte index 4 in the parsed stream.

Now, let's write an alternative version using Prefixed(), which a user could expect to be functionally equivalent:

In [59]: Struct(
    ...:     "version" / Int16ub,
    ...:     "box" / Prefixed(Int16ub, Struct(
    ...:         "payload_position" / Tell,
    ...:         "payload" / GreedyBytes,
    ...:     )),
    ...: ).parse(bytes([0, 1, 0, 5, 1, 2, 3, 4, 5]))
Out[59]: Container(version=1, box=Container(payload_position=0, payload=b'\x01\x02\x03\x04\x05'))

Oops, payload_position=0. We lost track of the position of the payload. That's not a very useful Tell in this case.

The reason is Tell relies stream.tell(), and several subconstructs create io.BytesIO() from data read from the original substream, which makes stream.tell() return 0. Furthermore, as far as I've been able to find, the position of the parent stream at the time of creating the BytesIO is not recorded, so this information cannot be recovered from the Container.

$ git grep --heading -p '_parsereport(io.BytesIO'
construct/core.py
class OffsettedEnd(Subconstruct):
        return self.subcon._parsereport(io.BytesIO(data), context, path)
class Prefixed(Subconstruct):
        return self.subcon._parsereport(io.BytesIO(data), context, path)
class FixedSized(Subconstruct):
        return self.subcon._parsereport(io.BytesIO(data), context, path)
class NullTerminated(Subconstruct):
        return self.subcon._parsereport(io.BytesIO(data), context, path)
class NullStripped(Subconstruct):
        return self.subcon._parsereport(io.BytesIO(data), context, path)
class Transformed(Subconstruct):
        return self.subcon._parsereport(io.BytesIO(data), context, path)
class ProcessXor(Subconstruct):
        return self.subcon._parsereport(io.BytesIO(data), context, path)
class ProcessRotateLeft(Subconstruct):
        return self.subcon._parsereport(io.BytesIO(data), context, path)

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions