Ticket #145 (new defect)

Opened 1 year ago

Last modified 1 year ago

CHM parser updates

Reported by: nneonneo <nneonneo@gmail.com> Assigned to: haypo
Priority: normal Milestone:
Component: parser Keywords:
Cc:

Description

[note: I really want to have a LZX decompressor, but I cannot find any; if I get more time I may write one in Python :)]

CHM parser is updated with some features: ability to parse all uncompressed files (well, at least as many as I could find ;)), so we're mainly waiting on the ability to decompress the LZX section.

At some point, when the decompressor is made, the content parser section can be wrapped in it, and a substream popped open. The substream will have, as arguments, all files which were not parsed in the decompressed stream, so that it is like a "sub archive" in a sense.

Attachments

chm.diff (9.8 kB) - added by nneonneo <nneonneo@gmail.com> on 07/02/07 06:28:41.
Patch to hachoir-parser/hachoir_parser/misc/chm.py
chm2.patch (9.0 kB) - added by haypo on 07/12/07 02:15:50.
Cleaned CHM patch
chm.2.diff (9.9 kB) - added by nneonneo <nneonneo@gmail.com> on 07/12/07 03:54:28.
Changed to use SFS; when the directory appears AFTER the data, we have little choice…

Change History

07/02/07 06:28:41 changed by nneonneo <nneonneo@gmail.com>

  • attachment chm.diff added.

Patch to hachoir-parser/hachoir_parser/misc/chm.py

07/02/07 06:30:38 changed by nneonneo <nneonneo@gmail.com>

Oh yes, not to mention fixing a major bug in the way the CWord was handled (multibyte CWords did not shift by 7 before adding the last byte, resulting in a completely wrong value)

07/12/07 02:15:08 changed by haypo

The patch doesn't work on 7zip.chm (file of Hachoir testcase). CHMParser should be a SeekableFieldSet? since CHM is a Microsoft file format and Microsoft loves seek method...

07/12/07 02:15:50 changed by haypo

  • attachment chm2.patch added.

Cleaned CHM patch

07/12/07 03:54:28 changed by nneonneo <nneonneo@gmail.com>

  • attachment chm.2.diff added.

Changed to use SFS; when the directory appears AFTER the data, we have little choice...

07/12/07 03:56:52 changed by nneonneo <nneonneo@gmail.com>

Grr...stupid Microsoft...


Add/Change #145 (CHM parser updates)