$Id: OLE2_README,v 1.1 1997/08/23 21:31:11 dps Exp dps $

This information is might be inaccurate due to it being reverse
engineering using some actual files. Laola did the same but got
a lot further. The following is heavily based on LAOLA docs from
Martin Schwartz <schartz@cs.tu-berlin.de>.

Following an unintentional Microsoft tip-off (in the OLE 2.01 spec,
which documents little more than the API, useless for non-MS systems)
OLE is based on the DOS FAT format. Reverse engineering suggests that

At 0x30 there is a 2 or 4 byte number, which is offset to main table/512.
This strongly suggests that it is a sector number and sectors are 512 bytes
each (as in DOS).

The main table is formated as 4 bytem, little endain numbers that almost
certaintly are sector numbers. -2 indicates the end of a chain.
?-1 is free space
?-3 is FAT

Microsoft documentation: maximum 32 characters per name, including the NULL.
Futher Microsoft document: First character of 1 to 1F means control stream
Currently only 4 are used
1 - basic control streams
2 - OLE presentation
3 - Used by container of this object
4 - used by storage implementation to hold properties

So Entry format is
0 - 63 <=32 character unicode string inluding null, null terminated
64 - 65 Length of name in bytes, including NULL  (LAOLA)
66      Type 1=Storage (directory), 2=stream, 5=root
67      Unkown
68 - 71 Previous pps
72 - 75 Next pps
76 - 79 Directory pps
80 - 95 ClassId, see below
96 - 99 Unknown

120 - 123 Little endian long data length (0 for IStorage, must be a clue)
124 - 127 0L at the moment, reserved??



ClassId format is

Bytes     4               2    2      8
  Long (little endain)-short-short-8x1 byte
e.g.
Id is: 00020906-0000-0000-c000000000000046
Bytes in file are (word 97 document):
06 09 02 00 00 00 00 00 c0 00 00 00 00 00 00 46

Same word7 document after a little emacs diddling:
Id is: 00020906-0301-0205-c000000000000046
Bytes in file are
06 09 02 00 01 03 05 02 c0 00 00 00 00 00 00 46

The significance of the ID is unclear.

     

