Checking File Preambles and Watermarks
I'm constantly having to check the preambles of files for some kind of signature (or "watermark") to check or validate the file's format.
I've re-invented the wheel many times over while doing this. Time, I thought, for a little helper routine or two.
First, a little routine to check the next N bytes from a stream for a given sequence of bytes (the "watermark").
function StreamHasWatermark(const Stm: TStream;
const Watermark: array of Byte): Boolean;
var
StmPos: Int64;
Buf: array of Byte;
I: Integer;
begin
Assert(Length(Watermark) > 0, 'No "watermark" specified');
Result := False;
StmPos := Stm.Position;
try
if Stm.Size - StmPos < Length(Watermark) then
Exit;
SetLength(Buf, Length(Watermark));
Stm.ReadBuffer(Pointer(Buf)^, Length(Buf));
for I := Low(Buf) to High(Buf) do
if Buf[I] <> Watermark[I] then
Exit;
Result := True;
finally
Stm.Position := StmPos;
end;
end;
Pass it a stream and an array containing the required "watermark" and the routine checks to see if the watermark exists at the current position† in the stream. The original stream position is restored after checking. This means that the routine can be called more than once to test for different watermarks without having worry about keeping track of the stream position.
† I designed the routine to check from the current stream position rather than the beginning of the stream because some of the files / streams my have have the required sequence of bytes offset from the start.
That's the core functionality taken care of, so how can we use the routine?
How about this generalised routine to check a file watermark or preamble?
function FileHasWatermark(const FileName: string;
const Watermark: array of Byte; const Offset: Integer = 0): Boolean;
overload;
var
FS: TFileStream;
begin
FS := TFileStream.Create(FileName, fmOpenRead or fmShareDenyNone);
try
FS.Position := Offset;
Result := StreamHasWatermark(FS, Watermark);
finally
FS.Free;
end;
end;
This routine is pretty self explanatory: it looks for the given sequence of bytes (Watermark) in the named file. It also has an optional parameter that lets you specify the offset of the watermark in the file.
Quite often watermarks are specified as ASCII text, so I've created an overload function to take an ASCII (actually ANSI) watermark instead of an array of bytes. Here it is:
function FileHasWatermark(const FileName: string;
const Watermark: AnsiString; const Offset: Integer = 0): Boolean;
overload;
var
Bytes: array of Byte;
I: Integer;
begin
SetLength(Bytes, Length(Watermark));
for I := 1 to Length(Watermark) do
Bytes[I - 1] := Ord(Watermark[I]);
Result := FileHasWatermark(FileName, Bytes, Offset);
end;
Finally a few examples, all of which assume the name of the required file is in a string variable named FileName:
- A zip file created by PKZip has preamble
50 4B 03 $04
in hex. So a test for such a zip file could be:if FileHasWatermark(FileName, [$50, $4B, $03, $04]) then
ShowMessage('PKZip file');
- Some versions of the old style Windows help file have the byte sequence
00 00 FF FF FF FF
at offset 6. The test is:if FileHasWatermark(FileName, [$00, $00, $FF, $FF, $FF, $FF], 6) then
ShowMessage('WinHelp file');
- I'm tinkering about with the Game of Life at the moment and have found there are two versions of the Life file format - 1.05 and 1.06 - which both use the
.lif
file extension but have different ASCII preambles. Here's a way to distinguish them using the ASCII overload of our function:if FileHasWatermark(FileName, '#Life 1.05') then
ShowMessage('Life v1.05 file format')
else if FileHasWatermark(FileName, '#Life 1.06') then
ShowMessage('Life v1.06 file format')
else
ShowMessage('Invalid Life file format');
Hope that's useful to someone.
EDIT: Versions of these routines are now available from the Code Snippets Database.
Comments
Post a Comment
Comments are very welcome, but please don't comment here if:
1) You have a query about, or a bug report for, one of my programs or libraries. Most of my posts contain a link to the relevant repository where there will be an issue tracker you can use.
2) You have a query about any 3rd party programs I feature, please address them to the developer(s) - there will be a link in the post.
3) You're one of the tiny, tiny minority who are aggressive or abusive - in the bin you go and reported you will be!
Thanks