Splitting Files (Small Code & Blazing Fast) (Views: 31)
Problem/Question/Abstract: How can I split up a file into smaller pieces of specified size and have the source code simple at the same time. Answer: Now why would one want to split up files? A reason could be that it is too large to be transferred reliably to another computer. Hence you chop it up into snmaller manageable pieces, transfer the pieces and re-assemble them in the target computer. Here is a very small, simple and very fast function for splitting a specified file into smaller files of specified size (in bytes). The function uses streams & is more or less self explainatory. Error handling is currently minimal & can be extended. The function does not modify the original file in any manner, but merely creates new files in the same directory as the original file with sequenced extensions (.001, .002, ...). What's the use of splitting if you cannot put them together again? To join up the split files, you can use the command line: Copy /B File1 + File2 + File3 ... TargetFile Save the following code to a file named "SplitFl.pas", use it in your source with the "Uses SplitFl" clause and you are ready to split (hopefully not of laughter)! {******************************************************} {* Description: Splits a specified file into pieces *} {* of specified size. *} {******************************************************} {* Last Modified : 12-Mar-2001 *} {* Author : Paramjeet Reen *} {******************************************************} {* I do not gurantee the fitness of this program. *} {* Please use it at your own risk. *} {******************************************************} {* Category :Freeware. *} {******************************************************} unit SplitFl; interface procedure SplitFile(const pFileName: AnsiString; const pSplitSize: LongInt); implementation uses Classes, SysUtils, Dialogs; function Smaller(const a, b: LongInt): LongInt; begin if (a < b) then begin Result := a; end else if (b > 0) then begin Result := b end else Result := 0; end; procedure SplitFile(const pFileName: AnsiString; const pSplitSize: LongInt); var vInpFl: TFileStream; vOutFl: TFileStream; vCtr: Integer; begin vInpFl := TFileStream.Create(pFileName, fmOpenRead); if (vInpFl.Size > pSplitSize) then begin vCtr := 0; while (vInpFl.Position < vInpFl.Size) do begin Inc(vCtr); vOutFl := TFileStream.Create(pFileName + '.' + FormatFloat('000', vCtr), fmCreate); vOutFl.CopyFrom(vInpFl, Smaller(pSplitSize, vInpFl.Size - vInpFl.Position)); vOutFl.Free; end; end else MessageDlg('File too small to split!', mtInformation, [mbOk], 0); vInpFl.Free; end; end. = = = = = = = = = = = = = = file Split Act - I Scene - II = = = = = = = = = = = = = = The story so far was that I believed that I had made a decent file splittingfunction that was both small & fast.However, it was pointed out that it is not fast when it comes to handling HUGE files.I then discovered the $F000 limit to the intermediate memory buffer & thought it to be the cause.Also another suggestion of using the "FILE_FLAG_SEQUENTIAL_SCAN" flag for opening the input & output files would yield performance benefits.Keeping all the above in mind, I re - worked my original code to the one given below.However, surprisingly, there is no appreciable speed benefit!! Perhaps someone can tell me why and suggest improvements... unit SplitFl; interface procedure SplitFile(const pFileName: AnsiString; const pSplitSize: LongInt); implementation uses Classes, SysUtils, Dialogs, Windows; function Smaller(const a, b: LongInt): LongInt; begin if (a < b) then begin Result := a; end else if (b > 0) then begin Result := b end else Result := 0; end; procedure SplitFile(const pFileName: AnsiString; const pSplitSize: LongInt); var vInpFlHandle: Integer; vOutFlHandle: Integer; vInpBytesLft: Integer; vOutBytesLft: Integer; vBufferSize: Integer; vBytesDone: Integer; vBuffer: Pointer; vCtr: Integer; begin //Use one of the following options to open the file. //vInpFlHandle := Integer(CreateFile(PChar(pFileName),GENERIC_READ,FILE_SHARE_READ,nil,OPEN_EXISTING,FILE_ATTRIBUTE_NORMAL,FILE_FLAG_SEQUENTIAL_SCAN)); vInpFlHandle := FileOpen(pFileName, 0); vInpBytesLft := FileSeek(vInpFlHandle, 0, 2); if (vInpBytesLft > pSplitSize) then begin vBufferSize := Smaller(GetHeapStatus.TotalUncommitted, pSplitSize); GetMem(vBuffer, vBufferSize); FileSeek(vInpFlHandle, 0, 0); vCtr := 0; while (vInpBytesLft > 0) do begin Inc(vCtr); //Use one of the following options to open the file. //vOutFlHandle := Integer(CreateFile(PChar(pFileName + '.' + FormatFloat('000', vCtr)),GENERIC_READ or GENERIC_WRITE,0,nil,CREATE_ALWAYS,FILE_ATTRIBUTE_NORMAL,FILE_FLAG_SEQUENTIAL_SCAN)); vOutFlHandle := FileCreate(pFileName + '.' + FormatFloat('000', vCtr)); vOutBytesLft := Smaller(vInpBytesLft, pSplitSize); while (vOutBytesLft > 0) do begin vBytesDone := FileRead(vInpFlHandle, vBuffer^, Smaller(vOutBytesLft, vBufferSize)); FileWrite(vOutFlHandle, vBuffer^, vBytesDone); Dec(vInpBytesLft, vBytesDone); Dec(vOutBytesLft, vBytesDone); end; FileClose(vOutFlHandle); end; FreeMem(vBuffer); end else MessageDlg('File too small to split!', mtInformation, [mbOk], 0); FileClose(vInpFlHandle); end; end. The TFileStream.Create calls the FileCreate in SysUtils, I've had some success, by creating a separate TFileStream constructor called TFileStream.CreateSeqScan that calles this SeqScanFileCreate instead of FileCreate, and adds the FILE_FLAG_SEQUENTIAL_SCAN to the windows API CreateFile function SeqScanFileCreate(const FileName: string): Integer; begin Result := Integer(CreateFile(PChar(FileName), GENERIC_READ or GENERIC_WRITE, 0, nil, CREATE_ALWAYS, FILE_ATTRIBUTE_NORMAL or FILE_FLAG_SEQUENTIAL_SCAN, 0)); end; this allows the operating system to read ahead as much as memory allows and write in larger chunks than the $F000 |