两个字符串列表的比较的算法(50)

  • 主题发起人 主题发起人 海无崖
  • 开始时间 开始时间

海无崖

Unregistered / Unconfirmed
GUEST, unregistred user!
现有两个Tstringlist一个是s1123232342223423另一个是s2 (s2有几十万行)12412,......23442,......23422,.................如何才能快速地将s1中每一行字符串在s2中检索哪一行是否包涵了这个字串.检索结果为23422,......请大侠们支招!
 
如果s2每行的长度固定,可优化:n := pos(s1, s2.txt);index := n div Length(s2[0]);如果不固定,使用前人的StrFunc.Pas文件,是汇编查找,其查找速度号称最快
 
放到数据库里用SQL语句查询
 
pos这个方法不是最优的算法.
 
效率来自汇编
 
用汇编又将怎么来写代码喃?
 
如果将1千行的字符列表和一个30多万条的字符列表比较,pos将会非常慢啊,难道就没有解决办法吗???
 
s2要先排序,TStringList.Sorted:=True遍历s1,在s2中查询s1的每一行,这时候是按二分法查询,速度要快些,效率是来自算法,不一定是汇编
 
首先,将一个30多万条的记录sorted就非常慢了,还不用说比较。
 
给你一段老外的代码:unit FastStrings;interfaceuses {$IFNDEF LINUX} Windows, {$ENDIF} SysUtils;//This TYPE declaration will become apparent latertype TBMJumpTable = array[0..255] of Integer; TFastPosProc = function (const aSource, aFind: Pointer; const aSourceLen, aFindLen: Integer; var JumpTable: TBMJumpTable): Pointer; TFastPosIndexProc = function (const aSourceString, aFindString: string; const aSourceLen, aFindLen, StartPos: Integer; var JumpTable: TBMJumpTable): Integer; TFastTagReplaceProc = procedure (var Tag: string; const UserData: Integer);//Boyer-Moore routinesprocedure MakeBMTable(Buffer: PChar; BufferLen: Integer; var JumpTable: TBMJumpTable);procedure MakeBMTableNoCase(Buffer: PChar; BufferLen: Integer; var JumpTable: TBMJumpTable);function BMPos(const aSource, aFind: Pointer; const aSourceLen, aFindLen: Integer; var JumpTable: TBMJumpTable): Pointer;function BMPosNoCase(const aSource, aFind: Pointer; const aSourceLen, aFindLen: Integer; var JumpTable: TBMJumpTable): Pointer;function FastAnsiReplace(const S, OldPattern, NewPattern: string; Flags: TReplaceFlags): string;procedure FastCharMove(const Source; var Dest; Count : Integer);function FastCharPos(const aSource : string; const C: Char; StartPos : Integer): Integer;function FastCharPosNoCase(const aSource : string; C: Char; StartPos : Integer): Integer;function FastPos(const aSourceString, aFindString : string; const aSourceLen, aFindLen, StartPos : Integer) : Integer;function FastPosNoCase(const aSourceString, aFindString : string; const aSourceLen, aFindLen, StartPos : Integer) : Integer;function FastPosBack(const aSourceString, aFindString : string; const aSourceLen, aFindLen, StartPos : Integer) : Integer;function FastPosBackNoCase(const aSourceString, aFindString : string; const aSourceLen, aFindLen, StartPos : Integer) : Integer;function FastReplace(const aSourceString : string; const aFindString, aReplaceString : string; CaseSensitive : Boolean = False) : string;function FastTagReplace(const SourceString, TagStart, TagEnd: string; FastTagReplaceProc: TFastTagReplaceProc; const UserData: Integer): string;function SmartPos(const SearchStr,SourceStr : string; const CaseSensitive : Boolean = TRUE; const StartPos : Integer = 1; const ForwardSearch : Boolean = TRUE) : Integer;implementationconst cDeltaSize = 1.5;var GUpcaseTable : array[0..255] of char; GUpcaseLUT: Pointer;//MakeBMJumpTable takes a FindString and makes a JumpTableprocedure MakeBMTable(Buffer: PChar; BufferLen: Integer; var JumpTable: TBMJumpTable);begin if BufferLen = 0 then raise Exception.Create('BufferLen is 0'); asm push EDI push ESI mov EDI, JumpTable mov EAX, BufferLen mov ECX, $100 REPNE STOSD mov ECX, BufferLen mov EDI, JumpTable mov ESI, Buffer dec ECX xor EAX, EAX@@loop: mov AL, [ESI] lea ESI, ESI + 1 mov [EDI + EAX * 4], ECX dec ECX jg @@loop pop ESI pop EDI end;end;procedure MakeBMTableNoCase(Buffer: PChar; BufferLen: Integer; var JumpTable: TBMJumpTable);begin if BufferLen = 0 then raise Exception.Create('BufferLen is 0'); asm push EDI push ESI mov EDI, JumpTable mov EAX, BufferLen mov ECX, $100 REPNE STOSD mov EDX, GUpcaseLUT mov ECX, BufferLen mov EDI, JumpTable mov ESI, Buffer dec ECX xor EAX, EAX@@loop: mov AL, [ESI] lea ESI, ESI + 1 mov AL, [EDX + EAX] mov [EDI + EAX * 4], ECX dec ECX jg @@loop pop ESI pop EDI end;end;function BMPos(const aSource, aFind: Pointer; const aSourceLen, aFindLen: Integer; var JumpTable: TBMJumpTable): Pointer;var LastPos: Pointer;begin LastPos := Pointer(Integer(aSource) + aSourceLen - 1); asm push ESI push EDI push EBX mov EAX, aFindLen mov ESI, aSource lea ESI, ESI + EAX - 1 std mov EBX, JumpTable@@comparetext: cmp ESI, LastPos jg @@NotFound mov EAX, aFindLen mov EDI, aFind mov ECX, EAX push ESI //Remember where we are lea EDI, EDI + EAX - 1 xor EAX, EAX@@CompareNext: mov al, [ESI] cmp al, [EDI] jne @@LookAhead lea ESI, ESI - 1 lea EDI, EDI - 1 dec ECX jz @@Found jmp @@CompareNext@@LookAhead: //Look up the char in our Jump Table pop ESI mov al, [ESI] mov EAX, [EBX + EAX * 4] lea ESI, ESI + EAX jmp @@CompareText@@NotFound: mov Result, 0 jmp @@TheEnd@@Found: pop EDI //We are just popping, we don't need the value inc ESI mov Result, ESI@@TheEnd: cld pop EBX pop EDI pop ESI end;end;function BMPosNoCase(const aSource, aFind: Pointer; const aSourceLen, aFindLen: Integer; var JumpTable: TBMJumpTable): Pointer;var LastPos: Pointer;begin LastPos := Pointer(Integer(aSource) + aSourceLen - 1); asm push ESI push EDI push EBX mov EAX, aFindLen mov ESI, aSource lea ESI, ESI + EAX - 1 std mov EDX, GUpcaseLUT@@comparetext: cmp ESI, LastPos jg @@NotFound mov EAX, aFindLen mov EDI, aFind push ESI //Remember where we are mov ECX, EAX lea EDI, EDI + EAX - 1 xor EAX, EAX@@CompareNext: mov al, [ESI] mov bl, [EDX + EAX] mov al, [EDI] cmp bl, [EDX + EAX] jne @@LookAhead lea ESI, ESI - 1 lea EDI, EDI - 1 dec ECX jz @@Found jmp @@CompareNext@@LookAhead: //Look up the char in our Jump Table pop ESI mov EBX, JumpTable mov al, [ESI] mov al, [EDX + EAX] mov EAX, [EBX + EAX * 4] lea ESI, ESI + EAX jmp @@CompareText@@NotFound: mov Result, 0 jmp @@TheEnd@@Found: pop EDI //We are just popping, we don't need the value inc ESI mov Result, ESI@@TheEnd: cld pop EBX pop EDI pop ESI end;end;//NOTE : FastCharPos and FastCharPosNoCase do not require you to pass the length// of the string, this was only done in FastPos and FastPosNoCase because// they are used by FastReplace many times over, thus saving a LENGTH()// operation each time. I can't see you using these two routines for the// same purposes so I didn't do that this time !function FastCharPos(const aSource : string; const C: Char; StartPos : Integer) : Integer;var L : Integer;begin //If this assert failed, it is because you passed 0 for StartPos, lowest value is 1 !! Assert(StartPos > 0); Result := 0; L := Length(aSource); if L = 0 then exit; if StartPos > L then exit; Dec(StartPos); asm PUSH EDI //Preserve this register mov EDI, aSource //Point EDI at aSource add EDI, StartPos mov ECX, L //Make a note of how many chars to search through sub ECX, StartPos mov AL, C //and which char we want @Loop: cmp Al, [EDI] //compare it against the SourceString jz @Found inc EDI dec ECX jnz @Loop jmp @NotFound @Found: sub EDI, aSource //EDI has been incremented, so EDI-OrigAdress = Char pos ! inc EDI mov Result, EDI @NotFound: POP EDI end;end;function FastCharPosNoCase(const aSource : string; C: Char; StartPos : Integer) : Integer;var L : Integer;begin Result := 0; L := Length(aSource); if L = 0 then exit; if StartPos > L then exit; Dec(StartPos); if StartPos < 0 then StartPos := 0; asm PUSH EDI //Preserve this register PUSH EBX mov EDX, GUpcaseLUT mov EDI, aSource //Point EDI at aSource add EDI, StartPos mov ECX, L //Make a note of how many chars to search through sub ECX, StartPos xor EBX, EBX mov BL, C mov AL, [EDX+EBX] @Loop: mov BL, [EDI] inc EDI cmp Al, [EDX+EBX] jz @Found dec ECX jnz @Loop jmp @NotFound @Found: sub EDI, aSource //EDI has been incremented, so EDI-OrigAdress = Char pos ! mov Result, EDI @NotFound: POP EBX POP EDI end;end;//The first thing to note here is that I am passing the SourceLength and FindLength//As neither Source or Find will alter at any point during FastReplace there is//no need to call the LENGTH subroutine each time !function FastPos(const aSourceString, aFindString : string; const aSourceLen, aFindLen, StartPos : Integer) : Integer;var JumpTable: TBMJumpTable;begin //If this assert failed, it is because you passed 0 for StartPos, lowest value is 1 !! Assert(StartPos > 0); if aFindLen < 1 then begin Result := 0; exit; end; if aFindLen > aSourceLen then begin Result := 0; exit; end; MakeBMTable(PChar(aFindString), aFindLen, JumpTable); Result := Integer(BMPos(PChar(aSourceString) + (StartPos - 1), PChar(aFindString),aSourceLen - (StartPos-1), aFindLen, JumpTable)); if Result > 0 then Result := Result - Integer(@aSourceString[1]) +1;end;function FastPosNoCase(const aSourceString, aFindString : string; const aSourceLen, aFindLen, StartPos : Integer) : Integer;var JumpTable: TBMJumpTable;begin //If this assert failed, it is because you passed 0 for StartPos, lowest value is 1 !! Assert(StartPos > 0); if aFindLen < 1 then begin Result := 0; exit; end; if aFindLen > aSourceLen then begin Result := 0; exit; end; MakeBMTableNoCase(PChar(AFindString), aFindLen, JumpTable); Result := Integer(BMPosNoCase(PChar(aSourceString) + (StartPos - 1), PChar(aFindString),aSourceLen - (StartPos-1), aFindLen, JumpTable)); if Result > 0 then Result := Result - Integer(@aSourceString[1]) +1;end;function FastPosBack(const aSourceString, aFindString : string; const aSourceLen, aFindLen, StartPos : Integer) : Integer;var SourceLen : Integer;begin if aFindLen < 1 then begin Result := 0; exit; end; if aFindLen > aSourceLen then begin Result := 0; exit; end; if (StartPos = 0) or (StartPos + aFindLen > aSourceLen) then SourceLen := aSourceLen - (aFindLen-1) else SourceLen := StartPos; asm push ESI push EDI push EBX mov EDI, aSourceString add EDI, SourceLen Dec EDI mov ESI, aFindString mov ECX, SourceLen Mov Al, [ESI] @ScaSB: cmp Al, [EDI] jne @NextChar @CompareStrings: mov EBX, aFindLen dec EBX jz @FullMatch @CompareNext: mov Ah, [ESI+EBX] cmp Ah, [EDI+EBX] Jnz @NextChar @Matches: Dec EBX Jnz @CompareNext @FullMatch: mov EAX, EDI sub EAX, aSourceString inc EAX mov Result, EAX jmp @TheEnd @NextChar: dec EDI dec ECX jnz @ScaSB mov Result,0 @TheEnd: pop EBX pop EDI pop ESI end;end;function FastPosBackNoCase(const aSourceString, aFindString : string; const aSourceLen, aFindLen, StartPos : Integer) : Integer;var SourceLen : Integer;begin if aFindLen < 1 then begin Result := 0; exit; end; if aFindLen > aSourceLen then begin Result := 0; exit; end; if (StartPos = 0) or (StartPos + aFindLen > aSourceLen) then SourceLen := aSourceLen - (aFindLen-1) else SourceLen := StartPos; asm push ESI push EDI push EBX mov EDI, aSourceString add EDI, SourceLen Dec EDI mov ESI, aFindString mov ECX, SourceLen mov EDX, GUpcaseLUT xor EBX, EBX mov Bl, [ESI] mov Al, [EDX+EBX] @ScaSB: mov Bl, [EDI] cmp Al, [EDX+EBX] jne @NextChar @CompareStrings: PUSH ECX mov ECX, aFindLen dec ECX jz @FullMatch @CompareNext: mov Bl, [ESI+ECX] mov Ah, [EDX+EBX] mov Bl, [EDI+ECX] cmp Ah, [EDX+EBX] Jz @Matches //Go back to findind the first char POP ECX Jmp @NextChar @Matches: Dec ECX Jnz @CompareNext @FullMatch: POP ECX mov EAX, EDI sub EAX, aSourceString inc EAX mov Result, EAX jmp @TheEnd @NextChar: dec EDI dec ECX jnz @ScaSB mov Result,0 @TheEnd: pop EBX pop EDI pop ESI end;end;//My move is not as fast as MOVE when source and destination are both//DWord aligned, but certainly faster when they are not.//As we are moving characters in a string, it is not very likely at all that//both source and destination are DWord aligned, so moving bytes avoids the//cycle penality of reading/writing DWords across physical boundariesprocedure FastCharMove(const Source; var Dest; Count : Integer);asm//Note: When this function is called, delphi passes the parameters as follows//ECX = Count//EAX = Const Source//EDX = Var Dest //If no bytes to copy, just quit altogether, no point pushing registers cmp ECX,0 Je @JustQuit //Preserve the critical delphi registers push ESI push EDI //move Source into ESI (generally the SOURCE register) //move Dest into EDI (generally the DEST register for string commands) //This may not actually be neccessary, as I am not using MOVsb etc //I may be able just to use EAX and EDX, there may be a penalty for //not using ESI, EDI but I doubt it, this is another thing worth trying ! mov ESI, EAX mov EDI, EDX //The following loop is the same as repNZ MovSB, but oddly quicker ! @Loop: //Get the source byte Mov AL, [ESI] //Point to next byte Inc ESI //Put it into the Dest mov [EDI], AL //Point dest to next position Inc EDI //Dec ECX to note how many we have left to copy Dec ECX //If ECX <> 0 then loop Jnz @Loop //Another optimization note. //Many people like to do this //Mov AL, [ESI] //Mov [EDI], Al //Inc ESI //Inc ESI //There is a hidden problem here, I wont go into too much detail, but //the pentium can continue processing instructions while it is still //working out the result of INC ESI or INC EDI //(almost like a multithreaded CPU) //if, however, you go to use them while they are still being calculated //the processor will stop until they are calculated (a penalty) //Therefore I alter ESI and EDI as far in advance as possible of using them //Pop the critical Delphi registers that we have altered pop EDI pop ESI @JustQuit:end;function FastAnsiReplace(const S, OldPattern, NewPattern: string; Flags: TReplaceFlags): string;var BufferSize, BytesWritten: Integer; SourceString, FindString: string; ResultPChar: PChar; FindPChar, ReplacePChar: PChar; SPChar, SourceStringPChar, PrevSourceStringPChar: PChar; FinalSourceMarker: PChar; SourceLength, FindLength, ReplaceLength, CopySize: Integer; FinalSourcePosition: Integer;begin //Set up string lengths BytesWritten := 0; SourceLength := Length(S); FindLength := Length(OldPattern); ReplaceLength := Length(NewPattern); //Quick exit if (SourceLength = 0) or (FindLength = 0) or (FindLength > SourceLength) then begin Result := S; Exit; end; //Set up the source string and find string if rfIgnoreCase in Flags then begin SourceString := AnsiUpperCase(S); FindString := AnsiUpperCase(OldPattern); end else begin SourceString := S; FindString := OldPattern; end; //Set up the result buffer size and pointers try if ReplaceLength <= FindLength then //Result cannot be larger, only same size or smaller BufferSize := SourceLength else //Assume a source string made entired of the sub string BufferSize := (SourceLength * ReplaceLength) div FindLength; //10 times is okay for starters. We don't want to //go allocating much more than we need. if BufferSize > (SourceLength * 10) then BufferSize := SourceLength * 10; except //Oops, integer overflow! Better start with a string //of the same size as the source. BufferSize := SourceLength; end; SetLength(Result, BufferSize); ResultPChar := @Result[1]; //Set up the pointers to S and SourceString SPChar := @S[1]; SourceStringPChar := @SourceString[1]; PrevSourceStringPChar := SourceStringPChar; FinalSourceMarker := @SourceString[SourceLength - (FindLength - 1)]; //Set up the pointer to FindString FindPChar := @FindString[1]; //Set the pointer to ReplaceString if ReplaceLength > 0 then ReplacePChar := @NewPattern[1] else ReplacePChar := nil; //Replace routine repeat //Find the sub string SourceStringPChar := AnsiStrPos(PrevSourceStringPChar, FindPChar); if SourceStringPChar = nil then Break; //How many characters do we need to copy before //the string occurs CopySize := SourceStringPChar - PrevSourceStringPChar; //Check we have enough space in our Result buffer if CopySize + ReplaceLength > BufferSize - BytesWritten then begin BufferSize := Trunc((BytesWritten + CopySize + ReplaceLength) * cDeltaSize); SetLength(Result, BufferSize); ResultPChar := @Result[BytesWritten + 1]; end; //Copy the preceeding characters to our result buffer Move(SPChar^, ResultPChar^, CopySize); Inc(BytesWritten, CopySize); //Advance the copy position of S Inc(SPChar, CopySize + FindLength); //Advance the Result pointer Inc(ResultPChar, CopySize); //Copy the replace string into the Result buffer if Assigned(ReplacePChar) then begin Move(ReplacePChar^, ResultPChar^, ReplaceLength); Inc(ResultPChar, ReplaceLength); Inc(BytesWritten, ReplaceLength); end; //Fake delete the start of the source string PrevSourceStringPChar := SourceStringPChar + FindLength; until (PrevSourceStringPChar > FinalSourceMarker) or not (rfReplaceAll in Flags); FinalSourcePosition := Integer(SPChar - @S[1]); CopySize := SourceLength - FinalSourcePosition; SetLength(Result, BytesWritten + CopySize); if CopySize > 0 then Move(SPChar^, Result[BytesWritten + 1], CopySize);end;function FastReplace(const aSourceString : string; const aFindString, aReplaceString : string; CaseSensitive : Boolean = False) : string;var PResult : PChar; PReplace : PChar; PSource : PChar; PFind : PChar; PPosition : PChar; CurrentPos, BytesUsed, lResult, lReplace, lSource, lFind : Integer; Find : TFastPosProc; CopySize : Integer; JumpTable : TBMJumpTable;begin LSource := Length(aSourceString); if LSource = 0 then begin Result := aSourceString; exit; end; PSource := @aSourceString[1]; LFind := Length(aFindString); if LFind = 0 then exit; PFind := @aFindString[1]; LReplace := Length(aReplaceString); //Here we may get an Integer Overflow, or OutOfMemory, if so, we use a Delta try if LReplace <= LFind then SetLength(Result,lSource) else SetLength(Result, (LSource *LReplace) div LFind); except SetLength(Result,0); end; LResult := Length(Result); if LResult = 0 then begin LResult := Trunc((LSource + LReplace) * cDeltaSize); SetLength(Result, LResult); end; PResult := @Result[1]; if CaseSensitive then begin MakeBMTable(PChar(AFindString), lFind, JumpTable); Find := BMPos; end else begin MakeBMTableNoCase(PChar(AFindString), lFind, JumpTable); Find := BMPosNoCase; end; BytesUsed := 0; if LReplace > 0 then begin PReplace := @aReplaceString[1]; repeat PPosition := Find(PSource,PFind,lSource, lFind, JumpTable); if PPosition = nil then break; CopySize := PPosition - PSource; Inc(BytesUsed, CopySize + LReplace); if BytesUsed >= LResult then begin //We have run out of space CurrentPos := Integer(PResult) - Integer(@Result[1]) +1; LResult := Trunc(LResult * cDeltaSize); SetLength(Result,LResult); PResult := @Result[CurrentPos]; end; FastCharMove(PSource^,PResult^,CopySize); Dec(lSource,CopySize + LFind); Inc(PSource,CopySize + LFind); Inc(PResult,CopySize); FastCharMove(PReplace^,PResult^,LReplace); Inc(PResult,LReplace); until lSource < lFind; end else begin repeat PPosition := Find(PSource,PFind,lSource, lFind, JumpTable); if PPosition = nil then break; CopySize := PPosition - PSource; FastCharMove(PSource^,PResult^,CopySize); Dec(lSource,CopySize + LFind); Inc(PSource,CopySize + LFind); Inc(PResult,CopySize); Inc(BytesUsed, CopySize); until lSource < lFind; end; SetLength(Result, (PResult+LSource) - @Result[1]); if LSource > 0 then FastCharMove(PSource^, Result[BytesUsed + 1], LSource);end;function FastTagReplace(const SourceString, TagStart, TagEnd: string; FastTagReplaceProc: TFastTagReplaceProc; const UserData: Integer): string;var TagStartPChar: PChar; TagEndPChar: PChar; SourceStringPChar: PChar; TagStartFindPos: PChar; TagEndFindPos: PChar; TagStartLength: Integer; TagEndLength: Integer; DestPChar: PChar; FinalSourceMarkerStart: PChar; FinalSourceMarkerEnd: PChar; BytesWritten: Integer; BufferSize: Integer; CopySize: Integer; ReplaceString: string; procedure AddBuffer(const Buffer: Pointer; Size: Integer); begin if BytesWritten + Size > BufferSize then begin BufferSize := Trunc(BufferSize * cDeltaSize); if BufferSize <= (BytesWritten + Size) then BufferSize := Trunc((BytesWritten + Size) * cDeltaSize); SetLength(Result, BufferSize); DestPChar := @Result[BytesWritten + 1]; end; Inc(BytesWritten, Size); FastCharMove(Buffer^, DestPChar^, Size); DestPChar := DestPChar + Size; end;begin Assert(Assigned(@FastTagReplaceProc)); TagStartPChar := PChar(TagStart); TagEndPChar := PChar(TagEnd); if (SourceString = '') or (TagStart = '') or (TagEnd = '') then begin Result := SourceString; Exit; end; SourceStringPChar := PChar(SourceString); TagStartLength := Length(TagStart); TagEndLength := Length(TagEnd); FinalSourceMarkerEnd := SourceStringPChar + Length(SourceString) - TagEndLength; FinalSourceMarkerStart := FinalSourceMarkerEnd - TagStartLength; BytesWritten := 0; BufferSize := Length(SourceString); SetLength(Result, BufferSize); DestPChar := @Result[1]; repeat TagStartFindPos := AnsiStrPos(SourceStringPChar, TagStartPChar); if (TagStartFindPos = nil) or (TagStartFindPos > FinalSourceMarkerStart) then Break; TagEndFindPos := AnsiStrPos(TagStartFindPos + TagStartLength, TagEndPChar); if (TagEndFindPos = nil) or (TagEndFindPos > FinalSourceMarkerEnd) then Break; CopySize := TagStartFindPos - SourceStringPChar; AddBuffer(SourceStringPChar, CopySize); CopySize := TagEndFindPos - (TagStartFindPos + TagStartLength); SetLength(ReplaceString, CopySize); if CopySize > 0 then Move((TagStartFindPos + TagStartLength)^, ReplaceString[1], CopySize); FastTagReplaceProc(ReplaceString, UserData); if Length(ReplaceString) > 0 then AddBuffer(@ReplaceString[1], Length(ReplaceString)); SourceStringPChar := TagEndFindPos + TagEndLength; until SourceStringPChar > FinalSourceMarkerStart; CopySize := PChar(@SourceString[Length(SourceString)]) - (SourceStringPChar - 1); if CopySize > 0 then AddBuffer(SourceStringPChar, CopySize); SetLength(Result, BytesWritten);end;function SmartPos(const SearchStr,SourceStr : string; const CaseSensitive : Boolean = TRUE; const StartPos : Integer = 1; const ForwardSearch : Boolean = TRUE) : Integer;begin // NOTE: When using StartPos, the returned value is absolute! if (CaseSensitive) then if (ForwardSearch) then Result:= FastPos(SourceStr,SearchStr,Length(SourceStr),Length(SearchStr),StartPos) else Result:= FastPosBack(SourceStr,SearchStr,Length(SourceStr),Length(SearchStr),StartPos) else if (ForwardSearch) then Result:= FastPosNoCase(SourceStr,SearchStr,Length(SourceStr),Length(SearchStr),StartPos) else Result:= FastPosBackNoCase(SourceStr,SearchStr,Length(SourceStr),Length(SearchStr),StartPos)end;var I: Integer;initialization {$IFNDEF LINUX} for I:=0 to 255 do GUpcaseTable := Chr(I); CharUpperBuff(@GUpcaseTable[0], 256); {$ELSE} for I:=0 to 255 do GUpcaseTable := UpCase(Chr(I)); {$ENDIF} GUpcaseLUT := @GUpcaseTable[0];end.
 
To:szhcracker大侠,你帖出的代码怎么用啊???麻烦按我的需求写个例子,谢谢了。
 
那个S1的字符串内容都是数字,且是长度为5的字串??还是说不定?
 
你可能主要用到以下几个函数:FastCharPos、FastCharPosNoCase、FastPos、FastPosNoCase
 
我用了FastPos,速度和以前一样,没什么变化.
 
那个FastPos,就是Pos的意思,大概代码就是:var SrcData: string; RetVal, Len: Integer; LineStart, LineEnd: PChar; LineData: string;SrcData := S2.Text;for I := 0 to S1.Count - 1 do [ RetVal := FindPosNotCase(SrcData, S1, Length(SrcData), Length(S1, 1); if RetVal > 0 then [ // find LineStart := PChar(SrcData) + RetVal + 1; LineEnd := LineStart; while (LineEnd^ in [#13, #10, #0]) do Inc(lineEnd); Len := LineEnd - LineStart; // 得到该行 SetString(LineData, LineStart, Len); // ... ];]随便写的,没试。
 
To:QQ在线,我也是这样写的,但对速度并无帮助.我觉得关键就是在pos这种方法对一个几十万条的记录,可能的确是有问题的.
 
你這裡根本都沒有建立起一個好的數據結構出來,談何用算法來優化查找時間啊~~~~~~麻煩你先建好你自己的數據結構。你這樣相當于只是一個鏈錶結構,可用的算法不多!
 
是这样的,第一个字符串列表只是身份证号比如:522342342342323452234234234232345223423423423234...............第二个字符串列表是包括有身份证号,姓名,身高等所有人员信息如:'234523523452345','张三','男'....'234523523452345','李四','男'....'234523523232345','王五','女'.....................................我想把在第二个列表中所有包括第一个的身份证号的找出来.
 
我建议你先把2个都排序,然后再用那个fastpos第一个搜索全部,第2个就从上一个搜到的位置开始向下搜索,这样越搜索需要找的范围就越小
 
那用hash的方法试下。大概是这样:uses IniFiles;function CreateHash(List: TStrings): TStringHash;var S: string; I: Integer;begin Result := TStringHash.Create(List.Count); for I := 0 to List.Count - 1 do begin S := List; Result.Add(S, I); end;end;procedure DoFilter(S1, S2: TStrings; DisplayList: TStrings);var I, Ret: Integer; Line, S: string; Hash: TStringHash;begin Hash := CreateHash(S1); DisplayList.BeginUpdate; try for I := 0 to S2.Count - 1 do begin Line := S2; // S := 取Line的第一字段数据,我不知你怎么取的 Ret := Pos(',', Line); S := Copy(Line, 1, Ret - 1); // 存在就放到Display中 if Hash.ValueOf(S) <> -1 then DisplayList.Add(Line); end; finally DisplayList.EndUpdate; end;end;调用就是:List1.LoadFromFile('身份证号文件');List2.LoadFromFile('所有信息文件');DoFilter(List1, List2, Memo1.Lines);
 

Similar threads

S
回复
0
查看
1K
SUNSTONE的Delphi笔记
S
S
回复
0
查看
913
SUNSTONE的Delphi笔记
S
D
回复
0
查看
923
DelphiTeacher的专栏
D
D
回复
0
查看
715
DelphiTeacher的专栏
D
后退
顶部