DELPHI7的通配符比较的汇编函数(0)

  • 主题发起人 主题发起人 e271828
  • 开始时间 开始时间
E

e271828

Unregistered / Unconfirmed
GUEST, unregistred user!
DELPHI7的通配符比较的汇编函数 作者:李均宇 e271828@tom.com,QQ:165442523DELPHI7中的POS()函数是不能用通配符的,但是有汇编代码公开的,于是我想能否把此汇编函数改编成可能用通配符,有此想法已有多年了,最近我重新研究这个问题,发现是可行了,并做了出来,由于时间仓促,错误在所难免,希望能发现此汇编函数有BUG者能反馈给我知,我就多谢大家了。下面是我修改DELPHI7中的POS()而成的POSLI()的汇编源码,我试过可以通配符的,未知有无什么BUG,希望有人通知我BUG在何处。此汇编函数支持?号只匹配一个中文。只支持?号和*号,不支持[]号。例如:子串为“Edit*1*2*3?4*5”,源串为“Edit111www222123国45qEdit222www333qq”时将返回最后一个*号之后第一个字符的匹配处INDEX。如果子串没有*号,返回子串第一个字符的匹配INDEX,如果子串有*号,返回最后一个最接近*号的字符的匹配INDEX值。总之,返回值大于0就表示匹配成功了。此函数支持中文查找,如子串为“谢”,源串为“中华”,返回0,虽然中之后半字节和华之前半字节合成谢字,但我这汇编函数已做了处理了。此汇编函数参数为PCHAR,可以传多于255个字符的字串作参数也!注意:此汇编函数Posli()有一个死BUG,如果子串中*号正好对应撞上源串也对应是*号,此时出错将无法更正也,因为汇编函数CMPSB不会中断也,所以无法处理也,除非不用汇编指令CMPSB,这样就麻烦了,此为死穴,我现在无法更正也。如子串为wwww*www,源串为aaaawwww*qwww,本应返回大于0,但我返回为0,此为死穴,我无法更正也,除非源串中不允许含有*号。http://blog.sina.com.cn/s/blog_5a8269570100kois.html?retcode=0http://blog.csdn.net/e271828/archive/2010/08/28/5846826.aspxfunction Posli( substr :pchar ; s : pchar ) : Integer;vardlen,sublen,esi0,edi0,starnum,starnum2,ifbacknum:integer;asm{ ->EAX Pointer to substr }{ EDX Pointer to string }{ <-EAX Position of substr in s or 0 } PUSH EBX PUSH ESI PUSH EDI MOV ESI,EAX { Point ESI to substr } MOV EDI,EAX MOV starnum,0 MOV starnum2,0 MOV dlen,0 MOV sublen,0 XOR ECX,ECX MOV CL,[EDI] INC EDI //////////////////////// XOR ECX,ECX MOV ECX,0FFFFFFFFH XOR AL,AL REPNE SCASB NOT ECX MOV sublen,ECX //SUB sublen,2 ///////////////////////// MOV EDI,ESI MOV AL,'*'@@start0: REPNE SCASB JNE @@start ADD starnum,1 JMP @@start0 //////////////////////////////@@start://///////////////////////////// MOV EDI,EDX INC EDI XOR ECX,ECX MOV ECX,0FFFFFFFFH XOR AL,AL REPNE SCASB NOT ECX MOV dlen,ECX //SUB dlen,1///////////////////////////////////////// MOV EDI,EDX { Point EDI to s } MOV esi0,ESI MOV edi0,EDI XOR ECX,ECX { ECX = Length(s) } MOV CL,[EDI] MOV ECX, dlen PUSH EDI { remember s position to calculate index } //INC EDI { Point EDI to first char of s } XOR EDX,EDX { EDX = Length(substr) } MOV DL,[ESI] //INC ESI { Point ESI to first char of substr } MOV EDX, sublen CMP EDX,0 { EDX = Length(substr) - 1 } JS @@fail { < 0 ? return 0 } MOV AL,[ESI] { AL = first char of substr } //INC ESI { Point ESI to 2'nd char of substr } DEC EDX SUB ECX,EDX { #positions in s to look at } { = Length(s) - Length(substr) + 1 } ADD ECX,starnum JLE @@fail PUSH ESI { save outer loop substr pointer } PUSH EDI { save outer loop s pointer } MOV ECX,sublen ADD EDI,1 JMP @@star@@loop: REPNE SCASB JNE @@fail MOV EBX,ECX { save outer loop counter } PUSH ESI { save outer loop substr pointer } PUSH EDI { save outer loop s pointer } MOV ECX,EDX@@loopwww: // MOV AL,[ESI] // MOV AL,[ESI-1] //MOV AL,[ESI-2] REPE CMPSB //PUSH ESI JE @@found //INC EDI /////////////// //MOV AL,[ESI] //MOV AL,[ESI-1] //MOV AL,[EDI-1] //CMP AL,[ESI-1] //JE @@found CMP ECX,0 JE @@iffound1 {MOV AL,[ESI] CMP AL,$12 JE @@found CMP AL,$0 JE @@found CMP AL,$FF JE @@found} /////////////////@@iffound2: //PUSH EAX MOV AL,[ESI] SUB ESI,1 MOV AL,[ESI] INC ESI //INC ESI CMP AL,'?' //POP ESI JE @@what CMP AL,'*' JE @@star //MOV AL,[ESI] //CMP AL,$12 //JE @@fail2 //CMP AL,$0 //JE @@fail2 //POP EAX MOV AL,[EDI] CMP AL,$12 JE @@fail2 CMP AL,$0 JE @@fail2 ////////////////有可能源串短于子串也,因?号匹配中文也 POP EDI { restore outer loop s pointer } POP ESI { restore outer loop substr pointer } MOV ECX,EBX { restore outer loop counter } JMP @@loopOK@@what: MOV EAX,0 MOV EAX,EDI SUB EAX,dlen CMP EAX,edi0 JG @@fail2 ///////如果源串已结束则必NO //BUG MOV AL,[ESI] CMP AL,$12 //此当结束 //POP EAX JE @@found CMP AL,$0 JE @@found //push eax //MOV AL,[ESI] //CMP AL,$0 //此也当结束,为经验,不知何解 //POP EAX CMP ECX,0 JE @@found ///////////////////// MOV AL,[EDI] CMP AL,$80 JNB @@chinese@@whatchinese: MOV AL,[ESI] //////////////////////// JMP @@loopwww@@chinese: ADD EDI,1 JMP @@whatchinese@@star: ADD starnum2,1 SUB EDI,1 MOV AL,[ESI] CMP AL,$12 //POP EAX JE @@found CMP AL,$0 JE @@found // POP EAX // POP EAX ///////// //XOR ECX,ECX // MOV CL,[EDI] // INC EDI { Point EDI to first char of s } // PUSH EDI { remember s position to calculate index } // XOR EDX,EDX { EDX = Length(substr) } // MOV DL,[ESI] // INC ESI { Point ESI to first char of substr } // DEC EDX { EDX = Length(substr) - 1 } // JS @@fail { < 0 ? return 0 } // XOR EAX,EAX ////////////////////////// //////////////////// //PUSH EAX@@www: CMP ECX,0 JE @@found MOV EAX,0 MOV EAX,EDI SUB EAX,dlen CMP EAX,edi0 JG @@fail2 //////////////////// MOV AL,[ESI] ADD ESI,1 SUB ECX,1 //MOV AL,[ESI] //INC ESI CMP AL,'?' //POP ESI JE @@qq CMP AL,'*' JE @@www CMP AL,$12 //POP EAX JE @@found CMP AL,$0 JE @@found //POP EAX SUB ESI,1 ADD ECX,1 POP EAX POP EAX ///////////////////////// //////////////////////////////@@loopOK: MOV AL,[ESI] { AL = first char of substr } // INC ESI { Point ESI to 2'nd char of substr } // SUB ECX,EDX { #positions in s to look at } { = Length(s) - Length(substr) + 1 } // JLE @@fail //MOV ECX,dlen-(EDI-edi0)-(sublen-(ESI-esi0))+1+starnum//-starnum2 MOV ECX,dlen SUB ECX,EDI ADD ECX,edi0 SUB ECX,sublen ADD ECX,ESI //esi为1时,ESI0为0,所以总多一,要加多一个一 SUB ECX,esi0 //ADD ECX,2 //STRING ADD ECX,1 //PCHAR ADD ECX,starnum //SUB ECX,starnum2 CMP ECX,0 JLE @@fail REPNE SCASB JNE @@fail //MOV EBX,ECX { save outer loop counter } /////////////////////////////////////////////////////// PUSH EAX PUSH EDI SUB EDI,1 MOV AL,[EDI] CMP AL,$80 POP EDI POP EAX JNB @@IFBACK ///////////////////////////////////////////////////////@@IFLEAD: //ADD ESI,1 MOV ECX,sublen SUB ECX,ESI ADD ECX,esi0 SUB ECX,1 //PCHAR才如此也 //SUB ECX,1 PUSH ESI { save outer loop substr pointer } INC ESI PUSH EDI { save outer loop s pointer } //PUSH EDX MOV ECX,ECX CMP ECX,0 JE @@found //POP EDX JMP @@loopwww@@IFBACK: PUSH EDI PUSH EAX MOV ifbacknum,0 SUB EDI,1@@ifback2: ADD EDI,1 CMP EDI,edi0 JE @@ifback1 MOV AL,[EDI] CMP AL,$80 JB @@ifback1 NOT ifbacknum JMP @@ifback2@@ifback3: POP EAX POP EDI JMP @@IFLEAD@@ifback1: CMP ifbacknum,0 JNE @@ifback3 POP EAX POP EDI ADD EDI,1 JMP @@loopOK@@qq: POP EAX ADD EDI,1 /////////////// //PUSH EAX MOV AL,[EDI] CMP AL,$80 JNB @@chinese0@@whatchinese0: //POP EAX //////////////////////// PUSH EDI JMP @@www@@qqq: ADD EDI,1 /////////////// PUSH EAX MOV AL,[EDI] CMP AL,$80 JNB @@chinese0 POP EAX //////////////////////// PUSH EDI JMP @@www@@chinese0: ADD EDI,1 JMP @@whatchinese0@@fail2: POP EDX POP EDX@@fail: POP EDX { get rid of saved s pointer } XOR EAX,EAX JMP @@exit@@iffound1: MOV AL,[ESI] MOV AL,[ESI-1] MOV AL,[EDI-1] CMP AL,[ESI-1] JE @@found JMP @@iffound2@@found: POP EDI { restore outer loop s pointer } POP ESI { restore outer loop substr pointer } POP EDX { restore pointer to first char of s } MOV EAX,EDI { EDI points of char after match } SUB EAX,EDX { the difference is the correct index }@@exit: POP EDI POP ESI POP EBXend;
 
要用通配符搜索的的,想必也不考虑效率,还不如用正则表达式。
 
正则表达式一者必慢过我汇编的通配符函数,二者正则表达式复杂难用,不及通配符函数来得方便。
 
和Masks.MatchesMask的比较如何?
 
正则表达式的优势是能处理任意长的字串,效率当然是没有办法于ASM比
 
MatchesMask慢到离谱的,我的汇编函数也能处理任意长的字串的,是PCHAR的。
 
不管怎么说,还是顶楼主一个,自己写的总能对自己的能力有所提高的。
 
后退
顶部