Search:
Namespaces
Discussions
.NET v1.1
Feedback
Example on RegExp and screen scape ?
Messages
Related Types
This message was discovered on
ASPFriends.com 'aspngregexp' list
.
Ian Vink
After I screen scrape a page, I need to get the address that is between
this html block.=20
<td width=3D"275" colspan=3D"2">
<font face=3D"MS Sans Serif" size=3D"2">P.O. Box 1507</font>
</td>
I need to get "P.O. Box 1507"
Any simple examples? (I have the HTML in a string called sHTML) The HTML
is always exactly like this.
Ian
Reply to this message...
howard@botrykk.no (=?ISO-8859-1?Q?H=8Cvard?= Boland)
It depends ...
[Original message clipped]
first you need to know if there are other places that use the same font
tag ...that can cause a problem when mining.
basically you introduce a regualr expression to pull it out.
but try something like....
Dim r As Regex
Dim m As Match
Dim str As String
r = New Regex("(?<=
>font face="
"
>MS Sans Serif
"
>" size=
""
>2"
"
>>
)(.|\n)*?(?=</font>)",
RegexOptions
.IgnoreCase)
m = r.Match(sHTML)
str = m.Groups(0).Value
_______________________
:-: Howard Boland :-:
Reply to this message...
Darren Neimke
WW91IGNvdWxkIGJlIHZlcmJvc2UgaWYgeW91IGtub3cgdGhhdCB5b3Ugd2lsbCAqYWx3YXlzKiBi
ZSBtYXRjaGluZyAqZXhhY3RseSogdGhhdCB0ZXh0Og0KIA0KXDx0ZCB3aWR0aD0iMjc1IiBjb2xz
cGFuPSIyIlw+XHMqP1w8Zm9udCBmYWNlPSJNUyBTYW5zIFNlcmlmIiBzaXplPSIyIlw+KC4qPylc
PFwvZm9udFw+XHMqPzxcL3RkXD4NCg0KCS0tLS0tT3JpZ2luYWwgTWVzc2FnZS0tLS0tIA0KCUZy
b206IElhbiBWaW5rIFttYWlsdG86SWFuVmlua0BhdmlkeGNoYW5nZS5jb21dIA0KCVNlbnQ6IFR1
ZSA4LzIwLzIwMDIgMTI6MTAgQU0gDQoJVG86IGFzcG5ncmVnZXhwIA0KCUNjOiANCglTdWJqZWN0
OiBbYXNwbmdyZWdleHBdIEV4YW1wbGUgb24gUmVnRXhwIGFuZCBzY3JlZW4gc2NhcGUgPw0KCQ0K
CQ0KDQoJQWZ0ZXIgSSBzY3JlZW4gc2NyYXBlIGEgcGFnZSwgSSBuZWVkIHRvIGdldCB0aGUgYWRk
cmVzcyB0aGF0IGlzIGJldHdlZW4NCgl0aGlzIGh0bWwgYmxvY2suDQoJDQoJPHRkIHdpZHRoPSIy
NzUiIGNvbHNwYW49IjIiPg0KCSAgICA8Zm9udCBmYWNlPSJNUyBTYW5zIFNlcmlmIiBzaXplPSIy
Ij5QLk8uIEJveCAxNTA3PC9mb250Pg0KCTwvdGQ+DQoJDQoJSSBuZWVkIHRvIGdldCAiUC5PLiBC
b3ggMTUwNyINCgkNCglBbnkgc2ltcGxlIGV4YW1wbGVzPyAoSSBoYXZlIHRoZSBIVE1MIGluIGEg
c3RyaW5nIGNhbGxlZCBzSFRNTCkgVGhlIEhUTUwNCglpcyBhbHdheXMgZXhhY3RseSBsaWtlIHRo
aXMuDQoJDQoJSWFuDQoJDQoJDQoJfCBbYXNwbmdyZWdleHBdIG1lbWJlciBkYXJyZW4ubmVpbWtl
QHNkbS5jb20uYXUgPSBZT1VSIElEDQoJfCBodHRwOi8vd3d3LmFzcGxpc3RzLmNvbS9hc3BsaXN0
cy9hc3BuZ3JlZ2V4cC5hc3AgPSBKT0lOL1FVSVQNCgl8IGh0dHA6Ly93d3cuYXNwbGlzdHMuY29t
L3NlYXJjaCA9IFNFQVJDSCBBcmNoaXZlcw0KCQ0KCQ0KDQo
Reply to this message...
System.Text.RegularExpressions.RegexOptions
Ad
MBR BootFX
Best-of-breed application framework for .NET projects, developed by Matthew Baxter-Reynolds and MBR IT
Copyright © Matthew Baxter-Reynolds 2001-2008. '.NET 247 Software Development Services' is a trading style of MBR IT Solutions Ltd.
Contact Us
-
Terms of Use
-
Privacy Policy
-
www.dotnet247.com