Friday, December 25, 2009

Scraping Web Pages with VBScript and KiXtart

Here’s a quick example of how to scrape HREF and IMG strings from a web page using VBScript and KiXtart with the InternetExplorer.Application COM object.  Thanks to Paul Sadowski for the basis of the VBScript example (with only very slight modifications).

VBScript Code

url = "http://www.textpad.com"

Set ie = CreateObject("InternetExplorer.Application")
ie.Navigate url

Wscript.Echo "DOCUMENT HYPERLINKS" & vbCRLF
Do Until ie.ReadyState = 4
Wscript.Sleep 2
Loop
For each link in ie.Document.Links
Wscript.Echo link, link.InnerText
Next

Wscript.Echo "------------------------------------"
Wscript.Echo "DOCUMENT IMAGE TAGS" & vbCRLF

For each img in ie.Document.Images
Wscript.Echo img.Src
Next

ie.Quit


KiXtart Code



break ON

$url = "http://www.textpad.com"

$ie = CreateObject("InternetExplorer.Application")
$ie.Navigate($url)
? "DOCUMENT HYPERLINKS"

While $ie.ReadyState <> 4
Sleep 2
Loop

For each $link in $ie.Document.Links
? $link+"="+$link.InnerText
Next

? "DOCUMENT IMAGE TAGS"

For each $pix in $ie.Document.Images
? $pix.Src
Next

$ie.Quit()

No comments:

Post a Comment