Web Scraping intro with AutoHotKey 101-Getting data from a page, handles & pointers

Web Scraping Intro with AutoHotkey

Web Scraping Intro with AutoHotkey

Being able to, programatically, navigate to an Internet page and scrape the contents in a reliable fashion is best things invented since sliced bread!   I spent years manually going through pages and copying/pasting contents from IE to Excel then spent even more time trying to clean it up.  Done properly you can get the data very, very close to how it is on the web with little effort.

The below video walks through using AutoHotKey to obtain basic values from a Web page.  It also demonstrates a script I wrote that helps write the syntax (yes I’m that lazy!)  The AutoHotKey script I wrote is further down this page and can also be found on the AHK forum here.

In this beginning tutorial I how to:
1) get a pointer to IE
2) navigate to a page
3) get text from a page

Web Scraping Intro with AutoHotkey

Webscraping with AHK 101-Pointer and getting values from page

Here is the script writer to use during your web scraping intro with AutoHotkey.

Web Scraping with AutoHotKey 105- trouble shooting web scraping

trouble shooting web scraping

Trouble Shooting Web Scraping

When building my first scraping scripts I used to waste a ton of time trying to figure out what was broken.  Adding some structure to your diagnoses process can greatly speed-up detecting what has gone wrong.   A copy of the AutoHotKey syntax writer can be found here.

I think some excellent advice, not exclusive to trouble shooting web scraping, is to have a bobble-doll or something to talk to.  Pretend you’re explaining your issue to a friend and often, when you hear yourself say the words, your issue will appear to you.

This video offers some general trouble shooting tips around trouble shooting web scraping when using AutoHotKey.

Webscraping with AHK 105- General trouble shooting tips

 

 

Programmatically interact with the SciTE editor via COM objects

SciTE editor via COM objects

SciTE editor via COM objects

SciTE is a great IDE that I use with AutoHotKey, SPSS, SQL, Python, XML, HTML, etc.   I love being able to use regular expressions in it to manipulate text and it has some very cool capabilities.  This video is one of my favorite demonstrations how powerful SciTE can be at manipulating text.

Here is a short tutorial and demonstration on how to manipulate SciTE editor via COM objects and Windows commands with AutoHotKey.

 

SciTE editor via COM objects- Editor Windows Commands

A specific version of the SciTE editor for AutoHotKey can be downloaded here and more generic documentation can be found here.

Automating email metrics w/AutoHotKey, Excel, & RegEx on the naming convention

Automating email metrics

Automating email metrics

I’ve written an AutoHotKey script that logs into our vendor website and uses web scraping exports the recent email campaigns then breaks them out by region, campaign type, Business Unit, etc. and links them back to our SharePoint site for more information.

 

What’s really cool is that I’m putting in formulas (not the static number) thus if some of the items that  are not classified (someone incorrectly used our mailing name so the RegEx broke) they can go back and update the code and all the numbers will adjust!

Automating email metrics

Automating email metrics