Leveraging the Document Object Model
This third video on Web Scraping gets a little advanced and shows how you can leverage the DOM to make extracting data from a webpage much easier and reliable.
Leveraging the Document Object Model (DOM)will take some practice (especially if you’re not familiar with Object oriented coding) but it is well worth it because it greatly reduces the amount of clean-up you have to do after you extract your data. I used to write some pretty crazy regular expressions to try and clean up my code. Once I learned how to better navigate the DOM it negated the need for cleaning!
Video Leveraging the DOM plus looping over pages
The syntax for writing the writing the web scraping code can be found on my first post here. There is also an AutoHotKey forum thread you might wish to review here.
The term Web Scraping is well known but I’ve yet to hear much talk about “Web Pasting”. In this Web Scraping Example with AutoHotkey video I demonstrate how I helped my Real Estate agent automate checking the status of a house on a website.
Web Scraping example with AutoHotkey
Here is a second version where, depending on the status of the home, I write a follow-up email either with an email from Outlook or within the website via COM.
While it is easy to use Excel to transpose rows into columns, I’ve created an AutoHotkey script which negates the need to. Basically it examines what is on the clipboard and replaces tabs with line breaks, then it sends paste to the program thus it is transposing clipboard content. I have it triggered of hitting a hotkey and comes in pretty handy and can save some time if it is something you do frequently.
Below is a video which demonstrates it usage.
^+t:: ;Control, Shift and T trigger script
StringReplace, clipboard, clipboard,%A_Tab%, `r , All ;replace all tabs with line return
SendPlay,^v ;send "paste"
Transposing clipboard content