Note: This site is currently "Under construction". I'm migrating to a new version of my site building software. Lots of things are in a state of disrepair as a result (for example, footnote links aren't working). It's all part of the process of building in public. Most things should still be readable though.

Install Django and Pulling URLs and Titles from Safari Tabs - Parts 1 and 2

`youtube: https://www.youtube.com/watch?v=WoAD9nUEQJA`

`youtube: https://www.youtube.com/watch?v=pDlonZSL6q4`

### [Time: 00:00:00] TextExpander ISO 8601 Snippet Fix

I took a look at the source code for the page and it looked fine (though, I realized when reviewing the stream it wasn't). So, I sshed into my server and fixed it that way. I had to do that because the link was actually to the old instance of my site. It's still live at `http://alanwsmith.com` as compared to the new production version of my site which is `https://www.alanwsmith.com`. (It's on my list to get everything setup so the non secure, non-www version redirects. Just gotta get to it.)

### [Time: 00:19:00] Installing

I've decided to use Django to replace my local `launchpad` website that's just a bunch of PHP files. (Django, not Drupal, which I confuse the name of every time.) Someone asked why I was going with Django instead of sticking with PHP. My thinking:

- When I write code these days, it's mostly python and that's what Django uses - I want to use a framework instead of just writing a bunch of code myself. (I've made my own frameworks. I'm happy to not have to do that anymore.) - I'm not religious about any framework or language stuff so it just works for me

Once I figured out that I was after Django instead of Drupal, I did the quick install then spent about an hour going through the tutorial. Half the time was me getting frustrated with it. I'd looked at Django a few years back and remember the same frustrations. It's not broken. The code works, but there is so much room for improvement. I added "Make a Better Django Tutorial" to my list.

If it wasn't so frustrating, I would have spent more time on it. But, I was beat, so I bailed. I'll get back to it on another stream.

### [Time: 01:26:00] Getting Browser Tab URLs

This is a new one that occurred to me when writing up earlier stream notes. I spent a lot of time going back and forth between my text editor and my browser copying URLs and then typing in the titles and notes for the various links I used. My goal was to create a script to automate that process as much as possible.

What's awesome is that most of the work was already done for me. I found this AppleScript that grabs titles and URLs from all the tabs in all the windows of Safari and copies them to the clipboard. One quick edit to put the output in Markdown format and I could have stopped there, but I wanted a little more.

The first thing I was after was a way to fire off the AppleScript from a PHP page on my local launchpad tools site. I'd done some `osascript` calls to fire AppleScript over the past week so I was optimistic I could get it to work.

I couldn't.

I spent some decent time on it and kept seeing behavior that looked like it should have worked, but didn't. I have some more ideas to try, but given that I'm moving to Django it wasn't worth spending more time on it. Instead, I moved over to using a plain old python script. The reason for that instead of just using the working AppleScript is because I wanted to capture meta tag descriptions of the pages as well.

It might be possible to pull down a web page and parse it with AppleScript, but I have no desire to try. Hence, python. I started using Selenium to do the parsing but ran into some aggravating issues with getting the title of the page (which I didn't need it was just what I was using a tone test).

Seems that for every element on a page, Selenium uses:

Code

element = driver.find_element_by_tag_name("title")
    print(element.text)

Every element except for the title that is. You have to get the title with `driver.title`. Here's an example:

Code

#!/usr/bin/env python3

from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.firefox.options import Options

def get_details(url):
    options = Options()
    options.headless = True
    driver = webdriver.Firefox(options=options)
    driver.get(url)
    wait = WebDriverWait(driver, 10)
    page_title = driver.title
    driver.quit()
    return page_title

print(get_details("https://www.alanwsmith.com/"))

Once I got that figured out, I moved on to getting the description. This is around the time I ran out of steam in the first stream and picked up in the second one. In that second stream, the code got complicated enough that I moved over from just a little script to a more structured one that included tests. I made a lot of progress on that, but haven't finished it up yet. I'll do that in the next stream.

### Miscellaneous

Random stuff from the stream.

Python snippet that calls an external command and returns the STDOUT results into a variable

Code

command_response = subprocess.run(['osascript', 'tab-parser.scpt'], stdout=subprocess.PIPE).stdout.decode('utf-8')

These PHP calls to `osascript` fired and got Safari to active and bring itself to the front window.

Code

shell_exec("osascript -e 'tell application \"Safari\" to activate'");

shell_exec("osascript -l JavaScript -e 'var Safari = new Application(\"/Applications/Safari.app\"); Safari.activate();'");

But, when I tried to run the AppleScript file, I couldn't get it to work.

Code

// no go
echo(shell_exec('osascript tab-parser.scpt'));

I tried this and about a thousand variations. I expect there's a security thing involved. There's probably a way to do it, but it wasn't worth more effort for me.

I discovered that when you use python's `urllib.request.urlopen(url)`, you need to wrap it in a `try`. Otherwise, your script will explode if it hits something like a 403 error. (And since we're talking about it, you also need to decode the `.read()` call to utf-8). Here's a sample:

Code

#!/usr/bin/env python3

import urllib.request

def get_web_page(url):
    try:
        with urllib.request.urlopen(url) as response:
            return response.read().decode("utf-8")
    except:
        return ""

if __name__ == "__main__":
    html_doc = get_web_page("https://www.alanwsmith.com/")
    print(html_doc)

### Links From The Stream

Here's some of the various links from the stream.

- AppleScript - How to execute a multi line applescript from Terminal – MacOS X Software – Forum - Beautiful Soup 4.9.0 documentation - Beautiful Soup Finding if a tag exists - Stack Overflow - beautifulsoup4 · PyPI - Built-in Types — Python 3.8.6 documentation - Call a function from another file? - Stack Overflow - Capture all tabs in Safari as URLs to the clipboard – theconsultant.net - Code inspections - Help - Convert bytes to a string - Stack Overflow - Converting to one line AppleScript - Stack Overflow - Daring Fireball - Data Types — Python 3.8.6 documentation - Errors and Exceptions — Python 3.8.6 documentation - Extract title with BeautifulSoup - Stack Overflow - ForLoop - Python Wiki - Get data from webpage using applescript - Stack Overflow - Get meta tag content property with BeautifulSoup and Python - Stack Overflow - Get page title with Selenium WebDriver using Java - Stack Overflow - Get webpage contents with Python? - Stack Overflow - get_attribute() element method - Selenium Python - GeeksforGeeks - GetTitle - Glossary — Python 3.8.6 documentation - How do I open a generic URL from AppleScript? - Ask Different - How do you start an application in Javascript via osascript? - Stack Overflow - How to call an external command? - Stack Overflow - How to disable auto show hints in JetBrains IDEs (IntelliJ IDEA, PyCharm, WebStorm) on mouse over - Stack Overflow - How to log to message window in Script Editor using JavaScript for Automation - Stack Overflow - How to use string.replace() in python 3.x - Stack Overflow - How to write applescript to print TextEdi… - Apple Community - How to write to standard data out using JavaScript or AppleScript multiple times? - Stack Overflow - HTTP error 403 in Python 3 Web Scraping - Stack Overflow - Is there a built-in function to print all the current properties and values of an object? - Stack Overflow - Linux see directory tree structure using tree command - nixCraft - Locating Elements — Selenium Python Bindings 2 documentation - Meta tags and BeautifulSoup - Model field reference - More Control Flow Tools — Python 3.8.6 documentation - Ned Batchelder: Keep data out of your variable names - PHP - PHP: shell_exec - Manual - Print to Stdout with applescript - Stack Overflow - Pythex: a Python regular expression editor - Python BeautifulSoup check if find returns Null object - Python Data Types - Python: Assign split value to multiple variables - Stack Overflow - Running shell command and capturing the output - Stack Overflow - Scraping Data on the Web with BeautifulSoup - Scraping metadata with beautifulsoup : learnpython - selenium - getTitle() returning current URL instead of page title - Software Quality Assurance & Testing Stack Exchange - Selenium How To Get Title Text? - Stack Overflow - Set up a Git repository - Help - sets — Unordered collections of unique elements — Python 2.7.18 documentation - Settings - Sorting a set of values - Stack Overflow - Sorting HOW TO — Python 3.3.7 documentation - Split and Count a Python String - Stack Overflow - Test if children tag exists in beautifulsoup - Stack Overflow - The try statement - Time Zone Abbreviations - Time Zones in North America - Time Zone Abbreviations - Worldwide List - Time Zones in New York, United States - urllib.request — Extensible library for opening URLs — Python 3.8.6 documentation - webdriver - Question about the Selenium getTitle() Method - Software Quality Assurance & Testing Stack Exchange - WebDriver API — Selenium Python Bindings 2 documentation - What is close() and quit() commands in Selenium Webdriver? - Writing your first Django app, part 3