Pseudowork

I made an art:

I've been working on generating my own metadata for our local (television) programs based on closed captioning content. First round of scraping text from CC resulted in just over 2 million words (after a long, dreadful cleanup). I'm about 1/3 of the way through our programming.

The database won't be ready for a while, so I crammed the raw text into one of those word-cloud generators. Not as neat as searching the archive for "Indy 500" and getting a list of every show to mention it (with links that start video playback 5 seconds before it appears), but still pretty neat.

No comments:

Post a Comment