Month: January 2014

Scanned Automation with Hazel, Revisited

You may recall I was scanning documents and using Applescript to extract a highlighted date from them. Since then, Hazel has started natively supported date matching in its rules, which has been great. Tell Hazel the date format and maybe some text just before or after it os it can tell start date from end date, and it gets it right. Most of the time. 

Sometimes, though, things go sideways. I can see the text in the document, but for whatever reason, Hazel isn’t matching. I ended up creating a service via Automator which extracts the text from the PDF and puts it up on the screen. Most times, I can then find the text I want and put it into the Hazel rule, and away we go. 

This project is why I hadn’t posted much the past few weeks – I’ve been running a backlog of scanned PDFs through a bunch of rules to make sure I’ve got enough of it covered. Right now, I’ve got about 40 rules to identify the company that sent me the bill, and another slightly-less than that to identify the dates. 450 items have gone through the process, so I’m fairly confident that it’s working. A few more tweaks – tagging for things like “bills that aren’t automatically paid online” and watching for social security numbers, and I should be in good shape. 

Importing Mavericks Tags to Evernote

I have a Hazel-based workflow defined for scanned documents. With the release of Mavericks, I added tagging to that flow for things like the company that generated the document, a few tags about what the doc is about (cars, kids, etc). However, I like keeping the docs in Evernote. So, as I usually do, I turned to Applescript and came up with this:

set theActualFile to choose file

set theFile to quoted form of (POSIX path of (theActualFile))

set origDelim to AppleScript’stext item delimiters

set newDelim to “,”

set AppleScript’stext item delimiters to newDelim

— Get tags from file

set userTags to (do shell script “mdls -raw -name kMDItemUserTags ” & theFile)

set theTags to (every text item of userTags)

tell application “Evernote”

set theNote to create note from file theActualFile

–Clean up the list, removing whitespace, quotes, parenthesis, tabs, newlines

repeat with theTag in theTags

— sed command in line below taken from stib’s answer on this page: http://stackoverflow.com/questions/2783713/applescript-cleaning-a-string

set theTag to do shell script “echo ” & quoted form of theTag & “| tr -d ‘\n'”

set theTag to do shell script “echo ” & quoted form of theTag & “| tr -d ‘\r'”

set theTag to do shell script “echo ” & quoted form of theTag & “| tr -d ‘[:blank:]'”

set theTag to do shell script “echo ” & quoted form of theTag & “|sed \”s/[^[:alnum:][:space:]]//g\””

— TODO need to clean out newline as well

if not (tag named theTag exists) then

set my_new_tag to (make new tag with properties {name:theTag})

assign my_new_tag to theNote

else

assign tag theTag to theNote

end if

end repeat

end tell

set AppleScript’s text item delimiters to origDelim

 

The workflow asks for a file to be chosen (I’ll fix that so it accepts a file argument later when I’m ready to make it a part of the automated flow), gets the tags as string (via mdls), splits it out, and then strips out the newlines, tabs and whitespace from each of the tags.

Two caveats.

1) I assume it will split apart any tag with a comma in it.
2) It will remove whitespace form inside the tag as well – so a tag of “credit card” will become “creditcard”… but I’m okay with that.

Here’s the Applescript for your downloading pleasure: Import File to Evernote.scpt