You may recall I was scanning documents and using Applescript to extract a highlighted date from them. Since then, Hazel has started natively supported date matching in its rules, which has been great. Tell Hazel the date format and maybe some text just before or after it os it can tell start date from end date, and it gets it right. Most of the time.
Sometimes, though, things go sideways. I can see the text in the document, but for whatever reason, Hazel isn’t matching. I ended up creating a service via Automator which extracts the text from the PDF and puts it up on the screen. Most times, I can then find the text I want and put it into the Hazel rule, and away we go.
This project is why I hadn’t posted much the past few weeks – I’ve been running a backlog of scanned PDFs through a bunch of rules to make sure I’ve got enough of it covered. Right now, I’ve got about 40 rules to identify the company that sent me the bill, and another slightly-less than that to identify the dates. 450 items have gone through the process, so I’m fairly confident that it’s working. A few more tweaks – tagging for things like “bills that aren’t automatically paid online” and watching for social security numbers, and I should be in good shape.