Jump to content

Creating Efficient Cache That Monitors A Folder of Markdown Files in Workflow


Recommended Posts

Hi all,

 

I am designing a workflow in python3 and have some questions about how to design a certain part of it's functionality.

 

On a high level I want this workflow to search a collection of markdown files and return results based on certain attributes. These markdown files contain front matter, which in turn, contain tags. 

 

Extracting the tags, no problem. I'm just stuck on the best way to implement a way to efficiently store the data for lookup.

 

  • I was thinking of implementing some sort of dict with tags as the keys and file paths/names as the values.

 

My issue is that I want to:

 

  • Not have to rebuild this dictionary every time a user queries for a tag. So some sort of cache comes to mind.
  • I want to use fuzzy matching (e.g query 'gi' will match 'git'). But since I am extracting the data a step before I send it to Alfred I can't use Alfred's fuzzy searching.
  • I've seen solutions using sqlite but I think that might be overkill at this time. I'm not gonna have 2,500+ notes.

 

Another consideration is that I would want to update the cache when files have been added and deleted. So I'm not sure the best way to do this with Alfred or Python, being able to monitor a folder of files and update a cache or something similar based on addition and deletion to make sure the cache is properly updated.

 

Any tips/suggestions would be much appreciated.

 

Thanks!

 

Link to comment

Welcome @kostyafarber,

 

A great way to keep a cache is building the JSON the Script Filter expects into alfred_workflow_cache, then just read the file directly. Tick the Alfred filters results option and set the match mode (click the (…)) to Word matching - Any order. Set match for even greater control of how to filter.


Then it depends on your exact needs. For example, each time you ↩ a result, you could check if the cache is older than X time and rebuild if it is.

 

1 hour ago, kostyafarber said:

Another consideration is that I would want to update the cache when files have been added and deleted.

 

That requires monitoring. The way I would recommend for that is using launchd, which is made by Apple and ships with macOS. launchd.info is a great reference. You can tell it to monitor a directory, but subdirectories aren’t matched. Probably for performance reasons, but you can add each path individually.


You set it up so that on every change, it calls an External Trigger to rebuild the cache.

Link to comment

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...