adding an RSS feed of recent activity to my org-roam digital garden with org-publish

*
planted: 13/08/2023last tended: 19/08/2023

So, I wanted to add an RSS feed for activity in my digital garden.

1. Why?

Mainly so that I could include a widget on my WordPress site that would surface the latest changes from the garden. Because that tends to get more action than my stream, at the moment, but it's a bit hidden away.

Also, potentially, I could add the RSS feed to a Mastodon bot, which could be kind of fun.

2. How?

I'm using org-roam to write and org-publish to publish my digital garden. So I need something that works with that setup.

ox-rss exists. However, it expects one file with a heading per entry for in order to produce its RSS feed. That's not how org-roam works - you have one file per entry.

So you need to get something from your org-roam files in a format for ox-rss to work with. Luckily you can (ab)use org-publish's sitemap functionality.

I think this is the first post that described how to do that: Org mode blogging: RSS feed

I found a few other posts that seem to use a similar setup, they look usually to be based off that original one. e.g. Website With Emacs, Blogging with Emacs and Org.

I used that and it works.

("commonplace-rss"
 :base-directory ,temp-dir
 :base-extension "org"
 :publishing-directory ,publish-dir
 :publishing-function commonplace/publish-rss-feed
 :rss-extension "xml"
 :html-link-home ,commonplace/publish-url
 :html-link-use-abs-url t
 :html-link-org-files-as-html t
 :auto-sitemap t
 :sitemap-function commonplace/generate-org-for-rss-feed
 :sitemap-title "Recent activity in Neil's Digital Garden"
 :sitemap-filename "recentchanges-feed.org"
 :sitemap-style list
 :sitemap-sort-files anti-chronologically
 :sitemap-format-entry commonplace/format-rss-feed-entry)

Add a new component to your org-publish-project-alist.

(defun commonplace/generate-org-for-rss-feed (title sitemap)
  "Generate a sitemap of posts that is exported as a RSS feed.
TITLE is the title of the RSS feed.  SITEMAP is an internal
representation for the files to include.  PROJECT is the current
project."
  (let* ((posts (cdr sitemap))
         (last-hundred (seq-subseq posts 0 (min (length posts) 100))))
    (concat "#+TITLE: " title "\n\n"
            (org-list-to-subtree (cons (car sitemap) last-hundred)))))

Tweaks from the original: I take only the last hundred posts from the date ordered list. I was already doing this for my recent changes page. I think in an attempt to speed it up. (Not sure that it does though).

(defun commonplace/format-rss-feed-entry (entry _style project)
  "Format ENTRY for the posts RSS feed in PROJECT."
  (let* ((title (org-publish-find-title entry project))
         (link (concat (file-name-sans-extension entry) ".html"))
         (pubdate (format-time-string (car org-time-stamp-formats)
                                      (org-publish-find-date entry project))))
    (format "%s
:properties:
:rss_permalink: %s
:pubdate: %s
:end:\n"
            title
            link
            pubdate)))

This is used to format each entry that goes into the org file that's generated. I've not made any tweaks to this.

(defun commonplace/publish-rss-feed (plist filename dir) "Publish PLIST to RSS when FILENAME is recentchanges-feed.org. DIR is the location of the output." (if (equal "recentchanges-feed.org" (file-name-nondirectory filename)) (org-rss-publish-to-rss plist filename dir)))

This is the publishing fucntion that is you set up to be called from the particular component for building your RSS feed in your org-publish-project-alist.

Some notes:

I had to (require 'ox-rss) at the top of my publish.el file. And I also had to include it in spacemacs additional packages.

The original uses rss.org as the name of the generated org page that the RSS xml file is built from. But I already have a page called rss.org - it's the page in my digital garden about RSS. So I changed the name to recentchanges-feed.org. You can use whatever name you like for an RSS feed file.

You'll note above that my :base-directory is a temporary directory. I'm playing with this as a way to only built the recent changes RSS off the most recent files that have change. These are copied into the temp dir before the org publish process runs, with:

rm tempdir/*
find . -mtime -28 -name "*.org" -not -path "./tempdir/*" -exec cp --parents -r -p '{}' tempdir \;

This is to avoid processing thousands of org-roam files just to build the recent changes list.

3. Some issues to be resolved

3.1. Backend confusion

A filter function that I have running on the html backend is sticking its stuff in here, which breaks the RSS file.

(defun commonplace/filter-body (text backend info)
  (when (org-export-derived-backend-p backend 'html)
    (concat "<div class='e-content'>" text "</div>")))

I'm guessing the rss backend piggybacks on the html backend or something.

Yeah looks like it: https://github.com/emacsmirror/ox-rss/blob/master/ox-rss.el#L119

(defun commonplace/filter-body (text backend info)
  (when (org-export-derived-backend-p backend 'html)
    (unless (org-export-derived-backend-p backend 'rss)
      (concat "<div class='e-content'>" text "</div>"))))

^ sorted it.

3.2. Subfolders

It doesn't currently produce the correct URL in the RSS feed for my journal pages, which are in a journal subfolder.

3.3. Duplicate IDs

Somehow org-roam seems to think that the IDs for various pages are those for the entries in the RSS org file, not the actual pages themselves.

Some info on that here: https://org-roam.discourse.group/t/possible-to-ignore-directories-within-the-org-directory/2454

(setq org-roam-file-exclude-regexp
      (concat "^" (expand-file-name org-roam-directory) "/tempdir/")

This seems to have resolved the issue.

To be honest, having tempdir as a subdir of the current dir is causing lots of problems. Should try to just put it somewhere else.

4. Elsewhere

4.1. In my garden

Notes that link to this note (AKA backlinks).

4.3. Mentions

Recent changes. Source. Peer Production License.