| NEW FAR SIDE IN FRESHRSS
2023-03-29
I got some great feedback to yesterday's post about using FreshRSS + XPath to
subscribe to Forward, including helpful comments from FreshRSS developer
Alexandre Alapetite and from somebody who appreciated it and my Far Side
"Daily Dose" recipe and wondered if it was possible to get the new Far Side
content in FreshRSS too.
Wait, there's new Far Side content? Yup: it turns out Gary Larson's dusted off
his pen and started drawing again. That's awesome! But the last thing I want
is to have to go to the website once every few... what: days? weeks? months?
He's not syndicated any more so he's not got a deadline to work to! If only
there were some way to have my feed reader, y'know, do it for me and let me
know whenever he draws something new.
|
|
Here's my setup for getting Larson's new funnies right where I want them:
* Feed URL: https://www.thefarside.com/new-stuff/1
This isn't a valid address for any of the new stuff, but always seems to
redirect to somewhere that is, so that's nice.
* XPath for finding news items: //div[@class="swiper-slide"]
Turns out all the "recent" new stuff gets loaded in the HTML and then
JavaScript turns it into a slider etc.; some of the CSS classes change when
the JavaScript runs so I needed to View Source rather than use my browser's
inspector to find everything.
* Item title: concat("Far Side #",
descendant::button[@aria-label="Share"]/@data-shareable-item)
Ugh. The easiest place I could find a "clean" comic ID number was in a data-
attribute of the "share" button, where it's presumably used for engagement
tracking. Still, whatever works right?
* Item content: descendant::figcaption
When Larson captions a comic, the caption is important.
* Item link (URL) and item unique ID: concat("https://www.thefarside.com",
./@data-path)
The URLs work as direct links to the content, and because they're unique, they
make a reasonable unique ID too (so long as their numbering scheme is
internally-consistent, this should stop a re-run of new content popping up in
your feed reader if the same comic comes around again).
* Item thumbnail:
concat("https://fox.q-t-a.uk/referer-faker.php?pw=YOUR-SECRET-PASSWORD-GOES-HERE&referer=https://www.thefarside.com/&url=",
descendant::img[@data-src]/@data-src)
The Far Side uses Referer: headers as an anti-hotlinking measure, which
prevents us easily loading the images directly in an RSS reader. I use this
tiny PHP script as a proxy to mitigate that. If you don't have such a proxy
set up, you could simply omit the "Item thumbnail" and "Item content" fields
and click the link to go to the original page.
* Item date:
normalize-space(descendant::div[@class="tfs-comic-new__meta"]/*[1])
The date is spread through two separate text nodes, so we get the content of
their wrapper and use normalize-space to tidy the whitespace up. The date
format then looks like "Wednesday, March 29, 2023", which we can parse using a
custom date/time format string:
* Custom date/time format: l, F j, Y
I promise I'll stop writing about how awesome FreshRSS + XPath is someday.
Today isn't that day.
Meanwhile: if you used to use a feed reader but gave up when the Web started
to become hostile to them and big social media systems started to wall you in,
you should really consider picking one up again. The stuff I write about is
complex edge-cases that most folks don't need to think about in order to
benefit from RSS... but it's super convenient to have the things you care
about online (news, blogs, social media, videos, newsletters, comics, search
trends...) collated and sorted for you... without interference from algorithms
that want to push "sticky" content, without invasive tracking or
advertisements (or cookie banners or privacy popups), without something
"disappearing" simply because you put off reading it for a few days.
LINKS
|