How do I make a feed from a webpage?

Not every website publishes an RSS feed. Web page feeds let you turn any HTML page into a feed by telling LightWatch where to find the content using CSS selectors.

This is a premium feature.

Webpages use HTML as the structure for their content. CSS selectors provide a way to target specific elements within that structure. The webpage feed feature combines these two concepts to let you pick content out of a webpage and turn it into a feed.

This feature works on static webpages. It may not be able to extract content from pages that load dynamically. If you need to create a feed from a dynamic site and the webpage feed feature doesn't work, see How do I create my own feed? for other options.

Setting up a web page feed

From the Home tab, tap the More menu in the top right.
Tap Add feed.
Enter the page URL and tap Search for Feeds.
If no RSS feed is found, tap Make feed from webpage.

Visual wizard

The easiest way to get started is the visual wizard. Tap Open Visual Wizard to load the page, then tap elements directly to prefill the selectors. LightWatch will try to figure out the right selectors for you. You can fine-tune them manually afterward.

The built-in webpage feed builder works on static webpages. If the content in the visual wizard looks different from your normal browser, the webpage likely relies on JavaScript and may not work. But it might — it depends on what role the JavaScript plays — so give it a try

Configuring selectors manually

The configuration screen has fields for each part of a post that LightWatch needs to extract:

Post — The repeating element that wraps each item on the page (e.g., article, .post, .card). This is the only required field.
Title — The element containing the title text within each post. Has optional Attribute, Format, and Titlecase fields (see below).
Post URL — The element linking to the full post. Defaults to reading the href attribute.
Date — The element containing the publication date within each post.
Post Content — The element containing images and videos. Supports comma-separated selectors for content in different containers.
Post ID — A unique identifier for each post, used for deduplication.

Additional fields

Some sections have extra fields for fine-tuning:

Attribute — Which attribute to read from the matched element. Leave blank to use the element’s text content.
Format — A regular expression with a single capture group () to extract part of the matched text. Whatever the group captures becomes the final value. For example, ^(.+?) \| on “My Photo | Blog Name” extracts “My Photo”.
Titlecase — Converts the result to title case.

LightWatch validates your selectors against the actual page and shows match counts as you type.

Article content

If the posts link to pages with additional images or content, you can enable Enable article scraping to have LightWatch also visit each linked post and extract media from there. See getting images from linked posts for more specific information, but it’s basically the same idea as what you’ve read here.

Using an AI agent to find selectors

If you’re not sure what selectors to use, paste the following prompt into an AI agent like ChatGPT or Claude along with the page URL.

I'm configuring an RSS reader called LightWatch to create a feed
from a webpage. Please visit this URL and inspect the HTML to find
the right CSS selectors.

URL: [paste page URL here]

The page has a repeating list of items (posts, articles, photos,
etc). I need values for these specific text inputs in the app.
Leave a field blank if it's not needed.

**Post** (required)
- Selector: CSS selector targeting the repeating element that
  wraps each item (e.g. article, .post, .card). Must match
  multiple items on the page.

**Title**
- Selector: the element within each post containing the title.
- Attribute: which attribute to read, or blank for text content.
- Format: a regex with one capture group () to extract the clean
  title. Whatever the group captures becomes the final value. For
  example, ^(.+?) \| on "My Photo | Blog Name | 2024" extracts
  "My Photo". Leave blank if the title is already clean.
- Titlecase: yes or no. Use yes if the title is ALL CAPS or all
  lowercase.

**Post URL**
- Selector: the element within each post linking to the full
  post. The app reads the href attribute by default.
- Attribute: which attribute to read if not href.

**Date**
- Selector: the element within each post containing the date.
- Attribute: which attribute to read, or blank for text content.
- Format: a regex with one capture group () if the date needs
  extracting from surrounding text.

**Post Content**
- Selector: the element within each post containing images or
  videos. Supports comma-separated selectors if media is in
  different containers within the post.

**Post ID**
- Selector: an element with a unique identifier per post, used
  for deduplication.
- Attribute: which attribute to read (e.g. id, data-id), or
  blank for text content.

Test that the Post selector matches multiple items on the page.
Only include fields where you can find a match.

Limitations

Web page feeds depend on the structure of the source page staying reliable. If the site redesigns its HTML, the selectors may need updating. If this happens, you will receive a status notification every time the feed checks for updates informing you that it isn’t finding content where it expects to.