Create RSS Feed from Webpage • FiveFilters.org

Create an RSS feed for a web page which does not offer its own.

  1. Give us the web page URL and tell us which links you're interested in
  2. We'll produce a simple feed with those links as feed items
  3. Subscribe to the feed in your news reader to get notified of changes

You can also use the resulting feed with other services which take RSS as input, such as our Full-Text RSS and PDF Newspaper tools.

Results


Submitted parameters (
    [url] => chomsky.info/articles.htm
    [url_contains] => articles/
)

Request parameters

When making HTTP requests, you can pass the following parameters to extract.php (either in a GET request or POST request). For most uses, supplying the page URL (url) and a selector (either in_id_or_class or item) is enough.

We do not provide form fields for all of these parameters, but you can modify the URL in your browser after clicking 'Preview' to use them. See our examples.

Parameter Value Description
url string (URL) Web page URL containing links which we want to extract for our RSS feed. This is required.
in_id_or_class string (attribute value) Find links inside elements whose id or class attribute matches this value.
This translates to the following XPath:
//a[@href and ancestor-or-self::*[@id="string" or contains(concat(" ",normalize-space(@class)," "), " string ")]]
item string (CSS selector) Look inside element(s) matching this CSS expression (for example: div.news .item). Cannot be used in combination with in_id_or_class.
item_title string (CSS selector) or 0 Extract item title from element matching CSS selector. This is applied within the context of elements selected by item. If omitted, the text of the first matching <a> element will be used. If set to 0, titles will not be included in the output.
item_url string (CSS selector) or 0 Extract item URL from element matching CSS selector. This is applied within the context of elements selected by item. If omitted, the URL of the first matching <a> element will be used. If set to 0, URLs will not be included in the output.
item_desc string (CSS selector) Extract item description from element matching CSS selector. This is applied within the context of elements selected by item. If omitted, the generated feed will not include item descriptions.
url_contains[] string (URL substring) Filter out any item whose URL does not contain one of these substrings. This can appear multiple times. Example: /article/
text_contains[] string (word or phrase) Filter out any item whose title or description does not contain any of the supplied words or phrases. This can appear multiple times.
unique_url 1 (default), 0 If multiple matching items have the same URL, only the first encountered will be kept. Set to 0 to keep all matching URLs. Note: this is enabled by default, but always disabled when you ask for URLs to be excluded from output (item_url=0).
unique_title 1 (default), 0 If multiple matching items have the same title, only the first encountered will be kept. Set to 0 to keep all matching titles. Note: this is enabled by default, but always disabled when you ask for titles to be excluded from output (item_title=0)
strip string (CSS selector) Remove elements matching CSS selector. This will be processed before we start looking for items. To strip multiple elements, use commas. For example: .header,.footer
strip_if_url[] string Filter out any item whose URL contains any of the supplied words or phrases. This can appear multiple times.
strip_if_text[] string Filter out any item whose title or description contain any of the supplied words or phrases. This can appear multiple times.
feed_title string The feed title to use in the generated feed. If omitted, we'll use whatever's in the <title> element of the web page requested. Note: this should be the actual title, not a selector.
max number (limit: 50) The maximum number of items to return. The limit can be changed in the config file.
format rss (default), json By default the output will be formatted as RSS 2.0. Set this to JSON if you prefer that.
saved[] string Saved request parameters. In the config file, you can associate a set of request parameters with a given name. You can then pass that name in the request instead of the request parameters.

Required parameters: url must be supplied.


Quick start

  1. Enter a page URL in the form above and click 'Preview' (or try one of the examples).
  2. If the preview looks okay, use the RSS feed in your news reader or application.
  3. The form only provides fields for basic parameters - to use the others you will have to modify the URL in your browser (try the examples for an idea).
  4. That's it! (Although see below if you'd like to customise further.)