Python 3 script to create an RSS feed for Google News and Search discovery

The goal of this tutorial is to create an RSS.xml file suitable for being exposed as a discovery feed for Google News.

Why is it useful to have an RSS.xml file? Mainly because it automates publication tasks. Starting from this file, you can notify readers, feed a newsletter queue, monitor recent URLs, connect editorial dashboards and complement the XML sitemap with a fresh list of recently published content.

Current SEO context: Google Search accepts RSS 2.0 and Atom 1.0 feeds, but a feed does not replace a normal sitemap or good internal links. Treat RSS as the fast-update layer: recent articles, canonical URLs, clean titles, dates and descriptions that match the public page.

Repository: https://github.com/al118345/rss-python. Video: https://www.youtube.com/watch?v=k8mVioEJLL8

rss.xml structure

A good reference is the New York Times world RSS feed: https://rss.nytimes.com/services/xml/rss/nyt/World.xml

Its structure is similar to the snippet shown above. Of all these elements, the main fields we need to fill are:

Required elements	Function
title	Contains the RSS channel title.
link	Contains the website URL.
description	Contains the RSS channel description.

Implementation

With this example and all the previous information, the repository https://github.com/al118345/rss-python was created. It contains the following code:

As you can see, the general idea is simple. First, a static section is filled with the website name, email and other identifying information. Those values are later used to identify the website and its author.

After that, the script loads a CSV document with the following structure:

title	url	topic
Bayesian network fundamentals	https://1938.com.es/redes-bayesianas	mathematics
Introduction to MongoDB. Document query examples.	https://1938.com.es/mongodb	mongodb nosql

The structure is intentionally simple: three columns with the title, URL and topic. This file is read by the script to generate the different feed entries automatically.

What Google News expects from the feed

A feed should not be treated as a random list of links. Google News, Google Search and other readers can use it as a structured signal about what has changed on the site, which URL is canonical, when an item was published and whether the entry belongs to a recognizable editorial source. For that reason, the script should generate stable URLs, meaningful titles, clean descriptions and dates in a standard format.

The most common mistake is to create a valid XML document that is still poor from an editorial point of view. If every item has a generic description, if several titles are almost identical or if the feed points to pages that return redirects or thin content, the feed will be technically correct but weak for discovery.

Validation checklist before publishing

Open the generated XML in a browser and check that it has no escaping errors or broken characters.
Verify that every link returns HTTP 200 and uses the same canonical URL that appears in the page HTML.
Use a descriptive channel title, a real site link, language metadata and an updated build date.
Avoid duplicate items: each article should appear once, with one permanent URL and one clear topic.
Keep the CSV source under version control so feed changes can be reviewed like any other content change.

Publication workflow

In a real project, this script can run after publishing a new article. The usual flow is: update the CSV or content database, regenerate the RSS file, upload it with the static assets, request a crawl if the article is important and monitor Search Console for indexing problems. This does not force Google to index a page, but it gives crawlers a clean and consistent discovery path.

It also connects well with other automation tasks: an RSS entry can feed a newsletter, a social post queue or a small internal dashboard that checks whether recent articles have title, description, canonical and sitemap coverage. The important part is that the RSS should reflect the public website, not become a separate source of truth with different URLs or summaries.

Another useful improvement is to add automated validation before writing the final file. The script can reject empty titles, relative URLs, missing topics or duplicated links before generating XML. This prevents the feed from publishing low-quality entries that later have to be removed from Google News or Search Console.

If the feed is generated from a CMS or a static-site build, keep the same publication rules as the sitemap: only include pages that are indexable, canonical and useful for readers. Tag archives, temporary URLs, search results and tests should not be mixed with editorial articles. Clean discovery signals are boring, but they are exactly what crawlers need.

A final practical check is to compare the RSS, sitemap and visible article list. Important URLs should appear in the three places with the same final address. If a page is only present in the feed but not internally linked, it may still be discovered, but it sends a weaker quality signal than an article connected from related content.

The final generated RSS looks like this: