Skip to content

Add markdown output in preparation for llms.txt #1632

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jun 4, 2025
Merged

Add markdown output in preparation for llms.txt #1632

merged 3 commits into from
Jun 4, 2025

Conversation

paoloredis
Copy link
Collaborator

@paoloredis paoloredis commented May 29, 2025

@andy-stark-redis
Copy link
Contributor

@paoloredis Looking great so far, but a few questions:

  • In the proposal section of the llms.txt "standard", it says that the files should have names like index.html.md (ie, the .md is added after the original suffix). This doesn't seem to work currently, although index.md works fine. Is this likely to be a problem?
  • Shortcodes other than relrefs still appear in the page. I'm not sure if there are any others worth keeping in the text (images, maybe?) but perhaps it's useful to have another regex in section.md/single.md that removes shortcodes other than the ones we want to use (eg, {{< note >}} can probably just be deleted).
  • I'm not sure if this requires a change to the page content more than anything, but currently, some pages have the university links at the bottom. In the Markdown version, these get rendered as shortcodes, but the heading that introduces them is part of the page Markdown (usually ## Continue learning with Redis University). If we think the university links might be useful to keep in the Markdown versions of the pages, then we need to handle this shortcode. If we don't want them in the Markdown, then maybe we need to remove the header from the page content and add it to the university links shortcode template. As it stands, the Markdown contains a header with nothing useful after it.

@paoloredis
Copy link
Collaborator Author

Thanks for the feedback @andy-stark-redis.

In the proposal section of the llms.txt "standard", it says that the files should have names like index.html.md (ie, the .md is added after the original suffix). This doesn't seem to work currently, although index.md works fine. Is this likely to be a problem?

I've changed config.toml so that the markdown files are named like index.html.md.

Shortcodes other than relrefs still appear in the page. I'm not sure if there are any others worth keeping in the text (images, maybe?) but perhaps it's useful to have another regex in section.md/single.md that removes shortcodes other than the ones we want to use (eg, {{< note >}} can probably just be deleted).

I've removed all shortcode usage. The image shortcodes were replaced with links to the images.

I'm not sure if this requires a change to the page content more than anything, but currently, some pages have the university links at the bottom. In the Markdown version, these get rendered as shortcodes, but the heading that introduces them is part of the page Markdown (usually ## Continue learning with Redis University). If we think the university links might be useful to keep in the Markdown versions of the pages, then we need to handle this shortcode. If we don't want them in the Markdown, then maybe we need to remove the header from the page content and add it to the university links shortcode template. As it stands, the Markdown contains a header with nothing useful after it.

Good point, I agree we should remove the header from the page content and add it to the shortcode template instead. I'll do that in a separate pull request.

Copy link
Collaborator

@mich-elle-luna mich-elle-luna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you!

Copy link
Contributor

@andy-stark-redis andy-stark-redis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking great now! Approved :-)

Just one other thought that's occurred to me - the links to other docs pages in the new .html.md files point to the HTML version of the page rather than the index.html.md equivalent. I'm just wondering if they should point to the index.html.md version or if it even makes any difference? I can't find any guidance about this on https://llmstxt.org/ or elsewhere, unfortunately. (Maybe I'm just overthinking this!)

@paoloredis
Copy link
Collaborator Author

@andy-stark-redis I don't think that's an issue, as it's the llms.txt file that is going to link to the .html.md pages.

@paoloredis paoloredis merged commit 2c085f0 into main Jun 4, 2025
4 checks passed
@paoloredis paoloredis deleted the DOC-5280 branch June 4, 2025 13:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants