Skip to content

fix: Markdown link parsing #2960

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Jun 1, 2025
Merged

Conversation

gibson042
Copy link
Contributor

Fixes #2959

Improves alignment with CommonMark, although does not include support for unescaped nesting.

Can be reviewed commit-by-commit.

@Gerrit0 Gerrit0 merged commit 3da819f into TypeStrong:master Jun 1, 2025
7 checks passed
@Gerrit0
Copy link
Collaborator

Gerrit0 commented Jun 1, 2025

Thank you!

@Gerrit0
Copy link
Collaborator

Gerrit0 commented Jun 1, 2025

This still isn't quite right... I was looking at reducing the number of test cases, picking key things to check as it slowed down the tests by about 15% on my machine to generate a ludicrous number of checks (3.5 -> 4 seconds) and found these cases, which are incorrectly captured as links:

[

](./no)

[

`code`](./no)

[

text](./no)

@gibson042
Copy link
Contributor Author

This still isn't quite right... I was looking at reducing the number of test cases, picking key things to check as it slowed down the tests by about 15% on my machine to generate a ludicrous number of checks (3.5 -> 4 seconds) and found these cases, which are incorrectly captured as links:

[

](./no)

[

`code`](./no)

[

text](./no)

Can you elaborate? Handling of those inputs looks correct to me:

code
$ npx tsx -e '
  import { lexCommentString } from "./src/lib/converter/comments/rawLexer.ts";
  import { lexBlockComment } from "./src/lib/converter/comments/blockLexer.ts";
  import { parseCommentString, parseComment } from "./src/lib/converter/comments/parser.ts";
  import { FileRegistry } from "./src/lib/models/FileRegistry.ts";
  import { TestLogger } from "./src/test/TestLogger.ts";
  import { MinimalSourceFile } from "#utils";

  const inputs = ["[\n\n](./no)", "[\n\n`code`](./no)", "[\n\ntext](./no)"];
  inputs.push(inputs.join("\n\n"));

  const makeParse = (lex, parse) => {
    const config = {
      blockTags: new Set("@param @remarks @module @inheritDoc @defaultValue".split(" ")),
      inlineTags: new Set(["@link"]),
      modifierTags: new Set("@public @private @protected @readonly @enum @event @packageDocumentation".split(" ")),
      jsDocCompatibility: { defaultTag: true, exampleTag: true, ignoreUnescapedBraces: false, inheritDocTag: false },
      suppressCommentWarningsInDeclarationFiles: false,
      useTsLinkResolution: false,
      commentStyle: "jsdoc",
    };
    return text => {
      const files = new FileRegistry();
      const logger = new TestLogger();
      const content = lex(text);
      const sourceFile = new MinimalSourceFile(text, "/dev/zero");
      const result = parse(content, config, sourceFile, logger, files);
      logger.expectNoOtherMessages();
      return result;
    };
  };
  const parseRaw = makeParse(lexCommentString, parseCommentString);
  const parseBlockComment = makeParse(lexBlockComment, parseComment);
  const embedInComment = input => {
    const lines = input.split("\n");
    const embedded = `/**\n${lines.map(line => " * " + line).join("\n")}\n */`;
    return embedded;
  };
  for (const rawInput of inputs) {
    console.log("\n\n==== raw input");
    console.log(rawInput);
    console.log("==== raw parse");
    console.log(parseRaw(rawInput));
    const commentInput = embedInComment(rawInput);
    console.log("\n==== comment input");
    console.log(commentInput);
    console.log("==== comment parse");
    console.log(parseBlockComment(commentInput));
  }
'
result
==== raw input
[

](./no)
==== raw parse
{
  content: [ { kind: 'text', text: '[\n\n](./no)' } ],
  frontmatter: {}
}

==== comment input
/**
 * [
 * 
 * ](./no)
 */
==== comment parse
Comment {
  summary: [ { kind: 'text', text: '[\n\n](./no)' } ],
  blockTags: [],
  modifierTags: Set(0) {},
  label: undefined
}


==== raw input
[

`code`](./no)
==== raw parse
{
  content: [
    { kind: 'text', text: '[\n\n' },
    { kind: 'code', text: '`code`' },
    { kind: 'text', text: '](./no)' }
  ],
  frontmatter: {}
}

==== comment input
/**
 * [
 * 
 * `code`](./no)
 */
==== comment parse
Comment {
  summary: [
    { kind: 'text', text: '[\n\n' },
    { kind: 'code', text: '`code`' },
    { kind: 'text', text: '](./no)' }
  ],
  blockTags: [],
  modifierTags: Set(0) {},
  label: undefined
}


==== raw input
[

text](./no)
==== raw parse
{
  content: [ { kind: 'text', text: '[\n\ntext](./no)' } ],
  frontmatter: {}
}

==== comment input
/**
 * [
 * 
 * text](./no)
 */
==== comment parse
Comment {
  summary: [ { kind: 'text', text: '[\n\ntext](./no)' } ],
  blockTags: [],
  modifierTags: Set(0) {},
  label: undefined
}


==== raw input
[

](./no)

[

`code`](./no)

[

text](./no)
==== raw parse
{
  content: [
    { kind: 'text', text: '[\n\n](./no)\n\n[\n\n' },
    { kind: 'code', text: '`code`' },
    { kind: 'text', text: '](./no)\n\n[\n\ntext](./no)' }
  ],
  frontmatter: {}
}

==== comment input
/**
 * [
 * 
 * ](./no)
 * 
 * [
 * 
 * `code`](./no)
 * 
 * [
 * 
 * text](./no)
 */
==== comment parse
Comment {
  summary: [
    { kind: 'text', text: '[\n\n](./no)\n\n[\n\n' },
    { kind: 'code', text: '`code`' },
    { kind: 'text', text: '](./no)\n\n[\n\ntext](./no)' }
  ],
  blockTags: [],
  modifierTags: Set(0) {},
  label: undefined
}

@Gerrit0
Copy link
Collaborator

Gerrit0 commented Jun 4, 2025

Sorry about that! I had apparently added a test case with just one newline, but didn't have different links, so I didn't realize it was that link that cased it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Link parsing deviates from Markdown
2 participants