[PR #917] [MERGED] Replace ANTLR with hand-rolled parser #806

Closed
opened 2025-12-30 01:26:54 +01:00 by adam · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/apple/pkl/pull/917
Author: @stackoverflow
Created: 1/29/2025
Status: Merged
Merged: 2/12/2025
Merged by: @stackoverflow

Base: mainHead: new-parser


📝 Commits (10+)

  • f37a904 Start new parser
  • 724219a ParserComparisonTest inherits from ParserComparisonTestInterface (#1)
  • d1fc7b9 Start parser visitor
  • b1be0be Spotless apply and rebase main
  • 6f5520e Assert that compare doesn't throw, and annotate the path with as (#3)
  • 05c147a Collect all test failures using SoftAssertions (#4)
  • e1fc451 Fix bug in parser and lexer
  • 0afd454 Fix bug in operator resolution
  • 95234f8 Add support for legacy syntax
  • 4e02b30 Change spans to be faster

📊 Changes

197 files changed (+11714 additions, -3310 deletions)

View changed files

📝 .gitignore (+1 -0)
📝 DEVELOPMENT.adoc (+2 -8)
📝 THIRD-PARTY-NOTICES.txt (+21 -32)
📝 bench/bench.gradle.kts (+1 -3)
📝 bench/gradle.lockfile (+0 -1)
📝 bench/src/jmh/java/org/pkl/core/parser/ParserBenchmark.java (+1 -1)
📝 buildSrc/src/main/kotlin/pklFatJar.gradle.kts (+0 -1)
📝 docs/modules/language-reference/pages/index.adoc (+0 -6)
📝 docs/src/test/kotlin/DocSnippetTests.kt (+7 -8)
📝 pkl-cli/gradle.lockfile (+0 -1)
📝 pkl-cli/pkl-cli.gradle.kts (+0 -3)
📝 pkl-cli/src/main/java/org/pkl/cli/svm/PolyglotContextImplTarget.java (+4 -4)
📝 pkl-codegen-java/gradle.lockfile (+0 -1)
📝 pkl-codegen-kotlin/gradle.lockfile (+0 -1)
📝 pkl-commons-cli/gradle.lockfile (+0 -1)
📝 pkl-config-java/gradle.lockfile (+0 -1)
📝 pkl-config-kotlin/gradle.lockfile (+0 -1)
📝 pkl-core/gradle.lockfile (+1 -1)
📝 pkl-core/pkl-core.gradle.kts (+18 -19)
📝 pkl-core/src/main/java/org/pkl/core/PcfRenderer.java (+1 -1)

...and 80 more files

📄 Description

Fixes #906
Fixes #888
Fixes #927
Fixes #931
Fixes #932

Incompatible changes:

  • Object entries in the same line now require a semicolon (properties and elements are unaffected):
// not valid anymore
foo { ["1"] = 1 ["2"] = 2 }

// requires ;
foo { ["1"] = 1; ["2"] = 2 }

Reason: not requiring semicolons or newlines requires too much backtracking in the parser, degrading performance. Also both our IntelliJ plugin and the LSP don't allow this syntax and show an error.

  • Some invalid string escapes are not accepted anymore (see #888)
  • Unseparated object members require a space, semicolon or newline (see #932)

Our current ANTLR parser is quite slow, and, depending on the codebase, can be the slowest part of running Pkl.

Some results from parsing all snippet tests in pkl-core showing ~100x performance improvement:
ANTLR elapsed: 7122ms (7.1s)
New parser elapsed: 78.73ms

This is still a draft, many tests are still failing and repl parsing is still using the old parser.


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/apple/pkl/pull/917 **Author:** [@stackoverflow](https://github.com/stackoverflow) **Created:** 1/29/2025 **Status:** ✅ Merged **Merged:** 2/12/2025 **Merged by:** [@stackoverflow](https://github.com/stackoverflow) **Base:** `main` ← **Head:** `new-parser` --- ### 📝 Commits (10+) - [`f37a904`](https://github.com/apple/pkl/commit/f37a904416489cde6fd6b3ef5278115b47070f7d) Start new parser - [`724219a`](https://github.com/apple/pkl/commit/724219a3e363ab7b2f4745d7833b4978f58bcd80) `ParserComparisonTest` inherits from `ParserComparisonTestInterface` (#1) - [`d1fc7b9`](https://github.com/apple/pkl/commit/d1fc7b9f001504f0509b40634e87c808cdd2b740) Start parser visitor - [`b1be0be`](https://github.com/apple/pkl/commit/b1be0be4bde879c098c67878c9826ae432bd6f25) Spotless apply and rebase main - [`6f5520e`](https://github.com/apple/pkl/commit/6f5520ef830725099c5daa42f6e2a9677b243728) Assert that `compare` doesn't throw, and annotate the path with `as` (#3) - [`05c147a`](https://github.com/apple/pkl/commit/05c147a4437e0ee99116e07f82260dd577829cbc) Collect all test failures using `SoftAssertions` (#4) - [`e1fc451`](https://github.com/apple/pkl/commit/e1fc4515f6412575de2e3332ef071219ec2750d3) Fix bug in parser and lexer - [`0afd454`](https://github.com/apple/pkl/commit/0afd454b7e9b1525e4a823da5b8bf9cbdeea5406) Fix bug in operator resolution - [`95234f8`](https://github.com/apple/pkl/commit/95234f8abd3c8c278792ca9d54bbdaad3d6d1939) Add support for legacy syntax - [`4e02b30`](https://github.com/apple/pkl/commit/4e02b30a00b3e2da7a8feaba8f195d57c7802927) Change spans to be faster ### 📊 Changes **197 files changed** (+11714 additions, -3310 deletions) <details> <summary>View changed files</summary> 📝 `.gitignore` (+1 -0) 📝 `DEVELOPMENT.adoc` (+2 -8) 📝 `THIRD-PARTY-NOTICES.txt` (+21 -32) 📝 `bench/bench.gradle.kts` (+1 -3) 📝 `bench/gradle.lockfile` (+0 -1) 📝 `bench/src/jmh/java/org/pkl/core/parser/ParserBenchmark.java` (+1 -1) 📝 `buildSrc/src/main/kotlin/pklFatJar.gradle.kts` (+0 -1) 📝 `docs/modules/language-reference/pages/index.adoc` (+0 -6) 📝 `docs/src/test/kotlin/DocSnippetTests.kt` (+7 -8) 📝 `pkl-cli/gradle.lockfile` (+0 -1) 📝 `pkl-cli/pkl-cli.gradle.kts` (+0 -3) 📝 `pkl-cli/src/main/java/org/pkl/cli/svm/PolyglotContextImplTarget.java` (+4 -4) 📝 `pkl-codegen-java/gradle.lockfile` (+0 -1) 📝 `pkl-codegen-kotlin/gradle.lockfile` (+0 -1) 📝 `pkl-commons-cli/gradle.lockfile` (+0 -1) 📝 `pkl-config-java/gradle.lockfile` (+0 -1) 📝 `pkl-config-kotlin/gradle.lockfile` (+0 -1) 📝 `pkl-core/gradle.lockfile` (+1 -1) 📝 `pkl-core/pkl-core.gradle.kts` (+18 -19) 📝 `pkl-core/src/main/java/org/pkl/core/PcfRenderer.java` (+1 -1) _...and 80 more files_ </details> ### 📄 Description Fixes #906 Fixes #888 Fixes #927 Fixes #931 Fixes #932 Incompatible changes: - Object entries in the same line now require a semicolon (properties and elements are unaffected): ```pkl // not valid anymore foo { ["1"] = 1 ["2"] = 2 } // requires ; foo { ["1"] = 1; ["2"] = 2 } ``` Reason: not requiring semicolons or newlines requires too much backtracking in the parser, degrading performance. Also both our IntelliJ plugin and the LSP don't allow this syntax and show an error. - Some invalid string escapes are not accepted anymore (see #888) - Unseparated object members require a space, semicolon or newline (see #932) Our current ANTLR parser is quite slow, and, depending on the codebase, can be the slowest part of running Pkl. Some results from parsing all snippet tests in pkl-core showing ~100x performance improvement: ANTLR elapsed: 7122ms (7.1s) New parser elapsed: 78.73ms This is still a draft, many tests are still failing and repl parsing is still using the old parser. --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
adam added the pull-request label 2025-12-30 01:26:54 +01:00
adam closed this issue 2025-12-30 01:26:54 +01:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/pkl#806