jesboat · 2026-03-07T16:33:45Z

PR Notes

This PR introduces two new functions in racket/private/stx,
stx-find-duplicate-ids and raise-if-duplicate-identifiers. Then,
throughout the codebase, places where duplicate identifier syntax errors
are raised are changed to use one of the two new functions. This causes
those syntax errors to highlight all copies of the duplicate identifier,
instead of just one.

It also centralizes two previous chunks of code which did duplicate ID
checking: the implementation of check-duplicate-identifier and an
implementation in the contract code. The implementation is written in
pure #%kernel in the intent that it will also soon replace a third
duplicate-identifier check in the implementation of let.

The new implementation uses straightforward loops instead of a let/ec
previously used in check-duplicate-identifier. I have not benchmarked
this, but would be quite surprised if the let/ec version wins.

Unlike the implementation in let which uses a list-based loop for
small input sizes, we use a hashtable for all input sizes (except for 0
and 1, for which there can be no duplicates). This is based on some
benchmarking done on lists of various sizes which have no duplicates:
https://gist.github.com/jesboat/a47f090dca2094b5cd36598ebb5e6406

There is an ad-hoc test suite which tests almost all the changes:
https://gist.github.com/jesboat/83e728cd39a818d9a977b7bb1bf3c9f9
It checks that duplicate identifiers are detected and that the
exn:fail:syntax contains the expected number of identifiers all of
which have distinct source locations. It is missing tests for three
changes:

"duplicate identifier" in racket/collects/racket/private/unit/unit-infer.rkt
2nd "duplicate declared identifier" in racket/collects/racket/private/class-internal.rkt
"duplicate identifier" in racket/collects/racket/private/class-internal.rkt

There is no fundamental reason those aren't tested; I'm just not
familiar enough with units and classes that writing a test which tickles
those particular places would have been straightforward.

Commit log

duplicate id handling: new helper functions

Introduce new helper functions:

stx-find-duplicate-ids is like check-duplicate-identifier, except it
returns all copies of the duplicated identifier, in a form which makes
it easy to pass them to raise-syntax-error and get a syntax error which
highlights all uses. (This commit does not introduce a public variant,
although it might be desirable to add one in the future.)

raise-if-duplicate-identifiers does exactly what it says on the tin.

These functions are tested in the rest of the PR series.

duplicate id handling: use new helpers in various places

Use the new helpers in various places across the codebase, where
the changes are unobtrusive. There are ~29 uses total, of which this
PR changes 28.

An uncommitted ad-hoc test suite tests all but 3 (one in racket/unit, two in
racket/class). I don't know how to integrate it into the Racket test infra,
but it's at:
https://gist.github.com/jesboat/83e728cd39a818d9a977b7bb1bf3c9f9

duplicate id handling: use new helpers in contract-in, contract-out

The change to use the new duplicate ID helper functions in the code for
(require (contract-in ...)) and (provide (contract-out ...)) is
not trivial, and is presented here as its own commit.

It is tested by the uncommitted test suite (along with most of the
changes in the previous commit) at:
https://gist.github.com/jesboat/83e728cd39a818d9a977b7bb1bf3c9f9

duplicate id handling: replace check-duplicate-identifier implementation

Replace the old implementation (which was written using an, um, odd style
using an escape continuation) with a simple delegate to the new function.

Add some basic test cases.

rfindler · 2026-03-16T14:40:59Z

racket/collects/racket/contract/private/in-out.rkt

+  (define (signal-dup-syntax-error!)
+    (define-values (dup origs) (stx-find-duplicate-identifiers (reverse (unbox bound-id-list))))
+    (when dup
+     (raise-syntax-error who "duplicate identifiers" stx dup origs)))


I'm not sure if this is a semantics-preserving change. But I am pretty sure we don't need the box (also the comment is out of sync with the box)

On second thought, I think this is okay. Probably okay to use set!, instead of set-box!, tho?

The behavior would change from highlighting all copies of the duplicate identifier to only highlighting an arbitrary two. It felt like that's an acceptable compromise for using the common stx-find-duplicate-identifiers function rather than custom code here.

Modifying stx-find-duplicate-identifiers to find all the duplicates would actually be pretty straightforward though. I can see that resulting in unwieldy errors if somebody has, say, 50 copies of the same identifier. But perhaps we say that if somebody has 50 copies of the same identifier, something has already gone wrong enough that we're okay with giving them an unwieldy error. :)

I'll modify to use set! instead of boxes. (I suspect the only reason I used a box was a vague tendency to prefer value mutation over variable mutation.)

Modifying stx-find-duplicate-identifiers to find all the duplicates would actually be pretty straightforward though. I can see that resulting in unwieldy errors if somebody has, say, 50 copies of the same identifier. But perhaps we say that if somebody has 50 copies of the same identifier, something has already gone wrong enough that we're okay with giving them an unwieldy error. :)

I think it would be great to include them all. We can let the code that renders the error message decide on some cutoff if one is needed, I'd say (since this won't be the only place that can generate big lists). In the case of DrRacket, you hit cmd-. to cycle between them so 50 seems totally fine. 10^6 may be a problem if there's some non-linear algorithm somewhere, tho. Not a new problem!

I'll modify to use set! instead of boxes. (I suspect the only reason I used a box was a vague tendency to prefer value mutation over variable mutation.)

Does it help to know that a set!d variable (probably) has a different representation at runtime that's more "value" like? From my perspective, I prefer the set! because then there's a little bit more information that's "obvious" to the compiler about the fact that the assignment doesn't escape a certain scope. Probably doesn't actually matter in this case, of course!

Current version of the PR includes all copies of the identifier, and uses set! here.

(I think the reason I lean towards boxes is because the mutation becomes an explicit part of the type and becomes explicit at the read sites. After a couple years reading spaghetti pushed me pretty far into the "make things explicit" camp. But I see your point about set! being easier for compilers to analyze.)

(I think the reason I lean towards boxes is because the mutation becomes an explicit part of the type and becomes explicit at the read sites. After a couple years reading spaghetti pushed me pretty far into the "make things explicit" camp. But I see your point about set! being easier for compilers to analyze.)

Oh, I'm totally on the same page in that sense! I just see a set! as explicit; although the syntax written by the programmer doesn't show it at the binding site of a set!d variable it is definitely a property of the syntax whether a variable is mutated or not. Check syntax shows it, for example, and that shows only things evident about the syntax. A box, in contrast can flow elsewhere and that requires more than just information about the syntax; it requires information about what happens at runtime.

Current version of the PR includes all copies of the identifier

Yay, thank you!

rfindler · 2026-03-16T14:44:49Z

I'm loving the better error messages that're going to come out of this! DrRacket's -. keyboard shortcut is going to get a lot more useful!

Introduce new helper functions: `stx-find-duplicate-ids` is like check-duplicate-identifier, except it returns all copies of the duplicated identifier, in a form which makes it easy to pass them to `raise-syntax-error` and get a syntax error which highlights all uses. (This commit does not introduce a public variant, although it might be desirable to add one in the future.) `raise-if-duplicate-identifiers` does exactly what it says on the tin. These functions are tested in the rest of the PR series.

Use the new helpers in various places across the codebase, where the changes are unobtrusive. There are ~29 uses total, of which this PR changes 28. An uncommitted ad-hoc test suite tests all but 3 (one in racket/unit, two in racket/class). I don't know how to integrate it into the Racket test infra, but it's at: https://gist.github.com/jesboat/83e728cd39a818d9a977b7bb1bf3c9f9

The change to use the new duplicate ID helper functions in the code for `(require (contract-in ...))` and `(provide (contract-out ...))` is not trivial, and is presented here as its own commit. It is tested by the uncommitted test suite (along with most of the changes in the previous commit) at: https://gist.github.com/jesboat/83e728cd39a818d9a977b7bb1bf3c9f9

Replace the old implementation (which was written using an, um, odd style using an escape continuation) with a simple delegate to the new function. Add some basic test cases.

jesboat · 2026-03-16T18:01:38Z

Opportunity for discussion: which IDs (first occurrence of the duplicated identifier, second occurrence i.e. first duplicate, other occurrences) should end up where (sub-expr argument or extra-sources argument) in the raise-syntax-error calls.

jesboat marked this pull request as draft March 7, 2026 16:33

jesboat force-pushed the pr-draft-dup-id-functions-and-uses branch 4 times, most recently from a20ce9e to a03cef3 Compare March 13, 2026 14:36

jesboat force-pushed the pr-draft-dup-id-functions-and-uses branch from a03cef3 to e07bf3c Compare March 16, 2026 14:28

rfindler reviewed Mar 16, 2026

View reviewed changes

jesboat force-pushed the pr-draft-dup-id-functions-and-uses branch 2 times, most recently from a378125 to a9455f7 Compare March 16, 2026 17:13

jesboat added 4 commits March 16, 2026 13:14

duplicate id handling: replace check-duplicate-identifier implementation

f4f1fb4

Replace the old implementation (which was written using an, um, odd style using an escape continuation) with a simple delegate to the new function. Add some basic test cases.

jesboat force-pushed the pr-draft-dup-id-functions-and-uses branch from a9455f7 to f4f1fb4 Compare March 16, 2026 17:33

jesboat changed the title ~~draft: duplicate ID handling~~ duplicate ID handling Mar 16, 2026

jesboat marked this pull request as ready for review March 16, 2026 17:56

jesboat mentioned this pull request Mar 24, 2026

discussion/proposal: merging some lower-level racket/base stuff together for perf reasons #5406

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

duplicate ID handling#5464

duplicate ID handling#5464
jesboat wants to merge 4 commits intoracket:masterfrom
jesboat:pr-draft-dup-id-functions-and-uses

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

Uh oh!

PR Notes

Commit log

duplicate id handling: new helper functions

duplicate id handling: use new helpers in various places

duplicate id handling: use new helpers in contract-in, contract-out

duplicate id handling: replace check-duplicate-identifier implementation

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants