Conversation
a20ce9e to
a03cef3
Compare
a03cef3 to
e07bf3c
Compare
| (define (signal-dup-syntax-error!) | ||
| (define-values (dup origs) (stx-find-duplicate-identifiers (reverse (unbox bound-id-list)))) | ||
| (when dup | ||
| (raise-syntax-error who "duplicate identifiers" stx dup origs))) |
There was a problem hiding this comment.
I'm not sure if this is a semantics-preserving change. But I am pretty sure we don't need the box (also the comment is out of sync with the box)
There was a problem hiding this comment.
On second thought, I think this is okay. Probably okay to use set!, instead of set-box!, tho?
There was a problem hiding this comment.
The behavior would change from highlighting all copies of the duplicate identifier to only highlighting an arbitrary two. It felt like that's an acceptable compromise for using the common stx-find-duplicate-identifiers function rather than custom code here.
Modifying stx-find-duplicate-identifiers to find all the duplicates would actually be pretty straightforward though. I can see that resulting in unwieldy errors if somebody has, say, 50 copies of the same identifier. But perhaps we say that if somebody has 50 copies of the same identifier, something has already gone wrong enough that we're okay with giving them an unwieldy error. :)
I'll modify to use set! instead of boxes. (I suspect the only reason I used a box was a vague tendency to prefer value mutation over variable mutation.)
There was a problem hiding this comment.
Modifying stx-find-duplicate-identifiers to find all the duplicates would actually be pretty straightforward though. I can see that resulting in unwieldy errors if somebody has, say, 50 copies of the same identifier. But perhaps we say that if somebody has 50 copies of the same identifier, something has already gone wrong enough that we're okay with giving them an unwieldy error. :)
I think it would be great to include them all. We can let the code that renders the error message decide on some cutoff if one is needed, I'd say (since this won't be the only place that can generate big lists). In the case of DrRacket, you hit cmd-. to cycle between them so 50 seems totally fine. 10^6 may be a problem if there's some non-linear algorithm somewhere, tho. Not a new problem!
I'll modify to use set! instead of boxes. (I suspect the only reason I used a box was a vague tendency to prefer value mutation over variable mutation.)
Does it help to know that a set!d variable (probably) has a different representation at runtime that's more "value" like? From my perspective, I prefer the set! because then there's a little bit more information that's "obvious" to the compiler about the fact that the assignment doesn't escape a certain scope. Probably doesn't actually matter in this case, of course!
There was a problem hiding this comment.
Current version of the PR includes all copies of the identifier, and uses set! here.
(I think the reason I lean towards boxes is because the mutation becomes an explicit part of the type and becomes explicit at the read sites. After a couple years reading spaghetti pushed me pretty far into the "make things explicit" camp. But I see your point about set! being easier for compilers to analyze.)
There was a problem hiding this comment.
(I think the reason I lean towards boxes is because the mutation becomes an explicit part of the type and becomes explicit at the read sites. After a couple years reading spaghetti pushed me pretty far into the "make things explicit" camp. But I see your point about set! being easier for compilers to analyze.)
Oh, I'm totally on the same page in that sense! I just see a set! as explicit; although the syntax written by the programmer doesn't show it at the binding site of a set!d variable it is definitely a property of the syntax whether a variable is mutated or not. Check syntax shows it, for example, and that shows only things evident about the syntax. A box, in contrast can flow elsewhere and that requires more than just information about the syntax; it requires information about what happens at runtime.
There was a problem hiding this comment.
Current version of the PR includes all copies of the identifier
Yay, thank you!
|
I'm loving the better error messages that're going to come out of this! DrRacket's -. keyboard shortcut is going to get a lot more useful! |
a378125 to
a9455f7
Compare
Introduce new helper functions: `stx-find-duplicate-ids` is like check-duplicate-identifier, except it returns all copies of the duplicated identifier, in a form which makes it easy to pass them to `raise-syntax-error` and get a syntax error which highlights all uses. (This commit does not introduce a public variant, although it might be desirable to add one in the future.) `raise-if-duplicate-identifiers` does exactly what it says on the tin. These functions are tested in the rest of the PR series.
Use the new helpers in various places across the codebase, where the changes are unobtrusive. There are ~29 uses total, of which this PR changes 28. An uncommitted ad-hoc test suite tests all but 3 (one in racket/unit, two in racket/class). I don't know how to integrate it into the Racket test infra, but it's at: https://gist.github.com/jesboat/83e728cd39a818d9a977b7bb1bf3c9f9
The change to use the new duplicate ID helper functions in the code for `(require (contract-in ...))` and `(provide (contract-out ...))` is not trivial, and is presented here as its own commit. It is tested by the uncommitted test suite (along with most of the changes in the previous commit) at: https://gist.github.com/jesboat/83e728cd39a818d9a977b7bb1bf3c9f9
Replace the old implementation (which was written using an, um, odd style using an escape continuation) with a simple delegate to the new function. Add some basic test cases.
a9455f7 to
f4f1fb4
Compare
|
Opportunity for discussion: which IDs (first occurrence of the duplicated identifier, second occurrence i.e. first duplicate, other occurrences) should end up where ( |
PR Notes
This PR introduces two new functions in racket/private/stx,
stx-find-duplicate-idsandraise-if-duplicate-identifiers. Then,throughout the codebase, places where duplicate identifier syntax errors
are raised are changed to use one of the two new functions. This causes
those syntax errors to highlight all copies of the duplicate identifier,
instead of just one.
It also centralizes two previous chunks of code which did duplicate ID
checking: the implementation of
check-duplicate-identifierand animplementation in the contract code. The implementation is written in
pure
#%kernelin the intent that it will also soon replace a thirdduplicate-identifier check in the implementation of
let.The new implementation uses straightforward loops instead of a
let/ecpreviously used in
check-duplicate-identifier. I have not benchmarkedthis, but would be quite surprised if the
let/ecversion wins.Unlike the implementation in
letwhich uses a list-based loop forsmall input sizes, we use a hashtable for all input sizes (except for 0
and 1, for which there can be no duplicates). This is based on some
benchmarking done on lists of various sizes which have no duplicates:
https://gist.github.com/jesboat/a47f090dca2094b5cd36598ebb5e6406
There is an ad-hoc test suite which tests almost all the changes:
https://gist.github.com/jesboat/83e728cd39a818d9a977b7bb1bf3c9f9
It checks that duplicate identifiers are detected and that the
exn:fail:syntax contains the expected number of identifiers all of
which have distinct source locations. It is missing tests for three
changes:
There is no fundamental reason those aren't tested; I'm just not
familiar enough with units and classes that writing a test which tickles
those particular places would have been straightforward.
Commit log
duplicate id handling: new helper functions
Introduce new helper functions:
stx-find-duplicate-idsis like check-duplicate-identifier, except itreturns all copies of the duplicated identifier, in a form which makes
it easy to pass them to
raise-syntax-errorand get a syntax error whichhighlights all uses. (This commit does not introduce a public variant,
although it might be desirable to add one in the future.)
raise-if-duplicate-identifiersdoes exactly what it says on the tin.These functions are tested in the rest of the PR series.
duplicate id handling: use new helpers in various places
Use the new helpers in various places across the codebase, where
the changes are unobtrusive. There are ~29 uses total, of which this
PR changes 28.
An uncommitted ad-hoc test suite tests all but 3 (one in racket/unit, two in
racket/class). I don't know how to integrate it into the Racket test infra,
but it's at:
https://gist.github.com/jesboat/83e728cd39a818d9a977b7bb1bf3c9f9
duplicate id handling: use new helpers in contract-in, contract-out
The change to use the new duplicate ID helper functions in the code for
(require (contract-in ...))and(provide (contract-out ...))isnot trivial, and is presented here as its own commit.
It is tested by the uncommitted test suite (along with most of the
changes in the previous commit) at:
https://gist.github.com/jesboat/83e728cd39a818d9a977b7bb1bf3c9f9
duplicate id handling: replace check-duplicate-identifier implementation
Replace the old implementation (which was written using an, um, odd style
using an escape continuation) with a simple delegate to the new function.
Add some basic test cases.