-
Notifications
You must be signed in to change notification settings - Fork 1
feat(mongodb): replace catch-all rules with explicit method definitions for C3 autocomplete #57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…n this session: ## Session Summary This session continued the implementation of the plan to remove hardcoded method lists by leveraging C3 token candidates from the restructured grammar. ### What was done: **Task 3 — Fix `query_type.go` build failure:** - Rewrote `classifyMethodChain()` to use the new grammar API: `chain.CollectionExplainMethod()`, `chain.AllCursorMethodCall()`, `chain.CollectionMethodCall()` - Rewrote `classifyMethodCall()` → `classifyCollectionMethodCall()` to use `mongodb.ICollectionMethodCallContext` instead of the removed `mongodb.IMethodCallContext` - Removed the `GenericMethod()` check (no longer exists) - Added classification for the 20 new collection methods (bulkWrite, update, mapReduce, findAndModify, hideIndex, unhideIndex, reIndex, createSearchIndex, etc.) - Updated the `incomplete_find_statement` test expectation (parser now recovers `find` better → `Select` instead of `DML`) **Task 3 — Fix `convertCandidates()` token flooding:** - At `db.|`, C3 tokens leak from the `collectionOperation` path. Fixed by only using C3 tokens when `collectionAccess` is NOT present (i.e., at collection/cursor method positions) - At `db.|`, returns only hardcoded database methods + collections (unchanged) - At `db.users.|` and `db.users.find().|`, returns C3 tokens directly **Task 4 — Regenerate test expectations:** - Used record mode to regenerate `test_completion.yaml` - Key behavior change: `db.users.|` now shows 89 methods (both collection + cursor) instead of 34 collection-only. This is correct per the grammar, which allows either type as the first method. - `db.users.find().|` correctly shows only 35 cursor methods (grammar enforces `cursorMethodCall` after first method) **Task 5 — Cleanup verification:** - `isMethodToken()` — still needed for token → function type classification - `getDatabaseMethodCandidates()` — still needed because database methods from `dbStatement` aren't provided by C3 at `db.|` position - `previousNonDotToken()` — still needed for `db.` vs `db[` distinction - All 15 database methods verified as matching `dbStatement` grammar rule **All tests pass, lint clean.**
**1. Grammar fix (`MongoShellParser.g4`)**: Removed `cursorMethodCall` from the first position in `methodChain`. Now only `collectionMethodCall` is valid as the first method after a collection:
```
methodChain
: DOT collectionExplainMethod DOT collectionMethodCall (DOT cursorMethodCall)*
| DOT collectionMethodCall (DOT cursorMethodCall)*
;
```
There's no valid use case for `db.users.sort()` — cursor methods only make sense after a collection method like `find()`.
**2. Removed `isMethodToken` and `getTokenCandidateType`**: In the `else` branch of `convertCandidates`, we're at a method position where every C3 token is a method. So we treat all tokens as `CandidateTypeFunction` with `()` suffix. This eliminates the ~120-line `isMethodToken` switch block entirely.
This also fixed a bug: `getPlanCache` was previously classified as KEYWORD (no `()`). Now it correctly appears as `getPlanCache()` with FUNCTION type.
**3. Completion behavior now**:
- `db.|` → 15 database methods + collections (hardcoded, unchanged)
- `db.users.|` → only **collection methods** (48 methods — no cursor methods like `allowDiskUse`, `sort`, `limit`)
- `db.users.find().|` → only **cursor methods** (35 methods)
**Removed `genericDbMethod`** — the `identifier LPAREN arguments? RPAREN` catch-all rule from `dbStatement`. Same pattern we already removed for collection methods (`genericMethod`). **Added 45 new lexer tokens** for the 48 missing db methods (3 already had tokens: `AGGREGATE`, `WATCH`, `SET_WRITE_CONCERN`). **Added 48 new `dbStatement` alternatives** — each follows the explicit pattern `DB DOT TOKEN LPAREN arguments? RPAREN` with a labeled alternative name. **Updated `getDatabaseMethodCandidates()`** — expanded from 15 to 63 methods (all db methods now listed alphabetically). **Updated `identifier` rule** — all 45 new tokens added so they can still be used as identifiers (collection names, keys, etc.).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR refactors the MongoDB shell parser to replace generic catch-all grammar rules with explicit method definitions to improve C3 autocomplete context-awareness. The changes enable the autocomplete system to suggest only contextually-appropriate methods at each position (e.g., only db methods after "db.", only collection methods after "db.users.", only cursor methods after "db.users.find().").
Changes:
- Removed two catch-all rules (
genericMethodandgenericDbMethod) that were causing autocomplete to flood all keywords as suggestions - Added 61 new lexer tokens (16 collection methods + 45 database methods) to explicitly recognize method names
- Split the generic
methodCallrule into two context-specific rules:collectionMethodCall(52 alternatives) andcursorMethodCall(35 alternatives) - Added 48 new explicit database method alternatives to the
dbStatementrule
Reviewed changes
Copilot reviewed 6 out of 8 changed files in this pull request and generated 93 comments.
Show a summary per file
| File | Description |
|---|---|
| mongodb/MongoShellLexer.g4 | Added 16 collection method tokens (bulkWrite, update, mapReduce, hideIndex, etc.) and 45 database method tokens (auth, createUser, grantRolesToUser, etc.) |
| mongodb/MongoShellParser.g4 | Removed catch-all rules, split methodCall into collectionMethodCall/cursorMethodCall, added 48 explicit db method alternatives, added 20 new collection method parser rules |
| mongodb/mongoshellparser_visitor.go | Updated visitor interface with new Visit methods for all new parser rules (45 db methods, 20 collection methods, split methodCall) |
| mongodb/mongoshellparser_listener.go | Updated listener interface with Enter/Exit methods for all new parser rules |
| mongodb/mongoshellparser_base_visitor.go | Added default implementations of Visit methods for all new rules |
| mongodb/mongoshellparser_base_listener.go | Added default implementations of Enter/Exit methods for all new rules |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| methodChain | ||
| : DOT methodCall (DOT methodCall)* | ||
| : DOT collectionExplainMethod DOT collectionMethodCall (DOT cursorMethodCall)* | ||
| | DOT collectionMethodCall (DOT cursorMethodCall)* | ||
| ; |
Copilot
AI
Jan 30, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The methodChain grammar rule has a potential ambiguity issue. The first alternative DOT collectionExplainMethod DOT collectionMethodCall (DOT cursorMethodCall)* will never be matched because collectionExplainMethod is already included in collectionMethodCall (line 264). This means ANTLR will always choose the second alternative DOT collectionMethodCall (DOT cursorMethodCall)* when parsing explain() chains.
To fix this, collectionExplainMethod should be removed from the collectionMethodCall rule (line 264), so that the special case for explain().find() chains can be properly distinguished from normal method chains.
The upstream parser (bytebase/parser#57) replaced the unified MethodCallContext with CollectionMethodCallContext and CursorMethodCallContext, and removed the GenericMethod catch-all. - Split visitMethodCall into visitCollectionMethodCall and visitCursorMethodCall - Update visitMethodChain to use CollectionMethodCall() and AllCursorMethodCall() instead of AllMethodCall() - Replace GenericMethod fallback with extractMethodNameFromText Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The upstream parser (bytebase/parser#57) replaced the unified MethodCallContext with CollectionMethodCallContext and CursorMethodCallContext, and removed the GenericMethod catch-all. - Split visitMethodCall into visitCollectionMethodCall and visitCursorMethodCall - Update visitMethodChain to use CollectionMethodCall() and AllCursorMethodCall() instead of AllMethodCall() - Replace GenericMethod fallback with extractMethodNameFromText Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Summary
methodCallintocollectionMethodCall(48 methods) andcursorMethodCall(35 methods) so C3 returns context-appropriate completions at each positiongenericMethodandgenericDbMethodcatch-all rules (identifier LPAREN arguments? RPAREN) that caused C3 to flood all keyword tokens as method candidatesdbStatementalternatives to replace the removed catch-allsTest plan
make build && make testinmongodb/directory passesdb.|shows only db methods,db.users.|shows only collection methods,db.users.find().|shows only cursor methods🤖 Generated with Claude Code