Skip to content

WIP Mitigate hallucinations with codeblocks containing Typescript/Web Components/Javascript #64

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 14 commits into
base: main
Choose a base branch
from

Conversation

nelsonwittwer
Copy link
Contributor

@nelsonwittwer nelsonwittwer commented Jul 9, 2025

Why

There is nothing more costly to productivity than LLM hallucinations within codeblocks that accepted into a codebase.

We have mitigated that risk with GraphQL. We need to do the same with Typescript, Web components, and Javascript codeblocks using Shopify's APIs.

What

Even if users aren't using Typescript (web components with polaris app home), we can still use the types of the Typescript APIs that are directly or indirectly used.

This introduces a new validate_typescript_codeblocks MCP tool that accepts a TS package name that will be used to detect hallucinations / errors in codeblocks LLMs generated/modified.

In the case of app home, it will accept the Typescript library @shopify/app-bridge-ui-types that app home uses indirectly as our TS source of truth. We'll configure learn_shopify_api's response to require all the LLM to use this tool to validate any codeblocks they would offer return to a user.

WIP

What's still in question is how we will use TS .

This uses Typia to validate TS without having a user's machine. I need to stress test the implementation with more than the 3 weak sauce evals I've been testing this with.

This previous idea added a temporary TS file and using the user's CLI to typescript compile. I worry about how many more things can go wrong with this case: the user doesn't have typescript installed because they prefer JS/Web components. This could send an agent spiraling and modify the user's machine in a way that isn't necessary. It also has the risk of not cleaning up after itself (it temporariliy creates a TS file to run the typescript compile command.

@nelsonwittwer nelsonwittwer requested a review from a team as a code owner July 9, 2025 20:15
@nelsonwittwer nelsonwittwer changed the title Mitigate hallucinations with codeblocks containing Typescript/Web Components/Javascript WIP Mitigate hallucinations with codeblocks containing Typescript/Web Components/Javascript Jul 11, 2025
Copy link
Contributor

@matteodepalo matteodepalo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few comments but this looks cleaner!

@@ -0,0 +1,995 @@
import { z } from "zod";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've moved from storing big files likes schemas inside dev-mcp and we now store all schemas inside shopify.dev itself and fetch only what we need on demand. Do you think we could move this to shopify.dev and fetch it at runtime when needed? You can take a look at how we fetch and cache graphql schemas to get inspiration here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup! Hassan is going to work on this if I can't get to it sooner.

package.json Outdated
@@ -3,15 +3,15 @@
"version": "1.1.0",
"main": "dist/index.js",
"scripts": {
"build": "tsc && node -e \"require('fs').chmodSync('dist/index.js', '755')\"",
"build:watch": "tsc --watch",
"build": "tspc && node -e \"require('fs').chmodSync('dist/index.js', '755')\"",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the change to tpsc? Was there something failing with tsc?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shit -- great catch -- it was used when I was spiking with typia. I'll revert this

@nelsonwittwer nelsonwittwer force-pushed the typescript_validation branch from 0457de9 to 99d744e Compare July 30, 2025 08:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants