[compiler] source location validator (#35109)

@josephsavona this was briefly discussed in an old thread, lmk your
thoughts on the approach. I have some fixes ready as well but wanted to
get this test case in first... there's some things I don't _love_ about
this approach, but end of the day it's just a tool for the test suite
rather than something for end user folks so even if it does a 70% good
enough job that's fine.

### refresher on the problem
when we generate coverage reports with jest (istanbul), our coverage
ends up completely out of whack due to the AST missing a ton of (let's
call them "important") source locations after the compiler pipeline has
run.

At the moment to get around this, we've been doing something a bit
unorthodox and also running our test suite with istanbul running before
the compiler -- which results in its own set of issues (for eg, things
being memoized differently, or the compiler completely bailing out on
the instrumented code, etc).

before getting in fixes, I wanted to set up a test case to start
chipping away on as you had recommended.

### how it works

The validator basically:
1. Traverses the original AST and collects the source locations for some
"important" node types
- (excludes useMemo/useCallback calls, as those are stripped out by the
compiler)
3. Traverses the generated AST and looks for nodes with matching source
locations.
4. Generates errors for source locations missing nodes in the generated
AST

### caveats/drawbacks

There are some things that don't work super well with this approach. A
more natural test fit I think would be just having some explicit
assertions made against an AST in a test file, as you can just bake all
of the assumptions/nuance in there that are difficult to handle in a
generic manner. However, this is maybe "good enough" for now.

1. Have to be careful what you put into the test fixture. If you put in
some code that the compiler just removes (for eg, a variable assignment
that is unused), you're creating a failure case that's impossible to
fix. I added a skip for useMemo/useCallback.
2. "Important" locations must exactly match for validation to pass.
- Might get tricky making sure things are mapped correctly when a node
type is completely changed, for eg, when a block statement arrow
function body gets turned into an implicit return via the body just
being an expression/identifier.
- This can/could result in scenarios where more changes are needed to
shuttle the locations through due to HIR not having a 1:1 mapping all
the babel nuances, even if some combination of other data might be good
enough even if not 10000% accurate. This might be the _right_ thing
anyways so we don't end up with edge cases having incorrect source
locations.
This commit is contained in:
Nathan
2025-11-12 22:02:46 -05:00
committed by GitHub
parent bbe3f4d322
commit 3a495ae722
6 changed files with 472 additions and 0 deletions

View File

@@ -105,6 +105,7 @@ import {inferMutationAliasingRanges} from '../Inference/InferMutationAliasingRan
import {validateNoDerivedComputationsInEffects} from '../Validation/ValidateNoDerivedComputationsInEffects';
import {validateNoDerivedComputationsInEffects_exp} from '../Validation/ValidateNoDerivedComputationsInEffects_exp';
import {nameAnonymousFunctions} from '../Transform/NameAnonymousFunctions';
import {validateSourceLocations} from '../Validation/ValidateSourceLocations';
export type CompilerPipelineValue =
| {kind: 'ast'; name: string; value: CodegenFunction}
@@ -557,6 +558,10 @@ function runWithEnvironment(
log({kind: 'ast', name: 'Codegen (outlined)', value: outlined.fn});
}
if (env.config.validateSourceLocations) {
validateSourceLocations(func, ast).unwrap();
}
/**
* This flag should be only set for unit / fixture tests to check
* that Forget correctly handles unexpected errors (e.g. exceptions

View File

@@ -364,6 +364,13 @@ export const EnvironmentConfigSchema = z.object({
validateNoCapitalizedCalls: z.nullable(z.array(z.string())).default(null),
validateBlocklistedImports: z.nullable(z.array(z.string())).default(null),
/**
* Validates that AST nodes generated during codegen have proper source locations.
* This is useful for debugging issues with source maps and Istanbul coverage.
* When enabled, the compiler will error if important source locations are missing in the generated AST.
*/
validateSourceLocations: z.boolean().default(false),
/**
* Validate against impure functions called during render
*/

View File

@@ -0,0 +1,206 @@
/**
* Copyright (c) Meta Platforms, Inc. and affiliates.
*
* This source code is licensed under the MIT license found in the
* LICENSE file in the root directory of this source tree.
*/
import {NodePath} from '@babel/traverse';
import * as t from '@babel/types';
import {CompilerDiagnostic, CompilerError, ErrorCategory} from '..';
import {CodegenFunction} from '../ReactiveScopes';
import {Result} from '../Utils/Result';
/**
* IMPORTANT: This validation is only intended for use in unit tests.
* It is not intended for use in production.
*
* This validation is used to ensure that the generated AST has proper source locations
* for "important" original nodes.
*
* There's one big gotcha with this validation: it only works if the "important" original nodes
* are not optimized away by the compiler.
*
* When that scenario happens, we should just update the fixture to not include a node that has no
* corresponding node in the generated AST due to being completely removed during compilation.
*/
/**
* Some common node types that are important for coverage tracking.
* Based on istanbul-lib-instrument
*/
const IMPORTANT_INSTRUMENTED_TYPES = new Set([
'ArrowFunctionExpression',
'AssignmentPattern',
'ObjectMethod',
'ExpressionStatement',
'BreakStatement',
'ContinueStatement',
'ReturnStatement',
'ThrowStatement',
'TryStatement',
'VariableDeclarator',
'IfStatement',
'ForStatement',
'ForInStatement',
'ForOfStatement',
'WhileStatement',
'DoWhileStatement',
'SwitchStatement',
'SwitchCase',
'WithStatement',
'FunctionDeclaration',
'FunctionExpression',
'LabeledStatement',
'ConditionalExpression',
'LogicalExpression',
]);
/**
* Check if a node is a manual memoization call that the compiler optimizes away.
* These include useMemo and useCallback calls, which are intentionally removed
* by the DropManualMemoization pass.
*/
function isManualMemoization(node: t.Node): boolean {
// Check if this is a useMemo/useCallback call expression
if (t.isCallExpression(node)) {
const callee = node.callee;
if (t.isIdentifier(callee)) {
return callee.name === 'useMemo' || callee.name === 'useCallback';
}
if (
t.isMemberExpression(callee) &&
t.isIdentifier(callee.property) &&
t.isIdentifier(callee.object)
) {
return (
callee.object.name === 'React' &&
(callee.property.name === 'useMemo' ||
callee.property.name === 'useCallback')
);
}
}
return false;
}
/**
* Create a location key for comparison. We compare by line/column/source,
* not by object identity.
*/
function locationKey(loc: t.SourceLocation): string {
return `${loc.start.line}:${loc.start.column}-${loc.end.line}:${loc.end.column}`;
}
/**
* Validates that important source locations from the original code are preserved
* in the generated AST. This ensures that Istanbul coverage instrumentation can
* properly map back to the original source code.
*
* The validator:
* 1. Collects locations from "important" nodes in the original AST (those that
* Istanbul instruments for coverage tracking)
* 2. Exempts known compiler optimizations (useMemo/useCallback removal)
* 3. Verifies that all important locations appear somewhere in the generated AST
*
* Missing locations can cause Istanbul to fail to track coverage for certain
* code paths, leading to inaccurate coverage reports.
*/
export function validateSourceLocations(
func: NodePath<
t.FunctionDeclaration | t.ArrowFunctionExpression | t.FunctionExpression
>,
generatedAst: CodegenFunction,
): Result<void, CompilerError> {
const errors = new CompilerError();
// Step 1: Collect important locations from the original source
const importantOriginalLocations = new Map<
string,
{loc: t.SourceLocation; nodeType: string}
>();
func.traverse({
enter(path) {
const node = path.node;
// Only track node types that Istanbul instruments
if (!IMPORTANT_INSTRUMENTED_TYPES.has(node.type)) {
return;
}
// Skip manual memoization that the compiler intentionally removes
if (isManualMemoization(node)) {
return;
}
// Collect the location if it exists
if (node.loc) {
const key = locationKey(node.loc);
importantOriginalLocations.set(key, {
loc: node.loc,
nodeType: node.type,
});
}
},
});
// Step 2: Collect all locations from the generated AST
const generatedLocations = new Set<string>();
function collectGeneratedLocations(node: t.Node): void {
if (node.loc) {
generatedLocations.add(locationKey(node.loc));
}
// Use Babel's VISITOR_KEYS to traverse only actual node properties
const keys = t.VISITOR_KEYS[node.type as keyof typeof t.VISITOR_KEYS];
if (!keys) {
return;
}
for (const key of keys) {
const value = (node as any)[key];
if (Array.isArray(value)) {
for (const item of value) {
if (t.isNode(item)) {
collectGeneratedLocations(item);
}
}
} else if (t.isNode(value)) {
collectGeneratedLocations(value);
}
}
}
// Collect from main function body
collectGeneratedLocations(generatedAst.body);
// Collect from outlined functions
for (const outlined of generatedAst.outlined) {
collectGeneratedLocations(outlined.fn.body);
}
// Step 3: Validate that all important locations are preserved
for (const [key, {loc, nodeType}] of importantOriginalLocations) {
if (!generatedLocations.has(key)) {
errors.pushDiagnostic(
CompilerDiagnostic.create({
category: ErrorCategory.Todo,
reason: 'Important source location missing in generated code',
description:
`Source location for ${nodeType} is missing in the generated output. This can cause coverage instrumentation ` +
`to fail to track this code properly, resulting in inaccurate coverage reports.`,
}).withDetails({
kind: 'error',
loc,
message: null,
}),
);
}
}
return errors.asResult();
}

View File

@@ -12,4 +12,5 @@ export {validateNoCapitalizedCalls} from './ValidateNoCapitalizedCalls';
export {validateNoRefAccessInRender} from './ValidateNoRefAccessInRender';
export {validateNoSetStateInRender} from './ValidateNoSetStateInRender';
export {validatePreservedManualMemoization} from './ValidatePreservedManualMemoization';
export {validateSourceLocations} from './ValidateSourceLocations';
export {validateUseMemo} from './ValidateUseMemo';

View File

@@ -0,0 +1,224 @@
## Input
```javascript
// @validateSourceLocations
import {useEffect, useCallback} from 'react';
function Component({prop1, prop2}) {
const x = prop1 + prop2;
const y = x * 2;
const arr = [x, y];
const obj = {x, y};
const [a, b] = arr;
const {x: c, y: d} = obj;
useEffect(() => {
if (a > 10) {
console.log(a);
}
}, [a]);
const foo = useCallback(() => {
return a + b;
}, [a, b]);
function bar() {
return (c + d) * 2;
}
console.log('Hello, world!');
return [y, foo, bar];
}
```
## Error
```
Found 13 errors:
Todo: Important source location missing in generated code
Source location for VariableDeclarator is missing in the generated output. This can cause coverage instrumentation to fail to track this code properly, resulting in inaccurate coverage reports..
error.todo-missing-source-locations.ts:5:8
3 |
4 | function Component({prop1, prop2}) {
> 5 | const x = prop1 + prop2;
| ^^^^^^^^^^^^^^^^^
6 | const y = x * 2;
7 | const arr = [x, y];
8 | const obj = {x, y};
Todo: Important source location missing in generated code
Source location for VariableDeclarator is missing in the generated output. This can cause coverage instrumentation to fail to track this code properly, resulting in inaccurate coverage reports..
error.todo-missing-source-locations.ts:6:8
4 | function Component({prop1, prop2}) {
5 | const x = prop1 + prop2;
> 6 | const y = x * 2;
| ^^^^^^^^^
7 | const arr = [x, y];
8 | const obj = {x, y};
9 | const [a, b] = arr;
Todo: Important source location missing in generated code
Source location for VariableDeclarator is missing in the generated output. This can cause coverage instrumentation to fail to track this code properly, resulting in inaccurate coverage reports..
error.todo-missing-source-locations.ts:7:8
5 | const x = prop1 + prop2;
6 | const y = x * 2;
> 7 | const arr = [x, y];
| ^^^^^^^^^^^^
8 | const obj = {x, y};
9 | const [a, b] = arr;
10 | const {x: c, y: d} = obj;
Todo: Important source location missing in generated code
Source location for VariableDeclarator is missing in the generated output. This can cause coverage instrumentation to fail to track this code properly, resulting in inaccurate coverage reports..
error.todo-missing-source-locations.ts:8:8
6 | const y = x * 2;
7 | const arr = [x, y];
> 8 | const obj = {x, y};
| ^^^^^^^^^^^^
9 | const [a, b] = arr;
10 | const {x: c, y: d} = obj;
11 |
Todo: Important source location missing in generated code
Source location for VariableDeclarator is missing in the generated output. This can cause coverage instrumentation to fail to track this code properly, resulting in inaccurate coverage reports..
error.todo-missing-source-locations.ts:9:8
7 | const arr = [x, y];
8 | const obj = {x, y};
> 9 | const [a, b] = arr;
| ^^^^^^^^^^^^
10 | const {x: c, y: d} = obj;
11 |
12 | useEffect(() => {
Todo: Important source location missing in generated code
Source location for VariableDeclarator is missing in the generated output. This can cause coverage instrumentation to fail to track this code properly, resulting in inaccurate coverage reports..
error.todo-missing-source-locations.ts:10:8
8 | const obj = {x, y};
9 | const [a, b] = arr;
> 10 | const {x: c, y: d} = obj;
| ^^^^^^^^^^^^^^^^^^
11 |
12 | useEffect(() => {
13 | if (a > 10) {
Todo: Important source location missing in generated code
Source location for ExpressionStatement is missing in the generated output. This can cause coverage instrumentation to fail to track this code properly, resulting in inaccurate coverage reports..
error.todo-missing-source-locations.ts:12:2
10 | const {x: c, y: d} = obj;
11 |
> 12 | useEffect(() => {
| ^^^^^^^^^^^^^^^^^
> 13 | if (a > 10) {
| ^^^^^^^^^^^^^^^^^
> 14 | console.log(a);
| ^^^^^^^^^^^^^^^^^
> 15 | }
| ^^^^^^^^^^^^^^^^^
> 16 | }, [a]);
| ^^^^^^^^^^^
17 |
18 | const foo = useCallback(() => {
19 | return a + b;
Todo: Important source location missing in generated code
Source location for ExpressionStatement is missing in the generated output. This can cause coverage instrumentation to fail to track this code properly, resulting in inaccurate coverage reports..
error.todo-missing-source-locations.ts:14:6
12 | useEffect(() => {
13 | if (a > 10) {
> 14 | console.log(a);
| ^^^^^^^^^^^^^^^
15 | }
16 | }, [a]);
17 |
Todo: Important source location missing in generated code
Source location for VariableDeclarator is missing in the generated output. This can cause coverage instrumentation to fail to track this code properly, resulting in inaccurate coverage reports..
error.todo-missing-source-locations.ts:18:8
16 | }, [a]);
17 |
> 18 | const foo = useCallback(() => {
| ^^^^^^^^^^^^^^^^^^^^^^^^^
> 19 | return a + b;
| ^^^^^^^^^^^^^^^^^
> 20 | }, [a, b]);
| ^^^^^^^^^^^^^
21 |
22 | function bar() {
23 | return (c + d) * 2;
Todo: Important source location missing in generated code
Source location for ReturnStatement is missing in the generated output. This can cause coverage instrumentation to fail to track this code properly, resulting in inaccurate coverage reports..
error.todo-missing-source-locations.ts:19:4
17 |
18 | const foo = useCallback(() => {
> 19 | return a + b;
| ^^^^^^^^^^^^^
20 | }, [a, b]);
21 |
22 | function bar() {
Todo: Important source location missing in generated code
Source location for ReturnStatement is missing in the generated output. This can cause coverage instrumentation to fail to track this code properly, resulting in inaccurate coverage reports..
error.todo-missing-source-locations.ts:23:4
21 |
22 | function bar() {
> 23 | return (c + d) * 2;
| ^^^^^^^^^^^^^^^^^^^
24 | }
25 |
26 | console.log('Hello, world!');
Todo: Important source location missing in generated code
Source location for ExpressionStatement is missing in the generated output. This can cause coverage instrumentation to fail to track this code properly, resulting in inaccurate coverage reports..
error.todo-missing-source-locations.ts:26:2
24 | }
25 |
> 26 | console.log('Hello, world!');
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
27 |
28 | return [y, foo, bar];
29 | }
Todo: Important source location missing in generated code
Source location for ReturnStatement is missing in the generated output. This can cause coverage instrumentation to fail to track this code properly, resulting in inaccurate coverage reports..
error.todo-missing-source-locations.ts:28:2
26 | console.log('Hello, world!');
27 |
> 28 | return [y, foo, bar];
| ^^^^^^^^^^^^^^^^^^^^^
29 | }
30 |
```

View File

@@ -0,0 +1,29 @@
// @validateSourceLocations
import {useEffect, useCallback} from 'react';
function Component({prop1, prop2}) {
const x = prop1 + prop2;
const y = x * 2;
const arr = [x, y];
const obj = {x, y};
const [a, b] = arr;
const {x: c, y: d} = obj;
useEffect(() => {
if (a > 10) {
console.log(a);
}
}, [a]);
const foo = useCallback(() => {
return a + b;
}, [a, b]);
function bar() {
return (c + d) * 2;
}
console.log('Hello, world!');
return [y, foo, bar];
}