-
Notifications
You must be signed in to change notification settings - Fork 79
feat(evasive-transform): add meaning-preserving evasive transform for imports in strings #3026
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
… imports in strings
| if (options && options.preventHtmlCommentRegression) { | ||
| // Workaround for generator creating `-->` from `-- >` in code generated from the AST | ||
| result.code = result.code.replace(/-->/g, '-- >'); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kriskowal @mhofman is this controversial at all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a test case covering that this won't affect the contents of strings in the output?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what you mean here. if a string contains an html coment sequence, it'll get disconnected. So there should be nothing left in strings to affect. Is that what you mean? Or what kind of fixture would the test need?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code like const htmlCommentClose = "-->"; (in which the sequence appears as string contents) must not be affected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OH, it would be affected.
It's not meaning-preserving, that's why I made it opt-in.
We've been doing that same transform in LavaMoat for a long time and it didn't bite us (this particular one at least) so I thought we could put it in at the much better time.
Maybe a scarier name is gonna make this feature more palatable.
Anyway, I'll extract it into a separate PR to proceed with the part we most care about.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I've replaced this with a potentially cheaper trick that introduces an empty comment inbetween a--/**/>b
Now the only controversy is the new regexp evasion. @gibson042 check it out :)
| @@ -0,0 +1,171 @@ | |||
| const evadeRegexp = /import\s*\(|<!--|-->/g; | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note this now provides multiple evasions in one go
| const multilineimport =\`␊ | ||
| console.log(${a})␊ | ||
| await im${""}port('some-module');␊ | ||
| console.log(${b})␊ | ||
| \`;␊ | ||
| ␊ | ||
| const taggedtemplate = String.dedent\`␊ | ||
| await import('some-module');␊ | ||
| \`;␊ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is there this difference between tagged vs. untagged template literal contents?
And as an aside, I think this test would be better if the input and expected output appeared in the same file (i.e., not using t.snapshot). It's valuable for high volume, but this use seems to come down on the wrong side of the tradeoff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because the tag function would see an extra split and empty string value in it. Where without a tag the effects of having an extra hole filled with an empty string are unobservable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know it's more code to get right, but if we're making this change, we should be thorough about it, e.g.:
const taggedtemplate = makeEvadedTemplateApplier(String.dedent)`
await ${IMPORT}('some-module');
`;where IMPORT is a unique global sentinel such as Symbol('import') and makeEvadedTemplateApplier is a global function like
const makeEvadedTemplateApplier =
fn =>
(T0, ...A0) => {
const argCount = A0.length;
let mapped = weakmapGet(evadedTemplateMappings, T0);
if (!mapped) {
const raw0 = T0.raw;
const T1 = [];
const raw1 = [];
const importIndexes = [];
for (let i = 0, j = 0; i <= argCount; ++i, ++j) {
T1[j] = T0[i];
raw1[j] = raw0[i];
while (i < argCount && A0[i] === IMPORT) {
importIndexes[importIndexes.length] = i++;
T1[j] += `import${T0[i]}`;
raw1[j] += `import${raw0[i]}`;
}
}
defineProperty(T1, 'raw', {
...{ writable: false, enumerable: false, configurable: false },
value: freeze(raw1),
);
mapped = freeze({ T1: freeze(T1), importIndexes });
weakmapSet(evadedTemplateMappings, T0, mapped);
}
const { T1, importIndexes } = mapped;
let i = 0;
let j = 0;
let nextImportIndex = j < importIndexes.length && importIndexes[j];
const next = () => {
while (i === nextImportIndex) {
i += 1;
j += 1;
nextImportIndex = j < importIndexes.length && importIndexes[j];
}
if (i < argCount) return { done: false, value: A0[i++] };
return { done: true, value: undefined };
};
const A1 = { [iteratorSymbol]: () => A1, next };
return fn(T1, ...A1);
};(which AFAIK is unobservable, until/unless https://github.com/tc39/proposal-array-is-template-object advances to Stage 4)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We would have to put all that runtime in a single line of code, and it’s a lot. Might have to do some hard trade-off calculus. I’m in favor of making monotonic progress until we hit a cost cliff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know to what extent minification is on the table for injected helpers like the above, but if we are comfortable with it then the cost would be just under 500 bytes, and only necessary when the source text being transformed has one or more tagged template literals with static contents that would be affected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's about the potential issues with sourcemaps or having to find a place to put it. Also, the level of complexity I'm not comfortable with introducing.
It's a complex fix for a very unlikely problem. And if a module contains a tagged template string with an import statement in it, it's likely problematic in many other ways.
HTML in tagged template strings is somewhat common, but as a virtual dom component, where HTML comments don't make sense.
I fail to find an example of tagged template strings containing hardcoded javascript with imports in it.
I'd like to avoid shipping a tagged template string transform for now.
|
I'm considering also ading a transform that changes a RegExp |
boneskull
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems reasonable but I think @gibson042's comment about --> needs addressing
8609251 to
6939389
Compare
| * @param {import('@babel/traverse').NodePath} p | ||
| * @returns {void} | ||
| */ | ||
| export const evadeRegexpLiteral = p => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gibson042 @mhofman this is a new addition to this PR
it's meaning preserving unless the regexp is listing characters to match, including an open brace like this: /[import(]+/ - which as regular expressions go seems unlikely enough. Do we want to avoid that with one extra test to see if a ( exists between matching [] ?
Description
With compartment-mapper introducing support for dynamic imports, the old regexp-based ways in which we evaded import censorship in SES are no longer viable and AST based transform can be the only meaning-preserving one.
Open questions
import (- with whitespace? We could add complexity to the transform or assume they're rare enough and let them hit censorship.splitby regexp unless somebody stops meimport(that are left need not be preserved? Could we use a simple regexp-replace there for a cheaper way to get the same result?eval('var a = import("somemodule")')or intend to ?Security Considerations
Scaling Considerations
AST transform is slower than regexp, but for cases where
import(doesn't exist in a string, it could be faster than a regexp search o the entire string.Documentation Considerations
Testing Considerations
Compatibility Considerations
with the transform being meaning-preserving, this should not be breaking for anyone. Also, considering opt-in