-
Notifications
You must be signed in to change notification settings - Fork 90
SALTO-315, SALTO-316: Move parser to worker thread #423
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## master #423 +/- ##
==========================================
- Coverage 94.08% 94.03% -0.05%
==========================================
Files 111 112 +1
Lines 4899 4946 +47
Branches 873 879 +6
==========================================
+ Hits 4609 4651 +42
- Misses 290 295 +5
Continue to review full report at Codecov.
|
- Moved calls to WASM parser to a different thread to avoid blocking the main thread - Changed input for parse from buffer to string because in fact we always needed string
const loadWasmModule = async (): Promise<WebAssembly.Module> => { | ||
// Relative path from source location | ||
const modulePath = path.join(__dirname, '..', '..', '..', 'hcl.wasm') | ||
const data = await readFile(modulePath) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Breaks packaging...
This here should specifically be fs.readFile
or fs.readFileSync
since we're reading a bundled "resource", it should be read from the bundle by the fs
methods patched by nexe. file.ts
uses the fs.promises
api which is not patched.
Probably best to add the above statement as a comment there :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought the readFileSync was a mistake there. will revert
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
promisify
on fs.readFile
seems to work
// Generate call ID | ||
const currMaxID = _.max([0, ...this.inFlightIDs]) as number | ||
const callID = currMaxID + 1 | ||
this.inFlightIDs.add(callID) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if this simplifies - instead of having both inflightIds and activeCalls store a single map of callId to { worker id, return func }
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't always need the active calls map (single threaded is still an option) but we always need the call IDs.
return context.return as HclReturn | ||
} finally { | ||
// cleanup call context from global scope | ||
delete hclParserCall[callId] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see comment above regarding dynamic delete.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is used for communication with go. not sure we can / want to change how that works as part of this change
/* istanbul ignore next */ | ||
if (parentPort !== null) { | ||
const parent = parentPort | ||
const handleCall = async ([func, args]: ['call' | 'stop', unknown[]]): Promise<void> => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can type it:
['stop'] | ['call', wasmModule: WebAssembly.Module, context: HclCallContext]
singleThreaded: boolean | ||
} | ||
export const initializeCore = async ( | ||
args: InitializeArgs = { singleThreaded: true }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC the default value here causes tests to run in single-threaded mode?
If so, seems a bit implicit and buried, maybe add jest global setup?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default value here causes the parser to run the single threaded mode by default, that applies to most tests as well as the vscode extension and generally anywhere else that might want to use the parser other than the CLI.
Unfortunately jest global setup doesn't run in every jest process, only in the main one, so if we change the default (which I think we should not, at least until we see what it does to the extension) and try to use it, we still end up with threaded parsers in the jest workers and jest times out on a lot of tests.
There is a different method of using setup files that inject an additional setup for test every file in a project, but that would have to be implemented separately for every project in the repository as far as I can understand and IMO it won't really make it any less hidden because you'd have to know about that feature in jest and look for it specifically
const go = new Go() | ||
go.env = GO_ENV | ||
// eslint-disable-next-line no-undef | ||
const inst = await WebAssembly.instantiate(wasmModule, go.importObject) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can't we reuse the instance per worker?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Edit: Looks like this is because the go main can only handle a single call. If so, is instantiating the module per call time consuming? I think it can be solved by applying the solution suggested here that we talked about, which will remove the callId
mechanism, removing one stack of logic...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
once the module is loaded, instantiating doesn't take very long (comparing to the amount of time it takes to run the go code), around 25-40ms IIRC.
edit: actually the number I had in mind was with the actual parsing for a small file, so the instanciate probably takes less...
} | ||
|
||
// The following code only runs in the worker thread so it is never reported for coverage | ||
/* istanbul ignore next */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the complete lack of coverage here might be slightly dangerous. is there a way to run this code in tests, in "mock worker" mode?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is covered, just not reported because, as the comment says, jest doesn't collect coverage data from the worker thread
@@ -13,7 +13,7 @@ func convertAllDiagnostics( | |||
parserErrs hcl.Diagnostics, | |||
traversalErrs hcl.Diagnostics, | |||
) []interface{} { | |||
result := make([]interface{}, 0, len(parserErrs) + len(traversalErrs)) | |||
result := make([]interface{}, 0, len(parserErrs)+len(traversalErrs)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this gofmt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes
Trying to use worker pool libraries yielded too many strange errors on thread teardown so this implementation is just a single worker thread (i.e - no real performance gain).
This can be quite easily changed to run on a worker pool for better performance but hopefully this won't be very valuable once @roironn is done with the new parser.