v8: add v8.startupSnapshot utils

This adds several APIs under the `v8.startupSnapshot` namespace
for specifying hooks into the startup snapshot serialization
and deserialization.

- isBuildingSnapshot()
- addSerializeCallback()
- addDeserializeCallback()
- setDeserializeMainFunction()

PR-URL: https://github.com/nodejs/node/pull/43329
Fixes: https://github.com/nodejs/node/issues/42617
Refs: https://github.com/nodejs/node/issues/35711
Reviewed-By: Chengzhong Wu <legendecas@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
This commit is contained in:
Joyee Cheung
2022-04-18 12:36:36 +08:00
parent 027c2880a6
commit a36a5469c2
15 changed files with 414 additions and 30 deletions

View File

@@ -1179,6 +1179,13 @@ because the `node:domain` module has been loaded at an earlier point in time.
The stack trace is extended to include the point in time at which the
`node:domain` module had been loaded.
<a id="ERR_DUPLICATE_STARTUP_SNAPSHOT_MAIN_FUNCTION"></a>
### `ERR_DUPLICATE_STARTUP_SNAPSHOT_MAIN_FUNCTION`
[`v8.startupSnapshot.setDeserializeMainFunction()`][] could not be called
because it had already been called before.
<a id="ERR_ENCODING_INVALID_ENCODED_DATA"></a>
### `ERR_ENCODING_INVALID_ENCODED_DATA`
@@ -2314,6 +2321,13 @@ has occurred when attempting to start the loop.
Once no more items are left in the queue, the idle loop must be suspended. This
error indicates that the idle loop has failed to stop.
<a id="ERR_NOT_BUILDING_SNAPSHOT"></a>
### `ERR_NOT_BUILDING_SNAPSHOT`
An attempt was made to use operations that can only be used when building
V8 startup snapshot even though Node.js isn't building one.
<a id="ERR_NO_CRYPTO"></a>
### `ERR_NO_CRYPTO`
@@ -3501,6 +3515,7 @@ The native call from `process.cpuUsage` could not be processed.
[`url.parse()`]: url.md#urlparseurlstring-parsequerystring-slashesdenotehost
[`util.getSystemErrorName(error.errno)`]: util.md#utilgetsystemerrornameerr
[`util.parseArgs()`]: util.md#utilparseargsconfig
[`v8.startupSnapshot.setDeserializeMainFunction()`]: v8.md#v8startupsnapshotsetdeserializemainfunctioncallback-data
[`zlib`]: zlib.md
[crypto digest algorithm]: crypto.md#cryptogethashes
[debugger]: debugger.md

View File

@@ -876,6 +876,137 @@ Called immediately after a promise continuation executes. This may be after a
Called when the promise receives a resolution or rejection value. This may
occur synchronously in the case of `Promise.resolve()` or `Promise.reject()`.
## Startup Snapshot API
<!-- YAML
added: REPLACEME
-->
> Stability: 1 - Experimental
The `v8.startupSnapshot` interface can be used to add serialization and
deserialization hooks for custom startup snapshots. Currently the startup
snapshots can only be built into the Node.js binary from source.
```console
$ cd /path/to/node
$ ./configure --node-snapshot-main=entry.js
$ make node
# This binary contains the result of the execution of entry.js
$ out/Release/node
```
In the example above, `entry.js` can use methods from the `v8.startupSnapshot`
interface to specify how to save information for custom objects in the snapshot
during serialization and how the information can be used to synchronize these
objects during deserialization of the snapshot. For example, if the `entry.js`
contains the following script:
```cjs
'use strict';
const fs = require('fs');
const zlib = require('zlib');
const path = require('path');
const assert = require('assert');
const {
isBuildingSnapshot,
addSerializeCallback,
addDeserializeCallback,
setDeserializeMainFunction
} = require('v8').startupSnapshot;
const filePath = path.resolve(__dirname, '../x1024.txt');
const storage = {};
assert(isBuildingSnapshot());
addSerializeCallback(({ filePath }) => {
storage[filePath] = zlib.gzipSync(fs.readFileSync(filePath));
}, { filePath });
addDeserializeCallback(({ filePath }) => {
storage[filePath] = zlib.gunzipSync(storage[filePath]);
}, { filePath });
setDeserializeMainFunction(({ filePath }) => {
console.log(storage[filePath].toString());
}, { filePath });
```
The resulted binary will simply print the data deserialized from the snapshot
during start up:
```console
$ out/Release/node
# Prints content of ./test/fixtures/x1024.txt
```
Currently the API is only available to a Node.js instance launched from the
default snapshot, that is, the application deserialized from a user-land
snapshot cannot use these APIs again.
### `v8.startupSnapshot.addSerializeCallback(callback[, data])`
<!-- YAML
added: REPLACEME
-->
* `callback` {Function} Callback to be invoked before serialization.
* `data` {any} Optional data that will be passed to the `callback` when it
gets called.
Add a callback that will be called when the Node.js instance is about to
get serialized into a snapshot and exit. This can be used to release
resources that should not or cannot be serialized or to convert user data
into a form more suitable for serialization.
### `v8.startupSnapshot.addDeserializeCallback(callback[, data])`
<!-- YAML
added: REPLACEME
-->
* `callback` {Function} Callback to be invoked after the snapshot is
deserialized.
* `data` {any} Optional data that will be passed to the `callback` when it
gets called.
Add a callback that will be called when the Node.js instance is deserialized
from a snapshot. The `callback` and the `data` (if provided) will be
serialized into the snapshot, they can be used to re-initialize the state
of the application or to re-acquire resources that the application needs
when the application is restarted from the snapshot.
### `v8.startupSnapshot.setDeserializeMainFunction(callback[, data])`
<!-- YAML
added: REPLACEME
-->
* `callback` {Function} Callback to be invoked as the entry point after the
snapshot is deserialized.
* `data` {any} Optional data that will be passed to the `callback` when it
gets called.
This sets the entry point of the Node.js application when it is deserialized
from a snapshot. This can be called only once in the snapshot building
script. If called, the deserialized application no longer needs an additional
entry point script to start up and will simply invoke the callback along with
the deserialized data (if provided), otherwise an entry point script still
needs to be provided to the deserialized application.
### `v8.startupSnapshot.isBuildingSnapshot()`
<!-- YAML
added: REPLACEME
-->
* Returns: {boolean}
Returns true if the Node.js instance is run to build a snapshot.
[HTML structured clone algorithm]: https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers_API/Structured_clone_algorithm
[Hook Callbacks]: #hook-callbacks
[V8]: https://developers.google.com/v8/

View File

@@ -53,7 +53,6 @@ function prepareMainThreadExecution(expandArgv1 = false,
setupCoverageHooks(process.env.NODE_V8_COVERAGE);
}
setupDebugEnv();
// Print stack trace on `SIGINT` if option `--trace-sigint` presents.
@@ -84,6 +83,8 @@ function prepareMainThreadExecution(expandArgv1 = false,
initializeDeprecations();
initializeWASI();
require('internal/v8/startup_snapshot').runDeserializeCallbacks();
if (!initialzeModules) {
return;
}

View File

@@ -2,7 +2,10 @@
const { ObjectDefineProperty } = primordials;
const rawMethods = internalBinding('process_methods');
const {
addSerializeCallback,
isBuildingSnapshot
} = require('v8').startupSnapshot;
// TODO(joyeecheung): deprecate and remove these underscore methods
process._debugProcess = rawMethods._debugProcess;
process._debugEnd = rawMethods._debugEnd;
@@ -134,6 +137,12 @@ function refreshStderrOnSigWinch() {
stderr._refreshSize();
}
function addCleanup(fn) {
if (isBuildingSnapshot()) {
addSerializeCallback(fn);
}
}
function getStdout() {
if (stdout) return stdout;
stdout = createWritableStdioStream(1);
@@ -145,12 +154,14 @@ function getStdout() {
process.on('SIGWINCH', refreshStdoutOnSigWinch);
}
internalBinding('mksnapshot').cleanups.push(function cleanupStdout() {
addCleanup(function cleanupStdout() {
stdout._destroy = stdoutDestroy;
stdout.destroy();
process.removeListener('SIGWINCH', refreshStdoutOnSigWinch);
stdout = undefined;
});
// No need to add deserialize callback because stdout = undefined above
// causes the stream to be lazily initialized again later.
return stdout;
}
@@ -164,12 +175,14 @@ function getStderr() {
if (stderr.isTTY) {
process.on('SIGWINCH', refreshStderrOnSigWinch);
}
internalBinding('mksnapshot').cleanups.push(function cleanupStderr() {
addCleanup(function cleanupStderr() {
stderr._destroy = stderrDestroy;
stderr.destroy();
process.removeListener('SIGWINCH', refreshStderrOnSigWinch);
stderr = undefined;
});
// No need to add deserialize callback because stderr = undefined above
// causes the stream to be lazily initialized again later.
return stderr;
}
@@ -256,10 +269,12 @@ function getStdin() {
}
}
internalBinding('mksnapshot').cleanups.push(function cleanupStdin() {
addCleanup(function cleanupStdin() {
stdin.destroy();
stdin = undefined;
});
// No need to add deserialize callback because stdin = undefined above
// causes the stream to be lazily initialized again later.
return stdin;
}

View File

@@ -978,6 +978,8 @@ E('ERR_DOMAIN_CANNOT_SET_UNCAUGHT_EXCEPTION_CAPTURE',
'The `domain` module is in use, which is mutually exclusive with calling ' +
'process.setUncaughtExceptionCaptureCallback()',
Error);
E('ERR_DUPLICATE_STARTUP_SNAPSHOT_MAIN_FUNCTION',
'Deserialize main function is already configured.', Error);
E('ERR_ENCODING_INVALID_ENCODED_DATA', function(encoding, ret) {
this.errno = ret;
return `The encoded data was not valid for encoding ${encoding}`;
@@ -1456,6 +1458,8 @@ E('ERR_NETWORK_IMPORT_BAD_RESPONSE',
"import '%s' received a bad response: %s", Error);
E('ERR_NETWORK_IMPORT_DISALLOWED',
"import of '%s' by %s is not supported: %s", Error);
E('ERR_NOT_BUILDING_SNAPSHOT',
'Operation cannot be invoked when not building startup snapshot', Error);
E('ERR_NO_CRYPTO',
'Node.js is not compiled with OpenSSL crypto support', Error);
E('ERR_NO_ICU',

View File

@@ -9,7 +9,7 @@ const {
const binding = internalBinding('mksnapshot');
const { NativeModule } = require('internal/bootstrap/loaders');
const {
compileSnapshotMain,
compileSerializeMain,
} = binding;
const {
@@ -83,7 +83,7 @@ const supportedModules = new SafeSet(new SafeArrayIterator([
'v8',
// 'vm',
// 'worker_threads',
// 'zlib',
'zlib',
]));
const warnedModules = new SafeSet();
@@ -117,25 +117,22 @@ function main() {
} = require('internal/bootstrap/pre_execution');
prepareMainThreadExecution(true, false);
process.once('beforeExit', function runCleanups() {
for (const cleanup of binding.cleanups) {
cleanup();
}
});
const file = process.argv[1];
const path = require('path');
const filename = path.resolve(file);
const dirname = path.dirname(filename);
const source = readFileSync(file, 'utf-8');
const snapshotMainFunction = compileSnapshotMain(filename, source);
const serializeMainFunction = compileSerializeMain(filename, source);
require('internal/v8/startup_snapshot').initializeCallbacks();
if (getOptionValue('--inspect-brk')) {
internalBinding('inspector').callAndPauseOnStart(
snapshotMainFunction, undefined,
serializeMainFunction, undefined,
requireForUserSnapshot, filename, dirname);
} else {
snapshotMainFunction(requireForUserSnapshot, filename, dirname);
serializeMainFunction(requireForUserSnapshot, filename, dirname);
}
}

View File

@@ -0,0 +1,111 @@
'use strict';
const {
validateFunction,
} = require('internal/validators');
const {
ERR_NOT_BUILDING_SNAPSHOT,
ERR_DUPLICATE_STARTUP_SNAPSHOT_MAIN_FUNCTION
} = require('internal/errors');
const {
setSerializeCallback,
setDeserializeCallback,
setDeserializeMainFunction: _setDeserializeMainFunction,
markBootstrapComplete
} = internalBinding('mksnapshot');
function isBuildingSnapshot() {
// For now this is the only way to build a snapshot.
return require('internal/options').getOptionValue('--build-snapshot');
}
function throwIfNotBuildingSnapshot() {
if (!isBuildingSnapshot()) {
throw new ERR_NOT_BUILDING_SNAPSHOT();
}
}
const deserializeCallbacks = [];
let deserializeCallbackIsSet = false;
function runDeserializeCallbacks() {
while (deserializeCallbacks.length > 0) {
const { 0: callback, 1: data } = deserializeCallbacks.shift();
callback(data);
}
}
function addDeserializeCallback(callback, data) {
throwIfNotBuildingSnapshot();
validateFunction(callback, 'callback');
if (!deserializeCallbackIsSet) {
// TODO(joyeecheung): when the main function handling is done in JS,
// the deserialize callbacks can always be invoked. For now only
// store it in C++ when it's actually used to avoid unnecessary
// C++ -> JS costs.
setDeserializeCallback(runDeserializeCallbacks);
deserializeCallbackIsSet = true;
}
deserializeCallbacks.push([callback, data]);
}
const serializeCallbacks = [];
function runSerializeCallbacks() {
while (serializeCallbacks.length > 0) {
const { 0: callback, 1: data } = serializeCallbacks.shift();
callback(data);
}
// Remove the hooks from the snapshot.
require('v8').startupSnapshot = undefined;
}
function addSerializeCallback(callback, data) {
throwIfNotBuildingSnapshot();
validateFunction(callback, 'callback');
serializeCallbacks.push([callback, data]);
}
function initializeCallbacks() {
// Only run the serialize callbacks in snapshot building mode, otherwise
// they throw.
if (isBuildingSnapshot()) {
setSerializeCallback(runSerializeCallbacks);
}
}
let deserializeMainIsSet = false;
function setDeserializeMainFunction(callback, data) {
throwIfNotBuildingSnapshot();
// TODO(joyeecheung): In lib/internal/bootstrap/node.js, create a default
// main function to run the lib/internal/main scripts and make sure that
// the main function set in the snapshot building process takes precedence.
validateFunction(callback, 'callback');
if (deserializeMainIsSet) {
throw new ERR_DUPLICATE_STARTUP_SNAPSHOT_MAIN_FUNCTION();
}
deserializeMainIsSet = true;
_setDeserializeMainFunction(function deserializeMain() {
const {
prepareMainThreadExecution
} = require('internal/bootstrap/pre_execution');
// This should be in sync with run_main_module.js until we make that
// a built-in main function.
prepareMainThreadExecution(true);
markBootstrapComplete();
callback(data);
});
}
module.exports = {
initializeCallbacks,
runDeserializeCallbacks,
// Exposed to require('v8').startupSnapshot
namespace: {
addDeserializeCallback,
addSerializeCallback,
setDeserializeMainFunction,
isBuildingSnapshot
}
};

View File

@@ -40,6 +40,9 @@ const {
Serializer,
Deserializer
} = internalBinding('serdes');
const {
namespace: startupSnapshot
} = require('internal/v8/startup_snapshot');
let profiler = {};
if (internalBinding('config').hasInspector) {
@@ -372,4 +375,5 @@ module.exports = {
serialize,
writeHeapSnapshot,
promiseHooks,
startupSnapshot
};

View File

@@ -3,6 +3,7 @@
#include "debug_utils-inl.h"
using v8::Context;
using v8::Function;
using v8::Global;
using v8::HandleScope;
using v8::Isolate;
@@ -44,6 +45,13 @@ Maybe<int> SpinEventLoop(Environment* env) {
if (EmitProcessBeforeExit(env).IsNothing())
break;
{
HandleScope handle_scope(isolate);
if (env->RunSnapshotSerializeCallback().IsEmpty()) {
break;
}
}
// Emit `beforeExit` if the loop became alive either after emitting
// event, or after running some callbacks.
more = uv_loop_alive(env->event_loop());
@@ -54,6 +62,11 @@ Maybe<int> SpinEventLoop(Environment* env) {
if (env->is_stopping()) return Nothing<int>();
env->set_trace_sync_io(false);
// Clear the serialize callback even though the JS-land queue should
// be empty this point so that the deserialized instance won't
// attempt to call into JS again.
env->set_snapshot_serialize_callback(Local<Function>());
env->PrintInfoForSnapshotIfDebug();
env->VerifyNoStrongBaseObjects();
return EmitProcessExit(env);

View File

@@ -34,6 +34,7 @@ using v8::Array;
using v8::Boolean;
using v8::Context;
using v8::EmbedderGraph;
using v8::EscapableHandleScope;
using v8::Function;
using v8::FunctionTemplate;
using v8::HandleScope;
@@ -671,6 +672,26 @@ void Environment::PrintSyncTrace() const {
isolate(), stack_trace_limit(), StackTrace::kDetailed));
}
MaybeLocal<Value> Environment::RunSnapshotSerializeCallback() const {
EscapableHandleScope handle_scope(isolate());
if (!snapshot_serialize_callback().IsEmpty()) {
Context::Scope context_scope(context());
return handle_scope.EscapeMaybe(snapshot_serialize_callback()->Call(
context(), v8::Undefined(isolate()), 0, nullptr));
}
return handle_scope.Escape(Undefined(isolate()));
}
MaybeLocal<Value> Environment::RunSnapshotDeserializeMain() const {
EscapableHandleScope handle_scope(isolate());
if (!snapshot_deserialize_main().IsEmpty()) {
Context::Scope context_scope(context());
return handle_scope.EscapeMaybe(snapshot_deserialize_main()->Call(
context(), v8::Undefined(isolate()), 0, nullptr));
}
return handle_scope.Escape(Undefined(isolate()));
}
void Environment::RunCleanup() {
started_cleanup_ = true;
TRACE_EVENT0(TRACING_CATEGORY_NODE1(environment), "RunCleanup");

View File

@@ -558,6 +558,9 @@ class NoArrayBufferZeroFillScope {
V(promise_hook_handler, v8::Function) \
V(promise_reject_callback, v8::Function) \
V(script_data_constructor_function, v8::Function) \
V(snapshot_serialize_callback, v8::Function) \
V(snapshot_deserialize_callback, v8::Function) \
V(snapshot_deserialize_main, v8::Function) \
V(source_map_cache_getter, v8::Function) \
V(tick_callback_function, v8::Function) \
V(timers_callback_function, v8::Function) \
@@ -1333,6 +1336,10 @@ class Environment : public MemoryRetainer {
void RunWeakRefCleanup();
v8::MaybeLocal<v8::Value> RunSnapshotSerializeCallback() const;
v8::MaybeLocal<v8::Value> RunSnapshotDeserializeCallback() const;
v8::MaybeLocal<v8::Value> RunSnapshotDeserializeMain() const;
// Strings and private symbols are shared across shared contexts
// The getters simply proxy to the per-isolate primitive.
#define VP(PropertyName, StringValue) V(v8::Private, PropertyName)

View File

@@ -485,6 +485,14 @@ MaybeLocal<Value> StartExecution(Environment* env, StartExecutionCallback cb) {
return scope.EscapeMaybe(cb(info));
}
// TODO(joyeecheung): move these conditions into JS land and let the
// deserialize main function take precedence. For workers, we need to
// move the pre-execution part into a different file that can be
// reused when dealing with user-defined main functions.
if (!env->snapshot_deserialize_main().IsEmpty()) {
return env->RunSnapshotDeserializeMain();
}
if (env->worker_context() != nullptr) {
return StartExecution(env, "internal/main/worker_thread");
}

View File

@@ -457,7 +457,7 @@ void SerializeBindingData(Environment* env,
namespace mksnapshot {
static void CompileSnapshotMain(const FunctionCallbackInfo<Value>& args) {
void CompileSerializeMain(const FunctionCallbackInfo<Value>& args) {
CHECK(args[0]->IsString());
Local<String> filename = args[0].As<String>();
Local<String> source = args[1].As<String>();
@@ -485,23 +485,46 @@ static void CompileSnapshotMain(const FunctionCallbackInfo<Value>& args) {
}
}
static void Initialize(Local<Object> target,
Local<Value> unused,
Local<Context> context,
void* priv) {
Environment* env = Environment::GetCurrent(context);
Isolate* isolate = context->GetIsolate();
env->SetMethod(target, "compileSnapshotMain", CompileSnapshotMain);
target
->Set(context,
FIXED_ONE_BYTE_STRING(isolate, "cleanups"),
v8::Array::New(isolate))
.Check();
void SetSerializeCallback(const FunctionCallbackInfo<Value>& args) {
Environment* env = Environment::GetCurrent(args);
CHECK(env->snapshot_serialize_callback().IsEmpty());
CHECK(args[0]->IsFunction());
env->set_snapshot_serialize_callback(args[0].As<Function>());
}
static void RegisterExternalReferences(ExternalReferenceRegistry* registry) {
registry->Register(CompileSnapshotMain);
void SetDeserializeCallback(const FunctionCallbackInfo<Value>& args) {
Environment* env = Environment::GetCurrent(args);
CHECK(env->snapshot_deserialize_callback().IsEmpty());
CHECK(args[0]->IsFunction());
env->set_snapshot_deserialize_callback(args[0].As<Function>());
}
void SetDeserializeMainFunction(const FunctionCallbackInfo<Value>& args) {
Environment* env = Environment::GetCurrent(args);
CHECK(env->snapshot_deserialize_main().IsEmpty());
CHECK(args[0]->IsFunction());
env->set_snapshot_deserialize_main(args[0].As<Function>());
}
void Initialize(Local<Object> target,
Local<Value> unused,
Local<Context> context,
void* priv) {
Environment* env = Environment::GetCurrent(context);
env->SetMethod(target, "compileSerializeMain", CompileSerializeMain);
env->SetMethod(target, "markBootstrapComplete", MarkBootstrapComplete);
env->SetMethod(target, "setSerializeCallback", SetSerializeCallback);
env->SetMethod(target, "setDeserializeCallback", SetDeserializeCallback);
env->SetMethod(
target, "setDeserializeMainFunction", SetDeserializeMainFunction);
}
void RegisterExternalReferences(ExternalReferenceRegistry* registry) {
registry->Register(CompileSerializeMain);
registry->Register(MarkBootstrapComplete);
registry->Register(SetSerializeCallback);
registry->Register(SetDeserializeCallback);
registry->Register(SetDeserializeMainFunction);
}
} // namespace mksnapshot
} // namespace node

View File

@@ -0,0 +1,32 @@
'use strict';
const fs = require('fs');
const zlib = require('zlib');
const path = require('path');
const assert = require('assert');
const {
isBuildingSnapshot,
addSerializeCallback,
addDeserializeCallback,
setDeserializeMainFunction
} = require('v8').startupSnapshot;
const filePath = path.resolve(__dirname, '../x1024.txt');
const storage = {};
assert(isBuildingSnapshot());
addSerializeCallback(({ filePath }) => {
console.error('serializing', filePath);
storage[filePath] = zlib.gzipSync(fs.readFileSync(filePath));
}, { filePath });
addDeserializeCallback(({ filePath }) => {
console.error('deserializing', filePath);
storage[filePath] = zlib.gunzipSync(storage[filePath]);
}, { filePath });
setDeserializeMainFunction(({ filePath }) => {
console.log(storage[filePath].toString());
}, { filePath });

View File

@@ -21,6 +21,7 @@ const expectedModules = new Set([
'Internal Binding fs_event_wrap',
'Internal Binding fs',
'Internal Binding heap_utils',
'Internal Binding mksnapshot',
'Internal Binding messaging',
'Internal Binding module_wrap',
'Internal Binding native_module',
@@ -167,6 +168,7 @@ const expectedModules = new Set([
'NativeModule url',
'NativeModule util',
'NativeModule v8',
'NativeModule internal/v8/startup_snapshot',
'NativeModule vm',
]);