-
Notifications
You must be signed in to change notification settings - Fork 81
Parsoid: Add the variant proxy #1207
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 1 commit
Commits
Show all changes
11 commits
Select commit
Hold shift + click to select a range
a8fb336
Parsoid: Move the Parsoid class to lib/parsoid.js
56cbde4
Parsoid: Create the modules for both variants
eec926e
Parsoid: Add the variant proxy
3a7cfeb
Parsoid: Use Parsoid/PHP for tests using Beta wikipedia
cce372e
Parsoid: Add fallback for transforms to the proxy
a5d5f5d
Parsoid: Throw errors for unimplemented methods
8cfead2
Parsoid: Stash: Honour our own ETag when retrieving the stash
3d63e77
Minor: Parsoid: No return JSDoc for abstract methods
1665ed0
Parsoid: Tests: Add Parsoid/PHP to the list of allowed URLs
ccdf409
Parsoid: Set content-language and vary if they are missing
cd991c8
Travis: Start Cassandra only for the `cassandra` test target
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Parsoid: Add the variant proxy
The proxy allows directing requests to either variant. It loads both
variants' modules internally and uses their operations to complete
requests. It is designed in such a way so as to allow an easy transition
between fully using JS to fully using PHP with no config changes. When
first introduced, its defaults emulate the JS-only scenario. Once the
switch is fully achieved, then simply changing `sys/parsoid.js` for
`sys/parsoid-php.js` in `projects/sys/default.wmf.yaml` with no config
change results in having a fully-functional Parsoid/PHP module. The
proxy can. thus, function properly with only one of variant modules
loaded and configured.
In order to support the transition period, the proxy has three modes of
operation: single, mirror and split. In single mode, only one variant is
used, defined by the `default_variant` configuration value, defaulting
to `js`. This allows us to start using the proxy with no config changes.
In the final stages of the transition (before we remove the proxy), it
can be changed to `php` to only use the PHP variant. The mirror mode is
used to asynchronously mirror traffic to the PHP variant. Requests are
issued to both variants, but only the JS one is returned. The amount of
traffic to be mirrored can be tuned with the `percentage` configuration
parameter. The imporant caveat here is that only requests for
`/page/{format}` end points are mirrored - we cannot do so reliably for
transforms since they rely on stashed content, which is likely not to be
available for the PHP variant. Furthermore, when the proxy is configured
in mirror mode, dependency update events are emitted only for the JS
variant, so as to avoid duplicates. Finally, the split mode is used to
split the traffic between the two variants based on the request domain.
If one of the patterns given in the `pattern` configuration parameter is
matched, then the variant not defined in `default_variant` is used,
otherwise the default one is used. This mode supports the second stage
of the transition, where JS will be authoritative for the majority of
domains, while we will be slowly moving projects one by one (or group by
group) over to using Parsoid/PHP.
Apart from these modes, the proxy also supports clients directly telling
it which variant to use. If the incoming request has the
`PARSOID_VARIANT` cookie or the `X-Parsoid-Variant` header set, then the
request is sent directly to that variant regardless of the proxy's mode.
When deciding where to send the request, the proxy gives precedence to
the header in case both are set.
Bug: T230791- Loading branch information
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,204 @@ | ||
| 'use strict'; | ||
|
|
||
| const P = require('bluebird'); | ||
| const HyperSwitch = require('hyperswitch'); | ||
|
|
||
| const mwUtil = require('../lib/mwUtil'); | ||
|
|
||
| const HTTPError = HyperSwitch.HTTPError; | ||
| const spec = HyperSwitch.utils.loadSpec(`${__dirname}/parsoid.yaml`); | ||
|
|
||
| const OPERATIONS = [ | ||
| 'getHtml', | ||
| 'getDataParsoid', | ||
| 'getLintErrors', | ||
| 'transformHtmlToHtml', | ||
| 'transformHtmlToWikitext', | ||
| 'transformWikitextToHtml', | ||
| 'transformWikitextToLint', | ||
| 'transformChangesToWikitext' | ||
| ]; | ||
|
|
||
| const invert = (v) => v === 'js' ? 'php' : 'js'; | ||
|
|
||
| class ParsoidProxy { | ||
|
|
||
| constructor(opts = {}) { | ||
| const modOpts = this._initOpts(opts); | ||
| const jsOpts = Object.assign({}, modOpts); | ||
| const phpOpts = Object.assign({}, modOpts); | ||
| delete jsOpts.php_host; | ||
| phpOpts.host = phpOpts.php_host; | ||
| delete phpOpts.php_host; | ||
| this._initMods(jsOpts, phpOpts); | ||
| } | ||
|
|
||
| _initOpts(opts) { | ||
| const retOpts = Object.assign({}, opts); | ||
| retOpts.host = retOpts.host || retOpts.parsoidHost; | ||
| if (!retOpts.host && !retOpts.php_host) { | ||
| throw new Error('Parsoid proxy: no host option specified!'); | ||
| } | ||
| this.options = retOpts.proxy || {}; | ||
| // possible values are 'js' and 'php' | ||
| this.default_variant = this.options.default_variant || 'js'; | ||
| if (!['js', 'php'].includes(this.default_variant)) { | ||
| throw new Error('Parsoid proxy: valid variants are js and php!'); | ||
| } | ||
| // possible values are 'single', 'mirror' and 'split' | ||
| this.mode = this.options.mode || 'single'; | ||
| if (!['single', 'mirror', 'split'].includes(this.mode)) { | ||
| throw new Error('Parsoid proxy: valid modes are single, mirror and split!'); | ||
| } | ||
| this.percentage = parseFloat(this.options.percentage || 0); | ||
| if (isNaN(this.percentage) || this.percentage < 0 || this.percentage > 100) { | ||
| throw new Error('Parsoid proxy: percentage must a number between 0 and 100!'); | ||
| } | ||
| if (this.percentage === 0 && this.mode === 'mirror') { | ||
| // a special case of mirror mode with 0% is in fact the single mode | ||
| this.mode = 'single'; | ||
| } | ||
| this.splitRegex = mwUtil.constructRegex(this.options.pattern); | ||
| if (!this.splitRegex && this.mode === 'split') { | ||
| // split mode with no pattern is single mode | ||
| this.mode = 'single'; | ||
| this.splitRegex = /^$/; | ||
| } else if (this.mode !== 'split') { | ||
| this.splitRegex = /^$/; | ||
| } | ||
| this.resources = []; | ||
| delete retOpts.parsoidHost; | ||
| delete retOpts.proxy; | ||
| return retOpts; | ||
| } | ||
|
|
||
| _initMods(jsOpts, phpOpts) { | ||
| if (!phpOpts.host) { | ||
| if (this.mode !== 'single') { | ||
| // php_host was not provided but the config expects | ||
| // both modules to be functional, so error out | ||
| throw new Error('Parsoid proxy: expected both host and php_host options!'); | ||
| } | ||
| if (this.default_variant === 'php') { | ||
| phpOpts.host = jsOpts.host; | ||
| delete jsOpts.host; | ||
| } | ||
| } | ||
| if (this.mode === 'mirror') { | ||
| if (this.default_variant === 'php') { | ||
| throw new Error('Parsoid proxy: when mirroring, only js can be the default variant!'); | ||
| } | ||
| // js is the default, so don't let php issue dependency update events | ||
| phpOpts.skip_updates = true; | ||
| } | ||
| this.mods = { | ||
| js: this._addMod('js', jsOpts), | ||
| php: this._addMod('php', phpOpts) | ||
| }; | ||
| } | ||
|
|
||
| _backendNotSupported() { | ||
| throw new HTTPError({ | ||
| status: 400, | ||
| body: { | ||
| type: 'bad_request', | ||
| description: 'Parsoid variant not configured!' | ||
| } | ||
| }); | ||
| } | ||
|
|
||
| _addMod(variant, opts) { | ||
| if (opts.host) { | ||
| const mod = require(`./parsoid-${variant}.js`)(opts); | ||
| // we are interested only in the operations and resources | ||
| this.resources = this.resources.concat(mod.resources); | ||
| return mod.operations; | ||
| } | ||
| // return operations that error out if no host is specified | ||
| const ret = {}; | ||
| OPERATIONS.forEach((o) => { | ||
| ret[o] = this._backendNotSupported; | ||
| }); | ||
| return ret; | ||
| } | ||
|
|
||
| _getStickyVariant(hyper, req) { | ||
| let variant = hyper._rootReq.headers['x-parsoid-variant'] || | ||
| req.headers['x-parsoid-variant']; | ||
| if (!variant && hyper._rootReq.headers.cookie) { | ||
| const match = /parsoid_variant=([^;]+)/i.exec(hyper._rootReq.headers.cookie); | ||
| if (match) { | ||
| variant = match[1]; | ||
| } | ||
| } | ||
| if (!variant) { | ||
| return undefined; | ||
| } | ||
| variant = variant.toLowerCase(); | ||
| if (!['js', 'php'].includes(variant)) { | ||
| throw new HTTPError({ | ||
| status: 400, | ||
| body: { | ||
| type: 'bad_request', | ||
| description: `Parsoid variant ${variant} not configured!` | ||
| } | ||
| }); | ||
| } | ||
| return variant; | ||
| } | ||
|
|
||
| _req(variant, operation, hyper, req, setHdr = true) { | ||
| if (setHdr) { | ||
| req.headers = req.headers || {}; | ||
| req.headers['x-parsoid-variant'] = variant; | ||
| } | ||
| return this.mods[variant][operation](hyper, req) | ||
| .then((res) => { | ||
| res.headers = res.headers || {}; | ||
| res.headers['x-parsoid-variant'] = variant; | ||
| return P.resolve(res); | ||
| }); | ||
| } | ||
|
|
||
| doRequest(operation, hyper, req) { | ||
| let variant = this._getStickyVariant(hyper, req); | ||
| if (variant) { | ||
| // the variant has been set explicitly by the client, honour it | ||
| return this._req(variant, operation, hyper, req); | ||
| } | ||
| variant = this.default_variant; | ||
| // mirror mode works only for getFormat, since for mirroring | ||
| // tranforms we would need to be sure we have the php output | ||
| // stashed | ||
| if (this.mode === 'mirror' && !/transform/.test(operation)) { | ||
| if (Math.round(Math.random() * 100) <= this.percentage) { | ||
| // issue an async request to the second variant and | ||
| // don't wait for the return value | ||
| this._req(invert(variant), operation, hyper, req, false) | ||
| .catch((e) => hyper.logger.log(`info/parsoidproxy/${invert(variant)}`, e)); | ||
| } | ||
| } | ||
| // we can now safely check simply where to direct the request using | ||
| // splitRegex because it won't match anything for any mode other than split | ||
| variant = this.splitRegex.test(req.params.domain) ? invert(variant) : variant; | ||
| return this._req(variant, operation, hyper, req); | ||
| } | ||
|
|
||
| getOperations() { | ||
| const ret = {}; | ||
| OPERATIONS.forEach((o) => { | ||
| ret[o] = this.doRequest.bind(this, o); | ||
| }); | ||
| return ret; | ||
| } | ||
|
|
||
| } | ||
|
|
||
| module.exports = (options = {}) => { | ||
| const ps = new ParsoidProxy(options); | ||
| return { | ||
| spec, | ||
| operations: ps.getOperations(), | ||
| resources: ps.resources | ||
| }; | ||
| }; |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is unrelated I believe, but huge kudos for finding this :)