Admin Guide
Configuration
Setting a proxy for global use
To set-up a global HTTP(S) or SOCKS5 proxy, use the following Storm command:
> playwright.setup.proxy myproxy:8080 --username $lib.globals.get(proxy:user) --password $lib.globals.get(proxy:pass)
Setting Playwright proxy for all users
Dependencies
Synapse-Playwright requires the following Power-Ups to be installed:
Name : synapse-fileparser
Version: >=4.2.1,<5.0.0
Desc : Synapse-FileParser is required in order to use the HTMLtoJSON API.
Permissions
Package (synapse-playwright) defines the following permissions:
power-ups.playwright.user : Allows a user to instantiate a Playwright browser and page. ( default: false )
power-ups.playwright.proxy : Allows a user to override the global proxy configuration. ( default: false )
You may add rules to users/roles directly from storm:
> auth.user.addrule fred power-ups.playwright.user
Added rule power-ups.playwright.user to user fred.
or:
> auth.role.addrule ninjas power-ups.playwright.proxy
Added rule power-ups.playwright.proxy to role ninjas.
Exported Storm APIs
Synapse Playwright exports the playwright.api
module as a Storm API.
page(loadurl=(null), conf=(null))
---------------------------------
Create a new Page share.
Args:
loadurl (str or None): Optional URL to load on Page creation
conf (dict or None): Optional configuration to provide to the browser and page
Returns:
(bool, str or Page, dict): Tuple of (ok, Page or err mesg, excinfo)
htmlToJson(page, template)
--------------------------
Extract JSON from the Page HTML using FileParser.
Args:
page (Page): The Page share
template (dict): The FileParser htmlToJson template
Returns:
(bool, any, dict): Tuple of (ok, template data or err string, excinfo)
extractTables(page, table_selector=(null))
------------------------------------------
Extract table data from the HTML tables in the Page.
Args:
page (Page): The Page share
table_selector (string or None): Optional CSS selector to select the tables to extract. (Defaults to 'table')
Returns:
(bool, any, dict): Tuple of (ok, list of tables data or err string, excinfo)
spawn(page)
-----------
Spawn a new Page that shares a context with the existing Page.
Args:
page (Page): The Page to spawn from
Returns:
(bool, str or Page, dict): Tuple of (ok, Page or err string, excinfo)
getMeta()
---------
Get meta information about the Playwright browser configuration.
Returns:
dict: Dictionary with a meta keyvals under a "browser" key.
wget2pdf(url, conf=(null), err_for_status=(false), norm=(false))
-------------------------------------------------
Load a page from a URL and save the generated PDF file to the Axon and the Cortex.
Response details can be accessed via the "resp" key in the info dict.
Args:
url (str): The URL to load.
conf (dict or None): Optional configuration to provide to the browser and page.
err_for_status (bool): Return an error if response code != 200 (default: false).
norm (bool): Normalize the page to optimize the PDF representation.
Returns:
(bool, str or storm:node, dict): Tuple of (ok, inet:urlfile or err, response info).
Node Actions
Synapse-Playwright provides the following node actions in Optic:
Name : wget2pdf
Desc : Save a generated PDF as an inet:urlfile
Forms: inet:url
Onload Events
Synapse Playwright does not use any onload
events.