extract_url_components
Extract components from an URL.
Let’s say we have http://www.cwi.nl:80/%7Eguido/Python.html;a=2;b=3?c=4,2&d=e#anchor
as our URL.
Then these components will be the following:
scheme
: URL scheme specifier (http)domain
: Network location part (www.cwi.nl:80)path
: Hierarchical path (/%7Eguido/Python.html)params
: Parameters for last path element (a=2;b=3)query
: Query component (c=4,2&d=e)fragment
: Fragment identifier (anchor)
For more information about these components you can check urllib’s description here.
Usage
The following example shows how the step can be used in a recipe.
Use http
as default scheme.
Inputs & Outputs
The following are the inputs expected by the step and the outputs it produces. These are generally
columns (ds.first_name
), datasets (ds
or ds[["first_name", "last_name"]]
) or models (referenced
by name e.g. "churn-clf"
).
Configuration
The following parameters can be used to configure the behaviour of the step by including them in
a json object as the last “input” to the step, i.e. step(..., {"param": "value", ...}) -> (output)
.
URL Default Scheme. If you wish to add a scheme (http, https…) prefix to those urls that don’t have one, do it here. If you wish none to be added, use null instead.
Was this page helpful?