- It will allow the resulting column to be used by steps only accepting the new type,
e.g. when casting a column of concatenated texts to the
"url"
type, so that it may be used where Urls are expected (e.g. the stepfetch_url_content
). - It will change any values not conformant with the new type to the missing value (NaN). E.g.,
casting a column of mixed data containing numbers to the
"number"
type, will replace all values that cannot be read as numbers with NaN.
"type"
parameter, e.g. "number"
,
"category"
etc.), the steps accepts different configuration parameters. See the subsections under
Parameters below for further details.
Usage
The following example shows how the step can be used in a recipe.Examples
Examples
E.g. to simply convert a
text
column to a category
column, use:Inputs & Outputs
The following are the inputs expected by the step and the outputs it produces. These are generally columns (ds.first_name
), datasets (ds
or ds[["first_name", "last_name"]]
) or models (referenced
by name e.g. "churn-clf"
).
Inputs
Inputs
The column you wish to cast.
Outputs
Outputs
A new column with original data cast to the desired type.
Configuration
The following parameters can be used to configure the behaviour of the step by including them in a json object as the last “input” to the step, i.e.step(..., {"param": "value", ...}) -> (output)
.
Parameters
Parameters
Desired semantic type of the converted data.
Make data numerical with
"type": "number"
.Separator to mark the decimal part.
Use ”.” or ”,” to indicate how decimal values are separated when parsing text strings
into numerical format. It is automatically assumed that the other character is used as
the thousands separator. E.g.
"decimal": "."
assumes that the period ”.” is used to
separate decimals and ”,” thousands, as in the number string “12,173.12”.Values must be one of the following:.
,
Separator to mark the thousands.
Use ”.” or ”,” to indicate how thousands are separated when parsing text strings
into numerical format. It is automatically assumed that the other character is used as
the decimal separator. E.g.
"thousand": "."
assumes that the period ”.” is used to
separate thousands and ”,” decimals, as in the number string “12.173,12”.Values must be one of the following:.
,