Bots From Extension: cfxdm

Data Management

This extension provides 174 bots.

Bot @dm:add-bounded-dataset

Bot Position In Pipeline: Source Sink

This is a bot that adds a bounded dataset. Bounded datasets are bound to a pre-defined schema so that data is always validated against a set of rules.

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
name*	Text		Name of the dataset
schema_name*	Text		Name of the schema to bind this dataset

Bot @dm:add-checksum

Bot Position In Pipeline: Sink

Add checksum to input dataframe. Checksum can be by rows-only or rows and then the entire dataset.

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
checksum_type	Text	dataset	Compute checksum for by row only or by row and then entire dataset. Valid values are 'rows-only', 'dataset'
row_checksum_column	Text	rda_row_checksum	Output column for computed row level checksum. If the column already exists, it will be replaced and not included in the checksum computation.
data_checksum_column	Text	rda_data_checksum	Output column for computed checksum for entire dataset. If the column already exists, it will be replaced and not included in the checksum computation.
key	Text		Optional key to be used in the computed hash.

Example usage:

@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> @dm:add-checksum checksum_type='rows-only' &
            row_checksum_column='id'

Playground

Try this in RDA Playground

Bot @dm:add-missing-columns

Bot Position In Pipeline: Sink

Add columns if not found in the input

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
columns*	Text		Comma separated list of column names to be added if not already exist in input
value	Text		Value to be assigned if columns are not found in input. Default is None.

Example usage:

@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> @dm:add-missing-columns columns = 'Serial_Number' &
            value = 'SSI151909K5'

Playground

Try this in RDA Playground

Example Pipelines Using this Bot

Bot @dm:add-schema

Bot Position In Pipeline: Source Sink

This is a bot that adds json schema to the system. The datasets can be bound to this schema so that adding/editing any rows to dataset are automatically validated against this schema.

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
schema_file*	Text		File path or URL of the json schema file
name*	Text		Name of the json schema

Bot @dm:add-template

Bot Position In Pipeline: Source Sink

Add a formatting template with 'name' and contents downloaded from a 'url'. If the template already exists, it will overwrite.

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Description
url*	Text	URL to contents of the Jinja2 style formatting template
name*	Text	Name of the formatting template
description	Text	Formattiong template description

Bot @dm:addrow

Bot Position In Pipeline: Sink

Append a row to input dataframe, with specified column = value parameters

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

This bot only accepts wildcard parameters. All name = 'value' parameters are passed to the bot.

Example usage:

@dm:empty
    --> @dm:addrow column1 = "Value1" & column2 = "Value2"

Example Pipelines Using this Bot

Bot @dm:apply-alert-rules

Bot Position In Pipeline: Source Sink

Unselect nodes in stack of given node types if there is no right link available

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
name*	Text		Name of the alert ruleset to apply
timestamp_column	Text		Name of the column to be used as timestamp

Bot @dm:apply-data-model

Bot Position In Pipeline: Sink

Apply specified data model to input dataframe. Example model name 'assetLCMMaster'

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
model*	Text		Name of the model to apply to input dataframe
removeUnmapped	Text	no	Remove columns that are not in the model. Specify 'yes' or 'no'
apply_for_empty	Text	no	Apply value for empty string ('yes' or 'no')

Example usage:

@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> @dm:save name = "sample_servicenow_incident"

--> @c:new-block
    --> @dm:recall  name="sample_servicenow_incident"
    --> @dm:apply-data-model  model="assetLCMMaster" &
            removeUnmapped="no"
    --> @dm:save  name="sample_servicenow_incident"

Bot @dm:apply-snmp-trap-template

Bot Position In Pipeline: Sink

Apply template to incoming SNMP Trap objects. Template must be available in RDA Object repository.

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
timestamp_col*	Text		Column which has timestamp at which SNMP Trap was received. Must be in UTC epoch milliseconds format
version_col	Text		Column which has SNMP version.
address_col*	Text		Column which has IPAddress of the SNMP Trap source
varbinds_col*	Text		Column which has varbind list. Should be list of dict objects
template_folder	Text	snmp_trap_templates	Name of the folder for RDA Objects which contains the SNMP Trap template
template_name	Text	traps	Name of the RDA Object which has the SNMP Trap template

Bot @dm:apply-template-all-rows

Bot Position In Pipeline: Sink

Apply specified formatting template for all input rows and produce one rendered row

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Description
template_name*	Text	Name of the formatting template to be applied
output_col*	Text	Output column
status_col	Text	Template parsing status column. If not specified and any errors will cause pipeline to abort.

Example Pipelines Using this Bot

sample-incident-analytics

Bot @dm:apply-template-by-row

Bot Position In Pipeline: Sink

Apply specified formatting template for each input dataframe row and produce rendered output

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Description
template_name*	Text	Name of the formatting template to be applied
output_col*	Text	Output column
status_col	Text	Template parsing status column. If not specified and any errors will cause pipeline to abort.

Example Pipelines Using this Bot

Bot @dm:apply-topology-rci

Bot Position In Pipeline: Sink

Apply topology based Root Cause Inference model

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
node_weights_dict	Text		Name of the node weights dictionary. Expects columns node_id and weight
link_weights_dict	Text		Name of the link weights dictionary. Expects columns link_type and weight
severity_weights_dict	Text		Name of the severity weights dictionary. Expects columns severity and weight
select_top	Text	1	How many high score nodes to select. Default is 1
stack_name*	Text		Name of the stack

Bot @dm:bin

Bot Position In Pipeline: Sink

Create bins for numerical 'column' and bins specified by 'bins' parameter

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
column*	Text		Numerical value column
bins*	Text		Comma separated list of numerical values representing bins

Example usage:

@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
      --> @dm:hist timestamp = 'sys_created_on' &
            interval = '30d'
      --> *dm:filter * get count as 'tcount'
      --> @dm:bin column = 'tcount' &
            bins = "0,1,5,10,100"

Playground

Try this in RDA Playground

Bot *dm:bookmark-list

Bot Position In Pipeline: Source

List of saved bookmarks

This bot expects a Full CFXQL.

Bot applies the Query on the data that is already loaded from previous bot or from a source.

Example usage:

@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'

--> @c:new-block
    --> *dm:bookmark-list

Playground

Try this in RDA Playground(rda_docs_env)

Bot @dm:build-hierarchy

Bot Position In Pipeline: Sink

Builds relationships between the entities and populates hierarchy keys in the newly created 'hierarchy' column

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Description
entity_key_column*	Text	Attribute that represents the unique entity key column
relation_key_column*	Text	A relationship key column that should match to the entity_key_column
hierarchy_end_key_column	Text	Attribute that represent the hierarchy end key column to stop hierarchy building
hierarchy_end_value	Text	Hierarchy end value to stop hierarchy building
include_column_to_primary_key	Text	Column name that will be added to the entity key column that makes uniqueness to build the hierarchy

Bot @dm:change-time-format

Bot Position In Pipeline: Sink

Change datetime from one format to another for all specified columns

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Description
columns*	Text	Comma separated list of column names
from_format*	Text	From format, one of: datetimestr s ms ns
to_format*	Text	To format, one of: datetimestr s ms ns. Can also specify a custom format expression.

Bot @dm:check-columns

Bot Position In Pipeline: Sink

Check input columns for specific list of columns that must exist or must not exist and take an action

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Description
must_contain	Text	Comma separated list of columns which must exist in the input.
must_not_contain	Text	Comma separated list of columns which must not exist in the input.
action*	Text	Action to take if either of the column checks fails. Must be one of 'fail','skip-block', 'skip-pipeline'

Example usage:

@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> @dm:check_columns must_contain = 'active,activity_due,approval,assigned_to' &
            must_not_contain ='assignment_group,business_duration' &
            action ='fail'

Playground

Try this in RDA Playground(rda_docs_env)

Bot @dm:check-integrity

Bot Position In Pipeline: Sink

Check integrity of input data using 'rules' dataset and save results to 'errors' dataset

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
rules*	Text		Name of the rules dataset
failfast	Text	yes	Specify 'yes' to abort quickly on first error or 'no' to keep validating rules even when some rules fail
errors*	Text		Name of the output errors dataset
failpipeline	Text	no	Specify 'no' fail the entire pipeline on errors, 'yes' to keep executing

Bot *dm:cohort-list

Bot Position In Pipeline: Source

List of Cohorts

This bot expects a Full CFXQL.

Bot applies the Query on the data that is already loaded from previous bot or from a source.

Bot @dm:cohort-load

Bot Position In Pipeline: Source Sink

Load cohort specified by 'name'

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
name*	Text		Name of the cohort to load

Bot @dm:concat

Bot Position In Pipeline: Source

Concatenate set of saved dataframes ('names'). Each dataframe must have been saved using dm:save

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
names*	Text		Name of the saved datasets (regex)
return_empty	Text	no	Return an empty dataframe if an error occurs loading the dataset

Example usage:

@dm:empty
    --> @dm:addrow  Id = 198273 &
            Name = Jhon &
            Age = 50
    --> @dm:save name = "concat_Dataset_1"

--> @c:new-block
    --> @dm:empty
    --> @dm:addrow  Id = 191217 &
            Name = Don &
            Age = 35
    --> @dm:addrow Id = 187654 &
            Name = Alex &
            Age = 30
    --> @dm:save name = "concat_Dataset-2"

--> @c:new-block
    --> @dm:concat  names = "concat_Data.*"
    --> @dm:save name = "concat_dataset"

Playground

Try this in RDA Playground

Example Pipelines Using this Bot

aws-dependency-mapper-inner-pipeline

Bot @dm:concat-input-dataset

Bot Position In Pipeline: Sink

Concatenate set of saved dataframes ('names') with the input dataset. Each dataframe must have been saved using dm:save

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
names*	Text		Name of the saved datasets (regex)
return_empty	Text	no	Return an empty dataframe if an error occurs loading the dataset

Bot @dm:content-to-object

Bot Position In Pipeline: Sink

Convert data from a column into objects

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Description
input_content_column*	Text	Name of the column in input that contains the data
output_column*	Text	Column name where object names will be inserted
output_folder*	Text	Folder name where objects will be stored

Bot @dm:copy-columns

Bot Position In Pipeline: Sink

copy values from one column to other

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
from*	Text		Get the value from specified column or columns (comma separated)
to*	Text		Store the value to specified column or columns (comma separated). The specified number of columns should match 'from' column(s).
func	Text		Supported operations are: [strip, upper, lower, append, lstrip, rstrip, replace, split, join, len]
value	Text	,	Specify a value for 'split & join' functions, specify 'oldvalue' and 'newvalue' for replace function. Default value is set to Comma.
prefix	Text		if function is append,Specify the string that has to append at the beginning
suffix	Text		if function is append,Specify the string that has to append at the end

Bot @dm:copy-config

Bot Position In Pipeline: Source Sink

Copy RDA Object to local file in worker, typically used to update a configuration file on a mounted folder

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Description
src_object*	Text	Object name
src_folder*	Text	Folder name on the object storage
dest_file*	Text	location of the destination file
backup_dir	Text	If dest_file exists, copy it to this backup directory

Bot @dm:counter

Bot Position In Pipeline: Sink

Adds COUNTER to each row of the input dataset

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Bot @dm:create-cohorts

Bot Position In Pipeline: Source Sink

Create cohorts from input stack

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
stack_name*	Text		Name of the Stack to use
groupby*	Text		comma separated column names to do groupby
cohort_name_prefix	Text	cohort	Cohort name prefix to use
cfxql_filter	Text		cfxql filter to apply on stack data

Bot @dm:create-logarchive-repo

Bot Position In Pipeline: Source Sink

Create logarchive repository on RDA Platform Minio, if not created already

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
repo*	Text		Name of the Log Archive repository to be created. If a repo already exists with this name, bot will not perform any action.
prefix*	Text		Object prefix on the platform Minio.
retention	Text	0	Retention period in number of days. If set to 0, RDA will not manage the log archive lifecycle.

Example Pipelines Using this Bot

dli-process-synthetic-syslogs

Bot @dm:create-persistent-stream

Bot Position In Pipeline: Source Sink

Create a Persistent Stream if not already created

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
name*	Text		Name of the Persistent Stream to create.
index_name	Text		Optional index name in OpenSearch. If not specified, index name will be automatically created from stream name.
retention_days	Text	31	Retention period in number of days. If set to 0, RDA will not manage the persistent stream lifecycle.
timestamp_column	Text		Name of timestamp column. Optional.
unique_cols	Text		Comma separated list of columns to be used as unique columns to make stream updatable

Example Pipelines Using this Bot

Bot @dm:create-zipfile

Bot Position In Pipeline: Source

Zip the contents of the given folder and place it at the specified location.

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
folder_name*	Text		Folder name to create a zip file
zipfile_name	Text		File name to the created zipfile. If the zipfile name is not specified, it will take it from the folder_name
save_to_location	Text	False	Location to place the created zip file. If it is not specified, zipfile will save at the given folder_name

Bot @dm:dataset-location

Bot Position In Pipeline: Source Sink

Get the location information for a previously saved dataset

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
name*	Text		Name of the dataset

Example usage:

@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> @dm:save name = 'servicenow_incidents_example'
    --> @dm:dataset-location name = 'servicenow_incidents_example'

Playground

Try this in RDA Playground(rda_docs_env)

Bot @dm:dedup

Bot Position In Pipeline: Sink

Dedup rows using specified 'columns'

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
columns	Text		Comma separated list of column names. Default: All columns
keep	Text	first	Specify which duplicates (if any) to keep.Choose from first or last

Example usage:

@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> @dm:dedup columns = 'state'

Playground

Try this in RDA Playground

Example Pipelines Using this Bot

aws-dependency-mapper-inner-pipeline

Bot @dm:delete-dataset

Bot Position In Pipeline: Sink

Delete a previously saved dataset

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
dataset_column	Text	dataset	Column Name of the dataset to delete

Bot @dm:describe

Bot Position In Pipeline: Sink

Describe the input dataframe using optional 'columns' attribute

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
columns	Text		Comma separated list of column names. Default all columns

Example usage:

@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
   -->@dm:describe  columns = 'assignment_group'

Playground

Try this in RDA Playground

Bot @dm:diff

Bot Position In Pipeline: Sink

Compare input dataset against a 'base_dataset'

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
base_dataset*	Text		Name of the base dataset to compare the input dataset
key_cols*	Text		Comma separated columns to identity each row
exclude	Text		Exclude columns in the diff (regex pattern)
keep_data	Text	no	Keep the data columns in the diff output ('yes' or 'no')

Example usage:

@dm:empty
    --> @dm:addrow product="cucm" &
            id="1" &
            ip="10.95.112.50"
    --> @dm:save name="base_ds"

--> @c:new-block
    --> @dm:empty
    --> @dm:addrow product="host" &
            id="1" &
            ip="10.95"
    --> @dm:diff base_dataset="base_ds" &
            key_cols="id"

Playground

Try this in RDA Playground(rda_docs_env)

Bot @dm:dns-ip-to-name

Bot Position In Pipeline: Sink

Perform reverse DNS lookup to map IP Addresses to Hostnames on specified columns

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
from_cols*	Text		Comma separated list of columns with IP Address values
to_cols*	Text		Comma separated list of column names to store resolved Hostnames.
keep_value	Text	no	If lookup fails, store original value if 'yes' Or null if 'no'
num_threads	Text	5	Number of threads. Must be in the range of 1 to 20
additional_records	Text	false	Get additional domain names. true/false
record_type	Text	PTR	Comma separated record types for additional records. ex: PTR,A,CNAME

Example usage:

@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> @dm:add-missing-columns columns = 'ipaddress' &
            value = '10.95.158.1'
    --> @dm:dns-ip-to-name from_cols = "ipaddress" &
            to_cols = "name_addr"

Playground

Try this in RDA Playground(rda_docs_env)

Bot @dm:dns-name-to-ip

Bot Position In Pipeline: Sink

Perform DNS lookup to map Hostnames to IP Addresses on specified columns

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
from_cols*	Text		Comma separated list of columns with Hostnames
to_cols*	Text		Comma separated list of column names to store resolved IP Addresses.
keep_value	Text	no	If lookup fails, store original value if 'yes' Or null if 'no'
num_threads	Text	5	Number of threads. Must be in the range of 1 to 20
additional_records	Text	false	Get additional ip addresses list. true/false

Example usage:

@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> @dm:add-missing-columns columns = 'named_addr' &
            value = 'ip-10-95-158-1.us-west-2.compute.internal'
    --> @dm:dns-name-to-ip from_cols = "named_addr" &
            to_cols = "ipaddress"

Playground

Try this in RDA Playground(rda_docs_env)

Bot @dm:drop-null-columns

Bot Position In Pipeline: Sink

Drop columns with a specified % of null values

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
keep_columns	Text		Column name regex pattern. All columns matching this pattern must remain in output even if they have nulls
threshold	Text	100.0	Percent threshold for Null (NaN) values for each column. If % of null exceed this threshold they will be removed from output. Value must be > 0 and <= 100.0. Default is 100
empty_is_null	Text	no	Treat empty strings with white spaces only as Nulls. Valid values are 'yes' or 'no'. Default is 'no'

Bot @dm:dropnull

Bot Position In Pipeline: Sink

Drop rows if specified 'columns' have null values

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
columns*	Text		Comma separated list of column names

Example usage:

@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> @dm:dropnull
            columns = 'assigned_to,assignment_group'

Playground

Try this in RDA Playground

Bot @dm:empty

Bot Position In Pipeline: Source Sink

Create an empty dataframe with optional columns

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
columns	Text		Comma separated list of column names to be included in the empty dataframe. By default no columns are included.

Example usage:

@dm:empty
    --> @dm:addrow
            ip_address = "10.95.103.125" &
            name = "john"

Playground

Try this in RDA Playground(rda_docs_env)

Example Pipelines Using this Bot

Bot @dm:enrich

Bot Position In Pipeline: Sink

Enrich the input dataframe using a saved dictionary dataset

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
dict*	Text		Name of the saved dataset to be used as dictionary
src_key_cols*	Text		Comma separated list of column names in input to use for join
dict_key_cols*	Text		Comma separated list of column names in dict to use for join
enrich_cols*	Text		Comma separated list of column names to bring from dictionary
enrich_cols_as	Text		Comma separated list of new column names in the enriched output
suffixes	Text	_x,_y	Comma separated list of suffixes to add to overlapping column names in left and right respectively
indicator	Text	False	Enable indicator to add a column to the output DataFrame called _merge with information on the source of each row. The column can be given a different name by providing a string argument
how_type	Text	left	Specify the type of merge to be performed example:(right, outer, inner).By default left merge is performed
dedup_dict	Text	yes	Disable the dedup_dict to not to drop the duplicate rows from dict_key_columns
case_insensitive	Text	no	Perform case insensitive match on the key values
replace_values	Text	no	If enabled actual column value will be replaced with _x or _y values if not null by dropping _x,_y columns
cache	Text	yes	Cache the dict for future recalls. 'yes' or 'no'.
cache_refresh_seconds	Text	120	Refresh the cache (if new update available) after specified seconds.

Example usage:

@dm:empty
    --> @dm:addrow IP="123" &
            name="john"
    --> @dm:save name="dataset_one"

--> @c:new-block
    --> @dm:empty
    --> @dm:addrow place="Hyderabad" &
            employee_id="145" &
            name = "john"
    --> @dm:save name="dataset_two"

--> @c:new-block
    --> @dm:recall name="dataset_one"
    --> @dm:enrich dict="dataset_two" &
            src_key_cols="name" &
            dict_key_cols="name" &
            enrich_cols="place,employee_id"

Playground

Try this in RDA Playground(rda_docs_env)

Example Pipelines Using this Bot

aws-dependency-mapper-inner-pipeline

Bot @dm:enrich-conditional

Bot Position In Pipeline: Sink

Enrich the input dataframe using a saved dataset based on CFXQL condition

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
dict*	Text		Name of the saved dataset to be used as dictionary
condition*	Text		Condition must be a valid CFXQL expression that is applied to input dataset. Supports GET operation for column filtering.
enrich_cols	Text		Comma separated list of enriched columns from dictionary if the rule matches. This will be applied after the condition parameter is applied.
enrich_cols_as	Text		Rename the enrich columns, should be specified in the same order as in enrich_cols param. This will be applied after the condition parameter is applied.
return_status	Text	no	Add a column 'meta_enrich_status' to the output with the status of the enrichment. If set, failures will be captured in this column.
cache	Text	no	Cache the result for future recalls. 'yes' or 'no'
cache_refresh_seconds	Text	120	Refresh the cache (if new update available) after specified seconds

Bot @dm:enrich-using-ip-cidr

Bot Position In Pipeline: Sink

Enrich the input dataframe using a saved dictionary dataset. Match IP address in input dataframe with CIDRs specified in the dictionary

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Description
dict*	Text	Name of the saved dataset to be used as dictionary
src_ip_col*	Text	Column name in input dataframe which has IPv4 or IPv6 address
dict_cidr_col*	Text	Comma separated list of column names in dict to use for join
enrich_cols*	Text	Comma separated list of column names to bring from dictionary
enrich_cols_as	Text	Comma separated list of new column names in the enriched output

Example usage:

@dm:empty
    --> @dm:addrow IP="19.23.97.102" and
            serial_number="11363"
    --> @dm:addrow IP="19.23.97.103" and
            serial_number="11364"
    --> @dm:save name="dataset_one"

--> @c:new-block
    --> @dm:empty
    --> @dm:addrow place="Hyderabad" and
            employee_id="13212" and
            name = "john" and
            ip_address ="19.23.97.102"
    --> @dm:addrow place="pune" and
            employee_id="13210" and
            name = "joy" and
            ip_address ="19.23.97.103"
    --> @dm:save name="dataset_two"

--> @c:new-block
    --> @dm:recall name="dataset_one"
    --> @dm:enrich-using-ip-cidr dict = 'dataset_two' and
            src_ip_col = 'IP' and
            dict_cidr_col = 'ip_address' and
            enrich_cols  ='place'

Playground

Try this in RDA Playground(rda_docs_env)

Bot @dm:enrich-using-ip-cidr-multi-proc

Bot Position In Pipeline: Sink

Enrich the input dataframe using a saved dictionary dataset. Match IP address in input dataframe with CIDRs specified in the dictionary. Use specified number of processes.

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
dict*	Text		Name of the saved dataset to be used as dictionary
src_ip_col*	Text		Column name in input dataframe which has IPv4 or IPv6 address
dict_cidr_col*	Text		Comma separated list of column names in dict to use for join
enrich_cols*	Text		Comma separated list of column names to bring from dictionary
enrich_cols_as	Text		Comma separated list of new column names in the enriched output
_max_procs	Text	0	Maximum number of CPUs to use. 0 means all available CPUs. Value should be >= 1

Bot @dm:enrich-using-pstream

Bot Position In Pipeline: Sink

Enrich the input dataframe using a persistent stream

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
dict*	Text		Name of the persistent stream to be used as dictionary
src_key_cols*	Text		Comma separated list of column names in input to use for join
dict_key_cols*	Text		Comma separated list of column names in dict to use for join
enrich_cols*	Text		Comma separated list of column names to bring from dictionary
enrich_cols_as	Text		Comma separated list of new column names in the enriched output
batch_lookup	Text		Specify how many unique rows to look up in the dictionary at a time. Example: 50. This option typically improves performance when the dictionary is very large.
suffixes	Text	_x,_y	Comma separated list of suffixes to add to overlapping column names in left and right respectively
indicator	Text	False	Enable indicator to add a column to the output DataFrame called _merge with information on the source of each row. The column can be given a different name by providing a string argument
how_type	Text	left	Specify the type of merge to be performed example:(right, outer, inner).By default left merge is performed
dedup_dict	Text	yes	Disable the dedup_dict to not to drop the duplicate rows from dict_key_columns
replace_values	Text	no	If enabled actual column value will be replaced with _x or _y values if not null by dropping _x,_y columns

Bot @dm:enrich-using-rule-dict

Bot Position In Pipeline: Sink

Enrich using rule based dictionary which contains 'rule' column

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
dict*	Text		Name of the saved dataset to be used as rules dictionary. dictionary should contain rule_id and rule column
rule_id_column	Text	rule_id	Rule ID column in dictionary. rules will be sorted in ascending order using this column
rule_column	Text	rule	Rule column in dictionary. Rule must be a valid CFXQL expression that is applied in input dataset.
enrich_columns	Text		Comma separated list of enriched columns from dictionary if the rule matches
template_columns	Text		Comma separated list of template column names. At least one of enrich_columns or template_columns must be specified.
cache	Text	yes	Cache the dict for future recalls. 'yes' or 'no'.
cache_refresh_seconds	Text	120	Refresh the cache (if new update available) after specified seconds.

Example Pipelines Using this Bot

Bot @dm:eval

Bot Position In Pipeline: Sink

Map values using evaluate function. Specify one of more column = 'expression' pairs

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
_skip_errors	Text	no	Specify 'yes' or 'no'. If 'yes', do not bailout if any of the row processing results in error. Check meta_eval_message field when it continues with an error.

This bot also accepts wildcard parameters. Any additional name = 'value' parameters are passed to the bot.

Example usage:

@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> @dm:eval  state_descr = "'state is ' + (state)"

Playground

Try this in RDA Playground

Example Pipelines Using this Bot

Bot @dm:eval-multi-proc

Bot Position In Pipeline: Sink

Map values using evaluate function. Uses all available CPU cores to do parallel processing. Specify one of more column = 'expression' pairs

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
_max_procs	Text	0	Maximum number of CPUs to use. 0 means all available CPUs. Value should be >= 1
_skip_errors	Text	no	Specify 'yes' or 'no'. If 'yes', do not bailout if any of the row processing results in error. Check meta_eval_message field when it continues with an error.

This bot also accepts wildcard parameters. Any additional name = 'value' parameters are passed to the bot.

Example usage:

@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> @dm:eval-multi-proc  state_descr = "'state is ' + (state)" &
            _max_procs = 2

Playground

Try this in RDA Playground(rda_docs_env)

Bot @dm:event-sampling

Bot Position In Pipeline: Source Sink

Sample events to create training dataset for classification

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
time_column	Text	timestamp	Event occurred time column
events_spread_dataset*	Text		Events Spread dataset name
critical_event_mnemonic_columns*	Text		Comma separated list of column names to use to filter critical events
sample_events_query	Text	*	CFXQL query to fetch related events in duration period
duration_hours	Text	1	Duration to fetch related events for each critical event
groupby*	Text		Column name to perform groupby to get counts

Bot @dm:eventcorr-intra-group

Bot Position In Pipeline: Sink

Compute noise reduction for each group using 'groupby', 'created', 'window' columns

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
groupby*	Text		Comma separated list of columns to do the grouping
timestamp*	Text		Timestamp column name, typically event created timestamp
unit	Text		Timestamp unit. Default is None means timestamp is a string. other units are s,ms,ns
id	Text		Identity column, if not specified will use timestamp column
window	Text	15	Sliding time window that groups events that occur within the window (in minutes). Multiple windows may be specified as comma separated list
window_type	Text	moving	Window type 'moving' or 'fixed'
group_label_dataset	Text		If specified, correlated group assignments will be written to the specified dataset

Example usage:

@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> @dm:eventcorr-intra-group groupby = "state,description" &
            timestamp = "sys_created_on"

Playground

Try this in RDA Playground(rda_docs_env)

Bot @dm:eventzoning

Bot Position In Pipeline: Sink

Compute event zones using 'groupby', 'created', 'resolved' columns

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
groupby*	Text		Comma separated list of columns to do the grouping
created*	Text		Timestamp column name, typically event created timestamp
resolved	Text		Timestamp column name, typically event closed or resolved timestamp
unit	Text		Timestamp unit. Default is None means timestamp is a string. other units are s,ms,ns
id	Text		Identity column, if not specified will use timestamp column
freq	Text	5%	Frequency threshold for zoning. Default 5%. If % is ommitted it will be taken as absolute count threshold
mttr	Text	1d	MTTR threshold for zoning. Example: 1d, 2h, 90m, 9000s

Example usage:

@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> @dm:eventzoning groupby = "state,approval" &
            created = "time_worked"

Playground

Try this in RDA Playground(rda_docs_env)

Bot @dm:explode

Bot Position In Pipeline: Sink

Explode a 'column' into rows by splitting the value using a 'sep' separator

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
column*	Text		Name of the column to explode into rows
sep	Text	,	Separator (default is comma)

Example Usage

@dm:empty
    --> @dm:addrow
            message = "AgentDevice=WindowsLog\tAgentLogFile=System\tPluginVersion=WC.MSEVEN6.10.0.1.276\tSource=Service Control Manager\tComputer=DESKTOP-5O3J61V\tOriginatingComputer=DESKTOP-5O3J61V\tUser=SYSTEM\tDomain=NT AUTHORITY\tEventID=1073748864\tEventIDCode=7040\tEventType=4\tEventCategory=0\tRecordNumber=19466\tTimeGenerated=1653682613\tTimeWritten=1653682613\tLevel=Informational\tKeywords=Classic\tTask=None\tOpcode=Info\tMessage=The start type of the Background Intelligent Transfer Service service was changed from demand start to auto start."
    --> @dm:eval
            msg = "message.replace('\\t', ';')"
    --> @dm:explode
            column = "msg" & sep = ";"
    --> @dm:eval
            name = "msg.split('=', 1)[0]" &
            value = "msg.split('=', 1)[1]"
    --> @dm:pivot-table
            columns = "name" &
            index = "message" &
            value = "value" &
            agg = "first"

Example Pipelines Using this Bot

Bot @dm:explode-json

Bot Position In Pipeline: Sink

Explode a 'column' that contains JSON object(s)into rows

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
column*	Text		Name of the column with JSON data to explode into rows
ignore_errors	Text	yes	Ignore JSON parsing or structural errors. 'yes' or no'
exclude_exploded_columns	Text		Regular expression to exclude a set of columns from exploded data
include_exploded_columns	Text	.*	Regular expression to include a set of columns from exploded data
prefix_parent_key	Text	no	Prefix the parent key to the exploded columns.

Example usage:

@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> @dm:add-missing-columns columns = 'server_ipaddress' &
            value = '{"server_ip":"10.95.158.1","group":"network","duration_of_business":"2 hours"}'
    --> @dm:explode-json column = 'server_ipaddress'

Playground

Try this in RDA Playground(rda_docs_env)

Example Pipelines Using this Bot

sample-ecommerce-analytics

Bot @dm:explode-timerange-into-windows

Bot Position In Pipeline: Sink

Explode a specified timerange into windows for events that have created and resolved timestamps. Aggregate value for each window using specified function.

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
groupby*	Text		Comma separated list of columns to do the grouping
created_col*	Text		Event created Timestamp column name, must be in datetimestr format
resolved_col*	Text		Event resolved Timestamp column name, must be in datetimestr format
value_col*	Text		Event resolved Timestamp column name, must be in datetimestr format
window_start*	Text		Window start timestamp in datetimestr format
window_end	Text		Window end timestamp in datetimestr format. If not specified, current timestamp will be used.
interval*	Text		Bucketization interval expressed with days, hours, minutes, seconds. Ex: '1d, 4h, 15min'
agg	Text	sum	Value aggregation function. Valid values are sum, min, max

Example usage:

@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> @dm:explode-timerange-into-windows groupby = "state,description" &
            created_col = "sys_created_on" &
            resolved_col = "resolved_at" &
            value_col = "resolved_at" &
            window_start = "2022-06-05 13:43:14" &
            interval = "1d, 4h, 15min"

Playground

Try this in RDA Playground(rda_docs_env)

Bot @dm:extract

Bot Position In Pipeline: Sink

Extract data using 'expr' regex pattern from 'columns'

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
expr*	Text		Regular expression with named patterns
columns*	Text		Comma separated list of columns from which to extract the data

Bot @dm:extract-contents-from-html

Bot Position In Pipeline: Sink

Extract contents from HTML content in the input dataset.

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
output_column*	Text		Name of the output column to save the extracted HTML content.
input_column*	Text		Name of the input column which contains HTML content.
path*	Text		The path to the element to extract, separated by periods (e.g. 'html.body.div').
index	Text	0	The index of the element to extract, if there are multiple elements at the specified path. Defaults to 0.

Bot @dm:extract-key-value

Bot Position In Pipeline: Sink

Extract Key-Value pairs from column and add to dataframe

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
name	Text	extract_kv	Name for this bot.
format	Text	kv_type1	Format of input data. Supported formats are syslog_kv_type1, cef, kv_type1. Ex: syslog_kv_type1 format: <150>device="SFW" date=2022-07-04 time=10:06:39 timezone="IST" device_name="AB690" ... kv_type1 format: device="SFW" date=2022-07-04 time=10:06:39 timezone="IST" device_name="AB690"
column*	Text		Column name in input dataset which contains key=value fields that need to be extracted.
_max_procs	Text	1	Maximum number of CPUs to use. 0 means all available CPUs.

Bot @dm:fail-if-shape

Bot Position In Pipeline: Sink

Fail the pipeline, if input dataframe shape meets criteria expressed using 'row_count' & 'column_count'

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
row_count	Text		Number of rows in input dataframe. This variable accepts all numeric operations.
column_count	Text		Number of columns in input dataframe. This variable accepts all numeric operations.

Example usage:

@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> @dm:fail-if-shape row_count = 66 & 
                   column_count = 88

Playground

Try this in RDA Playground(rda_docs_env)

Bot @dm:file-to-object

Bot Position In Pipeline: Sink

Convert files from a column into objects

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Description
input_filename_column*	Text	Name of the column in input that contains the filenames
output_column*	Text	Column name where object names will be inserted
output_folder*	Text	Folder name where objects will be stored

Bot *dm:filter

Bot Position In Pipeline: Sink

Apply CFXQL filtering on the data

This bot expects a Full CFXQL.

Bot applies the Query on the data that is already loaded from previous bot or from a source.

Example usage:

@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> *dm:filter assignment_group  contains 'Software'  get assigned_to , assignment_group as 'assigned'

Playground

Try this in RDA Playground

Example Pipelines Using this Bot

Bot @dm:filter-using-dict

Bot Position In Pipeline: Sink

Filter rows using a dictionary. Action can be 'include' or 'exclude'

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
dict*	Text		Name of the saved dataset to be used as dictionary
src_key_cols*	Text		Comma separated list of column names in input to use for join
dict_key_cols*	Text		Comma separated list of column names in dict to use for join
action	Text	include	Must be one of 'include' or 'exclude'. Include means keep the rows that match the dictionary, else drop the rows that match the dictionary.

Bot @dm:find-affected-child-nodes

Bot Position In Pipeline: Sink

Traverse CMDB relationship like table to identify potentially affected child nodes for each parent node

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
impacted_classes*	Text		Comma separated list of impacted classes in output. Ex: 'Server,Virtual Machine Instance'
max_depth	Text	3	Max number of hops from parent node

Bot @dm:find-and-replace

Bot Position In Pipeline: Sink

Search data for the given condition and replace column value for the specified column name

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Description
condition	Text	Search one or more queries to fetch the records for replacing data
column_name*	Text	Specify list of column names to replace the data
column_value*	Text	Specify the list of column values to replace for the specified column names
replace_if_column_exist	Text	Specify the column name to replace the data, only if this column exists
sep	Text	Specify the separator to list multiple conditions,column names & column values

Bot @dm:fixcolumns

Bot Position In Pipeline: Sink

Fix column names such that they contain only allowed characters

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
include	Text		Column name regex pattern to fix in the output, remaining columns are left as is

Example usage:

@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> @dm:fixcolumns

Playground

Try this in RDA Playground(rda_docs_env)

Example Pipelines Using this Bot

aws-dependency-mapper-inner-pipeline

Bot @dm:fixnull

Bot Position In Pipeline: Sink

Replace null values in a comma separated column list

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
columns*	Text		Comma separated list of column names
value	Text		Value to be replaced with, default is empty string
apply_for_empty	Text	no	Apply value for for empty string ('yes' or 'no')

Example usage:

@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> @dm:fixnull
            columns = 'assigned_to'  &
            value = 'unassigned'

Playground

Try this in RDA Playground

Bot @dm:fixnull-regex

Bot Position In Pipeline: Sink

Replace null values in all columns that match the specified regular expression

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
columns	Text	.*	Regular expresson for column names
value	Text		Value to be replaced with, default is empty string
apply_for_empty	Text	no	Apply value for for empty string ('yes' or 'no')

Example Pipelines Using this Bot

li-replay-logs-to-dev-env

Bot *dm:functions

Bot Position In Pipeline: Source

List of functions available for mapping in 'map' bots

This bot expects a Full CFXQL.

Bot applies the Query on the data that is already loaded from previous bot or from a source.

Bot @dm:gc

Bot Position In Pipeline: Sink

Perform immediate garbage collection. Useful when dealing with very large datasets.

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
generation	Text	2	Generation parameter. Must be 0 or 1 or 2

Example usage:

@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> @dm:gc

Playground

Try this in RDA Playground(rda_docs_env)

Bot @dm:generate-metric-stats

Bot Position In Pipeline: Source

Generate usage stats (ex: hourly) for a given period of time

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
stream*	Text		Name of the Persistent Stream
cfxql_query	Text	*	CFXQL query to filter stream data.
ts_column	Text	timestamp	Timestamp column name in the stream
groupby*	Text		Comma separated list of column names to do the grouping. For better performance, have the first item in the group to the asset (ex: asset_id) for which the stats are generated
column*	Text		Name of the column that has the metric data value
threshold	Text	90	Metric (ex: CPU) Usage % alarm threshold
threshold_type	Text	over	Possible values: over or under. Use this to check if the threshold is over or under the provided threshold
clear_threshold	Text	75	Metric (ex: CPU) Usage % recovery threshold
bucket	Text	HOUR	Duration bucket. For now, only HOUR and MONTH are supported. Metrics insights/analysis (which provides recommendations based on thresholds) is available only for HOUR
freq	Text	MONTH	Frequency for data collection. For now, only MONTH is supported
skip_below_threshold	Text	yes	Skip processing groups which haven't crossed the threshold even once. Set it to 'no' to process all groups.
max_value	Text	100	Provide the maximum value possible for the metric. For metrics that provide the value as a %, default of 100 will do it. This helps in determining the value relative to the max value for threshold analysis
chunk_size	Text	1000	Number of rows to fetch in each chunk when we retrieve data to generate stats. Use larger number if we are dealing with more data points with relatively less data per row. Do not use more than 5000. Avoid using less than 1000
generate_alarm_times	Text	no	Set it to 'yes' to generate alarm times details which include the day of the week with counts. For example, this could be used to create suppression policy. Note: This could cause the bot to run very slow

Bot @dm:get-from-location

Bot Position In Pipeline: Source Sink

Retrieve dataset from a specified location

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
path*	Text		Path to object in Minio bucket.
empty_as_null	Text	yes	While reading the saved datasets, treat empty strings as Nulls. 'yes' or 'no'
_skip_errors	Text	no	Specify 'yes' or 'no'. If 'yes', do not bailout if any of the processing results in error.

Example usage:

@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> @dm:save name = "servicenow_dataset"
    --> @dm:save-to-location name = "servicenow_dataset" &
            format = "csv" &
            location = "output"

--> @c:new-block
    --> @dm:get-from-location path = "output/servicenow_dataset.csv"

Bot @dm:get-tagged-dataset

Bot Position In Pipeline: Source

List the datasets that are tagged with given tag name

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
tag_name*	Text		Tag name to get list of tagged bounded dataset

Bot @dm:grok

Bot Position In Pipeline: Sink

Extract data using Grok syntax from a single column'

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
column*	Text		Column data which need to be parsed with grok pattern
pattern*	Text		Grok pattern. For more than one pattern to be used, use \| (pipe without spaces) between the patterns.
_skip_errors	Text	no	Specify 'yes' or 'no'. If 'yes', do not bailout if any of the row processing results in error. Check meta_grok_message field when it continues with an error.
exclude_unmatched_columns	Text	no	Specify 'yes' or 'no'. If 'yes', don't include unmatched columns.

List of pre-built grok patterns are listed here.

Example usage:

@dm:empty
    --> @dm:addrow
            description = "Server ip is 10.95.103.125"
    --> @dm:grok
            column = "description" &
            pattern = "Server ip is %{IP:server_ip}"

Playground

Try this in RDA Playground(rda_docs_env)

Example Pipelines Using this Bot

sample-grok-test

Bot @dm:grok-multi-proc

Bot Position In Pipeline: Sink

Extract data using Grok syntax from a single column. Uses all available CPU cores to do parallel processing.

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
column*	Text		Column data which need to be parsed with grok pattern
pattern*	Text		Grok pattern. For more than one pattern to be used, use \| (pipe without spaces) between the patterns.
_max_procs	Text	0	Maximum number of CPUs to use. 0 means all available CPUs. Value should be >= 1
_skip_errors	Text	no	Specify 'yes' or 'no'. If 'yes', do not bailout if any of the row processing results in error. Check meta_grok_message field when it continues with an error.
exclude_unmatched_columns	Text	no	Specify 'yes' or 'no'. If 'yes', don't include unmatched columns.

List of pre-built grok patterns are listed here.

Example usage:

@dm:empty
    --> @dm:addrow
            description = "Server ip is 10.95.103.125"
    --> @dm:grok-multi-proc
            column = "description" &
            pattern = "Server ip is %{IP:server_ip}" &
            _max_procs = 4

Playground

Try this in RDA Playground(rda_docs_env)

Bot @dm:groupby

Bot Position In Pipeline: Sink

Group rows using specified 'columns' and specified 'agg' function

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
columns*	Text		Comma separated list of column names to do the grouping
agg	Text	count	Aggregation function. Default 'count'. For multiple aggregations, specify comma separated values

Example usage:

@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> @dm:groupby columns = 'state' &
            agg = 'count'

Playground

Try this in RDA Playground

Example Pipelines Using this Bot

Bot @dm:head

Bot Position In Pipeline: Sink

Get first 'n' rows

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
n	Text	10	Number of rows retain from head position

Example usage:

@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    -->@dm:head n = 9

Playground

Try this in RDA Playground

See Data Guide for more details

Example Pipelines Using this Bot

Bot @dm:hist

Bot Position In Pipeline: Sink

Create histogram using 'timestamp' column and use 'interval' binning

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
timestamp*	Text		Timestamp column
interval*	Text		Interval expressed with days, hours, minutes, seconds. Ex: '1d, 4h, 15min'

Example usage:

@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
       --> @dm:hist timestamp = 'sys_created_on' & interval = '30d'

Playground

Try this in RDA Playground

Bot @dm:hist-groupby

Bot Position In Pipeline: Sink

Perform Groupby and then create histogram using 'timestamp' column and use 'interval' binning

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
timestamp*	Text		Timestamp column
interval*	Text		Interval expressed with days, hours, minutes, seconds. Ex: '1d, 4h, 15min'
groupby*	Text		Comma separated list of columns to do the grouping
align	Text	yes	Align all metrics to same start and end time. Specify 'yes' or 'no'

Example Usage

@files:loadfile
            filename = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> *dm:filter
            sys_created_on after '2020-01-01 00:00:00'
            GET sys_created_on, priority
    --> @dm:hist-groupby
            timestamp = "sys_created_on" &
            interval = "30d" &
            groupby = "priority" &
            align = "no"

Bot @dm:identity-discovery

Bot Position In Pipeline: Sink

Discover identities in the input dataset

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
idtypes	Text		Comma separated list of Identity types. Default all identities (ex: ipaddress)

Example usage:

@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> @dm:identity-discovery

Bot @dm:implode

Bot Position In Pipeline: Sink

Implode 'merge_cols' into comma separated list

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
key_columns*	Text		Comma separated list of primary key columns
merge_columns*	Text		Comma separated list of columns to merge
merge_sep	Text	,	Merge value using specified separator, default is comma
dedup_merge_values	Text	yes	Dedup merge values (yes or no)
keep_columns	Text		Comma separated list of columns to keep after the merge

Example usage:

@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> @dm:implode key_columns='state' &
            merge_columns='caller_id' &
            merge_sep = ','

Playground

Try this in RDA Playground

Example Pipelines Using this Bot

aws-dependency-mapper-inner-pipeline

Bot @dm:ingest-from-location

Bot Position In Pipeline: Source

This is a bot that ingests data once from the files that match the file name pattern in a chunked manner. We currently support ingesting data from following data formats - [csv,json,parquet,orc]

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
external_storage_credential_name	Text		Name of the predefined credential for the external S3 or minio. If not specified, we assume its local platform minio
object_prefix	Text		Applicable for local platform minio
filename_pattern*	Text		File criteria in regex format
num_rows	Text	1000	Number of rows to fetch in each chunk
max_rows	Text		Read until this limit is reached
max_data_size_mb	Text		Read until this limit is reached. This can also be a fraction.
format	Text		Format is either csv/json/parquet/orc. If not specified, it will be derived from extension
line_read	Text	yes	Only applicable for JSON. By default file is read as a json object per line. If you want to load the whole file as JSON, set it to 'no'

Example usage:

Local Platform MinioExternal S3 or Minio

@dm:ingest-from-location object_prefix='test-data/temp'
            and filename_pattern='customer.json'

@dm:ingest-from-location external_storage_credential_name='s3-sa'
            and filename_pattern='.*\.json'
            and line_read='no'

Bot @dm:json-to-html

Bot Position In Pipeline: Sink

Converts JSON column in the input dataset to HTML table.

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
output_column*	Text		Name of the output column which captures the input column as HTML Table.
input_column*	Text		Name of the input column which contains JSON data.
include_column_name	Text	no	Include the column name in the HTML table.
parent_key_location	Text	top	For nested JSON, include the parent key at either 'side' or 'top' of the table.

Bot @dm:list-all-schemas

Bot Position In Pipeline: Source Sink

This is a bot that fetches all json schemas added in the system and lists the metadata of schemas as a dataframe

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Bot @dm:list-from-location

Bot Position In Pipeline: Source

List datasets in a specified location

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
location*	Text		Location in Minio bucket.

Example usage:

@c:new-block
    --> @dm:list-from-location location = "output/"

Playground

Try this in RDA Playground(rda_docs_env)

Bot @dm:load-bookmark

Bot Position In Pipeline: Source Sink

Load a previously saved bookmark

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
name*	Text		Name of the bookmark to load
default	Text		Name of the bookmark to load

Example usage:

@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> @dm:load-bookmark name = 'sample-servicenow-example'

Bot @dm:load-ml-dataset

Bot Position In Pipeline: Source Sink

Load the ML Datasets

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
pipeline_type*	Text	description	ML Pipeline type. Must be one of 'Clustering', 'Regression', 'Classification'
tmp_path	Text		Temporary directory path where datasets are to be stored
minio_path	Text		Minio path to get datasets from

This bot also accepts wildcard parameters. Any additional name = 'value' parameters are passed to the bot.

Bot @dm:load-ml-model

Bot Position In Pipeline: Source Sink

Load the ML Model with 'name' and 'model_type'

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
name*	Text		Name of the ML Model to load
model_type*	Text		ML Model type. Must be one of 'clustering', 'regression', 'classification'

Bot @dm:load-template

Bot Position In Pipeline: Source Sink

Load the formatting template with 'name'

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
name*	Text		Name of the template to load

Bot @dm:logarchive-replay

Bot Position In Pipeline: Source

Read the data from given archive for a specified time interval

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
repo*	Text		Name of the Log Archive repository. Mandatory.
archive*	Text		Name of the archive within the repository. Mandatory.
from	Text		Date & Time in text format. Ex: ISO format. Must be in UTC timezone.
to	Text		Date & Time in text format. Ex: ISO format. Must be in UTC timezone.
minutes	Text	15	Number of minutes of the data to replay. Must be >= 0. this field will be ignored if 'to' is specified
max_rows	Text	0	Maximum rows to replay. If not specified, will replay all data in the specified intervals. If specified and > 0, it will stop once specified rows have been read.
speed	Text	1.0	Speed at which to replay the events. 1 means close to original speed. < 1 means slower than original. > 1 means faster than original. This is an approximate and cannot be guaranteed. 0 means no introduced latency and try to replay as fast as possible.
batch_size	Text	100	Number of rows to return for each iteration.
label	Text		Label for the replay. Used for reports to identify various replay actions.

Bot @dm:logarchive-save

Bot Position In Pipeline: Sink

Save the log data in given archive of given repository

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
repo*	Text		Name of the Log Archive repository. Mandatory.
archive*	Text		Name of the archive within the repository. Mandatory.

Example Pipelines Using this Bot

dli-process-synthetic-syslogs

Bot @dm:manipulate-string

Bot Position In Pipeline: Sink

Manipulate the column values

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
from	Text		Get the value from specified column, if 'func' param is set to 'eval', 'from' param is not mandatory.
func*	Text		Supported functions are: [strip, substring, lower, upper, split, join, match, length, eval, lstrip, rstrip, count_value, replace, concat_columns]
lower_limit	Text		if func is 'substring', specify lower limit from which string should be extracted
upper_limit	Text		if func is 'substring', specify upper limit till which string should be extracted
value	Text	,	Specify regex pattern(s) when 'func' is set to 'match', specify separator when 'func' is set to 'split'. Extracted values are assigned to 'to' column, Default value is comma
to*	Text		Store the value to the specified column

Bot @dm:map

Bot Position In Pipeline: Sink

Inline mapping of columns 'from' using 'func' and save output to 'to' column

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
from	Text		Get the value from the specified column or columns (comma separated)
to	Text		Store the mapped value into the specified column
attr	Text		If from & to is same variable, attr can be used instead
func	Text		Function to use during mapping
_skip_errors	Text	no	Specify 'yes' or 'no'. If 'yes', do not bailout if any of the row processing results in error. Check meta_map_message field when it continues with an error.

This bot also accepts wildcard parameters. Any additional name = 'value' parameters are passed to the bot.

Example Pipelines Using this Bot

Bot @dm:map-multi-proc

Bot Position In Pipeline: Sink

Inline mapping of columns 'from' using 'func' and save output to 'to' column. Uses all available CPU cores to do parallel processing.

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
from	Text		Get the value from the specified column or columns (comma separated)
to	Text		Store the mapped value into the specified column
attr	Text		If from & to is same variable, attr can be used instead
func	Text		Function to use during mapping
_max_procs	Text	0	Maximum number of CPUs to use. 0 means all available CPUs. Value should be >= 1
_skip_errors	Text	no	Specify 'yes' or 'no'. If 'yes', do not bailout if any of the row processing results in error. Check meta_map_message field when it continues with an error.

This bot also accepts wildcard parameters. Any additional name = 'value' parameters are passed to the bot.

Example usage:

@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> @dm:map-multi-proc from = 'assignment_group' &
            to = 'assigned_group' &
            _max_procs = 2

Playground

Try this in RDA Playground(rda_docs_env)

Bot @dm:map-snmp-trap-to-alert

Bot Position In Pipeline: Sink

Enrich incoming snmp trap with alert related information. (Typically called after apply-snmp-trap-template)

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
template_folder	Text	snmp_trap_alerts	Name of the folder for RDA Objects which contains the template to convert SNMP Trap to alerts.
node_id_column	Text	rda_gw_client_ip	Column name that contains unique device id (default: rda_gw_client_ip)

Bot @dm:mask

Bot Position In Pipeline: Sink

Partially or completely mask all values in specified columns

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
columns*	Text		Comma separated list of columns for which values will be masked
pos	Text	5	Position from which start the masking
char	Text	#	Masking character

Example usage:

@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> @dm:mask columns = 'sys_id' &
            pos = 8 &
            char = '*'

Playground

Try this in RDA Playground

Example usage:

@dm:empty
    --> @dm:addrow
            phone_number = "925-555-5565" &
            ssn = "123-56-8008" &
            email = "john.doe@acme.com"
    --> @dm:eval
            ssn = "'*****'+ssn.split('-')[-1]" &
            phone_number = "phone_number[:3]+'*-***-*'+phone_number[-3:]" &
            email = "'***'+email.split('@')[0][-5:]+'@xxxxx.yyy'"

Playground

Try this in RDA Playground

See Data Guide for more details

Bot @dm:math

Bot Position In Pipeline: Sink

This bot performs mathematical functions

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Description
columns	Text	Comma separated list of column names(Mandatory for ceil,floor and median functions)
func*	Text	Function name to perform mathematical tasks. Available functions are (ceil\|floor\|median\|row_count\|column_count\|month_range)
year_column	Text	Column name containing year value.(Mandatory only for month_range function)
month_column	Text	Column name containing month value.(Mandatory only for month_range function)

Bot @dm:melt

Bot Position In Pipeline: Sink

Unpivot a DataFrame from wide to long format

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
id_cols*	Text		Column names separated by comma to be used as id
var_col_name	Text		Header for variable column
value_cols	Text		Column names separated by comma to be used as values. If not specified, uses all columns other than id columns
value_col_name	Text	value	Value column header

Example usage:

@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> @dm:melt id_cols= "approval,state"

Playground

Try this in RDA Playground(rda_docs_env)

Bot @dm:mergecolumns

Bot Position In Pipeline: Sink

Merge columns using 'include' regex and/or 'exclude' regex into 'to' column

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Description
include*	Text	Column name regex pattern to include in the merge operation
exclude	Text	Column name regex pattern to exclude in the merged column
to*	Text	Output column name for merged column

Example usage:

@files:loadfile
    filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> @dm:mergecolumns include = 'description|escalation*' &
            exclude = 'due_date' &
            to = 'combined_columns'

Playground

Try this in RDA Playground

Bot @dm:metadata

Bot Position In Pipeline: Sink

Analyze metadata for the input dataset

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
include	Text		Column name regex pattern to include in the output (include patterns are matched first and then exclude)
exclude	Text		Columns to exclude in the analysis, regular expression

mple usage:

@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    -->@dm:metadata

Playground

Try this in RDA Playground

Bot @dm:metric-corr

Bot Position In Pipeline: Sink

Computes correlation between columns specified as metric and value column

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
metric	Text		Comma separated list of columns. If none specified, it uses all columns other than value and timestamp.
timestamp*	Text	timestamp	Timestamp column name
unit	Text		Timestamp unit. Default is None means timestamp is a string. other units are s,ms,ns
interval*	Text		Bucketization interval expressed with days, hours, minutes, seconds. Ex: '1d, 4h, 15min'
agg	Text	mean	Data aggregator function for interval data : mean,median,sum,min,max

Example usage:

@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> @dm:metric-corr timestamp = "sys_created_on"  &
            interval = "1d, 4h, 15min"

Playground

Try this in RDA Playground(rda_docs_env)

Bot @dm:metric-correlator

Bot Position In Pipeline: Sink

Computes correlation between metrics specified in metric label column

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
metric_label*	Text		Name of column which specifies metric label
timestamp_column*	Text	timestamp	Timestamp column name
value_column	Text		Comma separated list of columns. If none specified, it uses all columns other than timestamp.
unit	Text		Timestamp unit. Default is None means timestamp is a string. other units are s,ms,ns
detrend	Text	yes	Detrend involves removing trend / seasonality from the data. Valid values are 'yes' or 'no'. Default is 'yes'
correlation_threshold	Text	0.5	If correlation between two metrics is greater than this value then they are said to be correlated. Ranges between 0 to 1

Bot @dm:metric-statistical-analysis

Bot Position In Pipeline: Sink

Computes all statistical parameters for each metric

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
metric*	Text		Comma separated list of columns to identify each metric
timestamp*	Text	timestamp	Timestamp column name
unit	Text		Timestamp unit. Default is None means timestamp is a string. other units are s,ms,ns
value*	Text		Metric value column
precision	Text	1	Number of decimals for each numerical value in the output
anomaly_percentile	Text		If specified, compute number of anomalies above this percentile value. Value should be >0 and <=100.

Bot *dm:ml-model-list

Bot Position In Pipeline: Source

List of saved ML Models

This bot expects a Full CFXQL.

Bot applies the Query on the data that is already loaded from previous bot or from a source.

Bot @dm:object-add

Bot Position In Pipeline: Source Sink

Add object to a folder

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
name*	Text		Object name
folder*	Text		Folder name on the object storage
input_file*	Text		File from which object will be added
description	Text		Description
overwrite	Text	yes	If file already exists, overwrite without prompting

Bot @dm:object-delete

Bot Position In Pipeline: Source Sink

Delete object from a folder

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
name*	Text		Object name
folder*	Text		Folder name on the object storage

Bot @dm:object-delete-list

Bot Position In Pipeline: Sink

Delete list of objects

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
input_object_column*	Text		Column with object names

Bot @dm:object-get

Bot Position In Pipeline: Source Sink

Get Object from a folder

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Description
name*	Text	Object name
folder*	Text	Folder name on the object storage
save_to_file	Text	Save the downloaded object to specified file
save_to_dir	Text	Save the downloaded object to specified directory

Bot @dm:object-list

Bot Position In Pipeline: Source

List objects for a folder

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
folder	Text		Folder name on the object storage

Bot @dm:object-to-content

Bot Position In Pipeline: Sink

Convert object pointers from a column into content

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
input_object_column*	Text		Name of the column in input that contains the object name
output_column*	Text		Column name where content will be inserted

Bot @dm:object-to-file

Bot Position In Pipeline: Sink

Convert object pointers from a column into file

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
input_object_column*	Text		Name of the column in input that contains the objects
output_column*	Text		Column name where filenames need to be inserted

Bot @dm:object-to-inline-img

Bot Position In Pipeline: Sink

Convert object pointers from a column into inline HTML img tags

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
input_object_column*	Text		Name of the column in input that contains the JPEG or PNG Image
output_column*	Text		Column name where HTML img tag code need to be inserted

Bot @dm:parse-using-textfsm

Bot Position In Pipeline: Sink

Parse one or more rows of data text data using specified textfsm model

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
folder*	Text		RDA Object folder in which textfsm model is defined
object*	Text		RDA Object name in which textfsm model is defined
raw_data_col*	Text		Column name for input data in which raw data is expected
keep_cols	Text		Keep specified comma separated list of columns in output
status_col	Text	textfsm_status	Parsing status in the output

Bot @dm:pivot-table

Bot Position In Pipeline: Sink

Creates Pivot Table with index and columns

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
columns*	Text		Column names to pivot separated by comma
index*	Text		Column names to use as index in pivot
value	Text		Name of value column. If not specified will use all available columns other than index and columns above.
agg	Text	mean	Default is mean. User can specify sum, max, min, median.

Example Usage

Example 1Example 2

@dm:empty
    --> @dm:addrow
            message = "AgentDevice=WindowsLog\tAgentLogFile=System\tPluginVersion=WC.MSEVEN6.10.0.1.276\tSource=Service Control Manager\tComputer=DESKTOP-5O3J61V\tOriginatingComputer=DESKTOP-5O3J61V\tUser=SYSTEM\tDomain=NT AUTHORITY\tEventID=1073748864\tEventIDCode=7040\tEventType=4\tEventCategory=0\tRecordNumber=19466\tTimeGenerated=1653682613\tTimeWritten=1653682613\tLevel=Informational\tKeywords=Classic\tTask=None\tOpcode=Info\tMessage=The start type of the Background Intelligent Transfer Service service was changed from demand start to auto start."
    --> @dm:eval
            msg = "message.replace('\\t', ';')"
    --> @dm:explode
            column = "msg" & sep = ";"
    --> @dm:eval
            name = "msg.split('=', 1)[0]" &
            value = "msg.split('=', 1)[1]"
    --> @dm:pivot-table
            columns = "name" &
            index = "message" &
            value = "value" &
            agg = "first"

Playground

Try this in RDA Playground

@files:loadfile
            filename = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> *dm:filter
            sys_created_on after '2020-01-01 00:00:00'
            GET sys_created_on, priority
    --> @dm:hist-groupby
            timestamp = "sys_created_on" &
            interval = "30d" &
            groupby = "priority" &
            align = "no"
    --> @dm:change-time-format
            columns = "sys_created_on" &
            from_format = "ns" &
            to_format = "%Y-%m"
    ## use eval to split the priority and add a prefix
    --> @dm:eval
            priority = "'_' + priority.split(' ')[-1]"
    ## Pivot column priority while keeping index columns as is
    --> @dm:pivot-table
            columns = 'priority' &
            index = 'sys_created_on' &
            value = 'count' &
            agg = 'sum'

Playground

Try this in RDA Playground

Bot @dm:process-syslog-from-kv-list

Bot Position In Pipeline: Sink

Process syslogs that have information in a list of dicts and have RFC5424 payload

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
data_column*	Text		column which has list of dictionary with multiple attributes. One attribute for name and another for value
key_attr	Text	name	Name of the attribute in data_column dictionary which indicates Key
value_attr	Text	stringValue	Name of the attribute in data_column dictionary which indicates Value
rfc5424_attr	Text	RFC5424	Key name which indicates RFC5424 encoded syslog parameters.

Bot #dm:pstream-delete-data-by-query

Bot Position In Pipeline: Sink

Delete the data in a persistent stream via CFXQL.

This bot expects Full CFXQL.

Bot translates the Query to native query of the Data source supported by this extension.

This is a data sink which expects the following input parameters to be passed via input dataframe.

Input Dataframe Column Name	Type	Default Value	Description
`name`*	Text		Name of the Persistent Stream
`conflicts`	Text	abort	What to do when the delete by query hits version conflicts? Valid choices: abort, proceed Default: abort
`timeout`	Text	10	Timeout in seconds to wait for response

Example usage:

@dm:empty
    --> @dm:addrow name="people_ps"
    --> #dm:pstream-delete-data-by-query city is 'John Doe'

Bot #dm:pstream-update-data-by-query

Bot Position In Pipeline: Sink

Update the data in a persistent stream via CFXQL.

This bot expects Full CFXQL.

Bot translates the Query to native query of the Data source supported by this extension.

This is a data sink which expects the following input parameters to be passed via input dataframe.

Input Dataframe Column Name	Type	Default Value	Description
`name`*	Text		Name of the Persistent Stream
`columns`*	Text		Comma separated list of column names that needs to be updated for all the records that match the query. Example: city,state,zipcode
`values`*	Text		Set the value to specified column or columns (comma separated). The specified number of columns should match 'columns' column(s). Example: San Jose,CA,12345
`conflicts`	Text	abort	What to do when the update by query hits version conflicts? Valid choices: abort, proceed Default: abort
`timeout`	Text	10	Timeout in seconds to wait for response

Example usage:

@dm:empty
--> @dm:addrow name='people_1k' and columns='city,name' and values='New Delhi,Deepika' 
--> #dm:pstream-update-data-by-query city is 'Delhi' and name is 'Deepak'

Bot #dm:query-persistent-stream

Bot Position In Pipeline: Sink

Query the data in a persistent stream via CFXQL.

This bot expects Full CFXQL.

Bot translates the Query to native query of the Data source supported by this extension.

This is a data sink which expects the following input parameters to be passed via input dataframe.

Input Dataframe Column Name	Type	Default Value	Description
`name`*	Text		Name of the Persistent Stream
`max_rows`	Text	1000	Max rows in each batch. Ignored when 'aggs' is used
`limit`	Text	1000	Limit total output rows. If set to 0, will retrieve all rows from the stream that match the query. Ignored when 'aggs' is used
`sort_by_col`	Text		Name of the column to sort
`sort_type`	Text	desc	Must be one of 'asc' or 'desc'
`aggs`	Text		Specified as 'sum:field_name'. Supported functions are sum, cardinality, min, max, mean, value_count
`groupby`	Text		Comma seperated list of columns to groupby; used only when 'aggs' is used
`max_aggregation_groups`	Text	1000	Fetches upto 1000 groups aggregation results by default. Larger values (10,000+) results in use of more memory to compute
`retry_attempts_on_no_data`	Text	0	Number of retries with 2 seconds wait for each retry when there is no data. Default is 0

Example usage:

Simple QueryWith Aggregations

@dm:empty
    --> @dm:addrow name="dli-log-stats" and max_rows="2"
    --> #dm:query-persistent-stream *  
    # (1)!

* implies match all (no filtering of data)

@dm:empty
    --> @dm:addrow name="dli-log-stats" and groupby='device,mode' 
            and aggs='max:count,avg:count,sum:count'  # (1)!
    --> #dm:query-persistent-stream device in ['ipaddress1', 'ipaddress2']

Computes max for count, averge for count and sum for count

Supported Aggregations

Agg Function	Description
min	minimum value in the group. Supported on numeric values only
max	maximum value in the group. Supported on numeric values only
sum	sum of values in the group. Supported on numeric values only
avg	average of the values in the group. Supported on numeric values only
first	first value when sorted by ascending order for the field
last	last value when sorted by ascending order for the field

Bot @dm:query-persistent-stream-from-bookmark

Bot Position In Pipeline: Source

This is a streaming bot that reads one or more rows from the last bookmarked record in a persistent stream via CFXQL filter criteria

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
name*	Text		Name of the Persistent Stream
bookmark*	Text		Name of the bookmark
offset	Text	latest	Read data from beginning or current offset. This option is only applicable when the bookmark does not exist, i.e. once the bookmark offset it set, it cannot be changed. Default is 'latest' and designed to work when default sort by column '_RDA_Id' is used. Other option is 'earliest', which reads from the beginning.
query	Text	*	CFXQL query to filter results.
sort_by_col	Text	_RDA_Id	Comma separated list of column names to be sorted by. Its important for the set of sorting columns provided to uniquely identify a record so that last bookmarked record is unique. Default is _RDA_Id (internal unique id). Example: 'timestamp, some_unique_id'
sort_type	Text	asc	Must be one of 'asc' or 'desc'
max_rows	Text	1000	Max rows in each batch.

Example usage:

@dm:query-persistent-stream-from-bookmark name="dli-log-stats" and bookmark="stats-bookmark"

Bot @dm:query-persistent-stream-iterate-by-chunk

Bot Position In Pipeline: Source

Queries the data in a persistent stream and returns data in chunks

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
name*	Text		Name of the Persistent Stream
query	Text	*	CFXQL query to filter results.
batch_size	Text	1000	Rows in each batch. Default is 1000
sort_by_col	Text		Name of the column to sort
sort_type	Text	desc	Must be one of 'asc' or 'desc'

Bot @dm:query-persistent-stream-iterate-by-time

Bot Position In Pipeline: Source

Queries the data in a persistent stream and returns data in chunks based on time interval

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
name*	Text		Name of the Persistent Stream
from	Text		Date & Time must be in ISO format. Must be in UTC timezone. Example: 2023-06-14, 2023-06-14T05:00:00. If not provided, earliest available date will be used
to	Text		Date & Time must be in ISO format. Must be in UTC timezone. Example: 2023-06-14, 2023-06-14T05:00:00. If not provided, the latest available date will be used
query	Text	*	CFXQL query to filter results.
interval	Text	1d	Interval expressed with days or hours or mins. Ex: '1d' or '4h' or '30min'
timestamp_column	Text	timestamp	Name of timestamp column. Default is 'timestamp'
aggs	Text		Specified as 'function:field_name'. Supported functions are sum, cardinality, min, max, mean, value_count
groupby	Text		Comma separated list of columns to groupby; used only when 'aggs' is used
include_time_intervals	Text	no	Include from and to interval used for each iteration. The column names will be 'from_interval' and 'to_interval'
chunk_size	Text	1000	Number of rows to fetch in each chunk when we retrieve data. Use larger number if we are dealing with more data points with relatively less data per row. Do not use more than 5000. Avoid using less than 1000. This is not applicable when 'aggs' is used.
max_aggregation_groups	Text	1000	Fetches upto 1000 groups aggregation results by default. Larger values (10,000+) results in use of more memory to compute.

Bot @dm:recall

Bot Position In Pipeline: Source Sink

Recall (load) a previously saved dataset

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
name	Text		Name of the dataset to recall.(name is mandatory, if tag parameter is not given)
cache	Text	no	Cache the result for future recalls. 'yes' or 'no'
cache_refresh_seconds	Text	120	Refresh the cache (if new update available) after specified seconds
empty_as_null	Text	yes	While reading the saved datasets, treat empty strings as Nulls. 'yes' or 'no'
return_empty	Text	no	Return an empty dataframe if an error occurs loading the dataset
empty_df_columns	Text		Comma separated list of columns for empty dataframe
ignore_dtypes	Text	no	Ignore column data types during loading
make_copy	Text	yes	Specifies whether to make copy of dataset for temp datasets. 'yes' or 'no'
tag	Text		Name of the tag, return the first occurrence of the dataset having the given tag

Example usage:

@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> @dm:save name = "servicenow_data"

--> @c:new-block
    --> @dm:recall     name="servicenow_data"

Playground

Try this in RDA Playground(rda_docs_env)

Example Pipelines Using this Bot

Bot @dm:recall-chunked

Bot Position In Pipeline: Source

Recall (load) a previously saved dataset as a data stream. Loads num_rows in each chunk.

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
name*	Text		Name of the dataset to recall
num_rows*	Text		Number of rows to fetch in each chunk
empty_as_null	Text	yes	While reading the saved datasets, treat empty strings as Nulls. 'yes' or 'no'
return_empty	Text	no	Return an empty dataframe if an error occurs loading the dataset
empty_df_columns	Text		Comma separated list of columns for empty dataframe
ignore_dtypes	Text	no	Ignore column data types during loading

Example usage:

@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> @dm:save name = "servicenow_data"

--> @c:new-block
    --> @dm:recall-chunked   name="servicenow_data" &
            num_rows = 5

Playground

Try this in RDA Playground(rda_docs_env)

Bot @dm:recall-query

Bot Position In Pipeline: Source Sink

Recall (load) a previously saved dataset using CFXQL query

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
name*	Text		Name of the dataset to recall
empty_as_null	Text	yes	While reading the saved datasets, treat empty strings as Nulls. 'yes' or 'no'
return_empty	Text	no	Return an empty dataframe if an error occurs loading the dataset
empty_df_columns	Text		Comma separated list of columns for empty dataframe
make_copy	Text	yes	Specifies whether to make copy of dataset for temp datasets. 'yes' or 'no'
query	Text	*	CFXQL query to filter results.
max_rows	Text		Limit the number of rows to return.

Bot @dm:relations-child-to-parent-paths

Bot Position In Pipeline: Sink

Traverse CMDB relationship like table to identify all possible paths from each child to all parent node(s)

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
child_col*	Text		Input column name which identifies a child column in a relationship table
parent_col*	Text		Input column name which identifies a parent column in a relationship table

Bot @dm:relations-parent-to-children-paths

Bot Position In Pipeline: Sink

Traverse CMDB relationship like table to identify all possible paths from each parent to all child node(s)

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
child_col*	Text		Input column name which identifies a child column in a relationship table
parent_col*	Text		Input column name which identifies a parent column in a relationship table

Bot @dm:rename-columns

Bot Position In Pipeline: Sink

Rename specified column names using new_column_name = 'old_column_name' format

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

This bot only accepts wildcard parameters. All name = 'value' parameters are passed to the bot.

Bot @dm:replace-data

Bot Position In Pipeline: Sink

Replace data using 'expr' regex pattern from 'columns'

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Description
columns*	Text	Comma separated list of columns
expr*	Text	Regular expression to identify the part that need to be replaced
replace	Text	Replace with this value. If not specified, replaces with empty string

Example usage:

@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> @dm:replace-data columns = 'description' &
            expr = 'to' &
            replace = 'two'

Playground

Try this in RDA Playground

Bot @dm:resample-timeseries

Bot Position In Pipeline: Sink

Resample time series data on specified timestamp column by aggregation function provided

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
ts_column*	Text		Timestamp column name.
value_column	Text		Comma separated list of columns. If none specified, it uses all columns other than timestamp.
interval	Text	1H	Bucketization interval expressed with days, hours, minutes, seconds. Ex: '1D, 4H, 15min'
agg	Text	sum	Value aggregation function. Valid values are sum, min, max, mean
interpolate	Text	no	Specify 'yes' or 'no'. If 'yes' then interpolate missing values after aggregation

Bot @dm:row-delta

Bot Position In Pipeline: Sink

Perform difference between 2 consecutive rows for the list of columns and replace them with the result

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
value_columns*	Text		Comma separated list of numeric column field names
groupby	Text		Perform delta within each group. Comma separated list of columns to groupby.
skip_first_row	Text	no	The column value for the first rows for the value columns provided in the result will be NaN as there is not row above them. You can skip the first row in the result by setting this to 'yes'
sort_columns	Text	timestamp	Comma separated list of column names. Default is 'timestamp' field
sort_order	Text	ascending	Sorting order ('ascending' or 'descending'). For multiple sort orders, specify comma separated values. If sort order list is shorter than column list, the last sort order will be used for the remaining columns

Bot *dm:safe-filter

Bot Position In Pipeline: Sink

Apply safe CFXQL filtering on the data

This bot expects a Full CFXQL.

Bot applies the Query on the data that is already loaded from previous bot or from a source.

Bot @dm:sample

Bot Position In Pipeline: Sink

Randomly sample 'n' number of rows

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
n	Text	1	Number of rows to return in output using random sampling. This can also be a fraction between 0 to 1.0 to indicate fraction of input rows to return.
re_use	Text	auto	Re-use already sampled row in output. Valid values are 'auto', 'yes' or 'no'.

Example usage:

@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> @dm:sample
            n = '0.01'

Playground

Try this in RDA Playground

Example Pipelines Using this Bot

Bot @dm:sample-groupby

Bot Position In Pipeline: Sink

Randomly sample 'n' number of rows within each group

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
columns*	Text	1	Comma separated list of column names to do the grouping.
n	Text	1	Number of rows to return in each group using random sampling. This can also be a fraction between 0 to 1.0 to indicate fraction of input rows to return within each group.
re_use	Text	auto	Re-use already sampled row in output. Valid values are 'auto', 'yes' or 'no'.

Example usage:

SampleWith fraction of data

    @files:loadfile
            filename = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> @dm:sample-groupby columns='severity' and n=5
    --> *dm:filter * get severity,state,opened_at

    @files:loadfile
            filename = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> @dm:sample-groupby columns='severity,state' and n=0.5
    --> *dm:filter * get severity,state,opened_at

Playground

Try this in RDA Playground

Bot @dm:save

Bot Position In Pipeline: Sink

Save the dataset with 'name'

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
name*	Text		Name of the dataset to save
format	Text	csv	Data format on the object storage. Ignored for the temporary datasets. Must be one of 'csv' or 'parquet'
publish	Text		Deprecated. Name of the tag to publish in cfxDimensions platform. Can be used only with Dimensions configuration.
append	Text	no	If set to 'yes', appends the input dataset as a chunk to the existing dataset if any. Valid values are 'yes', 'no'
return_appended_dataset	Text	no	If set to 'yes' and append is 'yes', returns the full appended dataset. Valid values are 'yes', 'no'
tag	Text		Name of the tag, appends the given tag name in the metadata

Example usage:

@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> @dm:save name = 'sample_servicenow_incidents_dataset'

Playground

Try this in RDA Playground(rda_docs_env)

Example Pipelines Using this Bot

Bot @dm:save-bookmark

Bot Position In Pipeline: Sink

Save the bookmark with 'name'

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
name*	Text		Name of the bookmark to save
value_column*	Text		Name of value column
value_type	Text	timestamp	Value type (timestamp, numeric, text)
ts_format	Text		If value_type is timestamp, format. Valid units are s, ms, ns or null for string format.
value_func	Text	max	Value functions are first, last, min, max

Example usage:

@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
--> @dm:save-bookmark name = 'sample-servicenow-example' & 
         value_column = 'sys_created_on'

Playground

Try this in RDA Playground(rda_docs_env)

Bot @dm:save-ml-dataset

Bot Position In Pipeline: Source Sink

Save the ML Datasets

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
pipeline_type*	Text	description	ML Pipeline type. Must be one of 'clustering', 'regression', 'classification'
tmp_path	Text		Temporary directory path where datasets are stored
minio_path	Text		Minio path to store datasets

This bot also accepts wildcard parameters. Any additional name = 'value' parameters are passed to the bot.

Bot @dm:save-ml-model

Bot Position In Pipeline: Source Sink

Save the ML Model with 'name' and 'model_type'

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Description
name*	Text	Name of the ML Model to save
model_type*	Text	ML Model type. Must be one of 'clustering', 'regression', 'classification'
description	Text	Template content column
model_data_path*	Text	ML Model file path

This bot also accepts wildcard parameters. Any additional name = 'value' parameters are passed to the bot.

Bot @dm:save-template

Bot Position In Pipeline: Sink

Save the formatting template with 'name'

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
name*	Text		Name of the template to save
description_col	Text	description	Template description column
content_col	Text	content	Template content column
content_type_col	Text	content_type	Template content type column

Bot @dm:save-to-location

Bot Position In Pipeline: Sink

Save the dataset to a specified location

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
name	Text		Name of the dataset to save. If not provided, will be extracted from location
format	Text		Data format on the object storage. Supported formats are csv, paraquet, gzip, json and zip. If not provided, will be extracted from location
location	Text		Location in Minio bucket to save the object.
ignore_index	Text	no	Ignore index columns while saving file to location. Possible values are 'yes' or 'no'. (Applicable only for csv format)

Example usage:

@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> @dm:save name = "servicenow_dataset"
    --> @dm:save-to-location name = "servicenow_dataset" &
            format = "csv" &
            location = "output"

Bot *dm:savedlist

Bot Position In Pipeline: Source

List of saved datasets

This bot expects a Full CFXQL.

Bot applies the Query on the data that is already loaded from previous bot or from a source.

Example usage:

@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
  --> @dm:save name = 'sample_service_now_incidents'

--> @c:new-block
    --> *dm:savedlist

Playground

Try this in RDA Playground(rda_docs_env)

Bot @dm:selectcolumns

Bot Position In Pipeline: Sink

Select columns using 'include' regex and/or 'exclude' regex

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
include	Text		Column name regex pattern to include in the output (include patterns are matched first and then exclude)
exclude	Text	None	Column name regex pattern to exclude in the output (include pattern are matched first and then exclude)

Example usage:

@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> @dm:selectcolumns
            include = 'assigned_to|approval.*|sys_*|.*description.*' & 
            exclude = 'sys_d.*'

Playground

Try this in RDA Playground

Example Pipelines Using this Bot

Bot @dm:set-tracing-context

Bot Position In Pipeline: Source Sink

Set the tracing context using name = value pairs

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

This bot only accepts wildcard parameters. All name = 'value' parameters are passed to the bot.

Bot @dm:set-tracing-context-from-input

Bot Position In Pipeline: Sink

Set the tracing context using input dataframe column values from first row

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
columns*	Text		Comma separated column names in input dataframe, to be propagated to context

Bot @dm:skip-block-if-shape

Bot Position In Pipeline: Sink

Skip rest of the current block if input dataframe shape meets criteria expressed using 'row_count' & 'column_count'

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
row_count	Text		Number of rows in input dataframe. This variable accepts all numeric operations.
column_count	Text		Number of columns in input dataframe. This variable accepts all numeric operations.

Example usage:

@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> @dm:skip-block-if-shape row_count = 66 & column_count = 88

Playground

Try this in RDA Playground(rda_docs_env)

Example Pipelines Using this Bot

Bot @dm:skip-pipeline-if-shape

Bot Position In Pipeline: Sink

Skip rest of the pipeline if input dataframe shape meets criteria expressed using 'row_count' & 'column_count'

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
row_count	Text		Number of rows in input dataframe. This variable accepts all numeric operations.
column_count	Text		Number of columns in input dataframe. This variable accepts all numeric operations.

Example usage:

@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> @dm:skip-pipeline-if-shape row_count = 66 and column_count = 88
    --> @dm:tail n = 5

Playground

Try this in RDA Playground(rda_docs_env)

Bot @dm:sleep

Bot Position In Pipeline: Sink

Wait for a specified number of seconds before executing next step. Useful for timed loops.

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
seconds	Text		Wait time in seconds, must be > 0, fractional seconds are allowed

Example usage:

@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> @dm:sleep seconds = 5
    --> @dm:sample n='0.1'

Playground

Try this in RDA Playground(rda_docs_env)

Bot @dm:sort

Bot Position In Pipeline: Sink

Sort values using 'columns' with 'order'

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
columns	Text		Comma separated list of column names
order	Text	ascending	Sorting order ('ascending' or 'descending'). For multiple sort orders, specify comma separated values. If sort order list is shorter than column list, the last sort order will be used for the remaining columns

Example usage:

@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> @dm:sort columns = 'assigned_to' and
            order = "desc"

Playground

Try this in RDA Playground

Example Pipelines Using this Bot

Bot @dm:span-creator

Bot Position In Pipeline: Sink

Create Spans from input timeseries data

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
unique_id*	Text		Comma separated list of columns
starttime*	Text		Start time column name
endtime*	Text		End time column name
unit	Text		Timestamp unit. Default is None means timestamp is a string. other units are s,ms,ns
status*	Text		Status column name
data_label*	Text		Label for data in spans
filter	Text		Comma separated list of filter columns
incident_id*	Text		Incident id column name
interval_start	Text	22	Interval start days/seconds/microseconds/milliseconds/minutes/hours/weeks
interval_end	Text	2	Interval end days/seconds/microseconds/milliseconds/minutes/hours/weeks
interval_unit	Text	hours	Interval unit days/seconds/microseconds/milliseconds/minutes/hours/weeks

Bot *dm:stack-connected-nodes

Bot Position In Pipeline: Source Sink

All connected nodes on previously selected Nodes on stack

This bot expects a Full CFXQL.

Bot applies the Query on the data that is already loaded from previous bot or from a source.

Bot @dm:stack-create

Bot Position In Pipeline: Source Sink

Create stack from input topology dataset

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Description
topology_nodes*	Text	Name of topology nodes dataset
topology_edges	Text	Name of topology edges dataset
name*	Text	Name for stack

Example Pipelines Using this Bot

aws-dependency-mapper-inner-pipeline

Bot *dm:stack-filter

Bot Position In Pipeline: Sink

Filter stack Nodes/Edges based on previously selected Nodes/Edges on stack

This bot expects a Full CFXQL.

Bot applies the Query on the data that is already loaded from previous bot or from a source.

Bot @dm:stack-find-impact-distances

Bot Position In Pipeline: Source Sink

Search a saved stack using asset-dependency service and get impact distances from the specified nodes

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
stack_name*	Text		Name of previously saved stack. This bot will use asset-dependency service to load the specified stack and perform a search
search_for*	Text		Comma separated list of values to search
attr_names*	Text		Comma separated list of attribute names Search the values specified in 'search_for' in this list of attribute names
node_types	Text		Comma separated list of node types to search
exclude_node_types	Text		Comma separated list of node types to exclude
depth	Text	10	Maximum depth from the selected nodes
operation	Text	equals	Type of value comparison operation. Must be one of 'equals', 'contains', or 'matches'
ignore_case	Text	yes	Ignore case while doing the search. Must be one of 'yes' or 'no'
max_matches	Text	1	Maximum number of matches per each search.
timeout	Text	120	Timeout in seconds

Bot @dm:stack-generate

Bot Position In Pipeline: Source Sink

Generate stack from input topology dataset

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
input_stack_name*	Text		Name of input stack json to generate stack from
app_type	Text	OIA	Name of application to publish stack to
output_stack_name*	Text		Name of application to publish stack to
room_id*	Text		Location to publish stack to
incident_id*	Text		Comma separated Incident id under which stack will be referenced
incident_summary*	Text		Comma separated summary for incidents
fqdn*	Text		FQDN
url*	Text		URL
ip_address*	Text		IP Address

Bot *dm:stack-impacted-nodes

Bot Position In Pipeline: Source Sink

Impact Analysis on previously selected Nodes on stack

This bot expects a Full CFXQL.

Bot applies the Query on the data that is already loaded from previous bot or from a source.

Bot @dm:stack-join

Bot Position In Pipeline: Source Sink

Create stack from input topology dataset

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Description
target_stack*	Text	Name of target stack to join
filter*	Text	Name of dictionary dataset with filters
name*	Text	Name of new stack

Bot *dm:stack-list

Bot Position In Pipeline: Source

List of application Stacks

This bot expects a Full CFXQL.

Bot applies the Query on the data that is already loaded from previous bot or from a source.

Bot @dm:stack-load

Bot Position In Pipeline: Source Sink

Load application Stack specified by 'name'

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
name*	Text		Name of the Stack to load

Bot @dm:stack-save

Bot Position In Pipeline: Source Sink

Save application Stack specified by 'name'

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
name*	Text		Name of the Stack to save
description	Text		Stack description
stack_data_column	Text	data	Column name where stack definition is stored in dataframe
additional_nodes_rules	Text		Name of dataset with rules to create and attach new node to matching existing nodes.

Example Pipelines Using this Bot

aws-dependency-mapper-inner-pipeline

Bot @dm:stack-search

Bot Position In Pipeline: Source Sink

Search a saved stack using asset-dependency service

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
stack_name*	Text		Name of previously saved stack. This bot will use asset-dependency service to load the specified stack and perform a search
search_for*	Text		Comma separated list of values to search
attr_names*	Text		Comma separated list of attribute names Search the values specified in 'search_for' in this list of attribute names
node_types	Text		Comma separated list of node types to search
exclude_node_types	Text		Comma separated list of node types to exclude
depth	Text	2	Maximum depth from the selected nodes
operation	Text	equals	Type of value comparison operation. Must be one of 'equals', 'contains', or 'matches'
ignore_case	Text	yes	Ignore case while doing the search. Must be one of 'yes' or 'no'
max_matches	Text	1	Maximum number of matches per each search.
timeout	Text	120	Timeout in seconds
result_stack_name	Text		Name of the result stack

Bot *dm:stack-select-nodes

Bot Position In Pipeline: Sink

Select Nodes from stack based on provided criteria

This bot expects a Full CFXQL.

Bot applies the Query on the data that is already loaded from previous bot or from a source.

Bot @dm:stack-unselect-nodes

Bot Position In Pipeline: Source Sink

Unselect nodes in stack of given node types if there is no right link available

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
node_type*	Text		Type of nodes to be unselected if there is no right link available

Bot @dm:staging-read

Bot Position In Pipeline: Source

This is a streaming bot that reads one or more rows from the files that match the criteria in the specified staging area in a chunked manner. We currently support ingesting data from following data formats - [csv,json,parquet,orc]

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
name*	Text		Staging area name
num_rows	Text	1000	Number of rows to fetch in each chunk
format	Text		Format is either csv/json/parquet/orc. If not specified, it will be derived from extension
line_read	Text	yes	Only applicable for JSON. By default file is read as a json object per line. If you want to load the whole file as JSON, set it to 'no'

Example usage:

@dm:staging-read name='staging-area-platform-sample'

Bot @dm:string-to-columns

Bot Position In Pipeline: Sink

It splits a column and assign values to columns

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
from*	Text		Specify the column name that has the string data
seps	Text	,	Specify the list of separators, example:\|$@/#!. Default is comma.
to*	Text		Specify the comma separated column names to which extracted strings need to be added as values. Once the 'from' column is split, the new columns are added to the existing dataframe, based on the order of the specified columns.
to_column_default	Text		Assign a value to 'to' column(s), when there is no string value extracted out of 'from' column, Default is None

Bot @dm:synthetic-dataset

Bot Position In Pipeline: Source

Generate a new dataframe with specified row_count and column_name = 'field_type' paris

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
row_count*	Text		Number of rows for the output dataframe

This bot also accepts wildcard parameters. Any additional name = 'value' parameters are passed to the bot.

List of supported synthetic data field types are listed here.

Example usage:

@c:new-block
    --> @dm:synthetic-dataset row_count = 5 &
            col_address = "address"

Playground

Try this in RDA Playground(rda_docs_env)

Bot *dm:synthetic-fields

Bot Position In Pipeline: Source

List of field types supported in synthetic data functions or bots

This bot expects a Full CFXQL.

Bot applies the Query on the data that is already loaded from previous bot or from a source.

Example usage:

@c:new-block
    --> *dm:synthetic-fields

Playground

Try this in RDA Playground(rda_docs_env)

Bot @dm:tail

Bot Position In Pipeline: Sink

Get last 'n' rows

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
n	Text	10	Number of rows retain from tail position

Example usage:

@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> @dm:tail n = 9

Playground

Try this in RDA Playground

Bot @dm:telegraf-parser

Bot Position In Pipeline: Sink

Parse the telegraf data that is passed through input dataset

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
tags_prefix	Text	tags	Add a prefix to the tags block after flattening. By default, 'tags' prefix will be added.
fields_prefix	Text		Add a prefix to the fields block after flattening. By default, no prefix will be added.

Bot *dm:template-list

Bot Position In Pipeline: Source

List of saved formatting templates

This bot expects a Full CFXQL.

Bot applies the Query on the data that is already loaded from previous bot or from a source.

Bot @dm:time-filter

Bot Position In Pipeline: Sink

Apply time filter on specified timestamp column

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Description
column*	Text	Timestamp column name.
from	Text	From timestamp. Can be absolute format or relative to current time. Atleast one of 'from' or 'to' must be specified.
to	Text	To timestamp. Can be absolute format or relative to current time. Atleast one of 'from' or 'to' must be specified.

Bot @dm:to-json

Bot Position In Pipeline: Sink

Converts each row of the input dataset to JSON format.

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
exclude_columns	Text		Regular expression to exclude one or more columns from input data
include_columns	Text	.*	Regular expression to include one or more columns from input data
output_column*	Text		Name of the output column which captures the input dataset as JSON format.
keep_original_columns	Text	no	Keep existing columns in output or not. 'yes' or 'no'

Bot @dm:to-type

Bot Position In Pipeline: Sink

Change data type to str or int or float for specified columns

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
columns*	Text		Comma separated list of column names
type*	Text		Type to convert into: str / int / float

Example usage:

@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> @dm:to_type  columns = 'activity_due' &
            type  = 'int'

Playground

Try this in RDA Playground

Bot @dm:transpose

Bot Position In Pipeline: Sink

Transposes columns to rows

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
columns*	Text		Comma-separated column names to set as index before transpose
value	Text		Name of value column

Bot @dm:validate-data

Bot Position In Pipeline: Sink

Check integrity of data data using 'schema' that has been uploaded previously

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
schema_name*	Text		Schema to verify the data against.
failfast	Text	yes	Specify 'yes'(default) to abort quickly on first error or 'no' to keep validating records
action	Text	none	Action to take if validation fails. Must be one of 'none' (default), 'fail','skip-block', 'skip-pipeline'

Bot @dm:vectorization

Bot Position In Pipeline: Sink

Compares data of the given columns and populates change state of data in the newly created 'change_state' column

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
dict	Text		JSON that contains 'column_name' and 'column_type' keys for actual column name and column type values. ex:[{'column_name':'name','column_type':'type'}]
suffixes	Text	_new,_old	Comma separated list of suffixes to identify old and new columns
modified_summary	Text	False	Enable modified_summary to compute number of modified rows in each column and populates summary in the 'modified_summary' column.(True/False)

Bot @dm:verify-checksum

Bot Position In Pipeline: Sink

Verify checksum of input dataframe. Checksum can be by rows-only or rows or for the entire dataset.

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Default Value	Description
checksum_type	Text	dataset	Verify checksum for by row only or by row and then entire dataset. Valid values are 'rows-only', 'dataset'
row_checksum_column	Text	rda_row_checksum	Input column for computed row level checksum.
data_checksum_column	Text	rda_data_checksum	Input column for computed checksum for entire dataset.
key	Text		Optional key to be used in the computed hash.
drop_checksum_columns	Text	yes	Use 'yes' or 'no' to specify if the checksum related columns should be dropped from the dataframe after a successful validation.

Bot @dm:xml-to-json

Bot Position In Pipeline: Sink

Parse XML document in a specified column and convert it into JSON

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name	Type	Description
column*	Text	Column data which need to be parsed for XML
output_column*	Text	Column name for the output JSON data
status_column	Text	Column name for parsing status
json_path	Text	Dot delimited JSON path to be traversed in the document. Default is entire document.

Example usage:

@dm:empty
    --> @dm:addrow Serial_Number="<number>124</number>" &
            name="<name>john</name>"
    --> @dm:addrow Serial_Number="<number>133</number>" &
            name="<name>smith</name>"
    --> @dm:xml-to-json column = 'name' &
            output_column = 'json_data'

Playground

Try this in RDA Playground(rda_docs_env)