Skip to content

Bots From Extension: cfxdm

Data Management

This extension provides 174 bots.





Bot @dm:add-bounded-dataset

Bot Position In Pipeline: Source Sink

This is a bot that adds a bounded dataset. Bounded datasets are bound to a pre-defined schema so that data is always validated against a set of rules.

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
name* Text Name of the dataset
schema_name* Text Name of the schema to bind this dataset







Bot @dm:add-checksum

Bot Position In Pipeline: Sink

Add checksum to input dataframe. Checksum can be by rows-only or rows and then the entire dataset.

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
checksum_type Text dataset Compute checksum for by row only or by row and then entire dataset. Valid values are 'rows-only',
'dataset'
row_checksum_column Text rda_row_checksum Output column for computed row level checksum. If the column already exists, it will be replaced
and not included in the checksum computation.
data_checksum_column Text rda_data_checksum Output column for computed checksum for entire dataset. If the column already exists, it will
be replaced and not included in the checksum computation.
key Text Optional key to be used in the computed hash.

Example usage:

1
2
3
4
@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> @dm:add-checksum checksum_type='rows-only' &
            row_checksum_column='id'







Bot @dm:add-missing-columns

Bot Position In Pipeline: Sink

Add columns if not found in the input

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
columns* Text Comma separated list of column names to be added if not already exist in input
value Text Value to be assigned if columns are not found in input. Default is None.

Example usage:

1
2
3
4
@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> @dm:add-missing-columns columns = 'Serial_Number' &
            value = 'SSI151909K5'

Example Pipelines Using this Bot







Bot @dm:add-schema

Bot Position In Pipeline: Source Sink

This is a bot that adds json schema to the system. The datasets can be bound to this schema so that adding/editing any rows to dataset are automatically validated against this schema.

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
schema_file* Text File path or URL of the json schema file
name* Text Name of the json schema







Bot @dm:add-template

Bot Position In Pipeline: Source Sink

Add a formatting template with 'name' and contents downloaded from a 'url'. If the template already exists, it will overwrite.

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
url* Text URL to contents of the Jinja2 style formatting template
name* Text Name of the formatting template
description Text Formattiong template description







Bot @dm:addrow

Bot Position In Pipeline: Sink

Append a row to input dataframe, with specified column = value parameters

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

This bot only accepts wildcard parameters. All name = 'value' parameters are passed to the bot.

Example usage:

@dm:empty
    --> @dm:addrow column1 = "Value1" & column2 = "Value2"

Example Pipelines Using this Bot







Bot @dm:apply-alert-rules

Bot Position In Pipeline: Source Sink

Unselect nodes in stack of given node types if there is no right link available

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
name* Text Name of the alert ruleset to apply
timestamp_column Text Name of the column to be used as timestamp







Bot @dm:apply-data-model

Bot Position In Pipeline: Sink

Apply specified data model to input dataframe. Example model name 'assetLCMMaster'

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
model* Text Name of the model to apply to input dataframe
removeUnmapped Text no Remove columns that are not in the model. Specify 'yes' or 'no'
apply_for_empty Text no Apply value for empty string ('yes' or 'no')

Example usage:

1
2
3
4
5
6
7
8
9
@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> @dm:save name = "sample_servicenow_incident"

--> @c:new-block
    --> @dm:recall  name="sample_servicenow_incident"
    --> @dm:apply-data-model  model="assetLCMMaster" &
            removeUnmapped="no"
    --> @dm:save  name="sample_servicenow_incident"







Bot @dm:apply-snmp-trap-template

Bot Position In Pipeline: Sink

Apply template to incoming SNMP Trap objects. Template must be available in RDA Object repository.

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
timestamp_col* Text Column which has timestamp at which SNMP Trap was received. Must be in UTC epoch milliseconds
format
version_col Text Column which has SNMP version.
address_col* Text Column which has IPAddress of the SNMP Trap source
varbinds_col* Text Column which has varbind list. Should be list of dict objects
template_folder Text snmp_trap_templates Name of the folder for RDA Objects which contains the SNMP Trap template
template_name Text traps Name of the RDA Object which has the SNMP Trap template







Bot @dm:apply-template-all-rows

Bot Position In Pipeline: Sink

Apply specified formatting template for all input rows and produce one rendered row

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
template_name* Text Name of the formatting template to be applied
output_col* Text Output column
status_col Text Template parsing status column. If not specified and any errors will cause pipeline to abort.

Example Pipelines Using this Bot







Bot @dm:apply-template-by-row

Bot Position In Pipeline: Sink

Apply specified formatting template for each input dataframe row and produce rendered output

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
template_name* Text Name of the formatting template to be applied
output_col* Text Output column
status_col Text Template parsing status column. If not specified and any errors will cause pipeline to abort.

Example Pipelines Using this Bot







Bot @dm:apply-topology-rci

Bot Position In Pipeline: Sink

Apply topology based Root Cause Inference model

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
node_weights_dict Text Name of the node weights dictionary. Expects columns node_id and weight
link_weights_dict Text Name of the link weights dictionary. Expects columns link_type and weight
severity_weights_dict Text Name of the severity weights dictionary. Expects columns severity and weight
select_top Text 1 How many high score nodes to select. Default is 1
stack_name* Text Name of the stack







Bot @dm:bin

Bot Position In Pipeline: Sink

Create bins for numerical 'column' and bins specified by 'bins' parameter

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
column* Text Numerical value column
bins* Text Comma separated list of numerical values representing bins

Example usage:

1
2
3
4
5
6
7
@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
      --> @dm:hist timestamp = 'sys_created_on' &
            interval = '30d'
      --> *dm:filter * get count as 'tcount'
      --> @dm:bin column = 'tcount' &
            bins = "0,1,5,10,100"







Bot *dm:bookmark-list

Bot Position In Pipeline: Source

List of saved bookmarks

This bot expects a Full CFXQL.

Bot applies the Query on the data that is already loaded from previous bot or from a source.

Example usage:

1
2
3
4
5
@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'

--> @c:new-block
    --> *dm:bookmark-list

Playground

Try this in RDA Playground(rda_docs_env)







Bot @dm:build-hierarchy

Bot Position In Pipeline: Sink

Builds relationships between the entities and populates hierarchy keys in the newly created 'hierarchy' column

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
entity_key_column* Text Attribute that represents the unique entity key column
relation_key_column* Text A relationship key column that should match to the entity_key_column
hierarchy_end_key_column Text Attribute that represent the hierarchy end key column to stop hierarchy building
hierarchy_end_value Text Hierarchy end value to stop hierarchy building
include_column_to_primary_key Text Column name that will be added to the entity key column that makes uniqueness to build the
hierarchy







Bot @dm:change-time-format

Bot Position In Pipeline: Sink

Change datetime from one format to another for all specified columns

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
columns* Text Comma separated list of column names
from_format* Text From format, one of: datetimestr s ms ns
to_format* Text To format, one of: datetimestr s ms ns. Can also specify a custom format expression.







Bot @dm:check-columns

Bot Position In Pipeline: Sink

Check input columns for specific list of columns that must exist or must not exist and take an action

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
must_contain Text Comma separated list of columns which must exist in the input.
must_not_contain Text Comma separated list of columns which must not exist in the input.
action* Text Action to take if either of the column checks fails. Must be one of 'fail','skip-block', 'skip-pipeline'

Example usage:

1
2
3
4
5
@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> @dm:check_columns must_contain = 'active,activity_due,approval,assigned_to' &
            must_not_contain ='assignment_group,business_duration' &
            action ='fail'

Playground

Try this in RDA Playground(rda_docs_env)







Bot @dm:check-integrity

Bot Position In Pipeline: Sink

Check integrity of input data using 'rules' dataset and save results to 'errors' dataset

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
rules* Text Name of the rules dataset
failfast Text yes Specify 'yes' to abort quickly on first error or 'no' to keep validating rules even when some
rules fail
errors* Text Name of the output errors dataset
failpipeline Text no Specify 'no' fail the entire pipeline on errors, 'yes' to keep executing







Bot *dm:cohort-list

Bot Position In Pipeline: Source

List of Cohorts

This bot expects a Full CFXQL.

Bot applies the Query on the data that is already loaded from previous bot or from a source.







Bot @dm:cohort-load

Bot Position In Pipeline: Source Sink

Load cohort specified by 'name'

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
name* Text Name of the cohort to load







Bot @dm:concat

Bot Position In Pipeline: Source

Concatenate set of saved dataframes ('names'). Each dataframe must have been saved using dm:save

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
names* Text Name of the saved datasets (regex)
return_empty Text no Return an empty dataframe if an error occurs loading the dataset

Example usage:

@dm:empty
    --> @dm:addrow  Id = 198273 &
            Name = Jhon &
            Age = 50
    --> @dm:save name = "concat_Dataset_1"

--> @c:new-block
    --> @dm:empty
    --> @dm:addrow  Id = 191217 &
            Name = Don &
            Age = 35
    --> @dm:addrow Id = 187654 &
            Name = Alex &
            Age = 30
    --> @dm:save name = "concat_Dataset-2"

--> @c:new-block
    --> @dm:concat  names = "concat_Data.*"
    --> @dm:save name = "concat_dataset"

Example Pipelines Using this Bot







Bot @dm:concat-input-dataset

Bot Position In Pipeline: Sink

Concatenate set of saved dataframes ('names') with the input dataset. Each dataframe must have been saved using dm:save

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
names* Text Name of the saved datasets (regex)
return_empty Text no Return an empty dataframe if an error occurs loading the dataset







Bot @dm:content-to-object

Bot Position In Pipeline: Sink

Convert data from a column into objects

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
input_content_column* Text Name of the column in input that contains the data
output_column* Text Column name where object names will be inserted
output_folder* Text Folder name where objects will be stored







Bot @dm:copy-columns

Bot Position In Pipeline: Sink

copy values from one column to other

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
from* Text Get the value from specified column or columns (comma separated)
to* Text Store the value to specified column or columns (comma separated). The specified number of columns
should match 'from' column(s).
func Text Supported operations are: [strip, upper, lower, append, lstrip, rstrip, replace, split, join,
len]
value Text , Specify a value for 'split & join' functions, specify 'oldvalue' and 'newvalue' for replace
function. Default value is set to Comma.
prefix Text if function is append,Specify the string that has to append at the beginning
suffix Text if function is append,Specify the string that has to append at the end







Bot @dm:copy-config

Bot Position In Pipeline: Source Sink

Copy RDA Object to local file in worker, typically used to update a configuration file on a mounted folder

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
src_object* Text Object name
src_folder* Text Folder name on the object storage
dest_file* Text location of the destination file
backup_dir Text If dest_file exists, copy it to this backup directory







Bot @dm:counter

Bot Position In Pipeline: Sink

Adds COUNTER to each row of the input dataset

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot







Bot @dm:create-cohorts

Bot Position In Pipeline: Source Sink

Create cohorts from input stack

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
stack_name* Text Name of the Stack to use
groupby* Text comma separated column names to do groupby
cohort_name_prefix Text cohort Cohort name prefix to use
cfxql_filter Text cfxql filter to apply on stack data







Bot @dm:create-logarchive-repo

Bot Position In Pipeline: Source Sink

Create logarchive repository on RDA Platform Minio, if not created already

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
repo* Text Name of the Log Archive repository to be created. If a repo already exists with this name,
bot will not perform any action.
prefix* Text Object prefix on the platform Minio.
retention Text 0 Retention period in number of days. If set to 0, RDA will not manage the log archive lifecycle.

Example Pipelines Using this Bot







Bot @dm:create-persistent-stream

Bot Position In Pipeline: Source Sink

Create a Persistent Stream if not already created

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
name* Text Name of the Persistent Stream to create.
index_name Text Optional index name in OpenSearch. If not specified, index name will be automatically created
from stream name.
retention_days Text 31 Retention period in number of days. If set to 0, RDA will not manage the persistent stream
lifecycle.
timestamp_column Text Name of timestamp column. Optional.
unique_cols Text Comma separated list of columns to be used as unique columns to make stream updatable

Example Pipelines Using this Bot







Bot @dm:create-zipfile

Bot Position In Pipeline: Source

Zip the contents of the given folder and place it at the specified location.

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
folder_name* Text Folder name to create a zip file
zipfile_name Text File name to the created zipfile. If the zipfile name is not specified, it will take it from
the folder_name
save_to_location Text False Location to place the created zip file. If it is not specified, zipfile will save at the given
folder_name







Bot @dm:dataset-location

Bot Position In Pipeline: Source Sink

Get the location information for a previously saved dataset

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
name* Text Name of the dataset

Example usage:

1
2
3
4
@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> @dm:save name = 'servicenow_incidents_example'
    --> @dm:dataset-location name = 'servicenow_incidents_example'

Playground

Try this in RDA Playground(rda_docs_env)







Bot @dm:dedup

Bot Position In Pipeline: Sink

Dedup rows using specified 'columns'

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
columns Text Comma separated list of column names. Default: All columns
keep Text first Specify which duplicates (if any) to keep.Choose from first or last

Example usage:

1
2
3
@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> @dm:dedup columns = 'state'

Example Pipelines Using this Bot







Bot @dm:delete-dataset

Bot Position In Pipeline: Sink

Delete a previously saved dataset

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
dataset_column Text dataset Column Name of the dataset to delete







Bot @dm:describe

Bot Position In Pipeline: Sink

Describe the input dataframe using optional 'columns' attribute

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
columns Text Comma separated list of column names. Default all columns

Example usage:

1
2
3
@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
   -->@dm:describe  columns = 'assignment_group'







Bot @dm:diff

Bot Position In Pipeline: Sink

Compare input dataset against a 'base_dataset'

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
base_dataset* Text Name of the base dataset to compare the input dataset
key_cols* Text Comma separated columns to identity each row
exclude Text Exclude columns in the diff (regex pattern)
keep_data Text no Keep the data columns in the diff output ('yes' or 'no')

Example usage:

@dm:empty
    --> @dm:addrow product="cucm" &
            id="1" &
            ip="10.95.112.50"
    --> @dm:save name="base_ds"

--> @c:new-block
    --> @dm:empty
    --> @dm:addrow product="host" &
            id="1" &
            ip="10.95"
    --> @dm:diff base_dataset="base_ds" &
            key_cols="id"

Playground

Try this in RDA Playground(rda_docs_env)







Bot @dm:dns-ip-to-name

Bot Position In Pipeline: Sink

Perform reverse DNS lookup to map IP Addresses to Hostnames on specified columns

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
from_cols* Text Comma separated list of columns with IP Address values
to_cols* Text Comma separated list of column names to store resolved Hostnames.
keep_value Text no If lookup fails, store original value if 'yes' Or null if 'no'
num_threads Text 5 Number of threads. Must be in the range of 1 to 20
additional_records Text false Get additional domain names. true/false
record_type Text PTR Comma separated record types for additional records. ex: PTR,A,CNAME

Example usage:

1
2
3
4
5
6
@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> @dm:add-missing-columns columns = 'ipaddress' &
            value = '10.95.158.1'
    --> @dm:dns-ip-to-name from_cols = "ipaddress" &
            to_cols = "name_addr"

Playground

Try this in RDA Playground(rda_docs_env)







Bot @dm:dns-name-to-ip

Bot Position In Pipeline: Sink

Perform DNS lookup to map Hostnames to IP Addresses on specified columns

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
from_cols* Text Comma separated list of columns with Hostnames
to_cols* Text Comma separated list of column names to store resolved IP Addresses.
keep_value Text no If lookup fails, store original value if 'yes' Or null if 'no'
num_threads Text 5 Number of threads. Must be in the range of 1 to 20
additional_records Text false Get additional ip addresses list. true/false

Example usage:

1
2
3
4
5
6
@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> @dm:add-missing-columns columns = 'named_addr' &
            value = 'ip-10-95-158-1.us-west-2.compute.internal'
    --> @dm:dns-name-to-ip from_cols = "named_addr" &
            to_cols = "ipaddress"

Playground

Try this in RDA Playground(rda_docs_env)







Bot @dm:drop-null-columns

Bot Position In Pipeline: Sink

Drop columns with a specified % of null values

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
keep_columns Text Column name regex pattern. All columns matching this pattern must remain in output even if
they have nulls
threshold Text 100.0 Percent threshold for Null (NaN) values for each column. If % of null exceed this threshold
they will be removed from output. Value must be > 0 and <= 100.0. Default is 100
empty_is_null Text no Treat empty strings with white spaces only as Nulls. Valid values are 'yes' or 'no'. Default
is 'no'







Bot @dm:dropnull

Bot Position In Pipeline: Sink

Drop rows if specified 'columns' have null values

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
columns* Text Comma separated list of column names

Example usage:

1
2
3
4
@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> @dm:dropnull
            columns = 'assigned_to,assignment_group'







Bot @dm:empty

Bot Position In Pipeline: Source Sink

Create an empty dataframe with optional columns

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
columns Text Comma separated list of column names to be included in the empty dataframe. By default no columns
are included.

Example usage:

1
2
3
4
@dm:empty
    --> @dm:addrow
            ip_address = "10.95.103.125" &
            name = "john"

Playground

Try this in RDA Playground(rda_docs_env)

Example Pipelines Using this Bot







Bot @dm:enrich

Bot Position In Pipeline: Sink

Enrich the input dataframe using a saved dictionary dataset

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
dict* Text Name of the saved dataset to be used as dictionary
src_key_cols* Text Comma separated list of column names in input to use for join
dict_key_cols* Text Comma separated list of column names in dict to use for join
enrich_cols* Text Comma separated list of column names to bring from dictionary
enrich_cols_as Text Comma separated list of new column names in the enriched output
suffixes Text _x,_y Comma separated list of suffixes to add to overlapping column names in left and right respectively
indicator Text False Enable indicator to add a column to the output DataFrame called _merge with information on
the source of each row. The column can be given a different name by providing a string argument
how_type Text left Specify the type of merge to be performed example:(right, outer, inner).By default left merge
is performed
dedup_dict Text yes Disable the dedup_dict to not to drop the duplicate rows from dict_key_columns
case_insensitive Text no Perform case insensitive match on the key values
replace_values Text no If enabled actual column value will be replaced with _x or _y values if not null by dropping
_x,_y columns
cache Text yes Cache the dict for future recalls. 'yes' or 'no'.
cache_refresh_seconds Text Refresh the cache (if new update available) after specified seconds. If not provided, dataset
will be cached until pipeline termination.

Example usage:

@dm:empty
    --> @dm:addrow IP="123" &
            name="john"
    --> @dm:save name="dataset_one"

--> @c:new-block
    --> @dm:empty
    --> @dm:addrow place="Hyderabad" &
            employee_id="145" &
            name = "john"
    --> @dm:save name="dataset_two"

--> @c:new-block
    --> @dm:recall name="dataset_one"
    --> @dm:enrich dict="dataset_two" &
            src_key_cols="name" &
            dict_key_cols="name" &
            enrich_cols="place,employee_id"

Playground

Try this in RDA Playground(rda_docs_env)

Example Pipelines Using this Bot







Bot @dm:enrich-conditional

Bot Position In Pipeline: Sink

Enrich the input dataframe using a saved dataset based on CFXQL condition

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
dict* Text Name of the saved dataset to be used as dictionary
condition* Text Condition must be a valid CFXQL expression that is applied to input dataset. Supports GET operation
for column filtering.
enrich_cols Text Comma separated list of enriched columns from dictionary if the rule matches. This will be
applied after the condition parameter is applied.
enrich_cols_as Text Rename the enrich columns, should be specified in the same order as in enrich_cols param. This
will be applied after the condition parameter is applied.
return_status Text no Add a column 'meta_enrich_status' to the output with the status of the enrichment. If set,
failures will be captured in this column.
cache Text no Cache the result for future recalls. 'yes' or 'no'
cache_refresh_seconds Text 120 Refresh the cache (if new update available) after specified seconds







Bot @dm:enrich-using-ip-cidr

Bot Position In Pipeline: Sink

Enrich the input dataframe using a saved dictionary dataset. Match IP address in input dataframe with CIDRs specified in the dictionary

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
dict* Text Name of the saved dataset to be used as dictionary
src_ip_col* Text Column name in input dataframe which has IPv4 or IPv6 address
dict_cidr_col* Text Comma separated list of column names in dict to use for join
enrich_cols* Text Comma separated list of column names to bring from dictionary
enrich_cols_as Text Comma separated list of new column names in the enriched output

Example usage:

@dm:empty
    --> @dm:addrow IP="19.23.97.102" and
            serial_number="11363"
    --> @dm:addrow IP="19.23.97.103" and
            serial_number="11364"
    --> @dm:save name="dataset_one"

--> @c:new-block
    --> @dm:empty
    --> @dm:addrow place="Hyderabad" and
            employee_id="13212" and
            name = "john" and
            ip_address ="19.23.97.102"
    --> @dm:addrow place="pune" and
            employee_id="13210" and
            name = "joy" and
            ip_address ="19.23.97.103"
    --> @dm:save name="dataset_two"

--> @c:new-block
    --> @dm:recall name="dataset_one"
    --> @dm:enrich-using-ip-cidr dict = 'dataset_two' and
            src_ip_col = 'IP' and
            dict_cidr_col = 'ip_address' and
            enrich_cols  ='place'

Playground

Try this in RDA Playground(rda_docs_env)







Bot @dm:enrich-using-ip-cidr-multi-proc

Bot Position In Pipeline: Sink

Enrich the input dataframe using a saved dictionary dataset. Match IP address in input dataframe with CIDRs specified in the dictionary. Use specified number of processes.

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
dict* Text Name of the saved dataset to be used as dictionary
src_ip_col* Text Column name in input dataframe which has IPv4 or IPv6 address
dict_cidr_col* Text Comma separated list of column names in dict to use for join
enrich_cols* Text Comma separated list of column names to bring from dictionary
enrich_cols_as Text Comma separated list of new column names in the enriched output
_max_procs Text 0 Maximum number of CPUs to use. 0 means all available CPUs. Value should be >= 1







Bot @dm:enrich-using-pstream

Bot Position In Pipeline: Sink

Enrich the input dataframe using a persistent stream

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
dict* Text Name of the persistent stream to be used as dictionary
src_key_cols* Text Comma separated list of column names in input to use for join
dict_key_cols* Text Comma separated list of column names in dict to use for join
enrich_cols* Text Comma separated list of column names to bring from dictionary
enrich_cols_as Text Comma separated list of new column names in the enriched output
batch_lookup Text Specify how many unique rows to look up in the dictionary at a time. Example: 50. This option
typically improves performance when the dictionary is very large.
suffixes Text _x,_y Comma separated list of suffixes to add to overlapping column names in left and right respectively
indicator Text False Enable indicator to add a column to the output DataFrame called _merge with information on
the source of each row. The column can be given a different name by providing a string argument
how_type Text left Specify the type of merge to be performed example:(right, outer, inner).By default left merge
is performed
dedup_dict Text yes Disable the dedup_dict to not to drop the duplicate rows from dict_key_columns
replace_values Text no If enabled actual column value will be replaced with _x or _y values if not null by dropping
_x,_y columns







Bot @dm:enrich-using-rule-dict

Bot Position In Pipeline: Sink

Enrich using rule based dictionary which contains 'rule' column

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
dict* Text Name of the saved dataset to be used as rules dictionary. dictionary should contain rule_id
and rule column
rule_id_column Text rule_id Rule ID column in dictionary. rules will be sorted in ascending order using this column
rule_column Text rule Rule column in dictionary. Rule must be a valid CFXQL expression that is applied in input dataset.
enrich_columns Text Comma separated list of enriched columns from dictionary if the rule matches
template_columns Text Comma separated list of template column names. At least one of enrich_columns or template_columns
must be specified.
cache Text yes Cache the dict for future recalls. 'yes' or 'no'.
cache_refresh_seconds Text Refresh the cache (if new update available) after specified seconds. If not provided, dataset
will be cached until pipeline termination.

Example Pipelines Using this Bot







Bot @dm:eval

Bot Position In Pipeline: Sink

Map values using evaluate function. Specify one of more column = 'expression' pairs

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
_skip_errors Text no Specify 'yes' or 'no'. If 'yes', do not bailout if any of the row processing results in error.
Check meta_eval_message field when it continues with an error.

This bot also accepts wildcard parameters. Any additional name = 'value' parameters are passed to the bot.

Example usage:

1
2
3
@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> @dm:eval  state_descr = "'state is ' + (state)"

Example Pipelines Using this Bot







Bot @dm:eval-multi-proc

Bot Position In Pipeline: Sink

Map values using evaluate function. Uses all available CPU cores to do parallel processing. Specify one of more column = 'expression' pairs

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
_max_procs Text 0 Maximum number of CPUs to use. 0 means all available CPUs. Value should be >= 1
_skip_errors Text no Specify 'yes' or 'no'. If 'yes', do not bailout if any of the row processing results in error.
Check meta_eval_message field when it continues with an error.

This bot also accepts wildcard parameters. Any additional name = 'value' parameters are passed to the bot.

Example usage:

1
2
3
4
@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> @dm:eval-multi-proc  state_descr = "'state is ' + (state)" &
            _max_procs = 2

Playground

Try this in RDA Playground(rda_docs_env)







Bot @dm:event-sampling

Bot Position In Pipeline: Source Sink

Sample events to create training dataset for classification

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
time_column Text timestamp Event occurred time column
events_spread_dataset* Text Events Spread dataset name
critical_event_mnemonic_columns* Text Comma separated list of column names to use to filter critical events
sample_events_query Text * CFXQL query to fetch related events in duration period
duration_hours Text 1 Duration to fetch related events for each critical event
groupby* Text Column name to perform groupby to get counts







Bot @dm:eventcorr-intra-group

Bot Position In Pipeline: Sink

Compute noise reduction for each group using 'groupby', 'created', 'window' columns

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
groupby* Text Comma separated list of columns to do the grouping
timestamp* Text Timestamp column name, typically event created timestamp
unit Text Timestamp unit. Default is None means timestamp is a string. other units are s,ms,ns
id Text Identity column, if not specified will use timestamp column
window Text 15 Sliding time window that groups events that occur within the window (in minutes). Multiple
windows may be specified as comma separated list
window_type Text moving Window type 'moving' or 'fixed'
group_label_dataset Text If specified, correlated group assignments will be written to the specified dataset

Example usage:

1
2
3
4
@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> @dm:eventcorr-intra-group groupby = "state,description" &
            timestamp = "sys_created_on"

Playground

Try this in RDA Playground(rda_docs_env)







Bot @dm:eventzoning

Bot Position In Pipeline: Sink

Compute event zones using 'groupby', 'created', 'resolved' columns

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
groupby* Text Comma separated list of columns to do the grouping
created* Text Timestamp column name, typically event created timestamp
resolved Text Timestamp column name, typically event closed or resolved timestamp
unit Text Timestamp unit. Default is None means timestamp is a string. other units are s,ms,ns
id Text Identity column, if not specified will use timestamp column
freq Text 5% Frequency threshold for zoning. Default 5%. If % is ommitted it will be taken as absolute count
threshold
mttr Text 1d MTTR threshold for zoning. Example: 1d, 2h, 90m, 9000s

Example usage:

1
2
3
4
@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> @dm:eventzoning groupby = "state,approval" &
            created = "time_worked"

Playground

Try this in RDA Playground(rda_docs_env)







Bot @dm:explode

Bot Position In Pipeline: Sink

Explode a 'column' into rows by splitting the value using a 'sep' separator

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
column* Text Name of the column to explode into rows
sep Text , Separator (default is comma)

Example Usage

@dm:empty
    --> @dm:addrow
            message = "AgentDevice=WindowsLog\tAgentLogFile=System\tPluginVersion=WC.MSEVEN6.10.0.1.276\tSource=Service Control Manager\tComputer=DESKTOP-5O3J61V\tOriginatingComputer=DESKTOP-5O3J61V\tUser=SYSTEM\tDomain=NT AUTHORITY\tEventID=1073748864\tEventIDCode=7040\tEventType=4\tEventCategory=0\tRecordNumber=19466\tTimeGenerated=1653682613\tTimeWritten=1653682613\tLevel=Informational\tKeywords=Classic\tTask=None\tOpcode=Info\tMessage=The start type of the Background Intelligent Transfer Service service was changed from demand start to auto start."
    --> @dm:eval
            msg = "message.replace('\\t', ';')"
    --> @dm:explode
            column = "msg" & sep = ";"
    --> @dm:eval
            name = "msg.split('=', 1)[0]" &
            value = "msg.split('=', 1)[1]"
    --> @dm:pivot-table
            columns = "name" &
            index = "message" &
            value = "value" &
            agg = "first"

Example Pipelines Using this Bot







Bot @dm:explode-json

Bot Position In Pipeline: Sink

Explode a 'column' that contains JSON object(s)into rows

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
column* Text Name of the column with JSON data to explode into rows
ignore_errors Text yes Ignore JSON parsing or structural errors. 'yes' or no'
exclude_exploded_columns Text Regular expression to exclude a set of columns from exploded data
include_exploded_columns Text .* Regular expression to include a set of columns from exploded data
prefix_parent_key Text no Prefix the parent key to the exploded columns.

Example usage:

1
2
3
4
5
@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> @dm:add-missing-columns columns = 'server_ipaddress' &
            value = '{"server_ip":"10.95.158.1","group":"network","duration_of_business":"2 hours"}'
    --> @dm:explode-json column = 'server_ipaddress'

Playground

Try this in RDA Playground(rda_docs_env)

Example Pipelines Using this Bot







Bot @dm:explode-timerange-into-windows

Bot Position In Pipeline: Sink

Explode a specified timerange into windows for events that have created and resolved timestamps. Aggregate value for each window using specified function.

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
groupby* Text Comma separated list of columns to do the grouping
created_col* Text Event created Timestamp column name, must be in datetimestr format
resolved_col* Text Event resolved Timestamp column name, must be in datetimestr format
value_col* Text Event resolved Timestamp column name, must be in datetimestr format
window_start* Text Window start timestamp in datetimestr format
window_end Text Window end timestamp in datetimestr format. If not specified, current timestamp will be used.
interval* Text Bucketization interval expressed with days, hours, minutes, seconds. Ex: '1d, 4h, 15min'
agg Text sum Value aggregation function. Valid values are sum, min, max

Example usage:

1
2
3
4
5
6
7
8
@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> @dm:explode-timerange-into-windows groupby = "state,description" &
            created_col = "sys_created_on" &
            resolved_col = "resolved_at" &
            value_col = "resolved_at" &
            window_start = "2022-06-05 13:43:14" &
            interval = "1d, 4h, 15min"

Playground

Try this in RDA Playground(rda_docs_env)







Bot @dm:extract

Bot Position In Pipeline: Sink

Extract data using 'expr' regex pattern from 'columns'

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
expr* Text Regular expression with named patterns
columns* Text Comma separated list of columns from which to extract the data







Bot @dm:extract-contents-from-html

Bot Position In Pipeline: Sink

Extract contents from HTML content in the input dataset.

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
output_column* Text Name of the output column to save the extracted HTML content.
input_column* Text Name of the input column which contains HTML content.
path* Text The path to the element to extract, separated by periods (e.g. 'html.body.div').
index Text 0 The index of the element to extract, if there are multiple elements at the specified path.
Defaults to 0.







Bot @dm:extract-key-value

Bot Position In Pipeline: Sink

Extract Key-Value pairs from column and add to dataframe

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
name Text extract_kv Name for this bot.
format Text kv_type1 Format of input data. Supported formats are syslog_kv_type1, cef, kv_type1. Ex: syslog_kv_type1
format: <150>device="SFW" date=2022-07-04 time=10:06:39 timezone="IST" device_name="AB690"
... kv_type1 format: device="SFW" date=2022-07-04 time=10:06:39 timezone="IST" device_name="AB690"
column* Text Column name in input dataset which contains key=value fields that need to be extracted.
_max_procs Text 1 Maximum number of CPUs to use. 0 means all available CPUs.







Bot @dm:fail-if-shape

Bot Position In Pipeline: Sink

Fail the pipeline, if input dataframe shape meets criteria expressed using 'row_count' & 'column_count'

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
row_count Text Number of rows in input dataframe. This variable accepts all numeric operations.
column_count Text Number of columns in input dataframe. This variable accepts all numeric operations.

Example usage:

1
2
3
4
@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> @dm:fail-if-shape row_count = 66 & 
                   column_count = 88

Playground

Try this in RDA Playground(rda_docs_env)







Bot @dm:file-to-object

Bot Position In Pipeline: Sink

Convert files from a column into objects

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
input_filename_column* Text Name of the column in input that contains the filenames
output_column* Text Column name where object names will be inserted
output_folder* Text Folder name where objects will be stored







Bot *dm:filter

Bot Position In Pipeline: Sink

Apply CFXQL filtering on the data

This bot expects a Full CFXQL.

Bot applies the Query on the data that is already loaded from previous bot or from a source.

Example usage:

1
2
3
@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> *dm:filter assignment_group  contains 'Software'  get assigned_to , assignment_group as 'assigned'

Example Pipelines Using this Bot







Bot @dm:filter-using-dict

Bot Position In Pipeline: Sink

Filter rows using a dictionary. Action can be 'include' or 'exclude'

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
dict* Text Name of the saved dataset to be used as dictionary
src_key_cols* Text Comma separated list of column names in input to use for join
dict_key_cols* Text Comma separated list of column names in dict to use for join
action Text include Must be one of 'include' or 'exclude'. Include means keep the rows that match the dictionary,
else drop the rows that match the dictionary.







Bot @dm:find-affected-child-nodes

Bot Position In Pipeline: Sink

Traverse CMDB relationship like table to identify potentially affected child nodes for each parent node

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
impacted_classes* Text Comma separated list of impacted classes in output. Ex: 'Server,Virtual Machine Instance'
max_depth Text 3 Max number of hops from parent node







Bot @dm:find-and-replace

Bot Position In Pipeline: Sink

Search data for the given condition and replace column value for the specified column name

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
condition Text Search one or more queries to fetch the records for replacing data
column_name* Text Specify list of column names to replace the data
column_value* Text Specify the list of column values to replace for the specified column names
replace_if_column_exist Text Specify the column name to replace the data, only if this column exists
sep Text Specify the separator to list multiple conditions,column names & column values







Bot @dm:fixcolumns

Bot Position In Pipeline: Sink

Fix column names such that they contain only allowed characters

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
include Text Column name regex pattern to fix in the output, remaining columns are left as is

Example usage:

1
2
3
@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> @dm:fixcolumns

Playground

Try this in RDA Playground(rda_docs_env)

Example Pipelines Using this Bot







Bot @dm:fixnull

Bot Position In Pipeline: Sink

Replace null values in a comma separated column list

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
columns* Text Comma separated list of column names
value Text Value to be replaced with, default is empty string
apply_for_empty Text no Apply value for for empty string ('yes' or 'no')

Example usage:

1
2
3
4
5
@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> @dm:fixnull
            columns = 'assigned_to'  &
            value = 'unassigned'







Bot @dm:fixnull-regex

Bot Position In Pipeline: Sink

Replace null values in all columns that match the specified regular expression

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
columns Text .* Regular expresson for column names
value Text Value to be replaced with, default is empty string
apply_for_empty Text no Apply value for for empty string ('yes' or 'no')

Example Pipelines Using this Bot







Bot *dm:functions

Bot Position In Pipeline: Source

List of functions available for mapping in 'map' bots

This bot expects a Full CFXQL.

Bot applies the Query on the data that is already loaded from previous bot or from a source.







Bot @dm:gc

Bot Position In Pipeline: Sink

Perform immediate garbage collection. Useful when dealing with very large datasets.

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
generation Text 2 Generation parameter. Must be 0 or 1 or 2

Example usage:

1
2
3
@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> @dm:gc

Playground

Try this in RDA Playground(rda_docs_env)







Bot @dm:generate-metric-stats

Bot Position In Pipeline: Source

Generate usage stats (ex: hourly) for a given period of time

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
stream* Text Name of the Persistent Stream
cfxql_query Text * CFXQL query to filter stream data.
ts_column Text timestamp Timestamp column name in the stream
groupby* Text Comma separated list of column names to do the grouping. For better performance, have the first
item in the group to the asset (ex: asset_id) for which the stats are generated
column* Text Name of the column that has the metric data value
threshold Text 90 Metric (ex: CPU) Usage % alarm threshold
threshold_type Text over Possible values: over or under. Use this to check if the threshold is over or under the provided
threshold
clear_threshold Text 75 Metric (ex: CPU) Usage % recovery threshold
bucket Text HOUR Duration bucket. For now, only HOUR and MONTH are supported. Metrics insights/analysis (which
provides recommendations based on thresholds) is available only for HOUR
freq Text MONTH Frequency for data collection. For now, only MONTH is supported
skip_below_threshold Text yes Skip processing groups which haven't crossed the threshold even once. Set it to 'no' to process
all groups.
max_value Text 100 Provide the maximum value possible for the metric. For metrics that provide the value as a
%, default of 100 will do it. This helps in determining the value relative to the max value
for threshold analysis
chunk_size Text 1000 Number of rows to fetch in each chunk when we retrieve data to generate stats. Use larger number
if we are dealing with more data points with relatively less data per row. Do not use more
than 5000. Avoid using less than 1000
generate_alarm_times Text no Set it to 'yes' to generate alarm times details which include the day of the week with counts.
For example, this could be used to create suppression policy. Note: This could cause the bot
to run very slow







Bot @dm:get-from-location

Bot Position In Pipeline: Source Sink

Retrieve dataset from a specified location

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
path* Text Path to object in Minio bucket.
empty_as_null Text yes While reading the saved datasets, treat empty strings as Nulls. 'yes' or 'no'
_skip_errors Text no Specify 'yes' or 'no'. If 'yes', do not bailout if any of the processing results in error.

Example usage:

1
2
3
4
5
6
7
8
9
@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> @dm:save name = "servicenow_dataset"
    --> @dm:save-to-location name = "servicenow_dataset" &
            format = "csv" &
            location = "output"

--> @c:new-block
    --> @dm:get-from-location path = "output/servicenow_dataset.csv"







Bot @dm:get-tagged-dataset

Bot Position In Pipeline: Source

List the datasets that are tagged with given tag name

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
tag_name* Text Tag name to get list of tagged bounded dataset







Bot @dm:grok

Bot Position In Pipeline: Sink

Extract data using Grok syntax from a single column'

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
column* Text Column data which need to be parsed with grok pattern
pattern* Text Grok pattern. For more than one pattern to be used, use | (pipe without spaces) between
the patterns.
_skip_errors Text no Specify 'yes' or 'no'. If 'yes', do not bailout if any of the row processing results in error.
Check meta_grok_message field when it continues with an error.
exclude_unmatched_columns Text no Specify 'yes' or 'no'. If 'yes', don't include unmatched columns.

List of pre-built grok patterns are listed here.

Example usage:

1
2
3
4
5
6
@dm:empty
    --> @dm:addrow
            description = "Server ip is 10.95.103.125"
    --> @dm:grok
            column = "description" &
            pattern = "Server ip is %{IP:server_ip}"

Playground

Try this in RDA Playground(rda_docs_env)

Example Pipelines Using this Bot







Bot @dm:grok-multi-proc

Bot Position In Pipeline: Sink

Extract data using Grok syntax from a single column. Uses all available CPU cores to do parallel processing.

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
column* Text Column data which need to be parsed with grok pattern
pattern* Text Grok pattern. For more than one pattern to be used, use | (pipe without spaces) between
the patterns.
_max_procs Text 0 Maximum number of CPUs to use. 0 means all available CPUs. Value should be >= 1
_skip_errors Text no Specify 'yes' or 'no'. If 'yes', do not bailout if any of the row processing results in error.
Check meta_grok_message field when it continues with an error.
exclude_unmatched_columns Text no Specify 'yes' or 'no'. If 'yes', don't include unmatched columns.

List of pre-built grok patterns are listed here.

Example usage:

1
2
3
4
5
6
7
@dm:empty
    --> @dm:addrow
            description = "Server ip is 10.95.103.125"
    --> @dm:grok-multi-proc
            column = "description" &
            pattern = "Server ip is %{IP:server_ip}" &
            _max_procs = 4

Playground

Try this in RDA Playground(rda_docs_env)







Bot @dm:groupby

Bot Position In Pipeline: Sink

Group rows using specified 'columns' and specified 'agg' function

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
columns* Text Comma separated list of column names to do the grouping
agg Text count Aggregation function. Default 'count'. For multiple aggregations, specify comma separated values

Example usage:

1
2
3
4
@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> @dm:groupby columns = 'state' &
            agg = 'count'

Example Pipelines Using this Bot







Bot @dm:head

Bot Position In Pipeline: Sink

Get first 'n' rows

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
n Text 10 Number of rows retain from head position

Example usage:

1
2
3
@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    -->@dm:head n = 9

See Data Guide for more details

Example Pipelines Using this Bot







Bot @dm:hist

Bot Position In Pipeline: Sink

Create histogram using 'timestamp' column and use 'interval' binning

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
timestamp* Text Timestamp column
interval* Text Interval expressed with days, hours, minutes, seconds. Ex: '1d, 4h, 15min'

Example usage:

1
2
3
@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
       --> @dm:hist timestamp = 'sys_created_on' & interval = '30d'







Bot @dm:hist-groupby

Bot Position In Pipeline: Sink

Perform Groupby and then create histogram using 'timestamp' column and use 'interval' binning

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
timestamp* Text Timestamp column
interval* Text Interval expressed with days, hours, minutes, seconds. Ex: '1d, 4h, 15min'
groupby* Text Comma separated list of columns to do the grouping
align Text yes Align all metrics to same start and end time. Specify 'yes' or 'no'

Example Usage

@files:loadfile
            filename = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> *dm:filter
            sys_created_on after '2020-01-01 00:00:00'
            GET sys_created_on, priority
    --> @dm:hist-groupby
            timestamp = "sys_created_on" &
            interval = "30d" &
            groupby = "priority" &
            align = "no"







Bot @dm:identity-discovery

Bot Position In Pipeline: Sink

Discover identities in the input dataset

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
idtypes Text Comma separated list of Identity types. Default all identities (ex: ipaddress)

Example usage:

1
2
3
@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> @dm:identity-discovery







Bot @dm:implode

Bot Position In Pipeline: Sink

Implode 'merge_cols' into comma separated list

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
key_columns* Text Comma separated list of primary key columns
merge_columns* Text Comma separated list of columns to merge
merge_sep Text , Merge value using specified separator, default is comma
dedup_merge_values Text yes Dedup merge values (yes or no)
keep_columns Text Comma separated list of columns to keep after the merge

Example usage:

1
2
3
4
5
@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> @dm:implode key_columns='state' &
            merge_columns='caller_id' &
            merge_sep = ','

Example Pipelines Using this Bot







Bot @dm:ingest-from-location

Bot Position In Pipeline: Source

This is a bot that ingests data once from the files that match the file name pattern in a chunked manner. We currently support ingesting data from following data formats - [csv,json,parquet,orc]

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
external_storage_credential_name Text Name of the predefined credential for the external S3 or minio. If not specified, we assume
its local platform minio
object_prefix Text Applicable for local platform minio
filename_pattern* Text File criteria in regex format
num_rows Text 1000 Number of rows to fetch in each chunk
max_rows Text Read until this limit is reached
max_data_size_mb Text Read until this limit is reached. This can also be a fraction.
format Text Format is either csv/json/parquet/orc. If not specified, it will be derived from extension
line_read Text yes Only applicable for JSON. By default file is read as a json object per line. If you want to
load the whole file as JSON, set it to 'no'

Example usage:

@dm:ingest-from-location object_prefix='test-data/temp'
            and filename_pattern='customer.json'
1
2
3
@dm:ingest-from-location external_storage_credential_name='s3-sa'
            and filename_pattern='.*\.json'
            and line_read='no'







Bot @dm:json-to-html

Bot Position In Pipeline: Sink

Converts JSON column in the input dataset to HTML table.

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
output_column* Text Name of the output column which captures the input column as HTML Table.
input_column* Text Name of the input column which contains JSON data.
include_column_name Text no Include the column name in the HTML table.
parent_key_location Text top For nested JSON, include the parent key at either 'side' or 'top' of the table.







Bot @dm:list-all-schemas

Bot Position In Pipeline: Source Sink

This is a bot that fetches all json schemas added in the system and lists the metadata of schemas as a dataframe

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot







Bot @dm:list-from-location

Bot Position In Pipeline: Source

List datasets in a specified location

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
location* Text Location in Minio bucket.

Example usage:

@c:new-block
    --> @dm:list-from-location location = "output/"

Playground

Try this in RDA Playground(rda_docs_env)







Bot @dm:load-bookmark

Bot Position In Pipeline: Source Sink

Load a previously saved bookmark

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
name* Text Name of the bookmark to load
default Text Name of the bookmark to load

Example usage:

1
2
3
@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> @dm:load-bookmark name = 'sample-servicenow-example'







Bot @dm:load-ml-dataset

Bot Position In Pipeline: Source Sink

Load the ML Datasets

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
pipeline_type* Text description ML Pipeline type. Must be one of 'Clustering', 'Regression', 'Classification'
tmp_path Text Temporary directory path where datasets are to be stored
minio_path Text Minio path to get datasets from

This bot also accepts wildcard parameters. Any additional name = 'value' parameters are passed to the bot.







Bot @dm:load-ml-model

Bot Position In Pipeline: Source Sink

Load the ML Model with 'name' and 'model_type'

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
name* Text Name of the ML Model to load
model_type* Text ML Model type. Must be one of 'clustering', 'regression', 'classification'







Bot @dm:load-template

Bot Position In Pipeline: Source Sink

Load the formatting template with 'name'

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
name* Text Name of the template to load







Bot @dm:logarchive-replay

Bot Position In Pipeline: Source

Read the data from given archive for a specified time interval

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
repo* Text Name of the Log Archive repository. Mandatory.
archive* Text Name of the archive within the repository. Mandatory.
from Text Date & Time in text format. Ex: ISO format. Must be in UTC timezone.
to Text Date & Time in text format. Ex: ISO format. Must be in UTC timezone.
minutes Text 15 Number of minutes of the data to replay. Must be >= 0. this field will be ignored if 'to'
is specified
max_rows Text 0 Maximum rows to replay. If not specified, will replay all data in the specified intervals.
If specified and > 0, it will stop once specified rows have been read.
speed Text 1.0 Speed at which to replay the events. 1 means close to original speed. < 1 means slower than
original. > 1 means faster than original. This is an approximate and cannot be guaranteed.
0 means no introduced latency and try to replay as fast as possible.
batch_size Text 100 Number of rows to return for each iteration.
label Text Label for the replay. Used for reports to identify various replay actions.







Bot @dm:logarchive-save

Bot Position In Pipeline: Sink

Save the log data in given archive of given repository

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
repo* Text Name of the Log Archive repository. Mandatory.
archive* Text Name of the archive within the repository. Mandatory.

Example Pipelines Using this Bot







Bot @dm:manipulate-string

Bot Position In Pipeline: Sink

Manipulate the column values

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
from Text Get the value from specified column, if 'func' param is set to 'eval', 'from' param is not
mandatory.
func* Text Supported functions are: [strip, substring, lower, upper, split, join, match, length, eval,
lstrip, rstrip, count_value, replace, concat_columns]
lower_limit Text if func is 'substring', specify lower limit from which string should be extracted
upper_limit Text if func is 'substring', specify upper limit till which string should be extracted
value Text , Specify regex pattern(s) when 'func' is set to 'match', specify separator when 'func' is set
to 'split'. Extracted values are assigned to 'to' column, Default value is comma
to* Text Store the value to the specified column







Bot @dm:map

Bot Position In Pipeline: Sink

Inline mapping of columns 'from' using 'func' and save output to 'to' column

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
from Text Get the value from the specified column or columns (comma separated)
to Text Store the mapped value into the specified column
attr Text If from & to is same variable, attr can be used instead
func Text Function to use during mapping
_skip_errors Text no Specify 'yes' or 'no'. If 'yes', do not bailout if any of the row processing results in error.
Check meta_map_message field when it continues with an error.

This bot also accepts wildcard parameters. Any additional name = 'value' parameters are passed to the bot.

Example Pipelines Using this Bot







Bot @dm:map-multi-proc

Bot Position In Pipeline: Sink

Inline mapping of columns 'from' using 'func' and save output to 'to' column. Uses all available CPU cores to do parallel processing.

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
from Text Get the value from the specified column or columns (comma separated)
to Text Store the mapped value into the specified column
attr Text If from & to is same variable, attr can be used instead
func Text Function to use during mapping
_max_procs Text 0 Maximum number of CPUs to use. 0 means all available CPUs. Value should be >= 1
_skip_errors Text no Specify 'yes' or 'no'. If 'yes', do not bailout if any of the row processing results in error.
Check meta_map_message field when it continues with an error.

This bot also accepts wildcard parameters. Any additional name = 'value' parameters are passed to the bot.

Example usage:

1
2
3
4
5
@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> @dm:map-multi-proc from = 'assignment_group' &
            to = 'assigned_group' &
            _max_procs = 2

Playground

Try this in RDA Playground(rda_docs_env)







Bot @dm:map-snmp-trap-to-alert

Bot Position In Pipeline: Sink

Enrich incoming snmp trap with alert related information. (Typically called after apply-snmp-trap-template)

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
template_folder Text snmp_trap_alerts Name of the folder for RDA Objects which contains the template to convert SNMP Trap to alerts.
node_id_column Text rda_gw_client_ip Column name that contains unique device id (default: rda_gw_client_ip)







Bot @dm:mask

Bot Position In Pipeline: Sink

Partially or completely mask all values in specified columns

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
columns* Text Comma separated list of columns for which values will be masked
pos Text 5 Position from which start the masking
char Text # Masking character

Example usage:

1
2
3
4
5
@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> @dm:mask columns = 'sys_id' &
            pos = 8 &
            char = '*'

Example usage:

1
2
3
4
5
6
7
8
9
@dm:empty
    --> @dm:addrow
            phone_number = "925-555-5565" &
            ssn = "123-56-8008" &
            email = "john.doe@acme.com"
    --> @dm:eval
            ssn = "'*****'+ssn.split('-')[-1]" &
            phone_number = "phone_number[:3]+'*-***-*'+phone_number[-3:]" &
            email = "'***'+email.split('@')[0][-5:]+'@xxxxx.yyy'"

See Data Guide for more details







Bot @dm:math

Bot Position In Pipeline: Sink

This bot performs mathematical functions

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
columns Text Comma separated list of column names(Mandatory for ceil,floor and median functions)
func* Text Function name to perform mathematical tasks. Available functions are (ceil|floor|median|row_count|column_count|month_range)
year_column Text Column name containing year value.(Mandatory only for month_range function)
month_column Text Column name containing month value.(Mandatory only for month_range function)







Bot @dm:melt

Bot Position In Pipeline: Sink

Unpivot a DataFrame from wide to long format

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
id_cols* Text Column names separated by comma to be used as id
var_col_name Text Header for variable column
value_cols Text Column names separated by comma to be used as values. If not specified, uses all columns other
than id columns
value_col_name Text value Value column header

Example usage:

1
2
3
@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> @dm:melt id_cols= "approval,state"

Playground

Try this in RDA Playground(rda_docs_env)







Bot @dm:mergecolumns

Bot Position In Pipeline: Sink

Merge columns using 'include' regex and/or 'exclude' regex into 'to' column

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
include* Text Column name regex pattern to include in the merge operation
exclude Text Column name regex pattern to exclude in the merged column
to* Text Output column name for merged column

Example usage:

1
2
3
4
5
@files:loadfile
    filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> @dm:mergecolumns include = 'description|escalation*' &
            exclude = 'due_date' &
            to = 'combined_columns'







Bot @dm:metadata

Bot Position In Pipeline: Sink

Analyze metadata for the input dataset

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
include Text Column name regex pattern to include in the output (include patterns are matched first and
then exclude)
exclude Text Columns to exclude in the analysis, regular expression

mple usage:

1
2
3
@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    -->@dm:metadata







Bot @dm:metric-corr

Bot Position In Pipeline: Sink

Computes correlation between columns specified as metric and value column

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
metric Text Comma separated list of columns. If none specified, it uses all columns other than value and
timestamp.
timestamp* Text timestamp Timestamp column name
unit Text Timestamp unit. Default is None means timestamp is a string. other units are s,ms,ns
interval* Text Bucketization interval expressed with days, hours, minutes, seconds. Ex: '1d, 4h, 15min'
agg Text mean Data aggregator function for interval data : mean,median,sum,min,max

Example usage:

1
2
3
4
@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> @dm:metric-corr timestamp = "sys_created_on"  &
            interval = "1d, 4h, 15min"

Playground

Try this in RDA Playground(rda_docs_env)







Bot @dm:metric-correlator

Bot Position In Pipeline: Sink

Computes correlation between metrics specified in metric label column

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
metric_label* Text Name of column which specifies metric label
timestamp_column* Text timestamp Timestamp column name
value_column Text Comma separated list of columns. If none specified, it uses all columns other than timestamp.
unit Text Timestamp unit. Default is None means timestamp is a string. other units are s,ms,ns
detrend Text yes Detrend involves removing trend / seasonality from the data. Valid values are 'yes' or 'no'.
Default is 'yes'
correlation_threshold Text 0.5 If correlation between two metrics is greater than this value then they are said to be correlated.
Ranges between 0 to 1







Bot @dm:metric-statistical-analysis

Bot Position In Pipeline: Sink

Computes all statistical parameters for each metric

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
metric* Text Comma separated list of columns to identify each metric
timestamp* Text timestamp Timestamp column name
unit Text Timestamp unit. Default is None means timestamp is a string. other units are s,ms,ns
value* Text Metric value column
precision Text 1 Number of decimals for each numerical value in the output
anomaly_percentile Text If specified, compute number of anomalies above this percentile value. Value should be >0
and <=100.







Bot *dm:ml-model-list

Bot Position In Pipeline: Source

List of saved ML Models

This bot expects a Full CFXQL.

Bot applies the Query on the data that is already loaded from previous bot or from a source.







Bot @dm:object-add

Bot Position In Pipeline: Source Sink

Add object to a folder

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
name* Text Object name
folder* Text Folder name on the object storage
input_file* Text File from which object will be added
description Text Description
overwrite Text yes If file already exists, overwrite without prompting







Bot @dm:object-delete

Bot Position In Pipeline: Source Sink

Delete object from a folder

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
name* Text Object name
folder* Text Folder name on the object storage







Bot @dm:object-delete-list

Bot Position In Pipeline: Sink

Delete list of objects

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
input_object_column* Text Column with object names







Bot @dm:object-get

Bot Position In Pipeline: Source Sink

Get Object from a folder

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
name* Text Object name
folder* Text Folder name on the object storage
save_to_file Text Save the downloaded object to specified file
save_to_dir Text Save the downloaded object to specified directory







Bot @dm:object-list

Bot Position In Pipeline: Source

List objects for a folder

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
folder Text Folder name on the object storage







Bot @dm:object-to-content

Bot Position In Pipeline: Sink

Convert object pointers from a column into content

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
input_object_column* Text Name of the column in input that contains the object name
output_column* Text Column name where content will be inserted







Bot @dm:object-to-file

Bot Position In Pipeline: Sink

Convert object pointers from a column into file

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
input_object_column* Text Name of the column in input that contains the objects
output_column* Text Column name where filenames need to be inserted







Bot @dm:object-to-inline-img

Bot Position In Pipeline: Sink

Convert object pointers from a column into inline HTML img tags

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
input_object_column* Text Name of the column in input that contains the JPEG or PNG Image
output_column* Text Column name where HTML img tag code need to be inserted







Bot @dm:parse-using-textfsm

Bot Position In Pipeline: Sink

Parse one or more rows of data text data using specified textfsm model

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
folder* Text RDA Object folder in which textfsm model is defined
object* Text RDA Object name in which textfsm model is defined
raw_data_col* Text Column name for input data in which raw data is expected
keep_cols Text Keep specified comma separated list of columns in output
status_col Text textfsm_status Parsing status in the output







Bot @dm:pivot-table

Bot Position In Pipeline: Sink

Creates Pivot Table with index and columns

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
columns* Text Column names to pivot separated by comma
index* Text Column names to use as index in pivot
value Text Name of value column. If not specified will use all available columns other than index and
columns above.
agg Text mean Default is mean. User can specify sum, max, min, median.

Example Usage

@dm:empty
    --> @dm:addrow
            message = "AgentDevice=WindowsLog\tAgentLogFile=System\tPluginVersion=WC.MSEVEN6.10.0.1.276\tSource=Service Control Manager\tComputer=DESKTOP-5O3J61V\tOriginatingComputer=DESKTOP-5O3J61V\tUser=SYSTEM\tDomain=NT AUTHORITY\tEventID=1073748864\tEventIDCode=7040\tEventType=4\tEventCategory=0\tRecordNumber=19466\tTimeGenerated=1653682613\tTimeWritten=1653682613\tLevel=Informational\tKeywords=Classic\tTask=None\tOpcode=Info\tMessage=The start type of the Background Intelligent Transfer Service service was changed from demand start to auto start."
    --> @dm:eval
            msg = "message.replace('\\t', ';')"
    --> @dm:explode
            column = "msg" & sep = ";"
    --> @dm:eval
            name = "msg.split('=', 1)[0]" &
            value = "msg.split('=', 1)[1]"
    --> @dm:pivot-table
            columns = "name" &
            index = "message" &
            value = "value" &
            agg = "first"
@files:loadfile
            filename = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> *dm:filter
            sys_created_on after '2020-01-01 00:00:00'
            GET sys_created_on, priority
    --> @dm:hist-groupby
            timestamp = "sys_created_on" &
            interval = "30d" &
            groupby = "priority" &
            align = "no"
    --> @dm:change-time-format
            columns = "sys_created_on" &
            from_format = "ns" &
            to_format = "%Y-%m"
    ## use eval to split the priority and add a prefix
    --> @dm:eval
            priority = "'_' + priority.split(' ')[-1]"
    ## Pivot column priority while keeping index columns as is
    --> @dm:pivot-table
            columns = 'priority' &
            index = 'sys_created_on' &
            value = 'count' &
            agg = 'sum'







Bot @dm:process-syslog-from-kv-list

Bot Position In Pipeline: Sink

Process syslogs that have information in a list of dicts and have RFC5424 payload

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
data_column* Text column which has list of dictionary with multiple attributes. One attribute for name and another
for value
key_attr Text name Name of the attribute in data_column dictionary which indicates Key
value_attr Text stringValue Name of the attribute in data_column dictionary which indicates Value
rfc5424_attr Text RFC5424 Key name which indicates RFC5424 encoded syslog parameters.







Bot #dm:pstream-delete-data-by-query

Bot Position In Pipeline: Sink

Delete the data in a persistent stream via CFXQL.

This bot expects Full CFXQL.

Bot translates the Query to native query of the Data source supported by this extension.

This is a data sink which expects the following input parameters to be passed via input dataframe.

Input Dataframe Column Name Type Default Value Description
name* Text Name of the Persistent Stream
conflicts Text abort What to do when the delete by query hits version conflicts? Valid choices: abort, proceed Default: abort
timeout Text 10 Timeout in seconds to wait for response

Example usage:

1
2
3
@dm:empty
    --> @dm:addrow name="people_ps"
    --> #dm:pstream-delete-data-by-query city is 'John Doe'







Bot #dm:pstream-update-data-by-query

Bot Position In Pipeline: Sink

Update the data in a persistent stream via CFXQL.

This bot expects Full CFXQL.

Bot translates the Query to native query of the Data source supported by this extension.

This is a data sink which expects the following input parameters to be passed via input dataframe.

Input Dataframe Column Name Type Default Value Description
name* Text Name of the Persistent Stream
columns* Text Comma separated list of column names that needs to be updated for all the records that match the query. Example: city,state,zipcode
values* Text Set the value to specified column or columns (comma separated). The specified number of columns should match 'columns' column(s). Example: San Jose,CA,12345
conflicts Text abort What to do when the update by query hits version conflicts? Valid choices: abort, proceed Default: abort
timeout Text 10 Timeout in seconds to wait for response

Example usage:

1
2
3
@dm:empty
--> @dm:addrow name='people_1k' and columns='city,name' and values='New Delhi,Deepika' 
--> #dm:pstream-update-data-by-query city is 'Delhi' and name is 'Deepak'







Bot #dm:query-persistent-stream

Bot Position In Pipeline: Sink

Query the data in a persistent stream via CFXQL.

This bot expects Full CFXQL.

Bot translates the Query to native query of the Data source supported by this extension.

This is a data sink which expects the following input parameters to be passed via input dataframe.

Input Dataframe Column Name Type Default Value Description
name* Text Name of the Persistent Stream
max_rows Text 1000 Max rows in each batch. Ignored when 'aggs' is used
limit Text 1000 Limit total output rows. If set to 0, will retrieve all rows from the stream that match the query. Ignored when 'aggs' is used
sort_by_col Text Name of the column to sort
sort_type Text desc Must be one of 'asc' or 'desc'
aggs Text Specified as 'sum:field_name'. Supported functions are sum, cardinality, min, max, mean, value_count
groupby Text Comma seperated list of columns to groupby; used only when 'aggs' is used
max_aggregation_groups Text 1000 Fetches upto 1000 groups aggregation results by default. Larger values (10,000+) results in use of more memory to compute
retry_attempts_on_no_data Text 0 Number of retries with 2 seconds wait for each retry when there is no data. Default is 0

Example usage:

1
2
3
4
@dm:empty
    --> @dm:addrow name="dli-log-stats" and max_rows="2"
    --> #dm:query-persistent-stream *  
    # (1)!
  1. * implies match all (no filtering of data)
1
2
3
4
@dm:empty
    --> @dm:addrow name="dli-log-stats" and groupby='device,mode' 
            and aggs='max:count,avg:count,sum:count'  # (1)!
    --> #dm:query-persistent-stream device in ['ipaddress1', 'ipaddress2']
  1. Computes max for count, averge for count and sum for count
Supported Aggregations
Agg Function Description
min minimum value in the group. Supported on numeric values only
max maximum value in the group. Supported on numeric values only
sum sum of values in the group. Supported on numeric values only
avg average of the values in the group. Supported on numeric values only
first first value when sorted by ascending order for the field
last last value when sorted by ascending order for the field







Bot @dm:query-persistent-stream-from-bookmark

Bot Position In Pipeline: Source

This is a streaming bot that reads one or more rows from the last bookmarked record in a persistent stream via CFXQL filter criteria

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
name* Text Name of the Persistent Stream
bookmark* Text Name of the bookmark
offset Text latest Read data from beginning or current offset. This option is only applicable when the bookmark
does not exist, i.e. once the bookmark offset it set, it cannot be changed. Default is 'latest'
and designed to work when default sort by column '_RDA_Id' is used. Other option is 'earliest',
which reads from the beginning.
query Text * CFXQL query to filter results.
sort_by_col Text _RDA_Id Comma separated list of column names to be sorted by. Its important for the set of sorting
columns provided to uniquely identify a record so that last bookmarked record is unique. Default
is _RDA_Id (internal unique id). Example: 'timestamp, some_unique_id'
sort_type Text asc Must be one of 'asc' or 'desc'
max_rows Text 1000 Max rows in each batch.

Example usage:

@dm:query-persistent-stream-from-bookmark name="dli-log-stats" and bookmark="stats-bookmark"







Bot @dm:query-persistent-stream-iterate-by-chunk

Bot Position In Pipeline: Source

Queries the data in a persistent stream and returns data in chunks

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
name* Text Name of the Persistent Stream
query Text * CFXQL query to filter results.
batch_size Text 1000 Rows in each batch. Default is 1000
sort_by_col Text Name of the column to sort
sort_type Text desc Must be one of 'asc' or 'desc'







Bot @dm:query-persistent-stream-iterate-by-time

Bot Position In Pipeline: Source

Queries the data in a persistent stream and returns data in chunks based on time interval

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
name* Text Name of the Persistent Stream
from Text Date & Time must be in ISO format. Must be in UTC timezone. Example: 2023-06-14, 2023-06-14T05:00:00.
If not provided, earliest available date will be used
to Text Date & Time must be in ISO format. Must be in UTC timezone. Example: 2023-06-14, 2023-06-14T05:00:00.
If not provided, the latest available date will be used
query Text * CFXQL query to filter results.
interval Text 1d Interval expressed with days or hours or mins. Ex: '1d' or '4h' or '30min'
timestamp_column Text timestamp Name of timestamp column. Default is 'timestamp'
aggs Text Specified as 'function:field_name'. Supported functions are sum, cardinality, min, max, mean,
value_count
groupby Text Comma separated list of columns to groupby; used only when 'aggs' is used
include_time_intervals Text no Include from and to interval used for each iteration. The column names will be 'from_interval'
and 'to_interval'
chunk_size Text 1000 Number of rows to fetch in each chunk when we retrieve data. Use larger number if we are dealing
with more data points with relatively less data per row. Do not use more than 5000. Avoid using
less than 1000. This is not applicable when 'aggs' is used.
max_aggregation_groups Text 1000 Fetches upto 1000 groups aggregation results by default. Larger values (10,000+) results in
use of more memory to compute.







Bot @dm:recall

Bot Position In Pipeline: Source Sink

Recall (load) a previously saved dataset

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
name Text Name of the dataset to recall.(name is mandatory, if tag parameter is not given)
cache Text no Cache the result for future recalls. 'yes' or 'no'
cache_refresh_seconds Text 120 Refresh the cache (if new update available) after specified seconds
empty_as_null Text yes While reading the saved datasets, treat empty strings as Nulls. 'yes' or 'no'
return_empty Text no Return an empty dataframe if an error occurs loading the dataset
empty_df_columns Text Comma separated list of columns for empty dataframe
ignore_dtypes Text no Ignore column data types during loading
make_copy Text yes Specifies whether to make copy of dataset for temp datasets. 'yes' or 'no'
tag Text Name of the tag, return the first occurrence of the dataset having the given tag

Example usage:

1
2
3
4
5
6
@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> @dm:save name = "servicenow_data"

--> @c:new-block
    --> @dm:recall     name="servicenow_data"

Playground

Try this in RDA Playground(rda_docs_env)

Example Pipelines Using this Bot







Bot @dm:recall-chunked

Bot Position In Pipeline: Source

Recall (load) a previously saved dataset as a data stream. Loads num_rows in each chunk.

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
name* Text Name of the dataset to recall
num_rows* Text Number of rows to fetch in each chunk
empty_as_null Text yes While reading the saved datasets, treat empty strings as Nulls. 'yes' or 'no'
return_empty Text no Return an empty dataframe if an error occurs loading the dataset
empty_df_columns Text Comma separated list of columns for empty dataframe
ignore_dtypes Text no Ignore column data types during loading

Example usage:

1
2
3
4
5
6
7
@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> @dm:save name = "servicenow_data"

--> @c:new-block
    --> @dm:recall-chunked   name="servicenow_data" &
            num_rows = 5

Playground

Try this in RDA Playground(rda_docs_env)







Bot @dm:recall-query

Bot Position In Pipeline: Source Sink

Recall (load) a previously saved dataset using CFXQL query

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
name* Text Name of the dataset to recall
empty_as_null Text yes While reading the saved datasets, treat empty strings as Nulls. 'yes' or 'no'
return_empty Text no Return an empty dataframe if an error occurs loading the dataset
empty_df_columns Text Comma separated list of columns for empty dataframe
make_copy Text yes Specifies whether to make copy of dataset for temp datasets. 'yes' or 'no'
query Text * CFXQL query to filter results.
max_rows Text Limit the number of rows to return.







Bot @dm:relations-child-to-parent-paths

Bot Position In Pipeline: Sink

Traverse CMDB relationship like table to identify all possible paths from each child to all parent node(s)

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
child_col* Text Input column name which identifies a child column in a relationship table
parent_col* Text Input column name which identifies a parent column in a relationship table







Bot @dm:relations-parent-to-children-paths

Bot Position In Pipeline: Sink

Traverse CMDB relationship like table to identify all possible paths from each parent to all child node(s)

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
child_col* Text Input column name which identifies a child column in a relationship table
parent_col* Text Input column name which identifies a parent column in a relationship table







Bot @dm:rename-columns

Bot Position In Pipeline: Sink

Rename specified column names using new_column_name = 'old_column_name' format

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

This bot only accepts wildcard parameters. All name = 'value' parameters are passed to the bot.







Bot @dm:replace-data

Bot Position In Pipeline: Sink

Replace data using 'expr' regex pattern from 'columns'

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
columns* Text Comma separated list of columns
expr* Text Regular expression to identify the part that need to be replaced
replace Text Replace with this value. If not specified, replaces with empty string

Example usage:

1
2
3
4
5
@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> @dm:replace-data columns = 'description' &
            expr = 'to' &
            replace = 'two'







Bot @dm:resample-timeseries

Bot Position In Pipeline: Sink

Resample time series data on specified timestamp column by aggregation function provided

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
ts_column* Text Timestamp column name.
value_column Text Comma separated list of columns. If none specified, it uses all columns other than timestamp.
interval Text 1H Bucketization interval expressed with days, hours, minutes, seconds. Ex: '1D, 4H, 15min'
agg Text sum Value aggregation function. Valid values are sum, min, max, mean
interpolate Text no Specify 'yes' or 'no'. If 'yes' then interpolate missing values after aggregation







Bot @dm:row-delta

Bot Position In Pipeline: Sink

Perform difference between 2 consecutive rows for the list of columns and replace them with the result

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
value_columns* Text Comma separated list of numeric column field names
groupby Text Perform delta within each group. Comma separated list of columns to groupby.
skip_first_row Text no The column value for the first rows for the value columns provided in the result will be NaN
as there is not row above them. You can skip the first row in the result by setting this to
'yes'
sort_columns Text timestamp Comma separated list of column names. Default is 'timestamp' field
sort_order Text ascending Sorting order ('ascending' or 'descending'). For multiple sort orders, specify comma separated
values. If sort order list is shorter than column list, the last sort order will be used for
the remaining columns







Bot *dm:safe-filter

Bot Position In Pipeline: Sink

Apply safe CFXQL filtering on the data

This bot expects a Full CFXQL.

Bot applies the Query on the data that is already loaded from previous bot or from a source.







Bot @dm:sample

Bot Position In Pipeline: Sink

Randomly sample 'n' number of rows

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
n Text 1 Number of rows to return in output using random sampling. This can also be a fraction between
0 to 1.0 to indicate fraction of input rows to return.
re_use Text auto Re-use already sampled row in output. Valid values are 'auto', 'yes' or 'no'.

Example usage:

1
2
3
4
@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> @dm:sample
            n = '0.01'

Example Pipelines Using this Bot







Bot @dm:sample-groupby

Bot Position In Pipeline: Sink

Randomly sample 'n' number of rows within each group

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
columns* Text 1 Comma separated list of column names to do the grouping.
n Text 1 Number of rows to return in each group using random sampling. This can also be a fraction between
0 to 1.0 to indicate fraction of input rows to return within each group.
re_use Text auto Re-use already sampled row in output. Valid values are 'auto', 'yes' or 'no'.

Example usage:

1
2
3
4
    @files:loadfile
            filename = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> @dm:sample-groupby columns='severity' and n=5
    --> *dm:filter * get severity,state,opened_at
1
2
3
4
    @files:loadfile
            filename = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> @dm:sample-groupby columns='severity,state' and n=0.5
    --> *dm:filter * get severity,state,opened_at







Bot @dm:save

Bot Position In Pipeline: Sink

Save the dataset with 'name'

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
name* Text Name of the dataset to save
format Text csv Data format on the object storage. Ignored for the temporary datasets. Must be one of 'csv'
or 'parquet'
publish Text Deprecated. Name of the tag to publish in cfxDimensions platform. Can be used only with Dimensions
configuration.
append Text no If set to 'yes', appends the input dataset as a chunk to the existing dataset if any. Valid
values are 'yes', 'no'
return_appended_dataset Text no If set to 'yes' and append is 'yes', returns the full appended dataset. Valid values are 'yes',
'no'
tag Text Name of the tag, appends the given tag name in the metadata

Example usage:

1
2
3
@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> @dm:save name = 'sample_servicenow_incidents_dataset'

Playground

Try this in RDA Playground(rda_docs_env)

Example Pipelines Using this Bot







Bot @dm:save-bookmark

Bot Position In Pipeline: Sink

Save the bookmark with 'name'

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
name* Text Name of the bookmark to save
value_column* Text Name of value column
value_type Text timestamp Value type (timestamp, numeric, text)
ts_format Text If value_type is timestamp, format. Valid units are s, ms, ns or null for string format.
value_func Text max Value functions are first, last, min, max

Example usage:

1
2
3
4
@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
--> @dm:save-bookmark name = 'sample-servicenow-example' & 
         value_column = 'sys_created_on'

Playground

Try this in RDA Playground(rda_docs_env)







Bot @dm:save-ml-dataset

Bot Position In Pipeline: Source Sink

Save the ML Datasets

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
pipeline_type* Text description ML Pipeline type. Must be one of 'clustering', 'regression', 'classification'
tmp_path Text Temporary directory path where datasets are stored
minio_path Text Minio path to store datasets

This bot also accepts wildcard parameters. Any additional name = 'value' parameters are passed to the bot.







Bot @dm:save-ml-model

Bot Position In Pipeline: Source Sink

Save the ML Model with 'name' and 'model_type'

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
name* Text Name of the ML Model to save
model_type* Text ML Model type. Must be one of 'clustering', 'regression', 'classification'
description Text Template content column
model_data_path* Text ML Model file path

This bot also accepts wildcard parameters. Any additional name = 'value' parameters are passed to the bot.







Bot @dm:save-template

Bot Position In Pipeline: Sink

Save the formatting template with 'name'

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
name* Text Name of the template to save
description_col Text description Template description column
content_col Text content Template content column
content_type_col Text content_type Template content type column







Bot @dm:save-to-location

Bot Position In Pipeline: Sink

Save the dataset to a specified location

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
name Text Name of the dataset to save. If not provided, will be extracted from location
format Text Data format on the object storage. Supported formats are csv, paraquet, gzip, json and zip.
If not provided, will be extracted from location
location Text Location in Minio bucket to save the object.
ignore_index Text no Ignore index columns while saving file to location. Possible values are 'yes' or 'no'. (Applicable
only for csv format)

Example usage:

1
2
3
4
5
6
@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> @dm:save name = "servicenow_dataset"
    --> @dm:save-to-location name = "servicenow_dataset" &
            format = "csv" &
            location = "output"







Bot *dm:savedlist

Bot Position In Pipeline: Source

List of saved datasets

This bot expects a Full CFXQL.

Bot applies the Query on the data that is already loaded from previous bot or from a source.

Example usage:

1
2
3
4
5
6
@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
  --> @dm:save name = 'sample_service_now_incidents'

--> @c:new-block
    --> *dm:savedlist

Playground

Try this in RDA Playground(rda_docs_env)







Bot @dm:selectcolumns

Bot Position In Pipeline: Sink

Select columns using 'include' regex and/or 'exclude' regex

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
include Text Column name regex pattern to include in the output (include patterns are matched first and
then exclude)
exclude Text None Column name regex pattern to exclude in the output (include pattern are matched first and then
exclude)

Example usage:

1
2
3
4
5
@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> @dm:selectcolumns
            include = 'assigned_to|approval.*|sys_*|.*description.*' & 
            exclude = 'sys_d.*'

Example Pipelines Using this Bot







Bot @dm:set-tracing-context

Bot Position In Pipeline: Source Sink

Set the tracing context using name = value pairs

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

This bot only accepts wildcard parameters. All name = 'value' parameters are passed to the bot.







Bot @dm:set-tracing-context-from-input

Bot Position In Pipeline: Sink

Set the tracing context using input dataframe column values from first row

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
columns* Text Comma separated column names in input dataframe, to be propagated to context







Bot @dm:skip-block-if-shape

Bot Position In Pipeline: Sink

Skip rest of the current block if input dataframe shape meets criteria expressed using 'row_count' & 'column_count'

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
row_count Text Number of rows in input dataframe. This variable accepts all numeric operations.
column_count Text Number of columns in input dataframe. This variable accepts all numeric operations.

Example usage:

1
2
3
@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> @dm:skip-block-if-shape row_count = 66 & column_count = 88

Playground

Try this in RDA Playground(rda_docs_env)

Example Pipelines Using this Bot







Bot @dm:skip-pipeline-if-shape

Bot Position In Pipeline: Sink

Skip rest of the pipeline if input dataframe shape meets criteria expressed using 'row_count' & 'column_count'

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
row_count Text Number of rows in input dataframe. This variable accepts all numeric operations.
column_count Text Number of columns in input dataframe. This variable accepts all numeric operations.

Example usage:

1
2
3
4
@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> @dm:skip-pipeline-if-shape row_count = 66 and column_count = 88
    --> @dm:tail n = 5

Playground

Try this in RDA Playground(rda_docs_env)







Bot @dm:sleep

Bot Position In Pipeline: Sink

Wait for a specified number of seconds before executing next step. Useful for timed loops.

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
seconds Text Wait time in seconds, must be > 0, fractional seconds are allowed

Example usage:

1
2
3
4
@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> @dm:sleep seconds = 5
    --> @dm:sample n='0.1'

Playground

Try this in RDA Playground(rda_docs_env)







Bot @dm:sort

Bot Position In Pipeline: Sink

Sort values using 'columns' with 'order'

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
columns Text Comma separated list of column names
order Text ascending Sorting order ('ascending' or 'descending'). For multiple sort orders, specify comma separated
values. If sort order list is shorter than column list, the last sort order will be used for
the remaining columns

Example usage:

1
2
3
4
@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> @dm:sort columns = 'assigned_to' and
            order = "desc"

Example Pipelines Using this Bot







Bot @dm:span-creator

Bot Position In Pipeline: Sink

Create Spans from input timeseries data

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
unique_id* Text Comma separated list of columns
starttime* Text Start time column name
endtime* Text End time column name
unit Text Timestamp unit. Default is None means timestamp is a string. other units are s,ms,ns
status* Text Status column name
data_label* Text Label for data in spans
filter Text Comma separated list of filter columns
incident_id* Text Incident id column name
interval_start Text 22 Interval start days/seconds/microseconds/milliseconds/minutes/hours/weeks
interval_end Text 2 Interval end days/seconds/microseconds/milliseconds/minutes/hours/weeks
interval_unit Text hours Interval unit days/seconds/microseconds/milliseconds/minutes/hours/weeks







Bot *dm:stack-connected-nodes

Bot Position In Pipeline: Source Sink

All connected nodes on previously selected Nodes on stack

This bot expects a Full CFXQL.

Bot applies the Query on the data that is already loaded from previous bot or from a source.







Bot @dm:stack-create

Bot Position In Pipeline: Source Sink

Create stack from input topology dataset

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
topology_nodes* Text Name of topology nodes dataset
topology_edges Text Name of topology edges dataset
name* Text Name for stack

Example Pipelines Using this Bot







Bot *dm:stack-filter

Bot Position In Pipeline: Sink

Filter stack Nodes/Edges based on previously selected Nodes/Edges on stack

This bot expects a Full CFXQL.

Bot applies the Query on the data that is already loaded from previous bot or from a source.







Bot @dm:stack-find-impact-distances

Bot Position In Pipeline: Source Sink

Search a saved stack using asset-dependency service and get impact distances from the specified nodes

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
stack_name* Text Name of previously saved stack. This bot will use asset-dependency service to load the specified
stack and perform a search
search_for* Text Comma separated list of values to search
attr_names* Text Comma separated list of attribute names Search the values specified in 'search_for' in this
list of attribute names
node_types Text Comma separated list of node types to search
exclude_node_types Text Comma separated list of node types to exclude
depth Text 10 Maximum depth from the selected nodes
operation Text equals Type of value comparison operation. Must be one of 'equals', 'contains', or 'matches'
ignore_case Text yes Ignore case while doing the search. Must be one of 'yes' or 'no'
max_matches Text 1 Maximum number of matches per each search.
timeout Text 120 Timeout in seconds







Bot @dm:stack-generate

Bot Position In Pipeline: Source Sink

Generate stack from input topology dataset

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
input_stack_name* Text Name of input stack json to generate stack from
app_type Text OIA Name of application to publish stack to
output_stack_name* Text Name of application to publish stack to
room_id* Text Location to publish stack to
incident_id* Text Comma separated Incident id under which stack will be referenced
incident_summary* Text Comma separated summary for incidents
fqdn* Text FQDN
url* Text URL
ip_address* Text IP Address







Bot *dm:stack-impacted-nodes

Bot Position In Pipeline: Source Sink

Impact Analysis on previously selected Nodes on stack

This bot expects a Full CFXQL.

Bot applies the Query on the data that is already loaded from previous bot or from a source.







Bot @dm:stack-join

Bot Position In Pipeline: Source Sink

Create stack from input topology dataset

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
target_stack* Text Name of target stack to join
filter* Text Name of dictionary dataset with filters
name* Text Name of new stack







Bot *dm:stack-list

Bot Position In Pipeline: Source

List of application Stacks

This bot expects a Full CFXQL.

Bot applies the Query on the data that is already loaded from previous bot or from a source.







Bot @dm:stack-load

Bot Position In Pipeline: Source Sink

Load application Stack specified by 'name'

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
name* Text Name of the Stack to load







Bot @dm:stack-save

Bot Position In Pipeline: Source Sink

Save application Stack specified by 'name'

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
name* Text Name of the Stack to save
description Text Stack description
stack_data_column Text data Column name where stack definition is stored in dataframe
additional_nodes_rules Text Name of dataset with rules to create and attach new node to matching existing nodes.

Example Pipelines Using this Bot







Bot Position In Pipeline: Source Sink

Search a saved stack using asset-dependency service

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
stack_name* Text Name of previously saved stack. This bot will use asset-dependency service to load the specified
stack and perform a search
search_for* Text Comma separated list of values to search
attr_names* Text Comma separated list of attribute names Search the values specified in 'search_for' in this
list of attribute names
node_types Text Comma separated list of node types to search
exclude_node_types Text Comma separated list of node types to exclude
depth Text 2 Maximum depth from the selected nodes
operation Text equals Type of value comparison operation. Must be one of 'equals', 'contains', or 'matches'
ignore_case Text yes Ignore case while doing the search. Must be one of 'yes' or 'no'
max_matches Text 1 Maximum number of matches per each search.
timeout Text 120 Timeout in seconds
result_stack_name Text Name of the result stack







Bot *dm:stack-select-nodes

Bot Position In Pipeline: Sink

Select Nodes from stack based on provided criteria

This bot expects a Full CFXQL.

Bot applies the Query on the data that is already loaded from previous bot or from a source.







Bot @dm:stack-unselect-nodes

Bot Position In Pipeline: Source Sink

Unselect nodes in stack of given node types if there is no right link available

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
node_type* Text Type of nodes to be unselected if there is no right link available







Bot @dm:staging-read

Bot Position In Pipeline: Source

This is a streaming bot that reads one or more rows from the files that match the criteria in the specified staging area in a chunked manner. We currently support ingesting data from following data formats - [csv,json,parquet,orc]

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
name* Text Staging area name
num_rows Text 1000 Number of rows to fetch in each chunk
format Text Format is either csv/json/parquet/orc. If not specified, it will be derived from extension
line_read Text yes Only applicable for JSON. By default file is read as a json object per line. If you want to
load the whole file as JSON, set it to 'no'

Example usage:

@dm:staging-read name='staging-area-platform-sample'







Bot @dm:string-to-columns

Bot Position In Pipeline: Sink

It splits a column and assign values to columns

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
from* Text Specify the column name that has the string data
seps Text , Specify the list of separators, example:|$@/#!. Default is comma.
to* Text Specify the comma separated column names to which extracted strings need to be added as values.
Once the 'from' column is split, the new columns are added to the existing dataframe, based
on the order of the specified columns.
to_column_default Text Assign a value to 'to' column(s), when there is no string value extracted out of 'from' column,
Default is None







Bot @dm:synthetic-dataset

Bot Position In Pipeline: Source

Generate a new dataframe with specified row_count and column_name = 'field_type' paris

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
row_count* Text Number of rows for the output dataframe

This bot also accepts wildcard parameters. Any additional name = 'value' parameters are passed to the bot.

List of supported synthetic data field types are listed here.

Example usage:

1
2
3
@c:new-block
    --> @dm:synthetic-dataset row_count = 5 &
            col_address = "address"

Playground

Try this in RDA Playground(rda_docs_env)







Bot *dm:synthetic-fields

Bot Position In Pipeline: Source

List of field types supported in synthetic data functions or bots

This bot expects a Full CFXQL.

Bot applies the Query on the data that is already loaded from previous bot or from a source.

Example usage:

@c:new-block
    --> *dm:synthetic-fields

Playground

Try this in RDA Playground(rda_docs_env)







Bot @dm:tail

Bot Position In Pipeline: Sink

Get last 'n' rows

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
n Text 10 Number of rows retain from tail position

Example usage:

1
2
3
@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> @dm:tail n = 9







Bot @dm:telegraf-parser

Bot Position In Pipeline: Sink

Parse the telegraf data that is passed through input dataset

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
tags_prefix Text tags Add a prefix to the tags block after flattening. By default, 'tags' prefix will be added.
fields_prefix Text Add a prefix to the fields block after flattening. By default, no prefix will be added.







Bot *dm:template-list

Bot Position In Pipeline: Source

List of saved formatting templates

This bot expects a Full CFXQL.

Bot applies the Query on the data that is already loaded from previous bot or from a source.







Bot @dm:time-filter

Bot Position In Pipeline: Sink

Apply time filter on specified timestamp column

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
column* Text Timestamp column name.
from Text From timestamp. Can be absolute format or relative to current time. Atleast one of 'from' or
'to' must be specified.
to Text To timestamp. Can be absolute format or relative to current time. Atleast one of 'from' or
'to' must be specified.







Bot @dm:to-json

Bot Position In Pipeline: Sink

Converts each row of the input dataset to JSON format.

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
exclude_columns Text Regular expression to exclude one or more columns from input data
include_columns Text .* Regular expression to include one or more columns from input data
output_column* Text Name of the output column which captures the input dataset as JSON format.
keep_original_columns Text no Keep existing columns in output or not. 'yes' or 'no'







Bot @dm:to-type

Bot Position In Pipeline: Sink

Change data type to str or int or float for specified columns

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
columns* Text Comma separated list of column names
type* Text Type to convert into: str / int / float

Example usage:

1
2
3
4
@files:loadfile
            filename  = 'https://bot-docs.cloudfabrix.io/data/datasets/sample-servicenow-incidents.csv'
    --> @dm:to_type  columns = 'activity_due' &
            type  = 'int'







Bot @dm:transpose

Bot Position In Pipeline: Sink

Transposes columns to rows

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
columns* Text Comma-separated column names to set as index before transpose
value Text Name of value column







Bot @dm:validate-data

Bot Position In Pipeline: Sink

Check integrity of data data using 'schema' that has been uploaded previously

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
schema_name* Text Schema to verify the data against.
failfast Text yes Specify 'yes'(default) to abort quickly on first error or 'no' to keep validating records
action Text none Action to take if validation fails. Must be one of 'none' (default), 'fail','skip-block', 'skip-pipeline'







Bot @dm:vectorization

Bot Position In Pipeline: Sink

Compares data of the given columns and populates change state of data in the newly created 'change_state' column

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
dict Text JSON that contains 'column_name' and 'column_type' keys for actual column name and column type
values. ex:[{'column_name':'name','column_type':'type'}]
suffixes Text _new,_old Comma separated list of suffixes to identify old and new columns
modified_summary Text False Enable modified_summary to compute number of modified rows in each column and populates summary
in the 'modified_summary' column.(True/False)







Bot @dm:verify-checksum

Bot Position In Pipeline: Sink

Verify checksum of input dataframe. Checksum can be by rows-only or rows or for the entire dataset.

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
checksum_type Text dataset Verify checksum for by row only or by row and then entire dataset. Valid values are 'rows-only',
'dataset'
row_checksum_column Text rda_row_checksum Input column for computed row level checksum.
data_checksum_column Text rda_data_checksum Input column for computed checksum for entire dataset.
key Text Optional key to be used in the computed hash.
drop_checksum_columns Text yes Use 'yes' or 'no' to specify if the checksum related columns should be dropped from the dataframe
after a successful validation.







Bot @dm:xml-to-json

Bot Position In Pipeline: Sink

Parse XML document in a specified column and convert it into JSON

This bot expects a Restricted CFXQL.

Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot

Parameter Name Type Default Value Description
column* Text Column data which need to be parsed for XML
output_column* Text Column name for the output JSON data
status_column Text Column name for parsing status
json_path Text Dot delimited JSON path to be traversed in the document. Default is entire document.

Example usage:

1
2
3
4
5
6
7
@dm:empty
    --> @dm:addrow Serial_Number="<number>124</number>" &
            name="<name>john</name>"
    --> @dm:addrow Serial_Number="<number>133</number>" &
            name="<name>smith</name>"
    --> @dm:xml-to-json column = 'name' &
            output_column = 'json_data'

Playground

Try this in RDA Playground(rda_docs_env)