Bots From Extension: cfxdm
Data Management
This extension provides 175 bots.
Bot @dm:add-bounded-dataset
Bot Position In Pipeline: Source Sink
This is a bot that adds a bounded dataset. Bounded datasets are bound to a pre-defined schema so that data is always validated against a set of rules.
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
name* | Text | Name of the dataset | |
schema_name* | Text | Name of the schema to bind this dataset |
Bot @dm:add-checksum
Bot Position In Pipeline: Sink
Add checksum to input dataframe. Checksum can be by rows-only or rows and then the entire dataset.
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
checksum_type | Text | dataset | Compute checksum for by row only or by row and then entire dataset. Valid values are 'rows-only', 'dataset' |
row_checksum_column | Text | rda_row_checksum | Output column for computed row level checksum. If the column already exists, it will be replaced and not included in the checksum computation. |
data_checksum_column | Text | rda_data_checksum | Output column for computed checksum for entire dataset. If the column already exists, it will be replaced and not included in the checksum computation. |
key | Text | Optional key to be used in the computed hash. |
Example usage:
Playground
Bot @dm:add-missing-columns
Bot Position In Pipeline: Sink
Add columns if not found in the input
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
columns* | Text | Comma separated list of column names to be added if not already exist in input | |
value | Text | Value to be assigned if columns are not found in input. Default is None. |
Example usage:
Playground
Example Pipelines Using this Bot
- li-filebeat-events-to-prod-env
- li-http-events-to-prod-env
- li-replay-logs-to-dev-env
- li-stream-tcp-syslogs
- li-udp-syslog-events-to-prod-env
- li-windows-events-to-prod-env
Bot @dm:add-schema
Bot Position In Pipeline: Source Sink
This is a bot that adds json schema to the system. The datasets can be bound to this schema so that adding/editing any rows to dataset are automatically validated against this schema.
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
schema_file* | Text | File path or URL of the json schema file | |
name* | Text | Name of the json schema |
Bot @dm:add-template
Bot Position In Pipeline: Source Sink
Add a formatting template with 'name' and contents downloaded from a 'url'. If the template already exists, it will overwrite.
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
url* | Text | URL to contents of the Jinja2 style formatting template | |
name* | Text | Name of the formatting template | |
description | Text | Formattiong template description |
Bot @dm:addrow
Bot Position In Pipeline: Sink
Append a row to input dataframe, with specified column = value parameters
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
This bot only accepts wildcard parameters. All name = 'value' parameters are passed to the bot.
Example usage:
Example Pipelines Using this Bot
- dli-process-synthetic-syslogs
- ebonding-servicenow-to-stream-v2
- sample-cato-networks-graphql
- sample-grok-test
- sample-mondaydotcom-graphql
- sample-nlp-example
Bot @dm:apply-alert-rules
Bot Position In Pipeline: Source Sink
Unselect nodes in stack of given node types if there is no right link available
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
name* | Text | Name of the alert ruleset to apply | |
timestamp_column | Text | Name of the column to be used as timestamp |
Bot @dm:apply-data-model
Bot Position In Pipeline: Sink
Apply specified data model to input dataframe. Example model name 'assetLCMMaster'
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
model* | Text | Name of the model to apply to input dataframe | |
removeUnmapped | Text | no | Remove columns that are not in the model. Specify 'yes' or 'no' |
apply_for_empty | Text | no | Apply value for empty string ('yes' or 'no') |
Example usage:
Bot @dm:apply-snmp-trap-template
Bot Position In Pipeline: Sink
Apply template to incoming SNMP Trap objects. Template must be available in RDA Object repository.
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
timestamp_col* | Text | Column which has timestamp at which SNMP Trap was received. Must be in UTC epoch milliseconds format |
|
version_col | Text | Column which has SNMP version. | |
address_col* | Text | Column which has IPAddress of the SNMP Trap source | |
varbinds_col* | Text | Column which has varbind list. Should be list of dict objects | |
template_folder | Text | snmp_trap_templates | Name of the folder for RDA Objects which contains the SNMP Trap template |
template_name | Text | traps | Name of the RDA Object which has the SNMP Trap template |
Bot @dm:apply-template-all-rows
Bot Position In Pipeline: Sink
Apply specified formatting template for all input rows and produce one rendered row
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
template_name* | Text | Name of the formatting template to be applied | |
output_col* | Text | Output column | |
status_col | Text | Template parsing status column. If not specified and any errors will cause pipeline to abort. |
Example Pipelines Using this Bot
Bot @dm:apply-template-by-row
Bot Position In Pipeline: Sink
Apply specified formatting template for each input dataframe row and produce rendered output
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
template_name* | Text | Name of the formatting template to be applied | |
output_col* | Text | Output column | |
status_col | Text | Template parsing status column. If not specified and any errors will cause pipeline to abort. |
Example Pipelines Using this Bot
- ebonding-stream-to-email
- ebonding-stream-to-pagerduty
- ebonding-stream-to-slack
- sample-cato-networks-graphql
- sample-formatting-template-example
- sample-mondaydotcom-graphql
Bot @dm:apply-topology-rci
Bot Position In Pipeline: Sink
Apply topology based Root Cause Inference model
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
node_weights_dict | Text | Name of the node weights dictionary. Expects columns node_id and weight | |
link_weights_dict | Text | Name of the link weights dictionary. Expects columns link_type and weight | |
severity_weights_dict | Text | Name of the severity weights dictionary. Expects columns severity and weight | |
select_top | Text | 1 | How many high score nodes to select. Default is 1 |
stack_name* | Text | Name of the stack |
Bot @dm:bin
Bot Position In Pipeline: Sink
Create bins for numerical 'column' and bins specified by 'bins' parameter
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
column* | Text | Numerical value column | |
bins* | Text | Comma separated list of numerical values representing bins |
Example usage:
Playground
Bot *dm:bookmark-list
Bot Position In Pipeline: Source
List of saved bookmarks
This bot expects a Full CFXQL.
Bot applies the Query on the data that is already loaded from previous bot or from a source.
Example usage:
Playground
Try this in RDA Playground(rda_docs_env)
Bot @dm:build-hierarchy
Bot Position In Pipeline: Sink
Builds relationships between the entities and populates hierarchy keys in the newly created 'hierarchy' column
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
entity_key_column* | Text | Attribute that represents the unique entity key column | |
relation_key_column* | Text | A relationship key column that should match to the entity_key_column | |
hierarchy_end_key_column | Text | Attribute that represent the hierarchy end key column to stop hierarchy building | |
hierarchy_end_value | Text | Hierarchy end value to stop hierarchy building | |
include_column_to_primary_key | Text | Column name that will be added to the entity key column that makes uniqueness to build the hierarchy |
Bot @dm:change-time-format
Bot Position In Pipeline: Sink
Change datetime from one format to another for all specified columns
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
columns* | Text | Comma separated list of column names | |
from_format* | Text | From format, one of: datetimestr s ms ns | |
to_format* | Text | To format, one of: datetimestr s ms ns. Can also specify a custom format expression. |
Bot @dm:check-columns
Bot Position In Pipeline: Sink
Check input columns for specific list of columns that must exist or must not exist and take an action
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
must_contain | Text | Comma separated list of columns which must exist in the input. | |
must_not_contain | Text | Comma separated list of columns which must not exist in the input. | |
action* | Text | Action to take if either of the column checks fails. Must be one of 'fail','skip-block', 'skip-pipeline' |
Example usage:
Playground
Try this in RDA Playground(rda_docs_env)
Bot @dm:check-integrity
Bot Position In Pipeline: Sink
Check integrity of input data using 'rules' dataset and save results to 'errors' dataset
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
rules* | Text | Name of the rules dataset | |
failfast | Text | yes | Specify 'yes' to abort quickly on first error or 'no' to keep validating rules even when some rules fail |
errors* | Text | Name of the output errors dataset | |
failpipeline | Text | no | Specify 'no' fail the entire pipeline on errors, 'yes' to keep executing |
Bot *dm:cohort-list
Bot Position In Pipeline: Source
List of Cohorts
This bot expects a Full CFXQL.
Bot applies the Query on the data that is already loaded from previous bot or from a source.
Bot @dm:cohort-load
Bot Position In Pipeline: Source Sink
Load cohort specified by 'name'
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
name* | Text | Name of the cohort to load |
Bot @dm:concat
Bot Position In Pipeline: Source
Concatenate set of saved dataframes ('names'). Each dataframe must have been saved using dm:save
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
names* | Text | Name of the saved datasets (regex) | |
return_empty | Text | no | Return an empty dataframe if an error occurs loading the dataset |
Example usage:
Playground
Example Pipelines Using this Bot
Bot @dm:concat-input-dataset
Bot Position In Pipeline: Sink
Concatenate set of saved dataframes ('names') with the input dataset. Each dataframe must have been saved using dm:save
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
names* | Text | Name of the saved datasets (regex) | |
return_empty | Text | no | Return an empty dataframe if an error occurs loading the dataset |
Bot @dm:content-to-object
Bot Position In Pipeline: Sink
Convert data from a column into objects
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
input_content_column* | Text | Name of the column in input that contains the data | |
output_column* | Text | Column name where object names will be inserted | |
output_folder* | Text | Folder name where objects will be stored |
Bot @dm:copy-columns
Bot Position In Pipeline: Sink
copy values from one column to other
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
from* | Text | Get the value from specified column or columns (comma separated) | |
to* | Text | Store the value to specified column or columns (comma separated). The specified number of columns should match 'from' column(s). |
|
func | Text | Supported operations are: [strip, upper, lower, append, lstrip, rstrip, replace, split, join, len] |
|
value | Text | , | Specify a value for 'split & join' functions, specify 'oldvalue' and 'newvalue' for replace function. Default value is set to Comma. |
prefix | Text | if function is append,Specify the string that has to append at the beginning | |
suffix | Text | if function is append,Specify the string that has to append at the end |
Bot @dm:copy-config
Bot Position In Pipeline: Source Sink
Copy RDA Object to local file in worker, typically used to update a configuration file on a mounted folder
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
src_object* | Text | Object name | |
src_folder* | Text | Folder name on the object storage | |
dest_file* | Text | location of the destination file | |
backup_dir | Text | If dest_file exists, copy it to this backup directory |
Bot @dm:counter
Bot Position In Pipeline: Sink
Adds COUNTER to each row of the input dataset
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Bot @dm:create-cohorts
Bot Position In Pipeline: Source Sink
Create cohorts from input stack
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
stack_name* | Text | Name of the Stack to use | |
groupby* | Text | comma separated column names to do groupby | |
cohort_name_prefix | Text | cohort | Cohort name prefix to use |
cfxql_filter | Text | cfxql filter to apply on stack data |
Bot @dm:create-logarchive-repo
Bot Position In Pipeline: Source Sink
Create logarchive repository on RDA Platform Minio, if not created already
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
repo* | Text | Name of the Log Archive repository to be created. If a repo already exists with this name, bot will not perform any action. |
|
prefix* | Text | Object prefix on the platform Minio. | |
retention | Text | 0 | Retention period in number of days. If set to 0, RDA will not manage the log archive lifecycle. |
Example Pipelines Using this Bot
Bot @dm:create-persistent-stream
Bot Position In Pipeline: Source Sink
Create a Persistent Stream if not already created
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
name* | Text | Name of the Persistent Stream to create. | |
index_name | Text | Optional index name in OpenSearch. If not specified, index name will be automatically created from stream name. |
|
retention_days | Text | 31 | Retention period in number of days. If set to 0, RDA will not manage the persistent stream lifecycle. |
timestamp_column | Text | Name of timestamp column. Optional. | |
unique_cols | Text | Comma separated list of columns to be used as unique columns to make stream updatable |
Example Pipelines Using this Bot
Bot @dm:create-zipfile
Bot Position In Pipeline: Source
Zip the contents of the given folder and place it at the specified location.
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
folder_name* | Text | Folder name to create a zip file | |
zipfile_name | Text | File name to the created zipfile. If the zipfile name is not specified, it will take it from the folder_name |
|
save_to_location | Text | False | Location to place the created zip file. If it is not specified, zipfile will save at the given folder_name |
Bot @dm:dataset-location
Bot Position In Pipeline: Source Sink
Get the location information for a previously saved dataset
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
name* | Text | Name of the dataset |
Example usage:
Playground
Try this in RDA Playground(rda_docs_env)
Bot @dm:dedup
Bot Position In Pipeline: Sink
Dedup rows using specified 'columns'
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
columns | Text | Comma separated list of column names. Default: All columns | |
keep | Text | first | Specify which duplicates (if any) to keep.Choose from first or last |
Example usage:
Playground
Example Pipelines Using this Bot
Bot @dm:delete-dataset
Bot Position In Pipeline: Sink
Delete a previously saved dataset
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
dataset_column | Text | dataset | Column Name of the dataset to delete |
Bot @dm:describe
Bot Position In Pipeline: Sink
Describe the input dataframe using optional 'columns' attribute
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
columns | Text | Comma separated list of column names. Default all columns |
Example usage:
Playground
Bot @dm:diff
Bot Position In Pipeline: Sink
Compare input dataset against a 'base_dataset'
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
base_dataset* | Text | Name of the base dataset to compare the input dataset | |
key_cols* | Text | Comma separated columns to identity each row | |
exclude | Text | Exclude columns in the diff (regex pattern) | |
keep_data | Text | no | Keep the data columns in the diff output ('yes' or 'no') |
Example usage:
Playground
Try this in RDA Playground(rda_docs_env)
Bot @dm:dns-ip-to-name
Bot Position In Pipeline: Sink
Perform reverse DNS lookup to map IP Addresses to Hostnames on specified columns
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
from_cols* | Text | Comma separated list of columns with IP Address values | |
to_cols* | Text | Comma separated list of column names to store resolved Hostnames. | |
keep_value | Text | no | If lookup fails, store original value if 'yes' Or null if 'no' |
num_threads | Text | 5 | Number of threads. Must be in the range of 1 to 20 |
additional_records | Text | false | Get additional domain names. true/false |
record_type | Text | PTR | Comma separated record types for additional records. ex: PTR,A,CNAME |
Example usage:
Playground
Try this in RDA Playground(rda_docs_env)
Bot @dm:dns-name-to-ip
Bot Position In Pipeline: Sink
Perform DNS lookup to map Hostnames to IP Addresses on specified columns
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
from_cols* | Text | Comma separated list of columns with Hostnames | |
to_cols* | Text | Comma separated list of column names to store resolved IP Addresses. | |
keep_value | Text | no | If lookup fails, store original value if 'yes' Or null if 'no' |
num_threads | Text | 5 | Number of threads. Must be in the range of 1 to 20 |
additional_records | Text | false | Get additional ip addresses list. true/false |
Example usage:
Playground
Try this in RDA Playground(rda_docs_env)
Bot @dm:drop-null-columns
Bot Position In Pipeline: Sink
Drop columns with a specified % of null values
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
keep_columns | Text | Column name regex pattern. All columns matching this pattern must remain in output even if they have nulls |
|
threshold | Text | 100.0 | Percent threshold for Null (NaN) values for each column. If % of null exceed this threshold they will be removed from output. Value must be > 0 and <= 100.0. Default is 100 |
empty_is_null | Text | no | Treat empty strings with white spaces only as Nulls. Valid values are 'yes' or 'no'. Default is 'no' |
Bot @dm:dropnull
Bot Position In Pipeline: Sink
Drop rows if specified 'columns' have null values
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
columns* | Text | Comma separated list of column names |
Example usage:
Playground
Bot @dm:empty
Bot Position In Pipeline: Source Sink
Create an empty dataframe with optional columns
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
columns | Text | Comma separated list of column names to be included in the empty dataframe. By default no columns are included. |
Example usage:
Playground
Try this in RDA Playground(rda_docs_env)
Example Pipelines Using this Bot
- dli-process-synthetic-syslogs
- ebonding-servicenow-to-stream-v2
- sample-cato-networks-graphql
- sample-grok-test
- sample-mondaydotcom-graphql
- sample-nlp-example
Bot @dm:enrich
Bot Position In Pipeline: Sink
Enrich the input dataframe using a saved dictionary dataset
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
dict* | Text | Name of the saved dataset to be used as dictionary | |
src_key_cols* | Text | Comma separated list of column names in input to use for join | |
dict_key_cols* | Text | Comma separated list of column names in dict to use for join | |
enrich_cols* | Text | Comma separated list of column names to bring from dictionary | |
enrich_cols_as | Text | Comma separated list of new column names in the enriched output | |
return_empty_dict | Text | no | Return empty dictionary if it doesn't exist. (yes/no) |
return_empty_cols | Text | no | Return empty columns if dict is empty or doesn't exist. (yes/no) |
suffixes | Text | _x,_y | Comma separated list of suffixes to add to overlapping column names in left and right respectively |
indicator | Text | False | Enable indicator to add a column to the output DataFrame called _merge with information on the source of each row. The column can be given a different name by providing a string argument |
how_type | Text | left | Specify the type of merge to be performed example:(right, outer, inner).By default left merge is performed |
dedup_dict | Text | yes | Disable the dedup_dict to not to drop the duplicate rows from dict_key_columns |
case_insensitive | Text | no | Perform case insensitive match on the key values |
replace_values | Text | no | If enabled actual column value will be replaced with _x or _y values if not null by dropping _x,_y columns |
cache | Text | yes | Cache the dict for future recalls. 'yes' or 'no'. |
cache_refresh_seconds | Text | 120 | Refresh the cache (if new update available) after specified seconds. |
Example usage:
Playground
Try this in RDA Playground(rda_docs_env)
Example Pipelines Using this Bot
Bot @dm:enrich-conditional
Bot Position In Pipeline: Sink
Enrich the input dataframe using a saved dataset based on CFXQL condition
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
dict* | Text | Name of the saved dataset to be used as dictionary | |
condition* | Text | Condition must be a valid CFXQL expression that is applied to input dataset. Supports GET operation for column filtering. |
|
enrich_cols | Text | Comma separated list of enriched columns from dictionary if the rule matches. This will be applied after the condition parameter is applied. |
|
enrich_cols_as | Text | Rename the enrich columns, should be specified in the same order as in enrich_cols param. This will be applied after the condition parameter is applied. |
|
return_status | Text | no | Add a column 'meta_enrich_status' to the output with the status of the enrichment. If set, failures will be captured in this column. |
return_empty_dict | Text | no | Return empty dictionary if it doesn't exist. (yes/no) |
return_empty_cols | Text | no | Return empty columns if dict is empty or doesn't exist. (yes/no) |
cache | Text | no | Cache the result for future recalls. 'yes' or 'no' |
cache_refresh_seconds | Text | 120 | Refresh the cache (if new update available) after specified seconds |
Bot @dm:enrich-using-ip-cidr
Bot Position In Pipeline: Sink
Enrich the input dataframe using a saved dictionary dataset. Match IP address in input dataframe with CIDRs specified in the dictionary
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
dict* | Text | Name of the saved dataset to be used as dictionary | |
src_ip_col* | Text | Column name in input dataframe which has IPv4 or IPv6 address | |
dict_cidr_col* | Text | Comma separated list of column names in dict to use for join | |
enrich_cols* | Text | Comma separated list of column names to bring from dictionary | |
enrich_cols_as | Text | Comma separated list of new column names in the enriched output |
Example usage:
Playground
Try this in RDA Playground(rda_docs_env)
Bot @dm:enrich-using-ip-cidr-multi-proc
Bot Position In Pipeline: Sink
Enrich the input dataframe using a saved dictionary dataset. Match IP address in input dataframe with CIDRs specified in the dictionary. Use specified number of processes.
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
dict* | Text | Name of the saved dataset to be used as dictionary | |
src_ip_col* | Text | Column name in input dataframe which has IPv4 or IPv6 address | |
dict_cidr_col* | Text | Comma separated list of column names in dict to use for join | |
enrich_cols* | Text | Comma separated list of column names to bring from dictionary | |
enrich_cols_as | Text | Comma separated list of new column names in the enriched output | |
_max_procs | Text | 0 | Maximum number of CPUs to use. 0 means all available CPUs. Value should be >= 1 |
Bot @dm:enrich-using-pstream
Bot Position In Pipeline: Sink
Enrich the input dataframe using a persistent stream
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
dict* | Text | Name of the persistent stream to be used as dictionary | |
query | Text | CFXQL query to filter the dictionary data. | |
src_key_cols* | Text | Comma separated list of column names in input to use for join | |
dict_key_cols* | Text | Comma separated list of column names in dict to use for join | |
enrich_cols* | Text | Comma separated list of column names to bring from dictionary | |
enrich_cols_as | Text | Comma separated list of new column names in the enriched output | |
return_empty_dict | Text | no | Return empty dictionary if pstream doesn't exist. (yes/no) |
return_empty_cols | Text | no | Return empty columns if dict is empty or doesn't exist. (yes/no) |
batch_lookup | Text | Specify how many unique rows to look up in the dictionary at a time. Example: 50. This option typically improves performance when the dictionary is very large. |
|
suffixes | Text | _x,_y | Comma separated list of suffixes to add to overlapping column names in left and right respectively |
indicator | Text | False | Enable indicator to add a column to the output DataFrame called _merge with information on the source of each row. The column can be given a different name by providing a string argument |
how_type | Text | left | Specify the type of merge to be performed example:(right, outer, inner).By default left merge is performed |
dedup_dict | Text | yes | Disable the dedup_dict to not to drop the duplicate rows from dict_key_columns |
replace_values | Text | no | If enabled actual column value will be replaced with _x or _y values if not null by dropping _x,_y columns |
Bot @dm:enrich-using-rule-dict
Bot Position In Pipeline: Sink
Enrich using rule based dictionary which contains 'rule' column
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
dict* | Text | Name of the saved dataset to be used as rules dictionary. dictionary should contain rule_id and rule column |
|
rule_id_column | Text | rule_id | Rule ID column in dictionary. rules will be sorted in ascending order using this column |
rule_column | Text | rule | Rule column in dictionary. Rule must be a valid CFXQL expression that is applied in input dataset. |
enrich_columns | Text | Comma separated list of enriched columns from dictionary if the rule matches | |
template_columns | Text | Comma separated list of template column names. At least one of enrich_columns or template_columns must be specified. |
|
cache | Text | yes | Cache the dict for future recalls. 'yes' or 'no'. |
cache_refresh_seconds | Text | 120 | Refresh the cache (if new update available) after specified seconds. |
Example Pipelines Using this Bot
- dli-process-synthetic-syslogs
- ebonding-stream-to-pagerduty
- li-filebeat-events-to-prod-env
- li-http-events-to-prod-env
- li-replay-logs-to-dev-env
- li-stream-tcp-syslogs
- li-udp-syslog-events-to-prod-env
- li-windows-events-to-prod-env
Bot @dm:eval
Bot Position In Pipeline: Sink
Map values using evaluate function. Specify one of more column = 'expression' pairs
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
_skip_errors | Text | no | Specify 'yes' or 'no'. If 'yes', do not bailout if any of the row processing results in error. Check meta_eval_message field when it continues with an error. |
This bot also accepts wildcard parameters. Any additional name = 'value' parameters are passed to the bot.
Example usage:
Playground
Example Pipelines Using this Bot
- dli-generate-synthetic-syslogs
- dli-process-synthetic-syslogs
- ebonding-stream-to-email
- ebonding-stream-to-pagerduty
- ebonding-stream-to-slack
- li-replay-logs-to-dev-env
- li-stream-tcp-syslogs
- li-udp-syslog-events-to-prod-env
- sample-cato-networks-graphql
- sample-ecommerce-analytics
- sample-formatting-template-example
- sample-vm-analytics
Bot @dm:eval-multi-proc
Bot Position In Pipeline: Sink
Map values using evaluate function. Uses all available CPU cores to do parallel processing. Specify one of more column = 'expression' pairs
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
_max_procs | Text | 0 | Maximum number of CPUs to use. 0 means all available CPUs. Value should be >= 1 |
_skip_errors | Text | no | Specify 'yes' or 'no'. If 'yes', do not bailout if any of the row processing results in error. Check meta_eval_message field when it continues with an error. |
This bot also accepts wildcard parameters. Any additional name = 'value' parameters are passed to the bot.
Example usage:
Playground
Try this in RDA Playground(rda_docs_env)
Bot @dm:event-sampling
Bot Position In Pipeline: Source Sink
Sample events to create training dataset for classification
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
time_column | Text | timestamp | Event occurred time column |
events_spread_dataset* | Text | Events Spread dataset name | |
critical_event_mnemonic_columns* | Text | Comma separated list of column names to use to filter critical events | |
sample_events_query | Text | * | CFXQL query to fetch related events in duration period |
duration_hours | Text | 1 | Duration to fetch related events for each critical event |
groupby* | Text | Column name to perform groupby to get counts |
Bot @dm:eventcorr-intra-group
Bot Position In Pipeline: Sink
Compute noise reduction for each group using 'groupby', 'created', 'window' columns
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
groupby* | Text | Comma separated list of columns to do the grouping | |
timestamp* | Text | Timestamp column name, typically event created timestamp | |
unit | Text | Timestamp unit. Default is None means timestamp is a string. other units are s,ms,ns | |
id | Text | Identity column, if not specified will use timestamp column | |
window | Text | 15 | Sliding time window that groups events that occur within the window (in minutes). Multiple windows may be specified as comma separated list |
window_type | Text | moving | Window type 'moving' or 'fixed' |
group_label_dataset | Text | If specified, correlated group assignments will be written to the specified dataset |
Example usage:
Playground
Try this in RDA Playground(rda_docs_env)
Bot @dm:eventzoning
Bot Position In Pipeline: Sink
Compute event zones using 'groupby', 'created', 'resolved' columns
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
groupby* | Text | Comma separated list of columns to do the grouping | |
created* | Text | Timestamp column name, typically event created timestamp | |
resolved | Text | Timestamp column name, typically event closed or resolved timestamp | |
unit | Text | Timestamp unit. Default is None means timestamp is a string. other units are s,ms,ns | |
id | Text | Identity column, if not specified will use timestamp column | |
freq | Text | 5% | Frequency threshold for zoning. Default 5%. If % is ommitted it will be taken as absolute count threshold |
mttr | Text | 1d | MTTR threshold for zoning. Example: 1d, 2h, 90m, 9000s |
Example usage:
Playground
Try this in RDA Playground(rda_docs_env)
Bot @dm:explode
Bot Position In Pipeline: Sink
Explode a 'column' into rows by splitting the value using a 'sep' separator
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
column* | Text | Name of the column to explode into rows | |
sep | Text | , | Separator (default is comma) |
Example Usage
Example Pipelines Using this Bot
Bot @dm:explode-json
Bot Position In Pipeline: Sink
Explode a 'column' that contains JSON object(s)into rows
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
column* | Text | Name of the column with JSON data to explode into rows | |
ignore_errors | Text | yes | Ignore JSON parsing or structural errors. 'yes' or no' |
exclude_exploded_columns | Text | Regular expression to exclude a set of columns from exploded data | |
include_exploded_columns | Text | .* | Regular expression to include a set of columns from exploded data |
prefix_parent_key | Text | no | Prefix the parent key to the exploded columns. |
Example usage:
Playground
Try this in RDA Playground(rda_docs_env)
Example Pipelines Using this Bot
Bot @dm:explode-timerange-into-windows
Bot Position In Pipeline: Sink
Explode a specified timerange into windows for events that have created and resolved timestamps. Aggregate value for each window using specified function.
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
groupby* | Text | Comma separated list of columns to do the grouping | |
created_col* | Text | Event created Timestamp column name, must be in datetimestr format | |
resolved_col* | Text | Event resolved Timestamp column name, must be in datetimestr format | |
value_col* | Text | Event resolved Timestamp column name, must be in datetimestr format | |
window_start* | Text | Window start timestamp in datetimestr format | |
window_end | Text | Window end timestamp in datetimestr format. If not specified, current timestamp will be used. | |
interval* | Text | Bucketization interval expressed with days, hours, minutes, seconds. Ex: '1d, 4h, 15min' | |
agg | Text | sum | Value aggregation function. Valid values are sum, min, max |
Example usage:
Playground
Try this in RDA Playground(rda_docs_env)
Bot @dm:extract
Bot Position In Pipeline: Sink
Extract data using 'expr' regex pattern from 'columns'
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
expr* | Text | Regular expression with named patterns | |
columns* | Text | Comma separated list of columns from which to extract the data |
Bot @dm:extract-contents-from-html
Bot Position In Pipeline: Sink
Extract contents from HTML content in the input dataset.
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
output_column* | Text | Name of the output column to save the extracted HTML content. | |
input_column* | Text | Name of the input column which contains HTML content. | |
path* | Text | The path to the element to extract, separated by periods (e.g. 'html.body.div'). | |
index | Text | 0 | The index of the element to extract, if there are multiple elements at the specified path. Defaults to 0. |
Bot @dm:extract-key-value
Bot Position In Pipeline: Sink
Extract Key-Value pairs from column and add to dataframe
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
name | Text | extract_kv | Name for this bot. |
format | Text | kv_type1 | Format of input data. Supported formats are syslog_kv_type1, cef, kv_type1. Ex: syslog_kv_type1 format: <150>device="SFW" date=2022-07-04 time=10:06:39 timezone="IST" device_name="AB690" ... kv_type1 format: device="SFW" date=2022-07-04 time=10:06:39 timezone="IST" device_name="AB690" |
column* | Text | Column name in input dataset which contains key=value fields that need to be extracted. | |
_max_procs | Text | 1 | Maximum number of CPUs to use. 0 means all available CPUs. |
Bot @dm:fail-if-shape
Bot Position In Pipeline: Sink
Fail the pipeline, if input dataframe shape meets criteria expressed using 'row_count' & 'column_count'
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
row_count | Text | Number of rows in input dataframe. This variable accepts all numeric operations. | |
column_count | Text | Number of columns in input dataframe. This variable accepts all numeric operations. |
Example usage:
Playground
Try this in RDA Playground(rda_docs_env)
Bot @dm:file-to-object
Bot Position In Pipeline: Sink
Convert files from a column into objects
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
input_filename_column* | Text | Name of the column in input that contains the filenames | |
output_column* | Text | Column name where object names will be inserted | |
output_folder* | Text | Folder name where objects will be stored |
Bot *dm:filter
Bot Position In Pipeline: Sink
Apply CFXQL filtering on the data
This bot expects a Full CFXQL.
Bot applies the Query on the data that is already loaded from previous bot or from a source.
Example usage:
Playground
Example Pipelines Using this Bot
- aws-dependency-mapper-inner-pipeline
- dli-process-synthetic-syslogs
- li-filebeat-events-to-prod-env
- li-http-events-to-prod-env
- li-tcp-syslog-events-to-dev-env
- li-tcp-syslog-events-to-prod-env
- li-udp-syslog-events-to-prod-env
- li-windows-events-to-prod-env
- sample-ecommerce-analytics
- sample-incident-clustering
- sample-vm-analytics
- sample-vrops-alert-analytics
Bot @dm:filter-using-dict
Bot Position In Pipeline: Sink
Filter rows using a dictionary. Action can be 'include' or 'exclude'
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
dict* | Text | Name of the saved dataset to be used as dictionary | |
src_key_cols* | Text | Comma separated list of column names in input to use for join | |
dict_key_cols* | Text | Comma separated list of column names in dict to use for join | |
action | Text | include | Must be one of 'include' or 'exclude'. Include means keep the rows that match the dictionary, else drop the rows that match the dictionary. |
Bot @dm:find-affected-child-nodes
Bot Position In Pipeline: Sink
Traverse CMDB relationship like table to identify potentially affected child nodes for each parent node
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
impacted_classes* | Text | Comma separated list of impacted classes in output. Ex: 'Server,Virtual Machine Instance' | |
max_depth | Text | 3 | Max number of hops from parent node |
Bot @dm:find-and-replace
Bot Position In Pipeline: Sink
Search data for the given condition and replace column value for the specified column name
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
condition | Text | Search one or more queries to fetch the records for replacing data | |
column_name* | Text | Specify list of column names to replace the data | |
column_value* | Text | Specify the list of column values to replace for the specified column names | |
replace_if_column_exist | Text | Specify the column name to replace the data, only if this column exists | |
sep | Text | Specify the separator to list multiple conditions,column names & column values |
Bot @dm:fixcolumns
Bot Position In Pipeline: Sink
Fix column names such that they contain only allowed characters
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
include | Text | Column name regex pattern to fix in the output, remaining columns are left as is |
Example usage:
Playground
Try this in RDA Playground(rda_docs_env)
Example Pipelines Using this Bot
Bot @dm:fixnull
Bot Position In Pipeline: Sink
Replace null values in a comma separated column list
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
columns* | Text | Comma separated list of column names | |
value | Text | Value to be replaced with, default is empty string | |
apply_for_empty | Text | no | Apply value for for empty string ('yes' or 'no') |
Example usage:
Playground
Bot @dm:fixnull-regex
Bot Position In Pipeline: Sink
Replace null values in all columns that match the specified regular expression
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
columns | Text | .* | Regular expresson for column names |
value | Text | Value to be replaced with, default is empty string | |
apply_for_empty | Text | no | Apply value for for empty string ('yes' or 'no') |
Example Pipelines Using this Bot
Bot *dm:functions
Bot Position In Pipeline: Source
List of functions available for mapping in 'map' bots
This bot expects a Full CFXQL.
Bot applies the Query on the data that is already loaded from previous bot or from a source.
Bot @dm:gc
Bot Position In Pipeline: Sink
Perform immediate garbage collection. Useful when dealing with very large datasets.
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
generation | Text | 2 | Generation parameter. Must be 0 or 1 or 2 |
Example usage:
Playground
Try this in RDA Playground(rda_docs_env)
Bot @dm:generate-metric-stats
Bot Position In Pipeline: Source
Generate usage stats (ex: hourly) for a given period of time
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
stream* | Text | Name of the Persistent Stream | |
cfxql_query | Text | * | CFXQL query to filter stream data. |
ts_column | Text | timestamp | Timestamp column name in the stream |
groupby* | Text | Comma separated list of column names to do the grouping. For better performance, have the first item in the group to the asset (ex: asset_id) for which the stats are generated |
|
column* | Text | Name of the column that has the metric data value | |
threshold | Text | 90 | Metric (ex: CPU) Usage % alarm threshold |
threshold_type | Text | over | Possible values: over or under. Use this to check if the threshold is over or under the provided threshold |
clear_threshold | Text | 75 | Metric (ex: CPU) Usage % recovery threshold |
bucket | Text | HOUR | Duration bucket. For now, only HOUR and MONTH are supported. Metrics insights/analysis (which provides recommendations based on thresholds) is available only for HOUR |
freq | Text | MONTH | Frequency for data collection. For now, only MONTH is supported |
skip_below_threshold | Text | yes | Skip processing groups which haven't crossed the threshold even once. Set it to 'no' to process all groups. |
max_value | Text | 100 | Provide the maximum value possible for the metric. For metrics that provide the value as a %, default of 100 will do it. This helps in determining the value relative to the max value for threshold analysis |
chunk_size | Text | 1000 | Number of rows to fetch in each chunk when we retrieve data to generate stats. Use larger number if we are dealing with more data points with relatively less data per row. Do not use more than 5000. Avoid using less than 1000 |
generate_alarm_times | Text | no | Set it to 'yes' to generate alarm times details which include the day of the week with counts. For example, this could be used to create suppression policy. Note: This could cause the bot to run very slow |
Bot @dm:get-from-location
Bot Position In Pipeline: Source Sink
Retrieve dataset from a specified location
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
path* | Text | Path to object in Minio bucket. | |
empty_as_null | Text | yes | While reading the saved datasets, treat empty strings as Nulls. 'yes' or 'no' |
_skip_errors | Text | no | Specify 'yes' or 'no'. If 'yes', do not bailout if any of the processing results in error. |
Example usage:
Bot @dm:get-tagged-dataset
Bot Position In Pipeline: Source
List the datasets that are tagged with given tag name
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
tag_name* | Text | Tag name to get list of tagged bounded dataset |
Bot @dm:grok
Bot Position In Pipeline: Sink
Extract data using Grok syntax from a single column'
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
column* | Text | Column data which need to be parsed with grok pattern | |
pattern* | Text | Grok pattern. For more than one pattern to be used, use | (pipe without spaces) between the patterns. |
|
_skip_errors | Text | no | Specify 'yes' or 'no'. If 'yes', do not bailout if any of the row processing results in error. Check meta_grok_message field when it continues with an error. |
exclude_unmatched_columns | Text | no | Specify 'yes' or 'no'. If 'yes', don't include unmatched columns. |
List of pre-built grok patterns are listed here.
Example usage:
Playground
Try this in RDA Playground(rda_docs_env)
Example Pipelines Using this Bot
Bot @dm:grok-multi-proc
Bot Position In Pipeline: Sink
Extract data using Grok syntax from a single column. Uses all available CPU cores to do parallel processing.
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
column* | Text | Column data which need to be parsed with grok pattern | |
pattern* | Text | Grok pattern. For more than one pattern to be used, use | (pipe without spaces) between the patterns. |
|
_max_procs | Text | 0 | Maximum number of CPUs to use. 0 means all available CPUs. Value should be >= 1 |
_skip_errors | Text | no | Specify 'yes' or 'no'. If 'yes', do not bailout if any of the row processing results in error. Check meta_grok_message field when it continues with an error. |
exclude_unmatched_columns | Text | no | Specify 'yes' or 'no'. If 'yes', don't include unmatched columns. |
List of pre-built grok patterns are listed here.
Example usage:
Playground
Try this in RDA Playground(rda_docs_env)
Bot @dm:groupby
Bot Position In Pipeline: Sink
Group rows using specified 'columns' and specified 'agg' function
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
columns* | Text | Comma separated list of column names to do the grouping | |
agg | Text | count | Aggregation function. Default 'count'. For multiple aggregations, specify comma separated values |
Example usage:
Playground
Example Pipelines Using this Bot
- sample-ecommerce-analytics
- sample-incident-analytics
- sample-incident-clustering
- sample-vm-analytics
- sample-vrops-alert-analytics
Bot @dm:head
Bot Position In Pipeline: Sink
Get first 'n' rows
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
n | Text | 10 | Number of rows retain from head position |
Example usage:
Playground
See Data Guide for more details
Example Pipelines Using this Bot
Bot @dm:hist
Bot Position In Pipeline: Sink
Create histogram using 'timestamp' column and use 'interval' binning
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
timestamp* | Text | Timestamp column | |
interval* | Text | Interval expressed with days, hours, minutes, seconds. Ex: '1d, 4h, 15min' |
Example usage:
Playground
Bot @dm:hist-groupby
Bot Position In Pipeline: Sink
Perform Groupby and then create histogram using 'timestamp' column and use 'interval' binning
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
timestamp* | Text | Timestamp column | |
interval* | Text | Interval expressed with days, hours, minutes, seconds. Ex: '1d, 4h, 15min' | |
groupby* | Text | Comma separated list of columns to do the grouping | |
align | Text | yes | Align all metrics to same start and end time. Specify 'yes' or 'no' |
Example Usage
Bot @dm:identity-discovery
Bot Position In Pipeline: Sink
Discover identities in the input dataset
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
idtypes | Text | Comma separated list of Identity types. Default all identities (ex: ipaddress) |
Example usage:
Bot @dm:implode
Bot Position In Pipeline: Sink
Implode 'merge_cols' into comma separated list
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
key_columns* | Text | Comma separated list of primary key columns | |
merge_columns* | Text | Comma separated list of columns to merge | |
merge_sep | Text | , | Merge value using specified separator, default is comma |
dedup_merge_values | Text | yes | Dedup merge values (yes or no) |
keep_columns | Text | Comma separated list of columns to keep after the merge |
Example usage:
Playground
Example Pipelines Using this Bot
Bot @dm:ingest-from-location
Bot Position In Pipeline: Source
This is a bot that ingests data once from the files that match the file name pattern in a chunked manner. We currently support ingesting data from following data formats - [csv,json,parquet,orc]
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
external_storage_credential_name | Text | Name of the predefined credential for the external S3 or minio. If not specified, we assume its local platform minio |
|
object_prefix | Text | Applicable for local platform minio | |
filename_pattern* | Text | File criteria in regex format | |
num_rows | Text | 1000 | Number of rows to fetch in each chunk |
max_rows | Text | Read until this limit is reached | |
max_data_size_mb | Text | Read until this limit is reached. This can also be a fraction. | |
format | Text | Format is either csv/json/parquet/orc. If not specified, it will be derived from extension | |
line_read | Text | yes | Only applicable for JSON. By default file is read as a json object per line. If you want to load the whole file as JSON, set it to 'no' |
Example usage:
Bot @dm:json-to-html
Bot Position In Pipeline: Sink
Converts JSON column in the input dataset to HTML table.
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
output_column* | Text | Name of the output column which captures the input column as HTML Table. | |
input_column* | Text | Name of the input column which contains JSON data. | |
include_column_name | Text | no | Include the column name in the HTML table. |
parent_key_location | Text | top | For nested JSON, include the parent key at either 'side' or 'top' of the table. |
Bot @dm:list-all-schemas
Bot Position In Pipeline: Source Sink
This is a bot that fetches all json schemas added in the system and lists the metadata of schemas as a dataframe
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Bot @dm:list-from-location
Bot Position In Pipeline: Source
List datasets in a specified location
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
location* | Text | Location in Minio bucket. |
Example usage:
Playground
Try this in RDA Playground(rda_docs_env)
Bot @dm:load-bookmark
Bot Position In Pipeline: Source Sink
Load a previously saved bookmark
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
name* | Text | Name of the bookmark to load | |
default | Text | Name of the bookmark to load |
Example usage:
Bot @dm:load-ml-dataset
Bot Position In Pipeline: Source Sink
Load the ML Datasets
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
pipeline_type* | Text | description | ML Pipeline type. Must be one of 'Clustering', 'Regression', 'Classification' |
tmp_path | Text | Temporary directory path where datasets are to be stored | |
minio_path | Text | Minio path to get datasets from |
This bot also accepts wildcard parameters. Any additional name = 'value' parameters are passed to the bot.
Bot @dm:load-ml-model
Bot Position In Pipeline: Source Sink
Load the ML Model with 'name' and 'model_type'
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
name* | Text | Name of the ML Model to load | |
model_type* | Text | ML Model type. Must be one of 'clustering', 'regression', 'classification' |
Bot @dm:load-template
Bot Position In Pipeline: Source Sink
Load the formatting template with 'name'
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
name* | Text | Name of the template to load |
Bot @dm:logarchive-replay
Bot Position In Pipeline: Source
Read the data from given archive for a specified time interval
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
repo* | Text | Name of the Log Archive repository. Mandatory. | |
archive* | Text | Name of the archive within the repository. Mandatory. | |
from | Text | Date & Time in text format. Ex: ISO format. Must be in UTC timezone. | |
to | Text | Date & Time in text format. Ex: ISO format. Must be in UTC timezone. | |
minutes | Text | 15 | Number of minutes of the data to replay. Must be >= 0. this field will be ignored if 'to' is specified |
max_rows | Text | 0 | Maximum rows to replay. If not specified, will replay all data in the specified intervals. If specified and > 0, it will stop once specified rows have been read. |
speed | Text | 1.0 | Speed at which to replay the events. 1 means close to original speed. < 1 means slower than original. > 1 means faster than original. This is an approximate and cannot be guaranteed. 0 means no introduced latency and try to replay as fast as possible. |
batch_size | Text | 100 | Number of rows to return for each iteration. |
label | Text | Label for the replay. Used for reports to identify various replay actions. |
Bot @dm:logarchive-save
Bot Position In Pipeline: Sink
Save the log data in given archive of given repository
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
repo* | Text | Name of the Log Archive repository. Mandatory. | |
archive* | Text | Name of the archive within the repository. Mandatory. |
Example Pipelines Using this Bot
Bot @dm:manipulate-string
Bot Position In Pipeline: Sink
Manipulate the column values
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
from | Text | Get the value from specified column, if 'func' param is set to 'eval', 'from' param is not mandatory. |
|
func* | Text | Supported functions are: [strip, substring, lower, upper, split, join, match, length, eval, lstrip, rstrip, count_value, replace, concat_columns] |
|
lower_limit | Text | if func is 'substring', specify lower limit from which string should be extracted | |
upper_limit | Text | if func is 'substring', specify upper limit till which string should be extracted | |
value | Text | , | Specify regex pattern(s) when 'func' is set to 'match', specify separator when 'func' is set to 'split'. Extracted values are assigned to 'to' column, Default value is comma |
to* | Text | Store the value to the specified column |
Bot @dm:map
Bot Position In Pipeline: Sink
Inline mapping of columns 'from' using 'func' and save output to 'to' column
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
from | Text | Get the value from the specified column or columns (comma separated) | |
to | Text | Store the mapped value into the specified column | |
attr | Text | If from & to is same variable, attr can be used instead | |
func | Text | Function to use during mapping | |
_skip_errors | Text | no | Specify 'yes' or 'no'. If 'yes', do not bailout if any of the row processing results in error. Check meta_map_message field when it continues with an error. |
This bot also accepts wildcard parameters. Any additional name = 'value' parameters are passed to the bot.
Example Pipelines Using this Bot
- aws-dependency-mapper-inner-pipeline
- dli-generate-synthetic-syslogs
- ebonding-stream-to-pagerduty
- ebonding-stream-to-twilio-sms-v2
- sample-ml-classification-prediction
Bot @dm:map-multi-proc
Bot Position In Pipeline: Sink
Inline mapping of columns 'from' using 'func' and save output to 'to' column. Uses all available CPU cores to do parallel processing.
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
from | Text | Get the value from the specified column or columns (comma separated) | |
to | Text | Store the mapped value into the specified column | |
attr | Text | If from & to is same variable, attr can be used instead | |
func | Text | Function to use during mapping | |
_max_procs | Text | 0 | Maximum number of CPUs to use. 0 means all available CPUs. Value should be >= 1 |
_skip_errors | Text | no | Specify 'yes' or 'no'. If 'yes', do not bailout if any of the row processing results in error. Check meta_map_message field when it continues with an error. |
This bot also accepts wildcard parameters. Any additional name = 'value' parameters are passed to the bot.
Example usage:
Playground
Try this in RDA Playground(rda_docs_env)
Bot @dm:map-snmp-trap-to-alert
Bot Position In Pipeline: Sink
Enrich incoming snmp trap with alert related information. (Typically called after apply-snmp-trap-template)
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
template_folder | Text | snmp_trap_alerts | Name of the folder for RDA Objects which contains the template to convert SNMP Trap to alerts. |
node_id_column | Text | rda_gw_client_ip | Column name that contains unique device id (default: rda_gw_client_ip) |
Bot @dm:mask
Bot Position In Pipeline: Sink
Partially or completely mask all values in specified columns
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
columns* | Text | Comma separated list of columns for which values will be masked | |
pos | Text | 5 | Position from which start the masking |
char | Text | # | Masking character |
Example usage:
Playground
Example usage:
Playground
See Data Guide for more details
Bot @dm:math
Bot Position In Pipeline: Sink
This bot performs mathematical functions
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
columns | Text | Comma separated list of column names(Mandatory for ceil,floor and median functions) | |
func* | Text | Function name to perform mathematical tasks. Available functions are (ceil|floor|median|row_count|column_count|month_range) | |
year_column | Text | Column name containing year value.(Mandatory only for month_range function) | |
month_column | Text | Column name containing month value.(Mandatory only for month_range function) |
Bot @dm:melt
Bot Position In Pipeline: Sink
Unpivot a DataFrame from wide to long format
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
id_cols* | Text | Column names separated by comma to be used as id | |
var_col_name | Text | Header for variable column | |
value_cols | Text | Column names separated by comma to be used as values. If not specified, uses all columns other than id columns |
|
value_col_name | Text | value | Value column header |
Example usage:
Playground
Try this in RDA Playground(rda_docs_env)
Bot @dm:mergecolumns
Bot Position In Pipeline: Sink
Merge columns using 'include' regex and/or 'exclude' regex into 'to' column
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
include* | Text | Column name regex pattern to include in the merge operation | |
exclude | Text | Column name regex pattern to exclude in the merged column | |
to* | Text | Output column name for merged column |
Example usage:
Playground
Bot @dm:metadata
Bot Position In Pipeline: Sink
Analyze metadata for the input dataset
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
include | Text | Column name regex pattern to include in the output (include patterns are matched first and then exclude) |
|
exclude | Text | Columns to exclude in the analysis, regular expression |
mple usage:
Playground
Bot @dm:metric-corr
Bot Position In Pipeline: Sink
Computes correlation between columns specified as metric and value column
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
metric | Text | Comma separated list of columns. If none specified, it uses all columns other than value and timestamp. |
|
timestamp* | Text | timestamp | Timestamp column name |
unit | Text | Timestamp unit. Default is None means timestamp is a string. other units are s,ms,ns | |
interval* | Text | Bucketization interval expressed with days, hours, minutes, seconds. Ex: '1d, 4h, 15min' | |
agg | Text | mean | Data aggregator function for interval data : mean,median,sum,min,max |
Example usage:
Playground
Try this in RDA Playground(rda_docs_env)
Bot @dm:metric-correlator
Bot Position In Pipeline: Sink
Computes correlation between metrics specified in metric label column
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
metric_label* | Text | Name of column which specifies metric label | |
timestamp_column* | Text | timestamp | Timestamp column name |
value_column | Text | Comma separated list of columns. If none specified, it uses all columns other than timestamp. | |
unit | Text | Timestamp unit. Default is None means timestamp is a string. other units are s,ms,ns | |
detrend | Text | yes | Detrend involves removing trend / seasonality from the data. Valid values are 'yes' or 'no'. Default is 'yes' |
correlation_threshold | Text | 0.5 | If correlation between two metrics is greater than this value then they are said to be correlated. Ranges between 0 to 1 |
Bot @dm:metric-statistical-analysis
Bot Position In Pipeline: Sink
Computes all statistical parameters for each metric
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
metric* | Text | Comma separated list of columns to identify each metric | |
timestamp* | Text | timestamp | Timestamp column name |
unit | Text | Timestamp unit. Default is None means timestamp is a string. other units are s,ms,ns | |
value* | Text | Metric value column | |
precision | Text | 1 | Number of decimals for each numerical value in the output |
anomaly_percentile | Text | If specified, compute number of anomalies above this percentile value. Value should be >0 and <=100. |
Bot *dm:ml-model-list
Bot Position In Pipeline: Source
List of saved ML Models
This bot expects a Full CFXQL.
Bot applies the Query on the data that is already loaded from previous bot or from a source.
Bot @dm:object-add
Bot Position In Pipeline: Source Sink
Add object to a folder
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
name* | Text | Object name | |
folder* | Text | Folder name on the object storage | |
input_file* | Text | File from which object will be added | |
description | Text | Description | |
overwrite | Text | yes | If file already exists, overwrite without prompting |
retention_days | Text | Enter retention days for folder. If set to 0, folder will exclude from purging. Retention days will not update if folder is already exists |
Bot @dm:object-delete
Bot Position In Pipeline: Source Sink
Delete object from a folder
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
name* | Text | Object name | |
folder* | Text | Folder name on the object storage |
Bot @dm:object-delete-list
Bot Position In Pipeline: Sink
Delete list of objects
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
input_object_column* | Text | Column with object names |
Bot @dm:object-get
Bot Position In Pipeline: Source Sink
Get Object from a folder
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
name* | Text | Object name | |
folder* | Text | Folder name on the object storage | |
save_to_file | Text | Save the downloaded object to specified file | |
save_to_dir | Text | Save the downloaded object to specified directory |
Bot @dm:object-list
Bot Position In Pipeline: Source
List objects for a folder
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
folder | Text | Folder name on the object storage |
Bot @dm:object-to-content
Bot Position In Pipeline: Sink
Convert object pointers from a column into content
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
input_object_column* | Text | Name of the column in input that contains the object name | |
output_column* | Text | Column name where content will be inserted |
Bot @dm:object-to-file
Bot Position In Pipeline: Sink
Convert object pointers from a column into file
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
input_object_column* | Text | Name of the column in input that contains the objects | |
output_column* | Text | Column name where filenames need to be inserted |
Bot @dm:object-to-inline-img
Bot Position In Pipeline: Sink
Convert object pointers from a column into inline HTML img tags
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
input_object_column* | Text | Name of the column in input that contains the JPEG or PNG Image | |
output_column* | Text | Column name where HTML img tag code need to be inserted |
Bot @dm:parse-using-textfsm
Bot Position In Pipeline: Sink
Parse one or more rows of data text data using specified textfsm model
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
folder | Text | RDA Object folder in which textfsm model is defined. | |
object* | Text | RDA Object name in which textfsm model is defined | |
raw_data_col* | Text | Column name for input data in which raw data is expected | |
keep_cols | Text | Keep specified comma separated list of columns in output | |
status_col | Text | textfsm_status | Parsing status in the output |
Bot @dm:pivot-table
Bot Position In Pipeline: Sink
Creates Pivot Table with index and columns
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
columns* | Text | Column names to pivot separated by comma | |
index* | Text | Column names to use as index in pivot | |
value | Text | Name of value column. If not specified will use all available columns other than index and columns above. |
|
agg | Text | mean | Default is mean. User can specify sum, max, min, median. |
Example Usage
Playground
Playground
Bot @dm:process-syslog-from-kv-list
Bot Position In Pipeline: Sink
Process syslogs that have information in a list of dicts and have RFC5424 payload
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
data_column* | Text | column which has list of dictionary with multiple attributes. One attribute for name and another for value |
|
key_attr | Text | name | Name of the attribute in data_column dictionary which indicates Key |
value_attr | Text | stringValue | Name of the attribute in data_column dictionary which indicates Value |
rfc5424_attr | Text | RFC5424 | Key name which indicates RFC5424 encoded syslog parameters. |
Bot #dm:pstream-delete-data-by-query
Bot Position In Pipeline: Sink
Delete the data in a persistent stream via CFXQL.
This bot expects Full CFXQL.
Bot translates the Query to native query of the Data source supported by this extension.
This is a data sink which expects the following input parameters to be passed via input dataframe.
Input Dataframe Column Name | Type | Default Value | Description |
---|---|---|---|
name * |
Text | Name of the Persistent Stream | |
conflicts |
Text | abort | What to do when the delete by query hits version conflicts? Valid choices: abort, proceed Default: abort |
timeout |
Text | 10 | Timeout in seconds to wait for response |
Example usage:
Bot #dm:pstream-update-data-by-query
Bot Position In Pipeline: Sink
Update the data in a persistent stream via CFXQL.
This bot expects Full CFXQL.
Bot translates the Query to native query of the Data source supported by this extension.
This is a data sink which expects the following input parameters to be passed via input dataframe.
Input Dataframe Column Name | Type | Default Value | Description |
---|---|---|---|
name * |
Text | Name of the Persistent Stream | |
columns * |
Text | Comma separated list of column names that needs to be updated for all the records that match the query. Example: city,state,zipcode | |
values * |
Text | Set the value to specified column or columns (comma separated). The specified number of columns should match 'columns' column(s). Example: San Jose,CA,12345 | |
conflicts |
Text | abort | What to do when the update by query hits version conflicts? Valid choices: abort, proceed Default: abort |
timeout |
Text | 10 | Timeout in seconds to wait for response |
Example usage:
Bot #dm:query-persistent-stream
Bot Position In Pipeline: Sink
Query the data in a persistent stream via CFXQL.
This bot expects Full CFXQL.
Bot translates the Query to native query of the Data source supported by this extension.
This is a data sink which expects the following input parameters to be passed via input dataframe.
Input Dataframe Column Name | Type | Default Value | Description |
---|---|---|---|
name * |
Text | Name of the Persistent Stream | |
max_rows |
Text | 1000 | Max rows in each batch. Ignored when 'aggs' is used |
limit |
Text | 1000 | Limit total output rows. If set to 0, will retrieve all rows from the stream that match the query. Ignored when 'aggs' is used |
sort_by_col |
Text | Name of the column to sort | |
sort_type |
Text | desc | Must be one of 'asc' or 'desc' |
aggs |
Text | Specified as 'sum:field_name'. Supported functions are sum, cardinality, min, max, mean, value_count | |
groupby |
Text | Comma seperated list of columns to groupby; used only when 'aggs' is used | |
max_aggregation_groups |
Text | 1000 | Fetches upto 1000 groups aggregation results by default. Larger values (10,000+) results in use of more memory to compute |
retry_attempts_on_no_data |
Text | 0 | Number of retries with 2 seconds wait for each retry when there is no data. Default is 0 |
Example usage:
*
implies match all (no filtering of data)
Supported Aggregations
Agg Function | Description |
---|---|
min | minimum value in the group. Supported on numeric values only |
max | maximum value in the group. Supported on numeric values only |
sum | sum of values in the group. Supported on numeric values only |
avg | average of the values in the group. Supported on numeric values only |
first | first value when sorted by ascending order for the field |
last | last value when sorted by ascending order for the field |
Bot @dm:query-persistent-stream-from-bookmark
Bot Position In Pipeline: Source
This is a streaming bot that reads one or more rows from the last bookmarked record in a persistent stream via CFXQL filter criteria
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
name* | Text | Name of the Persistent Stream | |
bookmark* | Text | Name of the bookmark | |
offset | Text | latest | Read data from beginning or current offset. This option is only applicable when the bookmark does not exist, i.e. once the bookmark offset it set, it cannot be changed. Default is 'latest' and designed to work when default sort by column '_RDA_Id' is used. Other option is 'earliest', which reads from the beginning. |
query | Text | * | CFXQL query to filter results. |
sort_by_col | Text | _RDA_Id | Comma separated list of column names to be sorted by. Its important for the set of sorting columns provided to uniquely identify a record so that last bookmarked record is unique. Default is _RDA_Id (internal unique id). Example: 'timestamp, some_unique_id' |
sort_type | Text | asc | Must be one of 'asc' or 'desc' |
max_rows | Text | 1000 | Max rows in each batch. |
Example usage:
Bot @dm:query-persistent-stream-iterate-by-chunk
Bot Position In Pipeline: Source
Queries the data in a persistent stream and returns data in chunks
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
name* | Text | Name of the Persistent Stream | |
query | Text | * | CFXQL query to filter results. |
batch_size | Text | 1000 | Rows in each batch. Default is 1000 |
sort_by_col | Text | Name of the column to sort | |
sort_type | Text | desc | Must be one of 'asc' or 'desc' |
Bot @dm:query-persistent-stream-iterate-by-time
Bot Position In Pipeline: Source
Queries the data in a persistent stream and returns data in chunks based on time interval
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
name* | Text | Name of the Persistent Stream | |
from | Text | Date & Time must be in ISO format. Must be in UTC timezone. Example: 2023-06-14, 2023-06-14T05:00:00. If not provided, earliest available date will be used |
|
to | Text | Date & Time must be in ISO format. Must be in UTC timezone. Example: 2023-06-14, 2023-06-14T05:00:00. If not provided, the latest available date will be used |
|
query | Text | * | CFXQL query to filter results. |
interval | Text | 1d | Interval expressed with days or hours or mins. Ex: '1d' or '4h' or '30min' |
timestamp_column | Text | timestamp | Name of timestamp column. Default is 'timestamp' |
aggs | Text | Specified as 'function:field_name'. Supported functions are sum, cardinality, min, max, mean, value_count |
|
groupby | Text | Comma separated list of columns to groupby; used only when 'aggs' is used | |
include_time_intervals | Text | no | Include from and to interval used for each iteration. The column names will be 'from_interval' and 'to_interval' |
chunk_size | Text | 1000 | Number of rows to fetch in each chunk when we retrieve data. Use larger number if we are dealing with more data points with relatively less data per row. Do not use more than 5000. Avoid using less than 1000. This is not applicable when 'aggs' is used. |
max_aggregation_groups | Text | 1000 | Fetches upto 1000 groups aggregation results by default. Larger values (10,000+) results in use of more memory to compute. |
Bot @dm:recall
Bot Position In Pipeline: Source Sink
Recall (load) a previously saved dataset
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
name | Text | Name of the dataset to recall.(name is mandatory, if tag parameter is not given) | |
cache | Text | no | Cache the result for future recalls. 'yes' or 'no' |
cache_refresh_seconds | Text | 120 | Refresh the cache (if new update available) after specified seconds |
empty_as_null | Text | yes | While reading the saved datasets, treat empty strings as Nulls. 'yes' or 'no' |
return_empty | Text | no | Return an empty dataframe if an error occurs loading the dataset |
empty_df_columns | Text | Comma separated list of columns for empty dataframe | |
ignore_dtypes | Text | no | Ignore column data types during loading |
make_copy | Text | yes | Specifies whether to make copy of dataset for temp datasets. 'yes' or 'no' |
tag | Text | Name of the tag, return the first occurrence of the dataset having the given tag |
Example usage:
Playground
Try this in RDA Playground(rda_docs_env)
Example Pipelines Using this Bot
- aws-dependency-mapper-inner-pipeline
- dli-generate-synthetic-syslogs
- dli-process-synthetic-syslogs
- sample-ecommerce-analytics
- sample-formatting-template-example
- sample-incident-analytics
- sample-incident-clustering
- sample-ml-classification-prediction
- sample-vm-analytics
- sample-vrops-alert-analytics
Bot @dm:recall-chunked
Bot Position In Pipeline: Source
Recall (load) a previously saved dataset as a data stream. Loads num_rows in each chunk.
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
name* | Text | Name of the dataset to recall | |
num_rows* | Text | Number of rows to fetch in each chunk | |
empty_as_null | Text | yes | While reading the saved datasets, treat empty strings as Nulls. 'yes' or 'no' |
return_empty | Text | no | Return an empty dataframe if an error occurs loading the dataset |
empty_df_columns | Text | Comma separated list of columns for empty dataframe | |
ignore_dtypes | Text | no | Ignore column data types during loading |
Example usage:
Playground
Try this in RDA Playground(rda_docs_env)
Bot @dm:recall-query
Bot Position In Pipeline: Source Sink
Recall (load) a previously saved dataset using CFXQL query
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
name* | Text | Name of the dataset to recall | |
empty_as_null | Text | yes | While reading the saved datasets, treat empty strings as Nulls. 'yes' or 'no' |
return_empty | Text | no | Return an empty dataframe if an error occurs loading the dataset |
empty_df_columns | Text | Comma separated list of columns for empty dataframe | |
make_copy | Text | yes | Specifies whether to make copy of dataset for temp datasets. 'yes' or 'no' |
query | Text | * | CFXQL query to filter results. |
max_rows | Text | Limit the number of rows to return. |
Bot @dm:relations-child-to-parent-paths
Bot Position In Pipeline: Sink
Traverse CMDB relationship like table to identify all possible paths from each child to all parent node(s)
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
child_col* | Text | Input column name which identifies a child column in a relationship table | |
parent_col* | Text | Input column name which identifies a parent column in a relationship table |
Bot @dm:relations-parent-to-children-paths
Bot Position In Pipeline: Sink
Traverse CMDB relationship like table to identify all possible paths from each parent to all child node(s)
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
child_col* | Text | Input column name which identifies a child column in a relationship table | |
parent_col* | Text | Input column name which identifies a parent column in a relationship table |
Bot @dm:rename-columns
Bot Position In Pipeline: Sink
Rename specified column names using new_column_name = 'old_column_name' format
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
This bot only accepts wildcard parameters. All name = 'value' parameters are passed to the bot.
Bot @dm:replace-data
Bot Position In Pipeline: Sink
Replace data using 'expr' regex pattern from 'columns'
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
columns* | Text | Comma separated list of columns | |
expr* | Text | Regular expression to identify the part that need to be replaced | |
replace | Text | Replace with this value. If not specified, replaces with empty string |
Example usage:
Playground
Bot @dm:resample-timeseries
Bot Position In Pipeline: Sink
Resample time series data on specified timestamp column by aggregation function provided
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
ts_column* | Text | Timestamp column name. | |
value_column | Text | Comma separated list of columns. If none specified, it uses all columns other than timestamp. | |
interval | Text | 1H | Bucketization interval expressed with days, hours, minutes, seconds. Ex: '1D, 4H, 15min' |
agg | Text | sum | Value aggregation function. Valid values are sum, min, max, mean |
interpolate | Text | no | Specify 'yes' or 'no'. If 'yes' then interpolate missing values after aggregation |
Bot @dm:row-delta
Bot Position In Pipeline: Sink
Perform difference between 2 consecutive rows for the list of columns and replace them with the result
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
value_columns* | Text | Comma separated list of numeric column field names | |
groupby | Text | Perform delta within each group. Comma separated list of columns to groupby. | |
skip_first_row | Text | no | The column value for the first rows for the value columns provided in the result will be NaN as there is not row above them. You can skip the first row in the result by setting this to 'yes' |
sort_columns | Text | timestamp | Comma separated list of column names. Default is 'timestamp' field |
sort_order | Text | ascending | Sorting order ('ascending' or 'descending'). For multiple sort orders, specify comma separated values. If sort order list is shorter than column list, the last sort order will be used for the remaining columns |
Bot *dm:safe-filter
Bot Position In Pipeline: Sink
Apply safe CFXQL filtering on the data
This bot expects a Full CFXQL.
Bot applies the Query on the data that is already loaded from previous bot or from a source.
Bot @dm:sample
Bot Position In Pipeline: Sink
Randomly sample 'n' number of rows
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
n | Text | 1 | Number of rows to return in output using random sampling. This can also be a fraction between 0 to 1.0 to indicate fraction of input rows to return. |
re_use | Text | auto | Re-use already sampled row in output. Valid values are 'auto', 'yes' or 'no'. |
Example usage:
Playground
Example Pipelines Using this Bot
Bot @dm:sample-groupby
Bot Position In Pipeline: Sink
Randomly sample 'n' number of rows within each group
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
columns* | Text | 1 | Comma separated list of column names to do the grouping. |
n | Text | 1 | Number of rows to return in each group using random sampling. This can also be a fraction between 0 to 1.0 to indicate fraction of input rows to return within each group. |
re_use | Text | auto | Re-use already sampled row in output. Valid values are 'auto', 'yes' or 'no'. |
Example usage:
Playground
Bot @dm:save
Bot Position In Pipeline: Sink
Save the dataset with 'name'
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
name* | Text | Name of the dataset to save | |
format | Text | csv | Data format on the object storage. Ignored for the temporary datasets. Must be one of 'csv' or 'parquet' |
publish | Text | Deprecated. Name of the tag to publish in cfxDimensions platform. Can be used only with Dimensions configuration. |
|
append | Text | no | If set to 'yes', appends the input dataset as a chunk to the existing dataset if any. Valid values are 'yes', 'no' |
return_appended_dataset | Text | no | If set to 'yes' and append is 'yes', returns the full appended dataset. Valid values are 'yes', 'no' |
tag | Text | Name of the tag, appends the given tag name in the metadata |
Example usage:
Playground
Try this in RDA Playground(rda_docs_env)
Example Pipelines Using this Bot
- aws-dependency-mapper-inner-pipeline
- dli-generate-synthetic-syslogs
- dli-process-synthetic-syslogs
- sample-ecommerce-analytics
- sample-incident-analytics
- sample-ml-classification-prediction
- sample-vm-analytics
Bot @dm:save-bookmark
Bot Position In Pipeline: Sink
Save the bookmark with 'name'
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
name* | Text | Name of the bookmark to save | |
value_column* | Text | Name of value column | |
value_type | Text | timestamp | Value type (timestamp, numeric, text) |
ts_format | Text | If value_type is timestamp, format. Valid units are s, ms, ns or null for string format. | |
value_func | Text | max | Value functions are first, last, min, max |
Example usage:
Playground
Try this in RDA Playground(rda_docs_env)
Bot @dm:save-ml-dataset
Bot Position In Pipeline: Source Sink
Save the ML Datasets
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
pipeline_type* | Text | description | ML Pipeline type. Must be one of 'clustering', 'regression', 'classification' |
tmp_path | Text | Temporary directory path where datasets are stored | |
minio_path | Text | Minio path to store datasets |
This bot also accepts wildcard parameters. Any additional name = 'value' parameters are passed to the bot.
Bot @dm:save-ml-model
Bot Position In Pipeline: Source Sink
Save the ML Model with 'name' and 'model_type'
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
name* | Text | Name of the ML Model to save | |
model_type* | Text | ML Model type. Must be one of 'clustering', 'regression', 'classification' | |
description | Text | Template content column | |
model_data_path* | Text | ML Model file path |
This bot also accepts wildcard parameters. Any additional name = 'value' parameters are passed to the bot.
Bot @dm:save-template
Bot Position In Pipeline: Sink
Save the formatting template with 'name'
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
name* | Text | Name of the template to save | |
description_col | Text | description | Template description column |
content_col | Text | content | Template content column |
content_type_col | Text | content_type | Template content type column |
Bot @dm:save-to-location
Bot Position In Pipeline: Sink
Save the dataset to a specified location
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
name | Text | Name of the dataset to save. If not provided, will be extracted from location | |
format | Text | Data format on the object storage. Supported formats are csv, paraquet, gzip, json and zip. If not provided, will be extracted from location |
|
location | Text | Location in Minio bucket to save the object. | |
ignore_index | Text | no | Ignore index columns while saving file to location. Possible values are 'yes' or 'no'. (Applicable only for csv format) |
Example usage:
Bot *dm:savedlist
Bot Position In Pipeline: Source
List of saved datasets
This bot expects a Full CFXQL.
Bot applies the Query on the data that is already loaded from previous bot or from a source.
Example usage:
Playground
Try this in RDA Playground(rda_docs_env)
Bot @dm:selectcolumns
Bot Position In Pipeline: Sink
Select columns using 'include' regex and/or 'exclude' regex
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
include | Text | Column name regex pattern to include in the output (include patterns are matched first and then exclude) |
|
exclude | Text | None | Column name regex pattern to exclude in the output (include pattern are matched first and then exclude) |
Example usage:
Playground
Example Pipelines Using this Bot
- aws-dependency-mapper-inner-pipeline
- ebonding-stream-to-pagerduty
- ebonding-stream-to-twilio-sms-v2
- sample-cato-networks-graphql
- sample-ecommerce-analytics
- sample-formatting-template-example
- sample-ml-classification-prediction
- sample-mondaydotcom-graphql
- sample-vm-analytics
- sample-vrops-alert-analytics
Bot @dm:set-tracing-context
Bot Position In Pipeline: Source Sink
Set the tracing context using name = value pairs
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
This bot only accepts wildcard parameters. All name = 'value' parameters are passed to the bot.
Bot @dm:set-tracing-context-from-input
Bot Position In Pipeline: Sink
Set the tracing context using input dataframe column values from first row
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
columns* | Text | Comma separated column names in input dataframe, to be propagated to context |
Bot @dm:skip-block-if-shape
Bot Position In Pipeline: Sink
Skip rest of the current block if input dataframe shape meets criteria expressed using 'row_count' & 'column_count'
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
row_count | Text | Number of rows in input dataframe. This variable accepts all numeric operations. | |
column_count | Text | Number of columns in input dataframe. This variable accepts all numeric operations. |
Example usage:
Playground
Try this in RDA Playground(rda_docs_env)
Example Pipelines Using this Bot
- li-filebeat-events-to-prod-env
- li-http-events-to-prod-env
- li-tcp-syslog-events-to-dev-env
- li-tcp-syslog-events-to-prod-env
- li-udp-syslog-events-to-prod-env
- li-windows-events-to-prod-env
Bot @dm:skip-pipeline-if-shape
Bot Position In Pipeline: Sink
Skip rest of the pipeline if input dataframe shape meets criteria expressed using 'row_count' & 'column_count'
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
row_count | Text | Number of rows in input dataframe. This variable accepts all numeric operations. | |
column_count | Text | Number of columns in input dataframe. This variable accepts all numeric operations. |
Example usage:
Playground
Try this in RDA Playground(rda_docs_env)
Bot @dm:sleep
Bot Position In Pipeline: Sink
Wait for a specified number of seconds before executing next step. Useful for timed loops.
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
seconds | Text | Wait time in seconds, must be > 0, fractional seconds are allowed |
Example usage:
Playground
Try this in RDA Playground(rda_docs_env)
Bot @dm:sort
Bot Position In Pipeline: Sink
Sort values using 'columns' with 'order'
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
columns | Text | Comma separated list of column names | |
order | Text | ascending | Sorting order ('ascending' or 'descending'). For multiple sort orders, specify comma separated values. If sort order list is shorter than column list, the last sort order will be used for the remaining columns |
Example usage:
Playground
Example Pipelines Using this Bot
Bot @dm:span-creator
Bot Position In Pipeline: Sink
Create Spans from input timeseries data
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
unique_id* | Text | Comma separated list of columns | |
starttime* | Text | Start time column name | |
endtime* | Text | End time column name | |
unit | Text | Timestamp unit. Default is None means timestamp is a string. other units are s,ms,ns | |
status* | Text | Status column name | |
data_label* | Text | Label for data in spans | |
filter | Text | Comma separated list of filter columns | |
incident_id* | Text | Incident id column name | |
interval_start | Text | 22 | Interval start days/seconds/microseconds/milliseconds/minutes/hours/weeks |
interval_end | Text | 2 | Interval end days/seconds/microseconds/milliseconds/minutes/hours/weeks |
interval_unit | Text | hours | Interval unit days/seconds/microseconds/milliseconds/minutes/hours/weeks |
Bot *dm:stack-connected-nodes
Bot Position In Pipeline: Source Sink
All connected nodes on previously selected Nodes on stack
This bot expects a Full CFXQL.
Bot applies the Query on the data that is already loaded from previous bot or from a source.
Bot @dm:stack-create
Bot Position In Pipeline: Source Sink
Create stack from input topology dataset
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
topology_nodes* | Text | Name of topology nodes dataset | |
topology_edges | Text | Name of topology edges dataset | |
name* | Text | Name for stack |
Example Pipelines Using this Bot
Bot *dm:stack-filter
Bot Position In Pipeline: Sink
Filter stack Nodes/Edges based on previously selected Nodes/Edges on stack
This bot expects a Full CFXQL.
Bot applies the Query on the data that is already loaded from previous bot or from a source.
Bot @dm:stack-find-impact-distances
Bot Position In Pipeline: Source Sink
Search a saved stack using asset-dependency service and get impact distances from the specified nodes
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
stack_name* | Text | Name of previously saved stack. This bot will use asset-dependency service to load the specified stack and perform a search |
|
search_for* | Text | Comma separated list of values to search | |
attr_names* | Text | Comma separated list of attribute names Search the values specified in 'search_for' in this list of attribute names |
|
node_types | Text | Comma separated list of node types to search | |
exclude_node_types | Text | Comma separated list of node types to exclude | |
depth | Text | 10 | Maximum depth from the selected nodes |
operation | Text | equals | Type of value comparison operation. Must be one of 'equals', 'contains', or 'matches' |
ignore_case | Text | yes | Ignore case while doing the search. Must be one of 'yes' or 'no' |
max_matches | Text | 1 | Maximum number of matches per each search. |
timeout | Text | 120 | Timeout in seconds |
Bot @dm:stack-generate
Bot Position In Pipeline: Source Sink
Generate stack from input topology dataset
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
input_stack_name* | Text | Name of input stack json to generate stack from | |
app_type | Text | OIA | Name of application to publish stack to |
output_stack_name* | Text | Name of application to publish stack to | |
room_id* | Text | Location to publish stack to | |
incident_id* | Text | Comma separated Incident id under which stack will be referenced | |
incident_summary* | Text | Comma separated summary for incidents | |
fqdn* | Text | FQDN | |
url* | Text | URL | |
ip_address* | Text | IP Address |
Bot *dm:stack-impacted-nodes
Bot Position In Pipeline: Source Sink
Impact Analysis on previously selected Nodes on stack
This bot expects a Full CFXQL.
Bot applies the Query on the data that is already loaded from previous bot or from a source.
Bot @dm:stack-join
Bot Position In Pipeline: Source Sink
Create stack from input topology dataset
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
target_stack* | Text | Name of target stack to join | |
filter* | Text | Name of dictionary dataset with filters | |
name* | Text | Name of new stack |
Bot *dm:stack-list
Bot Position In Pipeline: Source
List of application Stacks
This bot expects a Full CFXQL.
Bot applies the Query on the data that is already loaded from previous bot or from a source.
Bot @dm:stack-load
Bot Position In Pipeline: Source Sink
Load application Stack specified by 'name'
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
name* | Text | Name of the Stack to load |
Bot @dm:stack-save
Bot Position In Pipeline: Source Sink
Save application Stack specified by 'name'
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
name* | Text | Name of the Stack to save | |
description | Text | Stack description | |
stack_data_column | Text | data | Column name where stack definition is stored in dataframe |
additional_nodes_rules | Text | Name of dataset with rules to create and attach new node to matching existing nodes. |
Example Pipelines Using this Bot
Bot @dm:stack-search
Bot Position In Pipeline: Source Sink
Search a saved stack using asset-dependency service
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
stack_name* | Text | Name of previously saved stack. This bot will use asset-dependency service to load the specified stack and perform a search |
|
search_for* | Text | Comma separated list of values to search | |
attr_names* | Text | Comma separated list of attribute names Search the values specified in 'search_for' in this list of attribute names |
|
node_types | Text | Comma separated list of node types to search | |
exclude_node_types | Text | Comma separated list of node types to exclude | |
depth | Text | 2 | Maximum depth from the selected nodes |
operation | Text | equals | Type of value comparison operation. Must be one of 'equals', 'contains', or 'matches' |
ignore_case | Text | yes | Ignore case while doing the search. Must be one of 'yes' or 'no' |
max_matches | Text | 1 | Maximum number of matches per each search. |
timeout | Text | 120 | Timeout in seconds |
result_stack_name | Text | Name of the result stack |
Bot *dm:stack-select-nodes
Bot Position In Pipeline: Sink
Select Nodes from stack based on provided criteria
This bot expects a Full CFXQL.
Bot applies the Query on the data that is already loaded from previous bot or from a source.
Bot @dm:stack-unselect-nodes
Bot Position In Pipeline: Source Sink
Unselect nodes in stack of given node types if there is no right link available
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
node_type* | Text | Type of nodes to be unselected if there is no right link available |
Bot @dm:staging-read
Bot Position In Pipeline: Source
This is a streaming bot that reads one or more rows from the files that match the criteria in the specified staging area in a chunked manner. We currently support ingesting data from following data formats - [csv,json,parquet,orc]
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
name* | Text | Staging area name | |
num_rows | Text | 1000 | Number of rows to fetch in each chunk |
format | Text | Format is either csv/json/parquet/orc. If not specified, it will be derived from extension | |
line_read | Text | yes | Only applicable for JSON. By default file is read as a json object per line. If you want to load the whole file as JSON, set it to 'no' |
Example usage:
Bot @dm:string-to-columns
Bot Position In Pipeline: Sink
It splits a column and assign values to columns
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
from* | Text | Specify the column name that has the string data | |
seps | Text | , | Specify the list of separators, example:|$@/#!. Default is comma. |
to* | Text | Specify the comma separated column names to which extracted strings need to be added as values. Once the 'from' column is split, the new columns are added to the existing dataframe, based on the order of the specified columns. |
|
to_column_default | Text | Assign a value to 'to' column(s), when there is no string value extracted out of 'from' column, Default is None |
Bot @dm:synthetic-dataset
Bot Position In Pipeline: Source
Generate a new dataframe with specified row_count and column_name = 'field_type' paris
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
row_count* | Text | Number of rows for the output dataframe |
This bot also accepts wildcard parameters. Any additional name = 'value' parameters are passed to the bot.
List of supported synthetic data field types are listed here.
Example usage:
Playground
Try this in RDA Playground(rda_docs_env)
Bot *dm:synthetic-fields
Bot Position In Pipeline: Source
List of field types supported in synthetic data functions or bots
This bot expects a Full CFXQL.
Bot applies the Query on the data that is already loaded from previous bot or from a source.
Example usage:
Playground
Try this in RDA Playground(rda_docs_env)
Bot @dm:tail
Bot Position In Pipeline: Sink
Get last 'n' rows
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
n | Text | 10 | Number of rows retain from tail position |
Example usage:
Playground
Bot @dm:tail-logs
Bot Position In Pipeline: Source
Listen to the log files and return any new lines added.
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
log_filename* | Text | Name of the Log file. Mandatory. |
Bot @dm:telegraf-parser
Bot Position In Pipeline: Sink
Parse the telegraf data that is passed through input dataset
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
tags_prefix | Text | tags | Add a prefix to the tags block after flattening. By default, 'tags' prefix will be added. |
fields_prefix | Text | Add a prefix to the fields block after flattening. By default, no prefix will be added. |
Bot *dm:template-list
Bot Position In Pipeline: Source
List of saved formatting templates
This bot expects a Full CFXQL.
Bot applies the Query on the data that is already loaded from previous bot or from a source.
Bot @dm:time-filter
Bot Position In Pipeline: Sink
Apply time filter on specified timestamp column
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
column* | Text | Timestamp column name. | |
from | Text | From timestamp. Can be absolute format or relative to current time. Atleast one of 'from' or 'to' must be specified. |
|
to | Text | To timestamp. Can be absolute format or relative to current time. Atleast one of 'from' or 'to' must be specified. |
Bot @dm:to-json
Bot Position In Pipeline: Sink
Converts each row of the input dataset to JSON format.
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
exclude_columns | Text | Regular expression to exclude one or more columns from input data | |
include_columns | Text | .* | Regular expression to include one or more columns from input data |
output_column* | Text | Name of the output column which captures the input dataset as JSON format. | |
keep_original_columns | Text | no | Keep existing columns in output or not. 'yes' or 'no' |
Bot @dm:to-type
Bot Position In Pipeline: Sink
Change data type to str or int or float for specified columns
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
columns* | Text | Comma separated list of column names | |
type* | Text | Type to convert into: str / int / float |
Example usage:
Playground
Bot @dm:transpose
Bot Position In Pipeline: Sink
Transposes columns to rows
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
columns* | Text | Comma-separated column names to set as index before transpose | |
value | Text | Name of value column |
Bot @dm:validate-data
Bot Position In Pipeline: Sink
Check integrity of data data using 'schema' that has been uploaded previously
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
schema_name* | Text | Schema to verify the data against. | |
failfast | Text | yes | Specify 'yes'(default) to abort quickly on first error or 'no' to keep validating records |
action | Text | none | Action to take if validation fails. Must be one of 'none' (default), 'fail','skip-block', 'skip-pipeline' |
Bot @dm:vectorization
Bot Position In Pipeline: Sink
Compares data of the given columns and populates change state of data in the newly created 'change_state' column
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
dict | Text | JSON that contains 'column_name' and 'column_type' keys for actual column name and column type values. ex:[{'column_name':'name','column_type':'type'}] |
|
suffixes | Text | _new,_old | Comma separated list of suffixes to identify old and new columns |
modified_summary | Text | False | Enable modified_summary to compute number of modified rows in each column and populates summary in the 'modified_summary' column.(True/False) |
Bot @dm:verify-checksum
Bot Position In Pipeline: Sink
Verify checksum of input dataframe. Checksum can be by rows-only or rows or for the entire dataset.
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
checksum_type | Text | dataset | Verify checksum for by row only or by row and then entire dataset. Valid values are 'rows-only', 'dataset' |
row_checksum_column | Text | rda_row_checksum | Input column for computed row level checksum. |
data_checksum_column | Text | rda_data_checksum | Input column for computed checksum for entire dataset. |
key | Text | Optional key to be used in the computed hash. | |
drop_checksum_columns | Text | yes | Use 'yes' or 'no' to specify if the checksum related columns should be dropped from the dataframe after a successful validation. |
Bot @dm:xml-to-json
Bot Position In Pipeline: Sink
Parse XML document in a specified column and convert it into JSON
This bot expects a Restricted CFXQL.
Each parameter may be specified using '=' operator and AND logical operation
Following are the parameters expected for this Bot
Parameter Name | Type | Default Value | Description |
---|---|---|---|
column* | Text | Column data which need to be parsed for XML | |
output_column* | Text | Column name for the output JSON data | |
status_column | Text | Column name for parsing status | |
json_path | Text | Dot delimited JSON path to be traversed in the document. Default is entire document. |
Example usage:
Playground
Try this in RDA Playground(rda_docs_env)