Guide to Data Ingestion in RDA Fabric
1. Ingesting Data into RDAF Using Event Gateway
RDA Event Gateway is a RDA Fabric component that can be deployed in Cloud or on-premises / edge locations to ingest data from various sources.
RDA Event Gateway supports following endpoints types:
|syslog_tcp||TCP/SSL||Syslog or Syslog like event ingestion via TCP or SSL|
|syslog_udp||UDP||Syslog or Syslog like event ingestion via UDP|
|http||HTTP/HTTPS||JSON or Plain Text formatted events via Webhook. Supports HTTP operations POST & PUT|
|tcp_json||TCP/SSL||JSON encoded messages with one message per line|
|filebeat||HTTP/HTTPS||Elasticsearch Filebeat / Winlogbeat based ingestion of data|
|file||Ingestion of data from one or more file(s) or folder(s)|
RDA Event Gateway Endpoint configuration example for Webhook:
Explanation of configuration fields:
name: Name of the endpoint. Must be unique
enabled: If set to
false, Event Gateway will shutdown the endpoint
type: Type of Endpoint. For this example
trueruns the endpoint in HTTPS mode
content_type: Type of content to expect in incoming payload. Possible values are 'auto', 'json', 'text'. If set to auto, endpoint will detect the content using Content-Type HTTP header.
port: TCP port to listen for data
stream: Name of the RDA Stream where the data will be published for further consumption by RDA Pipelines or Persistent Streams
attrs: Optional dictionary of attributes that will be added to each message's payload. Event Gateway automatically inserts attributes:
- Following attributes are automatically inserted into each message:
rda_gw_ep_type: Endpoint Type (in this example: 'http')
rda_gw_ep_name: Endpoint Name
rda_gw_timestamp: Ingested timestamp in ISO format
rda_content_type: HTTP Content-Type header value
rda_url: HTTP URL
rda_path: Path part of HTTP URL
rda_gw_client_ip: IP Address of the client that posted the data
rda_user_agent: User-Agent of the client
rda_stream: RDA Stream where this message is being forwarded to
Automatic Archival of Data from Event Gateway:
RDA Event Gateway can be configured to automatically archive using RDA Log Archive feature.
Following is an example snippet for main.yml configuration file:
Log Archive repository (
demo_logarchive) must be pre-created using CLI or RDA Portal.
2. Ingesting Data into RDAF Using Message Queues
RDA Pipelines can continuously ingest data from many types of message queues. Some of the most commonly used approaches are:
See above pages for list of bots available for ingesting data from different types of queues.
3. Ingesting Data into RDAF Using Purpose Built Bots
RDA Provides extensive set of bots to retrieve data from various sources. Following are some of the integrations available:
File & Object Storages
- Files & URLs: Supports many formats like CSV, ZIP, GZIP, Parquet
- Minio/ S3
ITOM & Observability
- Arista Bigswitch
- Cisco ACI
- Cisco Intersight
- Cisco IoS
- Cisco Meraki
- Cisco NXOS
- Cisco Support
- Cisco UCS CIMC
- Cisco UCS Manager
- Cisco Unified Call Manager
- EMC Isilon
- EMC Unity
- EMC XtremIO
- HPE 3Par
- IBM AIX
- Linux & Docker
- NetApp ONTAP 7-Mode
- NetApp ONTAP C-Mode
- Pure Storage
- VMware vCenter
4. Ingesting Data Using Staging Area
RDA Pipelines can continuously ingest data from staging area (for example S3 or minio). Data can be ingested directly from files in a specified bucket and a folder path (or prefix).
Staging area definition specifies where data files are stored so that the data in the files can be ingested into RDA Fabric.
- Storage area definitions are stored in RDAF Object Storage.
- The staging area data can be in RDAF Object Storage or any external storage (S3 or minio). For external staging area, the user needs to create credential of type
stagingarea-ingestfor RDA platform to access the bucket.
Related RDA Client CLI Commands
staging-area-add Add or update staging area staging-area-delete Delete a staging area staging-area-get Get YAML data for a staging area staging-area-list List all staging areas.
See RDA CLI Guide for installation instructions
Sample YAML: For staging area in RDA platform
Sample YAML: For staging area that is external (S3 or minio)
Managing through RDA Portal
- In RDA Portal, Click on left menu Data
- Click on 'View Details' next to Data Staging Area
Managing through RDA Studio
- Studio does not have any user interface for managing the staging area.
5. Ingesting Data once from location
RDA Pipelines can also ingest data once from a given location (S3 or minio). Data can be ingested directly from files in a specified bucket and a folder path (or prefix).
For external location, the user needs to create credential of type
stagingarea-ingest for RDA platform to access the bucket.