Within Technology Services we separate Splunk into multiple tiers. These tiers serve different purposes within the Splunk environment and are outlined below.
Contents
Endpoint/Collection Tier
Usually a Splunk Universal Forwarder agent installed locally on a workstation or server.
A key component within this tier is the deployment server. The deployment server pushes and coordinates numerous settings on the individual endpoints known also as deployment clients. The deployment server is sized to meet the needs of ~500 endpoints. Additional capacity is planned to be added later as the service grows.
Forwarding Tier
The Forwarding Tier is a “middle” tier. It is a collection of “forwarders” – servers running either Universal Forwarders (UFs) or Heavy Forwarders (HFs) – which sit between machine data sources (servers, switches, IoT devices, software services, etc.) on the one hand and the Splunk Cloud indexing tier on the other. The forwarding tier includes servers both on-premises (VMware) and in the Illinois AWS environment. Future plans include forwarder tier servers in other cloud solutions such as Microsoft Azure. Aside: As Splunk Cloud matures, Splunk is including some forwarding services in their Splunk Cloud offering. As of this writing, Splunk Cloud recently made available HEC capability and a modified version of a Heavy Forwarder they call “Inputs Data Manager” (IDM).
“UF Tier” Forwarding
As the name implies, the “UF Tier” is a collection of load-balanced servers each running the simpler, more lightweight (though highly tuned) Universal Forwarder agents. UF Tier servers simply receive events from the UF agents from endpoints across the environment and forward them directly to Splunk Cloud. Their purpose is to provide a consistent, managed, and highly available channel between endpoints and the Splunk Cloud indexing tier.
“Heavy Forwarder” Forwarding
Heavy Forwarders are full installations of Splunk Enterprise and are used for the middle-tier forwarding in scenarios where the use of a Universal Forwarder at the source is infeasible. Unlike the UF Tier, Heavy Forwarders are typically used to pull (or receive) events from sources such as the following.
- Software services (SaaS or other) which can be configured to forward events to an endpoint on a Splunk HF by API* (push)
- Software services (SaaS or other) which can forward events upon request by API (pull)
- Database sources (e.g., MS SQL, MySQL, Oracle, etc.) through JDBC (pull)
* Splunk offers a REST endpoint referred to as an “HTTP Event Collector” (HEC) for receiving events passively (i.e., receiving events pushed to it).
Members of the forwarding tier are among a select number of forwarders allowed to send events directly to the indexing tier in Splunk Cloud. This reduces the number of open ports that the Splunk Cloud indexers have open to the world and provides additional security and manageability. Clients which are not allowed to connect to the public internet (i.e., Splunk Cloud) for security reasons must use the forwarding tier.
Data at the forwarding tier is not retained by the middle-tier forwarder. For this reason, although we expect all tiers to be highly available, customers are advised to manage log rotation so that logs are retained (locally at the endpoint) for at least three days.
The forwarding tier on campus is built to handle up to 2 TB of traffic per day and routes data out of the main campus internet connection. The forwarding tier in AWS is sized for 100 GB per day.
Indexing Tier
This aggregation tier is managed by Splunk in their Splunk Cloud offering. This tier is the most complicated and expensive part of the Splunk service offering requiring multiple large indexing servers each equipped with lots of SSD storage and compute. This is the most important part of the Splunk environment and where the events are stored.
At the time of this writing, the indexing tier has seven cluster members and is designed to handle up to 1.5 TB of data per day with a targeted 90 day retention period.
Search Tier
The search tier, comprised of “search heads”, delivers the Splunk Web interface and facilitates searching of the data that has been indexed at the indexing tier. From the customer perspective, this is the tier where dashboards, reports, and alerts are accessed and will run. Our primary production “search head”, in Splunk Cloud, is available at the address https://illinois.splunkcloud.com. Each search head instance (e.g., one for Production, one for Test / Dev) has its own set of capabilities and is configured according to its role in the environment.
A quick picture…
Here is a simplified architecture diagram.
