Maintenance Window Scheduling Guidance for Splunk Support

This document is intended as communication to the Splunk Cloud Product and Support teams.

Maintenance Window Candidacy Calendars

Please use the following calendars (TZ = US/Central, i.e., America/Chicago) to identify when to schedule maintenance / implement changes for our environment(s).

0) The Maintenance Window Calendar for Development Instance (illinois-dev.splunkcloud.com)

The development Splunk Cloud instance can be updated at any time, at will. We only request that Splunk provides a notice that maintenance will be performed, and the standard notices that take place when initiated and at the end that maintenance period. 

1) The “Little-to-no impact change” Maintenance Window for Production Instance (illinois.splunkcloud.com)

Please use this calendar to identify candidate Maintenance Windows for changes that…

  • Have a 2 hour completion time
  • Take in to account the (possible) need for a 1-week period of testing change in dev environment (see “Institutional Requirements Related to Change Management” below) – AND –
  • App installation requests
  • Search Head Cluster restarts
  • Maintenance completes within the designated windows 

TZ = US/Central, i.e., America/Chicago

Sunday Monday Tuesday Wednesday Thursday Friday Saturday
00:00 – 04:00 00:00 – 04:00 00:00 – 04:00
00:00 – 04:00
00:00 – 04:00

00:00 – 21:00 04:00 – 06:00 04:00 – 06:00 04:00 – 06:00 04:00 – 06:00 04:00 – 06:00 00:00 – 23:59
21:00 – 23:59 21:00 – 23:59 21:00 – 23:59 21:00 – 23:59 21:00 – 23:59 21:00 – 23:59
  • Green = Preferred / Encouraged.
  • Blue = Acceptable
  • Not listed = Not acceptable

2) The “Disruption Possible and/or Testing Required” Maintenance Window for Production Instance (illinois.splunkcloud.com)

Please use this calendar to identify candidate Maintenance Windows for changes that…

  • Take in to account the (likely) need for a 1-week period of testing the change in dev environment (see “Institutional Requirements Related to Change Management” below) – AND –
  • Service disruptions or degradation are possible. (See “Some Disruptions and/or Testing Required” below.)  – OR –
  • Implies checking for restored service on the part of the customer.  – OR – 
  • Imply some testing required for impact of (result of) change on the part of the customer.
  • Maintenance completes within the designated windows

TZ = US/Central, i.e., America/Chicago

Sunday Monday Tuesday Wednesday Thursday Friday Saturday
00:00 – 03:59 00:00 – 04:00 00:00 – 03:59
04:00 – 10:00

04:00 – 06:00

04:00 – 10:00
10:00 – 17:00 22:00 – 23:59 23:00 – 23:59 10:00 – 23:59
  • Green = Preferred / Encouraged
  • Blue = Acceptable for longer deployments that require coordination with Illinois to check-out & test
  • Red = Discouraged but Acceptable if coordinated with customer well in advance
  • Not listed = Not acceptable

3) The “Disruption Expected” Maintenance Window for Production Instance (illinois.splunkcloud.com)

Please use this calendar to identify candidate Maintenance Windows for changes that…

  • Have at least a 6 hour completion time
  • Take in to account the (likely) need for a 1-week period of testing the change in dev environment (see “Institutional Requirements Related to Change Management” below) – AND –
  • Are expected to disrupt or degrade service.
  • Checking for restored service on the part of the customer.  
  • Testing required for impact of (result of) change on the part of the customer.
  • Version upgrades
  • Maintenance completes within the designated windows

TZ = US/Central, i.e., America/Chicago

Sunday Monday Tuesday Wednesday Thursday Friday Saturday
00:00 – 10:00 00:00 – 06:00 00:00 – 10:00
21:00 – 23:59 21:00 – 23:59

21:00 – 23:59

  • Green = Preferred / Encouraged
  • Blue = Acceptable
  • Not listed = Not acceptable


Definitions / Clarifications

“Little-to-no impact”

The change…

  • Is not intended to substantially change the behavior of ingest.
  • Does not introduce new versions of core Splunk or installed apps that require post-install changes and/or testing by customer.

At the time of change / maintenance…

  • There will be no disruption or degradation of ingestion activity – or no more than a 15-minute disruption to (or degradation of) ingestion.
  • There will be no disruption or degradation of service to current users sessions (e.g., developing ad hoc searches, building reports, dashboards, etc.) in the user interface (Splunk Web) – or no more than a 15-minute disruption to (or degradation of) the user interface. (See Info box about Rolling Restarts below.)
  • All scheduled searches and alerts/triggers will occur within the window of their scheduled execution … provided the scheduled search’s window of execution is 5 minutes or more.
  • The vendor (Splunk) will ensure the instance is restored to service after the change is implemented.

info

Rolling Restarts

It is our understanding that “Rolling Restarts” of our environment should meet the above expectations. I.e., restarts a) should result in no disruption of ingestion and b) should result in less than 5 minutes of interruption for any search head. Further, c) a “Searchable Rolling Restart” should ensure scheduled searches are not lost / discarded / canceled (assuming they have a window of 5 minutes or more). Although undesirable, it is our understanding that d) user sessions will get briefly interrupted (including potential loss of any unsaved edits) upon restart of search head – even with a Searchable Rolling Restart of the Search Head.

“Some Disruptions and/or Testing Required”

The change…

  • May change the behavior of ingest.
  • Introduces new versions of core Splunk or installed apps that are expected to require post-install changes and/or testing.

At the time of change / maintenance…

  • There will be a disruption to or degradation of ingestion activity of more than 15 minutes.
  • The Splunk Web user interface (including navigating the site, editing knowledge objects, performing ad hoc searches, executing scheduled searches) may be disrupted (or degraded) for more than 15 minutes. (See Info Box about Rolling Restarts above.)
  • The customer may need to check to ensure that the service is restored after change is implemented.

Institutional Requirements Related to Change Management

  • At the University of Illinois, security updates are expected to be implemented within one week of public notice.
  • Traditionally within Technology Services at Illinois, changes to services offered to campus such as this are required (as part of Change Management protocol) to undergo a 1-week review on a test (dev) environment before implementing the change in production environment. With (non-traditional) Cloud services such as Splunk Cloud, we realize this may be inappropriate/redundant since Splunk does its own testing of its (our) Cloud environment. Because not all aspects of our Splunk Cloud environment fall within the responsibility of the Splunk Cloud Support/Product teams to monitor / test (e.g., our combination of apps, our custom configurations, analytics, etc.), it is recognized that the identification of what changes are expected to fall within this “1-week” testing expectation represents a subjective challenge. We are eager to manage this subjective challenge together as effectively as possible.

Splunk at Illinois
Email: splunk-admin@illinois.edu
Log In