The ServiceNow Health Check Checklist We Run for Every New Client
Why Every ServiceNow Instance Needs a Health Check
When we onboard a new client, the first thing we do is run a comprehensive health check on their ServiceNow instance. Not because we assume things are broken, but because every instance accumulates technical debt over time — skipped upgrades, abandoned customizations, orphaned scripts, and configurations that made sense three years ago but now create performance bottlenecks.
A health check gives you a baseline. It tells you what is working, what is fragile, and what needs immediate attention before you start any new implementation work. Without one, you are building on an unknown foundation.
Here is the exact 15-point checklist we use. We have refined it over 50+ enterprise engagements and it consistently surfaces the issues that matter most.
The 15-Point ServiceNow Health Check Checklist
1. Upgrade Currency
Check which ServiceNow release you are running and how many versions behind you are. ServiceNow releases two major versions per year. If you are more than two versions behind, you are accumulating upgrade debt that becomes exponentially harder to resolve. Document the current version, the target version, and any skipped versions in between.
2. Instance Scan Results
Run ServiceNow’s built-in Instance Scan with all scan suites enabled. This automatically flags best practice violations, performance issues, and security concerns. Pay special attention to critical and high-severity findings — these are the ones most likely to cause production incidents. Export the results and use them as your remediation backlog.
3. Customization Audit
Count the number of customized out-of-box records across business rules, client scripts, UI policies, and script includes. High customization counts (500+) indicate upgrade risk and maintenance overhead. For each customization, determine whether it is still needed, whether it can be replaced with a configuration, and whether it conflicts with newer platform features.
4. Performance Baseline
Measure average page load times, transaction response times, and slow query counts. Document these as your baseline so you can measure improvement after remediation. Key metrics include: average form load time (target: under 2 seconds), average list load time (target: under 3 seconds), and slow transaction count per day (target: zero).
5. Security Configuration Review
Audit Access Control Lists (ACLs) for overly permissive rules, check for hardcoded credentials in scripts, review user roles for excessive privileges, and verify that sensitive tables have proper field-level security. A single misconfigured ACL can expose confidential data to hundreds of users.
6. Unused Plugin Assessment
List all activated plugins and determine which ones are actually in use. Unused plugins consume system resources, add upgrade complexity, and expand your attack surface. Deactivate any plugin that has zero active users or zero transactions in the past 90 days.
7. Integration Health
Review all active integrations — REST messages, SOAP messages, MID Server connections, IntegrationHub spokes. Check for failed transaction patterns, authentication expiry dates, and deprecated API versions. Document the business owner for each integration so you know who to contact when changes are needed.
8. Data Volume Analysis
Check table sizes for the largest tables in the instance. Tables with millions of records without proper archiving rules slow down queries and increase backup times. Identify tables that need data archiving policies and implement retention rules based on business requirements.
9. Scheduled Job Review
List all active scheduled jobs, their execution frequency, and their average run time. Look for jobs that run too frequently, jobs that overlap with other jobs, and jobs that take longer than expected. Consolidate or optimize jobs that are consuming excessive system resources.
10. Update Set Hygiene
Check for uncommitted update sets, update sets in the “In Progress” state for more than 30 days, and update set collision history. Poor update set management is the number one cause of configuration drift between environments. Establish a policy for update set naming, completion, and promotion.
11. Knowledge Base Assessment
Review knowledge article counts, usage statistics, and article age. A knowledge base with outdated articles is worse than no knowledge base — it actively misleads users and increases support ticket volume. Identify articles older than 12 months that have not been reviewed and flag them for update or retirement.
12. SLA Configuration Audit
Verify that SLA definitions match current business agreements. Check for SLAs that are never triggered (misconfigured conditions), SLAs with zero breaches (suspiciously low — may indicate paused timers), and SLAs without proper notification rules. Misaligned SLAs give management false confidence about service delivery.
13. Workflow and Flow Status
Identify stalled workflows, flows with high error rates, and automations that have not executed in 90+ days. Stalled workflows consume system resources and can block record processing. Decommission any automation that is no longer serving a business purpose.
14. CMDB Data Quality Score
Assess CMDB health using completeness (percentage of required fields populated), accuracy (CIs matching real-world state), freshness (last update date), and relationship coverage (CIs with proper upstream and downstream mappings). A CMDB below 70% quality is not providing reliable value to incident and change management.
15. User Satisfaction Baseline
Pull the most recent user satisfaction survey results or, if unavailable, check incident reopen rates and self-service adoption rates. These indicate whether the platform is actually serving its users well. Technical health means nothing if users are frustrated and creating workarounds.
How to Interpret Your Results
After running through all 15 points, categorize findings into three priority buckets. Critical items are anything creating production risk, security vulnerabilities, or blocking business processes — fix these within 2 weeks. High items are performance issues, upgrade blockers, and compliance gaps — plan these for the next 30 days. Medium items are optimization opportunities and technical debt — schedule these quarterly.
The goal is not to fix everything at once. The goal is to understand your current state, prioritize based on business impact, and create a realistic remediation roadmap.
Get Your Free ServiceNow Health Check
If you would rather have an experienced team run this assessment for you, Milic Media offers a complimentary 30-minute ServiceNow health check for qualified organizations. We will walk through the critical findings, identify your top three priorities, and give you a roadmap you can act on immediately.
Book your free health check now
Leave a Reply