The Sr. Splunk Analyst position within the IT System Support and Control department is the top-level technical contributor in one or more highly specialized areas of systems administration. The Sr. Splunk Analyst is responsible for improving PJM's monitoring capabilities through the expansion of ITSI's capabilities working directly with platform and application owners to onboard new data and develop KPI's.
Thresholding and Alerting:
Define and configure dynamic and static thresholds within Splunk to identify anomalies, security events, and operational issues.
Design and implement proactive alerting systems that notify stakeholders of critical conditions based on thresholds and event patterns.
Optimize Thresholding configurations to minimize false positives and ensure that alerts are meaningful and actionable.
Implement KV stores, lookups, and data model acceleration to optimize search performance and reporting
KPI Creation and Reporting:
Develop and maintain Key Performance Indicators (KPIs) in ITSI to measure system performance and business operations.
Collaborate with stakeholders to gather requirements and define KPIs that align with organizational goals and metrics.
Build and optimize dashboards, Service Analyzers, and visualizations that display KPIs, ensuring clarity, usability, and accessibility for key decision-makers.
Data Analysis and Troubleshooting:
Investigate and analyze log data to identify trends, patterns, and anomalies that require attention.
Provide insights and recommendations for improving operational efficiency and identifying security risks.
Work with internal teams to resolve data quality issues, ensuring accurate and reliable data for Splunk analysis.
Essential Functions:
Systems Administration with a focus on Management of the PJM Splunk service infrastructure.
Perform data ingest, ensuring appropriate source typing and data quality
Provide subject matter expertise for Splunk SPL query writing, dashboard creation, and alert configuration
Mentor team members in all aspects of the Splunk platform
Work with information systems owners and project teams to understand their logging needs and assist with implementing practices and procedures consistent with PJM's policies
Maintain current knowledge of industry trends and standards
Periodic off-hours work required including weekends and holidays. Must be able to provide 24 by 7 on-call support as necessary.
Tool selection and implementation to support integration with the *Nix platform including Splunk and Configuration software like Puppet and Ansible to complete tasks with automation including systems recoveries.
Configuration management support for both hardware and software leveraging best practices, software packaging, version control systems.
Proactive performance management using state of the art tools
Manage projects for the implementation of new solutions at PJM.
Deployment and integration of Splunk apps according to business needs and strategic direction.
Participate in the development and completion of testing activities related to the Splunk platform.
Build and develop strong relationships with ITS's business area clients to develop solutions that make a difference and get recognized.
Manage, maintain and support middleware software working closely with developers, third party vendors, and customers to ensure a reliable and scalable platform.
Support PJM's IT Service Intelligence Monitoring platform
Respond to all identified system issues to reduce the Mean Time To Recovery
Characteristics & Qualifications:
Required:
Bachelor's Degree in Computer Science, Business or equivalent work experience