Cloud Monitoring Engineer (Remote)

Oxley Enterprises
Stafford, VA, USUSA
Full-time
AI
$102k - $156k/yearly
Posted 1 day ago

Job Description

**The following states/districts are excluded from this job ad: AK, CA, CO, CT, DC, HI, LA,** **MA, MN, MO, NE, NV, NH, NJ, NM, NY, ND, OR, PR, RI, VT,** **WA, WY** **Future Need \- Actively Interviewing** **Location:** Remote in any United States jurisdiction not excluded from this job advertisement. Provide the eyes\-on\-glass excellence a mission\-critical Department of Veterans Affairs (VA) platform demands. As a Cloud Monitoring Engineer, you will build, tune, and maintain the observability stack tracking latency, error rate, saturation, volume, and incident\-free availability across 300\+ applications. **Position Description:** The Cloud Monitoring Engineer builds and maintains the Capabilities and Services Dashboard and supports monitoring infrastructure, ensuring automated alerting detects production issues before user\-reported tickets arrive. **Minimum/General Experience:** 5 years of experience in cloud monitoring and observability engineering **Minimum Education:** Bachelor's Degree in computer science, information technology, or related field; Dynatrace Associate certification or equivalent observability platform certification (preferred) **Essential Skills/Qualifications:** * Excellent experience building and maintaining dashboards displaying latency, error rate, saturation, volume, and incident\-free availability in real\-time (e.g., Dynatrace, Splunk) * Excellent knowledge of the four Golden Signals * Excellent ability to individually monitor and track latency, error rate, saturation, volume, and incident\-free time * Excellent ability to implement dependency tracking within monitoring dashboards including latency, error rates, and transaction volumes * Excellent experience configuring automated alerts reflecting meaningful degradation or disruption while minimizing false positives * Excellent ability to maintain an accurate, complete, and auditable log of all alerts including alerted system, cause, timestamps, corrective actions, and responsible system * Above average experience supporting 24/7 monitoring operations and coordinating with on\-call Site Reliability Engineers (SREs) during active incidents * Above average knowledge of AWS CloudWatch and integration with third\-party observability tools in a GovCloud environment * Experience supporting federal government programs and enterprise\-scale applications operating in cloud\-based or hybrid environments * Excellent verbal and written communication skills **General Physical Requirements needed to perform the essential functions of this job may vary based on the location of the assignment**. * Assignment Location \- Remote * Sedentary Work \- Exerting up to 10 pounds of force occasionally and/or a negligible amount of force frequently or constantly to lift, carry, push, pull or otherwise move objects. * Typing, communicating, repetitive motions. * Close visual acuity to prepare and analyze data, view computer monitors and read. May need to view presentation screens and other visual aids in a virtual setting. * Inside environmental conditions with protection from outside elements. **Security**: Active Federal Civilian Public Trust clearance * U.S. Citizenship or Permanent Resident that has lived in the United States for at least 3 years **Federal Civilian Public Trust** Consists of a review of up to but not limited to: * Covers 10 year period and in some instances lifetime events * OPM Security Investigations Index (SII) * DOD Defense Central Investigations Index (DCII) * National Agency Check (NAC) records * FBI name check * FBI fingerprint check * Credit report check * Written inquiries to previous employers and references listed on the application for employment * Potential interviews with the subject, spouse, neighbors, supervisor, coworkers * Law enforcement check * Court records check * Education check \- Attendance and Degrees Acceptable Credentials **Tasks/activities include, but are not limited to:** * Builds and maintains the Capabilities and Services Dashboard displaying real\-time latency, error rate, saturation, volume, and incident\-free availability for all capabilities and services * Implements dependency tracking within the dashboard including latency, error rates, and transaction volumes for all capability and service dependencies * Configures and tunes automated monitoring and alerting mechanisms ensuring personnel are alerted to production issues prior to receipt of user\-reported tickets * Ensures all capabilities and services are individually monitored and tracked for latency, error rate, saturation, volume, and incident\-free time * Maintains an accurate, complete, and auditable alert log including alerted system, description, timestamps, corrective actions, and responsible system * Continuously tunes alert thresholds to reflect meaningful degradation or disruption while minimizing false positives * Supports the Capabilities and Services Monitoring Plan defining alert conditions, thresholds, notification mechanisms, and escalation paths * Coordinates with on\-call SREs and the Monitoring and Incident Manager during active incidents to provide real\-time dashboard data and historical trend analysis * Implements additional or revised dashboard metrics **CompensationBenefits:** The annual projected pay range for this position is $102,075 \- $156,237 with consideration being given to various factors including but not limited to qualifications, experience, job responsibilities, and geographic location. Oxley Enterprises, Inc. offers a full array of benefits including: * Medical, dental, vision and prescription drug coverage for you and your family. * Life Insurance, short\-term disability and long\-term disability paid for by the Company. * Supplemental coverages including Accident, Critical Illness, and Hospital. * Additional Life insurance coverage for you and your dependents. * 401k plan with various options to select based on your retirement goals. Oxley Enterprises, Inc. is a certified service\-disabled veteran\-owned (SDVOSB), veteran\-owned (VOSB), and woman\-owned small business (WOSB) that has 26 years of experience building and delivering quality IT systems and programs. Oxley is ranked in the INC 5000 7 times (2016, 2017, 2018, 2021, 2023, 2024, 2025\). Oxley is a 2019 \- 2025 Department of Labor HIRE Vets Medallion Award Winner. Oxley is Virginia Values Veterans certified. All qualified applicants will receive consideration for employment without regard to any status protected by applicable federal, state, or local law. If you require a reasonable accommodation to apply for a position at Oxley Enterprises, Inc., please send an email to our Human Resources Department at: careers@oxleyenterprises.com with the following information: Subject Line: Accommodation Request Provide a description of your accommodation request Include your contact information: Full name, Email address, Best number to reach you (optional) We participate in the E\-Verify program. http://www.dhs.gov/E\-Verify