Search by job, company or skills

  • Posted 10 days ago
  • Be among the first 10 applicants
Early Applicant

Job Description

Responsibilities

Position:

  • Junior DevSecOps Engineer/Junior Site Reliability Engineer
  • Focus: day-to-day infrastructure operations, reliability, and baseline security for production environments.



Role Purpose

Operate and maintain production systems to ensure availability, performance, and security hygiene across:

  • Linux-hosted web applications (primarily Python-based, plus supporting API/services)
  • Integration services (service-based components, e.g., compiled services such as Go/Java) and workflow orchestration/scheduling
  • Relational databases (e.g., MySQL/MariaDB and PostgreSQL)
  • Observability tooling (dashboards, metrics, alerts, and logse.g., Grafana or equivalent)

This is an execution-oriented role: monitoring, incident response, routine maintenance, safe changes, and continuous operational improvement.



Key ResponsibilitiesRequirements

  • Production Reliability & Operations
  • Perform daily health checks for applications, services, and infrastructure (CPU/memory/disk/network).
  • Monitor metrics and alerts; identify anomalies and take first-response actions based on runbooks.
  • Troubleshoot incidents using a structured approach (symptom scope evidence mitigation escalation).
  • Maintain service availability by applying safe operational actions (restart, rollback, failover steps where applicable).
  • Observability (Metrics, Logs, Alerts)
  • Read and interpret dashboards (latency, throughput, error rate, saturation, DB connections, queue depth).
  • Investigate issues using common log sources and system logs; collect evidence for post-incident review.
  • Maintain alert hygiene: reduce noise, validate thresholds, ensure alerts map to actionable playbooks.
  • Support creation/updating of operational dashboards and basic SLO/SLI tracking (where required).
  • Platform/Application Operations
  • Support application runtimes and background processing:
  • process/worker health, schedulers, job queues, cron/systemd services
  • configuration verification and environment consistency checks
  • Assist with deployments in collaboration with developers:
  • pre-checks, smoke tests, rollback readiness, and post-deploy monitoring
  • Perform routine maintenance: log rotation validation, cleanup tasks, certificate checks, and capacity housekeeping.
  • Database Operational Support (MySQL/MariaDB & PostgreSQL)
  • Perform operational checks:
  • connectivity, replication/HA status (if used), storage growth, connection usage
  • Verify and test backups:
  • ensure backups run, validate restore procedures periodically (under supervision)
  • Support troubleshooting:
  • identify symptoms of locking/contention, slow queries, and connection exhaustion
  • collect evidence (process list/activity, slow logs, relevant metrics)
  • DevSecOps Baseline Security Hygiene
  • Apply standard security practices:
  • SSH key management, least privilege, secure access patterns, secrets hygiene
  • Support patching routines:
  • OS updates, package updates, vulnerability remediation scheduling
  • Assist with hardening and exposure checks:
  • firewall/security group rules, port exposure reviews, TLS/cert validity checks
  • Ensure operational compliance basics:
  • access logs, change tracking, and minimal audit readiness.
  • Documentation & Continuous Improvement
  • Maintain and follow runbooks/SOPs for recurring tasks and incidents.
  • Write short incident notes (what happened, impact, mitigation, follow-up actions).
  • Automate repetitive checks with scripts (Bash/Python) and simple tooling.

Must-Have Requirements

Technical

  • Comfortable operating Linux servers via SSH (strong terminal usage is mandatory; Termius or similar SSH client experience preferred).
  • Able to read dashboards and interpret metrics using Grafana or equivalent (Datadog/New Relic/Prometheus UI etc.).
  • Basic understanding of web/service fundamentals:
  • HTTP/HTTPS, TLS basics, reverse proxy concepts, ports, DNS basics
  • Basic operational knowledge of relational databases:
  • MySQL/MariaDB or PostgreSQL concepts (connections, queries, backups, locking symptoms)
  • Basic scripting:
  • Bash and/or Python for routine automation and checks
  • Familiar with version control basics (Git) and disciplined change practices.

Behavioral

  • Strong operational mindset: careful, systematic, and calm during incidents.
  • Can follow runbooks, communicate clearly, and escalate with context.
  • Willing to work with an on-call/standby rotation (if applicable).



Nice-to-Have (Preferred)

  • Exposure to any orchestration/scheduling tool (Airflow, cron-based platforms, CI schedulers, etc.).
  • Experience with containers (Docker) and/or virtualization (VMware/Proxmox).
  • Familiarity with common components:
  • Nginx/HAProxy, Redis/queues, message brokers, object storage
  • Basic security tooling familiarity:
  • Vulnerability scanning concepts, CIS-style hardening, MFA/SSO integration awareness
  • Any experience building/maintaining monitoring/alert rules and dashboards.



Tools & Practical Skills We Expect

  • SSH and Linux triage commands: systemctl, journalctl, top/htop, free, df/du, iostat, ss/netstat, curl, tail, grep

Working With Monitoring

  • Understanding of latency, error rate, saturation, throughput, resource bottlenecks

Basic DB Checks

  • Connection count, active queries, long-running queries, storage growth
  • Communication and documentation:
  • Incident updates, handover notes, minimal post-incident summary



Recommended Screening (Hands-on)

  • Linux triage: given service down / high latency, show step-by-step checks and safe actions.
  • Dashboard reading: interpret a scenario (latency spike + error increase) and propose likely causes + next checks.
  • DB ops basics: how to detect connection exhaustion, locking symptoms, and what evidence to collect.
  • Security hygiene: explain safe SSH access, key handling, secrets, and patching routines.



Experience & Education Guidance

  • 02 years relevant experience (DevSecOps/SRE/Infra Support) or strong internship/homelab proof.
  • Fresh graduates are acceptable if they demonstrate strong terminal skills + monitoring literacy.

More Info

About Company

Job ID: 135172603