Junior DevSecOps Engineer

SwiftMind Indonesia

Banten, Indonesia, Tangerang

Fresher

Save

Posted 10 days ago
Be among the first 10 applicants

Early Applicant

Job Description

Responsibilities

Position:

Junior DevSecOps Engineer/Junior Site Reliability Engineer
Focus: day-to-day infrastructure operations, reliability, and baseline security for production environments.

Role Purpose

Operate and maintain production systems to ensure availability, performance, and security hygiene across:

Linux-hosted web applications (primarily Python-based, plus supporting API/services)
Integration services (service-based components, e.g., compiled services such as Go/Java) and workflow orchestration/scheduling
Relational databases (e.g., MySQL/MariaDB and PostgreSQL)
Observability tooling (dashboards, metrics, alerts, and logse.g., Grafana or equivalent)

This is an execution-oriented role: monitoring, incident response, routine maintenance, safe changes, and continuous operational improvement.

Key ResponsibilitiesRequirements

Production Reliability & Operations
Perform daily health checks for applications, services, and infrastructure (CPU/memory/disk/network).
Monitor metrics and alerts; identify anomalies and take first-response actions based on runbooks.
Troubleshoot incidents using a structured approach (symptom scope evidence mitigation escalation).
Maintain service availability by applying safe operational actions (restart, rollback, failover steps where applicable).
Observability (Metrics, Logs, Alerts)
Read and interpret dashboards (latency, throughput, error rate, saturation, DB connections, queue depth).
Investigate issues using common log sources and system logs; collect evidence for post-incident review.
Maintain alert hygiene: reduce noise, validate thresholds, ensure alerts map to actionable playbooks.
Support creation/updating of operational dashboards and basic SLO/SLI tracking (where required).
Platform/Application Operations
Support application runtimes and background processing:
process/worker health, schedulers, job queues, cron/systemd services
configuration verification and environment consistency checks
Assist with deployments in collaboration with developers:
pre-checks, smoke tests, rollback readiness, and post-deploy monitoring
Perform routine maintenance: log rotation validation, cleanup tasks, certificate checks, and capacity housekeeping.
Database Operational Support (MySQL/MariaDB & PostgreSQL)
Perform operational checks:
connectivity, replication/HA status (if used), storage growth, connection usage
Verify and test backups:
ensure backups run, validate restore procedures periodically (under supervision)
Support troubleshooting:
identify symptoms of locking/contention, slow queries, and connection exhaustion
collect evidence (process list/activity, slow logs, relevant metrics)
DevSecOps Baseline Security Hygiene
Apply standard security practices:
SSH key management, least privilege, secure access patterns, secrets hygiene
Support patching routines:
OS updates, package updates, vulnerability remediation scheduling
Assist with hardening and exposure checks:
firewall/security group rules, port exposure reviews, TLS/cert validity checks
Ensure operational compliance basics:
access logs, change tracking, and minimal audit readiness.
Documentation & Continuous Improvement
Maintain and follow runbooks/SOPs for recurring tasks and incidents.
Write short incident notes (what happened, impact, mitigation, follow-up actions).
Automate repetitive checks with scripts (Bash/Python) and simple tooling.

Must-Have Requirements

Technical

Comfortable operating Linux servers via SSH (strong terminal usage is mandatory; Termius or similar SSH client experience preferred).
Able to read dashboards and interpret metrics using Grafana or equivalent (Datadog/New Relic/Prometheus UI etc.).
Basic understanding of web/service fundamentals:
HTTP/HTTPS, TLS basics, reverse proxy concepts, ports, DNS basics
Basic operational knowledge of relational databases:
MySQL/MariaDB or PostgreSQL concepts (connections, queries, backups, locking symptoms)
Basic scripting:
Bash and/or Python for routine automation and checks
Familiar with version control basics (Git) and disciplined change practices.

Behavioral

Strong operational mindset: careful, systematic, and calm during incidents.
Can follow runbooks, communicate clearly, and escalate with context.
Willing to work with an on-call/standby rotation (if applicable).

Nice-to-Have (Preferred)

Exposure to any orchestration/scheduling tool (Airflow, cron-based platforms, CI schedulers, etc.).
Experience with containers (Docker) and/or virtualization (VMware/Proxmox).
Familiarity with common components:
Nginx/HAProxy, Redis/queues, message brokers, object storage
Basic security tooling familiarity:
Vulnerability scanning concepts, CIS-style hardening, MFA/SSO integration awareness
Any experience building/maintaining monitoring/alert rules and dashboards.

Tools & Practical Skills We Expect

SSH and Linux triage commands: systemctl, journalctl, top/htop, free, df/du, iostat, ss/netstat, curl, tail, grep

Working With Monitoring

Understanding of latency, error rate, saturation, throughput, resource bottlenecks

Basic DB Checks

Connection count, active queries, long-running queries, storage growth
Communication and documentation:
Incident updates, handover notes, minimal post-incident summary

Recommended Screening (Hands-on)

Linux triage: given service down / high latency, show step-by-step checks and safe actions.
Dashboard reading: interpret a scenario (latency spike + error increase) and propose likely causes + next checks.
DB ops basics: how to detect connection exhaustion, locking symptoms, and what evidence to collect.
Security hygiene: explain safe SSH access, key handling, secrets, and patching routines.

Experience & Education Guidance

02 years relevant experience (DevSecOps/SRE/Infra Support) or strong internship/homelab proof.
Fresh graduates are acceptable if they demonstrate strong terminal skills + monitoring literacy.