Search by job, company or skills

pt. indosat tbk

Data Engineering

3-5 Years
new job description bg glownew job description bg glownew job description bg svg
  • Posted 10 hours ago
  • Be among the first 10 applicants
Early Applicant

Job Description

Job Description

Company Overview:

Our organization is a leading innovator in cybersecurity, cloud, and AI solutions, dedicated to developing cutting-edge products and services that address the evolving needs of the technology landscape. We thrive in a rapidly developing market (Indonesia) where the demand for advanced tech solutions is ever-growing, driven by rapid technological advancements. We are an AI-native company committed to continuous improvement, helping our customers unlock their full revenue potential.

Role Summary

As a Data Engineer you will play a crucial role in building and managing the data pipelines that are essential for training and fine-tuning our Large Language Models (LLMs), with a specific focus on the Indonesian language. You will be responsible for designing, building, and maintaining a robust and scalable data infrastructure. You will collaborate closely with our team of Data Scientists and Machine Learning Engineers to ensure the availability of high-quality, clean, and structured Indonesian language data for developing accurate and locally relevant AI models.

Key Responsibilities

  • Build and Manage Data Pipelines: Design, develop, and maintain ETL (Extract, Transform, Load) processes to collect and process Indonesian text data from various sources, such as databases, APIs, and log files.
  • Data Collection and Integration: Gather complex and relevant datasets tailored to business needs, particularly Indonesian text data that covers a wide range of dialects and linguistic styles.
  • Data Cleaning and Pre-processing: Perform data cleaning to handle inconsistent, duplicate, or corrupted data. You will also transform raw data into a usable format for training machine learning models.
  • Data Architecture: Design and implement an efficient and scalable data architecture, including data warehouses and data lakes, to store and manage large volumes of data.
  • Ensure Data Quality: Develop data validation methods and analysis tools to ensure the integrity and accuracy of the data used for model training.
  • Team Collaboration: Work closely with Data Scientists to understand their data requirements and provide ready-to-use data for the fine-tuning and evaluation of LLM models.
  • Performance Optimization: Monitor and optimize the performance of data pipelines to ensure efficiency and scalability, especially when handling very large volumes of data.

Requirements

Qualifications & Experience:

Education

  • Required: Bachelor's degree in Computer Science, Engineering, Information Technology, or a related quantitative field.
  • Preferred: Master's degree in a Computer Science.

Experience

  • Required: Minimum 3 to 5 years of hands-on experience in a data engineering role, particularly in projects involving big data and machine learning with a proven track record of designing and implementing data pipelines and architecture including data ingestion, storage, processing and delivery.
  • Preferred: Experience with Big Data technologies (e.g., Hadoop, Spark). Experience with cloud platforms (e.g., AWS, GCP, Azure) and their associated data services. Familiarity with DevOps/DataOps principles for CI/CD.

Required Skills

  • Technical Skills:
  • Programming Languages: High proficiency in programming languages such as Python, SQL, and Scala.
  • Databases: Deep understanding of relational databases (like MySQL, PostgreSQL) and NoSQL databases (like MongoDB).
  • Big Data Tools: Hands-on experience with big data technologies such as Apache Spark, Hadoop, and Kafka.
  • Cloud Computing: Knowledge of cloud platforms like Amazon Web Services (AWS), Google Cloud Platform (GCP), or Microsoft Azure.
  • ETL Tools: Familiarity with ETL tools like Apache Airflow, Talend, or Stitch.
  • Soft Skills:
  • Strong analytical and problem-solving abilities.
  • Excellent communication and teamwork skills to collaborate effectively with various teams.
  • Ability to work independently in a dynamic environment.

Competencies

  • Technical:
  • Architecture Design
  • Business Needs Analysis
  • Data Analysis and Interpretation
  • Infrastructure Design
  • Software Design
  • Solution Architecture
  • System Architecture Design
  • System Configuration Management
  • System Integration
  • Leadership:
  • Applied Learning
  • Building Customer Loyalty
  • Business Awareness
  • Collaborating
  • Continuous Improvement
  • Planning & Organizing
  • Quality Orientation

More Info

Job Type:
Industry:
Employment Type:

About Company

Job ID: 145695351