Search by job, company or skills

practical devsecops

AI Engineer

Save
new job description bg glownew job description bg glownew job description bg svg
  • Posted 16 hours ago
  • Be among the first 10 applicants
Early Applicant

Job Description

AI Engineer — Model Training & AI Exploration

About The Project

We are building a comprehensive Quran Recitation Learning Platform — a production system that helps users practice and improve their Quran recitation using real-time AI-powered speech recognition, Tajweed rule analysis, and personalized audio feedback. The platform consists of a React Native mobile app, a FastAPI backend, and multiple GPU-accelerated microservices.

Our AI pipeline currently processes thousands of audio recordings, combining ASR (Automatic Speech Recognition), Tajweed analysis, pronunciation validation, and TTS (Text-to-Speech) feedback generation — all running as containerized gRPC microservices with CUDA acceleration.

Role Overview

We are looking for an AI Engineer to own and advance the model training pipeline and explore new AI approaches to improve our Quran recitation system. You will work with production ASR models and Tajweed analysis — improving accuracy, reducing latency, and expanding capabilities.

This is a hands-on role focused on fine-tuning, evaluation, improve scoring and AI R&D — not just API integration. You will be the primary person responsible for making AI models and scoring better.

What You'll Do

Scoring Improvement

  • Use method for improve tajweed and word error calculation
  • Create script for harness test

Model Training & Fine-Tuning

  • Fine-tune ASR models for Quranic Arabic using NVIDIA NeMo (FastConformer Hybrid RNNT/CTC architecture)
  • Train and optimize custom models for Tajweed rule detection (currently Whisper-based)
  • Train pronunciation validation models using Wav2Vec2 for harakat (diacritics) error detection
  • Build and maintain training data pipelines — data collection, cleaning, augmentation, and quality control
  • Develop evaluation harnesses with automated metrics (WER, CER, Tajweed accuracy, speaker similarity)
  • Manage experiment tracking (MLflow / Weights & Biases) and model versioning

AI Exploration & R&D

  • Research and prototype new architectures for Quranic Arabic ASR (conformer variants, whisper fine-tuning, custom tokenizers)
  • Explore on-device / edge deployment of lightweight ASR models for mobile inference
  • Experiment with LLM-based approaches for contextual recitation feedback and error explanation
  • Benchmark alternative models (e.g., Whisper large-v3, SeamlessM4T, custom conformer) against current pipeline
  • Research voice activity detection (VAD) and audio segmentation optimized for Quranic recitation patterns

Current System You'll Improve

Our AI Pipeline Today

Mobile App (React Native)

↓ Audio (WAV 16kHz)

Backend (FastAPI + Socket.IO)

↓ gRPC

├── QuranASRNemo (port 50051) -- NeMo FastConformer, streaming + offline

├── QuranASRTajweed (port 50053) -- Whisper-based Tajweed rule detection

├── QuranASRWav2Vec2 (port 50054) -- Raw pronunciation validation

└── QuranFeedback (port 50052) -- Coqui XTTS v2 TTS with voice cloning ## Disabled for now



Weighted Scoring → Accuracy + Tajweed Violations + Pronunciation Errors ## This need to be improve



Audio Feedback (TTS) + Text Feedback → Mobile App ## Disabled for now

Known Areas For Improvement You'd Tackle

  • Hardcoded confidence scores (currently fixed at 0.9 regardless of actual model output)
  • GPU inference serialization bottleneck (single lock, no batching)
  • No model versioning or experiment tracking infrastructure
  • Scoring thresholds lack empirical calibration (current heuristic: 45/25/15/15 split)
  • TTS voice cloning path bug (hardcoded speaker reference)
  • No training data pipeline or data quality tooling exists yet

Notes

Model training and fine tune is not primary focus for now, but nice to do if wanted

More Info

Job Type:
Industry:
Employment Type:

About Company

Job ID: 146617289