Backend AI & Data Pipeline Engineer at Seeka Technology


Company Logo

Seeka Technology is Hiring

Job Info:
  • Company Seeka Technology
  • Position Backend AI & Data Pipeline Engineer
  • Location Islamabad, Pakistan
  • Source SmartRecruiters
  • Published April 05, 2026
  • Category Development
  • Type Full-Time


Job Description

About the role

We are looking for a Backend AI & Data Pipeline Engineer to own the end-to-end data processing infrastructure that powers Yuzee's intelligent course and job matching platform. You will design and maintain scalable, event-driven pipelines that process tens of thousands of daily records, generate semantic embeddings, and feed a growing knowledge graph used for personalised career pathway recommendations.

What you'll do

  • Design and maintain three distinct processing pipelines — scheduled job ingestion, event-driven course processing, and a periodic knowledge graph builder — each with independent trigger logic and cost controls
  • Generate and manage semantic embeddings via Amazon Bedrock (Titan v2), index them in MongoDB Atlas Vector Search, and calibrate similarity thresholds to ensure match accuracy
  • Build and maintain a knowledge graph linking jobs, courses, skills, and industries using FP-Growth association rules and archetype-to-SOC code mapping
  • Build and improve a two-stage discovery and matching API on AWS Lambda — vector retrieval first, then deep eligibility scoring with LLM re-ranking
  • Right-size Fargate Spot instances and design resumable processing loops that tolerate interruption, keeping infrastructure costs under control as data volume scales
  • Maintain and improve daily job scrapers across multiple sources and build institution data scrapers with robust HTML cleaning pipelines

What we're looking for

  • 1+ years of backend engineering experience focused on data pipelines, ML infrastructure, or search systems
  • Hands-on experience with AWS serverless and container services — Lambda, ECS Fargate, EventBridge, and Step Functions
  • Strong Python skills — Pandas, async processing, bulk database operations, and text cleaning
  • Familiarity with vector databases and semantic similarity search; MongoDB Atlas Vector Search experience is a strong plus
  • Cost-conscious infrastructure mindset — you think in per-record compute costs, free tiers, Spot resilience, and right-sizing
  • Ability to document and communicate complex architecture clearly to both technical and non-technical stakeholders

Nice to have

  • Experience with knowledge graphs or association rule mining (FP-Growth, Apriori)
  • Experience using LLMs for re-ranking or eligibility assessment on top of vector retrieval results
  • Background in edtech, jobtech, or recommendation/matching systems

Degree or existing proven experience 

Use this instead:

Benefits

  • Fully remote / work-from-home role
  • Flexible working hours within the team’s expected schedule and business needs

  • Opportunity to work on real backend, data, and AI infrastructure projects

  • Exposure to practical engineering challenges in scraping, pipelines, retrieval, and cloud systems

  • Ongoing growth and development within a fast-moving technology environment

  • Opportunity to build long-term value and grow with the company based on performance, including progression and increased responsibility over time

A slightly more polished version:

Benefits

  • Fully remote / work-from-home position

  • Some flexibility in working hours, depending on team requirements and deliverables

  • Hands-on experience working on meaningful backend, data pipeline, and AI-related systems

  • Opportunity to contribute to a growing platform with real product and engineering challenges

  • Professional growth in a practical, fast-paced environment

  • Strong potential for long-term progression based on performance, regardless of location

If you want it to sound more attractive for hiring, the strongest version would be to add things like:

  • ownership

  • real product impact

  • career growth

  • direct exposure to architecture and scaling decisions


✉️