Data Scientist / Data Engineer

Responsibilities

Modeling & Analytics: Explore data, engineer features, develop and evaluate ML/AI models (supervised/unsupervised, NLP, generative/LLM use cases), and communicate insights clearly to stakeholders.
LLM & RAG Solutions: Build and optimize LLM‑based applications (prompting, fine‑tuning/adapter methods) and implement RAG pipelines for grounded responses and enterprise use.
Applications & Tooling: Develop end‑to‑end apps using Streamlit (frontend + backend) and build scalable workflows and Databricks Apps for data processing, experimentation, and deployment.
Data & MLOps: Work with Python/SQL and modern data platforms to productionize models (versioning, monitoring, retraining, CI/CD) with an emphasis on performance, reliability, and security.
Collaboration: Partner with product, engineering, and business teams to identify use cases, scope experiments, and translate outcomes into measurable impact.

Qualifications

Required

Bachelor's degree in a quantitative field (e.g., Data Science, Computer Science, Statistics) or equivalent practical experience.
Proficiency in Python and SQL; experience with ML libraries/frameworks (e.g., scikit‑learn, PyTorch or TensorFlow, Hugging Face).
Hands‑on experience with LLMs, RAG architectures, and evaluation methods.
Practical experience building Databricks workflows or Databricks Apps.
Ability to create Streamlit apps end‑to‑end (UI, business logic, and model integration).
Strong problem‑solving skills and the ability to communicate technical concepts to non‑technical audiences.

Preferred

Experience with vector databases and embeddings; familiarity with data lakes/Delta formats and distributed computing (e.g., Spark).
Knowledge of MLOps practices (MLflow, CI/CD, monitoring) and containerization (Docker).
Exposure to at least one major cloud platform (Azure/AWS/GCP).
Understanding of data governance, privacy, and responsible AI principles.