We analyze large-scale web data to deliver searchable insights. Our models power information extraction, anomaly detection, semantic search, and trend forecasting. We’re hiring a Data Scientist to push the next wave of ML features.
You will:
- Build and ship ML models: NLP (NER, classification, relation extraction, topic modeling, embeddings), anomaly detection, regression/classification.
- Work with LLMs (RAG, lightweight fine-tuning/LoRA), evaluations, and guardrails.
- Run experiments and A/B tests; define metrics and perform statistical validation.
- Productionize with MLflow/W&B, feature pipelines, and model serving (batch/REST).
- Partner with Data Engineering and Product on problem framing and metrics.
Must-have:
- 3+ years as a Data Scientist/ML Engineer.
- Strong Python (pandas/polars, scikit-learn) and solid statistics/experimentation.
- Experience with PyTorch/Transformers and NLP.
- Comfortable with messy web data: cleaning, deduplication, normalization, labeling.
- SQL and work on cloud data warehouses (BigQuery/Snowflake/Redshift).
- End-to-end ML lifecycle (from exploration to deployment/monitoring).
- Good command of English and a challenger mindset.
Nice-to-have:
- RAG/LLM-Ops (LangChain/LlamaIndex), vector search (FAISS/pgvector/Pinecone).
- Time-series, causal inference, or graph ML.
- Feature stores (Feast), DVC/data versioning.
- Airflow/Dagster, dbt; AWS/GCP; Docker/Kubernetes.
We offer:
- Big influence on the ML roadmap and measurable product outcomes.
- Flexible, outcome-driven environment with quick decisions.
- ESOP, learning/conference budget, modern hardware, team offsites.
Apply: Send your CV/LinkedIn (+ 1–2 project links/notebooks) to hello@banachastreet.com with subject “Data Scientist – Your Name”.
