This role is for one of the Weekday's clients Salary range: Rs 2500000 - Rs 3000000 (ie INR 25-30 LPA) Min Experience: 3 years Location: Bangalore JobType: full-time Key Responsibilities Take full ownership of the data engineering function from start to finish — covering data ingestion through to the serving layer — operating as a highly autonomous individual contributor.
Develop both real-time and batch ingestion pipelines for prediction market APIs (Polymarket Gamma/CLOB, Kalshi), sportsbook odds feeds (Pinnacle), and statistical sources (HLTV, ESPN, flashscore).
Architect and execute a Medallion architecture (Bronze → Silver → Gold) for market data, on-chain orderbook snapshots, historical odds, match results, and player/team statistics.
Create the feature store that powers our AI edge models, including Elo ratings, Bradley-Terry map-veto probabilities, Bayesian calibration indicators, and Kelly sizing inputs.
Implement WebSocket listeners and streaming infrastructure to track live odds fluctuations, in-play probability updates, and orderbook depth changes, targeting sub-second latency.
Construct nightly batch pipelines for data needed in model retraining — including historical odds versus outcomes, walk-forward backtesting datasets, and profit & loss reconciliation across exchanges.
Establish cloud infrastructure (AWS/GCP), manage job orchestration (Airflow/Prefect/cron), and deploy monitoring, alerting, and data quality checks throughout all pipelines.
Develop data APIs and caching layers that provide the trading terminal frontend with standardized, low-latency market data across all supported exchanges.
Conduct research and development on new data sources, scraping techniques, and tools to continuously broaden market coverage and enhance data freshness.
Qualifications Minimum of 3 years of practical Data Engineering experience building scalable production pipelines.
Proficient in Python (including Pandas, asyncio, aiohttp, requests, BeautifulSoup/Scrapy) and advanced SQL skills.
Experience with PostgreSQL, Redis/DragonflyDB, and cloud platforms such as AWS (S3, Lambda, RDS) or their GCP equivalents.
Hands-on expertise with WebSockets, streaming data, and real-time event-driven system architectures.
Skilled in REST API integration, webhook configuration, and large-scale web scraping (handling rate limits, proxy rotation, anti-bot measures).
Familiarity with workflow orchestration tools (Airflow, Prefect, or Dagster) and CI/CD pipelines on Linux/Docker environments.
Strong foundation in database design, Medallion/lakehouse architectures, and data modelling for analytical purposes.
Clear, well-structured communication skills in English.
Ability to independently manage execution, proactively troubleshoot, and resolve issues without supervision.
A meticulous data quality mindset, ensuring data accuracy and reliability instinctively.
Resourceful problem solver, capable of devising quick workarounds and fixes for API failures or changes in rate limits.
Maintains thorough documentation for schema definitions, pipeline DAGs, and failure runbooks.
Comfortable with rapid iteration cycles and designing extensible pipeline architectures.
Strong cross-functional communication skills to clearly convey technical information to non-technical stakeholders.
Skills Python, SQL, PostgreSQL, Redis, AWS, GCP, API Integration, Web Scraping, Airflow, Prefect, Dagster, WebSockets, Streaming Data, Docker, Data Modelling