Job Information
ElasticSearch, Inc. Search - Principal Data Scientist Grove City, Ohio
Elastic, the Search AI Company, enables everyone to find the answers they need in real time, using all their data, at scale - unleashing the potential of businesses and people. The Elastic Search AI Platform, used by more than 50% of the Fortune 500, brings together the precision of search and the intelligence of AI to enable everyone to accelerate the results that matter. By taking advantage of all structured and unstructured data - securing and protecting private information more effectively - Elastic's complete, cloud-based solutions for search, security, and observability help organizations deliver on the promise of AI. What is The Role: The Search Data team is responsible for developing and integrating statistical tools and machine learning models within the Search domain in support of semantic search, RAG, query understanding, and ranking and recommendations. As a Principal Data Scientist, you will work closely with our Product teams to lead the innovation, incubation, and prototyping phases of how to evolve and transform our AI/ML driven Search experiences and solutions with a focus on quickly bringing new ideas to production and into the hands of our customers. Your primary focus will be driving forward research and development in support of improving semantic search with proprietary models and customized open source models, developing techniques and models for query understanding, increasing ranking flexibility though ranking and recommendation models and related tooling, and developing tooling to help customers design and implement successful end-to-end RAG systems. Furthermore, you'll be investigating aspects of modern agentic search including reasoning engines, prompt engineering techniques, query understanding, and more. Doing this requires exploring and benchmarking new open source models, and existing proprietary Elastic models, while keeping up to date with the latest major advancements in the fields of NLP and information retrieval. If this sounds interesting, we would love to hear from you! Please include whatever info you believe is relevant: resume, GitHub profile, code samples, blog posts and writing samples, links to personal projects, etc. What You Will Be Doing: Explore, select and benchmark open source and Elastic proprietary models Experimenting, evaluating and testing use cases for Generative AI including RAG, chat and agentic search Keeping up-to-date with the most significant recent developments in the field of NLP and information retrieval Engage with the NLP and information retrieval communities (blogs, documentation, Python examples, conference talks, academic papers, etc.) Collaborate with cross-functional teams of data scientists, engineers, and product managers Promote knowledge sharing and collaboration in a distributed team What You Bring: 8+ years of confirmed experience building and applying NLP to production use cases 8+ years of professional software development experience in Python Experience in Generative AI, Retrieval Augmented Generation, and information retrieval Experience with libraries and frameworks such as PyTorch, transformers, and Pandas Experience using collaborative notebook-based workflows (e.g. Jupyter) for prototyping and knowledge sharing Expertise in AI/ML quality evaluation and improvement, including balancing tuning techniques with cost/benefit tradeoffs Self motivated, collaborative style, open communicator, experience in a distributed team Good attention to detail and highly organized Real passion for data, analysis and achieving excellence Experience with Elasticsearch is useful An academic background in the domain is also a plus Additional Information - We Take Care of Our People: As a distributed company, diversity drives our identity. Whether you're looking to launch a new career or grow an existing one, Elastic is the type of company where you can balan