Kateryna Hrytsaienko
Multilingual NLP pipeline up and running from scratch
#1about 3 minutes
The challenge of building end-to-end NLP pipelines
There is a lack of comprehensive guides for integrating multilingual NLP models into applications with proper CI/CD practices, especially for non-English languages.
#2about 5 minutes
Understanding the core components of an NLP pipeline
A typical NLP pipeline consists of three key stages: pre-processing, feature extraction, and modeling, with pre-processing being critical for handling unstructured data.
#3about 8 minutes
Why simply translating everything to English is not enough
Translating all text to English for NLP analysis can decrease accuracy by up to 20% due to lost semantic nuance and dialectical differences.
#4about 10 minutes
Generalizing languages with stemming and bag-of-words
Handle similar languages by using stemming to find common root words and a bag-of-words model with a similarity index to treat them as a single language.
#5about 5 minutes
Achieving high accuracy with a unified language model
By training classifiers on stemmed and normalized vectors from multiple similar languages, it's possible to achieve high accuracy of around 90% in tasks like topic classification.
#6about 8 minutes
Choosing the right deployment strategy for your model
Decide between embedding your model or exposing it as an API, considering options like serverless for simple cases or Kubernetes for scalable, cloud-agnostic deployments.
#7about 7 minutes
Implementing a CI/CD pipeline for your NLP model
Establish an MLOps workflow with continuous training, integration, and delivery by containerizing your model with Docker and automating builds with tools like GitHub Actions.
#8about 6 minutes
Q&A on slang processing, debugging, and transformers
The Q&A covers practical advice on handling slang with dictionaries, debugging with robust logging, and understanding the complexity gap between traditional methods and transformers like BERT.
Related jobs
Jobs that call for the skills explored in this talk.
Wilken GmbH
Ulm, Germany
Senior
Kubernetes
AI Frameworks
+3
Picnic Technologies B.V.
Amsterdam, Netherlands
Intermediate
Senior
Python
Structured Query Language (SQL)
+1
Matching moments
04:57 MIN
Increasing the value of talk recordings post-event
Cat Herding with Lions and Tigers - Christian Heilmann
05:03 MIN
Building and iterating on an LLM-powered product
Slopquatting, API Keys, Fun with Fonts, Recruiters vs AI and more - The Best of LIVE 2025 - Part 2
01:32 MIN
Organizing a developer conference for 15,000 attendees
Cat Herding with Lions and Tigers - Christian Heilmann
06:44 MIN
Using Chrome's built-in AI for on-device features
Devs vs. Marketers, COBOL and Copilot, Make Live Coding Easy and more - The Best of LIVE 2025 - Part 3
06:28 MIN
Using AI agents to modernize legacy COBOL systems
Devs vs. Marketers, COBOL and Copilot, Make Live Coding Easy and more - The Best of LIVE 2025 - Part 3
03:45 MIN
Preventing exposed API keys in AI-assisted development
Slopquatting, API Keys, Fun with Fonts, Recruiters vs AI and more - The Best of LIVE 2025 - Part 2
03:28 MIN
Why corporate AI adoption lags behind the hype
What 2025 Taught Us: A Year-End Special with Hung Lee
04:04 MIN
Shifting HR from standard products to AI-powered platforms
Turning People Strategy into a Transformation Engine
Featured Partners
Related Videos
A beginner’s guide to modern natural language processing
Jodie Burchell
Python-Based Data Streaming Pipelines Within Minutes
Bobur Umurzokov
Creating Industry ready solutions with LLM Models
Vijay Krishan Gupta & Gauravdeep Singh Lotey
Multimodal Generative AI Demystified
Ekaterina Sirazitdinova
Overview of Machine Learning in Python
Adrian Schmitt
DevOps for AI: running LLMs in production with Kubernetes and KubeFlow
Aarno Aukia
From Traction to Production: Maturing your LLMOps step by step
Maxim Salnikov
The state of MLOps - machine learning in production at enterprise scale
Bas Geerdink
Related Articles
View all articles



From learning to earning
Jobs that call for the skills explored in this talk.

Forschungszentrum Jülich GmbH
Jülich, Germany
Intermediate
Senior
Linux
Docker
AI Frameworks
Machine Learning


SOURCE GmbH
Wiesbaden, Germany
€65-75K
Senior
Python

envelio
Köln, Germany
Remote
Senior
Python
JavaScript
Structured Query Language (SQL)

MARKT-PILOT GmbH
Stuttgart, Germany
Remote
€75-90K
Senior
Java
Angular
TypeScript


Canonical Ltd.
Python
Kubernetes
Data analysis
Machine Learning

Canonical Ltd.
Remote
ETL
Azure
Linux
Python
+6

Canonical Ltd.
Remote
ETL
Azure
Linux
Python
+6