I am Shubham Kumar.
I build and explore machine learning systems with a focus on multilingual NLP, LLMs, and real-world AI applications.
-
🧠 Hugging Face Transformers
🔗 huggingface/transformers#9286- Contributed to resolving a token alignment issue affecting multilingual NER pipelines
- Improved tokenizer behavior across languages
-
🗣️ Hugging Face Model Hub
🔗 https://huggingface.co/zyberg2091/distilbert-base-multilingual-toxicity-classifier- Published a multilingual toxicity classifier (DistilBERT)
- Supports English, Hindi, and Hinglish
- Focused on real-world content moderation use cases
-
📊 TensorFlow Models (Community Contributions)
- Contributed to discussions on best practices for object detection on custom datasets
-
📚 Documentation Contributions (TensorFlow, Rasa)
- Improved developer experience through documentation fixes and clarity enhancements