8000
Skip to content

shafiahmed/NLP-Tutorial

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

NLP-Tutorial

Document Classification in Python

A tutorial showing how to leverage a few great libraries out there -- gensim and scikit-learn -- to not only perform document similarity queries, but document classification as well.

===== Files

corpus -- A directory of 4 tiny text files
.gitignore -- Files in repo for Git to ignore
classifier.py -- The main file that does everything
requirements.txt -- File used by pip to download dependencies

======== Download

All you need to do is clone the repo:

git clone https://github.com/Scripted/NLP-Tutorial

============ Dependencies

In a perfect world, running "pip install -r requirements.txt" should download all the dependencies necessary to run this code. Unfortunately, Numpy and Scipy don't always play nice with pip. So try "pip install -r requirements.txt" and if that doesn't work, check out the installation instructions on the modules' sites: Numpy , Scipy , Gensim , Scikit-Learn

======= Running

Easy enough:

python classifier.py

The output shows the various steps of the algorithm as it works.

About

Document Classification in Python

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

0