eNLP¶
This python library is a collection of common Natural Language Processing functions ranging from processing to visualisation. The purpose of the package is to collect commonly used functions into a single location and provide a simple approach for processing and understanding of textual data.
A number of example usages can be found in eNLP gallery, whilst publications whose research used the package are detailed in the publications section
Language Processing¶
The library has functions for basic language processing, some homemade functions, for example for punctuation removal, and other functions that leverage on the open-source packages of:
These functions have been wrote such that they can be called individually or strung together to make a processing pipeline. For example, to remove punctuation and perform a lemmatization of the remaining tokens, an NLP pipeline can be set up as so,
from enlp.pipeline import NLPPipeline
import spacy
langmodel = spacy.load('en_core_web_md')
text = "Some exciting text to be processed - ensure the language matches the spacy model"
processed_text = NLPPipeline(langmodel, text)
processed_text.rm_punctuation().spacy_lemmatize()
The processed text can be accessed via
processed_text.text
Understanding¶
The library also has a number of functions for language understanding, such as word vector creation, sentiment analysis, topic modelling and key word extraction. As well as the packages mentioned above, these functions leverage on the open-source packages of: