How about some Text Mining?

Ever wondered what's behind the vaguely sounding term called "Text Mining"? You've probably heard about Stemming and Lemmatization, but what about TF-IDF? Or N-grams? Or NLP? Have you ever wondered how the text is vectorized and then used for classification and prediction tasks? Or how you can use Text Mining against your Bugs or Requirements?

Well, ponder no more as this presentation is going to give you a full-blown overview of all the aforementioned techniques, all while following the examples taken from the PHP ecosystem. As an added bonus, we'll demonstrate that it's even possible to do it in pure PHP!

P.S. No previous knowledge of data science or statistics is needed. Come wondering and leave pondering!

Mihailo Joksimovic flag

Mihailo Joksimovic

Software Architect @ JAGGAER Direct