Blog

Cork AI Meetup 3 - Introduction to Text Analytics and Natural Language Processing (plus hands-on)

Peter Elger

March 15th 2018


hosted by Peter Elger / Keelin Murphy


At this third meetup sponsored by fourTheorem, our guest speaker Nick Grattan provided an introduction to text analytics techniques and their practical usages. This was followed by a hands-on tutorial implementing Word2Vec with TensorFlow.

The evening began with a welcome message from organisers Peter Elger (fourTheorem CEO) and Keelin Murphy (Research Fellow, UCC), and encouragement to dive into the beer and pizza whilst enjoying the material. Mark Hayes from the SQL User Group then followed with a message about an upcoming 1-day training event called ‘SQL Saturday’, scheduled for the 9th June 2018. The SQL User Group provides high quality technical sessions delivered by elite professionals on related topics such as database administration, development, cloud , data platforms and more! This event is usually held in Dublin but fortunately this year will be held at Cork University College! The entry fee is €250.

Check out this link for more information!
http://www.sqlsaturday.com/742/EventHome.aspx


Nick Grattan is an Application Architect at Dassault Systèmes, working on document and process management systems. He is also studying part-time for a Ph.D. in Text Analytics at the Insight Centre, University College Cork, Ireland. Nick gave a hugely informative and easily understandable introduction to text analytics and natural language processing (NLP). Cork AI introduction to  Text Analytics and Natural Language Processing by Nick Grattan

Nick covered topics such as:

  • Traditional “Frequentist” text analysis with Bag-of-Words and Vector Space Models. Measuring document/text similarity with distance metrics and clustering documents. He showed examples using Python and SciKitlearn.
  • Word Embeddings with Word2Vec for semantic term analysis.

Video of Nick from the meetup - https://youtu.be/ESMZDm9TPKg

Nick’s presentation slides from meetup - https://nickgrattandatascience.wordpress.com/

Cork AI introduction to  Text Analytics and Natural Language Processing by Nick Grattan


After a short break for more pizza, the second part of the evening continued with Nick leading a hands-on Word2Vec implementation using TensorFlow. This process included creating word embeddings from a corpus and exploring word semantics. Prerequisites for this hands-on workshop included access to an Amazon Web Services (AWS) account for GPU EC2 instances, and installation of an SSH client (Windows users were steered to the Putty client http://www.putty.org/ ).

Additional links that might be of interest referenced by Nick -

Links to video’s from last meetup:


The Cork AI meetups are non-profit events that allow our community to connect with like-minded individuals, share knowledge, and gain experience through a common interest in tech. The purpose of this blog post is to provide a summary of the evening’s events for anyone who might have missed being there live in person. A huge thanks to our organisers, speakers, and attendees for another successful evening. We hope to see you all at Cork AI #4!

Follow us on twitter for the latest updates @fourtheorem @CorkAI