Hosted by Peter Elger and Keelin Murphy
At the third Cork AI meetup sponsored by fourTheorem, our guest speaker Nick Grattan provided an introduction to text analytics techniques and their practical usages. This was followed by a hands-on tutorial implementing Word2Vec with TensorFlow.
The evening began with a welcome message from organisers Peter Elger (fourTheorem CEO) and Keelin Murphy (Research Fellow, UCC), and encouragement to dive into the beer and pizza whilst enjoying the material. Mark Hayes from the SQL User Group then followed with a message about an upcoming 1-day training event called ‘SQL Saturday’, scheduled for the 9th June 2018. The SQL User Group provides high quality technical sessions delivered by elite professionals on related topics such as database administration, development, cloud , data platforms and more! This event is usually held in Dublin but fortunately this year will be held at Cork University College! The entry fee is €250.
Nick Grattan is an Application Architect at Dassault Systèmes, working on document and process management systems. He is also studying part-time for a Ph.D. in Text Analytics at the Insight Centre, University College Cork, Ireland. Nick gave a hugely informative and easily understandable introduction to text analytics and natural language processing (NLP).
Nick covered topics such as:
- Traditional “Frequentist” text analysis with Bag-of-Words and Vector Space Models. Measuring document/text similarity with distance metrics and clustering documents. He showed examples using Python and SciKitlearn.
- Word Embeddings with Word2Vec for semantic term analysis.
Nick’s presentation slides from meetup: https://nickgrattandatascience.wordpress.com/
After a short break for more pizza, the second part of the evening continued with Nick leading a hands-on Word2Vec implementation using TensorFlow. This process included creating word embeddings from a corpus and exploring word semantics. Prerequisites for this hands-on workshop included access to an Amazon Web Services (AWS) account for GPU EC2 instances, and installation of an SSH client (Windows users were steered to the Putty client http://www.putty.org/ ).
Additional links that might be of interest referenced by Nick
- GitHub with code and hands-on instructions: https://github.com/CorkAI/Meetup3
- Description of the Document Clustering example: https://nickgrattandatascience.wordpress.com/2018/03/15/document-clustering-example/
- Description of the word2vec example: https://nickgrattandatascience.wordpress.com/2018/03/15/doc2vec-example/
Previous Meetup Videos
The Cork AI meetups are non-profit events that allow our community to connect with like-minded individuals, share knowledge, and gain experience through a common interest in tech. The purpose of this blog post is to provide a summary of the evening’s events for anyone who might have missed being there live in person. A huge thanks to our organisers, speakers, and attendees for another successful evening. We hope to see you all at Cork AI #4!
Follow us on twitter for the latest updates @fourtheorem
To share your thoughts, or to speak to a member of the fourTheorem team, get in touch today.