As I write, we are putting together the final chapter of AI as a Service, the book Peter Elger and I have been working hard on over the course of the last year or so! As part of the book, we’ve taken what we know and used a combination of serverless technology and managed AI services in AWS to create all of these applications:
- An image recognition system with object detection
- A voice-driven task management app
- A chatbot
- An automated identity document scanner
- An AI integration for eCommerce systems to determine the sentiment behind customer product reviews, categorise them using a custom classifier and forward them to the correct department
- An event website crawler that uses entity recognition to find information on conferences, including speaker profiles and event location
When I look back at it, that’s a massive amount of capability for one book! Two years ago, this wouldn’t have been possible, and this speaks to how far managed AI has come. In the projects we do at fourTheorem, we can’t always rely on off-the-shelf, managed services. Even then, building and deploying machine learning models with systems like AWS SageMaker significantly lowers the time and cost to get started with serious, business-transforming machine learning capability.
Companies may say they are held back from embracing AI by a shortage of machine learning expertise, but this generally should not be a problem. Machine Learning skills are certainly highly prized. This issue can generally be addressed by partnering with professional consultants (yes, like us!) who have experience across multiple projects. The bulk of the work in any AI-based project is not necessarily in the development of algorithms and models. AI and Data Science projects are still software projects and need to be treated as such. Treating such projects as specialised research efforts can hinder your ability to get them into production. Remember, software in production is where it starts to matter. Anything else is overhead!
Development and Operations have already fused to become DevOps. Data science/ML should also be brought into the constantly moving cycle of deploying to production. Constant feedback from real-world, production users is what makes a really great product. If you notice a level of dissonance between data science and development in your organisation, it’s time to change the process and bring in a more consolidated approach. We have observed a number of cases where excellent algorithms are developed but fail to make it into the application with the expected velocity because of a mismatch in expectations between disciplines.
I want to share four major considerations for adopting AI in your organisation. This should be of interest to technology leaders, developers and machine learning aficionados alike.
Machine Learning and Data Science experts know this already, but data is the number one most important thing to consider. It’s estimated that 50-80% of a data scientist’s time is spent on data acquisition and cleaning [1, 2].
With managed AI services, the volume and quality of training data is less of an issue since you’re relying on pre-trained models but training a custom model is still frequently required for many business applications. Then comes the data architecture. It’s critical to consider how data flows through the system and how it is stored. Data flow and storage decisions must optimise for performance, cost and security/privacy. How data is accessed and stored depends on whether the application requires on-demand, real-time machine learning or is able to process historical data at rest.
We have come a long way in the software industry over the past decade or so. Since the advent of the cloud, Infrastructure-as-Code (IaC) has become widely recognised as the way to define and deploy any infrastructure your software depends on. The principle of “you build it, you run it” enabled development teams to move faster and build more robust, reliable products. There is no reason why this cannot be applied to your machine learning features! Any ML-specific infrastructure, models and data should also be subject to versioning and should be defined in code along with the rest of the application!
The old idea of building software in the lab and waiting until it was “perfect” before bringing it to production went out the window ages ago. It’s worth restating – software that is not in production does nothing for your business apart from cost money. For AI features, always aim to get to production as close as you can to day one. Do this by starting with a minimal, viable implementation. Measure as the product evolves. You will avoid many unwelcome surprises that inevitably appear if you start delaying deployment. Taking care of DevOps (IaC) and automated tests early on make developments thereafter significantly faster and easier to estimate.
Don’t fear machine learning. The cost of experimentation with AI is remarkably low. This is a direct result of the available managed AI services for everything from document processing to sales forecasting. Even where custom AI models need to be developed, the tools available in the cloud facilitate early results with almost no upfront investment.
If you’re in a business that has data and a reasonable application for machine learning, there’s no reason why you can’t start today. Talk to us if you want some guidance. No matter what your current technology is, there’s always a way to get started quickly and see how you can start leveraging machine learning right away.
Get in touch with us to share your thoughts.
Eoin Shanaghy (@eoins) is the CTO of fourTheorem, a company specialising in modern, cloud-native applications and machine learning. He is the author of AI as a Service, a book from Manning Publications.