By Heidi Maher
July 22, 2016
As data is located and copied from multiple sources, data scientists and subject matter experts must have access to a simplified process for locating, reviewing and tracking higher-quality information that will be used to train the A.I. “brain.”
Tagging and classifying
You need to tag and classify the data to ensure that it can be properly digested. Depending on the A.I. task, some metadata has more value than others. If you are looking for marketing insights, you will likely value metadata drawn from EXIF files associated with images on social media sites, including geolocation, timestamps, camera type and serial numbers. In medical settings, metadata elements including patient ID-date of birth, provenance-timestamp, and privacy-content are essential.
Track responses and update
Finally, you must have governance capabilities built into the system to track responses to the information used and adjust the diet accordingly.
The great irony of A.I. is that while it would seem to be the ultimate autonomous computing creation, and technology exists to automate parts of the data curation process, the job of parenting an A.I. is a particularly human one based on extensive knowledge and expertise in the subject matter area of the A.I. And only by recognizing the importance of the human element within the process of data curation can we fully assess the difficulty of getting A.I. right and avoiding hyped expectations and overconfidence in the implementation effort. As a result, as parents, we must continually and patiently nurture A.I. as it matures through specific stages, mastering new and very specific capabilities that meet well-defined requirements.