Maximize business impact with Generative AI: Embrace automation, prioritize data quality, and preserve human oversight for success.
Discover the transformative potential of Generative AI with insights from Mark Van de Wiel, field CTO at Fivetran. Learn how to harness automation, ensure data quality, and preserve human oversight for optimal results.
Generative AI is poised to change how marketing, customer service, content creation, and even the legal field operate every day. For years now, artificial intelligence has helped uncover new business insights and optimize operations in everything from banking to ride-hailing services like Uber and Lyft, as companies use machine learning to evaluate transactions and predict demand.
The incredibly human-like output from generative AI, fueled by large language models, is poised to create unforeseen capabilities in rapidly distilling or sorting through data. This output, combined with the ability to create rough versions of written and visual content quickly, opens up whole new fields. Recent research from Vanson Bourne shows that 87% Opens a new window of organizations agree that AI is the future, and organizations that do not utilize it will fail to survive.
Already, we’re seeing preliminary results, although many companies are in the experimentation stage. One survey found that 66% of marketers using generative AI have witnessed positive ROI using tools like ChatGPT, Dall-E, and Bard, as 44% of marketers use it to create email copy, 42% use it to create social media copy, and 39% are creating social media images. Amazon has introduced tools to help sellers write product listings with generative AI. At the same time, internally, employees have found ChatGPT useful for answering customer questions and writing code and training materials, according to news coverage.
Grasping the Essence of Generative AI
To capitalize on AI’s potential, companies will need a proper foundation of education and organization. Using a service like Open AI’s ChatGPT or Anthropic’s Claude is so easy – with just a few words typed in able to generate pages of text – these systems not only appear to be magic, but they also can appear to be more authoritative than they are, and these failings call for proper guidelines for users. Just because it’s quick and efficient doesn’t mean generative AI can do everything.
One challenge is that humans see intelligence in AI where it doesn’t exist. From millions of years of identifying threats in the natural environment to attributing human motivations and feelings to our family pets, people are prone to spotting human-like intelligence where it doesn’t exist. AI is intrinsically non-human, defined by a profound understanding of associations between data points. Whereas humans can rely on innate instinct to understand our world, AI relies on data to make new connections. For AI, data is everything.
Inaccuracy, or “hallucinations,” as the AI industry calls them, is one of the biggest challenges with the proliferation of large language models. When the AI system doesn’t have “common sense,” it’s only as reliable as the data going into the model. Generative AI should never be viewed as a definitive source of knowledge – its primary purpose is to predict what words are most likely to come next based on the training dataset. Data quality issues will inevitably make any AI output flawed. While a knowledgeable individual can easily dismiss nonsensical results, identifying erroneous information buried deep in a dataset isn’t as straightforward.
The AI Revolution is Here: Is Your Data Ready?
Success with generative AI inside your organization requires powerful data management, and sophistication in determining what data to include and how to process that data. First, eliminate data silos. Suppose different sources of customer or BI data are isolated and stored independently. In that case, it’s much harder for AI to connect the datasets, so centralization is a priority – siloed data is inherently incomplete.
Second, utilize a data catalog to help define tables, organize metadata, and improve governance abilities to manage AI datasets and efficiently respond to data privacy laws and regulator requests as new AI privacy laws are codified. A data catalog can provide essential change tracking and indexing capabilities to save time and computing power when managing data.
Third, automate data integration and management. Managing data pipelines can be time-consuming and frustrating when schemas, endpoints, or APIs change on the fly. Data teams can focus more on finding insights than organizing and manually integrating data streams. A modern data stack can provide a real boost towards AI maturity. Automated integration, cloud data lakes, and visualization platforms facilitate easy, accurate data access. Organizations can confidently build AI models, knowing workloads can scale. Robust, automated data systems build confidence in the output coming from an AI model.
Regarding data integration, the Vanson Bourne survey found that data scientists spend 70% of their time working with and preparing data rather than building AI models, and 87% of respondents report that data scientists need to be utilized to their full potential. Data scientists should be spending much more of their time building AI models for forecasting and decision-making. Improving data quality can give data scientists more time to concentrate on building AI models that are reliable and accurate.
Navigating Compliance While Regulations Are Established
Companies are under pressure to adopt generative AI tools for productivity gains, but most nations are still taking input, and the EU is adjusting regulations already in effect. This fluid situation makes it hard to navigate this landscape with just existing laws and frameworks as guidance. But many of the same data protection principles around transparency, notice, and privacy rights currently in effect will likely extend to generative AI where technically possible.
To prepare for this, companies can focus on several key practices today:
- Transparency and documentation: Communicate AI usage, document logic, intended uses, and potential impacts on data subjects and maintain detailed logs of personal data for governance, data security, and privacy rights.
- Mitigate disclosure risk: Review tool Terms of Use and negotiate Data Protection Agreements with AI providers to protect proprietary and personal data. Create contractual protections around confidentiality and data reuse.
- Utilize localized LLMs: Train AI models on company-specific data to prevent leakage and address disclosure concerns. This enhances productivity by providing relevant insights while reducing data protection risks. Localized LLMs can be based on emerging open-source foundational models such as LLaMA or Falcon. These may comprise company-specific LLMs, along with industry-specific or task-specific LLMs that are trained for particular knowledge domains or tasks.
- Start small and experiment: Test smaller AI models locally before integrating them into live data. Conduct security testing to identify issues, ensure accuracy, and address potential vulnerabilities.
- Preserve the human element: Use generative AI to augment human performance, not replace it entirely. Maintain human oversight, review critical decisions, and verify AI-created content.
Building a Data-driven Foundation
To achieve success with Generative AI, companies need to examine their assumptions about what AI can do. They can’t accomplish today and then adjust existing data systems to maximize relevance and impact for the future. Data quality is essential for achieving maximum business impact while maintaining compliance with global regulations and automating data movement is crucial to ensuring data quality. By taking advantage of automation and building a strong data foundation today, companies will get the best results from AI in the future. As generative AI takes center stage, it’s time to set the stage for success.
The following Mark Van De Wiel from 2023 provides their research perspective. HERE