Data Platform: uses of Generative AI

We illustrate an application of Generative AI used to query a Data Platform powered by hybrid sources.

Why GenAI on Data Platform?

A potentially revolutionary application of Generative AI concerns the field of Data Platforms: integrated technological platforms that guarantee the coexistence of different data resources and allow their consultation by exploiting the mix of sources, identifying from time to time the most appropriate ones to respond to the queries of the user.

Generative AI acts as an enabler by identifying patterns, correlations and trends within data and can be used to generate scenarios, complex simulations, optimize decision-making processes, identify inefficiencies and suggest innovative solutions.
In this article we will describe a tool we developed as an example of applying Generative AI to Data Platforms.

Data Accessibility: how to
Data enquiry and referencing
Verification of the data validity

How to make the data of interest accessible to the AI model which will then process it?

Our tool includes a specific section to allow the upload of a file that will be used as a reference data lake.
If you want to extend access to an entire database, it is possible to integrate it within the work framework.
In particular, for Oracle databases there is an ad hoc technology called AI Services, integrated within the Oracle Cloud Infrastructure, which provides a machine learning and artificial intelligence infrastructure completely integrated into the Oracle ecosystem, guaranteeing performance and security levels extremely high.

Once the data of interest is available to the application, it is possible to carry out inquiries.

The application allows you to query the data via prompts (textual interaction mode) and it is possible to reference the previously loaded data by asking the model to use them to perform the operations.
The referencing can be done explicitly (e.g.: "Find customers with portfolio > €50,000”) or deducted from the model based on what is entered in the prompt.
The operations carried out can be very simple or complex; however, even in simple cases, it is often more convenient to prefer the tool to manual analysis given the greater speed with which the model can process the data compared to a human.

Another interesting aspect of the model is the possibility of identifying errors in the data that could escape the human eye, especially in the case of large files.

The model knows how to spot anomalies because it is good at understanding the semantic meaning of each column based on the headers.
Once queried, the model analyzes each of the records present in the file and verifies that the data respects the conventions expected by its type (e.g. numeric data, not empty, etc.).
Finally, it will be possible to intervene on specific records by exploiting the fact that the model has memory of previous interactions and is therefore able to highlight the records that present the anomalies previously identified by it.

Going further: using Neural Models

By implementing a neural model adapted to the context it is possible to obtain classifiers suitable for recognizing more complex anomalies automatically (i.e. without explicit user request).
For example, the model can be trained to analyze bank data and verify that each relationship present in the data sources relating to a mortgage is "coherent" (not simultaneously in amortization and pre-amortization; that a failure to pay an installment corresponds to a problem adequately valued etc).
This type of anomalies does not correspond to formal errors, but rather the problem lies in the (contradictory) information that a set of data possesses. By its nature, AI is a valid ally in the recognition of these issues.