APIs vs Data Virtualization: What’s the best infrastructure for a data marketplace? 

Engineering

APIs vs Data Virtualization: What’s the best infrastructure for a data marketplace? 

Finbourne Logo

06/12/2023

For decades, banks and other financial services organisations have dreamed of making data available to any application or user who needs it, in a secure and pain-free way. The vision? To create a single, discoverable marketplace that allows data consumers to browse and incorporate any piece of information available into their workflow, and to make the whole process as easy as shopping for goods online. 

Like most concepts, what sounds easy on paper is, in fact, extremely difficult in reality. Data comes from so many different suppliers, and in so many incompatible formats, that creating and maintaining a single platform that satisfies everyone’s needs is extremely challenging. 

Until recently, the only way many organisations felt they could create a data marketplace was through an API gateway, but recent advancements in data virtualization offer organisations a more efficient and effective way of making the dream a reality. 

The API Gateway, the “good enough” solution 

An API gateway is a server or software component that acts as an entry point for a collection of APIs. Simply put, an API gateway allows developers to browse the APIs in its collection, then select and integrate the relevant ones into their applications in a controlled and efficient way. 

Sounds like a simple way of creating a data marketplace, right? Just collect the documentation for a series of APIs designed to deliver data in one place, and you’ve got a data marketplace! Well, yes and no. API Gateways are relatively simple to create, but the technological limitations of APIs themselves present a scalability and maintenance challenge. 

As a point-to-point data exchange between two systems, APIs are not naturally suited to communicate large quantities of data from multiple sources. 

Let’s say you’ve developed an application that tracks the price of equities, and you want to integrate market price data from four separate data providers via their APIs. Firstly, you must manually integrate each API into your application. Secondly, given that the data provided by each API is likely to be different in structure and form, you will likely need to create four separate data quality and normalisation processes to make the outputs of each API consumable by your application. All this quickly adds up to a decent chunk of work. 

As you integrate even more APIs into an ever-growing number of applications, the scope of the work required increases exponentially. With so many complex dependencies and messily intertwined systems, the ecosystem quickly becomes so complex that it is impossible to untangle. 

Want to add an API to your gateway that communicates data from a new data source? Every client application needs to be reconfigured to accept it. Want to decommission an old API? A massive comms exercise needs to be undertaken to ensure that client applications are prepared for the change. In both cases, lots of manual, painstaking, and slow work is needed to ensure that clients can take advantage of new data products, and that business as usual can continue unaffected. 

What is needed is a way to give clients access to all the data available in an easy-to-consume, consistent format, without the need to manage complex technological dependencies. 

Fortunately, cutting-edge data virtualization technology, like FINBOURNE’s Luminesce, offers organizations the ability to finally create a stable, future-proof, scalable data marketplace.  

Enter data virtualization 

Data virtualization technology creates a virtual layer that ingests data from any source, collates it into a single place, and transforms it into an easy-to-use format. Data consumers then connect to this virtual layer, using a single interface to browse, access, and query every datum available. 

This means that, rather than having to integrate the outputs of multiple APIs into a client application, developers can use a single connection to a data virtualization layer to gather all the data their application could ever need. 

Sounds great, right? But what are the benefits? 

Firstly, data virtualization technology greatly reduces the workload placed on developers. If we go back to the example of a market price application that consumes data from four sources, rather than having to integrate an API from each source into the application, when using a data virtualization layer, all a developer needs to do to get data into their application is create a single connection to the data virtualization tool, and then write the necessary query to allow the application to access the data. The developer doesn’t even need to spend time creating a transformation process to get the data into a common format; the virtualization layer handles this on their behalf. 

Secondly, data virtualization layers adapt to change extremely well. 

As an underlying platform, data virtualization layers manage the connections to data sources on a client application’s behalf. If a new data source is added or decommissioned, the updated data is automatically made available to all the applications that are connected to the virtualization tool (assuming they have the correct permissions). There’s no need to unpick old APIs or integrate new ones to adapt to the changes; they happen without the need for much, if any, human intervention. 

In short, what takes hours of development time and constant complex ongoing maintenance in an API gateway can be achieved at a fraction of the cost and time in a virtualization layer. What’s more, its ability to handle change with minimal disruption makes investing in data virtualization technology a no-brainer for organizations that are trying to realize their data marketplace dreams. 

The API gateway has proved a useful stopgap in the data marketplace journey; a quick way to get data from one system to the next. The inherent limitations of APIs as a data medium are fast being overcome by data visualization technology, which is finally realizing the dream of ensuring that every user and system that needs access to data can get it quickly and efficiently. 

Topic  API Gateway  Data Virtualization  
Scope of Functionality  API Gateways focus on managing and securing APIs, handling requests, enforcing security policies, and routing data.  Data Virtualization is broader, dealing with data integration, access, and abstraction from various sources, not limited to APIs.  
Data Integration  API Gateways do not provide extensive data integration capabilities and are primarily focused on API-related interactions.  Data Virtualization specializes in data integration, combining data from multiple sources, performing transformations, enrichments and creating unified views.  
Real-Time Data Access  API Gateways work well for real-time API interactions but may not handle complex data streaming or joining across multiple sources.  Data Virtualization supports real-time data access and can provide real-time data streams or near-real-time access to integrated data.  
Complex Data Transformations  API Gateways handle basic data transformations for API responses but may lack advanced data management features.  Data Virtualization platforms offer extensive data transformation capabilities, including complex joins, calculations, aggregations, data enrichment.  
Data Governance  API Gateways enforce security policies but may not provide the same level of data governance and quality controls as Data Virtualization.  Data Virtualization platforms often offer robust data governance features, including data lineage, quality checks, and access control.  
Finbourne Logo

FINBOURNE

06/12/2023

twitterlinkedinfacebook

Related articles

Asset Managers – Exploring the Buy versus Build Dilemma

Finbourne LogoFINBOURNE03/05/2024

Exclusive Q&A: Northern Trust and FINBOURNE partner to win the data race

Finbourne LogoFINBOURNE02/05/2024

Droit and FINBOURNE Partner to Deliver End-to-End Position Reporting Solution

Finbourne LogoFINBOURNE30/04/2024

A Brief History Of Enterprise Data Management – Part 1: The Data Deluge

Finbourne LogoFINBOURNE23/04/2024