• Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

WS Critical Performance : Data access choices for populating object graphs based on multiple tables

dudu pontpont
Posts: 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

I'm part of a big architectural project, web based, with users around the world. Within my small team of architect, we are in the process of making the design choice for how to build/populate the domain models with data, within Restful services.

We have a very powerful (hardware setup) RDBMS, Oracle.

These domain models in the model layer of services are often relying on multiple DB entities/tables, spread across multiple Oracle Schemas.

We have a HUGE challenge in terms of performance, to beat our benchmarks, and serve distant user's browsers across the internet and geographical locations.

In order to stick to and chase these benchmarks, some architect in my team have proposed a design that I fear will be disastrous performance-wise. Since we don't have the data and conditions to replicate/test prod conditions now and for a while (for a meaningful POC), I would like your opinions on the design the architect proposes:


-Services exposing API for a business domain object (i.e. Booking object for example) are relying on multiple DB schemas and tables, say 5 different schemas and more tables.

-The Object Graph are composed of several levels (One-To-Many lists of other objects, themselves composed of schema hybrid objects) -We have chosen to do the Data access through stored procedures, so this is given and not part of the design decision I am looking at.

The Architect's proposed design:

The API will consist vertically of 3 layers (DTO-COntract#1, Domain-Business Layer#2, Data Access Layer #3). But horizontally it will be many more, as explained below. And that is the point where I m concerned.

The APIs will only retrieve data from their #3 layer on the schema they "own". HEre the booking schema for the example. An API will be forbidden to access other schemas through its own layer #3.

For all the other data needed, from other schemas (master data, other functional departments, etc), which are many: the API will use the layers #2 and #3 of other API stacks, which it will load as libraries (jar or assemblies in .net).

So to build the hybrid Object (e.g. booking), the API will here access first the data from its "own" schema through its own layer #3. And then call the layers 2#(and indirectly 3#) of the four other owning stacks owning their own DB schemas, in memory (since they are libraries).

The Reconciliation of all these data accessed (sequentially) from the 5 horizontal layers #2#3, will be done at layer #2 of the booking API, our example here. So at least 5 stored procedures will be called sequentially, passing each other foreign keys for the data to be retrieved. Then all these resultsets will be used to populate the different levels of the object graph.

Here is my concern:

-That Performance will suffer compared to a single data access layer and reducing the number of SP calls to potentially one, which I would prefer (returning multiple cursors, one for each level of the object graph).

I fear the performance will decrease by a multi fold factor, which appears evident to me at the conceptual level because of multiple DB calls and reconciliation of data in a application layer, not leveraging the DB in memory relational data manipulations. What do you think?

-The dependency structure of such API design will be a nightmare: we get, instead of 2-3 layers, each versioned, 11 versioned items in this example, in order to package (and test) the API. This gives a mathematical complexity of the versioning like (N)^Nbr of Binaries, N being the number of release (assuming same nbr of release for each library of course, but you get the point). This is huge.

What do you think?
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic