I have a scenario like the following. Organization works with about a 1000 warehouses all over the continent. Every day head office sends requests to all warehouses, each request is asking for the amount of specific item category present at each warehouse (for our purpose – ‘what types and how much units of each item type do you have for categoryID X?’). Warehouses check their inventory and respond over the period of several days, producing several millions of rows of item records in the shape of
(initalRequestId),(warehouseId), (itemId), (unitsPresent), ...(more supplementary data). Sometimes they send response in multiple parts, or even send corrections, invalidating a previous response.
On top of that, there are queries calculating report data from responses – aggregating things by country warehouse chain etc… Right now, those queries recalculate full report on every new response to a given request, so up to a 1000 recalculation per day for a given request. I’m looking for a good way to calculate those aggregations in an incremental manner (as well as trying to assess, if incremental recalculation will even benefit us here). When a new new batch of items is addedremoved for a given request – only the aggregates which those specific rows would affect are to be recalculated. My platform is SQL Server (newest version) and most ghings available in Azure cloud.
I’m currently researching Azure Synapse Analytics and Materialized Views. Looking for advice, it this a good solution, or is there something better that can be built with SQL or some other programming language platform. Also, looking for insights into incremental recalculation of aggregates – how to figure out, if it will be useful here.
P.S. we will be using PowerBI Embedded for visualization, and could possibly use it for aggregation also.