A common case in enterprise systems and micro-services-based architecture is the need to sync data between 2 systems or services.
For example, a billing system in a telco may need updated status regarding the customers' plan.
Common Solutions:
1. A Low latency Service
A common approach might be having a scaleable, low latency service that will be able to respond to status queries within 1ms. This can be achieved using golang based service and a Redis backend for example.
2. A PubSub architecture
In this approach, we will update the billing system w/ recent updates and will need to assume the billing database is updated with the latest updates. This is a very efficient method, yet it is prone to discrepancies and data drifting.
3. Batch Queries
The last method that we'll discuss in this post is having a batch query reg, "hot" subscribers, and getting back the results to the billing system. It might be less sophisticated than the other approaches, yet it is simple and less prone to load data drifting.
The Integration Patttern
The billing system would like to retrieve the status of up to 30K subscribers, in order to enrich the CDR (call data records) files.
This approach may be chosen, in order to minimize the number of calls between the two systems and to avoid data drifting.
As the number of subscribers is too large, to include in a single query, a naive solution, might be to query all the subscribers from the database and filter them at the application level.
A better solution might be using the TEMPORARY TABLE mechanism to extract only the needed subscribers from the table
Step by Step Solution
1. Insert the subscriber ids we get from the billing to a temporary table
CREATE TEMPORARY TABLE temp_billing_sps_subs_id (
subscriber_id bigint
);
2. Add a JOIN to the current query w/ the temp table
3. Get back up to 30K records instead of 2M.
Few things to think about:
1. We may need to add a new index to match the updated queries
2. We may need to add permissions in production to create these temporary tables.
Bottom Line
It may improve query response time (fewer data to fetch from disk and return to app server) and app query time (less time to scan the 2M records and filter them).