Scale Hacking: Cloud Computing, Software and System Performance

Dec 28, 2022

How to Create an AWS MySQL RDS Read Replica

Why?

Read Replica provides you extra read-only power, so you can direct your heavy load SELECT (OLAP) queries to the read replica, while the main RDS cluster can serve your critical OLTP transactions.

How?

To create a MySQL read replica in Amazon Web Services (AWS) using Amazon Relational Database Service (RDS), you can follow these steps:

Sign in to the AWS Management Console and navigate to the RDS dashboard.
In the navigation pane, click on the "Instances" menu item.
Select the instance that you want to use as the source for the read replica.
From the "Actions" dropdown menu, select "Create read replica".
In the "Create Read Replica" dialogue box that appears, you can specify the following options:

The instance identifier for the read replica. This should be unique within your AWS account.
The instance class for the read replica. You can choose from a range of instance sizes based on your performance and capacity needs.
The storage type and size for the read replica. You can choose between magnetic and SSD storage, and specify the size in gigabytes.
The availability zone for the read replica. This determines the geographical region where the read replica will be located.
The security group and VPC for the read replica. You can use the same security group and VPC as the source instance, or choose different ones.

Click "Create read replica" to create the read replica. It will be created as a new RDS instance, and will be automatically replicating data from the source instance.

You can monitor the progress of the read replica creation process by checking the "Status" column on the RDS dashboard. When the status of the read replica changes to "available", it is ready to use.

Keep in mind that read replicas are designed to be used for read-only workloads, and you cannot write to them directly. Any writes to the source instance will be replicated to the read replica, but you cannot write directly to the read replica.

Keep Performing, Moshe Kaplan

Dec 3, 2022

Monitoring Dashboard Principles for AWS Serverless System

A monitoring dashboard for a serverless system would typically include information about the health and performance of the system, such as the number of requests being handled, the average response time, and any errors or issues that are occurring. The specific metrics and information included on a monitoring dashboard can vary depending on the specific system and the needs of the users. Some common metrics that might be included on a monitoring dashboard for a serverless system include:

Request count: The number of requests being handled by the system.
Error rate: The percentage of requests that are resulting in errors.
Latency: The average time it takes for a request to be processed.
CPU and memory usage: The amount of CPU and memory resources being used by the system.
Throughput: The amount of data being processed by the system.
Scalability: Information about how the system is scaling up or down in response to changes in workload.

These are just some examples of the types of information that might be included on a monitoring dashboard for a serverless system. The exact metrics and information displayed on the dashboard will depend on the specific needs of the system and the users.

Keep Performing,

Moshe Kaplan

Nov 6, 2022

Why you get 503 with your AWS Lambda@Edge?

The short answer: Lambda@Edge is limited to a 5 seconds timeout

The reason: If you configured your lambda to a larger timeout (let's say 58 sec), when you deploy it to Lambda@edge, AWS would automatically adjust the number to a lower number (for us it was 1 second).

The solution is simple: You should not configure your Lambda functions that will be deployed to Lambda@Edge for more than 5 seconds.

How Trace it?

1. You get a 503 when you try to login a Cognito secured site for example with "The Lambda function associated with the CloudFront distribution is invalid or doesn't have the required permissions"

2. When you go to the CloudWatch to examine the logs, you find out, that some function end after 1000 ms.

Keep Performing,
Moshe Kaplan

Oct 17, 2022

How to Enable AWS OpenSearch Slow Logs?

The first step to solving a data performance issue is tracing the slow queries.

OpenSearch support two types of queries:

Indexing: write queries
Searching: read queries

You will be able to find out what is your performance bottleneck in the AWS cluster health dashboard. If you will find out that indexing is your bottleneck, you may enable the indexing slow log according to the AWS docs.

If you find out that your bottleneck is the search queries, you may use the following:

Get The list of index:

curl -X PUT "https://YOUR-CLUSTER-NAME.eu-west-1.es.amazonaws.com/_cat/indices?pretty"

Enable the slow log on the relevant indexes:

curl -X PUT "https://YOUR-CLUSTER-NAME.eu-west-1.es.amazonaws.com/YOUR-INDEX-NAME/_settings?pretty" -H 'Content-Type: application/json' -d'

{

"search": {

"slowlog": {

"threshold": {

"query": {

"warn": "15s",

"trace": "750ms",

"debug": "3s",

"info": "10s"

}

"level": "TRACE"

}

Where query defines the thresholds, and the level the minimal threshold that will be logged to the file.

After enabling the slow query log, you may be able to find the link to the log itself in the AWS cluster logs tab.

Bottom Line

After analyzing the slow log, you may be able to get some light on your performance spots.

Keep Performing,

Moshe Kaplan

Sep 6, 2022

The Data Analyst Guide: How to Connect Your MongoDB Database

Install MongoDB Compass
Connect to your MongoDB Cluster
Run an aggregation query

You can find the detailed process in our video

Keep Performing,

Moshe Kaplan

Aug 31, 2022

How did We Manage to Integrate Two Large Data based Services? or How to Query Large Number of Records from the MySQL database?

A common case in enterprise systems and micro-services-based architecture is the need to sync data between 2 systems or services.

For example, a billing system in a telco may need updated status regarding the customers' plan.

Common Solutions:

1. A Low latency Service

A common approach might be having a scaleable, low latency service that will be able to respond to status queries within 1ms. This can be achieved using golang based service and a Redis backend for example.

2. A PubSub architecture

In this approach, we will update the billing system w/ recent updates and will need to assume the billing database is updated with the latest updates. This is a very efficient method, yet it is prone to discrepancies and data drifting.

3. Batch Queries

The last method that we'll discuss in this post is having a batch query reg, "hot" subscribers, and getting back the results to the billing system. It might be less sophisticated than the other approaches, yet it is simple and less prone to load data drifting.

The Integration Patttern

The billing system would like to retrieve the status of up to 30K subscribers, in order to enrich the CDR (call data records) files.

This approach may be chosen, in order to minimize the number of calls between the two systems and to avoid data drifting.

As the number of subscribers is too large, to include in a single query, a naive solution, might be to query all the subscribers from the database and filter them at the application level.

A better solution might be using the TEMPORARY TABLE mechanism to extract only the needed subscribers from the table

Step by Step Solution

1. Insert the subscriber ids we get from the billing to a temporary table

CREATE TEMPORARY TABLE temp_billing_sps_subs_id (

subscriber_id bigint

);

2. Add a JOIN to the current query w/ the temp table

3. Get back up to 30K records instead of 2M.

Few things to think about:

1. We may need to add a new index to match the updated queries

2. We may need to add permissions in production to create these temporary tables.

Bottom Line

It may improve query response time (fewer data to fetch from disk and return to app server) and app query time (less time to scan the 2M records and filter them).

Keep Performing,

Moshe Kaplan

May 9, 2022

Duplicate Your MongoDB ATLAS ReplicaSet/Cluster

Well, does a MongoDB ReplicaSet/Cluster duplication for backup or test purposes sound like an obvious task?

Think again :-)

There is no magic button or anything like that, but there are a few tools that can be used to complete this task:

Backup and Restore

Use mongodump and mongorestore to backup your current instance and restore to a new instance. To save bandwidth and minimize time, make sure you use an instance in the cloud vendor region matching your MongoDB ATLAS.

Mirroring

Use the mongomirror to migrate the data to the new cluster. Please notice the mongomirror do not copy config and permissions.

Bottom Line

I believe the backup/restore method might be best in most cases, but you can have it your way

Keep Performing

Moshe Kaplan

Scale Hacking: Cloud Computing, Software and System Performance

Pages

Dec 28, 2022

How to Create an AWS MySQL RDS Read Replica

Dec 3, 2022

Monitoring Dashboard Principles for AWS Serverless System

Nov 6, 2022

Why you get 503 with your AWS Lambda@Edge?

Oct 17, 2022

How to Enable AWS OpenSearch Slow Logs?

Sep 6, 2022

The Data Analyst Guide: How to Connect Your MongoDB Database

Aug 31, 2022

How did We Manage to Integrate Two Large Data based Services? or How to Query Large Number of Records from the MySQL database?

Common Solutions:

The Integration Patttern

Step by Step Solution

Bottom Line

May 9, 2022

Duplicate Your MongoDB ATLAS ReplicaSet/Cluster

ShareThis

Intense Debate Comments

Ratings and Recommendations

Tags