Performance Fault Tolerancy Deployment

What influences Performance

Developer/ DB Admin

Efficient Queries/Operations
Indexes
Fitting Data Schema

DB Admin/ System Admin

Hardware & Network
Sharding
Replica Sets

Understanding Capped Collections

💡 **Capped collections** are a special type of collection that have to create explicitly where limit the amount of data and old data is automatically deleted when uploads new data if the memory or data limit is exceeded.

The oldest data is automatically deleted when new data comes in.
This is efficient for application logs where the most recent logs are stored, or as caching some data.

For capped collection, it returns documents as the same order as it was inserted. But for normal collection it is not always true.

Let's create a collection with capped, max is optional

Default size 4 MB, max is the number of documents that can be stored there.

> use performance
switched to db performance

> db.createCollection('capped', {capped: true, size: 10000, max: 3})
**Output**
{ ok: 1 }

Access capped collections

> db.capped.insertOne({name: 'Max'})
**Output**
{
  acknowledged: true,
  insertedId: ObjectId("6353cc9a88121c3743e45cef")
}

> db.capped.insertOne({name: 'Manu'})
**Output**
{
  acknowledged: true,
  insertedId: ObjectId("6353ccb588121c3743e45cf0")
}

db.capped.insertOne({name: 'Anna'})
**Output**
{
  acknowledged: true,
  insertedId: ObjectId("6353cccc88121c3743e45cf1")
}

> db.capped.find().pretty()

**Output**
[
  { _id: ObjectId("6353cc9a88121c3743e45cef"), name: 'Max' },
  { _id: ObjectId("6353ccb588121c3743e45cf0"), name: 'Manu' },
  { _id: ObjectId("6353cccc88121c3743e45cf1"), name: 'Anna' }
]

If any time need to order the documents to descending order

> db.capped.find().sort({$natural: -1}).pretty()
**Output**
[
  { _id: ObjectId("6353cccc88121c3743e45cf1"), name: 'Anna' },
  { _id: ObjectId("6353ccb588121c3743e45cf0"), name: 'Manu' },
  { _id: ObjectId("6353cc9a88121c3743e45cef"), name: 'Max' }
]

Now if we insert an new document then the first document will automatically get deleted(name: 'Max') and will not show error msg.

> db.capped.insertOne({name: 'Maria'})

**Output**
{
  acknowledged: true,
  insertedId: ObjectId("6353fac444b95bf60083ef55")
}

> db.capped.find().pretty()

**Output**
[
  { _id: ObjectId("6353ccb588121c3743e45cf0"), name: 'Manu' },
  { _id: ObjectId("6353cccc88121c3743e45cf1"), name: 'Anna' },
  { _id: ObjectId("6353fac444b95bf60083ef55"), name: 'Maria' }
]

But the data is nor permanently deleted, If by using Cache the can retrieve the full data

Replica Sets

Client(shell , Driver) <--Read--> MongoDB Server Primary Node --> Replica Set(Primary Node --> 1. Secondary Node 2. Secondary Node

If we add replica node then if we make a write request then it will be go to the Client then MongoDB server then primary node where we have the database. And if we have secondary node MongoDB asynchronously(not instantly) will replicate(copy) to in those secondary node. If primary node goes offline then data read from replica (Secondary Node).

Why Replica Sets?

Backup / Fault Tolerance
Improve Read Performance

To read data as first as possible can over distribute read request between the primary and secondary Node. If there are huge amount of read request then this will help to reduce the pressure at primary node.

Sharding (Horizontal Scaling)

More memory into a single machine. Horizontal scaling which means need more servers.

           MongoDB Server
				        |
------------------------------------------------------
server1	  server2  server3   server4  server5  server6

That issue here of course is this might sound logical but severs now don't duplicate that data, they are not backups, they split that data. So server 1 on the left, let's say stores data for the same application as the other server but a different chunk of it. So with sharding, we have multiple computers who all run MongoDB servers but these servers don’t work alone but work together and split the available data so that data is distributed across shards not replicated. So queries where find data but also insert update and delete operations therefore have to be run against all the servers or the right servers because each chunk manages its data and its range of the data.

Data is distributed (not replicated) across Shards

Each chunk manages its data and its a range of data

Queries are run across all Shards

Data can be stored as A-J, J-K, K-S.. to the sequence of a chunk(server).

How Sharding Works

----------
								| Client |
								----------
									|
							------------------				 
							| mongos(Router) |
							------------------		
									|
 -------------------------------------------------------------------------
        |                           |                         |
 MongoDB(Server/Shard)  	MongoDB(Server/Shard)     MongoDB(Server/Shard)		    
 --------------------------------------------------------------------------
     | Shard Key  |            | Shard Key  |             | Shard Key  | 
 ---------------------------------------------------------------------------
 // shard key: shard key is essential just a field that's added to every document which kid of is important. 

 ------------------------Queries and Sharding-------------------------

                             	----------
								| find() |
								----------
									 |
							 ------------------				 
						 	 | mongos(Router) |
							 ------------------
                                     |
		    -------------------------------------------------------
	        |                                                     | 
 ---------------------------			            ---------------------------			 
 | Option1: Operation does |                        | Option2: Operation does |
 | not contain Shared Key  |                        | contain Shared Key      |
 ---------------------------                        ---------------------------
    |        |       |                          Directly Send   ---> |
BroadCast >	 |       |                          to right Shard       |
    |        |       |                                               |
--------------------------                               --------------------
shared    shared    shared                                         shared

Deploying a MongoDB Server

localhost(mongod) -------> Atlas(mongod)

Have to manage

Manage Shards
Secure User / Auth Setup
Project Web Server / Network
Manage Replica Set
Encryption(Transportation & Rest)
Regular Backups
Update Software

Performance & Fault Tolerancy

Consider Capped Collections for cases where want to clear old data automatically.
Performance is all about having efficient queries / operations, fitting data formats and a best-practice MongoDB server config.
Replica sets provide fault tolerancy (with automatic recovery) and improved read performance.
Sharding allows to scale MongoDB server horizontally.

Deployment $ MongoDB Atlas

Deployment is a complex matter since it involves many tasks - some of them are not even directly related to MongoDB.
Unless are a experienced admin (or got one), should consider a managed solution like MongoDB Atlas.
Atlas is a managed service where can configure a MongoDB environment and pay at a by-usage basis.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance_Fault_Tolerancy_Deployment.md

Performance_Fault_Tolerancy_Deployment.md

Performance Fault Tolerancy Deployment

What influences Performance

Understanding Capped Collections

Replica Sets

Sharding (Horizontal Scaling)

Data is distributed (not replicated) across Shards

Queries are run across all Shards

How Sharding Works

Deploying a MongoDB Server

Performance & Fault Tolerancy

Deployment $ MongoDB Atlas

Files

Performance_Fault_Tolerancy_Deployment.md

Latest commit

History

Performance_Fault_Tolerancy_Deployment.md

File metadata and controls

Performance Fault Tolerancy Deployment

What influences Performance

Understanding Capped Collections

Replica Sets

Sharding (Horizontal Scaling)

Data is distributed (not replicated) across Shards

Queries are run across all Shards

How Sharding Works

Deploying a MongoDB Server

Performance & Fault Tolerancy

Deployment $ MongoDB Atlas