When should I use MongoDB instead of PostgreSQL or MySQL?

Use MongoDB when your data has varying structure (different fields per document), you need horizontal scaling with sharding, or your application works with nested JSON objects that map poorly to relational tables. Use PostgreSQL/MySQL when you need complex joins, transactions across multiple tables, or strict schema enforcement.

How do I back up a MongoDB database?

Use mongodump to create a BSON backup of your database: mongodump --db mydb --out /backup/path. Restore with mongorestore --db mydb /backup/path/mydb. For production, use MongoDB Atlas backup or filesystem snapshots of the data directory with the WiredTiger journal.

Does MongoDB support transactions like relational databases?

Yes, since MongoDB 4.0. Multi-document ACID transactions work across multiple documents and collections within a replica set. Since 4.2, distributed transactions also work across sharded clusters. However, document model design should minimize the need for transactions.

How do I connect to MongoDB from an application?

Use the official MongoDB driver for your language (pymongo for Python, mongodb for Node.js, etc.). Connect with a connection string: mongodb://username:password@host:27017/dbname. For replica sets, include all members in the connection string for automatic failover.

MongoDB: Getting Started with NoSQL Databases

MongoDB is a document-oriented NoSQL database that stores data as flexible JSON-like documents instead of rows and columns. Where relational databases require you to define a schema upfront and restructure tables as requirements change, MongoDB lets each document in a collection have different fields — making it ideal for applications where data structures evolve rapidly. This guide covers MongoDB installation, core CRUD operations, indexing strategies, and replica set configuration for production deployments.

Prerequisites

A Linux server (Ubuntu 22.04 or RHEL 8+ recommended) with at least 2 GB RAM
Root or sudo access for installation
Basic understanding of JSON data format
Familiarity with command-line tools

Installing MongoDB

Ubuntu/Debian

# Import the MongoDB GPG key
curl -fsSL https://www.mongodb.org/static/pgp/server-7.0.asc | \
  sudo gpg -o /usr/share/keyrings/mongodb-server-7.0.gpg --dearmor

# Add the repository
echo "deb [ arch=amd64,arm64 signed-by=/usr/share/keyrings/mongodb-server-7.0.gpg ] \
  https://repo.mongodb.org/apt/ubuntu jammy/mongodb-org/7.0 multiverse" | \
  sudo tee /etc/apt/sources.list.d/mongodb-org-7.0.list

# Install MongoDB
sudo apt update
sudo apt install -y mongodb-org

# Start and enable the service
sudo systemctl start mongod
sudo systemctl enable mongod

# Verify it is running
sudo systemctl status mongod
mongosh --eval "db.version()"

RHEL/CentOS

# Create the repository file
cat <<'EOF' | sudo tee /etc/yum.repos.d/mongodb-org-7.0.repo
[mongodb-org-7.0]
name=MongoDB Repository
baseurl=https://repo.mongodb.org/yum/redhat/$releasever/mongodb-org/7.0/x86_64/
gpgcheck=1
enabled=1
gpgkey=https://www.mongodb.org/static/pgp/server-7.0.asc
EOF

sudo dnf install -y mongodb-org
sudo systemctl start mongod
sudo systemctl enable mongod

Docker

docker run -d \
  --name mongodb \
  -p 27017:27017 \
  -v mongodb-data:/data/db \
  -e MONGO_INITDB_ROOT_USERNAME=admin \
  -e MONGO_INITDB_ROOT_PASSWORD=secretpassword \
  mongo:7

Core CRUD Operations

Connect to MongoDB using the shell:

# Connect to local MongoDB
mongosh

# Connect to a remote server with authentication
mongosh "mongodb://admin:password@192.168.1.100:27017/admin"

Create — Inserting Documents

// Switch to (or create) a database
use myapp

// Insert a single document
db.users.insertOne({
  name: "Alice Johnson",
  email: "alice@example.com",
  role: "admin",
  skills: ["python", "docker", "kubernetes"],
  created: new Date()
})

// Insert multiple documents
db.users.insertMany([
  { name: "Bob Smith", email: "bob@example.com", role: "developer", skills: ["javascript", "react"] },
  { name: "Carol Lee", email: "carol@example.com", role: "devops", skills: ["terraform", "aws", "docker"] }
])

MongoDB creates the database and collection automatically on first insert — no CREATE TABLE or CREATE DATABASE needed.

Read — Querying Documents

// Find all documents in a collection
db.users.find()

// Find with a filter
db.users.find({ role: "admin" })

// Query nested fields and arrays
db.users.find({ skills: "docker" })           // Array contains "docker"
db.users.find({ "skills.0": "python" })        // First skill is "python"

// Comparison operators
db.users.find({ age: { $gt: 25 } })            // Greater than
db.users.find({ role: { $in: ["admin", "devops"] } })  // In array

// Projection — select specific fields
db.users.find({ role: "admin" }, { name: 1, email: 1, _id: 0 })

// Sort and limit
db.users.find().sort({ created: -1 }).limit(10)

// Count documents
db.users.countDocuments({ role: "developer" })

Update — Modifying Documents

// Update one document
db.users.updateOne(
  { email: "alice@example.com" },
  { $set: { role: "superadmin", lastLogin: new Date() } }
)

// Add to an array
db.users.updateOne(
  { email: "bob@example.com" },
  { $push: { skills: "typescript" } }
)

// Update multiple documents
db.users.updateMany(
  { role: "developer" },
  { $set: { department: "engineering" } }
)

// Upsert — update if exists, insert if not
db.users.updateOne(
  { email: "dave@example.com" },
  { $set: { name: "Dave Wilson", role: "intern" } },
  { upsert: true }
)

Delete — Removing Documents

// Delete one document
db.users.deleteOne({ email: "dave@example.com" })

// Delete multiple documents
db.users.deleteMany({ role: "intern" })

// Drop an entire collection
db.users.drop()

Indexing for Performance

Without indexes, MongoDB performs a collection scan — reading every document to find matches. On a collection with millions of documents, this takes seconds instead of milliseconds.

// Create a single-field index
db.users.createIndex({ email: 1 })           // Ascending

// Create a compound index (multiple fields)
db.users.createIndex({ role: 1, created: -1 })

// Create a unique index (enforces uniqueness)
db.users.createIndex({ email: 1 }, { unique: true })

// Create a text index for full-text search
db.articles.createIndex({ title: "text", body: "text" })
db.articles.find({ $text: { $search: "mongodb tutorial" } })

// List all indexes on a collection
db.users.getIndexes()

// Explain a query to see if it uses an index
db.users.find({ email: "alice@example.com" }).explain("executionStats")

Indexing rules of thumb:

Index fields used in find() filters, sort(), and $match in aggregations
Compound indexes should order fields from most selective to least selective
Avoid indexing fields with low cardinality (e.g., boolean fields with only true/false)
Each index consumes RAM — monitor with db.stats() and db.collection.stats()

Comparing MongoDB with Other Databases

Feature	MongoDB	PostgreSQL	MySQL	Redis
Data model	Documents (JSON)	Relational (tables)	Relational (tables)	Key-value / data structures
Schema	Flexible (schema-less)	Strict (SQL DDL)	Strict (SQL DDL)	Schema-less
Query language	MQL + Aggregation	SQL	SQL	Commands
Joins	$lookup (limited)	Full SQL joins	Full SQL joins	None
Transactions	Multi-document ACID	Full ACID	Full ACID	Single-op atomic
Scaling	Horizontal (sharding)	Vertical (+ read replicas)	Vertical (+ read replicas)	In-memory, horizontal
Best for	Flexible schemas, rapid development	Complex queries, data integrity	Web applications, WordPress	Caching, sessions, real-time

Aggregation Pipeline

The aggregation pipeline is MongoDB’s answer to complex SQL queries with GROUP BY, JOIN, and subqueries:

// Count users per role
db.users.aggregate([
  { $group: { _id: "$role", count: { $sum: 1 } } },
  { $sort: { count: -1 } }
])

// Find the most common skills across all users
db.users.aggregate([
  { $unwind: "$skills" },
  { $group: { _id: "$skills", count: { $sum: 1 } } },
  { $sort: { count: -1 } },
  { $limit: 10 }
])

// Join users with their orders (like SQL JOIN)
db.orders.aggregate([
  { $lookup: {
      from: "users",
      localField: "userId",
      foreignField: "_id",
      as: "user"
  }},
  { $unwind: "$user" },
  { $project: { orderTotal: 1, "user.name": 1, "user.email": 1 } }
])

Backup and Restore

# Backup a specific database
mongodump --db myapp --out /backup/$(date +%Y-%m-%d)

# Backup with authentication
mongodump --uri="mongodb://admin:password@localhost:27017" --db myapp --out /backup/

# Backup all databases
mongodump --out /backup/full-$(date +%Y-%m-%d)

# Restore a database
mongorestore --db myapp /backup/2025-12-13/myapp

# Restore dropping existing data first
mongorestore --drop --db myapp /backup/2025-12-13/myapp

Gotchas and Edge Cases

Document size limit: A single MongoDB document cannot exceed 16 MB. If you need to store large files, use GridFS — MongoDB’s specification for storing files larger than 16 MB across multiple documents.

No joins by default: While $lookup provides basic join functionality, it is slower than relational joins. Design your document schema to embed related data within a single document when possible (denormalization).

Write concern and data loss: By default, MongoDB acknowledges writes after they reach the primary. If the primary crashes before replication, data is lost. For critical data, use writeConcern: { w: "majority" } to wait for replication.

ObjectId is not sequential: MongoDB’s default _id field uses ObjectId, which encodes a timestamp but is not strictly sequential. Do not use _id ordering as a substitute for a created timestamp field.

Memory-mapped storage: MongoDB’s WiredTiger engine uses available RAM for caching. A MongoDB server with 8 GB RAM and a 100 GB database only caches the hot working set. Monitor cache hit rates with db.serverStatus().wiredTiger.cache.

Troubleshooting

MongoDB fails to start

# Check the log for errors
sudo journalctl -u mongod --no-pager -n 50

# Common cause: insufficient disk space or wrong permissions
ls -la /var/lib/mongodb/
sudo chown -R mongodb:mongodb /var/lib/mongodb

Queries are slow

// Use explain to check if queries use indexes
db.users.find({ email: "alice@example.com" }).explain("executionStats")

// Look for "COLLSCAN" in the output — it means no index is used
// Create an index on the queried field
db.users.createIndex({ email: 1 })

Connection refused from remote clients

# MongoDB binds to localhost by default
# Edit /etc/mongod.conf to allow remote connections
# net:
#   bindIp: 0.0.0.0    # Listen on all interfaces

# IMPORTANT: Enable authentication first!
sudo systemctl restart mongod

Summary

MongoDB stores JSON-like documents in collections without requiring a predefined schema — each document can have different fields, making it ideal for evolving data structures
CRUD operations use intuitive methods like insertOne, find, updateOne, and deleteOne with JSON query filters instead of SQL
Indexes are critical for performance — create indexes on fields used in queries and sort operations, and use explain() to verify index usage
The aggregation pipeline handles complex data transformations, grouping, and joins that would require GROUP BY and JOIN in SQL
Back up regularly with mongodump and test restores with mongorestore — a backup you have not tested is not a backup
Design documents to embed related data rather than normalizing across collections — this reduces the need for expensive $lookup joins