Data Modeling Embedding vs Referencing
One of the most important decisions when designing a MongoDB database is how to structure relationships between data. Should you put related data inside the same document (embedding) or keep them in separate documents and link them (referencing)? This is called **data modeling**.
Embedding vs Referencing
| Embedding | Referencing |
|---|---|
| Related data is stored inside the same document | Related data is stored in separate documents with references |
| Like a form with multiple sections | Like a library catalog with book IDs |
Think of embedding like putting all your photos in one album. Referencing is like having photo IDs and a separate filing system – you need to look up each photo.
When to Use Embedding
Embedding works well when:
- You have one-to-one relationships.
- You have one-to-many relationships where the "many" side is small and doesn't change often.
- You frequently access the related data together.
- You need atomic updates (you can update everything at once).
Example: User with Addresses (Embedding)
{ _id: ObjectId("65a1b2c3d4e5f6a7b8c9d0e1"), name: "John Doe", email: "john@example.com", addresses: [ { street: "123 Main St", city: "New York", zip: "10001" }, { street: "456 Oak Ave", city: "Boston", zip: "02101" } ]}When to Use Referencing
Referencing (also called linking) works well when:
- You have one-to-many or many-to-many relationships with large or unbounded "many" sides.
- The related data is accessed independently.
- You need to avoid document growth (MongoDB documents have a 16MB size limit).
- The same data is referenced by multiple documents (don't duplicate).
Example: Blog Posts with Comments (Referencing)
<!-- posts collection -->{ _id: ObjectId("post123"), title: "MongoDB Data Modeling", content: "This is a blog post...", author: "Jane Smith"}
<!-- comments collection -->{ _id: ObjectId("comment456"), postId: ObjectId("post123"), <!-- reference to the post --> user: "Alice", text: "Great article!"}Many-to-Many Relationships
For many-to-many (like students and courses), use referencing:
<!-- students collection -->{ _id: ObjectId("student1"), name: "John", courses: [ObjectId("course1"), ObjectId("course2")]}
<!-- courses collection -->{ _id: ObjectId("course1"), name: "MongoDB 101", students: [ObjectId("student1"), ObjectId("student3")]}Decision Guide
| Scenario | Recommended |
|---|---|
| One-to-one (profile + details) | Embed |
| One-to-few (user + addresses) | Embed |
| One-to-many (blog post + comments) | Reference |
| Many-to-many (students + courses) | Reference |
| Data that grows unbounded (logs) | Reference |
Two Minute Drill
- **Embedding** stores related data in the same document – good for one-to-few relationships and when you always access data together.
- **Referencing** stores related data in separate documents with IDs – good for large or unbounded relationships and when data is accessed independently.
- There's no "always right" choice – it depends on your application's query patterns.
- Consider document size limit (16MB) when embedding.
- Consider data duplication and consistency when referencing.
Need more clarification?
Drop us an email at career@quipoinfotech.com
