-
Notifications
You must be signed in to change notification settings - Fork 4k
Add DataFrame.merge() term entry #6800
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
10 commits
Select commit
Hold shift + click to select a range
d362ed1
[Term Entry] Pandas DataFrame: .merge() - Add documentation for mergi…
Enrique-Macias 019751a
Update merge.md: Simplify code example and clarify output explanation…
Enrique-Macias a7f13d8
Add merge.md term entry for DataFrame.merge() method
Enrique-Macias b96a87d
Merge branch 'main' into py-pd-df-merge
mamtawardhani e27b907
Apply suggestions from code review
Enrique-Macias db5e35a
Add interactive Codebyte example for DataFrame.merge() in merge.md to…
Enrique-Macias 2dd3d6b
minor fixes
mamtawardhani 78c3b23
Merge branch 'main' into py-pd-df-merge
mamtawardhani 0d0b453
Merge branch 'main' into py-pd-df-merge
Sriparno08 9640f1d
Update merge.md
Sriparno08 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,117 @@ | ||
--- | ||
Title: '.merge()' | ||
Description: 'Merges two DataFrames based on a common key or index.' | ||
Subjects: | ||
- 'Computer Science' | ||
- 'Data Science' | ||
Tags: | ||
- 'Join' | ||
- 'Pandas' | ||
CatalogContent: | ||
- 'learn-python-3' | ||
- 'paths/computer-science' | ||
--- | ||
|
||
In Pandas, the **`.merge()`** method combines two DataFrames using a common key column or index, similar to a SQL `JOIN` operation. It's essential for integrating datasets that share related fields. | ||
|
||
## Syntax | ||
|
||
The `.merge()` method provides a flexible way to combine DataFrames using different types of joins. The syntax shows all available parameters: | ||
|
||
```pseudo | ||
DataFrame.merge(right, how='inner', on=None, left_on=None, right_on=None, left_index=False, right_index=False, suffixes=('_x', '_y')) | ||
``` | ||
|
||
**Parameters:** | ||
|
||
- `right`: The DataFrame or named Series to merge with. | ||
- `how`: Type of merge to perform: | ||
- `'inner'`: Include only matching rows from both DataFrames. | ||
- `'outer'`: Include all rows from both DataFrames, with `NaN`s where no match is found. | ||
- `'left'`: Include all rows from the left DataFrame and matching ones from the right. | ||
- `'right'`: Include all rows from the right DataFrame and matching ones from the left. | ||
- `on`: Index level or column names to join on. Must exist in both DataFrames. | ||
- `left_on`: Column(s) or index level(s) in the left DataFrame to use as join keys. | ||
- `right_on`: Column(s) or index level(s) in the right DataFrame to use as join keys. | ||
- `left_index`: Use the index from the left DataFrame as the join key. | ||
- `right_index`: Use the index from the right DataFrame as the join key. | ||
- `suffixes`: The suffixes to apply to overlapping column names from the left and right DataFrames. | ||
|
||
## Example | ||
|
||
This example demonstrates a basic merge operation between two DataFrames using a common `'id'` column: | ||
|
||
```py | ||
import pandas as pd | ||
|
||
# Create two sample DataFrames | ||
df1 = pd.DataFrame({ | ||
'id': [1, 2, 3, 4], | ||
'name': ['Alice', 'Bob', 'Charlie', 'David'] | ||
}) | ||
|
||
df2 = pd.DataFrame({ | ||
'id': [1, 2, 3, 5], | ||
'age': [25, 30, 35, 40] | ||
}) | ||
|
||
# Merge the DataFrames on the 'id' column | ||
merged_df = df1.merge(df2, on='id') | ||
|
||
# Print the merged DataFrame | ||
print(merged_df) | ||
``` | ||
|
||
The code produces this output: | ||
|
||
```shell | ||
id name age | ||
0 1 Alice 25 | ||
1 2 Bob 30 | ||
2 3 Charlie 35 | ||
``` | ||
|
||
> **Notes:** | ||
> | ||
> - Only rows with matching `id` values (1, 2, and 3) are included in the result. | ||
> - The row with `id=4` from `df1` is excluded because it has no match in `df2`. | ||
> - The row with `id=5` from `df2` is excluded because it has no match in `df1`. | ||
|
||
## Codebyte Example | ||
|
||
This codebyte example demonstrates different types of merges and their effects on the resulting DataFrame: | ||
|
||
```codebyte/python | ||
import pandas as pd | ||
|
||
# Create two sample DataFrames | ||
df1 = pd.DataFrame({ | ||
'id': [1, 2, 3, 4], | ||
'name': ['Alice', 'Bob', 'Charlie', 'David'] | ||
}) | ||
|
||
df2 = pd.DataFrame({ | ||
'id': [1, 2, 3, 5], | ||
'age': [25, 30, 35, 40] | ||
}) | ||
|
||
# Display original DataFrames | ||
print("Original DataFrames:") | ||
print("\nDataFrame 1:") | ||
print(df1) | ||
print("\nDataFrame 2:") | ||
print(df2) | ||
|
||
# Demonstrate different merge types | ||
print("\n1. Inner Merge (default):") | ||
print(df1.merge(df2, on='id')) | ||
|
||
print("\n2. Left Merge:") | ||
print(df1.merge(df2, on='id', how='left')) | ||
|
||
print("\n3. Right Merge:") | ||
print(df1.merge(df2, on='id', how='right')) | ||
|
||
print("\n4. Outer Merge:") | ||
print(df1.merge(df2, on='id', how='outer')) | ||
``` |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.