Skip to content

[Edit] Pandas Built-in Functions: .unique() #7404

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Aug 11, 2025
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
95 changes: 60 additions & 35 deletions content/pandas/concepts/built-in-functions/terms/unique/unique.md
Original file line number Diff line number Diff line change
@@ -1,81 +1,106 @@
---
Title: '.unique()'
Description: 'Returns an array containing all the unique elements in the data series, with no specific order.'
Description: 'Returns a NumPy array of the unique values in the order they appear in the Series.'
Subjects:
- 'Computer Science'
- 'Data Science'
- 'Data Visualization'
Tags:
- 'Arrays'
- 'Data'
- 'Encoding'
- 'Functions'
- 'Pandas'
CatalogContent:
- 'learn-python-3'
- 'paths/computer-science'
- 'paths/data-science'
- 'paths/data-science-foundations'
---

The **`.unique()`** function returns unique values from a data series using a hash table. It operates similarly to `numpy.unique()` but is notably faster, especially with large datasets, and it also includes NA values.
The Pandas **`.unique()`** function returns a [NumPy array](https://www.codecademy.com/resources/docs/numpy/ndarray) containing all the unique elements in a data series, with no specific order. It operates similarly to [NumPy's](https://www.codecademy.com/resources/docs/numpy) `.unique()`, but can be more efficient for large Series with repeated elements, and it also includes `NaN` values.

## Syntax
## Pandas `.unique()` Syntax

```pseudo
pd.unique(data_series)
series.unique()
```

The `data_series` parameter represents a 1-dimensional array-like data structure from which unique elements will be returned by the function. The `dtype` of the return matches that of the input, which can be of Index, Categorical, or Series type. The function lists the unique elements in the order they appear in the input data series, and it does _NOT_ sort them.
**Parameters:**

## Example
The `.unique()` function takes no parameters.

The following example demonstrates the use of the `.unique()` function:
**Return value:**

Returns a NumPy array containing the unique values from a Pandas Series, in the order they appear.

## Example 1: Basic Usage of `.unique()`

In this example, `.unique()` is used to return all the unique elements in `series`:

```py
import pandas as pd

series = pd.Series([3, -1, 5, -1, 2, 1, 3, 2, 1, 5, -2, 1, 2])
unique_elements = series.unique()
print(f"The unique elements in series {list(series)} are\n {unique_elements}")
print(unique_elements)
```

The above code outputs the following:
Here is the output:

```shell
The unique elements in series [3, -1, 5, -1, 2, 1, 3, -2, 1, 5, 2, 1, 2] are
[3 -1 5 2 1 3 -2]
[ 3 -1 5 2 1 -2]
```

## Codebyte Example
## Example 2: Using `.unique()` on a DataFrame Column

The code below shows off the effects of `unique()` on different kinds of data types: Index, Categorical, and Series. After defining the array-like objects, the `unique()` method is applied to list out the unique elements of each object, and the resulting data is printed out to the console.
In this example, `.unique()` is used to return all the unique names from the `Name` column of the `df` [DataFrame](https://www.codecademy.com/resources/docs/pandas/dataframe):

```codebyte/python
```py
import pandas as pd

index = pd.Index([
pd.Timestamp("20160101", tz="US/Eastern"),
pd.Timestamp("20160101", tz="US/Eastern"),
pd.Timestamp("20160102", tz="US/Eastern"),
pd.Timestamp("20160101", tz="US/Central"),
])
df = pd.DataFrame({
'Name': ['Alice', 'Bob', 'Alice', 'David', 'Bob'],
'Age': [25, 30, 25, 40, 30]
})

unique_names = df['Name'].unique()

print("Unique elements in Index:")
print(pd.unique(index))
print(unique_names)
```

Here is the output:

grades = pd.Categorical(['A', 'B', 'B+', 'C-', 'D', 'A', 'B', 'A', 'B-', 'F'], categories=['A', 'A-', 'B+', 'B', 'B-', 'C+', 'C', 'C-', 'D+', 'D', 'F'], ordered=True)
```shell
['Alice' 'Bob' 'David']
```

print("\nUnique elements in Categorical:")
print(pd.unique(grades))
## Codebyte Example: Dealing with Missing Values Using `.unique()`

string_series = pd.Series(['John', 'Jack', 'Ellen', 'Kirsten', 'Jack', 'John Jr', 'Kristen', 'Ellen'])
This codebyte example shows how `.unique()` deals with missing values:

print("\nUnique elements in String Series:")
print(pd.unique(string_series))
```codebyte/python
import pandas as pd

int_series = pd.Series([2 * n for n in range(10)] + [3 * n for n in range(5)])
data_with_nan = pd.Series([1, 2, 2, None, 3, None, 1])

print("\nUnique elements in Integer Series:")
print(pd.unique(int_series))
unique_with_nan = data_with_nan.unique()

print(unique_with_nan)
```

## Frequently Asked Questions

### 1. Does `.unique()` work on DataFrames directly?

No. `.unique()` only works on Series. To find unique values in a DataFrame column, you must select the column first:

```py
df['column_name'].unique()
```

### 2. What is the difference between `.unique()` and `.nunique()`?

- `.unique()` returns a NumPy array of the unique values.
- `.nunique()` returns the count of unique values.

### 3. What is the difference between `.unique()` and `.drop_duplicates()` in Pandas?

- `.unique()` is used on a single Series and returns a NumPy array of unique values in the order they appear.
- `.drop_duplicates()` is used on a Series or DataFrame and returns a Pandas object (Series or DataFrame) with duplicate rows or values removed.