I recently tried to figure out a subtle bug in our code. The piece of code I was debugging has one sole purpose: get list of items and remove the duplicates preserving the order of the original list.

After implementing testing, it turned out that the order wasn't always preserved as expected.

The way the code was implemented for stripping out the duplicates from the list was very simple:

>>> my_list = [1, 5, 5, 2, 6, 6]
>>> my_unique_list = list(set(my_list))
>>> print(my_unique_list)
[1, 2, 5, 6]

This looked fine, but the order or the original list was not preserved. After trying multiple different approaches, I settled on:

>>> from collections import OrderedDict
>>> my_list = [1, 5, 5, 2, 6, 6]
>>> my_unique_list = list(OrderedDict.fromkeys(my_list))
>>> print(my_unique_list)
[1, 5, 2, 6]

As of Python 3.6, you can change this to:

>>> my_list = [1, 5, 5, 2, 6, 6]
>>> my_unique_list = list(dict.fromkeys(my_list))
>>> print(my_unique_list)
[1, 5, 2, 6]

Just be aware that this approach has one drawback: it's slower than using the set approach.

Related Posts

  • Sorting a Python dict by value
  • Testing exceptions in pytest
  • Annotate Querysets to Fetch Specific Values
  • Optimize Database Calls with Prefetch Related and Select Related
  • Executing a Python function and fail after x attempts