“The SharePoint list is a mess, there’re so many duplicate items that it would take hours to delete them, can I build a Power Automate flow to do it for me?”
Power Automate flow doesn’t have to be always a repeatable flow, started on regular basis or on specific event. It can be a flow that you’ll start manually only once or twice, just to do a specific task. Some task that has clear rules, but is very time consuming. For example, deletion of duplicate items from a SharePoint list.
You could do it manually – create a view, sort the items, and go through all of them and delete duplicates. But the manual approach has a few downsides: it can take some time, and it’s easy to make a mistake. You can miss one duplicate, or delete a unique item without even noticing. That’s when it makes sense to delegate the work to a flow – to do it fast and without any mistakes.
Get all the items
If you want to search for duplicates and delete them, you must find them first. Get all the items using the ‘Get items’ action on the list. This is also the place where you define which of the duplicates you want to keep. Is it the first item, or the last item? Use the Order By field to keep the desired items at the top, e.g. to keep the latest items:
Modified desc
Find the duplicate items
Since you want to keep only the unique items, there must be a column that should stay unique. ‘Select’ only this column, e.g. the Title, and in a following ‘Compose’ keep only the unique values. It’s the same approach as used to send only one email per user.
union(body('Select'),body('Select'))
The result of ‘Compose’ will be only the unique values. Go through the values one by one inside ‘Apply to each’, and for each of them find the duplicates using ‘Filter array’.
You’re searching for all the items where the column that should be unique contains the currently processed unique value.
Remove only the duplicates
Here comes the tricky part. The ‘Filter array’ gives you all the items with the specific unique value, now you want to delete all of them except one – the one that should stay. And since you sorted the items already in the ‘Get items’, it’s the first item it’ll find.
Use the skip(…) expression to skip the first item and loop through the rest in another ‘Apply to each’.
skip(body('Filter_array'),1)
Then just delete all the remaining items using their ID.
item()?['ID']
At the end of the flow you’ll have only the unique items in the list.
Summary
This article described a flow to delete duplicate SharePoint items using Power Automate, but there’s one more idea behind this post: flows are not limited to repeated tasks, even though that’s the main use case. You can build a simple flow also for one-time tasks, like removing duplicates or copying files, the flow doesn’t care.
If the task has clearly defined steps that you’d have to do over and over and over again, you might consider building a flow. But always consider the amount of work! You shouldn’t spend 1 hour building a flow to do a 30 minutes manual task.
Great flow, Tom, thanks for sharing!
Hi
This was exactly what I was looking for, however, I can’t seem to get the filter to output anything. The only difference in my flow is that my Select box has a key/value entry, so I’ve put ‘Title’ as text for the key and the column ‘Title’ in the value.
Can you help?
Thanks
Hello Matt,
use the button on the right side of the ‘Select’ action to hide the ‘Key’ column and keep only the one for the value, as used in the post.
We receive work from List A to List B
I work on List B. I created a Choice column called status- and I added choices of New item, Duplicate item, completed, and Under review.
Sometimes we receive duplicate items in List B from List A, I would like to identify those duplicates every time when a new item is created in List B and I want to change the status to Duplicate item if the manufacturer’s part number is matching.
I used your workflow “Delete Duplicate Items” and made minor modifications. Instead, Delete item I added Update Item to change the status value and change the status value to “Duplicate Item”
The problem I am having all the new items added to my list now the status value has changed to Duplicate Item even though it’s not a duplicate Part#
Hello DT,
I can’t give advice in comments if you modify the flow as I don’t know what your flow looks like. If you’d like to take a look on the flow together you can book a consulting session.
Hi Tom!
Could you share a screenshot of this step, please? I’m having the same issue and I can’t find where to hide the ‘key’ column as you mentioned. Thank you!
Hello Mack,
in the ‘Select’ action click on the small icon highlighted in this article: https://tomriha.com/how-to-select-specific-values-from-an-array-in-power-automate/
Hiya, this is great thanks! Would it be possible to have this as an automated flow?
Hello Dom,
sure, just use Recurrence trigger instead of a manual one.
hi
could you advise if I want to remove the duplicate item which with more than 2 conditions.
For example, employee# and date, only remove when employee# and date are duplicated, and keep the last post one.
item1 employee#2 date 7/21,
item2 employee#2 date 7/21,
item3 employee#2 date 7/20
==>remove item1
Hello Louise,
just use multiple filters in the ‘Filter array’ action: https://tomriha.com/how-to-apply-multiple-conditions-in-the-filter-array-power-automate-action/
Hello,
I would to ask, in skip example.. i would like to keep the last iteam added and delete the oldest
Hello Aris,
that’s what the example does, keeps the newest item.
Hi Tom, this is great!!
I just need to keep the oldest item and delete the newer ones.
how can i edit the expression to do it?
In the get items Order By field, add Created Desc so this will list the items on newest to oldest then when the delete skips the first file it is skipping the newest one.
Hello Arie,
it always keeps the first item returned by the ‘Get items’, and the item at the top is selected by the ‘Order By’ field. If you sort the items in desc (descending order) it’ll keep the newest one, if you sort them in asc (ascending order) it’ll keep the oldest one.
Just replace the value in ‘Order By’ field with Created Desc as Rob wrote.
Great post, really useful. I managed to modify it and add it to my weekly flow that uploads new items to check just the new entries for duplicates in the existing list and remove all but the most recent.
The delete option at the bottom gets greyed out. Even though I have results from the previous filter. Any idea what could be causing this?
Hello pkd,
if it’s grey then there’s nothing to delete, the action was skipped. I’d double check the dynamic content used in the loop.
Hi Guys, found this and looked perfect for what I wanted but all looks ok (i’ve a novice however) and flow runs but nothing is deleted from the sharepoint list.
Basically I have a Microsoft Form which with a power automate, copies the content from the form on submit to a sharepoint list (works a treat); I even have one that runs at 00:00 and clears the sharepoint list for the next days input (again works a treat). However, based on user input in to the form (and it is expected behaviour) that a user could input more than once into the form which gives two (or more) entries in the form / sharepoint list. I only want to keep the most recent entry from the user. The user would input a “pin code” (just a number) to identify their entry. So in the sharepoint list i could have the pins stored (with other data elements of course) of 9000, 9000, 2000, 2000, 3000 (the order being based on newest being ordered first (so the second record of the 9000 is the first entry and the first 9000 is the most recent).
All looks good as far as this flow went and in my mind it would do the job. When I test and run the flow it is green and working but the duplicates in the sharepoint list aren’t deleted.
In the get items section i’m sorting on Modified desc
In the select its Value and I chose the sharepoint field of ParentPin (is that right?)
Compose is doing the union formula
Apply to Each is the compose output
Filter array is Value and ParentPin
Apply to Each 2 is the Skip
Apply to each 3 is the Value
Delete is SharepointID
Test works but no data deleted from the sharepoint list
When I look at the run test history it does appear (from my novice head) that it is taking all the values to start 9000, 9000, 2000, 2000, 3000 and the compose is showing only 9000, 2000,3000 which would be corrected based on the modified date.
Hoping your Obi-wans can save me from Darth Vader 😉
thank you in advance.
Hello Paul,
if you’ve got ‘Apply to each 3’ then it’s a different flow, the flow in the article has only 2 ‘Apply to each’ actions.
Im having an issue on the multi column filter (https://tomriha.com/how-to-apply-multiple-conditions-in-the-filter-array-power-automate-action/) can you create another tutorial which mapped exactly on the above steps just add two columns to filter. Thank you in advance.
Hello Sandman,
I’ll add it to topic ideas list. I’d probably use the concat() expression to connect the two columns together to get the unique values and work with them as a “couple” in the whole flow.
Excellent!! It really helped me!!! I wanted this long time ago and in the end i got it! Thanks!!!!!
Hello, I need help,
Subject: Duplicate item in List B
We receive work from List A to List B
I work on List B. I created a Choice column called status- and I added choices of New item, Duplicate item, completed, and Under review.
Sometimes we receive duplicate items in List B from List A, I would like to identify those duplicates every time when a new item is created in List B and I want to change the status to Duplicate item if the manufacturer’s part number is matching.
Can you some help?
I used your workflow “Delete Duplicate Items” and made minor modifications. Instead, Delete item I added Update Item to change the status value and change the status value to “Duplicate Item”
The problem I am having all the new items added to my list now the status value has changed to Duplicate Item even though it’s not a duplicate Part#
Hi Used the same workflow and replaced Delete with Update Status value to reflect as a Duplicate item.
I appreciate it if you can help me to build a workflow for this SharePoint list. I tried multiple methods but for some reason, it’s updating the status for all new items even though it’s not duplicate.
1. When the item is created
2. Search Testid duplicate
3. if a duplicate testid is found then update the most current date time item Status value to “Duplicate item”
4. Send an email to Manager- duplicate item found in the list take appropriate action.
My Filter Array output in Apply to Each is coming as an empty array. I am using this @equals(item()?[‘field_4’], item())
Could you please help
Hello Garima,
it must be the same column that you use to get the unique values, the one used in the ‘Select’ action. And make sure that you have the ‘Select’ action switched to the text mode only.
Much cleaner, flexible, & more efficient way to do this here…
https://powerusers.microsoft.com/t5/Power-Automate-Cookbook/Find-and-Remove-Duplicates/m-p/2191403#M1611
Awesome, thank you so much