Inventor Engineer-to-Order

Inventor Engineer-to-Order

Reply
Mentor
bsee1
Posts: 211
Registered: ‎11-14-2011
Message 1 of 8 (411 Views)
Accepted Solution

Efficient way to count duplicates in a list?

411 Views, 7 Replies
06-29-2012 06:51 AM

I have a list that has duplicate items.  I'd like to get a count of how many duplicates there are of each item in the list, then remove any duplicates.

 

For example list = {"a","b","b","a","c","a"} would give me something like {{"a",3}, {"b",2}, {"c",1}}

 

I've already done this using a double loop, but it's terribly inefficient. Is there some simple way to do this using prebuilt functions from ETO?  

 

My list will have upwards of 1000 items, which is why efficiency matters so much.  A double loop would mean a million comparisons. (1000 items each compared 1000 times)

*****************************
Win7 x64 - 16gb ram
i7 3610qm
FirePro M4000

Inventor 2013
ETO 6.1
Distinguished Contributor
ludesroc
Posts: 171
Registered: ‎05-03-2005
Message 2 of 8 (399 Views)

Re: Efficient way to count duplicates in a list?

06-29-2012 08:39 AM in reply to: bsee1

Try this...

 

Rule myList1 As List = {"a","b","b","a","c","a"}
Rule myList2 As List = Partition(myList1)
Rule myList3 As List
Dim myList As Any
For Each myList In myList2
myList3 = myList3 + {{first(myList), length(myList)}}
Next myList
End Rule

 

'To remove duplicates from original lst...

Rule myList5 As List = RemoveDuplicates(myList1)

 

Hope that helpd!

 

Regards,

Luc

Ludesroc
Mentor
FarrenYoung
Posts: 264
Registered: ‎07-13-2009
Message 3 of 8 (386 Views)

Re: Efficient way to count duplicates in a list?

06-29-2012 09:13 AM in reply to: ludesroc

While partition is a very handy function, my experience has shown it is extremely slow with large datasets.  I try to avoid it when I can.

 

 You have presented an elegant solution, but I believe the following should be 4-5x faster.  You wont notice much difference on smaller datasets, but when ran with 1,000 plus elements, you will gain a lot of time by not using partition.

 

    Uncached Rule qwe As List
        Dim tLst As List = {"a","b","b","a","c","a"}
        Dim nLst As List = {}
        
        Dim tLstLen As Integer = length(tLst)
        
        Dim uniques As List = removeDuplicates(tLst)
        Dim unique As Any
        For Each unique In uniques
            nLst = nLst + {{unique, tLstLen - length(removeAllOccurrencesOf(unique, tLst))}}
        Next
        
        Return nLst
    End Rule

 

Edit:

I wanted to make sure I wasn't remembering incorrectly so I went ahead and ran a test with/without partition.  Using the same 10,000 element list of single character strings and caching that list before running either test, I came up with 48623 milliseconds using partition and 8338 without partition.

--Farren

************************************************************************************
If this post helps, please click the "thumbs up" to give kudos
If this post answers your question, please click "Accept as Solution"
************************************************************************************
Distinguished Contributor
ludesroc
Posts: 171
Registered: ‎05-03-2005
Message 4 of 8 (381 Views)

Re: Efficient way to count duplicates in a list?

06-29-2012 09:18 AM in reply to: FarrenYoung

Great, I'll write this one down!

 

Thanks for the advise!

Luc

Ludesroc
Mentor
FarrenYoung
Posts: 264
Registered: ‎07-13-2009
Message 5 of 8 (379 Views)

Re: Efficient way to count duplicates in a list?

06-29-2012 09:21 AM in reply to: ludesroc

Yeah for what it's worth, if you're doing a lot of large list processing, it can be faster to write a .net function and call it.  Do your list processing in .net and then pass back the value/s.

--Farren

************************************************************************************
If this post helps, please click the "thumbs up" to give kudos
If this post answers your question, please click "Accept as Solution"
************************************************************************************
Mentor
bsee1
Posts: 211
Registered: ‎11-14-2011
Message 6 of 8 (372 Views)

Re: Efficient way to count duplicates in a list?

06-29-2012 01:53 PM in reply to: FarrenYoung

Wow this is much faster than my double loop method.  In C# I can write fairly efficient code, but with Intent I never know what features the language has.

 

Thanks!

*****************************
Win7 x64 - 16gb ram
i7 3610qm
FirePro M4000

Inventor 2013
ETO 6.1
Employee
venkatt
Posts: 13
Registered: ‎10-31-2007
Message 7 of 8 (305 Views)

Re: Efficient way to count duplicates in a list?

08-07-2012 03:02 PM in reply to: bsee1

Another possible way could be to sort the list first and then count the repeating items in the sorted list.

 

Parameter Rule InputList As List
  Dim inList As List={"a", "d", "b", "c", "aa", "d", "e", "c", "a", "c", "b"}
  Return inList
 End Rule
 
 Rule SortList As List
  SortList=sort(InputList, :Ascending)
 End Rule
 
 Rule CountofItems As List 
  Dim sortedList As List=SortList
  Dim countList As List={}
  Dim lengthSortedList=length(sortedList)
  
  Dim prevItem As Any=first(sortedList)
  Dim currItem As Any
  
  Dim count As Number = 1
  Dim itemcount As Number = 0
  While count <= lengthSortedList   
   currItem=nth(count, sortedList)
   If prevItem=currItem
    itemcount=itemcount+1
   Else
    countList=countList+{prevItem, itemcount}
    prevItem=currItem
    itemcount=0
    count=count-1
   End If
   count=count+1
  End While
  countList=countList+{prevItem, itemcount}
  
  Return countList
 End Rule

 

I'm fairly new to Intent so I can't compare this to other methods. Generally speaking though, sorting a list and iterating through it should be nlogn complexity (assuming the Sort function is nlogn) even otherwise I would think it would be possible to implement something like a QuickSort to sort the items.

 

Using a map (to use a C++ STL term) would be another way - iterating through the items in the list once and saving the count of each item in a map (can be accessed in constant time) but again since I'm not completely familiar with Intent, I don't know how to implement something that would be equivalen to a map (maybe child lists).

Employee
venkatt
Posts: 13
Registered: ‎10-31-2007
Message 8 of 8 (279 Views)

Re: Efficient way to count duplicates in a list?

08-08-2012 04:02 PM in reply to: bsee1

Here is another method that uses a map to store the occurrence of items in the input list. This can be comparable to using a sorted list and could be useful if the data set is large.

 

Parameter Rule InputList As List
  Dim inList As List={"a","d","b","c","aa","d","e","c","a","c","b"}
  Return inList
 End Rule

 

'Count items in a list - using a map
 Rule CountofItems_Map As List  
  Dim countList As List={}
  Dim createMap? As Boolean = defineMap(:countMap, reset?:=True)
  Dim lengthInputList As Number=length(InputList)
  
  Dim currItem As Any
  Dim count As Number = 1
  While count <= lengthInputList 
   currItem=nth(count, InputList)
   Dim currItemName As Name=MakeName(currItem)
   Dim currItemCount As Integer=getMapValue(:countMap, currItemName)
   If currItemCount=NoValue Then
    setMapValue(:countMap, currItemName,1)    
   Else
    setMapValue(:countMap, currItemName,currItemCount+1)
   End If
   count=count+1
  End While
  
  For Each currItem In getMapKeys(:countMap)
   countList=countList+{{currItem,getMapValue(:countMap,currItem)}}
  Next
  
  Return countList
 End Rule 

Announcements
Manufacturing Community
The Manufacturing Community provides additional access to tips, tutorials, blogs and networking with peers.
Need installation help?

Start with some of our most frequented solutions or visit the Installation and Licensing Forum to get help installing your software.