• Industries
  • Products
  • Buy
  • Services & Support
  • Communities
  • Discussion Groups

    Autodesk Inventor Engineer-to-Order

    Reply
    Mentor
    bsee1
    Posts: 172
    Registered: ‎11-14-2011
    Accepted Solution

    Efficient way to count duplicates in a list?

    318 Views, 7 Replies
    06-29-2012 06:51 AM

    I have a list that has duplicate items.  I'd like to get a count of how many duplicates there are of each item in the list, then remove any duplicates.

     

    For example list = {"a","b","b","a","c","a"} would give me something like {{"a",3}, {"b",2}, {"c",1}}

     

    I've already done this using a double loop, but it's terribly inefficient. Is there some simple way to do this using prebuilt functions from ETO?  

     

    My list will have upwards of 1000 items, which is why efficiency matters so much.  A double loop would mean a million comparisons. (1000 items each compared 1000 times)

    *****************************
    Win7 x64 - 16gb ram
    i7 3610qm
    FirePro M4000

    Inventor 2013
    ETO 6.1
    Please use plain text.
    Distinguished Contributor
    ludesroc
    Posts: 134
    Registered: ‎05-03-2005

    Re: Efficient way to count duplicates in a list?

    06-29-2012 08:39 AM in reply to: bsee1

    Try this...

     

    Rule myList1 As List = {"a","b","b","a","c","a"}
    Rule myList2 As List = Partition(myList1)
    Rule myList3 As List
    Dim myList As Any
    For Each myList In myList2
    myList3 = myList3 + {{first(myList), length(myList)}}
    Next myList
    End Rule

     

    'To remove duplicates from original lst...

    Rule myList5 As List = RemoveDuplicates(myList1)

     

    Hope that helpd!

     

    Regards,

    Luc

    Ludesroc
    Please use plain text.
    Mentor
    FarrenYoung
    Posts: 247
    Registered: ‎07-13-2009

    Re: Efficient way to count duplicates in a list?

    06-29-2012 09:13 AM in reply to: ludesroc

    While partition is a very handy function, my experience has shown it is extremely slow with large datasets.  I try to avoid it when I can.

     

     You have presented an elegant solution, but I believe the following should be 4-5x faster.  You wont notice much difference on smaller datasets, but when ran with 1,000 plus elements, you will gain a lot of time by not using partition.

     

        Uncached Rule qwe As List
            Dim tLst As List = {"a","b","b","a","c","a"}
            Dim nLst As List = {}
            
            Dim tLstLen As Integer = length(tLst)
            
            Dim uniques As List = removeDuplicates(tLst)
            Dim unique As Any
            For Each unique In uniques
                nLst = nLst + {{unique, tLstLen - length(removeAllOccurrencesOf(unique, tLst))}}
            Next
            
            Return nLst
        End Rule

     

    Edit:

    I wanted to make sure I wasn't remembering incorrectly so I went ahead and ran a test with/without partition.  Using the same 10,000 element list of single character strings and caching that list before running either test, I came up with 48623 milliseconds using partition and 8338 without partition.

    --Farren

    ************************************************************************************
    If this post helps, please click the "thumbs up" to give kudos
    If this post answers your question, please click "Accept as Solution"
    ************************************************************************************
    Please use plain text.
    Distinguished Contributor
    ludesroc
    Posts: 134
    Registered: ‎05-03-2005

    Re: Efficient way to count duplicates in a list?

    06-29-2012 09:18 AM in reply to: FarrenYoung

    Great, I'll write this one down!

     

    Thanks for the advise!

    Luc

    Ludesroc
    Please use plain text.
    Mentor
    FarrenYoung
    Posts: 247
    Registered: ‎07-13-2009

    Re: Efficient way to count duplicates in a list?

    06-29-2012 09:21 AM in reply to: ludesroc

    Yeah for what it's worth, if you're doing a lot of large list processing, it can be faster to write a .net function and call it.  Do your list processing in .net and then pass back the value/s.

    --Farren

    ************************************************************************************
    If this post helps, please click the "thumbs up" to give kudos
    If this post answers your question, please click "Accept as Solution"
    ************************************************************************************
    Please use plain text.
    Mentor
    bsee1
    Posts: 172
    Registered: ‎11-14-2011

    Re: Efficient way to count duplicates in a list?

    06-29-2012 01:53 PM in reply to: FarrenYoung

    Wow this is much faster than my double loop method.  In C# I can write fairly efficient code, but with Intent I never know what features the language has.

     

    Thanks!

    *****************************
    Win7 x64 - 16gb ram
    i7 3610qm
    FirePro M4000

    Inventor 2013
    ETO 6.1
    Please use plain text.
    Employee
    Posts: 7
    Registered: ‎10-31-2007

    Re: Efficient way to count duplicates in a list?

    08-07-2012 03:02 PM in reply to: bsee1

    Another possible way could be to sort the list first and then count the repeating items in the sorted list.

     

    Parameter Rule InputList As List
      Dim inList As List={"a", "d", "b", "c", "aa", "d", "e", "c", "a", "c", "b"}
      Return inList
     End Rule
     
     Rule SortList As List
      SortList=sort(InputList, :Ascending)
     End Rule
     
     Rule CountofItems As List 
      Dim sortedList As List=SortList
      Dim countList As List={}
      Dim lengthSortedList=length(sortedList)
      
      Dim prevItem As Any=first(sortedList)
      Dim currItem As Any
      
      Dim count As Number = 1
      Dim itemcount As Number = 0
      While count <= lengthSortedList   
       currItem=nth(count, sortedList)
       If prevItem=currItem
        itemcount=itemcount+1
       Else
        countList=countList+{prevItem, itemcount}
        prevItem=currItem
        itemcount=0
        count=count-1
       End If
       count=count+1
      End While
      countList=countList+{prevItem, itemcount}
      
      Return countList
     End Rule

     

    I'm fairly new to Intent so I can't compare this to other methods. Generally speaking though, sorting a list and iterating through it should be nlogn complexity (assuming the Sort function is nlogn) even otherwise I would think it would be possible to implement something like a QuickSort to sort the items.

     

    Using a map (to use a C++ STL term) would be another way - iterating through the items in the list once and saving the count of each item in a map (can be accessed in constant time) but again since I'm not completely familiar with Intent, I don't know how to implement something that would be equivalen to a map (maybe child lists).

    Please use plain text.
    Employee
    Posts: 7
    Registered: ‎10-31-2007

    Re: Efficient way to count duplicates in a list?

    08-08-2012 04:02 PM in reply to: bsee1

    Here is another method that uses a map to store the occurrence of items in the input list. This can be comparable to using a sorted list and could be useful if the data set is large.

     

    Parameter Rule InputList As List
      Dim inList As List={"a","d","b","c","aa","d","e","c","a","c","b"}
      Return inList
     End Rule

     

    'Count items in a list - using a map
     Rule CountofItems_Map As List  
      Dim countList As List={}
      Dim createMap? As Boolean = defineMap(:countMap, reset?:=True)
      Dim lengthInputList As Number=length(InputList)
      
      Dim currItem As Any
      Dim count As Number = 1
      While count <= lengthInputList 
       currItem=nth(count, InputList)
       Dim currItemName As Name=MakeName(currItem)
       Dim currItemCount As Integer=getMapValue(:countMap, currItemName)
       If currItemCount=NoValue Then
        setMapValue(:countMap, currItemName,1)    
       Else
        setMapValue(:countMap, currItemName,currItemCount+1)
       End If
       count=count+1
      End While
      
      For Each currItem In getMapKeys(:countMap)
       countList=countList+{{currItem,getMapValue(:countMap,currItem)}}
      Next
      
      Return countList
     End Rule 

    Please use plain text.