Speed issues when using GoExcel functions in large assembly - updating properties

Speed issues when using GoExcel functions in large assembly - updating properties

mcloughlin_b
Enthusiast Enthusiast
675 Views
7 Replies
Message 1 of 8

Speed issues when using GoExcel functions in large assembly - updating properties

mcloughlin_b
Enthusiast
Enthusiast

Hi Boffins

 

I have cobbled up some code to bulk update document properties (standard and custom) from an external spreadsheet using the GoExcel commands. I have this working at the moment but the code takes a heck of a long time for large assemblies. I am certain it is because of the calls to Excel to read in all the properties I am wanting to update (for each reference in the assembly). I have been looking into reading in the excel data (property names in row A and property values in row B) in one hit and storing this in an array or something similar and then updating all referenced doc properties to suit, without having to parse the spreadsheet with each document and each property. My thinking is that this will speed things up. 

 

Firstly, am I on the right track here and secondly how do I get my values into an array or list for processing?

 

Further information: the way i have set this up is that the current doc properties (assembly) are processed first (this is fast ) and then the user chooses whether to update all referenced docs. My spreadsheet is set up so that blank cells are ignored (document property will retain current value), cells with a space clear the document property and cells with text overwrite the property. I also check if the custom property exists in the document and add if missing.

 

I know for sure this is doable as i can find lots of mention along these lines but unfortunately have not tracked down any examples.

 

Any ideas, thoughts or such are most welcome.

 

Thanks in advance.

0 Likes
676 Views
7 Replies
Replies (7)
Message 2 of 8

Michael.Navara
Advisor
Advisor

Perhaps you are on the right way, but I'm not certain.

You can try this performance test. When I try to find and read data from Excel 1000x I got following result. It depends on what do you do with the value.

 

Performance test rule:

 

 

 

 

Dim xlsFileName = "C:\Path\To\Properties.xlsx"

Dim startTime = DateTime.Now
For partIndex = 1 To 100
	For propIndex = 1 To 10
		Dim rowIndex = GoExcel.FindRow(xlsFileName, "Sheet1", "PartNumber", "=", "Part" & partIndex)
		Dim propValue = GoExcel.CurrentRowValue("Prop" & propIndex)
		Logger.Debug(propValue)
	Next
Next

Dim duration = DateTime.Now - startTime
Logger.Info(duration.TotalMilliseconds)

 

 

 

 

 

Result (Inv 2023):

When logging level is set to Debug duration is about 600 ms for first run. It is required to clear log window otherwise the time grows.

When logging level is set to Info duration is about 20 ms.

 

 

Edit:

If you want to increase performance, you can find the row only once for each PartNumber

 

 

Dim xlsFileName = "C:\Path\To\Properties.xlsx"

Dim startTime = DateTime.Now
For partIndex = 1 To 100
	Dim rowIndex = GoExcel.FindRow(xlsFileName, "Sheet1", "PartNumber", "=", "Part" & partIndex)
	For propIndex = 1 To 10
		Dim propValue = GoExcel.CurrentRowValue("Prop" & propIndex)
		Logger.Debug(propValue)
	Next
Next

Dim duration = DateTime.Now - startTime
Logger.Info(duration.TotalMilliseconds)

 

 

 

In this case when logging level is set to Info duration is about 2-5 ms.

0 Likes
Message 3 of 8

mcloughlin_b
Enthusiast
Enthusiast

Hi michael

thanks for this.

the problem is that i am updating multiple properties across multiple sub-assemblies and their parts.

i also first need to verify the properties (mostly custom properties) exist in the files and then set values from the spreadsheet. So in effect i am querying the excel file many many times as each property is checked and then updated, times multiple files.

I was just wondering if there is a way to read in all of the excel fields in one hit and then parse through the model file properties.

 

Cheers

0 Likes
Message 4 of 8

Michael.Navara
Advisor
Advisor

This is no problem. You can build your own data structure based on XLS data. But it depends on how you want to find property value and how is your data structured. You need to specify this first. For example you can use function like this

 

Public Function GetPropertyValue(partNumber as String, propertyName as String) As Object 

 

And in implementation of this function you can query the value from underlying data structure.

 

At the beginning you read the data from XLS using similar method mentioned above and create your data structure. The fastest collection is Dictionary. In this case you can use two nested dictionaries the first for partNumber, the second for propertyName.

 

Dim xlsData as Dictionary(Of string, Dictionary(Of String, Object))

'Query sample:
Dim propValue = xlsData(partNumber)(propertyName)

 

 

Message 5 of 8

WCrihfield
Mentor
Mentor

Hi @mcloughlin_b.  Just adding another tip into the conversation.  If speed is a concern, and you are working with a lot of data checks/comparisons, in a lot of files, then the top suggestion would likely be to abandon the use of those GoExcel iLogic shortcut snippets entirely, in favor of accessing Excel through its own native API, and include the necessary 'AddReference' & 'Imports' lines in there, so your code will recognize the data Types unique to Excel.

 

AddReference "Microsoft.Office.Interop.Excel.dll"
Imports Microsoft.Office.Interop
Imports Microsoft.Office.Interop.Excel

 

There is a bit of a learning curve involved with doing things that way, but you will reap the rewards of the added speed you are looking for, and learn some useful stuff in the process.  The Dictionary idea, as Michael mentioned above, is always a good one when you want to keep sets of data together in a list type arrangement.  Maybe the quickest & simplest way to get get/set all data in an excel sheet though is by way of a 2-dimensional Array of Object.  Array's can be awkward to work with, because they are zero based (first item is at zero index, not 1), but it will retain the original layout of data in its simplest form.  You can get this from Dim oAllData() As Object = oWS.UsedRange.Value (where oWS represents the Worksheet).  And since that is a Read/Write property, after you have made any needed changes to the data in that Array, you can simply write it all back at one time too, by setting that Array as the value of the Range again.  Just another point of view.  Like Michael said, it all depends on your process, and what works best for your specific needs, because there are many ways of doing things.

 

Wesley Crihfield

EESignature

(Not an Autodesk Employee)

Message 6 of 8

mcloughlin_b
Enthusiast
Enthusiast

hi Michael and Wesley

many thanks for your suggestions, both of which I am keen to explore.

I had looked into accessing Excel through the API but could not quite work out how to get both of my excel rows into an array, so ended up using the GoExcel functions to read in each property for each model file. It does sound like I will need to wrap my mind around use of the native Excel functions to speed things up though so can either of you suggest where I can find some readings in this regard?

Also my requirements might be easier as I only intend to read values from the Excel file and update model properties accordingly.

As always many thanks for your time and sharing of knowledge.

 

0 Likes
Message 7 of 8

A.Acheson
Mentor
Mentor

Hi @mcloughlin_b 

Just a little info. Not a whole lot of nice beginners tutorials for VB.NET but I find if you do an internet search using stack overflow and VBA/VB.NET you can usually find what you need in a sample form. An example here.

Of course there is very complex ways to work with excel and very simple so choose something that is easy to understand and convert. 

 

Excel VBA tutorials are very popular and with some tweeking for the VB.NET environment you can usually get them to work. The Microsoft Help file is great also to get some obscure function/method.

 

Here is an example of some BOM processing using excel dll. In the subroutine you will see methods for getting last cell in a range and adding celldata etc. Maybe some of the subroutines could help you. 

If this solved a problem, please click (accept) as solution.‌‌‌‌
Or if this helped you, please, click (like)‌‌
Regards
Alan
Message 8 of 8

WCrihfield
Mentor
Mentor

Here is another good starter link for Microsoft's own VB.NET area for accessing Microsoft Excel.  These links are a bit harder to find, because most searches about coding or Excel take you to VBA stuff (or other programming languages).  This takes you right to the top of the pyramid, so you can work your way down through the objects, then to their methods & properties.

https://learn.microsoft.com/en-us/dotnet/api/microsoft.office.interop.excel?view=excel-pia 

Wesley Crihfield

EESignature

(Not an Autodesk Employee)