Yesterday I accidentally made a discovery
For the past week after working well the Civil 3D software slowed to a crawl for the sampleLine process. The autocad forum had no help for this specific slowdown and an email to a C3D expert found this is a known issue with no real solution other than to split the drawing into several dozen drawings with xref and shortcuts.
So out of frustration I re-tuned the computer (for this machine we paid the extra cost for unlocked hardware and bios after the experience with my locked HP laptop) - it fixed the slowdown. But, now some of the C3D corridor processes are unusably slow; it looks like C3D has a bi-modal optimal hardware configuration.
For a 4x4GB [16GB] dual dual-channel modules the two opposing settings for the memory are:
- a fast (CAS) latency of 7 and slower clock speed of 1650Mhz
- a faster memory speed 1730Mhz and slow CAS of 9 - I assume if I turned the memory up to 2000Mhz there will be continued improvement but an increasingly worse response time.
I am considering purchasing a third memory set for this machine to try and satisfy both a ~1800Mhz memory speed and CAS latency 7 response time:
- the memory I have now is marketed as 2000Mhz CAS 9 - at the above two settings it works for $210 http://www.newegg.com/Product/Product.aspx?Item=N8
- the original memory we purchased with this machine, PGV38G1600 1600Mhz CAS 9, never checked what it was actually running at - slow but $146
- proposed memory, 2133Mhz or 2200Mhz CAS9 detuned to 1800Mhz and a CAS 7 - should satisfy the Civil 3D needs at $240 to $600
What do you think?
I think you're chasing the wrong thing. As a single-threaded app that can only utilize one core, RAM speed is not the bottleneck even with 1333 MHz. Instead, it's at the CPU. That's why C3D performs so much better with the new 2nd Gen i3/i5/i7 chips... One of the big improvements in that line of chips was the MMU.
Are you REALLY running C3D on Server2008...? I suspect that might be more the cause of a lot of your problems... Server2008 does not make an ideal OS for a CAD computer.
Some good points:
- Fairly certain Server2008R2 and Win7 have the same core sourceCode (XP and server 2003 were different) "The fundamental difference between Windows 7 and Windows Server 2008 R2 is how tasks are prioritize...Changing Windows Server 2008 R2 to act like its desktop counterpart is nothing more than flicking a switch." have been using the server versions for the past five years without a problem - why, because it is free for students.
- The processor is clocked at 3.9Ghz to compensate for the single core; tried 4.2Ghz but did not see much of a change with the 300Mhz increase.
- The i3/i5/i7 are marketing designations and have no relation to the actual architecture, they somewhat randomly refer to the very different Nehalem, and Sandy Bridge. But taking the gist of what you are saying I researched and found that the AMD Phenom II and Nehalem have similar memory mamnagement unit (MMU) configurations and so I assume the Sandy Bridge has the improved configuration you noted. But sandybridge is rated for memory up to 1333Mhz while AMD Phenom is rated for up to 2000Mhz - both have possibilities for tuning to higher clocks. I will stop here since I am way out of my area of knowledge and propose finding someone to answer the question of memory if we can form a solid question to ask.
All that said, in theory you could be correct - but in real world use, I found a very big difference in performance for differing C3D functions with the two memory configurations.
You are getting yourself into trouble, and chasing ghosts...
Yes, "Nehalem" referred to the early i3/i5/i7. But "Sandy Bridge" is "2nd Gen i3/i5/i7". Two very different things. It's also unwise to try direct comparisons between components of Intel and AMD CPUs, since so many things are structured differently.
In any case, you are running into very bizarre problems. They are not typical. And the thing that is weirdest about what you are doing is that you are trying to use Windows Server as an OS for a CAD workstation. I would view that as insane, if for no other reason than it takes Windows Server so much longer to boot up and shut down, let alone all the rest.
If anything, maybe you should concentrate on what Services you have running. You may be able to tune Windows Server to behave much like a CAD Workstation, but it's not intended to be used that way, so I suspect you'll have to do a lot of tweaking. That all seems like wasted time and expense to me.
What do you have that actually is defined in the drawing containing the Sample Lines?
Are you currently using data shortcuts/references to manage the design data?
How large is the alignment that you are sampling?
Are you sampling a corridor also? If so, does it reside in the same drawing?
The sample lines are in the worst case scenario as best I can tell. The criteria you asked for I am guessing are the things that can be done to resolve sample line performance problems; validating that this is a well known problem.
The point is that I accidentally solved the sampleLine performance issue by tuning the memory. And today I am going to test different memory settings to see the behavior on the sampleLine and corridor function. Do you have any suggestions to help ensure the test results are usefull.
Also - what is your opinion for this idea; on the autocad c3d help wiki a performance page that summarizes the topic like this and at the bottom of the page a table where anyone that cares can place hardware and software configurations that work well for them like this table. When I built my computer I researched configurations and found very little except for this website http://www.c3dbenelux.org who made an exception so I could join and review the test results. even now I am still finding major new information like "Sandy Bridge has a exceptionally improved memory unit", nice. But in real world use what does it mean. If we post to the table what we 'feel' works and post some index scores then we can compare, learn, and form a more effective community.
I am not surprised that tweaking the memory would have some impact in performance.
However, in the time, effort, and money spent on adjusting, tweaking, and improving hardware configuration (above what you find in multiple blogs and even posts here from Autodesk employees) versus the performance improvement gained through that process seems minimal and even costly compared to looking at project/data workflow.
What do you mean? In most cases several people are collaborating and their work has a workflow so I see your point - in this specific case I am working by myself so the workflow is nonexistent; unless you mean workflow as in first build the entire model then last build sampleLines and when it locks-up on the last sampleLine hand the C3D model off - it is now complete. I did that but did sampleLines before payItem mapping so maybe I should have done payItem mapping before sampleLines then run QTO reports. But the point is that the model becomes unusable at some point.
For context, this specific case is using a location-based model, there are 450+ surfaces, 77 corridors, 10 alignments, and many sampleLines. The workflow for a location-based C3D model is without doubt complicated and I am not going to test it to show how complicated or if it is even possible. Hardware tests are costly only once - after that it is returned thousands of times. If I find through considerable time, effort, and cost that C3D 'likes' 1800Mhz memory with a CAS 7 and this is replicatable for everyone then there is no cost beyond myself. And, if it mitigates the need to orchestrate workflows more complicated than the flea-flicker play - good.
To keep this subjective and give some real values, attached is a video of the process I will use to test the sampleLine and corridor performance.
These are the steps I propose to use for the 2 opposing memory configurations: documentation is with CamStudio screen capture and 4 measured metrics are save time for sampleLine and corridor, time to map payItem, time to rename surface - measured in seconds from video. The test machine specifications are here.
- show computer properties (this view was asked for by Gavini in an email exchange several months ago)
- show advanced system settings performance visual effects, advanced, and virtual memory(Gavini)
- show device manager graphics properties driver (Gavini)
- show C3D performance tuner and manual tune (Gavini)
- show device manager graphics driver (Gavini)
- build a 'test' sample line 99CL1>NB>07 - this step took 80 seconds + 330 seconds to save (the temp drops from 47°C to 39°C, I turned on a floor fan pointed at the open case)
- Map sampleLine with objects (mistake due to mismatch in locations)
- assign a payItem to 99CL1>NB>07 corridor code code set style - several seconds
- rename a corridor surface - several seconds
- save - 70 seconds
- run geekbench score (universally accessible index allows comparing these results with other configurations)
- show CPU-z configuration (often asked for in forum threads)
- done - total duration per test 730 seconds (12 minutes)
- post results to this table; the video is 1GB so it is in a 160MB zip
- summary: this test was supposed to have a quick sampleLine time but it did not so maybe my memory observation is not the whole story - this is a different but similar configuration to what did work.
seem complete, through, degree, documented, and allows a 3rd party to repeat
- 4.0Ghz AMD Deneb C3, memory timing 9-9-9-24-41-2T 1886Mhz, sampleLine = 360s, surfaceRename = <1s, corridorSave=70s
- 3.7Ghz AMD Deneb C3, memory timing 7-8-7-24-24-1T 1654Mhz, sampleLine1 = 200s, surfaceRename2 = 10s, corridorSave=40s
- Details and video posted here http://cife.stanford.edu/wiki/doku.php?id=granite:
- in previous test the alignment sample line list was already open, this takes 70s and is included in the memory low latency test time - meaning the actual spread between times is greater than shown
- added a space to name, previous tests only opened and closed name without editing - so the actual time was not measured but the 'feel' of the 10s delay is noticeable compared to the faster processor speed setup
- test machine setup here http://cife.stanford.edu/wiki/doku.php?id=cee241:s
By accident I noticed the performance in corridor and sample line differed noticeably with different hardware configuration. After testing, a faster processor and memory speed with slower memory response resulted in good corridor processing but slow sampleLine processing - and the opposite - slower processor and memory speed with faster memory response resulted in slow corridor processing and faster sampleLine processing. Above are the results with context using actual values rather than an arbitrary fast and slow.
The final results are not consistent with my initial 'discovery', the sampleLine time was better with a faster processor and slower memory response, but in testing is has the best time with the opposite configuration - I think the difference in 'feel' is in the time to open the alignment sampleLine list rather than the overall sampleLine time. The difference in feel between the settings is noticeable and annoying enough that I took the time to look for a new configuration and then test, document, and post here for comments; essentially each setting results in an opposing function that feels like it 'doesn't work'. The sampleLine is twice as fast, the corridor surface name is five times slower and the corridorSave is twice as fast.
I could test further, in greater detail, and generate better test runs for comparisons with better accuracy in the measurements but what is here is sufficient to validate that there is a significant difference in opposing function performance with different hardware settings. Until there is more discussion and reproduction of these results I am not going to test further.