Announcements
Attention for Customers without Multi-Factor Authentication or Single Sign-On - OTP Verification rolls out April 2025. Read all about it here.

You can parse .rvt files

10x3x30r
Contributor

You can parse .rvt files

10x3x30r
Contributor
Contributor

I've started playing with raw .rvt format for the past few days and realized that it's actually fairly readable. I don't mean just basic metadata like project version, but it seems like you could extract quite a lot of names of familes, view and parameters. It'd be probably more convenient to do it from command line, but as a quick proof of concept you can just drill into with 7zip and decompress it until utf-16 encoded strings start showing... quite a lot of them, at least more than I expected.

 

if you want to also have a ready-made api to cleanly access each data type oda bimrv claims 100% access.

 

I'm posting it as I couldn't find any information on this topic shared here before.

 

from command line:

7z x FILE.rvt

fd -tf -x binwalk -e

fd -e bin . extractions -X strings -e l

 

you could just run binwalk on .rvt directly, but this way I think that it won't preserve stream names in .rvt file (it's in Compound File Binary Format) and they may be useful for later analysis

Reply
779 Views
15 Replies
Replies (15)

jeremy_tammik
Autodesk
Autodesk

Thank you for the interesting note. You may also be interested in the rvt-app that Peter Hirn just pointed out to me in the past few days:

 

 

Currently, it just reads and displays the basic file info; maybe it could be expanded to include some of the additional information you glean from RVT and RFA as well.

   

Jeremy Tammik Developer Advocacy and Support + The Building Coder + Autodesk Developer Network + ADN Open

10x3x30r
Contributor
Contributor

Hi Jeremy, I was hoping to catch your interest with it. The holidays season is over, so I don't know if I have time to develop further, but with some luck it would be good to share it via rvt-app, their interface is very nice.

 

rvt-app did come up in internet search "revit file format parse" when I was writing this post, so I assume that it has gained some clicks already. When I tried it though, the only non-confidential file I had was from 2015, so it sadly only yielded an error.

10x3x30r
Contributor
Contributor

Thanks šŸ˜… I've run it on racbasicsampleproject.rvt: results

like you said it's overall project info

I then run above commands on this file and extracted what looks like family names related to columns and beams from this file

there is still a lot of mistakes in output if you are just trying to get column family names for example

complete process log

jeremy_tammik
Autodesk
Autodesk

Cool. Interesting data. Looks like the beginnings of a BOM extraction. Thank you for sharing, and congratulations on the results.

  

Jeremy Tammik Developer Advocacy and Support + The Building Coder + Autodesk Developer Network + ADN Open

peter_hirn
Contributor
Contributor

Hey! Nice to see some interest in my weekend project šŸ˜

 

I didn't add support for Revit 2015 yet, but I hope it is identical to 2016 and therefore extremely easy to add.

 

Regarding confidential files: The app runs directly in the browser and no data is sent to any servers. You can check the source https://github.com/phi-ag/rvt-app but of course you have to trust me that this is actually deployed and there is no hidden build step or something.

 

I love the idea to extract more data from Revit files in the app. There are some low-hanging options, like plain xml files (I think project info and transmission data). Everything inside the partition files would probably need some serious work (just running "strings" or other brute-force extraction is something I don't want to do).

0 Likes

10x3x30r
Contributor
Contributor

Hi @peter_hirn

 

> Regarding confidential files: The app runs directly in the browser and no data is sent to any servers. You can check the source

 

You are correct, it's just that web browsers run on my desktop with full access to the Internet, so it just could leak, but I'm not saying that it will in any way.

 

> I love the idea to extract more data from Revit files in the app. 

 

I will keep that in mind if I ever attempt to clean-room reverse engineering .rvt format beyond simple type-unsafe strings. Right now I'm too time-constrained with everyday architecture to go further. But who knows, with so many .rvt files flying around on a daily basis there may be a chance to extract some quick insights while crunching another deadline (unless autodesk purposfully obfuscates the format, which as far as I saw wasn't the case so far?).

peter_hirn
Contributor
Contributor

Hey,

 

yeah, not trusting random browser apps with confidential data is very reasonable šŸ˜

 

Regarding further reverse engineering I'm in the same situation as you - not enough time. Also guessing the structure of the basic file info was quite easy by just looking at a hex editor. Finding the right pointers into the partition data seems a lot harder and I currently wouldn't even know where to start.

 

My hope is that by publishing the app under MIT every now and then someone will have a little time to push the implementation a bit further and we eventually end up with a good shared - and easy to read - understanding of the file structure.

0 Likes

10x3x30r
Contributor
Contributor

> we eventually end up with a good shared - and easy to read - understanding of the file structure

until autodesk changes it next year... to no surprise of course, it's an evolving platform and so is its serialization format (e.g. new toposolid type last year)

 

> I currently wouldn't even know where to start

for you or a curious reader I found ODA BimRv codebase to be a very useful starting point in understanding revit's schema and other inner workings, ODA guys (who just happen to be statistically significantly composed of Russians) have already reversed .rvt format allegedly to 100% compatibility every year since 2012 (and maybe even sooner)

peter_hirn
Contributor
Contributor

I'm aware of ODA and the incredible work they did. I'm currently under the impression that getting access to BimRv would require signing a lengthy contract with ODA, and consequently just porting concepts (not even code) to my app has the potential to end up in a complicated lawsuit. Therefore I'm not going to touch this project at all.

 

I assume ODA reverse-engineered the format by basically dropping Revit into Ghidra (or similar). That's another thing I'm never going to do, and if somebody does please don't contribute to my project.

 

In conclusion: Please don't sue me šŸ˜

0 Likes

10x3x30r
Contributor
Contributor

> getting access to BimRv would require signing a lengthy contract

if you mean using it on a project then yes, but if you just want to learn from their rich codebase then trial version provides you with almost all of it

 

> consequently just porting concepts (not even code) to my app has the potential to end up in a complicated lawsuit. Therefore I'm not going to touch this project at all.

you can't patent an idea, only its representation (code), so if Autodesk couldn't destroy ODA (even when they used TrueDWG watermark) then I wouldn't worry too much at this stage

 

> I assume ODA reverse-engineered the format by basically dropping Revit into Ghidra (or similar)

yep, IDA I'd say, NSA were probably too busy with PRISM at the time šŸ˜…

 

> That's another thing I'm never going to do, and if somebody does please don't contribute to my project.

how else would you do it? Some might argue that rvt-app opening .rvt file without revit running already goes against the license. I'm unclear where you draw the line

peter_hirn
Contributor
Contributor

Thanks for the info. Your arguments make perfect sense to me and there probably is a lot more wiggle room than I think.

 

Because of the legal implications I'd like to draw the line all the way at the bottom for now. As far as I understand, the way we are currently reverse-engineering the format would be considered clean-room. I don't think it's possible to prohibit us from using 7z, binwalk and a hex editor, and we don't have any knowledge of the proprietary implementation.

 

I currently don't know how to make progress on the clean-room approach. Maybe using a very small file and then trying to find pointers into the partition data, `Global/PartitionTable` and `Global/ElemTable` look promising.

 

The best I can do is asking contributors to not push code which was created through potentially illegal means. Since I don't know the related proprietary implementations, I'm not going to be able to tell the difference by just inspecting the code.

mkoshutanski
Observer
Observer

Hey @jeremy_tammik ,

Thank you so much for sharing sample revit models.

The first link of the dropbox is not working, probably because the link si pointing to dropbox home folder, instead of shared one (with link). is it possible to share with share link, it would be great to have more sample files, as im trying to test a Revit Batch Processing with many files and it would be good to test it with more data.Thanks!

0 Likes

jeremy_tammik
Autodesk
Autodesk

Oh dear, sorry about that. Is this one better? 

  

  

I hope so!

   

Jeremy Tammik Developer Advocacy and Support + The Building Coder + Autodesk Developer Network + ADN Open
0 Likes

mkoshutanski
Observer
Observer

Yep, it Works
Thanks, Jeremy! 

0 Likes