Why I don't trust Dropbox and what the Scrivener developer can do about it.

User avatar
devinganger
Posts: 2027
Joined: Sat Nov 06, 2010 1:55 pm
Platform: Mac, Win + iOS
Location: Monroe, WA 98272 (CN97au)
Contact:

Mon Apr 13, 2020 10:04 pm Post

dbrooks888 wrote:In my day job I run product management for a DevOps company that relies heavily on Git as the source of truth for large collaborative software projects. (Some of them with 100+ devs checking in changes to shared code.)

I would love to explore what it would take to set up a Git Repo that could handle the Merge and Conflict resolution for the underlying content.rtf and synopsis.txt files in a way that would be easy to use for people who don't even know what Git is.

Is there a resource out there that explains the file structure of Scrivener such that we could take a shot at it?


Until you can get Git to parse and handle RTF files, there's really no point in trying to go further. Once it can handle the RTF format, I suspect the rest would be fairly easy to do.
--
Devin L. Ganger, WA7DLG
Not a L&L employee; opinions are those of my cat
Life has a way of moving you past wants and hopes

User avatar
nontroppo
Posts: 1217
Joined: Mon Mar 05, 2007 5:22 pm
Platform: Mac
Location: Airstrip One

Tue Apr 14, 2020 12:31 am Post

dbrooks888 wrote:Is there a resource out there that explains the file structure of Scrivener such that we could take a shot at it?


Devin is right, the issue is dealing reliably with RTF — RTF is great because it is, after all, plain text, but RTF is a pain because it is nowhere near as easy to parse as formats like JSON or XML, AND on top of that, Scrivener's RTF has a bunch of custom additions to support features that RTF itself can't. But this is all just parsing content, nothing is impossible. Several diff programs have converters for Word or PDF documents that extracts the text for comparison.

The "key" file is the .scrivx XML description file, that contains the map of all the document unique IDs in the bundle. This is really nice to parse, and gets you the binder names for all the otherwise anonymous content.rtf documents. Snapshots are quite straightforward, every document has a UUID, and snapshot directories link to that and have an index.xml file that stores the metadata for each snapshot. Settings are mostly text or XML.

So understanding the scrivener bundle is really straightforward, the key to a nice Git interface is dealing with RTF. It would of course be great to get a nice Time Machine like interface to Git+Scrivener 8) 8) 8)

db
dbrooks888
Posts: 2
Joined: Mon Apr 13, 2020 6:25 pm
Platform: Mac

Tue Apr 14, 2020 1:31 pm Post

I get the RTF file complexity. We wrote our own XML parser for merge for the same reason, Git does a passable but not perfect job of merging very large and regular XML files.

So looks like that is where the most important investment in time will be.

Thanks also for the direction on the XML file.