From diff3 and elementary graph theory to DVCS in a few short steps
some code for LCA, and a model of revision history, to make a simple
supports named branches, merging, and import/export of revisions. So
far, there’s no network synchronisation protocol, although it’d be
easy to build a simple one using the revision import/export feature
XMLHttpRequest. The storage format and repository representation
is brutally naive, and because it doesn’t yet delta-compress
historical versions of files, it is a bit wasteful of memory.
The following LShift blog posts summarise some of the work I’ve done:
- Using the above to perform merging and DVCS.
a demo of diff, comm, and patch functionality.
a demo of three-way merge and conflict-handling functionality.
Giving TiddlyWiki an awareness of history
After building the heart of the system, I experimentally integrated it
with TiddlyWiki, providing version-control, history, and branch
merging features, as part of collaborative work done with Martin
Budden in June, 2008. You can try
out the demo
online, though to experiment with merging between instances you will
need to save it to your local hard disk using
History-awareness in other kinds of database
Systems like CouchDB and Google Wave already have rudimentary history-merging operations. It’d be a great addition to devise some deeper awareness of history for them, permitting flexible ‘offline’ operation, improving their partition tolerance and in effect making them even more decentralized and distributed than they already are.
All that would be required would be to make a note of parent document revision IDs each time a change is made to a document. Clearly, some protocol for compacting, thinning, or outright pruning overlong histories would also be required eventually, but a history-aware database could lead to some very interesting applications even in the absence of history compaction. Some of the new NoSQL database systems, Riak for example, support custom programmable merge operators, opening the door to full history-aware document merging.
A great application for such a DVCS-style database would be a bug tracker. Taking the step from centralized to decentralized bug tracking could be a boon comparable to the switch from centralized to decentralized source-code control…
Even various calendar- and address-book-synchronisation applications could make good use of history-awareness. Many current systems don’t cope well with more than two parties—generally, one client and one server—being permitted to edit the database.
Getting the Synchrotron code
Synchrotron is hosted on github.
The core interfaces, algorithms and internal structures of the DVCS code seem quite usable to me. In order to get to an efficient DVCS from here, the issues of storage and network formats will have to be addressed. Fortunately, storage and network formats are only about efficiency, not about features or correctness, and so they can be addressed separately from the core system. It will also eventually be necessary to revisit the naive LCA-computation code I’ve written, which is used to select an ancestor for use in a merge.
The code is split into a few different files:
The sources for the diff and diff3 demos and the DVCS demo. In the latter, check out the definition of
presets.preset1for an example of how to use the DVCS, and
presets.ambiguousLCAfor an example of the repository format and the use of the revision import feature.
Graph utilities (for computing LCA etc)
The DVCS and pseudo-file-system code.