Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Thanks Tim! I definitely second your observation that there's room and reason for plenty of tools in this space. DVC probably belongs in your list too: https://github.com/iterative/dvc. Looking forward to checking out Open Images.


The coolest thing to do is diff between the V2 and V3 branch for a label_descriptions table.

https://www.dolthub.com/repositories/Liquidata/open-images/c...

You can start to see the power of column-wise diffs. You can start to imagine what it would be like to change this table and then merge Google's changes in V3 onto your modified copy. Very powerful. We need a query interface on top of diffs. Lots to build...


Great work on Dolt and seeing temporal data stores emerging :-)

Regarding XML and JSON SirixDB[1] already provides full blown time-travel queries using a fork of Brackit[2], that is basically XQuery to process and query both the XML as well JSON documents. That said SirixDB in principal could also store relational data or graph data. The storage engine has been built from scratch to offer the best possible versioning capabilities. However I've never implemented branching/merging as I didn't come up with good use cases. It seems it's then always more of a versioning system like Git, but more fine granular.

I always struggled to implement this as SirixDB currently only allows a single read-write transaction on a resource. Thus, if it would support branching and merging users would have to manually handling conflicts when merging (or automatically -- using a merge-strategy which is often case not good).

There's however plently of optimization potential, as SirixDB optionally stores a lot of metadata for each node (number of descendants, a rolling hash, Dewey-IDs, number of children... as well as user-defined, typed secondary index-structures). I'll have to look how to build AST rewrite rules and implement a lot of optimizations into my Brackit binding in the future, so it's just the starting point (but everything should at least work already) :-)

[1] https://sirix.io and https://github.com/sirixdb/sirix

[2] http://wwwlgis.informatik.uni-kl.de/cms/fileadmin/publicatio...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: