Data Version Control
Push and pull lightweight snapshots of datasets and code with GitHub-like version control.
Unlike git, Dotscience works on data at massive scale and is designed for big data.
Along with a commit of the code executed and data consumed by a model, Dotscience version control also captures run metadata, including model parameterization, metrics and summary stats. This gives the full picture of every model run, and makes it easy for managers and collaborators to quickly identify runs of interest.
Dotscience uses Docker to snapshot runtime environments as well. This means that the model’s dependencies, such as specific library versions, are packaged with the code. So, when you’ve found the experiment you want, reproducing it is effortless.