Dotscience products.

Tools for machine learning model management.

Our products

We make data science teams more productive, by enabling collaboration, flexible access to high performance compute, and version control.

And, via our model governance and auditability tools, we ensure that every model data science teams build is suitable for deployment in highly regulated environments.

Get Dotscience

Dotscience products are open and interoperable. Read more about our full solutions for data science development and model governance.

Data Version Control

Push and pull lightweight snapshots of datasets and code with GitHub-like version control.

Unlike git, Dotscience works on data at massive scale and is designed for big data.

Along with a commit of the code executed and data consumed by a model, Dotscience version control also captures run metadata, including model parameterization, metrics and summary stats. This gives the full picture of every model run, and makes it easy for managers and collaborators to quickly identify runs of interest.

Dotscience uses Docker to snapshot runtime environments as well. This means that the model’s dependencies, such as specific library versions, are packaged with the code. So, when you’ve found the experiment you want, reproducing it is effortless.

Provenance Graph

Auto–generate a full history of any model’s development, for audit trails, and AI and machine learning regulatory compliance.

Provenance Graph uses persistent pointers to the code executed and the exact validation and training data consumed by models. Reports remain lightweight using the Dotscience filesystem, which is optimised to store only snapshots of changes to files.

As well as meeting audit requirements, Provenance Graph makes it easy to keep track of how and why models in production were built, removing the need for laborious and error-prone documentation.

Dashboard

Learn from every model run by anyone in your team with Dashboard.

Whether you test models one at a time, or autotune your model with hundreds of runs, we’ll catch the variables of every execution. Aggregated information about all runs of a model is made available in Dashboard, where you can flexibly visualise it to get deeper insights into model behaviour and make better decisions as you build and optimize your model.

Flexibly decide which aspects of runs to monitor. For instance, you can capture parameter settings and corresponding metric scores per run. Then, in Dashboard, compare runs side-by-side; isolate the effect of single parameters; and see immediately how changes to the code or data affect performance.

Runners

Attach any compute hardware to the development environments familiar to data scientists, including Jupyter and RStudio.

Specify a machine (a ‘runner’) from any cloud provider (AWS, Google Cloud, Azure) or a local cluster or server. Runners will seamlessly connect this compute to your model development environment. If you need distributed compute, Runners can provision a Kubernetes cluster and expose a user-friendly interface.

Runners gives data science teams self-service access to the latest top-spec processors and GPUs. Data scientists get their job done faster by accelerating model training and inference, with no additional support overheads. All work done on via Runners can be placed under version control and backed up to your Dotscience cloud account where you can explore your runs via Dashboard.