Data Science

Equip your team for success with lightweight, powerful infrastructure.

The potential of data science to boost performance across industries is old news. Now, the competition hinges on how well companies can integrate and facilitate their new data science teams.

As software development proves, getting the engineering infrastructure right – for collaborating, testing, delivering and monitoring – not only boosts productivity, but makes the outputs produced more resilient.

Data science teams have different needs, but stand to gain just as much from the right infrastructure.

Interoperable. Integrate with the data science ecosystem.

Juypter logo Scikit logo RStudio logo … and many more.
Team office

Win the race to incorporate big data.

Right now, the companies winning the race to operationalize big data are tech firms with the in-house expertise to support their data scientists. Typically, this requires building and maintaining custom tools. Dotscience makes these tools available, fully supported, to all data science teams.

Dotscience provides version control for model development, with deep analytics on model performance. Everything is designed for petabyte-scale datasets and collaborative working.

Flexible compute

Dotscience makes self–service high performance compute directly accessible to data scientists, cutting delivery time and overheads.

Running on any VM or local machine is as easy as point-and-shoot. Dotscience Runners exposes a familiar development environment on top of the compute: a Jupyter instance, RStudio, or a text editor.

By giving control over compute resources to data teams, Dotscience cuts model training time by orders of magnitude, and gets time-critical models into production quickly.

One-click cluster

Dotscience’s one-click cluster setup gives data scientists the ability to access distributed training when they need it. We’ll provision a Kubernetes cluster in your cloud and expose a straightforward, user-friendly interface.

Version control for data and code

Version everything – code, datasets, hyperparameters, runtime environment – for reproducibility and easy model sharing. Our innovative Dotscience filesystem allows you to version large files and datasets at a scale impossible with existing version control systems such as GitHub.

Experiment freely as you clean datasets, add and engineer features. Collaborate via proven GitHub-like workflows.

Evidence-based model development

Dotscience can capture rich metadata from all your model runs executed in training and development.

Historical run data is made available in aggregate. Using Dashboard you can explore this data to get unrivaled insights into model behaviour, such as the effect of parameter choices on performance. This allows you take better decisions as you optimize and deploy your model. Collaborators working on the same model can share metadata, enabling them to learn from runs executed across the team.