Please Stop Creating Data Science Departments

I started writing a post about everything that needs to change in the field of "Data Science". It's 3 days in, I'm on page 10, and I'm still busy writing. Since this topic is clearly a deep one for me, I'm going to break it into much smaller digestible chunks. Consider this Part 1 of X, where X tends towards infinite. My rant today begins with a plea to enterprises:

Please stop creating Data Science departments (and if you've already made the mistake of creating one: Please dismantle it.)

Firstly, I want to let you know that it's not your fault. You were misled. All organisations were misled. In fact, we know where this practice started. It was part of the initial zeitgeist of Data Science where all business consultants recommended that enterprises build their very own Data Science Center of Excellence. This idea is captured in the widely-used Gartner Data Science Capability Maturity Model, and shared and implemented widely

The result:  Enterprises have followed this advice, hired a whole group of Data Scientists, put them in a room together and shouted "Go forth! Find us some insights!". This isolation of Data Scientists leads to a whole suite of problems:

  • Business priorities often change, half way through the Data Science cycle. By the time Data Scientists are done, their work is often pointless or needs to be redone. Often, data scientists end up working on what turns out to be low-impact problems, at a very high cost, because they do not have an on-going relationship with business.
  • The separation of Data Scientist teams and the software engineering teams results in models that are often infeasible to deploy into real world systems. Time and time again, we see adversarial relationship between Data Scientists and Software Engineers, where Data Scientists are putting loads of effort into creating undeployable models, and frustrated software engineers are throwing them back or trying to rework them themselves.
  • The Data Scientists do not have the expertise to run this process by themselves. Data Scientists require support from Data Stewards, DevOps engineers, and (most importantly) Data Engineers.

Instead of a bunch of Data Scientists in a room, we should be building Data-oriented Product and Research teams. These teams should be multidisciplinary and agile, where each team has a Product Owner, a couple of Data Engineers, some DevOps expertise, a Data Analyst, a Data Scientist, and most preferably a UX designer. Here, the Data Scientists would be embedded into teams (and in some cases leading the team - especially for a research team) and business participation would be at EVERY STAGE OF THE DATA SCIENCE PROCESS - NOT JUST AT INCEPTION OR END. Our Data Scientists should be living the Agile Manifesto along with their teams.

So let's restructure our organisations, so that we can start getting value from our Data Science teams. Thank you for listening to my rant. I'll be back soon for the next edition of "Why Data Science Projects Fail"

About the author

Jade Abbott profile picture

Jade Abbott

Jade Abbott

I’m the ML lead at Retro Rabbit where I’ve worked at every end of putting an ML model into production. By night, I lead Masakhane, a grassroots open research movement for African language technologies Read more from Jade Abbott...