Data scientists prefer to spend their time working on a core set of data problems. Many dread long meetings without actionable outcomes. But shorter meetings can still be very useful in the data science process. Let’s look at some best practices for meetings and DS code reviews!
Best Practices for DS Meetings
The most important difference with normal meetings is that data science is an exploratory process. It takes time to train a model or find key insights, and success is often unpredictable. While many teams benefit from regular meetings, data science teams should only call a meeting when a significant milestone is reached. This can be upon completion of cleaning a data set or building a data pipeline, or a new iteration of a model with higher accuracy and lower loss, or when developing insights that can have a business impact.
This highlights that predetermined outcomes for data science projects are extremely important. Without clear objectives, teams will not hit any targets. Data science is so hyped that people think it magically solves their problems, but you need to define milestones and center meetings and huddles around accomplishing them or removing obstacles on the road to success.
The roles of people present in meetings depends on what milestone is being reviewed. If your data set needs wrangling, stick to the engineers who will build the model. If you’re reviewing a significant insight, you should involve stakeholders and business development. If you’re facing an ethical problem with the model, you should involve ethics and compliance. It should be evident for the data science manager who should join which meetings: the key contributors and the people who can follow up on results.
Avoid meetings with too many details and computational information. Data science is an interdisciplinary approach to solving problems, and not everyone is fluent in statistics or programming. While some conceptual understanding is needed from non-technical employees, it’s better to focus the meetings on results and how to follow-up on them. So instead of talking about which loss function you used for the classification algorithm, talk about the outcomes of your solution and how they can be applied to improve your company’s product or service.
Sam Altman, CEO of OpenAI, advocates for either 15-20 minute or 2 hour meetings. He found these timeframes far more effective than the regular 1 hour meeting. Meeting size is generally recommended to be 7 ± 2. So a minimum of 5 people and a maximum of 9 people should be present. Any smaller and you’ll miss out on unique views and domain expertise, but any larger and you’ll risk conformist group thinking (e.g. people not speaking up).
Best Practices for Code Reviews
Code review meetings in data science can be different from traditional software development because the data science lifecycle is iterative by its nature. They are more focused on improving the code base, rather than finding bugs. The following tactics can be used to maximize code reviews:
First, recognize as manager that all developers share responsibility for the codebase. Reviews may uncover an individual’s mistakes, but they should be taken as constructive feedback rather than personal criticism. Pairing a junior developer with a senior developer is often a good way to review code and provide mentoring. Senior employees can work collaboratively to review each others’ code, since it’s more likely that they will spot errors in their more complex code.
As a manager, have developers talk through their code in front of the team. This can clarify confusing syntax or reveal mistakes that weren’t caught individually. This can also teach other developers new tricks or methods. The ability to walk someone through your thought process while coding is one of the reasons why data scientists with strong communication skills are highly desirable in the field.
Code reviews are also a good time to expand documentation in the form of comments or written documents. This is particularly important for maturing or growing teams. Often, code is written without clear comments about what’s happening. This can be frustrating or confusing for new or future team members. So use code reviews as an opportunity to annotate your codebase!
Vectice is your memory for meetings
Vectice, the platform-agnostic data science-management software, empowers enterprises and their teams to establish best-in-class AI practices to accelerate the impact of AI on their business.
The centralized platform captures and annotates the most important assets in the data science process while keeping track of their lineage. This allows teams to confidently report and review the most important milestones, leading to more effective meetings.