An agile approach to Data Science

Taking an agile approach to data science helps deal with rapidly changing environments, uncertainty, complex solutions, emerging technologies, and ambiguous requirements inherent in these projects….

Marc St-Pierre profile picture

Marc St-Pierre

February 2, 20234 minutes read

Taking an agile approach to data science helps deal with rapidly changing environments, uncertainty, complex solutions, emerging technologies, and ambiguous requirements inherent in these projects. In OpenText Data Science methodology Part 1, we explored key agile elements to reduce risks and achieve early outcomes in unpredictable and complex scenarios by delivering high-value products early and frequently and reviewing the results with the customer on a regular basis. 

OpenText Data Science Methodology expands on the Agile approach, adding focus to six key areas or phases:

  • Business understanding
  • Data understanding
  • Data preparation
  • Modeling
  • Evaluation
  • Deployment

1. Business understanding

What problem or question do we want to address with data?

This first phase is understanding the background and business objectives for the project. It is important, at this stage, to have an idea of what success might look like and outlining it as criteria for all other phases. Activities will include inventory of resources and requirements, assumptions, constraints, risks, contingencies, terminology, costs and benefits. After having assessed the situation, a project plan can be produced, data mining goal defined, and the subject matter experts assembled (gather the right people).

2. Data understanding

What data do we have that could answer our questions?

Every data science project has three must-have ingredients: data + people + technology. In this phase, focus is given to the first ingredient, data. Activities will include collecting data, describe it, explore it and verify its quality. After having assessed and built an understanding of initial data sets, the project plan can be updated with data preparation tasks or even tasks to look for additional data.

3. Data preparation

What do we need to do to prepare the data for mining?

This phase consists of getting the most out of the data that you have.  Sometimes it means digitizing it, OCR’ing or pre-processing it in other ways so it can be used in the next modelling phase. Activities will include:

  • Selecting the data – deciding on the rationale for including or excluding it
  • Cleaning the data – defining the rules to clean it and preparing a report if needed
  • Constructing data – from derived attributes or generating records
  • Integrating data – merging data
  • Formatting data – reformatting

4. Modeling

How can we mimic or enhance human knowledge or actions through technology?

This phase applies to any type of data science, whether it is with structured data or text mined data. In conjunctions with technology choices, select modeling techniques, test designs, evolve your model and assess it against the criteria and data mining goals established in the business understanding phase.

5. Evaluation

What new information do we now know?

An agile approach calls for delivering high-value products early and frequently. In this phase, an assessment of data mining results pertaining to Business Success Criteria is performed by the development team. Essentially, the model is tested to validate its performance against the problem defined in the business understanding phase. A review of approved models and process provides a list of next actions and decisions to the customer to obtain feedback and even an initial usage to drive business value.

6. Deployment

What actions should we trigger with the new information? What needs human validation?

With customer approval, this phase enables delivery of initial product value, and activities will include:

  • Deployment Plan
  • Monitoring and Maintenance
  • Produce Final Report and Presentation
  • Review Project

Cognitive Strategy Workshop

The OpenText data science difference

A true differentiating factor with OpenText’s Professional Services in the AI domain, is that it houses data scientists and cognitive analysts who are not only experts in their domain – (with PhDs and decades of experience) – but they are also genuinely passionate about helping people understand and tap into the potential of their data.

Before jumping into a project, OpenText offers a Cognitive Strategy Workshop designed for IT executives, business analysts, and data owners. It is a four-day strategy workshop tailored to an organization’s needs. OpenText delivers advice and guidance based on best practices for implementing an AI project using Magellan to realize the most benefit from structured and unstructured data.

Above all else, remember that every data science project has three must-have ingredients: data + people + technology. The art is in bringing all three of these elements together through a data science methodology.

Learn more about the Cognitive Strategy Workshop or our AI & Analytics Services, alternatively reach out to the team.

Share this post

Share this post to x. Share to linkedin. Mail to
Marc St-Pierre avatar image

Marc St-Pierre

Marc is VP of Consulting Services for the Security + Artificial Intelligence + Linguistics & Translation practice. For more than 15 years, Marc has led services groups specialized in advanced and emerging technologies. He has lectured on semantic technologies and lead solution development such as Ai-Augmented Voice of the Customer and Magellan Search+.

See all posts

More from the author

Cybersecurity Services combat an APT with NDR

Cybersecurity Services combat an APT with NDR

Attackers linked to Iran and China are actively targeting critical infrastructure.  Both the U.S. Environmental Protection Agency and National Security Agency have requested that each…

March 28, 2024 4 minutes read
Strengthening Higher Education Institutions against evolving cyberthreats

Strengthening Higher Education Institutions against evolving cyberthreats

As cyberthreats continue to evolve, it is crucial for higher education institutions and universities to be vigilant.  Enforcing security strategies prudently designed to safeguard digital…

January 24, 2024 4 minutes read
Strengthening cyber resilience

Strengthening cyber resilience

Cyberattacks are on track to cause $10.5 trillion a year in damage by 2025. That’s a 300 percent increase from 2015 levels. A robust cybersecurity…

December 19, 2023 4 minutes read

Stay in the loop!

Get our most popular content delivered monthly to your inbox.