Drive has 700+ articles for digital transformation leaders written by StarCIO Digital Trailblazer, Isaac Sacolick. Learn more.

It sounds like a simple question. You have to load several data sets, implement some data cleansing, perform some matching to third party data, compute several aggregates, develop some rankings, group several dimensions, benchmark against another data set, analyze for trends and then normalize the data for multiple data visualizations.

In all likelihood, the algorithms that perform these functions are going to be implemented by different people in different technologies and perhaps at different stages in the analysis. End to end, they represent a complex data flow from data sources, computations, analysis, and delivery.

Key Data Architecture Considerations

So my question is, where are you implementing these data processing functions? Where are your algorithms stored? How are they documented? How do you answer questions around, “Where should I do this data processing?” What is your big data culture – Are you more likely to let data scientists determine what tool to use for different needs, or are you centralizing these data architecture decisions?

Once implemented, how do you review to determine what parts of your data processing needs to be refactored? Maybe a step isn’t performing well? Maybe a data visualization required some last mile data cleansing that should be moved upstream to benefit other analysis? Perhaps some algorithm fails to meet the “KT” (Knowledge Transfer) test and is so complex it will be impossible to be maintained?

Or maybe, you’ve implemented something in a Big Data tool that has just released a major upgrade requiring substantial changes to the implementation? Or even worse, perhaps the tool you selected is on the downside, having never achieved critical mass and now you have to explore alternatives and consider switching costs.

The reverse question is equally important. Perhaps you’re bundling some activity in the wrong tool and should consider expanding your technical architecture? Perhaps you are spending too many cycles getting SQL to perform and should consider a NoSQL store? Maybe the Python scripts you developed for data integration are becoming unmanageable and an ETL tool is needed?

Managing the Evolving Big Data Landscape and Growing Business Need

So the business need is growing, the technology landscape is changing, quickly, access to talent is volatile, and both standards and best practices are evolving. What does this mean for Big Data specialists and Digital Transformation leaders who need to prove results today but manage to an evolving practice?My simple answer is to rely on the basic practices that have made application development practices evolve through significant changes in demand, technologies, and development practices. Some specifics –

    • Invest in basic version control so that you can track changed implementations  across platforms and practices.
    • Evolve a data governance practice that starts with basic data dictionaries and documentation on algorithms.
    • Develop operational KPIs covering development cost, implementation complexity and system performance to sense when an implementation shows signs of becoming a pain point.
  • Capture technical debt data quality barriers and other things that need improvement.

And most important:

  • Invest time/resources to perform R&D and experiment.

Thanks to Matt Turck: Is Big Data Still a Thing

Published on:

Leave a Reply


StarCIO

My company, StarCIO, provides leadership, learning, and advisory programs for companies looking to accelerate delivering business value from digital transformation. Contact me if you’d like to learn more about partnering opportunities.


Isaac Sacolick

Join us for a future session of Coffee with Digital Trailblazers, where we discuss topics for aspiring transformation leaders. If you enjoy my thought leadership, please sign up for the Driving Digital Newsletter and read all about my transformation stories in Digital Trailblazer.


Coffee with Digital Trailblazers hosted by Isaac Sacolick

Digital Trailblazers! Join us Fridays at 11am ET for a live audio discussion on digital transformation topics:  innovation, product management, agile, DevOps, data governance, and more!


Join the Community of StarCIO Digital Trailblazers

About Drive

Drive Agility, Innovation, Transformation

Drive is the blog for digital transformation leaders brought to you by StarCIO and Isaac Sacolick.

Agility, Innovation, and Transformation are the three primary digital transformation core competencies that every StarCIO Digital Trailblazer must champion in their organizations. Learn more About Drive.


About the StarCIO Digital Trailblazer Community

StarCIO Digital Trailblazer Community

Revolutionizing traditional learning, networking, and advising experiences.

Visit the community


About StarCIO

StarCIO

About Isaac Sacolick

Isaac Sacolick

Author, 1,000+ articles, keynote speaker, Chief StarCIO Digital Trailblazer. Full bio


Driving Digital Newsletter

Driving Digital Newsletter

StarCIO Guides

StarCIO Agile Planning Guides

Digital Trailblazer

Digital Trailblazer by Isaac Sacolick

Driving Digital

Driving Digital by Isaac Sacolick

Driving Digital Standup

Driving Digital Standup

Coffee with Digital Trailblazers

StarCIO Coffee With Digital Trailblazers

Recognition

InfoWorld 2025 Judge
InfoWorld Technology of the Year 2024 Judge
Thinkers360 Top 10 in IT Leadership
Thinkers360 Top Agile Thought Leader
Thinkers360 Top DevOps Leader
Thinkers360 Top in Digital Transfomation
Thinkers360 Top in Analytics
Thinkers360 Top in Product Management

Discover more from StarCIO Digital Trailblazer Community

Subscribe now to keep reading and get access to the full archive.

Continue reading