VIS logo secondary for white background
BLOG

Migrate ETL from DataStage - automatically!

Explore the hidden complexity of IBM DataStage migrations and discover how automated ETL conversion can significantly reduce development effort, accelerate modernization projects, and help teams migrate thousands of integration jobs to new platforms faster and with less risk.

Miljenko Vuković
Written by

Miljenko Vuković

Founder - CEO
Published on

Blog

Content type
Share this article
Migrate ETL from DataStage - automatically!

Migrate ETL from IBM DataStage Automatically

Many companies are in the process of changing integration technology for all sorts of reasons. One frequent case is migration from IBM DataStage, very widespread platform from a blue-chip vendor.

Imagine that you are developer at such a project. Or even worse, technical lead.

Company has hundreds if not 4-digit number of integration routines. Developing all of them from scratch in the new technology could potentially last years with the team you have. Obviously, some sort of automation and refactoring will need to be one.

You find a simple, typical DataStage job and start studying it. Since you are still deciding about the target technology, you decide to document it in a simple Excel format. One sheet will be select section – the destination table columns and all the transformations needed to get them. Another will be “from section” – all the source tables and joins. The third will contain where section and all the other parameters for that integration routine.

Ok, it looks like there are two source tables and one transformer. You put the stuff from transformer in the select sheet. And the joins between the sources in the from sheet.

But then you start examining the sources and get unpleasantly surprised. One of the sources is so-called “Hash file”, proprietary DataStage structure that gets loaded in another job. After some thinking and consultation, you decide to replace hash files with tmp_ tables. After you migrate the previous job which loads hash file to the new technology, you will have your temp table loaded, and you'll be able to use it.

You adjust the from sheet accordingly. Alright, one source solved.

But another source is so-called connector. Here the situation is even worse. It is actually an SQL query with even more source tables and transformations.

It is still doable – you need to further adjust your specification of the new job. You add newly discovered source tables to from section, and the second set of transformations to the select sheet.

Process of migration from Data Stage via EPIC Ingenioso

Now imagine if you had even more connectors, hash files and transformers – which is actually quite a common situation. Reverse engineering, specifying and developing new jobs would quickly become a real nightmare. Trust me, I've been there.

Luckily, there is solution. Using EPIC Ingenioso you can automatically transform any DataStage job to the excel format just like the one from our story. You just export the job to XML format. This is the part of exported XML file:

ds xml export

Select file, click.

EPIC DS parser

And there you go! The whole previously shown ordeal is now replaced with a couple of clicks. Neat.

Select section of the output file.

EPIC select section

From section of the output file.

EPIC from section of the output file

OK, but now we have an Excel. Nice for documenting and specifying, but don't forget what we actually want. We want our ETL routine in target ETL technology of our choice!

Don't worry, we can easily turn such a self-documenting input to a number of target technologies, like Snowflake, BigQuery or Oracle Data Integrator. In our next posts, we will show you how to do exactly that.

Tags

EPIC Ingenioso

You might also be interested in

View all articles