ETL tool comparison

Loading Type

OWB: ELT
ODI: ELT
Informatica: ETL, ELT with „Pushdown Optimization Option“
Pentaho: ETL
DataStage: ETL
Microsoft SSIS: ELT

System Plfatform

OWB: Windows, Unix
ODI: Windows, Unix
Informatica: Windows, Unix
Pentaho: Windows, Unix, Other Platforms (JDK)
DataStage: Windows, Unix, Other Platforms
Microsoft SSIS: Windows

Parallelism

OWB: Parallel steps in OWB flows (Mapping or subflows). Parallel SQL.
ODI: Parallel processes in scenarios.
Informatica: Partition Option (parallel threads). Asynchronous task steps in workflows.
Pentaho: Parallel job execution (asynchronous). Parallel mapping execution (e.g. parallel DML).
DataStage: Parallel jobs and parallel stages.
Microsoft SSIS: asynchronous data flows (tasks) in a package.

Scalability

OWB: RAC.
ODI: Load balancing for distributed agents executing etl scenarios.
Informatica: Informatica cluster. Public cloud version.
Pentaho: Cluster (carte server).
DataStage: MPP, cluster. AWS cloud.
Microsoft SSIS: SQL Server Cluster. Azure cloud.

ETL Proccess

OWB: jobs with processes(PL/SQL) and mappings (generated PL/SQL)
ODI: Scenarios containing processes (knowledge modules)
Pentaho: jobs with transformations
Informatica: sessions with workflows/worklets and tasks/tasklets
DataStage: Server job, DataStage job, Sequence Job (workflow)
Microsoft SSIS: package

Deployment facility

OWB: Yes (Configurations, Export/Import between repositories)
ODI: Yes
Informatica: Sessions with environment specific parameter files.
Pentaho: Export/Import XML files between environments. Environment properties (as file or db content).
DataStage: Information Server Manager – build, deploy. Transfer binaries (same platform). ODF and XML.
Microsoft SSIS: Yes

Engine based / code generator

OWB: PL/SQL packages executed by OWB repository.
ODI: Agents executing generated scenarios (Interpreter).
Informatica: Engine.
Pentaho: Engine interpreting XML jobs and transformations.
DataStage: Binaries (linked libraries).
Microsoft SSIS: SSIS engine executing packages.

Environment configuration

OWB: In repository of target environment.
ODI: ?
Informatica: Environment specific parameter files on session level. Multiple datasources/targets.
Pentaho: Environment properties (as file or db content). Any kind of JDBC connection.
DataStage: Configuration files (APT_CONFIG_FILE)
Microsoft SSIS: „Configurations“ in packages.

Numbers of available transformation functions

OWB: about 30
ODI:
Informatica: 58
Pentaho: 60
DataStage: 70
Microsoft SSIS: 30

Slowly Changing Dimension

OWB: SCD Mappings (Enterprise ETL Option)
ODI: Yes
Informatica: Supports Full history, recent values, Current & Prev values.
Pentaho: Yes
DataStage: SCD „stage“ for Type 1 and 2
Microsoft SSIS: Yes, with a Wizard but difficult to modify

Rejected Records

OWB: Error tables
ODI: Some (oracle) „integration knowledge modules“ has rejection logic. „check knowledge modules“ for business check rejection suggested.
Informatica:. Reject (bad) files. Rejected records in logs.
Pentaho: Error handling data connections and validation nodes
DataStage: Yes
Microsoft SSIS: Can use conditional split transformation and error redirection to either CSV files or tables.

Debugging Facility

OWB: basic debugging, breakpoints
ODI: simulation mode
Informatica: session debugger, breakpoints, data movement, data analysis
Pentaho: mapping debugger, breakpoints, data movement, data analysis
DataStage: Supports basic debugging facilities for testing.
Microsoft SSIS: breakpoints, Data Viewer and error handling

Languages (extensibility)

OWB: SQL, PL/SQL
ODI: Java, Jython, Java Bean Shell
Informatica: Java transformations.
Pentaho: Java, JavaScript
DataStage: Script, C++
Microsoft SSIS: C#, VB.NET (2005)

Application Integration Functionality

OWB: Difficult
ODI: JAX-WS, JMS connectors
Informatica: WebServices (XML, JSON), JMS, IBM MQ, Tibco Rendevous
Pentaho: JMS node, possible (e.g. java transformation nodes)
DataStage: WISD ingegration (web services), InfoSphere Streams, IBM MQ stages, Java/C++ API.
Microsoft SSIS: WebService Task, MSMQ integration. Any .NET functionality possible.

CDC and real time ETL

OWB: Not available
ODI: Oracle Golden Gate knowledge modules
Informatica: PowerExchange CDC
Pentaho: possible (e.g. java transformation nodes)
DataStage: IBM InfoSphere CDC. Always-on stages (e.g. ISD, MQClient or custom Java).
Microsoft SSIS: StreamInsight

Metadata: Ability to view & navigate metadata on the web

OWB: OWB Repository Browser
ODI: ODI Metadata Navigator (weblogic)
Informatica: Job sessions can be monitored using Informatica Classes
Pentaho: Pentaho Repository
DataStage: InfoSphere Metadata workbench. DataStage Operations Console for operational view (monitoring, statistics, performance)
Microsoft SSIS: Does not support

Ability to Customize views of metadata for different users

OWB: No
ODI: User roles (repository)
Informatica: Supports.
Pentaho: User rights (read,write,access) on jobs and transformations
DataStage: User roles
Microsoft SSIS: Does not support

Metadata repository can be stored in RDBMS

OWB: Yes (only)
ODI: Yes (only)
Informatica: Yes
Pentaho: Yes (XML, RDBMS, Enterprise Repository)
DataStage: No. But the proprietary meta data can be moved to a RDBMS using the DOC Tool
Microsoft SSIS: Does not support

Metadata API

OWB: OBMScript. Oracle repository tables.
ODI: Java API. DB repository tables.
Informatica:
Pentaho: XML files. DB repository tables (DB repository). JSR-283 (Enterprise repository).
DataStage:
Microsoft SSIS:

Command line operation

OWB: No
ODI: Yes (start, stop, sleep and many others)
Informatica: Yes (Pmcmd – start, stop, abort or schedule workflow )
Pentaho: Start/Stop of job and transformations (pan and kitchen)
DataStage: Start/Stop, compile, export/import metadata (many different commands)
Microsoft SSIS: Yes (Dtexec.exe)

Ability to maintain versions of mappings

OWB: Very Difficult (through export/import with OMBScript on object level possible).
ODI: Yes. Internal version management on object level.
Informatica: Yes (Additional licence needed). SVN throught XML export/import possible.
Pentaho: SVN (for XML repositories) or in basic locking/versioning on object level in „Enterprise repository“
DataStage: CSV and ClearCase support.
Microsoft SSIS: Difficult (binary dtsx files).

Job Controlling & Scheduling

OWB: No (external scheduler or Oracle scheduler needed)
ODI: Yes
Informatica: Very good
Pentaho: only for „Enterprise repository“
DataStage: Does not support directly (no option). But possible to call custom programs after the job get executed.
Microsoft SSIS: Job Agent, Email Notification

Usability

OWB: Average. Slow learning curve. Bugs (out of support). Slow UI.
ODI: Average. Slow learing curve. Own ETL tooling vision.
Pentaho: Very good. Steep learning curve. Less mature than competitors (e.g. weak logging and monitoring). Good UI. No installation needed.
Informatica: Very good. Steep learning curve. Clean and intuitive UI.
DataStage:
Microsoft SSIS: Good. Steap learning curve. VisualStudio as UI. C# knowledges needed.

License

OWB: Included in Oracle DB. Additional options could be necessary (Partitioning, Enterprise ETL, etc.). About 15.000 core/year.
ODI: Per Core or Named User. About 30.000 per core/year
Pentaho: Free. Support costs about 30.000 per CPU/year.
Informatica: Per Server. Additional options could be necessary (e.g. „Pushdown Optimization“). Expensive and intransparent.
DataStage: Per Processor (Processor Value Unit)
Microsoft SSIS: Part of SQL Server license (CAL or Core).

Veröffentlicht in DWH