Concerted Data Engineering Services for a US-Based International Manufacturing Company
Mahrife Panopio
September 26, 2018
Pointwest supported an international manufacturing company’s IT group in their data warehousing initiative by providing a full suite of data engineering services ranging from data modeling, ETL development, ETL testing, database administration, and reports development.
In order to address the needs of the client, Pointwest created a 2-stage data warehouse using MS SQL Server, to house data from multiple sources such as Salesforce DB and Oracle DB. The first stage was a staging area or a landing zone, which houses copies of the different source tables integrated into one database. SQL Server Integration Services was then used to develop change data capture jobs that retrieves all incoming data from the source to the data warehouse staging area.
ETL jobs using Informatica Cloud were then developed and used to transform data from the staging database to data marts, which are in line with the developed dimensional models in ERwin data modeler. Through all these, an ETL test engineer plans and develops test cases to test metadata, completeness, and validates if transformation rules are adhered to. Finally, the Business Objects and Tableau developers create their own respective reports and dashboards based on the developed data marts.
The project team was formed in such a way that most of the facets of data management are represented:
- Data Modeler
- Modeling best practices using ERwin data modeler
- Creation of Logical and Physical Dimensional Data models
- Apply Naming and Datatype Standards for the Data Warehouse
- Ensuring referential integrity through foreign key relationships
- Object management of 19 subject areas (business domains), 43 entity relationship diagrams, and 245 tables
- Script creation using ERwin’s forward engineering feature
- Reverse engineering of old Oracle DB sources for backup
- Modeling best practices using ERwin data modeler
- Database Administrator
- Performance tuning activities
- Disk management
- TempDB sizing
- Server Power Management
- Maintenance jobs such as: Database backup, Rebuild Indexes, Update Statistics
- Index management
- Study execution plans
- Use of SQL Server Database Engine Tuning Advisor
- Conversion and simplification of old report queries from Oracle to MSSQL
- Development of CDC jobs using MS SSIS
- Review and execution of scripts provided by the data modeler
- Development of adhoc ETL stored procedures
- Performance tuning activities
- ETL Developer
- ETL Development starting from Source to Staging, until the Data Marts
- Preparation of mapping sheets
- ETL Development using Informatica
- Setup connections
- Develop ETL mappings
- Validation of ETL jobs
- Deployment and Scheduling
- Creation of task flows
- Scheduling of jobs
- ETL monitoring (for errors and re-runs)
- ETL Development starting from Source to Staging, until the Data Marts
- ETL Tester
- Test Plan Creation
- Test Cases Development
- Test Execution
- Metadata Testing (Naming and Datatype Standards)
- ETL Queries Testing
- Data Quality checks
- Data Completeness checks
- Production Testing (end-to-end from source to end target)
- Tableau Developer
- Dashboard development using Tableau
- Reports development using Tableau
- BOBJ Developer
- Reports development using SAP Business Objects
With the Pointwest’s support, the manufacturing company’s IT group was able to develop a functional data mart (data warehouse subset – grouped by a business domain) in a month, and were able to churn out initial reports as early as 2 months. After the initial month of setup and configuration, the pace picked up and a new subject area data mart is developed from start to end (i.e. from data modeling to testing) every 2-3 weeks.