(While navigating through the site, please be sure to disable your pop-up blocker.)
Data Integration Developer
The Data Integration Developer (DID) will join our Education & Training Division and be responsible for ensuring data integrity and governance across a variety of data sources while building and maintaining scalable data pipelines. The Data Integration Developer provides support in the following areas: 1) analyzing and interpreting requirements specs 2) developing Informatica workflows, mappings and sessions 3) following ETL development standards and guidelines 4) providing support for production jobs 5) have knowledge of data warehousing and reporting techniques.
The ideal candidate has hands-on experience building and supporting scalable ETL pipelines using SQL and Python, with strong knowledge of data warehousing tools such as Informatica, Snowflake, or similar platforms. They are skilled in integrating data from multiple sources, developing complex SQL queries, ensuring data integrity and governance, and supporting production data workflows in a secure, cloud-enabled environment.
MD Anderson offers employees:
* Paid employee medical benefits (zero premium) starting on first day for employees who work 30 or more hours per week.
* Group Dental, Vision, Life, AD&D and Disability coverage.
* Paid time off (PTO) and Extended Illness Bank (EIB) paid leave accruals.
* Paid institutional holidays, wellness leave, childcare leave, and other paid leave programs.
* Tuition Assistance Program after six months of service.
* Teachers Retirement System defined-benefit pension plan and two voluntary retirement plans.
* Employer paid life, AD&D and an illness-related reduced salary pay program.
* Extensive wellness, recognition, fitness, employee health programs and employee resource groups.
* Opportunities for professional growth through Career Development Center and Mentoring programs.
Key Functions:
* Work closely with cross-functional teams (engineering, product, analytics) and department leaders to define data models, optimize queries, and support their data analytics requirements.
* Design and develop code to extract, transform and load data from source systems to targets.
* Design, develop, and maintain scalable, efficient data pipelines in both development and production environments.
* Develop and/or debug complex queries using structured query language (SQL)
* Perform web scraping and other external data ingestion techniques to extract valuable data from outside sources in a secure and reliable manner.
* Develop and support data APIs or data access packages to provide structured and secure data access for internal stakeholders and other systems.
* Develop and manage database architecture that supports analytics, reporting, and data science needs, optimizing performance and storage.
* Ensure data governance and integrity across multiple data sources and software systems, implementing practices to maintain high data quality.
* Utilize prompting and generative AI tools to automate and enhance data processing workflows, improving efficiency and reducing manual effort.
* Maintain security, compliance, and best practices in all data management and processing activities, ensuring that data handling meets organizational and regulatory standards.
* Coordinate, promote & train counterparts, such as data scientists, data analysts, end users or any data consumers, in data pipelining and preparation techniques, which make it easier for them to integrate and consume the data they need for their own use cases.
* Other duties as assigned.
Core Competencies:
Behavioral:
* Business Insight: Ability to relate architectural decisions and data solutions to business needs, ensuring that engineering work has a positive impact on education and institutional outcomes.
* Communication: Excellent communication and collaboration skills, with the ability to work effectively across different teams and organizational levels.
* Adaptability: A growth mindset and willingness to learn new technologies or approaches. Able to adapt to new challenges in a fast-paced environment.
* Attention to Detail: Detail-oriented in documenting data definitions, processes, and validations to uphold data integrity and transparency.
Technical:
* Technical Skills: Proficiency in SQL and Python for data manipulation and analysis. Experience with data platforms and tools such as Snowflake, Informatica, or similar database/data warehousing solutions.
* ETL & APIs: Hands-on experience with building ETL processes, working with APIs, and creating data warehousing solutions to consolidate data from multiple sources.
* Data Handling: Knowledge of web scraping techniques and integrating external data sources. Comfortable working with both structured (relational databases) and unstructured data (logs, JSON, etc.).
* Problem-Solving: Strong problem-solving and troubleshooting skills to identify data issues, inconsistencies, or performance bottlenecks and implement timely fixes.
* Cloud Familiarity: Familiarity with cloud-based data platforms and services (AWS, Azure, or GCP) is a plus, as we utilize cloud infrastructure for some data storage and processing tasks.
EDUCATION:
Required: Bachelors degree with a major in computer science, accounting, business, healthcare, mathematics, or related field.
Preferred: Master of Science (MS), Master’s in Information Systems or Management Information Systems (MIS), or Master of Business Administration (MBA).
EXPERIENCE:
Required: Four years Integration programming and/or systems level experience. May substitute required education degree with additional years of equivalent experience on a one to one basis.
Preferred: Informatica Cloud and Azure Data Factory. Certification in an ETL tool.
The University of Texas MD Anderson Cancer Center offers excellent benefits, including medical, dental, paid time off, retirement, tuition benefits, educational opportunities, and individual and team recognition.
This position may be responsible for maintaining the security and integrity of critical infrastructure, as defined in Section 113.001(2) of the Texas Business and Commerce Code and therefore may require routine reviews and screening. The ability to satisfy and maintain all requirements necessary to ensure the continued security and integrity of such infrastructure is a condition of hire and continued employment.
It is the policy of The University of Texas MD Anderson Cancer Center to provide equal employment opportunity without regard to race, color, religion, age, national origin, sex, gender, sexual orientation, gender identity/expression, disability, protected veteran status, genetic information, or any other basis protected by institutional policy or by federal, state, or local laws unless such distinction is required by law.http://www.mdanderson.org/about-us/legal-and-policy/legal-statements/eeo-affirmative-action.html

