Dow Jones

Senior ML Data Engineer

11 October 2024
Apply Now
Deadline date:
£84000 - £156000 / year

Job Description

Job Description:

About Our Organization:

Dow Jones is a global provider of news and business information, delivering content to

consumers and organizations around the world across multiple formats, including print,

digital, mobile and live events. Dow Jones has produced unrivaled quality content for

more than 130 years and today has one of the world’s largest news-gathering

operations globally. It is home to leading publications and products including the

flagship Wall Street Journal, America’s largest newspaper by paid circulation; Barron’s,

MarketWatch, Mansion Global, Financial News, Investor’s Business Daily, Factiva, Dow

Jones Risk & Compliance, Dow Jones Newswires, OPIS and Chemical Market

Analytics. Dow Jones is a division of News Corp (Nasdaq: NWS, NWSA; ASX: NWS,

NWSLV).

About the Team:

Our Technology team drives the evolution of our Technology, Engineering, Data,

Product and User Experience functions. With a keen focus on delivering cutting-edge

solutions, we shape the digital landscape for our customers, readers, and users. From

revolutionizing visuals to optimizing tools and harnessing the power of data, mobile,

video, and social platforms, our team is committed to providing a seamless and

immersive experience across all touchpoints. Collaborating closely with our newsrooms

and strategic partners, we spearhead the development of groundbreaking products and

technologies.

About the Role:

Dow Jones is seeking an experienced Data Engineer to join our AI Engineering Team. You will be responsible for designing, developing, and maintaining robust data pipelines for data scraping, processing, extraction, transformation, loading, and storage. You will collaborate within our team to ensure the efficient and reliable retrieval of data, enabling seamless integration with downstream systems for analysis and decision-making.

As a key team member, you will play a crucial role in operationalizing data solutions to meet our organization’s needs and deliver tangible value. You will leverage your strong data engineering skills to develop robust, secure, and scalable data pipelines, utilizing your expertise in data retrieval and processing techniques. This position will report to the Associate Director, Data Science.

 

You Will:

  • Collaborate with data scientists and ML engineers to design, develop, and maintain end-to-end data pipelines for extraction, transformation, loading (ETL), and storage.

  • Clean, transform, and structure data using industry-standard techniques, ensuring quality and consistency.

  • Work with APIs to retrieve data from external sources or integrate with third-party services, adhering to best practices.

  • Manage and optimize SQL and NoSQL database systems for data storage, ensuring integrity and performance.

  • Automate data fetching, processing, and storage by implementing data pipelines for ML/AI use cases, leveraging ETL principles.

  • Identify and troubleshoot issues related to data quality and pipeline performance, applying problem-solving skills.

  • Communicate effectively with stakeholders and data providers to gather requirements and ensure project alignment.

  • Stay updated with industry trends, emerging technologies, and best practices in data engineering for Machine Learning

You Have:

  • At least 3 years of industrial experience in a data engineering role

  • Experience with cloud-based infrastructure and services (AWS, GCP preferred)

  • Experience in designing and implementing end-to-end data pipelines for ML/AI use cases. Preferably in Airflow or GCP Cloud Composer.

  • Ability to work with APIs to retrieve data from external sources or integrate with third-party services.

  • Familiarity with NLP and Machine Learning frameworks and libraries (e.g., PyTorch, HuggingFace, LangChain, spaCy, NLTK, scikit-learn, etc.)

  • Experience with working on Jenkins and/or Docker.

  • Familiarity with database systems, including SQL and NoSQL databases.

  • Bachelor’s Degree or higher in Computer Science, Computer Engineering, Data Science or related STEM field preferred

  • Strong communication and collaboration skills to work effectively with team members and stakeholders.

  • Continuous learning mindset with a willingness to stay updated with industry trends and best practices.

Our Benefits

  • Comprehensive Insurance Plans

  • Paid Time Off

  • Family Care Benefits

  • Access to Dow Jones Products

  • Subscription Discounts

  • Employee Referral Program

  • Employee Well-being Support & Fitness Programs

Reasonable accommodation: Dow Jones, Making Careers Newsworthy – We are an equal opportunity employer and all qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity or expression, pregnancy, age, national origin, disability status, genetic information, protected veteran status, or any other characteristic protected by law. EEO/AA/M/F/Disabled/Vets. Dow Jones is committed to providing reasonable accommodation for qualified individuals with disabilities, in our job application and/or interview process. If you need assistance or accommodation in completing your application, due to a disability, email us at [email protected]. Please put “Reasonable Accommodation” in the subject line and provide a brief description of the type of assistance you need. This inbox will not be monitored for application status updates.

Business Area:

Dow Jones – Technology

Job Category:

Data Analytics/Warehousing & Business Intelligence

Union Status:

Non-Union role