Data Engineer

Karachi, Pakistan

About the role

Location: Remote (Karachi, Pakistan)

Time zone: UMT+5

Time: Full Time

Type: Permanent

Description

StackWeavers is seeking a skilled Data Engineer to join our team. You will analyze and organize raw data, build and optimize data systems and pipelines, and prepare data for advanced analytics. Ideal candidates should have a degree in Computer Science or IT, 2-3 years of cloud computing experience, advanced Python, SQL skills, and proficiency with Apache Kafka, Apache Spark and Apache Airflow. You will collaborate with data engineers, data scientists and data analysts, on data engineering pipelines and data warehousing/data lake systems and data mesh systems.

Roles and Responsibilities

  • Analyze and organize raw data 
  • Build data systems and pipelines
  • Evaluate business needs and objectives
  • Prepare data for prescriptive and predictive modeling
  • Build algorithms and prototypes
  • Combine raw information from different sources
  • Explore ways to enhance data quality and reliability
  • Identify opportunities for data acquisition
  • Develop analytical tools and programs
  • Collaborate with data scientists and architects on several projects

Qualifications and Skills

  • Degree in Computer Science, IT, or similar field; a Master’s is a plus
  • 2-3 years of experience working on cloud computing and data
  • Advanced working SQL knowledge and experience working with relational databases, query authoring (SQL) as well as working familiarity with a variety of databases.
  • Experience building and optimizing data pipelines, architectures and data sets.
  • Experience performing root cause analysis on internal and external data and processes to answer specific business questions and identify opportunities for improvement.
  • Strong analytic skills related to working with unstructured datasets.
  • Build processes supporting data transformation, data structures, metadata, dependency and workload management.
  • A successful history of manipulating, processing and extracting value from large disconnected datasets.
  • Working knowledge of message queuing, stream processing, and highly scalable ‘big data’ data stores.