Data Architect

Karachi / Islamabad / Lahore, Sindh / Punjab, Pakistan

Contracted

Technical Services

Experienced

KalSoft is looking for an experienced Data Architect with 10+ years of expertise in designing, developing, and managing on-premises Data Lake, Lakehouse, and Big Data platforms. The ideal candidate should have deep knowledge of Apache Spark, distributed computing, and modern data architectures, ensuring scalable, high-performance, and governed data environments.
Key Responsibilities:
1. Data Architecture & Strategy

Design and implement scalable, high-performance Data Lake/Lakehouse architectures to support enterprise analytics and AI workloads.
Define data partitioning, indexing, and storage strategies for efficient querying and processing.
Implement metadata management, data lineage, and data cataloging to ensure governance and compliance.
Establish data pipeline architectures that support batch, real-time, and streaming data processing.

2. Data Engineering & Processing

Architect and optimize large-scale data processing pipelines using Apache Spark (PySpark, Scala, or Java).
Implement distributed computing frameworks such as Spark on YARN, Kubernetes, or standalone clusters.
Lead the development of ETL/ELT pipelines using Apache Spark, Hadoop, Trino (Presto), Apache Iceberg, Delta Lake, or Apache Hudi.
Enable real-time data streaming using Apache Kafka, Spark Structured Streaming, Apache Flink, or Apache NiFi.
Ensure data lake interoperability with data warehouses, BI tools, and AI/ML platforms.

3. Performance Optimization & Scalability

Optimize Apache Spark jobs by implementing RDD tuning, partitioning strategies, and caching mechanisms.
Improve query performance using Apache Spark SQL, Delta Lake optimizations, and Z-Ordering.
Implement data lifecycle management, compaction, and auto-tuning techniques for large-scale datasets.
Ensure scalability, fault tolerance, and high availability of data platforms.

4. Security, Compliance & Governance

Implement data security policies, role-based access control (RBAC), encryption, and tokenization.
Ensure compliance with GDPR, HIPAA, or other industry regulatory frameworks.
Enforce audit logging, data masking, and identity management for enterprise data security.
Enable data versioning and time-travel capabilities in Lakehouse platforms for compliance and reproducibility.

5. Collaboration & Leadership

Work closely with Data Engineers, Data Scientists, DevOps, and Business Analysts to align on data needs.
Guide teams on modern data engineering best practices and Apache Spark optimizations.
Engage with stakeholders and leadership to define data architecture roadmaps.

Required Skills & Experience:

10+ years of experience in data architecture, big data engineering, and data management.
Deep expertise in Apache Spark (PySpark, Scala, Java) for large-scale data processing.
Strong knowledge of on-premises Data Lake/Lakehouse architectures using Apache Iceberg, Delta Lake, or Apache Hudi.
Experience with Hadoop ecosystem (HDFS, YARN, Hive, Impala, HBase, Ozone).
Hands-on experience with distributed query engines (Trino, Presto, Apache Drill).
Experience with workflow orchestration tools (Apache Airflow, Oozie, Prefect).
Strong knowledge of data lake governance frameworks and metadata management.
Familiarity with containerization and orchestration (Docker, Kubernetes) for Spark-based workloads.
Experience with enterprise data security, access control, and data compliance regulations.
Programming skills in Python, Scala, Java, or SQL.
Experience in highly regulated industries (Oil & Gas, Healthcare, Telecom, Banking) is a plus.

Apply for this position

Required*

First Name*

Last Name*

Email Address*

Phone*

Address*

Resume*

We've received your resume. Click here to update it.

Attach resume or Paste resume

Attach resume as .pdf, .doc, .docx, .odt, .txt, or .rtf (limit 5MB) or Paste resume

Paste your resume here or Attach resume file

What's your highest level of education completed?*

LinkedIn Profile URL:*

Are you residing in Pakistan?*

In which Pakistan City are you based in?*

Could you please share your current salary package in PKR?*

As this is a contractual position with hourly compensation for billable work, what are your salary expectations in PKR per hour?*

This role requires approximately 15 hours of work per week. Would you be available to work in the EST time zone?*

If you have the flexibility to manage your own hours, what would be your preferred working time—morning or evening?*

When would you be available to start? (In Days)*

Which countries or regions have you delivered implementations in?
(Please specify your primary markets — e.g., Middle East, GCC, North America, Europe)*

Have you worked as an independent consultant, contractor, or freelancer?
Please describe the nature of your engagements.*

Experience working in a client-facing or consulting role?*

Human Check*

Submit Application