How To Set Up Zeppelin For Analytics And Visualization
In this article, you learn how to create and configure a Zeppelin instance on an EC2, and about notebook storage on S3, and SSH access.
...always one step away. I’m comfortable with tools such as Python, Pandas, LangChain, Node, SQL, Power BI, Tableau, or any similar stack you can justify. Key deliverables • Deployed WhatsApp agent(s) connected through the WhatsApp Business API - WhatsApp channel is ready. • Retrieval-augmented knowledge base so the bots surface the latest information without hallucinations .. Critical • Automated ETL jobs (n8n, Airflow, or your suggested alternative) feeding a structured data store • Reusable analysis scripts/notebooks with documented logic • Interactive dashboards and geo-visualisations accessible via web link • Deployment guide and a brief hand-off walkthrough Please respond ASAP with links to past conversational AI or data-analyti...
... deployment, and monitoring of integration solutions Required Skills Strong hands-on experience with SnapLogic Integration Platform Experience with REST/SOAP APIs and Web Services Knowledge of JSON, XML, and data transformation techniques Experience working with databases such as SQL Server, MySQL, or Oracle Understanding of cloud platforms (AWS, Azure, or GCP is a plus) Familiarity with ETL/ELT concepts and integration patterns Strong debugging and problem-solving skills Preferred Qualifications Experience with CI/CD and version control tools Exposure to enterprise applications like Salesforce, NetSuite, SAP, or similar Good communication and documentation skills Ability to work independently and in a collaborative team environment Engagement Details Long-term opp...
...and loads it downstream for ETL processing. The transfer works, but network-level performance is far below what I need. Here is what I’m looking for: • Diagnose the current ADO.NET-to-Oracle connection, identify any network, buffer, or packet-size bottlenecks, and benchmark the baseline throughput. • Tune the SSIS data flow (buffers, rows per batch, commit size, async settings, etc.) and, if necessary, adjust the Oracle driver or provider configuration. • Provide an updated package or detailed change list so I can reproduce the performance gains in other environments. • Produce a concise report summarizing findings, before-and-after metrics, and next-step recommendations. Source: another Oracle database accessed via ADO.NET. Goal: reliable, hi...
Job Description We are looking for an experienced Azure Data Engineer / Data Integration Specialist to design and implement a robust, scalable data pipeline that pulls data from the Blackbaud API and loads it into an Azure SQL Database using Azure Data Factory (ADF). The goal is to have a fully automated, secure, and monitored ETL pipeline that runs on a scheduled basis and supports future scaling. Project Scope 1. Data Ingestion Connect to Blackbaud REST APIs (OAuth authentication) Handle pagination, rate limits, and API throttling Extract multiple endpoints (e.g., constituents, gifts, transactions, etc.) 2. Data Transformation Clean, normalize, and structure raw API JSON Handle nulls, schema drift, and data type conversions Add audit fields (load date, source system, batch id) 3. ...
Top Ranked Requirements 1. Adobe/AEM Architect: Expert-level structuring of Adobe Analytics (segments, dimensions, templates). This is the "heavy lift" of the role. 2. The "Storyteller" (Strategy Bridge): 5+ years of experience turning data into strategic recommendations for Creative and Strategy teams. 3. Technical Data Handling: Hands-on proficiency with SQL/Python for ETL and data cleaning, and BI tools (Domo, Tableau, Looker) for automation. 4. Governance Lead: The ability to create "durable" frameworks—naming conventions, tagging plans, and intake processes—that ensure data stays clean year-over-year. Core Responsibilities (The "What") • Design & Implement: Build measurement frameworks and "always-on" tr...
### Overview We are looking for **Python engineers focused on web scraping, data extraction, and data cleaning**. This is **NOT** a large system or customer-facing role. The work consists of: * Small, clearly scoped Python scripts * Web scraping (HTML, PDFs, APIs) * Data cleaning and transformation * ETL-style utilities All work is: * Async-first * Internal tools only * Clearly scoped with written requirements This is **ongoing contract work**. Strong performers may receive long-term work. --- ### What You’ll Be Doing * Build Python scripts to scrape public websites * Parse HTML, JSON, CSV, and PDF files * Clean and normalize messy real-world data * Write clear, maintainable utility scripts * Deliver working code (not just prototypes) --- ### Required Skills * Str...
...español de forma fluida. Por favor, no apliques si no cumples este requisito, ya que la comunicación constante con el equipo es vital. Buscamos un experto freelance en Power BI y n8n para una colaboración a largo plazo. Tenemos múltiples proyectos de automatización en cola, pero comenzaremos con un Dashboard de Productividad operativo. Buscamos un perfil técnico que domine la integración de datos (ETL) y la visualización de alto nivel. El Primer Proyecto: Dashboard de Productividad El objetivo es añadir una sección a un reporte existente para calcular la productividad por operario. Fuentes de datos: Odoo 17 Enterprise: Extracción vía API de ausencias (vacaciones, bajas), festivos y listado de...
...to extract and aggregate content from my website, including blogs, podcasts, YouTube videos, books, and articles. The goal is to populate a structured spreadsheet (Excel or Google Sheets) with this data, making it easy for AI tools to analyze themes, trends, summaries, etc. Background on the Process: This is essentially a data extraction or content aggregation task (also known as web scraping or ETL: Extract, Transform, Load). It involves systematically collecting unstructured content from my site and organizing it into a spreadsheet. Each entry should include metadata like title, URL, summaries, tags, and more. I have a template spreadsheet ("new come and reason ") with columns such as: id type (e.g., blog, podcast, video, book, article) title date_published source_na...
...for me. Each paper must be: • 8 pages (≈ 5 000 words) • fully formatted in IEEE style and delivered as clean, compilable LaTeX • accompanied by the final PDF Scope for the three papers 1. AWS services and cloud-native architecture—cover core building blocks such as VPC, IAM, S3, Lambda, and how they interoperate in production-grade designs. 2. Data Engineering methodologies—drill into ETL processes and modern data-pipeline patterns on AWS (Glue, Step Functions, Kinesis, etc.) with diagrams and code snippets where helpful. 3. AI/ML algorithms and applications—tie in SageMaker, feature engineering, and model deployment on AWS, illustrating at least one end-to-end use case. Purpose These are personal papers for my own portfolio, ...
...search stack that starts with an ETL flow pulling exclusively from our internal PostgreSQL databases. The pipeline must ingest and transform 38 000+ B2B category records and 5 000–10 000 company profiles, then run cleaning, vectorization, and enrichment steps so every record is categorized and stored in a pgvector-enabled schema. Once the data is in place, a separate microservice should expose a REST API that supports hybrid search: dense vectors (OpenAI text-embedding-3-small) combined with BM25 and blended with RRF scoring. Results have to work equally well in Hungarian and English; huspacy, spaCy, and Open AI are the preferred tools for language handling and any fallback generation. I expect the codebase in Python 3.10+, organised as two deployable units: • ET...
...surface company or product metadata Verification rules must anchor to two primary sources: the official manufacturer websites themselves and relevant government databases. When information conflicts, the system should automatically weigh each source and assign a confidence score, storing the rationale so we can trace every decision later. What I expect you to deliver • Clean, well-documented ETL scripts (Python, SQL or comparable) that ingest, normalise and enrich my current tables • A modular rules engine where I can tweak source priority, matching logic and thresholds without touching core code • A confidence-scoring function that explains how each record was resolved, including the exact URLs or API records consulted • Logging and error-handling...
KEY RESPONSIBILITIES - Build an AI chatbot infrastructure on Amazon Bedrock using Anthropic Claude - Develop Knowledge Base infrastructure and ETL pipelines - Implement security controls: VPC isolation, PrivateLink endpoints, KMS encryption - Configure Bedrock Guardrails (content filters, PII masking, threat detection) - Set up monitoring and logging systems REQUIRED SKILLS - AWS: Bedrock, VPC/PrivateLink, S3, OpenSearch, IAM, KMS, Lambda, CloudTrail - Security: Entra ID/IAM Identity Center SSO, OIDC, encryption, network isolation, SIEM - Data Engineering: ETL pipelines, API integrations, data classification - IaC: Terraform or CloudFormation - Monitoring: CloudWatch, log analysis, KQL queries - ISO 27001, GDPR, data privacy principles - AI security (prompt injection, guardr...
I have direct access to our e-commerce sale...connections or scheduled refresh against the database. I’ll provide the connection string and a sample export; you’ll propose the data model, set up the transformations, and surface the insights in clean, intuitive visuals. Exact metrics and calculated fields can be finalised together once the framework is in place. Deliverables • Secure connection to the sales database, including any necessary ETL or query optimisation • Fully-interactive dashboard with filter panels, drill-through views, and export options • Clear hand-off documentation covering data model, refresh schedules, and how to extend the report Please outline the tool you prefer, the timeline you need, and one example of a similar dashboar...
...modernizációjának támogatása - Adatmigrációs és adatátalakítási folyamatok implementálása és felügyelete - SQL-alapú és ETL jellegű megoldások fejlesztése: PostgreSQL/MSSQL környezetben - Migrációs validációs, adatminőségi és kontroll mechanizmusok kialakítása reconciliation, ellenőrző riportok, idempotens betöltések - Technikai döntések előkészítése: batch vs. incremental megközelítés reprocessing, rollback, cut-over stratégiák - Szoros együttműködés üzleti és IT oldali stakeholderekke...
Project Description We need a person to work on ETL development, Informatica IDMC) Work Type Full-time or Part-time 3–5 days per week Around 4 hours per day Remote work Required Skills Informatica PowerCenter & IDMC ETL development SQL Payment Payment based on work More pay if performance is very good
Project Title Data Integration & Visualization Specialist Project Description We need a person to work on ETL development, Informatica IDMC) Work Type Full-time or Part-time 3–5 days per week Around 4 hours per day Remote work Required Skills Informatica PowerCenter & IDMC ETL development SQL Payment Payment based on work More pay if performance is very good
...indexes) * Import setup from CSV/JSON (including ID strategy and validation) * 10–20 Cypher queries covering typical usage patterns (lookup, filtering, “context retrieval”) * Short documentation + recommended next structure Skills / Requirements Must-have * Neo4j + Cypher (strong hands-on experience) * Graph data modeling (ontology / domain modeling; clean labels, relationships, and boundaries) * ETL / ingestion from CSV/JSON (ID strategy, validation, deduplication) * Performance & quality fundamentals: constraints, indexing, and query profiling/optimization using EXPLAIN / PROFILE Nice-to-have * Python for scripting and transformations (import utilities, data cleanup, automation) * LLM / RAG integration (chunking strategies, metadata design, retrieval ...
I need my separate databases to behave as one reliable source of truth. The job starts with assessing the current schemas and finishes when a single, well-documented repository is live, fully populated, and syncing automatically. Core tasks include mapping tables and fields, building the ETL pipelines, migrating historical records, validating row-level accuracy, and putting monitoring in place so future updates flow without manual intervention. I am open to the tech stack—whether you prefer native SQL scripts, Python with Pandas and SQLAlchemy, Talend, Airflow, or another proven toolset—as long as the choice is justified and scalable. Please attach a detailed project proposal that walks through your approach, milestones, estimated timeline, and any comparable integ...
...search stack that starts with an ETL flow pulling exclusively from our internal PostgreSQL databases. The pipeline must ingest and transform 38 000+ B2B category records and 5 000–10 000 company profiles, then run cleaning, vectorization, and enrichment steps so every record is categorized and stored in a pgvector-enabled schema. Once the data is in place, a separate microservice should expose a REST API that supports hybrid search: dense vectors (OpenAI text-embedding-3-small) combined with BM25 and blended with RRF scoring. Results have to work equally well in Hungarian and English; huspacy, spaCy, and Open AI are the preferred tools for language handling and any fallback generation. I expect the codebase in Python 3.10+, organised as two deployable units: • ET...
Project Title: Data Integration & Visualization Specialist for ETL, IDMC, Informatica & Qlik Project Description: We are seeking an experienced Data Integration & Visualization Specialist for a project involving ETL development, Informatica, IDMC, and Qlik. The goal of the project is to design, implement, and maintain data pipelines and dashboards for seamless data processing and reporting. Responsibilities: Design and develop ETL pipelines for data extraction, transformation, and loading. Work with Informatica PowerCenter and IDMC to manage and optimize data workflows. Develop and maintain Qlik dashboards and reports for data visualization. Ensure data quality, accuracy, and consistency across systems. Collaborate with project stakeholders to meet r...
Platinum by ETL; Combining our experience in the crypto/digital asset space, as well as ETL global’s expertise in legal, accountancy & tax matters, we are looking to target HNW/UHNW individuals with significant assets in cryptocurrency. The goal is to assist them in incorporating digital assets into their day-to-day accounting & legal matters, with an additional focus on how these assets can be passed along to heirs as part of a trust/inheritance set-up. Core product: “The Vault”- This product will be a secure multi-sig/ MPC wallet (Hosted by DFNS). Core functionalities: Frontend design to be high-class, slick, professional. (Please see below a link to the website which is in development: , for you to get an idea of the design we have gone for)...
...precio del suelo en la zona. El objetivo es que el sistema procese estos datos, genere un valor estimado confiable y lo presente en informes PDF descargables, listos para compartir con inversionistas o clientes internos. Necesito: 1. Un modelo de valoración claro y documentado (puede ser un algoritmo estadístico o machine learning, siempre que justifique la precisión). 2. Un pequeño flujo ETL para cargar y depurar las tres fuentes de datos. 3. Plantilla de informe PDF con gráficos y métricas principales ya embebidos. 4. Guía de uso e instalación para que el motor pueda ejecutarse en mi propio entorno (preferentemente Python con librerías comunes como pandas, scikit-learn y reportlab, pero abierto a suger...
...data operations tasks. Any experience with building AI agents into the workflows or experience with N8N that is is nice to have add on. We are looking exclusively for people in Egypt. We agree weekly on 10 hour work packages and will asses the work quality and deliverables also on weekly basis prior to payout. skills & experience required: - Proven experience designing and implementing robust ETL pipelines for large-scale, heterogeneous data sources - Strong proficiency in building and maintaining custom web crawlers and data scrapers using tools like power automate desktop, Javascript or similar frameworks - Expertise in handling unstructured and semi-structured data (e.g., JSON, APIs, flat files) - Familiarity with SQL, excel, power bi - Strong problem-solving skills an...
We are a fast-growing technology company building intelligent, AI-driven products that solve real-world business problems. Our focus is on automation, scalable systems, and p...workflows. You will work closely with product, engineering, and business stakeholders to develop reliable, production-grade AI solutions—not just experiments. What You’ll Do Design and implement AI-powered features and automation workflows Build and integrate LLM-based applications (OpenAI, Hugging Face, etc.) Develop scalable backend services and APIs Work with structured and unstructured data (ETL pipelines, embeddings, vector databases) Optimize performance, reliability, and cost of AI systems Collaborate on system architecture and technical decision-making Take ownership of features from con...
...similar), ingest the relevant tables, and then handle missing values, outliers, type casting, and duplicate detection. Once cleaned, the data should be written back to a new table in the same database and optionally exported to CSV so that downstream teams can verify the results in Excel if they choose. Key deliverables – Python script (Pandas, NumPy, SQLAlchemy preferred) that performs the full ETL/cleaning routine – Re-usable functions or class-based structure so future data drops can be processed with one command – Clear inline comments plus a short README explaining installation, execution, and configurable parameters Acceptance criteria 1. Script connects to the database with credentials provided at run-time or via .env file 2. All missing o...
I’m looking for a Data Engineer with strong AWS native services experience to help build and support an event-driven data platform. This project focuses on automated batch data pipelines, data governance, and making data available in a secure and scalable way. This is not ad-hoc ETL — it’s a platform-style setup. Tech stack involved: • AWS: S3, SQS, Lambda, MWAA (Airflow), EMR Serverless • Data Processing: PySpark, Apache Spark • Data Lake: Apache Iceberg, AWS Glue Catalog • Governance & Security: Lake Formation, IAM, KMS • Querying: Amazon Athena
...Livreur, Chaussures pour une Infirmière). D. Système de Rémunération (Data-to-Cash) Gain : 1,00 par facture / 1,50 par contrat soumis. But : Rémunérer l'utilisateur pour alimenter la base de données. 3. Le Volet B2B : Altro Insight (La Plateforme Data) Objectif : Une usine automatisée de vente de données (SaaS + E-commerce). A. Le Moteur de Traitement (L'Usine Interne) Nettoyage Automatisé (ETL) : Chaque donnée entrante est nettoyée, standardisée et anonymisée avant d'être stockée. Organisation : Les données sont classées par tags : Tranche d'âge, Genre, Métier, Région, Salaire, Employeur. B. Le Mod&egrav...
...analysts, engineers, or BI specialists with strong communication and writing skills to contribute technical articles to our blog. Your mission? Share hands-on experience, real-world examples, and practical tips to help fellow data professionals work smarter. Who are we? ClicData is an all-in-one data management and business intelligence platform (SaaS), offering data connectivity, warehousing, ETL, data visualization, and automation. Our audience includes data professionals and data-savvy business leaders, primarily in mid-market companies across North America. Why write with us? - Were looking for long-term collaborators not one-off gigs. That means predictable, recurring income for you. - You'll be credited as the author of each piece you write your expertise will be sho...
...domain to design, build, and maintain scalable data pipelines and reporting solutions. The ideal candidate will have hands-on experience across AWS and Microsoft Azure, strong Python/PySpark skills, and the ability to support integrated reporting and analytics using Power BI. Key Responsibilities Design, develop, and maintain end-to-end data pipelines for healthcare payer data Build and optimize ETL/ELT workflows using AWS Glue, Step Functions, and Python Work with Azure and AWS cloud services for data ingestion, processing, and storage Implement and manage Data Lake architecture (structured & unstructured data) Ensure high data quality, reliability, and performance across pipelines Support integrated reporting and analytics use cases Collaborate with business, analyt...
...domain to design, build, and maintain scalable data pipelines and reporting solutions. The ideal candidate will have hands-on experience across AWS and Microsoft Azure, strong Python/PySpark skills, and the ability to support integrated reporting and analytics using Power BI. Key Responsibilities Design, develop, and maintain end-to-end data pipelines for healthcare payer data Build and optimize ETL/ELT workflows using AWS Glue, Step Functions, and Python Work with Azure and AWS cloud services for data ingestion, processing, and storage Implement and manage Data Lake architecture (structured & unstructured data) Ensure high data quality, reliability, and performance across pipelines Support integrated reporting and analytics use cases Collaborate with business, analyt...
...appropriate for a small team (CloudWatch-based is fine) Governance & Access Configure permissions and table governance properly Maintain clear lineage from raw to curated Handoff Provide clear documentation: architecture diagram, runbook, and how to extend the pipeline Ideal Candidate (Must Have) Strong hands-on experience with S3, Athena, Glue, Python Experience building production ELT/ETL pipelines (not just ad hoc scripts) Solid understanding of data design (partitions, Parquet, table formats, cost/performance) Comfort with SQL and data modeling for analytics-ready datasets Ability to communicate tradeoffs clearly and propose a clean architecture Clear updates, short design review upfront, then implementation sprints After reviewing this proposal, I will proce...
...various sources into this new store. Think full-stack ETL: extract from APIs or flat files, transform for consistency and quality, then load into the MySQL tables on a defined schedule. I’m open to tools you’re comfortable with—Python scripts, Airflow, Fivetran, or a light-weight custom solution—as long as the result is reliable, monitorable, and easy to extend. Security, backup routines, and a concise hand-off document are part of the brief. If you’ve architected MySQL environments for analytics workloads before and can demonstrate tight, well-tested data pipelines, I’d like to see how you’d approach this. Deliverables (acceptance criteria): • Optimised MySQL schema tailored for analytics data • Automated, version-control...
...star or snowflake model that makes sense for reporting, then build the ETL logic to keep it populated and up to date. Our source of truth is MS SQL Server, and the semantic layer will live in Power BI; if you’re comfortable incorporating other sources such as flat files or APIs later, that flexibility will be a plus. Once the data model is solid, craft interactive Power BI dashboards that spotlight Sales performance—pipeline health, revenue trends, win-loss ratios, and regional drill-downs. I expect polished visual standards, sensible DAX, and refresh cycles that won’t keep users waiting. Deliverables I need to sign off on: • SQL scripts for all new or modified database objects and the dimensional model • Automated ETL process (T-SQL, SSIS, ...
Role: QA Engineer / QA Consultant · Engagement Type: Consultant basis (Temporary) · Duration: 2 months (extendable up to 3 months if needed) · Location – Work from office Experience & Skill Requirements: · Minimum 3 years of experience · Hands-on QA experience in Data Engineering projects (mandatory) · Experience validating data pipelines, ETL processes, data quality, and related workflows · Automation experience is good to have, but strong Data Engineering QA exposure is the key requirement
(This is not a real Project, If hired you will be training an individual in completing the follow...monitoring of pipeline health and bullet-proof error handling/recovery. Our stack is PostgreSQL 14 on Windows; feel free to suggest optimisations such as table-partitioning, COPY-based bulk inserts, or an intermediate message-queue if it helps sustain throughput. Deliverables • Updated C# modules that control the sensors/PCBs with automatic reconnect and status callbacks • A PostgreSQL-centric ETL pipeline covering transformation, live monitoring metrics and graceful retries • Clear set-up notes and concise inline code comments so the in-house team can maintain the solution If you’ve previously juggled hardware control and big SQL flows in the same project...
...en proyectos reales de análisis y visualización de datos. Trabajamos con empresas en operación (distribuidoras, logística, ventas, administración), por lo que buscamos un perfil que no solo haga dashboards, sino que entienda procesos e indicadores de negocio. - Requisitos Experiencia real con Power BI (modelado, DAX, dashboards) Conexión a APIs, bases de datos y fuentes externas Procesos de ETL (Power Query, limpieza y transformación de datos) Capacidad para interpretar KPIs y métricas de negocio Español fluido (preferentemente Argentina) - Modalidad de trabajo Trabajo 100% remoto Acceso a servidor privado de APSOL (por confidencialidad y performance) Infraestructura optimizada para procesamiento de datos y...
Witam, potrzebuję zlecić wykonanie dashboardu w Power BI. Poniżej wstawiam wytyczne projektu. Tematyka: dowolna, np. dane makroekonomiczne, analiza działania przedsiębiorstwa, medycyna, sport, ekologia, demografia, geografia, statystyki policyjne. Źródła danych: należy skorzystać z kilku źródeł danych, różnych typów (np. plików płaskich, baz danych, stron internetowych, usług sieciowych). ETL: należy odpowiednio przygotować dane do użycia. Może się okazać, że wykonanych operacji będzie nawet kilkanaście na jednej tabeli. Relacje: należy powiązać ze sobą relacjami tabele z różnych źródeł danych. Wizualizacje: powinny być dostosowane do tego, co chcemy zaprezentować. Nie powinno być mniej niż 4 różnych obiektów wizualizacy...
...day (Monday to Friday) In US timings • Payment: Weekly • 60-70k Monthly Role Overview: The Data Engineer designs, builds, and maintains scalable data pipelines and architectures to support healthcare workflows. This role ensures data reliability, performance, and HIPAA compliance while working with modern cloud-based data engineering tools. Duties & Responsibilities: • Design, build, and maintain ETL/ELT pipelines using Python, Dagster, DBT, and AWS services • Develop and optimize data models in PostgreSQL and write high-performance SQL • Monitor pipeline health, troubleshoot failures, and implement preventive controls • Enforce data quality, governance, and HIPAA compliance for PHI data • Automate deployments and monitoring using Terraform...
Project Details ₹600.00 – 1,500.00 INR I need an experienced data professional to take raw tables living in our cloud warehouse and turn them into clean, analytics-ready models. Your day-to-day will center on writing highly efficient SQL, designing repeatable ETL logic, and orchestrating everything through Azure Data Factory so the pipelines run hands-free. The warehouse is already in place; what’s missing is the transformation layer that converts disparate source data into a single source of truth the business can trust. Expect to: • Build and document robust staging, cleansing, and dimensional models directly inside the cloud data warehouse • Optimise complex joins and long-running queries for both cost and speed • Drop into Python (Pandas, NumPy) whe...
Big Data Engineer with expertise in the Hadoop Ecosystem, including on GCP, AWS and Snowflake, GCP - Big Query Extensive experience deploying cloud-based applications using Amazon Web Services such as Amazon EC2, S3, RDS, IAM, Auto Scaling, CloudWatch, SNS, Athena, Glue, Kinesis, Lambda, EMR, Redshift, and DynamoDB. Worked on ETL Migration services by developing and deploying AWS Lambda functions for generating a serverless data pipeline which can be written to Glue Catalog and can be queried from Athena. Proven expertise in deploying major software solutions for various high-end clients meeting the business requirements such as big data Processing, Ingestion, Analytics and Cloud Migration from On-prem to AWS Cloud using AWS EMR, S3, DynamoDB Hands of experience in GCP, Big Query,...
...day (Monday to Friday) In US timings • Payment: Weekly • 60-70k Monthly Role Overview: The Data Engineer designs, builds, and maintains scalable data pipelines and architectures to support healthcare workflows. This role ensures data reliability, performance, and HIPAA compliance while working with modern cloud-based data engineering tools. Duties & Responsibilities: • Design, build, and maintain ETL/ELT pipelines using Python, Dagster, DBT, and AWS services • Develop and optimize data models in PostgreSQL and write high-performance SQL • Monitor pipeline health, troubleshoot failures, and implement preventive controls • Enforce data quality, governance, and HIPAA compliance for PHI data • Automate deployments and monitoring using Terraform ...
...system using Python, PostgreSQL, OpenAI API, and other advanced technologies. The role involves building a robust ETL pipeline, implementing semantic search, and ensuring strict quality assurance. Scope of work - Build a robust ETL pipeline in Python to process 38k+ categories and company data with Hash Checking for incremental updates - Implement Semantic Search using Hybrid Search (Vector + BM25) - Ensure the system passes predefined 'Blind Test' protocol - Utilize strict type hinting and deterministic logic for code architecture - Deliver production-grade, containerized code - Project management tracking in Jira, with documentation and testing included Data engineering expertise ETL, Data validation, SQL Data technology tools Python, SQL Additional ...
...visibility catalogue that covers everyday electronics, clothing and household items. All of the raw product data already sits in the database; what they need is someone who ready to learn, submit the records through a pre-built AI tool to make the data a consumer-ready listings to our various sales channels for visibility The workflow is straightforward and requires non existent knowledge: The AI ETL framework pulls the SKU, descriptions, specs and images from the database, the AI also handles copy polishing and attribute mapping, then you only submit the product data which uploads the final set. The company will supply the database credentials, field map and a short text based training session so you can start right away. I’m looking for who is willing to learn. this is...
...patterns, ETL tools, and event frameworks ● Strong understanding of CRM data models, automation, lifecycle processes, and governance ● Experience with security, compliance, consent frameworks, and global privacy regulations ● Ability to produce high-quality architecture documentation and lead cross-functional workshops ● Excellent communication skills to collaborate with customer leadership, engineering teams, and business stakeholders ● Demonstrated leadership experience guiding teams and driving large-scale architectural initiatives ● Certifications such as Salesforce CTA, Application Architect, System Architect, Marketing Cloud Architect, or Data Cloud Consultant ● Experience with Tableau, MuleSoft, or advanced Marketing Cloud features is a plus ● Experience with Snowflake, re...
...visibility catalogue that covers everyday electronics, clothing and household items. All of the raw product data already sits in the database; what they need is someone who ready to learn, submit the records through a pre-built AI tool to make the data a consumer-ready listings to our various sales channels for visibility The workflow is straightforward and requires non existent knowledge: The AI ETL framework pulls the SKU, descriptions, specs and images from the database, the AI also handles copy polishing and attribute mapping, then you only submit the product data which uploads the final set. The company will supply the database credentials, field map and a short text based training session so you can start right away. I’m looking for who is willing to learn. this is...
...visibility catalogue that covers everyday electronics, clothing and household items. All of the raw product data already sits in the database; what they need is someone who ready to learn, submit the records through a pre-built AI tool to make the data a consumer-ready listings to our various sales channels for visibility The workflow is straightforward and requires non existent knowledge: The AI ETL framework pulls the SKU, descriptions, specs and images from the database, the AI also handles copy polishing and attribute mapping, then you only submit the product data which uploads the final set. The company will supply the database credentials, field map and a short text based training session so you can start right away. I’m looking for who is willing to learn. this is...
...focused on AI/ML projects. Prior experience working with Snowflake or similar cloud data platforms. Retail industry experience strongly preferred. Technical Skills Strong SQL skills for querying and correlating Snowflake tables. Experience with AI/ML frameworks and conversational AI (Cortex Agent, Dialogflow, Rasa, or similar). Python programming experience (preferred). Familiarity with data modeling, ETL, and BI dashboards....
...support our data operations Team based in Egypt (remote), specifically focused on webcrawling, API's, ELT, BI and data operations tasks. We are looking exclusively for people in Egypt. We agree weekly on 10 hour work packages and will asses the work quality and deliverables also on weekly basis prior to payout. skills & experience required: - Proven experience designing and implementing robust ETL pipelines for large-scale, heterogeneous data sources - Strong proficiency in building and maintaining custom web crawlers and data scrapers using tools like power automate desktop, Javascript or similar frameworks - Expertise in handling unstructured and semi-structured data (e.g., JSON, APIs, flat files) - Familiarity with SQL, excel, power bi - Strong problem-solving sk...
...goal of this workshop is to teach students how to bridge the gap between raw data analysis and advanced business intelligence. You will be responsible for guiding students through a hands-on capstone project. Key Responsibilities: Conduct weekly 1-on-1 or group mentoring sessions to review project progress. Explain complex concepts like DAX measures, Data Modeling (Star Schema), and Power Query ETL in simple terms. Guide students in integrating Python/R scripts within Power BI for advanced analytics. Provide constructive feedback on dashboard design, storytelling, and data accuracy. Assist in troubleshooting data connection issues and complex formula errors. Required Skills: Expertise in Power BI: DAX, Power Query, Row-Level Security, and Power BI Service. Data Science F...
...layouts, bullet point, font colors, hyperlinks, backlinks, and related features. - PRIORITY These APIs and Laravel business logic need to be collab w/ Syed Mobile ASAP. SignUp workflow = username, email, telephone, GA Login workflow = username, face match, google authenticator (GA) code Keep the Login business logic on Laravel or something else? What do you recommend? ETL Extract, Transfer, and Load (ETL) Python - ETL pipeline in python that runs or dumps every 24hrs FROM site data and face data TO PII storage warehouse and Face storage warehouse. PII meaning = personally identifiable information -Ability to retrieve PII & Face data. Incremental or Source driven Extraction. -Transform the data meaning clean, validate, standardize, de-duplicate...
In this article, you learn how to create and configure a Zeppelin instance on an EC2, and about notebook storage on S3, and SSH access.