Techfura - All about Programming and Data Engineering

SQL MERGE statement

January 28, 2021March 2, 2022 Kennedy

With data processing, in performing ETL workloads to a data warehouse, some of the activities in ETL can be achieved using the MERGE statement in SQL. The MERGE statement in SQL is a special type of query in SQL Server Read More …

Cloud Platforms Comparison Cheat Sheet

May 6, 2025May 6, 2025 Kennedy

Here’s a comprehensive cheat sheet for Cloud Comparison Cheat Sheet – core services across Microsoft Azure, Amazon Web Services (AWS), and Google Cloud Platform (GCP), organized by major category, plus brief descriptions and links to official docs for deeper dives. Read More …

SQL – CRUD and CT

February 3, 2025February 3, 2025 techfura

When writing a Microsoft SQL Server store procedure, it is normal to have multiple stored procedures for each CRUD operation (Create, Read, Update, Delete) – SELECT, INSERT, UPDATE and DELETE. However, is it possible to simplify this Transact-SQL logic into a single SQL Server stored Read More …

Building a Real-Time Data Pipeline with Python, Docker, Airflow, Spark, Kafka, and Cassandra

September 12, 2024September 12, 2024 techfura

In today’s data-driven world, the ability to efficiently collect, process, and analyze large volumes of data is paramount. This blog post will delve into a data engineering project that leverages a powerful combination of tools: Python, Docker, Airflow, Spark, Kafka, Read More …

Mastering Personalized Conversations with RAG

September 9, 2024September 9, 2024 techfura

In today’s fast-evolving tech landscape, the demand for personalized, contextually aware conversational systems is higher than ever. Whether you’re developing customer support bots, virtual personal assistants, or interactive educational tools, leveraging advanced models like Retrieval-Augmented Generation (RAG) can transform your Read More …

Top 10 Core Concepts in every Programming Languages

May 13, 2024May 13, 2024 techfura

While there are many programming languages, they all share some fundamental building blocks. Here’s a breakdown of the top 10 core concepts you’ll find in pretty much every coding language: Understanding these core concepts is like learning the alphabet of Read More …

Intro to Apache Spark

May 8, 2024May 8, 2024 techfura

Apache Spark is a lightning-fast cluster computing technology, designed for fast computation. It is based on Hadoop MapReduce and it extends the MapReduce model to efficiently use it for more types of computations, which includes interactive queries and stream processing. Read More …

Spark SQL – Part I

May 8, 2024May 8, 2024 techfura

The world of data is exploding. Businesses are generating massive datasets from various sources – customer transactions, sensor readings, social media feeds, and more. Analyzing this data is crucial for uncovering valuable insights, informing decisions, and gaining a competitive edge. Read More …

ETL with PySpark – Intro

May 8, 2024May 8, 2024 techfura

Data transformation is an essential step in the data processing pipeline, especially when working with big data platforms like PySpark. In this article, we’ll explore the different types of data transformations you can perform using PySpark, complete with easy-to-understand code Read More …

Spark DataFrame Cheat Sheet

May 8, 2024May 8, 2024 techfura

Core Concepts DataFrame is simply a type alias of Dataset[Row] Quick Reference val spark = SparkSession .builder() .appName(“Spark SQL basic example”) .master(“local”) .getOrCreate() // For implicit conversions like converting RDDs to DataFrames import spark.implicits._ Creation create DataSet from seq Read More …

Basic Linux Commands – Cheat sheet

May 8, 2024May 8, 2024 techfura

This cheat sheet covers some of the most essential commands you’ll encounter in Linux. File Management: Basic Operations: Permissions and Ownership: Network: Process Management: Finding Things: Help and Information: This is a basic list. Many more commands exist for specific Read More …