Cloud computing

Introduction

Architecture de Spring Cloud

Eureka Config Serve Zuul Consul Hystrix Resilience4J

Spring Boot (BackEnd) TPs

Creation,Dépendance,Configuration Exemple Video RestController

Produit Rest API

Entity et repository Ajouter Afficher Liste Produit Détails,Supprimer,Vider Modifier

Angular (FrontEnd)

Angular Rappel CLient CRUD

Spring Security

User Auth

CRUD

Vente CRUD

To be Continued...

Middlewares Orientés Messages

Communication Synchrone vs. Asynchrone API JMS : Java Message Service JMS avec ActiveMQ et HornetQ KAFKA

Spring Batch

Spring Batch

Stream Processing

Kafka Streams

Architectures Serverless

Architectures Serverless Résumé



Kafka

Kafka is primarily used to enable real-time data streaming and processing between systems. It ensures reliable, scalable, and fault-tolerant communication in data-driven environments, making it ideal for building event-driven architectures.

data-driven environment

A is a setting where decisions and operations are guided by data and analytics. It involves collecting, centralizing, and analyzing data to gain insights and drive evidence-based decisions. Examples include:

  • Healthcare: Predicting disease outbreaks using patient data.
  • Finance: Detecting fraud through transaction analysis.
  • Retail: Optimizing inventory based on customer trends.
  • Aviation: Monitoring aircraft telemetry for real-time safety and predictive maintenance.

Functionalities of Kafka

  1. Publish-Subscribe Messaging:

    • Kafka allows applications to produce messages (events) to a topic.
    • Consumers can subscribe to these topics and receive the events in real-time.
  2. Scalability:

    • Kafka is horizontally scalable. Topics are divided into partitions, which can be distributed across multiple brokers (nodes) in the Kafka cluster.
    • This ensures high throughput by parallelizing reads and writes.
  3. Data Durability:

    • Kafka stores messages on disk, allowing consumers to retrieve them later.
    • Messages are retained for a configurable period, even after consumption, ensuring reliability.
  4. Fault Tolerance:

    • Kafka replicates partitions across multiple brokers for redundancy.
    • If a broker fails, the replicas take over seamlessly.
  5. High Throughput:

    • Designed to handle millions of events per second, Kafka ensures minimal performance impact even in high-traffic systems.
  6. Event Storage:

    • Kafka acts as a durable log for event storage, allowing downstream systems to replay or process historical events.
  7. Real-Time Processing:

    • Kafka supports stream processing using tools like Kafka Streams or external frameworks such as Apache Flink or Apache Spark.
  8. Decoupling Systems:

    • Kafka acts as a mediator, enabling independent evolution of producers and consumers.
    • Producers and consumers can operate asynchronously, decoupling their lifecycles.
  9. Integration:

    • Kafka connects heterogeneous systems by acting as a central hub for data exchange.
    • With Kafka Connect, it integrates with databases, storage systems, and other services for seamless data ingestion and export.
  10. Replayability:

    • Consumers can read data from a specific point (offset) in the stream, making Kafka suitable for debugging, audits, and reprocessing.

Use Cases

  1. Log Aggregation:

    • Collects logs from different services and systems into a central platform for processing.
  2. Real-Time Analytics:

    • Enables event-driven analytics for dashboards, fraud detection, and recommendation systems.
  3. Stream Processing:

    • Transforms or aggregates event data for complex processing pipelines.
  4. Event Sourcing:

    • Captures the state changes in applications as events for replaying and recovering state.
  5. Data Integration:

    • Acts as a bridge for real-time data movement between databases, cloud services, and applications.
  6. IoT and Monitoring:

    • Processes streams from IoT devices or monitoring systems in real-time.

Kafka Components

  1. Producer: Sends messages to Kafka topics.
  2. Broker: Servers that store and serve messages.
  3. Topic: A category where messages are published.
  4. Partition: A subset of a topic for parallelism.
  5. Consumer: Reads messages from topics.
  6. ZooKeeper (Deprecated in newer versions): Manages cluster metadata and leader election (replaced by Kafka’s own metadata quorum in newer versions).
graph LR A[Kafka Architecture] --> B[Core Components] A --> C[Data Flow] A --> D[Key Features] B --> B1[Producer] B1 --> B1a[Publishes messages to topics] B1 --> B1b[Defines message keys for partitioning] B --> B2[Broker] B2 --> B2a[Stores and manages messages] B2 --> B2b[Handles partition replication] B2 --> B2c[Distributes load across brokers in a cluster] B --> B3[Topic] B3 --> B3a[Logical category for messages] B3 --> B3b[Divided into partitions for scalability] B --> B4[Partition] B4 --> B4a[Subset of a topic] B4 --> B4b[Ensures message order within a partition] B --> B5[Consumer] B5 --> B5a[Subscribes to topics and reads messages] B5 --> B5b[Tracks offset for reprocessing or continuous reading] B --> B6[ZooKeeper/Metadata Quorum] B6 --> B6a[Manages metadata and leader election] B6 --> B6b[Ensures cluster coordination] C --> C1[Producers send messages to topics] C1 --> C1a[Messages are distributed across partitions] C --> C2[Brokers store messages persistently] C2 --> C2a[Messages are retained based on time or size limits] C --> C3[Consumers fetch messages] C3 --> C3a[Can read in real-time or replay old messages] D --> D1[High Throughput] D1 --> D1a[Handles millions of messages per second] D --> D2[Fault Tolerance] D2 --> D2a[Partition replication across brokers] D --> D3[Scalability] D3 --> D3a[Add more brokers and partitions] D --> D4[Replayability] D4 --> D4a[Consumers can re-read messages from any offset] D --> D5[Decoupled Systems] D5 --> D5a[Producers and consumers are independent] D --> D6[Integration with Stream Processing] D6 --> D6a[Kafka Streams, Flink, or Spark for real-time analytics]
  1. Core Components:

    • Producers: Applications that publish events/messages to Kafka topics.
    • Brokers: Kafka servers that store and serve messages to consumers.
    • Topics: Logical channels where messages are categorized.
    • Partitions: Subsets of topics for parallel processing and scalability.
    • Consumers: Applications that read messages from Kafka topics.
    • ZooKeeper/Metadata Quorum: Manages metadata and cluster coordination (newer versions replace ZooKeeper with Kafka's native quorum).
  2. Data Flow:

    • Producers send messages to topics, and these messages are distributed across partitions.
    • Brokers store the messages persistently, retaining them for a specified duration.
    • Consumers fetch these messages, either in real-time or by replaying historical data.
  3. Key Features:

    • High throughput, fault tolerance, and scalability.
    • Ability to replay messages and decouple producers and consumers.
    • Seamless integration with stream processing tools for analytics and transformation.
graph LR A[Kafka Use Case: Real-Time Data Streaming Platform] %% Producers A --> B[Producers] B --> B1[Web Apps] B --> B2[IoT Devices] B --> B3[Logs & Monitoring] B --> B4[Database Change Events] %% Kafka Core A --> C[Kafka Core] C --> C1[Topics] C1 --> C2[Partitions] C --> C3[Brokers] C --> C4[Message Retention] C --> C5[Replication] %% Consumers A --> D[Consumers] D --> D1[Analytics & Dashboards] D --> D2[Stream Processing] D --> D3[Machine Learning Pipelines] D --> D4[Database Updates] D --> D5[Data Warehouses] %% Use Case Workflow subgraph Workflow E1[Producers publish events] E2[Kafka stores in partitions] E3[Consumers fetch messages] E4[Consumers process/store data] end C --> Workflow %% Monitoring and Integration A --> F[Monitoring & Integration] F --> F1[Prometheus & Grafana] F --> F2[Kafka Connect] F --> F3[ElasticSearch] %% Features subgraph Features G1[High Throughput] G2[Low Latency] G3[Scalability] G4[Fault Tolerance] G5[Message Replay] end A --> Features
  1. Producers:

    • Web Applications: Publish user interactions, transactions, or events to Kafka topics.
    • IoT Devices: Send sensor data for real-time processing.
    • Logs and Monitoring Tools: Centralize logs from various systems.
    • Database Change Events (CDC): Capture changes in relational databases (e.g., MySQL, PostgreSQL) to propagate updates.
  2. Kafka Core:

    • Topics act as logical channels for communication.
    • Partitions enable parallelism and scalability.
    • Brokers store and serve messages.
    • Replication ensures data durability and fault tolerance.
  3. Consumers:

    • Real-time dashboards for analytics.
    • Stream processing for data aggregation, filtering, or transformation.
    • Machine learning pipelines for real-time model training or inference.
    • Updates to relational databases.
    • Storing data in data warehouses for business intelligence.
  4. Workflow:

    • Events flow from producers to topics.
    • Kafka retains messages in partitions for a configurable period.
    • Consumers retrieve messages for processing or storage.
  5. Monitoring and Integration:

    • Use tools like Prometheus and Grafana for monitoring Kafka.
    • Integrate with external systems using Kafka Connect (e.g., ElasticSearch, Hadoop).
  6. Features:

    • Kafka’s high throughput, low latency, scalability, and fault tolerance make it ideal for real-time data streaming use cases.

use case in the aviation industry, such as real-time flight monitoring and data streaming

graph LR A[Kafka in Aviation Use Case: Real-Time Flight Monitoring] %% Producers A --> B[Producers] B --> B1[Aircraft Sensors] B --> B2[Air Traffic Control Systems] B --> B3[Airline Operations Centers] B --> B4[Weather Data Providers] %% Kafka Core A --> C[Kafka Core] C --> C1[Topics] C1 --> C1a[AircraftTelemetry] C1 --> C1b[FlightSchedules] C1 --> C1c[WeatherUpdates] C --> C2[Partitions] C --> C3[Brokers] C --> C4[Message Retention] C --> C5[Replication for Fault Tolerance] %% Consumers A --> D[Consumers] D --> D1[Airline Dashboards] D --> D2[Real-Time Alerts to Pilots and Ground Staff] D --> D3[Maintenance and Diagnostics Systems] D --> D4[Passenger Notification Systems] D --> D5[Predictive Analytics for Flight Efficiency] %% Use Case Workflow subgraph Workflow E1[Step 1: Sensors publish telemetry data to Kafka] E2[Step 2: Weather providers update weather information] E3[Step 3: Air traffic systems share flight schedules] E4[Step 4: Kafka stores and distributes data to consumers] E5[Step 5: Consumers process and display information] end C --> Workflow %% Features subgraph Features F1[Low Latency] F2[High Throughput] F3[Data Replay for Auditing] F4[Scalability for Multiple Flights] F5[Fault Tolerance] end A --> Features
  1. Producers:

    • Aircraft Sensors: Send real-time telemetry data, including speed, altitude, and system health.
    • Air Traffic Control Systems: Provide updates on flight schedules, airspace usage, and routing.
    • Airline Operations Centers: Share crew assignments, gate information, and ground operations data.
    • Weather Data Providers: Deliver live weather updates, turbulence warnings, and wind patterns.
  2. Kafka Core:

    • Topics like AircraftTelemetry, FlightSchedules, and WeatherUpdates organize data streams.
    • Partitions divide topics for scalability (e.g., telemetry by aircraft ID).
    • Brokers store messages, enabling real-time streaming and fault-tolerant data storage.
  3. Consumers:

    • Airline Dashboards: Display real-time flight statuses and operational insights.
    • Real-Time Alerts: Notify pilots or ground staff of critical updates (e.g., turbulence or maintenance needs).
    • Maintenance Systems: Analyze telemetry for predictive maintenance and diagnostics.
    • Passenger Notification Systems: Inform passengers of delays or gate changes.
    • Predictive Analytics: Use data to optimize flight paths, fuel efficiency, and scheduling.
  4. Workflow:

    • Data from producers (sensors, systems) is streamed to Kafka topics.
    • Consumers fetch this data for processing, visualization, and decision-making.
    • Kafka enables seamless communication between aviation systems.
  5. Features:

    • Kafka ensures low latency for critical updates, fault tolerance for reliability, and scalability to handle data from thousands of flights.

Project Setup

graph LR A[Spring Boot Application: Real-Time Aviation Monitoring] %% Producers TD A --> B[Telemetry Producer] B --> B1[REST API Endpoint :/api/telemetry/send:] B --> B2[Kafka Template] B2 --> B3[Kafka Topic: AircraftTelemetry] %% Kafka Core TD A --> C[Kafka Core] C --> C1[Topics] C1 --> C2[AircraftTelemetry] C --> C3[Brokers] C --> C4[Partitions] C --> C5[Replication for Fault Tolerance] %% Consumers TD A --> D[Telemetry Consumer] D --> D1[Kafka Listener :AircraftTelemetry] D1 --> D2[Process Telemetry Data] D2 --> D3[Log Telemetry to Console] D2 --> D4[Save Data to Database ] %% External Systems TD B1 --> E[Client Application] D --> F[Real-Time Dashboard] D --> G[Alerting System] D --> H[Data Analytics Pipeline] %% Enhancements TD subgraph Enhancements H1[Monitoring with Prometheus] H2[Grafana for Visualizing Metrics] H3[Stream Processing with Kafka Streams] H4[Database Integration for Historical Data] end A --> Enhancements

1. Spring Boot Application (A)

Central application for handling real-time aviation telemetry data.

2. Producers (B)

  • Telemetry Producer: Sends telemetry data to Kafka.
  • REST API Endpoint: Receives telemetry data via HTTP.
  • Kafka Template: Sends data to Kafka.
  • Kafka Topic: AircraftTelemetry: The channel where data is published.

3. Kafka Core (C)

  • Topics: Organizes data streams.
  • AircraftTelemetry: Specific topic for aircraft data.
  • Brokers: Kafka servers managing data.
  • Partitions: Allows parallel processing.
  • Replication: Ensures fault tolerance.

4. Consumers (D)

  • Telemetry Consumer: Subscribes to the Kafka topic to process data.
  • Kafka Listener: Listens for incoming data.
  • Process Telemetry Data: Processes the received data.
  • Log to Console: Logs data for monitoring.
  • Save to Database: Stores data for future use.

5. External Systems

  • Client Application: Sends data via the REST API.
  • Real-Time Dashboard: Displays data in real-time.
  • Alerting System: Notifies about anomalies.
  • Data Analytics Pipeline: Analyzes telemetry data.

6. Enhancements

  • Prometheus/Grafana: Monitoring and visualization.
  • Kafka Streams: Stream processing.
  • Database Integration: Stores historical data.

Create a Spring Boot Project

dependencies:
  • Spring Web
  • Spring Kafka
  • Spring Boot Actuator (for monitoring)
  • Spring DevTools (optional for development)
<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-kafka</artifactId>
</dependency>
<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-web</artifactId>
</dependency>

application.properties

spring.kafka.bootstrap-servers=localhost:9092
spring.kafka.consumer.group-id=aviation-group
spring.kafka.consumer.auto-offset-reset=earliest
spring.kafka.producer.key-serializer=org.apache.kafka.common.serialization.StringSerializer
spring.kafka.producer.value-serializer=org.apache.kafka.common.serialization.StringSerializer
spring.kafka.consumer.key-deserializer=org.apache.kafka.common.serialization.StringDeserializer
spring.kafka.consumer.value-deserializer=org.apache.kafka.common.serialization.StringDeserializer
  1. spring.kafka.bootstrap-servers=localhost:9092

    • This specifies the Kafka broker (or cluster of brokers) the application connects to.
    • In this case, it’s a single Kafka broker running on localhost at port 9092.
  2. spring.kafka.consumer.group-id=aviation-group

    • Kafka consumers are organized into groups. Each consumer in a group is assigned a subset of partitions to read messages.
    • Here, the consumer group is identified as aviation-group.
  3. spring.kafka.consumer.auto-offset-reset=earliest

    • Determines what the consumer should do if there is no initial offset or if the current offset is invalid (e.g., messages are deleted).
    • earliest: Reads messages starting from the beginning of the topic's log.
  4. spring.kafka.producer.key-serializer and spring.kafka.producer.value-serializer

    • Producers need to serialize the keys and values of messages into a format Kafka can store and transmit.
    • These properties specify the serializers for keys and values:
      • org.apache.kafka.common.serialization.StringSerializer is used to convert data to String format.
  5. spring.kafka.consumer.key-deserializer and spring.kafka.consumer.value-deserializer

    • Consumers need to deserialize the keys and values of messages received from Kafka.
    • These properties specify the deserializers for keys and values:
      • org.apache.kafka.common.serialization.StringDeserializer is used to convert data from Kafka's format back to String.

Step 1: Define Models

Create a model for telemetry data from aircraft sensors:

package com.example.aviation.model;

public class AircraftTelemetry {
    private String aircraftId;
    private double latitude;
    private double longitude;
    private double altitude;
    private double speed;

    // Getters and Setters
    public String getAircraftId() { return aircraftId; }
    public void setAircraftId(String aircraftId) { this.aircraftId = aircraftId; }
    public double getLatitude() { return latitude; }
    public void setLatitude(double latitude) { this.latitude = latitude; }
    public double getLongitude() { return longitude; }
    public void setLongitude(double longitude) { this.longitude = longitude; }
    public double getAltitude() { return altitude; }
    public void setAltitude(double altitude) { this.altitude = altitude; }
    public double getSpeed() { return speed; }
    public void setSpeed(double speed) { this.speed = speed; }
}

Step 2: Producer Implementation

Create a Kafka producer to send telemetry data.

package com.example.aviation.producer;

import com.example.aviation.model.AircraftTelemetry;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.kafka.core.KafkaTemplate;
import org.springframework.stereotype.Service;

@Service
public class TelemetryProducer {

    private static final String TOPIC = "AircraftTelemetry";

    @Autowired
    private KafkaTemplate kafkaTemplate;

    public void sendTelemetry(AircraftTelemetry telemetry) {
        kafkaTemplate.send(TOPIC, telemetry.getAircraftId(), telemetry);
    }
}

Step 3: Consumer Implementation

Create a Kafka consumer to process telemetry data.

package com.example.aviation.consumer;

import com.example.aviation.model.AircraftTelemetry;
import org.springframework.kafka.annotation.KafkaListener;
import org.springframework.stereotype.Service;

@Service
public class TelemetryConsumer {

    @KafkaListener(topics = "AircraftTelemetry", groupId = "aviation-group")
    public void consumeTelemetry(AircraftTelemetry telemetry) {
        System.out.println("Received telemetry: " + telemetry);
        // Add logic to process or save telemetry data
    }
}

Step 4: REST API for Sending Data

Expose an endpoint to allow external systems to send telemetry data.

package com.example.aviation.controller;

import com.example.aviation.model.AircraftTelemetry;
import com.example.aviation.producer.TelemetryProducer;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.web.bind.annotation.*;

@RestController
@RequestMapping("/api/telemetry")
public class TelemetryController {

    @Autowired
    private TelemetryProducer producer;

    @PostMapping("/send")
    public String sendTelemetry(@RequestBody AircraftTelemetry telemetry) {
        producer.sendTelemetry(telemetry);
        return "Telemetry sent successfully!";
    }
}

Kafka vs RabbitMQ

graph LR A[Kafka vs RabbitMQ] --> B[Use Case] A --> C[Architecture] A --> D[Scalability and Performance] A --> E[Messaging Patterns] A --> F[Durability and Reliability] A --> G[Tooling and Ecosystem] A --> H[Protocols] B --> B1[Kafka: High-throughput, real-time streaming] B --> B2[RabbitMQ: Task queues and low-latency messaging] C --> C1[Kafka: Log-based storage] C --> C2[RabbitMQ: Queue-based storage] C --> C3[Kafka: Consumer-managed offsets] C --> C4[RabbitMQ: Explicit acknowledgment] D --> D1[Kafka: Extremely high throughput] D --> D2[RabbitMQ: Lower throughput, low latency] D --> D3[Kafka: Horizontal scalability with partitions] D --> D4[RabbitMQ: Scalable with queues] E --> E1[Kafka: Built-in pub-sub model] E --> E2[RabbitMQ: Advanced routing with exchanges] E --> E3[Kafka: Limited point-to-point capabilities] E --> E4[RabbitMQ: Designed for point-to-point messaging] F --> F1[Kafka: Retains messages for reprocessing] F --> F2[RabbitMQ: Messages typically deleted after consumption] F --> F3[Kafka: Fault-tolerant with replication] F --> F4[RabbitMQ: Reliable with HA queues] G --> G1[Kafka: More complex setup, high learning curve] G --> G2[RabbitMQ: Easier configuration and usage] G --> G3[Kafka: Ecosystem with Kafka Streams and Connect] G --> G4[RabbitMQ: AMQP protocol support and plugins] H --> H1[Kafka: Uses its own binary protocol] H --> H2[RabbitMQ: Supports AMQP, MQTT, STOMP, HTTP]