Friday, February 14, 2025

Building a Retrieval-Augmented Generation (RAG) Application with Ollama 3.2 and Spring Boot

Building a RAG Application with Ollama 3.2 and Spring Boot

This blog post demonstrates how to build a Retrieval-Augmented Generation (RAG) application using Ollama 3.2 for large language models (LLMs) and Spring Boot for creating REST APIs. RAG combines information retrieval with LLMs to provide more accurate and contextually relevant answers. We'll leverage Docker Desktop for containerization and pgvector for vector storage.

Project Setup

We'll use Spring Boot version 3.3.7 for this project. Here's a breakdown of the key components and configurations:

1. Dependencies (Gradle):

dependencies {
    implementation 'org.springframework.boot:spring-boot-starter-jdbc'
    implementation 'org.springframework.boot:spring-boot-starter-web'
    implementation 'com.fasterxml.jackson.module:jackson-module-kotlin'
    implementation 'org.springframework.ai:spring-ai-ollama-spring-boot-starter'
    implementation 'org.springframework.ai:spring-ai-pgvector-store-spring-boot-starter'
}

This includes the necessary Spring Boot starters, Jackson for Kotlin support, and the Spring AI libraries for Ollama and pgvector integration.

2. application.properties:

spring.application.name=spring-boot-ai
server.port=8082

spring.ai.ollama.embedding.model=mxbai-embed-large
spring.ai.ollama.chat.model=llama3.2

spring.datasource.url=jdbc:postgresql://localhost:5432/sbdocs
spring.datasource.username=admin
spring.datasource.password=password

spring.ai.vectorstore.pgvector.initialize-schema=true
spring.vectorstore.pgvector=
spring.vectorstore.index-type=HNSW
spring.vectorstore.distance-type=COSINE_DISTANCE
spring.vectorstore.dimension=1024
spring.ai.vectorstore.pgvector.dimensions=1024

spring.docker.compose.lifecycle-management=start_only

This configuration sets the application name, port, Ollama model names, database connection details, and pgvector settings. Critically, spring.docker.compose.lifecycle-management=start_only allows Spring Boot to manage the Docker Compose lifecycle.

3. RagConfiguration.kt:


@Configuration
open class RagConfiguration {

    @Value("myDataVector.json")
    lateinit var myDataVectorName: String

    @Value("classpath:/docs/myData.txt")
    lateinit var originalArtical: Resource

    @Bean
    open fun getVector(embeddingModel: EmbeddingModel): SimpleVectorStore {
        val simpleVectorStore = SimpleVectorStore(embeddingModel)
        val vectorStoreFile = getVectorStoreFile()
        if (vectorStoreFile.exists()) {
            simpleVectorStore.load(vectorStoreFile)
        } else {
            val textReader = TextReader(originalArtical)
            textReader.customMetadata["filename"] = "myData.txt"
            val documents = textReader.get()
            val splitDocs = TokenTextSplitter()
                .split(documents)
            simpleVectorStore.add(splitDocs)
            simpleVectorStore.save(vectorStoreFile)
        }
        return simpleVectorStore
    }

    private fun getVectorStoreFile(): File {
        val path = Path("src", "main", "resources", "docs", myDataVectorName)
        return path.toFile()
    }
}
    
This configuration class creates a SimpleVectorStore bean. It loads existing vector data from database or generates it by reading the myData.txt file, splitting it into chunks, and embedding them using the specified embedding model.

4. RagController.kt:


@RestController
@RequestMapping("/rag")
class RagController(val chatClient: ChatClient, val vectorStore: SimpleVectorStore) {

    @Value("classpath:/prompts/ragPrompt.st")
    lateinit var ragPrompt: Resource

    @GetMapping("question")
    fun getAnswer(@RequestParam(name = "question", defaultValue = "What is the latest news about Olympics?") question: String): String? {

        return chatClient.prompt()
            .advisors(QuestionAnswerAdvisor(vectorStore, SearchRequest.defaults()))
            .user(question)
            .call()
            .content()
    }
}    
    
This controller defines a /rag/question endpoint that takes a question as a parameter. It uses the ChatClient and QuestionAnswerAdvisor to query the Ollama model, retrieving relevant context from the vectorStore and generating an answer.

Running the Application with Docker

1. Start pgvector Docker Container:

docker run --name pgvector-container -e POSTGRES_USER=admin -e POSTGRES_PASSWORD=password -e POSTGRES_DB=sbdocs -d -p 5432:5432 pgvector/pgvector:0.8.0-pg1

2. Pull Ollama Models:

Open a terminal in your Docker Desktop, exec of the springboot-ai-ollama-1 container and run:

ollama pull llama3.2
ollama pull mxbai-embed-large

3. Run the Spring Boot Application:

Start your Spring Boot application. Because of the spring.docker.compose.lifecycle-management property, Spring Boot will manage the Docker Compose file.

4. Access the API:

You can now access the RAG API at http://localhost:8082/rag/question?question=Your question here.

This setup provides a robust and scalable way to use Ollama 3.2 for RAG applications. The use of Docker and Spring Boot simplifies deployment and management. Remember to replace placeholder values like database credentials and file paths with your actual values. This example provides a foundation that you can extend to build more complex RAG applications.

Tuesday, December 31, 2024

Securing Microservices with JWT Authentication and Data Encryption

Securing Microservices with JWT Authentication and Data Encryption

Securing Microservices with JWT Authentication and Data Encryption

In modern microservices architectures, securing communication and data integrity are paramount. This article explores how JWT (JSON Web Token) authentication and data encryption can bolster security, ensuring that data exchanges between services remain confidential and trusted.

What is JWT Authentication?

JWT is a compact, URL-safe token format that securely transmits information between parties as a JSON object. It is widely used in microservices for its simplicity and efficiency.

Parts of a JWT Token

A JSON Web Token (JWT) consists of three parts, separated by periods (.):

  • Header: Specifies the token type (JWT) and signing algorithm (e.g., HS256 or RS256).
  • Example: { "alg": "HS256", "typ": "JWT" }
  • Payload: Contains claims about the user or the token itself. Claims can be:
    • Registered claims: Predefined fields like iss (issuer), sub (subject), exp (expiration time), etc.
    • Public claims: Custom claims, such as user roles or permissions.
    • Private claims: Claims specific to the application, like user IDs.
    Example: { "sub": "1234567890", "name": "John Doe", "admin": true, "iat": 1516239022 }
  • Signature: Ensures the token's integrity and authenticity. It is generated by signing the encoded header and payload with a secret or private key.
    Example for HMAC-SHA256:
    HMACSHA256( base64UrlEncode(header) + "." + base64UrlEncode(payload), secret )
A full JWT might look like this: eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiIxMjM0NTY3ODkwIiwibmFtZSI6IkpvaG4gRG9lIiwiaWF0IjoxNTE2MjM5MDIyfQ.SflKxwRJSMeKKF2QT4fwpMeJf36POk6yJV_adQssw5c

Shared Key vs. Public Key JWT in Microservices

Shared Key-Based JWT:

  1. How It Works:
    • A single secret key is used for both signing and verifying the token.
    • This secret must be shared between the microservices.
  2. Advantages:
    • Simple setup.
    • Suitable for small-scale systems with fewer services.
  3. Disadvantages:
    • Security Risk: If the key is compromised, all services relying on it are at risk.
    • Key Distribution: Sharing the key securely across multiple services can be challenging.

Public Key-Based JWT in Microservice

  1. How It Works:
    • The authentication server uses a private key to sign the JWT.
    • Microservices use a public key to verify the token's signature.
  2. Advantages:
    • Better Security: The private key remains on the authentication server, and only the public key is distributed.
    • Scalability: New services can independently verify tokens without needing access to the private key.
    • No Shared Secrets: Eliminates the need to distribute a secret key.
  3. Disadvantages:
    • Slightly more complex setup due to key management.
    • Requires a system to distribute the public key, like a JWKS (JSON Web Key Set) endpoint.
    • No Shared Secrets: Eliminates the need to distribute a secret key.

Data Encryption in Microservices

Encryption ensures sensitive data remains confidential and secure during transmission and storage.

Types of Encryption

  • Symmetric Encryption: Uses the same key for encryption and decryption.
  • Asymmetric Encryption: Utilizes a public key for encryption and a private key for decryption.

Encryption in Microservices Communication

  • Transport-Level Encryption: Secures data in transit using TLS (HTTPS).
  • Message-Level Encryption: Encrypts specific message payloads for added confidentiality.

Combining JWT and Encryption

  • Token Encryption: Adds a layer of security to JWTs by making intercepted tokens unreadable.
  • Public Key Infrastructure: Manages keys securely for token validation and encrypted communication.

Best Practices

  • Set reasonable expiration times for tokens and use refresh tokens for longer sessions.
  • Rotate encryption keys periodically to minimize security risks.
  • Audit and log token usage to detect anomalies.

Conclusion

JWT authentication and encryption are foundational to building secure microservices. By combining these technologies, you can ensure robust authentication, data confidentiality, and integrity across your system. Follow best practices to simplify implementation and focus on delivering high-quality services.

Building a Retrieval-Augmented Generation (RAG) Application with Ollama 3.2 and Spring Boot

Building a RAG Application with Ollama 3.2 and Spring Boot This blog post demonstrates how to build a Re...