{"id":2233,"date":"2025-07-11T12:36:50","date_gmt":"2025-07-11T12:36:50","guid":{"rendered":"https:\/\/rssfeedtelegrambot.bnaya.co.il\/index.php\/2025\/07\/11\/build-a-genai-app-with-java-using-spring-ai-and-docker-model-runner\/"},"modified":"2025-07-11T12:36:50","modified_gmt":"2025-07-11T12:36:50","slug":"build-a-genai-app-with-java-using-spring-ai-and-docker-model-runner","status":"publish","type":"post","link":"https:\/\/rssfeedtelegrambot.bnaya.co.il\/index.php\/2025\/07\/11\/build-a-genai-app-with-java-using-spring-ai-and-docker-model-runner\/","title":{"rendered":"Build a GenAI App With Java Using Spring AI and Docker Model Runner"},"content":{"rendered":"<p>When thinking about starting a Generative AI (GenAI) project, you might assume that Python is required to get started in this new space. However, if you\u2019re already a Java developer, there\u2019s no need to learn a new language. The Java ecosystem offers robust tools and libraries that make building GenAI applications both accessible and productive. <\/p>\n<p>In this blog, you\u2019ll learn how to build a GenAI app using Java. We\u2019ll do a step-by-step demo to show you how RAG enhances the model response, using Spring AI and Docker tools. Spring AI integrates with many model providers (for both chat and embeddings), vector databases, and more. In our example, we\u2019ll use the OpenAI and Qdrant modules provided by the Spring AI project to take advantage of built-in support for these integrations. Additionally, we\u2019ll use <a href=\"https:\/\/www.docker.com\/products\/model-runner\/\">Docker Model Runner<\/a> (instead of a cloud-hosted OpenAI model), which offers an OpenAI-compatible API, making it easy to <a href=\"https:\/\/www.docker.com\/blog\/run-llms-locally\/\">run AI models locally<\/a>. We\u2019ll automate the testing process using Testcontainers and Spring AI\u2019s tools to ensure the LLM\u2019s answers are contextually grounded in the documents we\u2019ve provided. Last, we\u2019ll show you how to use Grafana for observability and ensure our app behaves as designed.\u00a0<\/p>\n<h2 class=\"wp-block-heading\">Getting started\u00a0<\/h2>\n<p>Let\u2019s start building a sample application by going to <a href=\"https:\/\/start.spring.io\/\" target=\"_blank\">Spring Initializr<\/a> and choosing the following dependencies: Web, OpenAI, Qdrant Vector Database, and Testcontainers.<\/p>\n<p>It\u2019ll have two endpoints: a \u201c\/chat\u201d endpoint that interacts directly with the model and a \u00a0 \u201c\/rag\u201d endpoint that provides the model with additional context from documents stored in the vector database.<\/p>\n<h2 class=\"wp-block-heading\">Configuring Docker Model Runner<\/h2>\n<p>Enable Docker Model Runner in your Docker Desktop or Docker Engine as described in the<a href=\"https:\/\/docs.docker.com\/model\/\" target=\"_blank\"> official documentation<\/a>.<\/p>\n<p>Then pull the following two models:<\/p>\n\n<div class=\"wp-block-syntaxhighlighter-code \">\ndocker model pull ai\/llama3.1<br \/>\ndocker model pull ai\/mxbai-embed-large\n<\/div>\n<p>ai\/llama3.1 \u2013 chat model<\/p>\n<p>ai\/mxbai-embed-large \u2013 embedding model<\/p>\n<p>Both models are hosted at Docker Hub under the <a href=\"https:\/\/hub.docker.com\/u\/ai\" target=\"_blank\"><strong>ai<\/strong><\/a> namespace. You can also pick specific tags for the model, which usually provide different quantization of the model. If you don\u2019t know which tag to pick, the default one is a good starting point.<\/p>\n\n<h2 class=\"wp-block-heading\">Building the GenAI app<\/h2>\n<p>Let\u2019s create a <strong>ChatController<\/strong> under <strong>\/src\/main\/java\/com\/example<\/strong>, which will be our entry point to interact with the chat model:<\/p>\n\n<div class=\"wp-block-syntaxhighlighter-code \">\n@RestController<br \/>\npublic class ChatController {\n<p>\tprivate final ChatClient chatClient;<\/p>\n<p>\tpublic ChatController(ChatModel chatModel) {<br \/>\n\t\tthis.chatClient = ChatClient.builder(chatModel).build();<br \/>\n\t}<\/p>\n<p>\t@GetMapping(&#8220;\/chat&#8221;)<br \/>\n\tpublic String generate(@RequestParam(value = &#8220;message&#8221;, defaultValue = &#8220;Tell me a joke&#8221;) String message) {<br \/>\n\t\treturn this.chatClient.prompt().user(message).call().content();<br \/>\n\t}<\/p>\n<p>}<\/p>\n<\/div>\n<p>ChatClient is the interface that provides the available operations to interact with the model. We\u2019ll be injecting the actual model value (which model to use) via configuration properties.<\/p>\n<p>If no message query param is provided, then we\u2019ll ask the model to tell a joke (as seen in the defaultValue).<\/p>\n<p>Let\u2019s configure our application to point to Docker Model Runner and use the \u201cai\/llama3.1\u201d model by adding the following properties to <strong>\/src\/test\/resources\/application.properties<\/strong><\/p>\n\n<div class=\"wp-block-syntaxhighlighter-code \">\nspring.ai.openai.base-url=http:\/\/localhost:12434\/engines<br \/>\nspring.ai.openai.api-key=test<br \/>\nspring.ai.openai.chat.options.model=ai\/llama3.1\n<\/div>\n<p><a href=\"http:\/\/spring.ai\/\" target=\"_blank\">spring.ai<\/a>.openai.api-key is required by the framework, but we can use any value here since it is not needed for Docker Model Runner.<\/p>\n<p>Let\u2019s start our application by running <strong>.\/mvnw spring-boot:test-run<\/strong> or<strong> .\/gradlew bootTestRun <\/strong>and ask it about Testcontainers:<\/p>\n<div class=\"wp-block-syntaxhighlighter-code \">\nhttp :8080\/chat message==&#8221;What\u2019s testcontainers?&#8221;\n<\/div>\n<p>Below, we can find the answer provided by the LLM (ai\/llama3.1)<\/p>\n<div class=\"wp-block-syntaxhighlighter-code \">\nTestcontainers is a fantastic and increasingly popular library for **local testing with containers**. Let&#8217;s break down what it is, why it&#8217;s useful, and how it works:\n<p>**What is Testcontainers?**<\/p>\n<p>Testcontainers provides a way to run real, fully functional containerized services (like databases, message queues, web servers, etc.) directly within your tests. Instead of relying on mocked or stubbed versions of these services, you&#8217;re using the *actual* services, which leads to much more realistic and reliable test results.<\/p>\n<p>**Why Use Testcontainers?**<\/p>\n<p>* **Realistic Testing:** This is the biggest benefit.  Mocking databases or message queues can be brittle and difficult to maintain.  Testcontainers provides a service that behaves exactly like the real thing, leading to tests that more accurately reflect how your application will perform in production.<br \/>\n* **Simplified Test Setup:**  Forget about manually setting up and configuring databases or other services on your test machine. Testcontainers automatically handles the container creation, configuration, and cleanup for you.<br \/>\n* **Faster Tests:** Because the services are running locally, there\u2019s no network latency involved, resulting in significantly faster test execution times.<br \/>\n* **Consistent Environments:**  You eliminate the &#8220;it works on my machine&#8221; problem. Everyone running the tests will be using the same, pre-configured environment.<br \/>\n* **Supports Many Services:** Testcontainers supports a huge range of services, including:<br \/>\n    * **Databases:** PostgreSQL, MySQL, MongoDB, Redis, Cassandra, MariaDB<br \/>\n    * **Message Queues:** RabbitMQ, Kafka, ActiveMQ<br \/>\n    * **Web Servers:**  Tomcat, Jetty, H2 (for in-memory databases)<br \/>\n    * **And many more!**  The list is constantly growing.<\/p>\n<p>**How Does It Work?**<\/p>\n<p>1. **Client Library:** Testcontainers provides client libraries for various programming languages (Java, Python, JavaScript, Ruby, Go, .NET, and more).<br \/>\n2. **Container Run:** When you use the Testcontainers client library in your test, it automatically starts the specified container (e.g., a PostgreSQL database) in the background.<br \/>\n3. **Connection:** Your test code then connects to the running container using standard protocols (e.g., JDBC for PostgreSQL, HTTP for a web server).<br \/>\n4. **Test Execution:**  You execute your tests as usual.<br \/>\n5. **Cleanup:**  When the tests are finished, Testcontainers automatically shuts down the container, ensuring a clean state for the next test run.<\/p>\n<p>**Example (Conceptual &#8211; Python):**<\/p>\n<p>&#8220;`python<br \/>\nfrom testcontainers.postgresql import PostgreSQLEnvironment<\/p>\n<p># Create a PostgreSQL environment<br \/>\nenv = PostgreSQLEnvironment()<\/p>\n<p># Start the container<br \/>\nenv.start()<\/p>\n<p># Connect to the database<br \/>\ndb = env.db()  #  This creates a connection object to the running PostgreSQL container<\/p>\n<p># Perform database operations in your test<br \/>\n# &#8230;<\/p>\n<p># Stop the container (cleanup)<br \/>\nenv.shutdown()<br \/>\n&#8220;`<\/p>\n<p>**Key Concepts:**<\/p>\n<p>* **Environment:**  A Testcontainers environment is a configuration that defines which containers to run and how they should be configured.<br \/>\n* **Container:**  A running containerized service (e.g., a database instance).<br \/>\n* **Connection:** An object that represents a connection to a specific container.<\/p>\n<p>**Resources to Learn More:**<\/p>\n<p>* **Official Website:** [https:\/\/testcontainers.io\/](https:\/\/testcontainers.io\/) &#8211; This is the best place to start.<br \/>\n* **GitHub Repository:** [https:\/\/github.com\/testcontainers\/testcontainers](https:\/\/github.com\/testcontainers\/testcontainers) &#8211;  See the source code and contribute.<br \/>\n* **Documentation:** [https:\/\/testcontainers.io\/docs\/](https:\/\/testcontainers.io\/docs\/) &#8211; Comprehensive documentation with examples for various languages.<\/p>\n<p>**In short, Testcontainers is a powerful tool that dramatically improves the quality and reliability of your local tests by allowing you to test against real, running containerized services.**<\/p>\n<p>Do you want me to delve deeper into a specific aspect of Testcontainers, such as:<\/p>\n<p>*   A specific language implementation (e.g., Python)?<br \/>\n*   A particular service it supports (e.g., PostgreSQL)?<br \/>\n*   How to integrate it with a specific testing framework (e.g., JUnit, pytest)?<\/p>\n<\/div>\n<p>We can see that the answer provided by the LLM has some mistakes, for example, PostgreSQLEnvironment doesn\u2019t exist in testcontainers-python. Another one is the links to the docs, testcontainers.io doesn\u2019t exist. So, we can see some hallucinations in the answer.<\/p>\n<p>Of course, LLM responses are non-deterministic, and since each model is trained until a certain cutoff date, the information may be outdated, and the answers might not be accurate.<\/p>\n<p>To improve this situation, let\u2019s provide the model with some curated context about Testcontainers!<\/p>\n<p>We\u2019ll create another controller, <strong>RagController<\/strong>, which will retrieve documents from a vector search database.<\/p>\n<div class=\"wp-block-syntaxhighlighter-code \">\n@RestController<br \/>\npublic class RagController {\n<p>\tprivate final ChatClient chatClient;<\/p>\n<p>\tprivate final VectorStore vectorStore;<\/p>\n<p>\tpublic RagController(ChatModel chatModel, VectorStore vectorStore) {<br \/>\n\t\tthis.chatClient = ChatClient.builder(chatModel).build();<br \/>\n\t\tthis.vectorStore = vectorStore;<br \/>\n\t}<\/p>\n<p>\t@GetMapping(&#8220;\/rag&#8221;)<br \/>\n\tpublic String generate(@RequestParam(value = &#8220;message&#8221;, defaultValue = &#8220;What&#8217;s Testcontainers?&#8221;) String message) {<br \/>\n\t\treturn callResponseSpec(this.chatClient, this.vectorStore, message).content();<br \/>\n\t}<\/p>\n<p>\tstatic ChatClient.CallResponseSpec callResponseSpec(ChatClient chatClient, VectorStore vectorStore,<br \/>\n\t\t\tString question) {<br \/>\n\t\tQuestionAnswerAdvisor questionAnswerAdvisor = QuestionAnswerAdvisor.builder(vectorStore)<br \/>\n\t\t\t.searchRequest(SearchRequest.builder().topK(1).build())<br \/>\n\t\t\t.build();<br \/>\n\t\treturn chatClient.prompt().advisors(questionAnswerAdvisor).user(question).call();<br \/>\n\t}<\/p>\n<p>}<\/p>\n<\/div>\n<p>Spring AI provides many advisors. In this example, we are going to use the QuestionAnswerAdvisor to perform the query against the vector search database. It takes care of all the individual integrations with the vector database.<\/p>\n<h2 class=\"wp-block-heading\">Ingesting documents into the vector database<\/h2>\n<p>First, we need to load the relevant documents into the vector database. Under <strong>src\/test\/java\/com\/example<\/strong>, let\u2019s create an IngestionConfiguration class:<\/p>\n<div class=\"wp-block-syntaxhighlighter-code \">\n@TestConfiguration(proxyBeanMethods = false)<br \/>\npublic class IngestionConfiguration {\n<p>\t@Value(&#8220;classpath:\/docs\/testcontainers.txt&#8221;)<br \/>\n\tprivate Resource testcontainersDoc;<\/p>\n<p>\t@Bean<br \/>\n\tApplicationRunner init(VectorStore vectorStore) {<br \/>\n\t\treturn args -&gt; {<br \/>\n\t\t\tvar javaTextReader = new TextReader(this.testcontainersDoc);<br \/>\n\t\t\tjavaTextReader.getCustomMetadata().put(&#8220;language&#8221;, &#8220;java&#8221;);<\/p>\n<p>\t\t\tvar tokenTextSplitter = new TokenTextSplitter();<br \/>\n\t\t\tvar testcontainersDocuments = tokenTextSplitter.apply(javaTextReader.get());<\/p>\n<p>\t\t\tvectorStore.add(testcontainersDocuments);<br \/>\n\t\t};<br \/>\n\t}<\/p>\n<p>}<\/p>\n<\/div>\n<p>testcontainers.txt under <strong>\/src\/test\/resources\/docs<\/strong> directory will have the following content\u00a0 specific information. For a real-world use case, you would probably have a more extensive collection of documents.<\/p>\n\n<div class=\"wp-block-syntaxhighlighter-code \">\nTestcontainers is a library that provides easy and lightweight APIs for bootstrapping local development and test dependencies with real services wrapped in Docker containers. Using Testcontainers, you can write tests that depend on the same services you use in production without mocks or in-memory services.\n<p>Testcontainers provides modules for a wide range of commonly used infrastructure dependencies including relational databases, NoSQL datastores, search engines, message brokers, etc. See https:\/\/testcontainers.com\/modules\/ for a complete list.<\/p>\n<p>Technology-specific modules are a higher-level abstraction on top of GenericContainer which help configure and run these technologies without any boilerplate, and make it easy to access their relevant parameters.<\/p>\n<p>Official website: https:\/\/testcontainers.com\/<br \/>\nGetting Started: https:\/\/testcontainers.com\/getting-started\/<br \/>\nModule Catalog: https:\/\/testcontainers.com\/modules\/<\/p>\n<\/div>\n<p>Now, let\u2019s add an additional property to the<strong> src\/test\/resources\/application.properties<\/strong> file.<\/p>\n<div class=\"wp-block-syntaxhighlighter-code \">\nspring.ai.openai.embedding.options.model=ai\/mxbai-embed-large<br \/>\nspring.ai.vectorstore.qdrant.initialize-schema=true<br \/>\nspring.ai.vectorstore.qdrant.collection-name=test\n<\/div>\n<p><a href=\"https:\/\/hub.docker.com\/r\/ai\/mxbai-embed-large\" target=\"_blank\">ai\/mxbai-embed-large<\/a> is an embedding model that will be used to create the embeddings of the documents. They will be stored in the vector search database, in our case, Qdrant. Spring AI will initialize the Qdrant schema and use the collection named test.<\/p>\n\n<p>Let\u2019s update our <strong>TestDemoApplication<\/strong> Java class and add the IngestionConfiguration.class<\/p>\n\n<div class=\"wp-block-syntaxhighlighter-code \">\npublic class TestDemoApplication {\n<p>\tpublic static void main(String[] args) {<br \/>\n\t\tSpringApplication.from(DemoApplication::main)<br \/>\n\t\t\t.with(TestcontainersConfiguration.class, IngestionConfiguration.class)<br \/>\n\t\t\t.run(args);<br \/>\n\t}<\/p>\n<p>}<\/p>\n<\/div>\n<p>Now we start our application by running <strong>.\/mvnw spring-boot:test-run <\/strong>or<strong> .\/gradlew bootTestRun<\/strong> and ask it again about Testcontainers:<\/p>\n\n<div class=\"wp-block-syntaxhighlighter-code \">\nhttp :8080\/rag message==&#8221;What\u2019s testcontainers?&#8221;\n<\/div>\n<p>This time, the answer contains references from the docs we have provided and is more accurate.<\/p>\n<div class=\"wp-block-syntaxhighlighter-code \">\nTestcontainers is a library that helps you write tests for your applications by bootstrapping real services in Docker containers, rather than using mocks or in-memory services. This allows you to test your applications as they would run in production, but in a controlled and isolated environment.\n<p>It provides modules for commonly used infrastructure dependencies such as relational databases, NoSQL datastores, search engines, and message brokers.<\/p>\n<p>If you have any specific questions about how to use Testcontainers or its features, I&#8217;d be happy to help.<\/p>\n<\/div>\n<h2 class=\"wp-block-heading\">Integration testing<\/h2>\n<p>Testing is a key part of software development. Fortunately, Testcontainers and Spring AI\u2019s utilities support testing of GenAI applications. So far, we\u2019ve been testing the application manually, starting the application and performing requests to the given endpoints, verifying the correctness of the response ourselves. Now, we\u2019re going to automate it by writing an integration test to check if the answer provided by the LLM is more contextual, augmented by the information provided in the documents.<\/p>\n\n<div class=\"wp-block-syntaxhighlighter-code \">\n@SpringBootTest(classes = { TestcontainersConfiguration.class, IngestionConfiguration.class },<br \/>\n\t\twebEnvironment = SpringBootTest.WebEnvironment.RANDOM_PORT)<br \/>\nclass RagControllerTest {\n<p>\t@LocalServerPort<br \/>\n\tprivate int port;<\/p>\n<p>\t@Autowired<br \/>\n\tprivate VectorStore vectorStore;<\/p>\n<p>\t@Autowired<br \/>\n\tprivate ChatClient.Builder chatClientBuilder;<\/p>\n<p>\t@Test<br \/>\n\tvoid verifyTestcontainersAnswer() {<br \/>\n\t\tvar question = &#8220;Tell me about Testcontainers&#8221;;<br \/>\n\t\tvar answer = retrieveAnswer(question);<\/p>\n<p>\t\tassertFactCheck(question, answer);<br \/>\n\t}<\/p>\n<p>\tprivate String retrieveAnswer(String question) {<br \/>\n\t\tRestClient restClient = RestClient.builder().baseUrl(&#8220;http:\/\/localhost:%d&#8221;.formatted(this.port)).build();<br \/>\n\t\treturn restClient.get().uri(&#8220;\/rag?message={question}&#8221;, question).retrieve().body(String.class);<br \/>\n\t}<\/p>\n<p>\tprivate void assertFactCheck(String question, String answer) {<br \/>\n\t\tFactCheckingEvaluator factCheckingEvaluator = new FactCheckingEvaluator(this.chatClientBuilder);<br \/>\n\t\tEvaluationResponse evaluate = factCheckingEvaluator.evaluate(new EvaluationRequest(docs(question), answer));<br \/>\n\t\tassertThat(evaluate.isPass()).isTrue();<br \/>\n\t}<\/p>\n<p>\tprivate List&lt;Document&gt; docs(String question) {<br \/>\n\t\tvar response = RagController<br \/>\n\t\t\t.callResponseSpec(this.chatClientBuilder.build(), this.vectorStore, question)<br \/>\n\t\t\t.chatResponse();<br \/>\n\t\treturn response.getMetadata().get(QuestionAnswerAdvisor.RETRIEVED_DOCUMENTS);<br \/>\n\t}<\/p>\n<p>}<\/p>\n<\/div>\n<p>Importing ContainerConfiguration, Qdrant will be provided.<\/p>\n<p>Importing IngestionConfiguration, will load the documents into the vector database.<\/p>\n<p>We\u2019re going to use <a href=\"https:\/\/docs.spring.io\/spring-ai\/reference\/api\/testing.html#_factcheckingevaluator\" target=\"_blank\">FactCheckingEvaluator<\/a> to tell the chat model (ai\/llama3.1) to check the answer provided by the LLM and verify it with the documents stored in the vector database.<\/p>\n<p>Note: The integration test is using the same model we have declared in the previous steps. But we can definitely use a different model.<\/p>\n<p>Automating your tests ensures consistency and reduces the risk of errors that often come with manual execution.\u00a0<\/p>\n<h2 class=\"wp-block-heading\">Observability with the Grafana LGTM Stack<\/h2>\n<p>Finally, let\u2019s introduce some observability into our application. By introducing metrics and tracing, we can understand if our application is behaving as designed during development and in production.<\/p>\n<p>Add the following dependencies to the pom.xml<\/p>\n<div class=\"wp-block-syntaxhighlighter-code \">\n&lt;dependency&gt;<br \/>\n\t&lt;groupId&gt;org.springframework.boot&lt;\/groupId&gt;<br \/>\n\t&lt;artifactId&gt;spring-boot-starter-actuator&lt;\/artifactId&gt;<br \/>\n&lt;\/dependency&gt;<br \/>\n&lt;dependency&gt;<br \/>\n\t&lt;groupId&gt;io.micrometer&lt;\/groupId&gt;<br \/>\n\t&lt;artifactId&gt;micrometer-registry-otlp&lt;\/artifactId&gt;<br \/>\n&lt;\/dependency&gt;<br \/>\n&lt;dependency&gt;<br \/>\n\t&lt;groupId&gt;io.micrometer&lt;\/groupId&gt;<br \/>\n\t&lt;artifactId&gt;micrometer-tracing-bridge-otel&lt;\/artifactId&gt;<br \/>\n&lt;\/dependency&gt;<br \/>\n&lt;dependency&gt;<br \/>\n\t&lt;groupId&gt;io.opentelemetry&lt;\/groupId&gt;<br \/>\n\t&lt;artifactId&gt;opentelemetry-exporter-otlp&lt;\/artifactId&gt;<br \/>\n&lt;\/dependency&gt;<br \/>\n&lt;dependency&gt;<br \/>\n\t&lt;groupId&gt;org.testcontainers&lt;\/groupId&gt;<br \/>\n\t&lt;artifactId&gt;grafana&lt;\/artifactId&gt;<br \/>\n       &lt;scope&gt;test&lt;\/scope&gt;<br \/>\n&lt;\/dependency&gt;\n<\/div>\n<p>Now, let\u2019s create <strong>GrafanaContainerConfiguration <\/strong>under src\/test\/java\/com\/example.<\/p>\n<div class=\"wp-block-syntaxhighlighter-code \">\n@TestConfiguration(proxyBeanMethods = false)<br \/>\npublic class GrafanaContainerConfiguration {\n<p>\t@Bean<br \/>\n\t@ServiceConnection<br \/>\n\tLgtmStackContainer lgtmContainer() {<br \/>\n\t\treturn new LgtmStackContainer(&#8220;grafana\/otel-lgtm:0.11.4&#8221;);<br \/>\n\t}<\/p>\n<p>}<\/p>\n<\/div>\n<p>Grafana provides the <a href=\"https:\/\/hub.docker.com\/r\/grafana\/otel-lgtm\" target=\"_blank\">grafana\/otel-lgtm<\/a> image, which will start Prometheus, Tempo, and OpenTelemetry Collector, and other related services, all combined into a single convenient Docker image.<\/p>\n<p>For the sake of our demo, let\u2019s add a couple of properties at <strong>\/src\/test\/resources\/application.properties<\/strong> to sample 100% of requests.<\/p>\n\n<div class=\"wp-block-syntaxhighlighter-code \">\nspring.application.name=demo<br \/>\nmanagement.tracing.sampling.probability=1\n<\/div>\n<p>Update the <strong>TestDemoApplication<\/strong> class to include <strong>GrafanaContainerConfiguration.class<\/strong><\/p>\n\n<div class=\"wp-block-syntaxhighlighter-code \">\npublic class TestDemoApplication {\n<p>\tpublic static void main(String[] args) {<br \/>\n\t\tSpringApplication.from(DemoApplication::main)<br \/>\n\t\t\t.with(TestcontainersConfiguration.class, IngestionConfiguration.class, GrafanaContainerConfiguration.class)<br \/>\n\t\t\t.run(args);<br \/>\n\t}<\/p>\n<p>}<\/p>\n<\/div>\n<p>Now, run <strong>.\/mvnw spring-boot:test-run<\/strong> or<strong> .\/gradlew bootTestRun <\/strong>one more time, and perform a request.<\/p>\n<div class=\"wp-block-syntaxhighlighter-code \">\nhttp :8080\/rag message==&#8221;What\u2019s testcontainers?&#8221;\n<\/div>\n<p>Then, look for the following text in logs.<\/p>\n<div class=\"wp-block-syntaxhighlighter-code \">\no.t.grafana.LgtmStackContainer           : Access to the Grafana dashboard:<br \/>\nhttp:\/\/localhost:64908\n<\/div>\n<p>The port can be different for you, but clicking on it should open the Grafana dashboard. This is where you can query for metrics related to the model or vector search, and also see the traces.<\/p>\n<div class=\"wp-block-ponyo-image\"><\/div>\n<p class=\"has-xs-font-size\"><strong>Figure 1: Grafana dashboard showing model metrics, vector search performance, and traces<\/strong><\/p>\n<p>We can also display the token usage metric used by the chat endpoint.<\/p>\n<div class=\"wp-block-ponyo-image\"><\/div>\n<p class=\"has-xs-font-size\"><strong>Figure 2: Grafana dashboard panel displaying token usage metrics for the chat endpoint<\/strong><\/p>\n<p>List traces for the service with name \u201cdemo\u201d, and we can see a list of operations executed as part of this trace. You can use the trace ID with the name <strong>http get \/rag<\/strong> to see the full control flow within the same HTTP request.<\/p>\n<div class=\"wp-block-ponyo-image\"><\/div>\n<p class=\"has-xs-font-size\"><strong>Figure 3:\u00a0 Grafana dashboard showing trace details for a \/rag endpoint in a Java GenAI application<\/strong><\/p>\n\n<h3 class=\"wp-block-heading\">Conclusion<\/h3>\n<p>Docker offers powerful capabilities that complement the Spring AI project, allowing developers to build GenAI applications efficiently with Docker tools that they know and trust. It simplifies the startup of service dependencies, including the Docker Model Runner, which exposes an OpenAI-compatible API for running local models. Testcontainers help to quickly spin out integration testing to evaluate your app by providing lightweight containers for your services and dependencies.\u00a0 From development to testing, Docker and Spring AI have proven to be a reliable and productive combination for building modern AI-driven applications.<\/p>\n\n<h3 class=\"wp-block-heading\">Learn more<\/h3>\n<p>Get an inside look at the<a href=\"https:\/\/www.docker.com\/blog\/how-we-designed-model-runner-and-whats-next\/\"> design architecture of the Docker Model Runner<\/a>.\u00a0<\/p>\n<p>Explore the<a href=\"https:\/\/www.docker.com\/blog\/oci-artifacts-for-ai-model-packaging\/\"> story<\/a> behind our model distribution specification<\/p>\n<p>Read our quickstart guide to<a href=\"https:\/\/www.docker.com\/blog\/run-llms-locally\/\"> Docker Model Runner<\/a>.<\/p>\n<p>Find documentation for<a href=\"https:\/\/docs.docker.com\/model-runner\/\" target=\"_blank\"> Model Runner<\/a>.<\/p>\n<p>Visit our new <a href=\"https:\/\/www.docker.com\/solutions\/docker-ai\/\">AI solution page<\/a><\/p>","protected":false},"excerpt":{"rendered":"<p>When thinking about starting a Generative AI (GenAI) project, you might assume that Python is required to get started in [&hellip;]<\/p>\n","protected":false},"author":0,"featured_media":0,"comment_status":"","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"site-sidebar-layout":"default","site-content-layout":"","ast-site-content-layout":"default","site-content-style":"default","site-sidebar-style":"default","ast-global-header-display":"","ast-banner-title-visibility":"","ast-main-header-display":"","ast-hfb-above-header-display":"","ast-hfb-below-header-display":"","ast-hfb-mobile-header-display":"","site-post-title":"","ast-breadcrumbs-content":"","ast-featured-img":"","footer-sml-layout":"","ast-disable-related-posts":"","theme-transparent-header-meta":"","adv-header-id-meta":"","stick-header-meta":"","header-above-stick-meta":"","header-main-stick-meta":"","header-below-stick-meta":"","astra-migrate-meta-layouts":"default","ast-page-background-enabled":"default","ast-page-background-meta":{"desktop":{"background-color":"var(--ast-global-color-4)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"ast-content-background-meta":{"desktop":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"footnotes":""},"categories":[4],"tags":[],"class_list":["post-2233","post","type-post","status-publish","format-standard","hentry","category-docker"],"_links":{"self":[{"href":"https:\/\/rssfeedtelegrambot.bnaya.co.il\/index.php\/wp-json\/wp\/v2\/posts\/2233","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/rssfeedtelegrambot.bnaya.co.il\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/rssfeedtelegrambot.bnaya.co.il\/index.php\/wp-json\/wp\/v2\/types\/post"}],"replies":[{"embeddable":true,"href":"https:\/\/rssfeedtelegrambot.bnaya.co.il\/index.php\/wp-json\/wp\/v2\/comments?post=2233"}],"version-history":[{"count":0,"href":"https:\/\/rssfeedtelegrambot.bnaya.co.il\/index.php\/wp-json\/wp\/v2\/posts\/2233\/revisions"}],"wp:attachment":[{"href":"https:\/\/rssfeedtelegrambot.bnaya.co.il\/index.php\/wp-json\/wp\/v2\/media?parent=2233"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/rssfeedtelegrambot.bnaya.co.il\/index.php\/wp-json\/wp\/v2\/categories?post=2233"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/rssfeedtelegrambot.bnaya.co.il\/index.php\/wp-json\/wp\/v2\/tags?post=2233"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}