{"id":1074,"date":"2024-07-23T18:10:38","date_gmt":"2024-07-23T18:10:38","guid":{"rendered":"https:\/\/rssfeedtelegrambot.bnaya.co.il\/index.php\/2024\/07\/23\/local-llm-messenger-chat-with-genai-on-your-iphone\/"},"modified":"2024-07-23T18:10:38","modified_gmt":"2024-07-23T18:10:38","slug":"local-llm-messenger-chat-with-genai-on-your-iphone","status":"publish","type":"post","link":"https:\/\/rssfeedtelegrambot.bnaya.co.il\/index.php\/2024\/07\/23\/local-llm-messenger-chat-with-genai-on-your-iphone\/","title":{"rendered":"Local LLM Messenger: Chat with GenAI on Your iPhone"},"content":{"rendered":"<p>In this <a href=\"https:\/\/www.docker.com\/blog\/tag\/ai-ml-hackathon\/\" target=\"_blank\" rel=\"noopener\">AI\/ML Hackathon post<\/a>, we want to share another winning project from last year\u2019s <a href=\"https:\/\/docker.devpost.com\/\" target=\"_blank\" rel=\"noopener\">Docker AI\/ML Hackathon<\/a>. This time we will dive into <a href=\"https:\/\/devpost.com\/software\/local-lingo-messenger\" target=\"_blank\" rel=\"noopener\">Local LLM Messenger<\/a>, an honorable mention winner created by <a href=\"https:\/\/justingarrison.com\/\" target=\"_blank\" rel=\"noopener\">Justin Garrison<\/a>.<\/p>\n<p>Developers are pushing the boundaries to bring the power of artificial intelligence (AI) to everyone. One exciting approach involves integrating Large Language Models (LLMs) with familiar messaging platforms like Slack and iMessage. This isn\u2019t just about convenience; it\u2019s about transforming these platforms into launchpads for interacting with powerful AI tools.<\/p>\n<p>Imagine this: You need a quick code snippet or some help brainstorming solutions to coding problems. With LLMs integrated into your messaging app, you can chat with your AI assistant directly within the familiar interface to generate creative ideas or get help brainstorming solutions. No more complex commands or clunky interfaces \u2014 just a natural conversation to unlock the power of AI.<\/p>\n<p>Integrating with messaging platforms can be a time-consuming task, especially for macOS users. That\u2019s where Local LLM Messenger (LoLLMM) steps in, offering a streamlined solution for connecting with your AI via iMessage.\u00a0<\/p>\n<h2 class=\"wp-block-heading\">What makes LoLLM Messenger unique?<\/h2>\n<p>The following demo, which was submitted to the AI\/ML Hackathon, provides an overview of LoLLM Messenger (Figure 1).<\/p>\n<div class=\"wp-block-embed__wrapper\">\n<\/div>\n<p><strong>Figure 1: <\/strong>Demo of LoLLM Messenger as submitted to the AI\/ML Hackathon.<\/p>\n<p>The LoLLM Messenger bot allows you to send iMessages to Generative AI (GenAI) models running directly on your computer. This approach eliminates the need for complex setups and cloud services, making it easier for developers to experiment with LLMs locally.<\/p>\n<h2 class=\"wp-block-heading\">Key features of LoLLM Messenger<\/h2>\n<p>LoLLM Messenger includes impressive features that make it a standout among similar projects, such as:<\/p>\n<p><strong>Local execution<\/strong>: Runs on your computer, eliminating the need for cloud-based services and ensuring data privacy.<\/p>\n<p><strong>Scalability<\/strong>: Handles multiple AI models simultaneously, allowing users to experiment with different models and switch between them easily.<\/p>\n<p><strong>User-friendly interface<\/strong>: Offers a simple and intuitive interface, making it accessible to users of all skill levels.<\/p>\n<p><strong>Integration with Sendblue<\/strong>: Integrates seamlessly with <a href=\"https:\/\/sendblue.co\/\" target=\"_blank\" rel=\"noopener\">Sendblue<\/a>, enabling users to send iMessages to the bot and receive responses directly in their inbox.<\/p>\n<p><strong>Support for ChatGPT<\/strong>: Supports the GPT-3.5 Turbo and DALL-E 2 models, providing users with access to powerful AI capabilities.<\/p>\n<p><strong>Customization:<\/strong> Allows users to customize the bot\u2019s behavior by modifying the available commands and integrating their own AI models.<\/p>\n<h2 class=\"wp-block-heading\">How does it work?<\/h2>\n<p>The architecture diagram shown in Figure 2 provides a high-level overview of the components and interactions within the LoLLM Messenger project. It illustrates how the main application, AI models, messaging platform, and external APIs work together to enable users to send iMessages to AI models running on their computers.<\/p>\n<p><a href=\"https:\/\/www.docker.com\/wp-content\/uploads\/2024\/08\/F2-LoLLM-overview.png\" target=\"_blank\" rel=\"noopener\"><\/a><strong>Figure 2:<\/strong> Overview of the components and interactions in the LoLLM Messenger project.<\/p>\n<p>By leveraging Docker, Sendblue, and Ollama, LoLLM Messenger offers a seamless and efficient solution for those seeking to explore AI models without the need for cloud-based services. LoLLM Messenger utilizes <a href=\"https:\/\/docs.docker.com\/compose\/\" target=\"_blank\" rel=\"noopener\">Docker Compose<\/a> to manage the required services.\u00a0<\/p>\n<p>Docker Compose simplifies the process by handling the setup and configuration of multiple containers, including the main application, ngrok (for creating a secure tunnel), and Ollama (a server that bridges the gap between messaging apps and AI models).<\/p>\n<h2 class=\"wp-block-heading\">Technical stack<\/h2>\n<p>The LoLLM Messenger tech stack includes:<\/p>\n<p><strong>Lollmm service<\/strong>: This service is responsible for running the main application. It handles incoming iMessages, processing user requests, and interacting with the AI models. The lollmm service communicates with the Ollama model, which is a powerful AI model for text and image generation.<\/p>\n<p><strong>Ngrok<\/strong>: This service is used to expose the main application\u2019s port 8000 to the internet using ngrok. It runs in the Alpine image and forwards traffic from port 8000 to the ngrok tunnel. The service is set to run in the host network mode.<\/p>\n<p><strong>Ollama:<\/strong> This service runs the Ollama model, which is a powerful AI model for text and image generation. It listens on port 11434 and mounts a volume from .\/run\/ollama to \/home\/ollama. The service is set to deploy with GPU resources, ensuring that it can utilize an NVIDIA GPU if available.<\/p>\n<p><strong>Sendblue:<\/strong> The project integrates with Sendblue to handle iMessages. You can set up Sendblue by adding your API Key and API Secret in the app\/.env file and adding your phone number as a Sendblue contact.<\/p>\n<h2 class=\"wp-block-heading\">Getting started<\/h2>\n<p>To get started, ensure that you have installed and set up the following components:<\/p>\n<p>Install the latest <a href=\"https:\/\/www.docker.com\/products\/docker-desktop\/\" target=\"_blank\" rel=\"noopener\">Docker Desktop<\/a>.<\/p>\n<p>Register for Sendblue <a href=\"https:\/\/app.sendblue.co\/auth\/login\" target=\"_blank\" rel=\"noopener\">https:\/\/app.sendblue.co\/auth\/login<\/a>.\u00a0<\/p>\n<p>Create an ngrok account using your preferred way and get authtoken <a href=\"https:\/\/dashboard.ngrok.com\/signup\" target=\"_blank\" rel=\"noopener\">https:\/\/dashboard.ngrok.com\/signup<\/a>.<\/p>\n<h2 class=\"wp-block-heading\">Clone the repository<\/h2>\n<p>Open a terminal window and run the following command to clone this sample application:<\/p>\n<div class=\"wp-block-syntaxhighlighter-code \">\ngit clone https:\/\/github.com\/dockersamples\/local-llm-messenger\n<\/div>\n<p>You should now have the following files in your local-llm-messenger directory:<\/p>\n<div class=\"wp-block-syntaxhighlighter-code \">\n.<br \/>\n\u251c\u2500\u2500 LICENSE<br \/>\n\u251c\u2500\u2500 README.md<br \/>\n\u251c\u2500\u2500 app<br \/>\n\u2502   \u251c\u2500\u2500 Dockerfile<br \/>\n\u2502   \u251c\u2500\u2500 Pipfile<br \/>\n\u2502   \u251c\u2500\u2500 Pipfile.lock<br \/>\n\u2502   \u251c\u2500\u2500 default.ai<br \/>\n\u2502   \u251c\u2500\u2500 log_conf.yaml<br \/>\n\u2502   \u2514\u2500\u2500 main.py<br \/>\n\u251c\u2500\u2500 docker-compose.yaml<br \/>\n\u251c\u2500\u2500 img<br \/>\n\u2502   \u251c\u2500\u2500 banner.png<br \/>\n\u2502   \u251c\u2500\u2500 lasers.gif<br \/>\n\u2502   \u2514\u2500\u2500 lollm-demo-1.gif<br \/>\n\u251c\u2500\u2500 justfile<br \/>\n\u2514\u2500\u2500 test<br \/>\n    \u251c\u2500\u2500 msg.json<br \/>\n    \u2514\u2500\u2500 ollama.json\n<p>4 directories, 15 files\n<\/p><\/div>\n<p>The script main.py file under the \/app directory is a Python script that uses the FastAPI framework to create a web server for an AI-powered messaging application. The script interacts with OpenAI\u2019s GPT-3 model and an Ollama endpoint for generating responses. It uses Sendblue\u2019s API for sending messages.<\/p>\n<p>The script first imports necessary libraries, including FastAPI, requests, logging, and other required modules.<\/p>\n<div class=\"wp-block-syntaxhighlighter-code \">\nfrom dotenv import load_dotenv<br \/>\nimport os, requests, time, openai, json, logging<br \/>\nfrom pprint import pprint<br \/>\nfrom typing import Union, List\n<p>from fastapi import FastAPI<br \/>\nfrom pydantic import BaseModel<\/p>\n<p>from sendblue import Sendblue\n<\/p><\/div>\n<p>This section sets up configuration variables, such as API keys, callback URL, Ollama API endpoint, and maximum context and word limits.<\/p>\n<div class=\"wp-block-syntaxhighlighter-code \">\nSENDBLUE_API_KEY = os.environ.get(&#8220;SENDBLUE_API_KEY&#8221;)<br \/>\nSENDBLUE_API_SECRET = os.environ.get(&#8220;SENDBLUE_API_SECRET&#8221;)<br \/>\nopenai.api_key = os.environ.get(&#8220;OPENAI_API_KEY&#8221;)<br \/>\nOLLAMA_API = os.environ.get(&#8220;OLLAMA_API_ENDPOINT&#8221;, &#8220;http:\/\/ollama:11434\/api&#8221;)<br \/>\n# could also use request.headers.get(&#8216;referer&#8217;) to do dynamically<br \/>\nCALLBACK_URL = os.environ.get(&#8220;CALLBACK_URL&#8221;)<br \/>\nMAX_WORDS = os.environ.get(&#8220;MAX_WORDS&#8221;)\n<\/div>\n<p>Next, the script performs the logging configuration, setting the log level to INFO. Creates a file handler for logging messages to a file named app.log.<\/p>\n<p>It then defines various functions for interacting with the AI models, managing context, sending messages, handling callbacks, and executing slash commands.<\/p>\n<div class=\"wp-block-syntaxhighlighter-code \">\ndef set_default_model(model: str):<br \/>\n    try:<br \/>\n        with open(&#8220;default.ai&#8221;, &#8220;w&#8221;) as f:<br \/>\n            f.write(model)<br \/>\n            f.close()<br \/>\n            return<br \/>\n    except IOError:<br \/>\n        logger.error(&#8220;Could not open file&#8221;)<br \/>\n        exit(1)\n<p>def get_default_model() -&gt; str:<br \/>\n    try:<br \/>\n        with open(&#8220;default.ai&#8221;) as f:<br \/>\n            default = f.readline().strip(&#8220;n&#8221;)<br \/>\n            f.close()<br \/>\n            if default != &#8220;&#8221;:<br \/>\n                return default<br \/>\n            else:<br \/>\n                set_default_model(&#8220;llama2:latest&#8221;)<br \/>\n                return &#8220;&#8221;<br \/>\n    except IOError:<br \/>\n        logger.error(&#8220;Could not open file&#8221;)<br \/>\n        exit(1)<\/p>\n<p>def validate_model(model: str) -&gt; bool:<br \/>\n    available_models = get_model_list()<br \/>\n    if model in available_models:<br \/>\n        return True<br \/>\n    else:<br \/>\n        return False<\/p>\n<p>def get_ollama_model_list() -&gt; List[str]:<br \/>\n    available_models = []<\/p>\n<p>    tags = requests.get(OLLAMA_API + &#8220;\/tags&#8221;)<br \/>\n    all_models = json.loads(tags.text)<br \/>\n    for model in all_models[&#8220;models&#8221;]:<br \/>\n        available_models.append(model[&#8220;name&#8221;])<br \/>\n    return available_models<\/p>\n<p>def get_openai_model_list() -&gt; List[str]:<br \/>\n    return [&#8220;gpt-3.5-turbo&#8221;, &#8220;dall-e-2&#8221;]<\/p>\n<p>def get_model_list() -&gt; List[str]:<br \/>\n    ollama_models = []<br \/>\n    openai_models = []<br \/>\n    all_models = []<br \/>\n    if &#8220;OPENAI_API_KEY&#8221; in os.environ:<br \/>\n        # print(openai.Model.list())<br \/>\n        openai_models = get_openai_model_list()<\/p>\n<p>    ollama_models = get_ollama_model_list()<br \/>\n    all_models = ollama_models + openai_models<br \/>\n    return all_models<\/p>\n<p>DEFAULT_MODEL = get_default_model()<\/p>\n<p>if DEFAULT_MODEL == &#8220;&#8221;:<br \/>\n    # This is probably the first run so we need to install a model<br \/>\n    if &#8220;OPENAI_API_KEY&#8221; in os.environ:<br \/>\n        print(&#8220;No default model set. openai is enabled. using gpt-3.5-turbo&#8221;)<br \/>\n        DEFAULT_MODEL = &#8220;gpt-3.5-turbo&#8221;<br \/>\n    else:<br \/>\n        print(&#8220;No model found and openai not enabled. Installing llama2:latest&#8221;)<br \/>\n        pull_data = &#8216;{&#8220;name&#8221;: &#8220;llama2:latest&#8221;,&#8221;stream&#8221;: false}&#8217;<br \/>\n        try:<br \/>\n            pull_resp = requests.post(OLLAMA_API + &#8220;\/pull&#8221;, data=pull_data)<br \/>\n            pull_resp.raise_for_status()<br \/>\n        except requests.exceptions.HTTPError as err:<br \/>\n            raise SystemExit(err)<br \/>\n        set_default_model(&#8220;llama2:latest&#8221;)<br \/>\n        DEFAULT_MODEL = &#8220;llama2:latest&#8221;<\/p>\n<p>if validate_model(DEFAULT_MODEL):<br \/>\n    logger.info(&#8220;Using model: &#8221; + DEFAULT_MODEL)<br \/>\nelse:<br \/>\n    logger.error(&#8220;Model &#8221; + DEFAULT_MODEL + &#8221; not available.&#8221;)<br \/>\n    logger.info(get_model_list())<\/p>\n<p>    pull_data = &#8216;{&#8220;name&#8221;: &#8220;&#8216; + DEFAULT_MODEL + &#8216;&#8221;,&#8221;stream&#8221;: false}&#8217;<br \/>\n    try:<br \/>\n        pull_resp = requests.post(OLLAMA_API + &#8220;\/pull&#8221;, data=pull_data)<br \/>\n        pull_resp.raise_for_status()<br \/>\n    except requests.exceptions.HTTPError as err:<br \/>\n        raise SystemExit(err)<\/p>\n<p>def set_msg_send_style(received_msg: str):<br \/>\n    &#8220;&#8221;&#8221;Will return a style for the message to send based on matched words in received message&#8221;&#8221;&#8221;<br \/>\n    celebration_match = [&#8220;happy&#8221;]<br \/>\n    shooting_star_match = [&#8220;star&#8221;, &#8220;stars&#8221;]<br \/>\n    fireworks_match = [&#8220;celebrate&#8221;, &#8220;firework&#8221;]<br \/>\n    lasers_match = [&#8220;cool&#8221;, &#8220;lasers&#8221;, &#8220;laser&#8221;]<br \/>\n    love_match = [&#8220;love&#8221;]<br \/>\n    confetti_match = [&#8220;yay&#8221;]<br \/>\n    balloons_match = [&#8220;party&#8221;]<br \/>\n    echo_match = [&#8220;what did you say&#8221;]<br \/>\n    invisible_match = [&#8220;quietly&#8221;]<br \/>\n    gentle_match = []<br \/>\n    loud_match = [&#8220;hear&#8221;]<br \/>\n    slam_match = []<\/p>\n<p>    received_msg_lower = received_msg.lower()<br \/>\n    if any(x in received_msg_lower for x in celebration_match):<br \/>\n        return &#8220;celebration&#8221;<br \/>\n    elif any(x in received_msg_lower for x in shooting_star_match):<br \/>\n        return &#8220;shooting_star&#8221;<br \/>\n    elif any(x in received_msg_lower for x in fireworks_match):<br \/>\n        return &#8220;fireworks&#8221;<br \/>\n    elif any(x in received_msg_lower for x in lasers_match):<br \/>\n        return &#8220;lasers&#8221;<br \/>\n    elif any(x in received_msg_lower for x in love_match):<br \/>\n        return &#8220;love&#8221;<br \/>\n    elif any(x in received_msg_lower for x in confetti_match):<br \/>\n        return &#8220;confetti&#8221;<br \/>\n    elif any(x in received_msg_lower for x in balloons_match):<br \/>\n        return &#8220;balloons&#8221;<br \/>\n    elif any(x in received_msg_lower for x in echo_match):<br \/>\n        return &#8220;echo&#8221;<br \/>\n    elif any(x in received_msg_lower for x in invisible_match):<br \/>\n        return &#8220;invisible&#8221;<br \/>\n    elif any(x in received_msg_lower for x in gentle_match):<br \/>\n        return &#8220;gentle&#8221;<br \/>\n    elif any(x in received_msg_lower for x in loud_match):<br \/>\n        return &#8220;loud&#8221;<br \/>\n    elif any(x in received_msg_lower for x in slam_match):<br \/>\n        return &#8220;slam&#8221;<br \/>\n    else:<br \/>\n        return\n<\/p><\/div>\n<p>Two classes, Msg and Callback, are defined to represent the structure of incoming messages and callback data. The code also includes various functions and classes to handle different aspects of the messaging platform, such as setting default models, validating models, interacting with the Sendblue API, and processing messages. It also includes functions to handle slash commands, create messages from context, and append context to a file.<\/p>\n<div class=\"wp-block-syntaxhighlighter-code \">\nclass Msg(BaseModel):<br \/>\n    accountEmail: str<br \/>\n    content: str<br \/>\n    media_url: str<br \/>\n    is_outbound: bool<br \/>\n    status: str<br \/>\n    error_code: int | None = None<br \/>\n    error_message: str | None = None<br \/>\n    message_handle: str<br \/>\n    date_sent: str<br \/>\n    date_updated: str<br \/>\n    from_number: str<br \/>\n    number: str<br \/>\n    to_number: str<br \/>\n    was_downgraded: bool | None = None<br \/>\n    plan: str\n<p>class Callback(BaseModel):<br \/>\n    accountEmail: str<br \/>\n    content: str<br \/>\n    is_outbound: bool<br \/>\n    status: str<br \/>\n    error_code: int | None = None<br \/>\n    error_message: str | None = None<br \/>\n    message_handle: str<br \/>\n    date_sent: str<br \/>\n    date_updated: str<br \/>\n    from_number: str<br \/>\n    number: str<br \/>\n    to_number: str<br \/>\n    was_downgraded: bool | None = None<br \/>\n    plan: str<\/p>\n<p>def msg_openai(msg: Msg, model=&#8221;gpt-3.5-turbo&#8221;):<br \/>\n    &#8220;&#8221;&#8221;Sends a message to openai&#8221;&#8221;&#8221;<br \/>\n    message_with_context = create_messages_from_context(&#8220;openai&#8221;)<\/p>\n<p>    # Add the user&#8217;s message and system context to the messages list<br \/>\n    messages = [<br \/>\n        {&#8220;role&#8221;: &#8220;user&#8221;, &#8220;content&#8221;: msg.content},<br \/>\n        {&#8220;role&#8221;: &#8220;system&#8221;, &#8220;content&#8221;: &#8220;You are an AI assistant. You will answer in haiku.&#8221;},<br \/>\n    ]<\/p>\n<p>    # Convert JSON strings to Python dictionaries and add them to messages<br \/>\n    messages.extend(<br \/>\n        [<br \/>\n            json.loads(line)  # Convert each JSON string back into a dictionary<br \/>\n            for line in message_with_context<br \/>\n        ]<br \/>\n    )<\/p>\n<p>    # Send the messages to the OpenAI model<br \/>\n    gpt_resp = client.chat.completions.create(<br \/>\n        model=model,<br \/>\n        messages=messages,<br \/>\n    )<\/p>\n<p>    # Append the system context to the context file<br \/>\n    append_context(&#8220;system&#8221;, gpt_resp.choices[0].message.content)<\/p>\n<p>    # Send a message to the sender<br \/>\n    msg_response = sendblue.send_message(<br \/>\n        msg.from_number,<br \/>\n        {<br \/>\n            &#8220;content&#8221;: gpt_resp.choices[0].message.content,<br \/>\n            &#8220;status_callback&#8221;: CALLBACK_URL,<br \/>\n        },<br \/>\n    )<\/p>\n<p>    return<\/p>\n<p>def msg_ollama(msg: Msg, model=None):<br \/>\n    &#8220;&#8221;&#8221;Sends a message to the ollama endpoint&#8221;&#8221;&#8221;<br \/>\n    if model is None:<br \/>\n        logger.error(&#8220;Model is None when calling msg_ollama&#8221;)<br \/>\n        return  # Optionally handle the case more gracefully<\/p>\n<p>    ollama_headers = {&#8220;Content-Type&#8221;: &#8220;application\/json&#8221;}<br \/>\n    ollama_data = (<br \/>\n        &#8216;{&#8220;model&#8221;:&#8221;&#8216; + model +<br \/>\n        &#8216;&#8221;, &#8220;stream&#8221;: false, &#8220;prompt&#8221;:&#8221;&#8216; +<br \/>\n        msg.content +<br \/>\n        &#8221; in under &#8221; +<br \/>\n        str(MAX_WORDS) +  # Make sure MAX_WORDS is a string<br \/>\n        &#8216; words&#8221;}&#8217;<br \/>\n    )<br \/>\n    ollama_resp = requests.post(<br \/>\n        OLLAMA_API + &#8220;\/generate&#8221;, headers=ollama_headers, data=ollama_data<br \/>\n    )<br \/>\n    response_dict = json.loads(ollama_resp.text)<br \/>\n    if ollama_resp.ok:<br \/>\n        send_style = set_msg_send_style(msg.content)<br \/>\n        append_context(&#8220;system&#8221;, response_dict[&#8220;response&#8221;])<br \/>\n        msg_response = sendblue.send_message(<br \/>\n            msg.from_number,<br \/>\n            {<br \/>\n                &#8220;content&#8221;: response_dict[&#8220;response&#8221;],<br \/>\n                &#8220;status_callback&#8221;: CALLBACK_URL,<br \/>\n                &#8220;send_style&#8221;: send_style,<br \/>\n            },<br \/>\n        )<br \/>\n    else:<br \/>\n        msg_response = sendblue.send_message(<br \/>\n            msg.from_number,<br \/>\n            {<br \/>\n                &#8220;content&#8221;: &#8220;I&#8217;m sorry, I had a problem processing that question. Please try again.&#8221;,<br \/>\n                &#8220;status_callback&#8221;: CALLBACK_URL,<br \/>\n            },<br \/>\n        )<br \/>\n    return\n<\/p><\/div>\n<p>Navigate to the app\/ directory and create a new file for adding environment variables.<\/p>\n<div class=\"wp-block-syntaxhighlighter-code \">\ntouch .env<br \/>\nSENDBLUE_API_KEY=your_sendblue_api_key<br \/>\nSENDBLUE_API_SECRET=your_sendblue_api_secret<br \/>\nOLLAMA_API_ENDPOINT=http:\/\/host.docker.internal:11434\/api<br \/>\nOPENAI_API_KEY=your_openai_api_key\n<\/div>\n<p>Next, add the ngrok authtoken to the Docker Compose file. You can get the <a href=\"https:\/\/dashboard.ngrok.com\/get-started\/your-authtoken\" target=\"_blank\" rel=\"noopener\">authtoken from this link<\/a>.<\/p>\n<div class=\"wp-block-syntaxhighlighter-code \">\nservices:<br \/>\n  lollm:<br \/>\n    build: .\/app<br \/>\n    # command:<br \/>\n      # &#8211; sleep<br \/>\n      # &#8211; 1d<br \/>\n    ports:<br \/>\n      &#8211; 8000:8000<br \/>\n    env_file: .\/app\/.env<br \/>\n    volumes:<br \/>\n      &#8211; .\/run\/lollm:\/run\/lollm<br \/>\n    depends_on:<br \/>\n      &#8211; ollama<br \/>\n    restart: unless-stopped<br \/>\n    network_mode: &#8220;host&#8221;<br \/>\n  ngrok:<br \/>\n    image: ngrok\/ngrok:alpine<br \/>\n    command:<br \/>\n      &#8211; &#8220;http&#8221;<br \/>\n      &#8211; &#8220;8000&#8221;<br \/>\n      &#8211; &#8220;&#8211;log&#8221;<br \/>\n      &#8211; &#8220;stdout&#8221;<br \/>\n    environment:<br \/>\n      &#8211; NGROK_AUTHTOKEN=2i6iXXXXXXXXhpqk1aY1<br \/>\n    network_mode: &#8220;host&#8221;<br \/>\n  ollama:<br \/>\n    image: ollama\/ollama<br \/>\n    ports:<br \/>\n      &#8211; 11434:11434<br \/>\n    volumes:<br \/>\n      &#8211; .\/run\/ollama:\/home\/ollama<br \/>\n    network_mode: &#8220;host&#8221;\n<\/div>\n<h2 class=\"wp-block-heading\">Running the application stack<\/h2>\n<p>Next, you can run the application stack, as follows:<\/p>\n<div class=\"wp-block-syntaxhighlighter-code \">\n$ docker compose up\n<\/div>\n<p>You will see output similar to the following:<\/p>\n<div class=\"wp-block-syntaxhighlighter-code \">\n[+] Running 4\/4<br \/>\n \u2714 Container local-llm-messenger-ollama-1                           Create&#8230;                                          0.0s<br \/>\n \u2714 Container local-llm-messenger-ngrok-1                            Created                                            0.0s<br \/>\n \u2714 Container local-llm-messenger-lollm-1                            Recreat&#8230;                                         0.1s<br \/>\n ! lollm Published ports are discarded when using host network mode                                                    0.0s<br \/>\nAttaching to lollm-1, ngrok-1, ollama-1<br \/>\nollama-1  | 2024\/06\/20 03:14:46 routes.go:1011: INFO server config env=&#8221;map[OLLAMA_DEBUG:false OLLAMA_FLASH_ATTENTION:false OLLAMA_HOST:http:\/\/0.0.0.0:11434 OLLAMA_KEEP_ALIVE: OLLAMA_LLM_LIBRARY: OLLAMA_MAX_LOADED_MODELS:1 OLLAMA_MAX_QUEUE:512 OLLAMA_MAX_VRAM:0 OLLAMA_MODELS:\/root\/.ollama\/models OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http:\/\/localhost https:\/\/localhost http:\/\/localhost:* https:\/\/localhost:* http:\/\/127.0.0.1 https:\/\/127.0.0.1 http:\/\/127.0.0.1:* https:\/\/127.0.0.1:* http:\/\/0.0.0.0 https:\/\/0.0.0.0 http:\/\/0.0.0.0:* https:\/\/0.0.0.0:* app:\/\/* file:\/\/* tauri:\/\/*] OLLAMA_RUNNERS_DIR: OLLAMA_TMPDIR:]&#8221;<br \/>\nollama-1  | time=2024-06-20T03:14:46.308Z level=INFO source=images.go:725 msg=&#8221;total blobs: 0&#8243;<br \/>\nollama-1  | time=2024-06-20T03:14:46.309Z level=INFO source=images.go:732 msg=&#8221;total unused blobs removed: 0&#8243;<br \/>\nollama-1  | time=2024-06-20T03:14:46.309Z level=INFO source=routes.go:1057 msg=&#8221;Listening on [::]:11434 (version 0.1.44)&#8221;<br \/>\nollama-1  | time=2024-06-20T03:14:46.309Z level=INFO source=payload.go:30 msg=&#8221;extracting embedded files&#8221; dir=\/tmp\/ollama2210839504\/runners<br \/>\nngrok-1   | t=2024-06-20T03:14:46+0000 lvl=info msg=&#8221;open config file&#8221; path=\/var\/lib\/ngrok\/ngrok.yml err=nil<br \/>\nngrok-1   | t=2024-06-20T03:14:46+0000 lvl=info msg=&#8221;open config file&#8221; path=\/var\/lib\/ngrok\/auth-config.yml err=nil<br \/>\nngrok-1   | t=2024-06-20T03:14:46+0000 lvl=info msg=&#8221;starting web service&#8221; obj=web addr=0.0.0.0:4040 allow_hosts=[]<br \/>\nngrok-1   | t=2024-06-20T03:14:46+0000 lvl=info msg=&#8221;client session established&#8221; obj=tunnels.session<br \/>\nngrok-1   | t=2024-06-20T03:14:46+0000 lvl=info msg=&#8221;tunnel session started&#8221; obj=tunnels.session<br \/>\nngrok-1   | t=2024-06-20T03:14:46+0000 lvl=info msg=&#8221;started tunnel&#8221; obj=tunnels name=command_line addr=http:\/\/localhost:8000 url=https:\/\/94e1-223-185-128-160.ngrok-free.app<br \/>\nollama-1  | time=2024-06-20T03:14:48.602Z level=INFO source=payload.go:44 msg=&#8221;Dynamic LLM libraries [cpu cuda_v11]&#8221;<br \/>\nollama-1  | time=2024-06-20T03:14:48.603Z level=INFO source=types.go:71 msg=&#8221;inference compute&#8221; id=0 library=cpu compute=&#8221;&#8221; driver=0.0 name=&#8221;&#8221; total=&#8221;7.7 GiB&#8221; available=&#8221;3.9 GiB&#8221;<br \/>\nlollm-1   | INFO:     Started server process [1]<br \/>\nlollm-1   | INFO:     Waiting for application startup.<br \/>\nlollm-1   | INFO:     Application startup complete.<br \/>\nlollm-1   | INFO:     Uvicorn running on http:\/\/0.0.0.0:8000 (Press CTRL+C to quit)<br \/>\nngrok-1   | t=2024-06-20T03:16:58+0000 lvl=info msg=&#8221;join connections&#8221; obj=join id=ce119162e042 l=127.0.0.1:8000 r=[2401:4900:8838:8063:f0b0:1866:e957:b3ba]:54384<br \/>\nlollm-1   | OLLAMA API IS http:\/\/host.docker.internal:11434\/api<br \/>\nlollm-1   | INFO:     2401:4900:8838:8063:f0b0:1866:e957:b3ba:0 &#8211; &#8220;GET \/ HTTP\/1.1&#8221; 200 OK\n<\/div>\n<p>If you\u2019re testing it on a system without an NVIDIA GPU, then you can skip the deploy attribute of the Compose file.\u00a0<\/p>\n<p>Watch the output for your ngrok endpoint. In our case, it shows: <a href=\"https:\/\/94e1-223-185-128-160.ngrok-free.app\/\" target=\"_blank\" rel=\"noopener\">https:\/\/94e1-223-185-128-160.ngrok-free.app\/<\/a><\/p>\n<p>Next, append \/msg to the following ngrok webhooks URL:\u00a0<a href=\"https:\/\/94e1-223-185-128-160.ngrok-free.app\/\" target=\"_blank\" rel=\"noopener\">https:\/\/94e1-223-185-128-160.ngrok-free.app\/<\/a><\/p>\n<p>Then, add it under the webhooks URL section on Sendblue and save it (Figure 3).\u00a0 The ngrok service is configured to expose the lollmm service on port 8000 and provide a secure tunnel to the public internet using the ngrok.io domain.\u00a0<\/p>\n<p>The ngrok service logs indicate that it has started the web service and established a client session with the tunnels. They also show that the tunnel session has started and has been successfully established with the lollmm service.<\/p>\n<p>The ngrok service is configured to use the specified ngrok authentication token, which is required to access the ngrok service. Overall, the ngrok service is running correctly and is able to establish a secure tunnel to the lollmm service.<\/p>\n<p><a href=\"https:\/\/www.docker.com\/wp-content\/uploads\/2024\/08\/F3-ngrok-authentication.png\" target=\"_blank\" rel=\"noopener\"><\/a><strong>Figure 3: <\/strong>Adding ngrok authentication token to webhooks.<\/p>\n<p>Ensure that there are no error logs when you run the ngrok container (Figure 4).<\/p>\n<p><a href=\"https:\/\/www.docker.com\/wp-content\/uploads\/2024\/08\/F4-error-logs.png\" target=\"_blank\" rel=\"noopener\"><\/a><strong>Figure 4: <\/strong>Checking the logs for errors.<\/p>\n<p>Ensure that the LoLLM Messenger container is actively up and running (Figure 5).<\/p>\n<p><a href=\"https:\/\/www.docker.com\/wp-content\/uploads\/2024\/08\/F5-LoLLM-running.png\" target=\"_blank\" rel=\"noopener\"><\/a><strong>Figure 5: <\/strong>Ensure the LoLLM Messenger container is running. <\/p>\n<p>The logs show that the Ollama service has opened the specified port (11434) and is listening for incoming connections. The logs also indicate that the Ollama service has mounted the \/home\/ollama directory from the host machine to the \/home\/ollama directory within the container.<\/p>\n<p>Overall, the Ollama service is running correctly and is ready to provide AI models for inference.<\/p>\n<h2 class=\"wp-block-heading\">Testing the functionality<\/h2>\n<p>To test the functionality of the lollm service, you first need to add your contact number to the Sendblue dashboard. Then you should be able to send messages to the Sendblue number and observe the responses from the lollmm service (Figure 6).<\/p>\n<p><a href=\"https:\/\/www.docker.com\/wp-content\/uploads\/2024\/08\/F6-test-message.png\" target=\"_blank\" rel=\"noopener\"><\/a><strong>Figure 6:<\/strong> Testing functionality of lollm service.<\/p>\n<p>The Sendblue platform will send HTTP requests to the \/msg endpoint of your lollmm service, and your lollmm service will process these requests and return the appropriate responses.<\/p>\n<p>The lollmm service is set up to listen on port 8000.<\/p>\n<p>The ngrok tunnel is started and provides a public URL, such as <a href=\"https:\/\/94e1-223-185-128-160.ngrok-free.app\/\" target=\"_blank\" rel=\"noopener\">https:\/\/94e1-223-185-128-160.ngrok-free.app<\/a>.<\/p>\n<p>The lollmm service receives HTTP requests from the ngrok tunnel, including GET requests to the root path (\/) and other paths, such as \/favicon.ico, \/predict, \/mdg, and \/msg.<\/p>\n<p>The lollmm service responds to these requests with appropriate HTTP status codes, such as <strong>200 OK<\/strong> for successful requests and <strong>404 Not Found<\/strong> for requests to paths that do not exist.<\/p>\n<p>The ngrok tunnel logs the join connections, indicating that clients are connecting to the lollmm service through the ngrok tunnel.<\/p>\n<p>Figure 7: Sending requests and receiving responses. <\/p>\n<p>The first time you chat with LLM by typing \/list (Figure 7), you can check the logs as shown:<\/p>\n<div class=\"wp-block-syntaxhighlighter-code \">\nngrok-1   | t=2024-07-09T02:34:30+0000 lvl=info msg=&#8221;join connections&#8221; obj=join id=12bd50a8030b l=127.0.0.1:8000 r=18.223.220.3:44370<br \/>\nlollm-1   | OLLAMA API IS http:\/\/host.docker.internal:11434\/api<br \/>\nlollm-1   | INFO:     18.223.220.3:0 &#8211; &#8220;POST \/msg HTTP\/1.1&#8221; 200 OK<br \/>\nngrok-1   | t=2024-07-09T02:34:53+0000 lvl=info msg=&#8221;join connections&#8221; obj=join id=259fda936691 l=127.0.0.1:8000 r=18.223.220.3:36712<br \/>\nlollm-1   | INFO:     18.223.220.3:0 &#8211; &#8220;POST \/msg HTTP\/1.1&#8221; 200 OK\n<\/div>\n<p>Next, let\u2019s install the codellama model by typing \/install codellama:latest (Figure 8).<\/p>\n<p>Figure 8: Installing the `codellama` model.<\/p>\n<p>You can see the following container logs once you set the default model to codellama:latest as shown:<\/p>\n<div class=\"wp-block-syntaxhighlighter-code \">\nngrok-1   | t=2024-07-09T03:39:23+0000 lvl=info msg=&#8221;join connections&#8221; obj=join id=026d8fad5c87 l=127.0.0.1:8000 r=18.223.220.3:36282<br \/>\nlollm-1   | setting default model<br \/>\nlollm-1   | INFO:     18.223.220.3:0 &#8211; &#8220;POST \/msg HTTP\/1.1&#8221; 200 OK\n<\/div>\n<p>The lollmm service is running correctly and can handle HTTP requests from the ngrok tunnel. You can use the ngrok tunnel URL to test the functionality of the lollmm service by sending HTTP requests to the appropriate paths (Figure 9).<\/p>\n<p>Figure 9: Testing the messaging functionality.<\/p>\n<h2 class=\"wp-block-heading\">Conclusion<\/h2>\n<p>LoLLM Messenger is a valuable tool for developers and enthusiasts looking to push the boundaries of LLM integration within messaging apps. It allows developers to craft custom chatbots for specific needs, add real-time sentiment analysis to messages, or explore entirely new AI features in your messaging experience.\u00a0<\/p>\n<p>To get started, you can explore the <a href=\"https:\/\/github.com\/rothgar\/local-llm-messenger\" target=\"_blank\" rel=\"noopener\">LoLLM Messenger project on GitHub<\/a> and discover the potential of local LLM.<\/p>\n<h2 class=\"wp-block-heading\">Learn more<\/h2>\n<p>Subscribe to the <a href=\"https:\/\/www.docker.com\/newsletter-subscription\/\">Docker Newsletter<\/a>.\u00a0<\/p>\n<p>Read the <a href=\"https:\/\/www.docker.com\/blog\/tag\/ai-ml-hackathon\/\" target=\"_blank\" rel=\"noopener\">AI\/ML Hackathon<\/a> collection.<\/p>\n<p>Get the latest release of <a href=\"https:\/\/www.docker.com\/products\/docker-desktop\/\">Docker Desktop<\/a>.<\/p>\n<p>Vote on what\u2019s next! Check out our <a href=\"https:\/\/github.com\/docker\/roadmap\" target=\"_blank\" rel=\"noopener\">public roadmap<\/a>.<\/p>\n<p>Have questions? The <a href=\"https:\/\/www.docker.com\/community\/\">Docker community is here to help<\/a>.<\/p>\n<p>New to Docker? <a href=\"https:\/\/docs.docker.com\/desktop\/\" target=\"_blank\" rel=\"noopener\">Get started<\/a>.<\/p>","protected":false},"excerpt":{"rendered":"<p>In this AI\/ML Hackathon post, we want to share another winning project from last year\u2019s Docker AI\/ML Hackathon. This time [&hellip;]<\/p>\n","protected":false},"author":0,"featured_media":0,"comment_status":"","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"site-sidebar-layout":"default","site-content-layout":"","ast-site-content-layout":"default","site-content-style":"default","site-sidebar-style":"default","ast-global-header-display":"","ast-banner-title-visibility":"","ast-main-header-display":"","ast-hfb-above-header-display":"","ast-hfb-below-header-display":"","ast-hfb-mobile-header-display":"","site-post-title":"","ast-breadcrumbs-content":"","ast-featured-img":"","footer-sml-layout":"","ast-disable-related-posts":"","theme-transparent-header-meta":"","adv-header-id-meta":"","stick-header-meta":"","header-above-stick-meta":"","header-main-stick-meta":"","header-below-stick-meta":"","astra-migrate-meta-layouts":"default","ast-page-background-enabled":"default","ast-page-background-meta":{"desktop":{"background-color":"var(--ast-global-color-4)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"ast-content-background-meta":{"desktop":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"footnotes":""},"categories":[4],"tags":[],"class_list":["post-1074","post","type-post","status-publish","format-standard","hentry","category-docker"],"_links":{"self":[{"href":"https:\/\/rssfeedtelegrambot.bnaya.co.il\/index.php\/wp-json\/wp\/v2\/posts\/1074","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/rssfeedtelegrambot.bnaya.co.il\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/rssfeedtelegrambot.bnaya.co.il\/index.php\/wp-json\/wp\/v2\/types\/post"}],"replies":[{"embeddable":true,"href":"https:\/\/rssfeedtelegrambot.bnaya.co.il\/index.php\/wp-json\/wp\/v2\/comments?post=1074"}],"version-history":[{"count":0,"href":"https:\/\/rssfeedtelegrambot.bnaya.co.il\/index.php\/wp-json\/wp\/v2\/posts\/1074\/revisions"}],"wp:attachment":[{"href":"https:\/\/rssfeedtelegrambot.bnaya.co.il\/index.php\/wp-json\/wp\/v2\/media?parent=1074"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/rssfeedtelegrambot.bnaya.co.il\/index.php\/wp-json\/wp\/v2\/categories?post=1074"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/rssfeedtelegrambot.bnaya.co.il\/index.php\/wp-json\/wp\/v2\/tags?post=1074"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}