ローカルRAGってやつにMackerelを教えてやった

準備

Phi-3 のモデルを https://huggingface.co/microsoft/Phi-3-mini-4k-instruct-onnx からダウンロードする。 今回は cpu-int4-rtn-block-32 を使用した。

同様に Embedding のモデルを https://huggingface.co/TaylorAI/bge-micro-v2 から。

これらを利用するには SemanticKernel と仲間たちを使うといいらしい。

> dotnet add package Microsoft.SemanticKernel --version 1.20.0
> dotnet add package Microsoft.SemanticKernel.Connectors.Onnx --version 1.20.0-alpha
> dotnet add package Microsoft.SemanticKernel.Plugins.Memory --version 1.20.0-alpha

Mackerel について

Mackerel のヘルプは https://github.com/mackerelio/documents にマークダウンファイルであるので、Git サブモジュールとかで適当に持ってきておく。

今回のモデルは日本語に対応してないようなので、英語のヘルプページを読み込むことにする。

実装

気ままにぴゃ~っと。

open System
open System.IO
open System.Threading.Tasks
open Microsoft.SemanticKernel
open Microsoft.SemanticKernel.ChatCompletion
open Microsoft.SemanticKernel.Connectors.OpenAI
open Microsoft.SemanticKernel.Embeddings
open Microsoft.SemanticKernel.Memory
open Microsoft.SemanticKernel.Plugins.Memory

module Task =
    let bind f t = task {
        let! t' = t
        return! f t'
    }

    let map f = bind (f >> Task.FromResult)

type MemoryInfo = { Id: string; Text: string }

module MemoryInfo =
    let create id text = { Id = id; Text = text }

type MackerelHelp = { Path: string; Content: string }

module MackerelHelp =
    let create path content = { Path = path; Content = content }

    let rec list dir = seq {
        for file in Directory.EnumerateFiles(dir, "*.md") do
            yield file

        for subdir in Directory.EnumerateDirectories dir do
            yield! list subdir
    }

    let read path = task {
        let! content = File.ReadAllTextAsync path
        return create path content
    }

    let toInfo h = MemoryInfo.create h.Path h.Content

    let load = list >> Seq.map (read >> Task.map toInfo)

module MemoryStore =
    let save (memory: SemanticTextMemory) collection info =
        memory.SaveInformationAsync(collection, info.Text, info.Id)

    let populate (memory: SemanticTextMemory) collection =
        Seq.map (Task.bind (save memory collection))

let phi3modelPath = Path.Join("path", "to", "cpu-int4-rtn-block-32")
let bgeModelPath = Path.Join("path", "to", "bge-micro-v2", "onnx", "model.onnx")
let vocabPath = Path.Join("path", "to", "bge-micro-v2", "vocab.txt")
let builder = Kernel.CreateBuilder()
builder
    .AddOnnxRuntimeGenAIChatCompletion("phi-3", phi3modelPath)
    .AddBertOnnxTextEmbeddingGeneration(bgeModelPath, vocabPath)
|> ignore

let kernel = builder.Build()
let chatCompletionService = kernel.GetRequiredService<IChatCompletionService>()
let embeddingGenerator = kernel.GetRequiredService<ITextEmbeddingGenerationService>()
let memoryStore = VolatileMemoryStore()
let memory = SemanticTextMemory(memoryStore, embeddingGenerator)
kernel.ImportPluginFromObject(TextMemoryPlugin(memory)) |> ignore

let docsPath = Path.Join("path", "to", "documents", "content", "docs")
let collection = "mackerel-help"
MackerelHelp.load docsPath
|> MemoryStore.populate memory collection
|> Task.WhenAll
|> Async.AwaitTask
|> Async.RunSynchronously
|> ignore

let executionSettings =
    OpenAIPromptExecutionSettings(ToolCallBehavior = ToolCallBehavior.EnableKernelFunctions, MaxTokens = 200)

while true do
    Console.ForegroundColor <- ConsoleColor.White
    Console.Write "\n>> "

    let question = Console.ReadLine()
    let arguments = KernelArguments executionSettings
    arguments.Add("input", question)
    arguments.Add("collection", collection)

    let response =
        kernel.InvokePromptStreamingAsync(
            """
            Question: {{$input}}
            Answer the question using the memory content: {{Recall}}
            """,
            arguments
        )

    Console.ForegroundColor <- ConsoleColor.Cyan
    Console.Write "\n> "

    task {
        let enumerator = response.GetAsyncEnumerator()
        while! enumerator.MoveNextAsync() do
            Console.Write enumerator.Current
    }
    |> Async.AwaitTask
    |> Async.RunSynchronously

    Console.WriteLine()

お試し

何はともあれ Mackerel について

>> what is mackerel

>  Mackerel is a service that uses the concept of "roles" to manage and monitor hosts. It provides a web GUI and API for visualizing the statuses of hosts and applications with a high degree of detail and customization. Mackerel-agent is a program installed on hosts to gather and send host information and metrics to Mackerel for display and management. Mackerel supports various environments, including Linux and Windows, and is well-suited for public IaaS cloud-based systems. Metric data is saved in a time series database, with 1-minute intervals and retention periods of 460 days.

エージェントを Windows にインストールするには

>> how to install agent to windows

>  To install the mackerel-agent on a Windows host, you can follow these steps:

1. Download the installer from the provided URL: https://repo-kcps.mackerel.io/file/msi/mackerel-agent-latest-kcps.msi

2. Copy the downloaded msi file to the host that you want to monitor.

3. Install the agent using the GUI method:
   - Double-click on the downloaded msi file.
   - Follow the installation dialog.
   - When prompted for an API key, enter the API key obtained from the [API tab in the Organization page of the Mackerel Management screen](https://mackerel.io/my?tab=apikeys).

Alternatively, you can install the agent using the command line method:

1. Open the administrator's command prompt.
2. Run the following command, replacing the path with the actual path to the downloaded msi file and the API key with your actual API key:

   ```

   msiexec /qn /i "path\to\mackerel-agent-latest-kcps.msi" APIKEY="&quot;&lt;YOUR API KEY&gt;&quot;"

   ```

3. After installation, the mackerel-agent will start automatically. You can check if the agent is running by accessing the Windows Service management screen. The service name should be "mackerel-agent," and its status should be "running."

To check the mackerel-agent's log, refer to the Windows Event Log. The log will provide information about the agent's behavior.

Remember to uninstall the mackerel-agent and delete the ID files if you wish to remove the agent from the host.

やるじゃん。

感想

でも遅い。CUDAを使えるモデルもあったけど、このコードではうまくいかんかったので、追々もっといろいろ試してみたい。