LLM Integration with App on TypeScript

LLM integration services in TypeScript transform static applications into AI-powered tools: get started now or get left behind .

WebLLM

Dependencies

Adding LLM configuration file

This code configures the application to use a predefined list of models and enables the use of web workers:

model_list : This property is set to the model_list from the prebuiltAppConfig. It contains a list of models that the application can use. Here are the primary families of models currently supported:

Llama : Llama 3, Llama 2, Hermes-2-Pro-Llama-3

Phi : Phi 3, Phi 2, Phi 1.5

Gemma : Gemma-2B

Mistral : Mistral-7B-v0.3, Hermes-2-Pro-Mistral-7B, NeuralHermes-2.5-Mistral-7B, OpenHermes-2.5-Mistral-7B

Qwen : Qwen2 0.5B, 1.5B, 7B

use_web_worker : This property is set to true, indicating that the application should use a web worker for running tasks. Web workers allow for running scripts in background threads, which can improve performance by offloading tasks from the main thread.

Instantiate the Engine

This code performs followed three steps:

Step 1. Importing all the exported members

The first line imports all the exported members (functions, classes, constants, etc.) from the @mlc-ai/web-llm package and makes them available under the namespace webllm.

Step 2. Determine Whether to Use a Web Worker

The second line retrieves the use_web_worker setting from the appConfig object. This setting determines whether the application should use a web worker for running tasks.

Step 3. Declare the Engine Variable

The third line declares a variable engine of type webllm.MLCEngineInterface. This variable will hold the instance of the machine learning engine.

Step 4. Instantiate the Engine:

If useWebWorker is true:

It creates an instance of webllm.WebWorkerMLCEngine.

This instance is initialized with a new web worker, created from the worker.ts file.

The web worker is set up to run as a module.

The engine is also configured with appConfig and a log level of "INFO".

If useWebWorker is false:

It creates an instance of webllm.MLCEngine directly, without using a web worker.

This instance is also configured with appConfig.

Main Entry Point

The entry point in this example is the asynchronous CreateAsync method, which initializes the ChatUI class, passing the engine instance as an argument. This method sets up UI elements with the specified engine, and registers event handlers:

Chat Completion

Once the engine is successfully initialized, you can utilize the engine.chat.completions interface to call chat completions in the OpenAI style:

Streaming

WebLLM also supports streaming chat completion generating. To utilize it, just include stream: true in the engine.chat.completions.create call.:

Testing

Run `npm install`and `npm start` in CMD or PowerShell to start the application. In our case, the system automatically selected the Llama-3.2-1B-Instruct-q4f32_1-MLC model for work. Also, in our case, a chatbot client had already been developed, which only needed to be integrated with the above-described interface of the WebLLM interface functionality.

As we can see, LLM integration copes well with abstract questions from the knowledge base on which it was trained. But model might not have real-time data access or the capability to provide specific weather updates.

The example demonstrates how to invoke chat completions using OpenAI-style chat APIs and how to enable streaming for real-time responses. These make the chat experience more dynamic and responsive.

Conclusion

LLM integration: Terms Explained

Prompt Engineering

Retrieval-Augmented Generation (RAG)

Embeddings

Vector Database

Function Calling

Context Window

Agent Frameworks

FAQ

What is LLM integration in a TypeScript application?

How do you manage prompts in a TypeScript project?

What are best practices for LLM integration in TypeScript?

Retour au blog

LLM integration with App on TypeScript