Joey

How I Created TMDX

This article will introduce how I came up with the idea of creating TMDX and how I implemented it.

How I Created TMDX

Origin

I've been writing blogs for a long time, starting with Jekyll and later using Gatsby. However, these static site generators were cumbersome to use, requiring configuration of Git, learning how to build, and dealing with image uploads manually. So, I switched to Medium, but its writing experience was always subpar, lacking Markdown support, code highlighting, and a clean interface. I was constantly looking for a better writing platform.

In July, I created a blog donation plugin called TonTip, which helps individual webmasters earn blockchain donation income. However, after promoting it, I found the response was not significant. Therefore, I decided to build a blog platform that better met my needs, integrating any plugins I wanted.

Beginning

There are many tutorials online for building personal blogs. After reviewing some, I had a basic technical idea and decided to use:

  1. Svelte + SvelteKit
  2. Bun + TypeScript
  3. Cloudflare
  4. TailwindCSS
  5. DaisyUI
  6. Shiki
  7. Markdown-it
  8. Monaco Editor
  9. MathJax

I chose Svelte because the framework compiles template code into HTML and JS, eliminating unnecessary runtime dependencies. Bun was chosen because it's a popular recent JS runtime. TailwindCSS is a simple atomic framework, allowing beginners to quickly build beautiful interfaces. DaisyUI is a TailwindCSS UI component library, saving time on writing basic components. Markdown-it is used for compiling Markdown, with an option to use RemarkJS, but I chose Markdown-it for its simplicity. Shiki is a code syntax highlighting library, enabling beautiful code insertion in Markdown. MathJax is a math library, necessary for editing LaTeX formulas. The most important part, the editor, is based on Monaco, an open-source editor by the Microsoft team that can be embedded in browsers and has strong extensibility.

Product Design

The product design was straightforward. I referenced many blogs to design my blog interface. For the editor, I completely followed GitHub Copilot. I'm a subscriber to GitHub Copilot, which helps complete high-quality code, a great productivity tool for programmers. I wanted my editor to provide intelligent code completion and suggestions like GitHub Copilot, improving writing efficiency and making content more professional and accurate.

Here's the editing interface of VSCode + GitHub Copilot:

image.png

And here's the editing interface of TMDX:

image.png

The overall functionality and interface design are similar.

Image Library Implementation

An image library is crucial for the editing experience. We don't want to open local folders to find and upload images during editing. Therefore, I built-in an image upload and management feature. Users can directly upload images in the image library, which are automatically stored in the cloud and linked in the article. This simplifies the process and ensures unified management and quick access to images.

Here's the interface of the image library:

image.png

It supports file selection via mouse click, Ctrl + V paste upload, and drag-and-drop file upload.

Sometimes, users don't want to upload images through the image library. I also listened to editor events to implement paste and drag-and-drop uploads within the editor:

editor.getDomNode()?.addEventListener("paste", handlePaste);
editor.getDomNode()?.addEventListener("dragover", handleDragOver);
editor.getDomNode()?.addEventListener("drop", handleDrop);

AI Chat Implementation

Since the release of ChatGPT in 2023, many chatbots have emerged, and API costs have dropped significantly. Implementing a chat interface is simple: display questions and answers alternately in a container, with a textarea at the bottom for editing and sending questions. Since chat API responses are slow and return token by token, I used Svelte's convenient rune feature for reactive programming. This way, each token returned by the API is immediately displayed on the interface.

This screenshot shows how I used reactive programming and SSE features of the HTTP interface to update messages and then update the interface:

image.png

Initially, I only integrated the DeepSeek model, but later, I also tested other models like ChatGLM and found them good, so I added them. Now supported models include:

  1. DeepSeek: Text-only dialogue
  2. GLM: Text-only dialogue
  3. Flux: Text-to-image generation
  4. CogView: Text-to-image generation
  5. CogVideoX: Text-to-video generation

For text-to-image, I personally like to generate blog cover images, making it easy to create high-quality cover images without manual design or searching for images.

For example, I asked Flux to generate an image with the prompt:

"blue sky, white cloud"

It returned this image:

blue sky, white cloud

I just need to copy the link and set it as the cover field in the frontmatter.

AI Completion Implementation

AI completion is a technically challenging task. I still remember the praise and shock when TabNine first appeared in 2019. Then came GitHub Copilot, but at that time, LLM technology was not advanced, and Copilot's completion was not very accurate. Until 2023, with the emergence of ChatGPT, people realized that large models could be used for code completion. Microsoft quickly updated GitHub Copilot and iterated until the completion effect was excellent. I decided to integrate AI completion in my editor so users could enjoy intelligent completion similar to GitHub Copilot while writing. To achieve this, I chose the DeepSeek model as backend support, as it excels in natural language processing and code completion. Through API calls, the editor can get real-time completion suggestions and display them in the editor, with users just needing to press Tab to accept the suggestions.

Monaco Editor provides completion interfaces, and we just need to implement these interfaces to get the desired completion content:

monaco.languages.registerInlineCompletionsProvider(
    "markdown",
    {
        provideInlineCompletions: async (
            model,
            position,
            context,
            token,
        ) => {
            // generate completions...
        }
    }
);

The core challenge is to quickly, efficiently, and accurately generate completion content. I referenced many implementations, including:

  1. https://spencerporter2.medium.com/building-copilot-on-the-web-f090ceb9b20b
  2. https://github.com/arshad-yaseen/monacopilot
  3. https://sourcegraph.com/blog/the-lifecycle-of-a-code-ai-completion

Currently, I use the simplest method: periodically sending completion requests to AI and caching the content for display when needed.

I have a table recording completion times. Here's a screenshot of some records:

image.png

The third column is the length of the generated completion content, and the fourth column is the interface time. It generally returns within 5 seconds, acceptable for content creators, with room for future optimization.

I haven't specifically counted the quality of completion content or user acceptance. While editing this article, I took some screenshots of completion results, and personally, I think they're good:

image.png

image.png

image.png

image.png

AI Search Implementation

I placed a search box in the Navbar:

image.png

Unlike traditional searches using Elasticsearch or Algolia, I implemented an AI search function based on Cloudflare AI's Vectorize vector database. Each time a user publishes an article, AI analyzes it, generates embeddings, and saves them to the vector database. When users search, it compares vector similarity rather than traditional keyword matching. This method can more accurately understand user search intentions and provide more relevant search results.

AI Translation Implementation

As a non-native English speaker with limited English skills, I want my articles to be understood by people worldwide. Therefore, I decided to integrate AI translation into my blog platform. This way, users can choose to translate articles into multiple languages when publishing, expanding the audience. Initially, I considered using Google Translate, but found it couldn't understand context well, resulting in stiff translations. After testing some AI translation interfaces, I decided to use DeepSeek for translation:

const systemPrompt = `You are an AI translation assistant, please translate the user's markdown file into <lang>${lang}</lang>,

Requirements:
1. Skip images, links, quotes, and code in Markdown
2. If there is frontmatter, only translate the title and description fields
3. If unable to translate to the target language, keep the original text
`;
const output = await aiClient.chat.completions.create({
    model: 'deepseek-chat',
    messages: [
        { "role": "system", "content": systemPrompt },
        { "role": "user", "content": mdContent }
    ],
    stream: false,
});

Isn't it simple?

Using translation is also easy. There's a language selection bar at the top of the article. Just select the language you want to read:

image.png

Conclusion

I've spent two months on TMDX alone, but the functionality is still rudimentary. It needs continuous iteration. What do you think of this idea?

preview

Discussion (0)