Bundling issues with tiktoken (Error: Missing tiktoken_bg.wasm) #1127

marcusschiesser · 2024-08-19T03:35:35Z

I am opening this ticket to gather all issues related to bundling the WASM from https://github.com/dqbd/tiktoken:

Using AWS Nodejs serverless project, see Node Serverless deployment fails due to bundling issue #1110 (comment)
Using NextJS deploying on Vercel, see Error: Missing tiktoken_bg.wasm create-llama#164 (was fixed by copying the WASM file; see https://github.com/run-llama/create-llama/pull/201/files)

If you encounter this issue, please post your setup and configuration here.

LeonhardZehetgruber · 2024-08-30T12:28:13Z

I am encountering this issue when trying to integrate llamaindex into my Obsidian plugin. The build output for the plugin is a bundled main.js file.

package.json (the relevant part):

{
	"type": "module",
	"scripts": {
		"dev": "node esbuild.config.mjs"
	},
	"dependencies": {
		"llamaindex": "0.5.20"
	}
}

esbuild.config.mjs:

import esbuild from "esbuild";
import process from "node:process";
import builtins from "builtin-modules";

const context = await esbuild.context({
	entryPoints: { main: "src/main.ts" },
	bundle: true,
	platform: "node",
	external: [
		"obsidian",
		"electron",
		"sharp",
		"onnxruntime-node",
		"./xhr-sync-worker.js",
		...builtins],
	mainFields: ["browser", "module", "main"],
	conditions: ["browser"],
	format: "cjs",
	target: "es2022",
	logLevel: "info",
	treeShaking: true,
	outdir: "."
});

await context.rebuild();
process.exit(0);

tsconfig.json:

{
	"compilerOptions": {
		"baseUrl": "./src",
		"target": "es2022",
		"module": "ESNext",
		"moduleResolution": "bundler",
		"esModuleInterop": true,
		"skipLibCheck": true,
		"types": [
			"node",
			"jest"
		],
		"lib": [
			"DOM",
			"ES5",
			"ES6",
			"ES7",
			"ES2021",
			"ES2022"
		]
	},
	"include": [
		"**/*.ts"
	]
}

If I now use the following in my main.ts:

import { HuggingFaceEmbedding, Settings } from 'llamaindex';

Settings.embedModel = new HuggingFaceEmbedding({
	modelType: 'nomic-ai/nomic-embed-text-v1.5',
	quantized: false
});

I get the error Error: Missing tiktoken_bg.wasm at node_modules/tiktoken/tiktoken.cjs in the developer console.

AndreMaz · 2024-09-25T13:07:24Z

Just in case someone also faces the same issue. This is how I solved the issue

My next.config.mjs

import path from "path";
import { fileURLToPath } from "url";
import _jiti from "jiti";

import { withLlamaIndex } from "@web/chatbot/next";

const jiti = _jiti(fileURLToPath(import.meta.url));

// Import env files to validate at build time. Use jiti so we can load .ts files in here.
jiti("./src/env");

const isStaticExport = "false";

// Get __dirname equivalent for ES modules
const __filename = fileURLToPath(import.meta.url);
const __dirname = path.dirname(__filename);

/**
 * @type {import("next").NextConfig}
 */
const nextConfig = {
  basePath: process.env.NEXT_PUBLIC_BASE_PATH,
  serverRuntimeConfig: {
    PROJECT_ROOT: __dirname,
  },
  env: {
    BUILD_STATIC_EXPORT: isStaticExport,
  },
  // Trailing slashes must be disabled for Next Auth callback endpoint to work
  // https://stackoverflow.com/a/78348528
  trailingSlash: false,
  modularizeImports: {
    "@mui/icons-material": {
      transform: "@mui/icons-material/{{member}}",
    },
    "@mui/material": {
      transform: "@mui/material/{{member}}",
    },
    "@mui/lab": {
      transform: "@mui/lab/{{member}}",
    },
  },
  webpack(config) {
    config.module.rules.push({
      test: /\.svg$/,
      use: ["@svgr/webpack"],
    });

    // To allow chatbot to work
    // Extracted from: https://github.com/neondatabase/examples/blob/main/ai/llamaindex/rag-nextjs/next.config.mjs
    config.resolve.alias = {
      ...config.resolve.alias,
      sharp$: false,
      "onnxruntime-node$": false,
    };

    // From: https://github.com/dqbd/tiktoken?tab=readme-ov-file#nextjs
    config.experiments = {
      asyncWebAssembly: true,
      layers: true,
    };

    return config;
  },
  ...(isStaticExport === "true" && {
    output: "export",
  }),

  experimental: {
    outputFileTracingIncludes: {
      "/*": ["./cache/**/*"],
      "/api/**/*": ["./node_modules/**/*.wasm"],
    },
    serverComponentsExternalPackages: ["tiktoken", "onnxruntime-node"],
  },

  /** Enables hot reloading for local packages without a build step */
  transpilePackages: [
    "@web/api",
    "@web/auth",
    "@web/db",
    "@web/ui",
    "@web/validators",
    "@web/services",
    "@web/utils",
    "@web/logger",
    "@web/certs",
    "@web/chatbot",
  ],
  /** We already do linting and typechecking as separate tasks in CI */
  eslint: { ignoreDuringBuilds: true },
  typescript: { ignoreBuildErrors: true },
};

const withLlamaIndexConfig = withLlamaIndex(nextConfig);

export default withLlamaIndexConfig;

In my case everything related to llamaindex is at package @web/chatbot. This is why even the withLlamaIndex is being imported from @web/chatbot/next

Here's how my package.json at @web/chatbot looks like:

{
  "name": "@web/chatbot",
  "private": true,
  "version": "0.1.0",
  "type": "module",
  "exports": {
    ".": "./src/index.ts",
    "./next": "./src/with-lama-index.mjs"
  },
  "license": "MIT",
  "scripts": {
    "clean": "rm -rf .turbo node_modules",
    "format": "prettier --check . --ignore-path ../../.gitignore --ignore-path ../../.prettierignore",
    "lint": "eslint .",
    "typecheck": "tsc --emitDeclarationOnly"
  },
  "devDependencies": {
    "@web/eslint-config": "workspace:*",
    "@web/prettier-config": "workspace:*",
    "@web/tsconfig": "workspace:*",
    "@web/utils": "workspace:*",
    "eslint": "catalog:",
    "prettier": "catalog:",
    "typescript": "catalog:"
  },
  "prettier": "@web/prettier-config",
  "dependencies": {
    "@web/logger": "workspace:*",
    "@t3-oss/env-nextjs": "catalog:",
    "js-tiktoken": "^1.0.14",
    "llamaindex": "catalog:",
    "pg": "^8.13.0",
    "tiktoken": "^1.0.16"
  }
}

For reference: The next.config.mjs and my repo struct is based on create-t3-turbo repo

For more context check #1226

marcusschiesser mentioned this issue Aug 19, 2024

Error: Missing tiktoken_bg.wasm run-llama/create-llama#164

Closed

AndreMaz mentioned this issue Sep 20, 2024

@next/bundle-analyzer throws an error with nextjs-node-runtime example #1226

Closed

himself65 added the bug Something isn't working label Sep 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bundling issues with tiktoken (Error: Missing tiktoken_bg.wasm) #1127

Bundling issues with tiktoken (Error: Missing tiktoken_bg.wasm) #1127

marcusschiesser commented Aug 19, 2024

LeonhardZehetgruber commented Aug 30, 2024

AndreMaz commented Sep 25, 2024 •

edited

Loading

Bundling issues with tiktoken (Error: Missing tiktoken_bg.wasm) #1127

Bundling issues with tiktoken (Error: Missing tiktoken_bg.wasm) #1127

Comments

marcusschiesser commented Aug 19, 2024

LeonhardZehetgruber commented Aug 30, 2024

AndreMaz commented Sep 25, 2024 • edited Loading

AndreMaz commented Sep 25, 2024 •

edited

Loading