瀏覽代碼

Merge branch 'main' of https://github.com/Daethyra/gpt-crawler

Daemon 1 年之前
父節點
當前提交
fa2b026602
共有 1 個文件被更改,包括 9 次插入2 次删除
  1. 9 2
      Dockerfile

+ 9 - 2
Dockerfile

@@ -40,12 +40,19 @@ RUN npm --quiet set progress=false \
     && echo "NPM version:" \
     && npm --version
 
+# Install Python and required dependencies for the Python module
+RUN apt-get update \
+    && apt-get install -y python3 python3-pip \
+    && pip3 install beautifulsoup4 markdownify
+
+# Copy the Python script
+COPY --chown=myuser conv_html_to_markdown.py ./
+
 # Next, copy the remaining files and directories with the source code.
 # Since we do this after NPM install, quick build will be really fast
 # for most source file changes.
 COPY --chown=myuser . ./
 
-
 # Run the image. If you know you won't need headful browsers,
 # you can remove the XVFB start script for a micro perf gain.
-CMD ./start_xvfb_and_run_cmd.sh && npm run start:prod --silent
+CMD ./start_xvfb_and_run_cmd.sh && npm run start:prod --silent && python3 conv_html_to_markdown.py