Skip to content

Commit b811c39

Browse files
committed
feat: expand file import to 25 formats — add TXT, YAML, TOML, TSV, and 11 more
- Added .txt, .text, .log, .rst, .ini, .conf, .cfg, .env, .properties as plain text imports - Added .yaml/.yml import with yaml code block wrapping - Added .toml import with toml code block wrapping - Added .tsv import with tab-separated Markdown table conversion - Updated file input accept attribute and dropzone label - Updated convertFileToMarkdown() public API with new converters
1 parent aaa682f commit b811c39

5 files changed

Lines changed: 140 additions & 12 deletions

File tree

CHANGELOG-file-import.md

Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,42 @@
1+
# File Import Format Expansion — 15 New File Types
2+
3+
- Added `.txt` and `.text` plain text file import (loaded directly as text, same as `.md`)
4+
- Added `.log`, `.rst`, `.ini`, `.conf`, `.cfg`, `.env`, `.properties` as plain text file types
5+
- Added `.yaml` / `.yml` import with YAML code block wrapping
6+
- Added `.toml` import with TOML code block wrapping
7+
- Added `.tsv` (tab-separated values) import with Markdown table conversion
8+
- Updated file input `accept` attribute to include all 15 new extensions + MIME types
9+
- Updated dropzone label from "MD, DOCX, XLSX, CSV, HTML, JSON, XML, PDF" to "TXT, MD, DOCX, XLSX, CSV, TSV, HTML, JSON, XML, YAML, TOML, PDF"
10+
- Updated error toast to list all supported formats
11+
- Updated `convertFileToMarkdown()` public API with new TSV/YAML/TOML converters
12+
13+
---
14+
15+
## Summary
16+
Expanded file import support from 10 to 25 file extensions, adding `.txt` and 14 other common text-based formats that were previously rejected with an "Unsupported file format" error.
17+
18+
---
19+
20+
## 1. Plain Text File Import
21+
**Files:** `js/file-converters.js`
22+
**What:** Added `text` type to `SUPPORTED_EXTENSIONS` for `.txt`, `.text`, `.log`, `.rst`, `.ini`, `.conf`, `.cfg`, `.env`, `.properties`. These are routed through `importMarkdownFile()` (reads raw text into editor, no conversion needed).
23+
**Impact:** Users can now drag-and-drop or upload plain text files directly — the most commonly expected file type that was missing.
24+
25+
## 2. YAML / TOML / TSV Converters
26+
**Files:** `js/file-converters.js`
27+
**What:** Added three new converter functions: `convertYamlToMarkdown()` (wraps in ```yaml block), `convertTomlToMarkdown()` (wraps in ```toml block), `convertTsvToMarkdown()` (parses tab-separated rows into Markdown table). All three also added to the `convertFileToMarkdown()` public API used by the Memory indexer.
28+
**Impact:** YAML configs, TOML configs, and TSV data files now import with proper syntax highlighting or table formatting.
29+
30+
## 3. HTML & Accept Attribute Updates
31+
**Files:** `index.html`
32+
**What:** Updated the `#file-input` accept attribute to include `.txt,.text,.log,.rst,.ini,.conf,.cfg,.env,.properties,.tsv,.yaml,.yml,.toml` and corresponding MIME types (`text/plain`, `text/tab-separated-values`, `application/x-yaml`). Updated the dropzone label to show the expanded format list.
33+
**Impact:** File browser dialog now shows all supported file types when clicking "Browse"; dropzone label accurately reflects supported formats.
34+
35+
---
36+
37+
## Files Changed (2 total)
38+
39+
| File | Lines Changed | Type |
40+
|------|:---:|------|
41+
| `js/file-converters.js` | +43 −4 | New extensions, converters, API |
42+
| `index.html` | +4 −4 | Accept attribute, dropzone label |

README.md

Lines changed: 9 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,7 @@
3030
| **📌 AI Annotations** | Right-click context menu on selected text → 5 annotation types: ⭐ Highlight, 📝 Sticky Note, ❓ Ask AI, 🔖 Bookmark, 📖 Define; color-coded pills render inline in preview; sliding thread panel for multi-turn AI Q&A with document context, web search, and model selector; annotations stored as HTML comments in markdown source (portable, no external DB); **Study Copy** workflow for annotating shared read-only documents; `findBlockEnd()` structural insertion prevents markdown syntax breakage |
3131
| **🎤 Voice Dictation** | Dual-engine speech-to-text: **Voxtral Mini 3B** (WebGPU, primary, 13 languages, ~2.7 GB) or **Whisper Large V3 Turbo** (WASM fallback, ~800 MB) with consensus scoring; download consent popup with model info before first use; 50+ Markdown-aware voice commands — natural phrases ("heading one", "bold…end bold", "add table", "undo"); auto-punctuation via AI refinement or built-in fallback; streaming partial results |
3232
| **🔊 Text-to-Speech** | Hybrid Kokoro TTS engine — 9 languages (English, Japanese, Chinese, Spanish, French, Hindi, Italian, Portuguese) via [Kokoro 82M v1.0 ONNX](https://huggingface.co/textagent/Kokoro-82M-v1.0-ONNX) (~80 MB, off-thread WebWorker), Korean, German & others via Web Speech API fallback; **chunked synthesis** for long text (sentence-boundary splitting, ~500 chars/chunk, sequential synthesis with per-chunk progress); TTS card with separate ▶ Run (generate audio) / ▷ Play (replay) / 💾 Save (WAV download) buttons; hover any preview text and click 🔊 to hear pronunciation; voice auto-selection by language |
33-
| **Import** | MD, DOCX, XLSX/XLS, CSV, HTML, JSON, XML, PDF — drag & drop or click Upload to import |
33+
| **Import** | TXT, MD, DOCX, XLSX/XLS, CSV, TSV, HTML, JSON, XML, YAML, TOML, PDF — drag & drop or click Upload to import |
3434
| **Export** | Markdown, self-contained styled HTML, PDF (smart page-breaks, shared rendering pipeline), LLM Memory (5 formats: XML, JSON, Compact JSON, Markdown, Plain Text + shareable link) |
3535
| **Sharing** | AES-256-GCM encrypted sharing via Firebase; **compact share links** (`#s=<id>`, ~36 chars vs ~111 chars) with encryption key stored server-side; **custom named links** — optionally choose your own memorable name (e.g. `#s=mynotes`) with case-insensitive uniqueness, slug validation, and reserved-word protection; read-only shared links with auto-dismiss banner + floating "Read-only" pill indicator, **clean read-only view** (composer FAB + agent panel hidden when header collapsed), optional password protection (zero-knowledge — passphrase-derived key never stored); **view-locked links** (lock recipients to PPT or Preview mode, stored in Firestore to prevent URL tampering); **editor links** — cryptographic edit key system (`&ek=<token>`) grants write access to trusted collaborators (SHA-256 verified, AES-GCM encrypted write-token, auto-save to same document); **shared versions tracking** ("Previously Shared" panel with timestamps, view-mode badges, copy/delete actions); backward-compatible with legacy `#id=...&k=...` links |
3636
| **Presentation** | Slide mode using `---` separators, keyboard navigation, multiple layouts & transitions, speaker notes, overview grid, 20+ PPT templates with image backgrounds |
@@ -80,12 +80,16 @@ Import files directly — they're auto-converted to Markdown client-side:
8080

8181
| Format | Library | Notes |
8282
|:-------|:--------|:------|
83+
| **TXT / LOG / RST / INI / CONF** | Native | Loaded directly as text (same as .md) |
8384
| **DOCX** | Mammoth.js + Turndown.js | Preserves formatting, tables, images |
8485
| **XLSX / XLS** | SheetJS | Multi-sheet support with markdown tables |
8586
| **CSV** | Native parser | Auto-detection of delimiters |
87+
| **TSV** | Native parser | Tab-separated values → Markdown table |
8688
| **HTML** | Turndown.js | Extracts body content from full pages |
8789
| **JSON** | Native | Pretty-printed code block |
8890
| **XML** | Native | Formatted code block |
91+
| **YAML / YML** | Native | Wrapped in yaml code block |
92+
| **TOML** | Native | Wrapped in toml code block |
8993
| **PDF** | pdf.js | Page-by-page text extraction |
9094

9195
## 📤 Export Options
@@ -197,9 +201,9 @@ Import files directly — they're auto-converted to Markdown client-side:
197201
</details>
198202

199203
<details open>
200-
<summary><strong>📂 Import & Export — 8 Formats In, PDF/HTML Out</strong></summary>
204+
<summary><strong>📂 Import & Export — 15 Formats In, PDF/HTML Out</strong></summary>
201205

202-
**Import anything, export everything.** Drag and drop files in 8 formats (MD, DOCX, XLSX, CSV, HTML, JSON, XML, PDF) — all converted to Markdown client-side. Export as Markdown, HTML, or smart PDF with intelligent page breaks.
206+
**Import anything, export everything.** Drag and drop files in 15+ formats (TXT, MD, DOCX, XLSX, CSV, TSV, HTML, JSON, XML, YAML, TOML, PDF, and more) — all converted to Markdown client-side. Export as Markdown, HTML, or smart PDF with intelligent page breaks.
203207

204208
<img src="public/assets/demos/08_import_export.webp" alt="Import & Export — dropzone with 8 supported formats and export options" width="100%">
205209

@@ -442,7 +446,7 @@ graph LR
442446
|:-------|:----|
443447
| **Write** | Type or paste Markdown in the left editor panel |
444448
| **Preview** | See live rendered output in the right panel |
445-
| **Import** | Click ☁️ Upload or drag & drop — supports MD, DOCX, XLSX, CSV, HTML, JSON, XML, PDF |
449+
| **Import** | Click ☁️ Upload or drag & drop — supports TXT, MD, DOCX, XLSX, CSV, TSV, HTML, JSON, XML, YAML, TOML, PDF |
446450
| **Export** | Use the ⬇️ Export dropdown → Markdown, HTML, PDF, or LLM Memory |
447451
| **AI Assistant** | Click ✨ AI → choose a model → ask questions or use quick actions |
448452
| **Dark Mode** | Click the 🌙 moon icon |
@@ -539,7 +543,7 @@ TextAgent has undergone significant evolution since its inception. What started
539543

540544
| Date | Commits | Feature / Update |
541545
|------|---------|-----------------:|
542-
| **2026-03-30** | | 📌 **AI Annotations**new `ai-tags.js` module (1129 lines) with 5 annotation types (Highlight, Sticky Note, Ask AI, Bookmark, Define) via right-click context menu; color-coded pills in preview (⭐📝❓🔖📖); sliding thread panel for multi-turn AI Q&A on selected passages with document context, web search, and model selector; annotations stored as portable `<!-- @ai-tag: ... -->` HTML comments; Study Copy workflow for annotating shared read-only docs; `findBlockEnd()` structural block boundary detection prevents tag insertion from breaking markdown syntax (lists, tables, headings); tiered `requestAiTask` safety timeout (180s local / 60s cloud) with `_taskMessageIds` message isolation to prevent global worker listener cross-talk |
546+
| **2026-03-30** | | 📂 **File Import Expansion**added 15 new file extensions to the import pipeline: `.txt`, `.text`, `.log`, `.rst`, `.ini`, `.conf`, `.cfg`, `.env`, `.properties` as plain text; `.yaml`/`.yml` wrapped in yaml code block; `.toml` wrapped in toml code block; `.tsv` parsed as tab-separated Markdown table; updated file input `accept` attribute with all new extensions and MIME types; updated dropzone label and error toast; total supported extensions now 25 |
543547
| **2026-03-29** | | 🔑 **Secure Session Editing** — cryptographic Edit Key (`ek`) system for collaborative editing of shared documents; 24-char random edit key hashed via SHA-256 and stored as `ekHash` in Firestore; write-token encrypted with edit key via AES-GCM stored as `eWt`; editor links (`&ek=<token>`) grant write access without exposing raw write-tokens; verification in both compact (`#s=`) and secure (`#id=&secure=1`) share paths; edit mode bypasses form/quiz access gate; auto-save preserves `ekHash`/`eWt` fields; "Editor Link" section in share result modal with purple badge; "Copy All Links" includes editor link; email/download credentials include editor link; Firestore rules updated for new fields |
544548
| **2026-03-29** | | 🚀 **21× Cloud Context Window** — raised cloud AI worker context limits from 6K to 128K chars for Groq (Llama 3.3 70B) and OpenRouter (GPT-5.4, Claude, Qwen 35B); Gemini raised from 32K to 128K (4×); chat history per-message doubled from 4K to 8K chars; Gemini worker migrated to shared `ai-worker-common.js` (eliminated ~80 lines of duplicate system prompts); local Qwen workers deliberately kept at 32K chars — Qwen 3.5 hybrid GDN architecture degrades beyond 25-50K tokens for small (0.8B-4B) models; existing degenerate output circuit breaker and `presence_penalty: 2.0` safety nets preserved |
545549
| **2026-03-28** | | 🤖 **AI-Assisted Quiz Generation** — new Skill Injection system for `{{Quiz:}}` tags: `QUIZ_SYNTAX_SKILL` constant injected into AI prompts with exact pipe-format documentation for all 8 question types; `postProcessQuizLines()` auto-fixes common AI output errors (missing colons, swapped MCQ pipes, wrong types); `@prompt:` field with prompt textarea for natural-language quiz creation; per-card AI model selector dropdown; Practice/Test mode toggle button; creator Next button always enabled; flexible `{{Quiz :}}` parsing (space before colon); `requestAiTask` with try/catch and toast error handling |
Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,42 @@
1+
# File Import Format Expansion — 15 New File Types
2+
3+
- Added `.txt` and `.text` plain text file import (loaded directly as text, same as `.md`)
4+
- Added `.log`, `.rst`, `.ini`, `.conf`, `.cfg`, `.env`, `.properties` as plain text file types
5+
- Added `.yaml` / `.yml` import with YAML code block wrapping
6+
- Added `.toml` import with TOML code block wrapping
7+
- Added `.tsv` (tab-separated values) import with Markdown table conversion
8+
- Updated file input `accept` attribute to include all 15 new extensions + MIME types
9+
- Updated dropzone label from "MD, DOCX, XLSX, CSV, HTML, JSON, XML, PDF" to "TXT, MD, DOCX, XLSX, CSV, TSV, HTML, JSON, XML, YAML, TOML, PDF"
10+
- Updated error toast to list all supported formats
11+
- Updated `convertFileToMarkdown()` public API with new TSV/YAML/TOML converters
12+
13+
---
14+
15+
## Summary
16+
Expanded file import support from 10 to 25 file extensions, adding `.txt` and 14 other common text-based formats that were previously rejected with an "Unsupported file format" error.
17+
18+
---
19+
20+
## 1. Plain Text File Import
21+
**Files:** `js/file-converters.js`
22+
**What:** Added `text` type to `SUPPORTED_EXTENSIONS` for `.txt`, `.text`, `.log`, `.rst`, `.ini`, `.conf`, `.cfg`, `.env`, `.properties`. These are routed through `importMarkdownFile()` (reads raw text into editor, no conversion needed).
23+
**Impact:** Users can now drag-and-drop or upload plain text files directly — the most commonly expected file type that was missing.
24+
25+
## 2. YAML / TOML / TSV Converters
26+
**Files:** `js/file-converters.js`
27+
**What:** Added three new converter functions: `convertYamlToMarkdown()` (wraps in ```yaml block), `convertTomlToMarkdown()` (wraps in ```toml block), `convertTsvToMarkdown()` (parses tab-separated rows into Markdown table). All three also added to the `convertFileToMarkdown()` public API used by the Memory indexer.
28+
**Impact:** YAML configs, TOML configs, and TSV data files now import with proper syntax highlighting or table formatting.
29+
30+
## 3. HTML & Accept Attribute Updates
31+
**Files:** `index.html`
32+
**What:** Updated the `#file-input` accept attribute to include `.txt,.text,.log,.rst,.ini,.conf,.cfg,.env,.properties,.tsv,.yaml,.yml,.toml` and corresponding MIME types (`text/plain`, `text/tab-separated-values`, `application/x-yaml`). Updated the dropzone label to show the expanded format list.
33+
**Impact:** File browser dialog now shows all supported file types when clicking "Browse"; dropzone label accurately reflects supported formats.
34+
35+
---
36+
37+
## Files Changed (2 total)
38+
39+
| File | Lines Changed | Type |
40+
|------|:---:|------|
41+
| `js/file-converters.js` | +43 −4 | New extensions, converters, API |
42+
| `index.html` | +4 −4 | Accept attribute, dropzone label |

index.html

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -210,7 +210,7 @@
210210
<i class="bi bi-list-nested"></i>
211211
</button>
212212
<input type="file" id="file-input" class="file-input"
213-
accept=".md,.markdown,.docx,.xlsx,.xls,.csv,.html,.htm,.json,.xml,.pdf,text/markdown,application/vnd.openxmlformats-officedocument.wordprocessingml.document,application/vnd.openxmlformats-officedocument.spreadsheetml.sheet,application/vnd.ms-excel,text/csv,text/html,application/json,text/xml,application/pdf">
213+
accept=".md,.markdown,.txt,.text,.log,.rst,.ini,.conf,.cfg,.env,.properties,.docx,.xlsx,.xls,.csv,.tsv,.html,.htm,.json,.xml,.yaml,.yml,.toml,.pdf,text/markdown,text/plain,application/vnd.openxmlformats-officedocument.wordprocessingml.document,application/vnd.openxmlformats-officedocument.spreadsheetml.sheet,application/vnd.ms-excel,text/csv,text/tab-separated-values,text/html,application/json,text/xml,application/x-yaml,application/pdf">
214214

215215
<div class="dropdown">
216216
<button class="tool-button dropdown-toggle" type="button" id="exportDropdown"
@@ -426,8 +426,8 @@ <h5>Menu</h5>
426426
<i class="bi bi-x-lg"></i>
427427
</button>
428428
<p class="mb-0"><i class="bi bi-cloud-arrow-up me-1"></i>Drop your file here or click to browse
429-
<small class="ms-2" style="font-size:0.8rem; opacity:0.55; color: var(--text-color)">MD, DOCX, XLSX,
430-
CSV, HTML, JSON, XML,
429+
<small class="ms-2" style="font-size:0.8rem; opacity:0.55; color: var(--text-color)">TXT, MD, DOCX, XLSX,
430+
CSV, TSV, HTML, JSON, XML, YAML, TOML,
431431
PDF</small>
432432
</p>
433433
</div>

0 commit comments

Comments
 (0)