Skip to content

feat: Files Plugin#115

Open
atilafassina wants to merge 24 commits intomainfrom
plugin/files
Open

feat: Files Plugin#115
atilafassina wants to merge 24 commits intomainfrom
plugin/files

Conversation

@atilafassina
Copy link

@atilafassina atilafassina commented Feb 18, 2026

This PR implements a first version of the Files API plugin according to the internal RFC.

Features

  • list files
  • create directory
  • upload file
  • delete file
  • preview images or text
  • supports streaming for large files

Important

this PR also touches the server plugin to bypass the express.json middleware for upload routes, otherwise the stream does not passthrough

For testing functionality, I recommend using the adjacent PR that adds it to the Dev Playground ( #114 ), I split PR so code is easier to review here.

@atilafassina atilafassina marked this pull request as ready for review February 20, 2026 13:03
Copilot AI review requested due to automatic review settings February 20, 2026 13:03
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a comprehensive Files plugin for AppKit that enables Unity Catalog volume file operations. The plugin provides both HTTP routes and a programmatic API for common file operations including list, read, download, upload, delete, and preview. It integrates with AppKit's execution interceptor pipeline for caching, retry, and timeout handling.

The PR also modifies the server plugin to support bypassing JSON body parsing for specific routes (needed for file upload streaming), introducing a new skipBodyParsing route configuration option and getSkipBodyParsingPaths() plugin interface method.

Changes:

  • New Files plugin with full CRUD operations for Unity Catalog volumes
  • Server plugin modification to support raw body streaming for file uploads
  • Comprehensive test coverage including unit, integration, and connector tests
  • Documentation updates including README, API docs, and plugin guide

Reviewed changes

Copilot reviewed 32 out of 33 changed files in this pull request and generated 9 comments.

Show a summary per file
File Description
packages/appkit/src/plugins/files/plugin.ts Main FilesPlugin class with HTTP routes and programmatic API
packages/appkit/src/plugins/files/types.ts TypeScript type definitions for file operations
packages/appkit/src/plugins/files/defaults.ts Execution defaults for different operation tiers (read/download/write)
packages/appkit/src/plugins/files/manifest.json Plugin manifest declaring resource requirements
packages/appkit/src/connectors/files/client.ts FilesConnector implementing core file operations with telemetry
packages/appkit/src/connectors/files/defaults.ts Content-type resolution and text detection helpers
packages/appkit/src/plugins/server/index.ts Modified to support selective body parsing bypass
packages/shared/src/plugin.ts Added getSkipBodyParsingPaths() to BasePlugin interface
packages/appkit/src/plugin/plugin.ts Implementation of skipBodyParsing tracking and path registration
packages/appkit/src/plugins/files/tests/* Comprehensive test suites for plugin, connector, and integration
packages/appkit/src/plugins/server/tests/server.test.ts Tests for body parsing bypass functionality
docs/docs/plugins.md Plugin guide with Files plugin documentation and examples
packages/appkit/src/index.ts Export of files plugin and contentTypeFromPath helper
apps/dev-playground/server/index.ts Dev playground integration

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 32 out of 33 changed files in this pull request and generated 7 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@atilafassina atilafassina requested a review from Copilot February 20, 2026 15:58
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 32 out of 32 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 32 out of 32 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@atilafassina atilafassina requested a review from fjakobs February 23, 2026 17:30
Copy link
Member

@pkosiec pkosiec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The plugin looks awesome, great work! 👌
Could you please record a short demo how it works with the databricks apps init command and whether the apps deploy passes? Thanks in advance!

# in cli repo
make build
sudo ln -s "$(pwd)/cli" /usr/local/bin/dbx

# in appkit repo
DATABRICKS_APPKIT_TEMPLATE_PATH="/Users/pawel.kosiec/repositories/databricks-os/appkit/template" dbx apps init

- `format: "JSON"` (default) returns JSON rows
- `format: "ARROW"` returns an Arrow "statement_id" payload over SSE, then the client fetches binary Arrow from `/api/analytics/arrow-result/:jobId`

### Files plugin
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a heads-up - no changes needed for now: I split the plugins.md into multiple files here: #126

Not sure which PR will get merged first, probably yours, so of course I'll update the docs to split the files into a dedicated document 👍

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh great! I was thinking about doing it on a follow-up PR as well 🏆

telemetryExamples(),
analytics({}),
lakebaseExamples(),
files({ defaultVolume: process.env.DATABRICKS_DEFAULT_VOLUME }),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to pass the DATABRICKS_DEFAULT_VOLUME env value if it is already read by the plugin itself (at least that's what the plugin manifest says)? Looks redundant unless I'm missing something 🤔

host: "127.0.0.1",
autoStart: false,
}),
files({ defaultVolume: "/Volumes/catalog/schema/vol" }),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should it be empty if we set up the DATABRICKS_DEFAULT_VOLUME env?

Suggested change
files({ defaultVolume: "/Volumes/catalog/schema/vol" }),
files(),

?

Copy link
Collaborator

@fjakobs fjakobs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blocking - we need to solve access control before we can merge it

totalSpend: number;
/** @sqlType DATE */
createdAt: string;
;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the reverted types look like a regression

telemetryExamples(),
analytics({}),
lakebaseExamples(),
files({ defaultVolume: process.env.DATABRICKS_DEFAULT_VOLUME }),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how to we manage permissions. Can all users read and upload all the files?


Routes are mounted at `/api/files/*`.

#### Configuration
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see to problems with the proposed API:

  1. We only support a single volume. We have no way of serving files from multiple vlumes
  2. We have no access control. The API suggests that every user can read, delete, upload every file and upload to every.

I've tried to address this in my PoC like this:

https://github.com/databricks/appkit/pull/6/changes#diff-9d086cc651399510c1379ba2b4fee6aaa060fb066f4d190cb97b5be7a5a47facR18-R43

cc @MarioCadenas

});
```

#### User-scoped operations in a route handler
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how do these work?

| `GET` | `/api/files/preview?path=` | Get a file preview with text excerpt. |
| `POST` | `/api/files/upload?path=` | Upload a file (stream the request body). |
| `POST` | `/api/files/mkdir` | Create a directory (`{ path }` in body). |
| `POST` | `/api/files/delete` | Delete a file or directory (`{ path }` in body). |
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also need something like createSignedUrl() (see https://supabase.com/docs/reference/javascript/storage-from-createsignedurl) so we can download files that take longer than 60s to download.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants