Dell PowerScale Document Loader
Dell PowerScale is an enterprise scale out storage system that hosts industry leading OneFS filesystem that can be hosted on-prem or deployed in the cloud.
This document loader utilizes unique capabilities from PowerScale that can determine what files that have been modified since an application's last run and only returns modified files for processing. This will eliminate the need to re-process (chunk and embed) files that have not been changed, improving the overall data ingestion workflow.
This loader requires PowerScale's MetadataIQ feature enabled. Additional information can be found on our GitHub Repo: https://github.com/dell/powerscale-rag-connector
Overviewโ
Integration detailsโ
Class | Package | Local | Serializable | JS support |
---|---|---|---|---|
PowerScaleDocumentLoader | powerscale-rag-connector | โ | โ | โ |
PowerScaleUnstructuredLoader | powerscale-rag-connector | โ | โ | โ |
Loader featuresโ
Source | Document Lazy Loading | Native Async Support |
---|---|---|
PowerScaleDocumentLoader | โ | โ |
PowerScaleUnstructuredLoader | โ | โ |
Setupโ
This document loader requires the use of a Dell PowerScale system with MetadataIQ enabled. Additional information can be found on our github page: https://github.com/dell/powerscale-rag-connector
Installationโ
The document loader lives in an external pip package and can be installed using standard tooling
%pip install --upgrade --quiet powerscale-rag-connector
Initializationโ
Now we can instantiate document loader: