Skip to content

Commit 2cd21e6

Browse files
NickL77Undertone0809
authored andcommitted
Change Default GoogleDriveLoader Behavior to not Load Trashed Files (issue langchain-ai#5104) (langchain-ai#5220)
# Change Default GoogleDriveLoader Behavior to not Load Trashed Files (issue langchain-ai#5104) Fixes langchain-ai#5104 If the previous behavior of loading files that used to live in the folder, but are now trashed, you can use the `load_trashed_files` parameter: ``` loader = GoogleDriveLoader( folder_id="1yucgL9WGgWZdM1TOuKkeghlPizuzMYb5", recursive=False, load_trashed_files=True ) ``` As not loading trashed files should be expected behavior, should we 1. even provide the `load_trashed_files` parameter? 2. add documentation? Feels most users will stick with default behavior ## Who can review? Community members can review the PR once tests pass. Tag maintainers/contributors who might be interested: DataLoaders - @eyurtsev Twitter: [@nicholasliu77](https://twitter.com/nicholasliu77)
1 parent 2f36b0b commit 2cd21e6

File tree

1 file changed

+6
-4
lines changed

1 file changed

+6
-4
lines changed

langchain/document_loaders/googledrive.py

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -31,6 +31,7 @@ class GoogleDriveLoader(BaseLoader, BaseModel):
3131
file_ids: Optional[List[str]] = None
3232
recursive: bool = False
3333
file_types: Optional[Sequence[str]] = None
34+
load_trashed_files: bool = False
3435

3536
@root_validator
3637
def validate_inputs(cls, values: Dict[str, Any]) -> Dict[str, Any]:
@@ -215,16 +216,17 @@ def _load_documents_from_folder(
215216
_files = files
216217

217218
returns = []
218-
for file in _files:
219-
if file["mimeType"] == "application/vnd.google-apps.document":
219+
for file in files:
220+
if file["trashed"] and not self.load_trashed_files:
221+
continue
222+
elif file["mimeType"] == "application/vnd.google-apps.document":
220223
returns.append(self._load_document_from_id(file["id"])) # type: ignore
221224
elif file["mimeType"] == "application/vnd.google-apps.spreadsheet":
222225
returns.extend(self._load_sheet_from_id(file["id"])) # type: ignore
223226
elif file["mimeType"] == "application/pdf":
224227
returns.extend(self._load_file_from_id(file["id"])) # type: ignore
225228
else:
226229
pass
227-
228230
return returns
229231

230232
def _fetch_files_recursive(
@@ -238,7 +240,7 @@ def _fetch_files_recursive(
238240
pageSize=1000,
239241
includeItemsFromAllDrives=True,
240242
supportsAllDrives=True,
241-
fields="nextPageToken, files(id, name, mimeType, parents)",
243+
fields="nextPageToken, files(id, name, mimeType, parents, trashed)",
242244
)
243245
.execute()
244246
)

0 commit comments

Comments
 (0)