Build an End-to-End Data Capture Pipeline using Document AI

학습자는 이 프로젝트에서 다음을 수행하게 됩니다.
1 hour
초급
다운로드 필요 없음
공유 가능한 수료증
영어
데스크톱 전용

This is a self-paced lab that takes place in the Google Cloud console. In this lab you use Cloud Functions and Pub/Sub to create an end-to-end document processing pipeline using Document AI. The Document AI API is a document understanding solution that takes unstructured data, such as documents and emails, and makes the data easier to understand, analyze, and consume. In this lab, you will create a document processing pipeline that will automatically process documents that are uploaded to Cloud Storage. The pipeline consists of a primary Cloud Function that processes new files that are uploaded to Cloud Storage using a Document AI form processor and then saves form data detected in those files to BigQuery. If the form data includes any address fields the address data is then written to a Pub/Sub topic that in turn triggers a second Cloud Function that uses to Geocoding API to provide geographic coordinate data for the address that is also written to BigQuery. This is a simple pipeline that uses a general form processor that will detect basic form data, such as a labelled field containing address information. Document AI processors that use one of the specialized parsers that are beyond the scope of this lab provide enhanced entity information for specific document types even when those documents do not include labelled fields. For example, a Document AI Invoice parser can provide detailed address and supplier information, from an unlabelled invoice document because it understands the layout of invoices.

개발할 기술

  • Cloud Storage

  • Business Process

  • Cloud API

  • Automations

프로젝트 작동 방식

대화형 실습 환경에서 새로운 도구 또는 기술을 배우세요.

클라우드 작업 영역에서 소프트웨어 및 도구에 접근할 수 있으며 다운로드할 필요가 없습니다.

제공자:

Placeholder

Google 클라우드

자주 묻는 질문