Data Storage Connectors

Karini AI supports out-of-the-box integration with the following data connectors. This gives you the flexibility to access data from disparate data sources.

The access credentials for the data connectors must be set in the .

Amazon S3

is a scalable object storage service provided by Amazon Web Services (AWS).
In order to setup access to your datasource in S3, you need to specify the path to your S3 bucket or folder within the bucket in the recipe's storage connector. You can also use the recursive option to access data from the bucket path and all of it's subfolders.

Azure Cloud Storage

is a Microsoft-managed cloud service that provides scalable and secure storage solutions.
In order to setup access to your datasource in Azure Cloud Storage, you need to specify the Azure Cloud Storage Container Path in the recipe's storage connector.

Google Cloud Storage

is a service provided by Google Cloud Platform that offers highly durable and available object storage.
To access you data source from Google Cloud Storage, you need to specify the full Google Cloud Storage bucket path in the recipe's storage connector.

Confluence

is a collaboration and content management tool used by teams to create, share, and manage their work in one place. It's often used for documentation, project planning, and team collaboration.
In Confluence, a space is a designated area where users can organize and manage related content, such as pages, documents, and discussions. To access you data from Confluence, you need to specify the confluence space name in the recipe's storage connector.

Dropbox

To access you data from Dropbox, you need to specify the dropbox folder name in the recipe's storage connector.

Box

Google Drive

To access you data from Google drive, you need to specify the Google drive folder id in the recipe's storage connector. Google drive folder id refers to the specific path or location within your Google Drive where the files or folders you want to access are stored.

Website

A Website connector typically allows you to extract and manage data directly from websites. This can include scraping data, integrating with APIs provided by websites, or embedding website content into other applications.

Karini AI's website connectors enables you to crawl your website data source using following options.

Source Type

URLs: Add up to 10 seed/starting point URLs of the websites you want to crawl. You can also include website subdomains.
Sitemap: Add up to 3 sitemap URLs of the websites you want to crawl. Sitemaps help in systematically crawling and extracting data from all pages listed in the sitemap file.
Source URL Files: Add up to 100 seed/starting point URLs listed in a text file in Amazon S3, or as http, https link. Each URL should be on a separate line in the text file. You can also upload from a local device.
Source Sitemap Files: Add up to 3 sitemap XML files stored in Amazon S3 or local device. Upload a file containing multiple sitemap URLs to crawl and extract data from.

Configuration Settings

Crawl Depth: The depth, or number, of levels from the seed level to crawl. For example, the seed URL page is depth 1 and any hyperlinks on this page that are also crawled are depth 2.
Maximum File Size (MB): The maximum size in MB of a webpage or attachment to crawl.
Maximum Number of URLs Crawled per Minute per Host: Limits the rate at which the connector accesses URLs on the same host.
Include files in web page links: Choose to crawl files that the webpages link to.
Include URL Patterns: Add regular expression patterns to include crawling specific URLs, and indexing any hyperlinks on these URL webpages.
Exclude URL Patterns: Add regular expression patterns to exclude crawling specific URLs, and indexing any hyperlinks on these URL webpages.

Manifest

You can provide a S3 manifest file as a data source in recipe storage connector. The manifest file is expected to be in CVS format, with each line containing a url as source.

Sharepoint

Sharepoint is a web-based collaboration and document management platform developed by Microsoft. It enables organizations to store, manage, and share documents and other content in a secure, centralized location.

To configure access to your data stored in sharepoint, you will need to specify the folder path in the recipe's source connector. This will enable seamless integration and retrieval of data from your sharepoint repository for use in your workflows.

PreviousQnA Recipe NextConnector Credential Setup

Last updated 27 days ago