2024-03-25

The first quarter of this year has been insanely busy and has led to my inability to blog as much. I have been carving whatever time I have over the weekends to continue testing new Azure AI Services but couldn’t find the time to clear out my backlog of blog topics.

One of the recent tests I’ve done is to test out the in-preview feature of using Azure AI Search (previously known as Cognitive Search) to index a SharePoint Online document library. This feature has been one that I had interest in because of the vast amounts of SharePoint libraries I work with across different clients and the ability to use Azure OpenAI to tab into the libraries would be very attractive. CoPilot Studio offers such a feature where you can easily configure a SharePoint URL for a CoPilot to tap into but I still prefer Azure AI Services as I feel the flexibility that development offers provides much more creative ideas.

With the above said, the purpose of this post is to provide 2 scripts:

Script to create the App Registration with the appropriate permissions to access SharePoint Online

Script that will create the AI Search data source, indexer, and index

The deployment will follow the same found in the following Microsoft document: https://learn.microsoft.com/en-us/azure/search/search-howto-index-sharepoint-online

Please take the time to read the supported document formats and limitations of this feature.

Step 1 – Enable system managed assigned managed identity

This step is optional and isn’t needed for the configuration in this blog post because we’ll be including the tenant ID in the connection string of the data source but if your environment uses AI Search to access storage accounts then this will likely already be enabled.

Step 2 – Decide which permissions the indexer requires

There are advantages and disadvantages to go either way. This post will demonstrate using application permissions but the script also includes commented out code for delegated.

Step 3 - Create a Microsoft Entra application registration that will be used to access the SharePoint document library

Use the following script to create the App Registration that will configure the required permissions: https://github.com/terenceluk/Azure/blob/main/AI%20Services/SharePoint%20Online%20Indexer/Create-App-Registration.sh

Note that there does not appear to be a way for Azure CLI to configure Platform configurations so you'll need to manually perform the following after the App Registration is created:

Navigate to the Authentication tab of the App Registration

Set Allow public client flows to Yes then select Save.

Select + Add a platform, then Mobile and desktop applications, then check https://login.microsoftonline.com/common/oauth2/nativeclient, then Configure.

Step 4 to 7 – Create SharePoint data source, indexer, index, and get properties of index

Use the following PowerShell script to create and configure the above components in the desired AI Search: https://github.com/terenceluk/Azure/blob/main/AI%20Services/SharePoint%20Online%20Indexer/Configure-AI-Search-for-SharePoint.ps1

The following components should be displayed when successfully configured:

AI Search Data Source



AI Search Indexer



AI Search Index



Test Chatbot with SharePoint Online document library data

With the AI Search configured to tap into the SharePoint Online library, we can now use the Azure Open AI Studio to test chatting with the data.

Launch Azure Open AI Studio:

Select Add your data and Add a data source:

Select Azure AI Search as the data source:

Select the appropriate subscription, AI Search service, and the index that was created.

There is also an option to customize the field mapping rather than using the default.

These two screenshots show the customization options:

For those who have watched the YouTube videos demonstrating the configuration, most of them have selected “content” for all the fields but as shown in the screenshot below, this is not allowed anymore as of March 23, 2024 because if such an attempt is made, the following error message will be displayed:

You cannot use the same column data in multiple fields

Proceeding to the Data management configuration will reveal that Semantic search is not available:

Only Keyword is available:

Review the configuration and complete the setup:

You should now be able to chat with your data:

Thoughts and Options

As noted in the beginning of the Microsoft document, this preview feature isn’t recommended for production workloads and Microsoft is very clear in the limitations section indicating:

If you need a SharePoint content indexing solution in a production environment, consider creating a custom connector with SharePoint Webhooks, calling Microsoft Graph API to export the data to an Azure Blob container, and then use the Azure Blob indexer for incremental indexing.

Using the indexer against an Azure Storage account opens up text embedding model capabilities that provide vector and semantic search, which would yield much better results. However, if the requirement is simply to gain some light insight into a SharePoint document library then piloting this preview feature and waiting for it to GA may be a good initiative.

Show more