Azure Cognitive Search Scripts

This directory contains automation scripts for indexing blog posts into Azure Cognitive Search, enabling enhanced search capabilities for the Azure OSS Developer Support blog.

🎯 Overview

The Azure Cognitive Search integration provides powerful search functionality by automatically indexing blog post content, metadata, and tags. The indexing process extracts structured data from Jekyll markdown files and uploads it to a configured Azure Search service.

Key Features

📋 Prerequisites

Before using these scripts, ensure you have:

🚀 Quick Start

  1. Navigate to the scripts directory:
    cd azcogsearch-scripts
    
  2. Install dependencies:
    npm install
    
  3. Configure environment variables:
    cp .env.example .env
    # Edit .env with your Azure Search service details
    
  4. Run the indexing script:
    npm run index
    

⚙️ Configuration

Environment Variables

Create a .env file in this directory with the following variables:

# Azure Search Service Configuration
AZ_SEARCH_SERVICE_NAME=your-search-service-name
AZ_SEARCH_ADMIN_KEY=your-admin-key-here
AZ_SEARCH_INDEX_NAME=blog-index

# Environment Configuration
NODE_ENV=development

Configuration Details

Variable Required Description Example
AZ_SEARCH_SERVICE_NAME Name of your Azure Search service myblog-search
AZ_SEARCH_ADMIN_KEY Admin API key for write operations 1234567890ABCDEF...
AZ_SEARCH_INDEX_NAME Target index name (defaults to blog-index) blog-posts-prod
NODE_ENV Environment for logging verbosity development or production

📂 File Structure

azcogsearch-scripts/
├── README.md           # This documentation
├── package.json        # Node.js project configuration
├── package-lock.json   # Dependency lock file
├── .env.example        # Environment variable template
├── .gitignore         # Git ignore rules
├── feed-index.js      # Main indexing script
└── blog-data.js       # Blog post processing utilities

🔧 Usage

Basic Indexing

Index all blog posts to Azure Cognitive Search:

npm run index

Development Mode

Run with enhanced logging for troubleshooting:

NODE_ENV=development npm run index

Production Mode

Run with minimal logging for CI/CD environments:

NODE_ENV=production npm run index

📊 Index Schema

The script creates documents in Azure Search with the following structure:

{
  "@search.action": "mergeOrUpload",
  "id": "2023-01-01-example-post",
  "title": "Example Blog Post Title",
  "tags": ["azure", "nodejs", "tutorial"],
  "categories": ["development", "cloud"],
  "description": "Brief description of the post (120 characters max)",
  "content": "Full post content with markdown removed",
  "url": "/2023/01/01/example-post/index.html",
  "pubDate": "2023-01-01T00:00:00.000Z"
}

Field Descriptions

🛠 Troubleshooting

Common Issues

Missing Configuration Error

❌ Missing required configuration: searchServiceName, adminApiKey

Solution: Ensure your .env file exists and contains all required variables.

Authentication Errors

❌ Error during Azure Search indexing:
   Message: Access denied
   Code: 403

Solution: Verify your admin API key is correct and has write permissions.

Network Connectivity Issues

❌ Error during Azure Search indexing:
   Message: getaddrinfo ENOTFOUND your-service.search.windows.net

Solution: Check your search service name and network connectivity.

No Posts Found Warning

⚠️ No posts found to index

Solution: Ensure blog posts exist in the _posts directory and follow naming conventions.

Debug Mode

Enable detailed error logging by setting:

NODE_ENV=development npm run index

This will provide:

🔒 Security Considerations

Best Practices

  1. Environment Variables: Never commit .env files to version control
  2. Admin Keys: Rotate admin keys regularly and use least-privilege access
  3. Managed Identity: Consider using managed identity in production environments
  4. Network Security: Restrict Azure Search service access to trusted IPs when possible

CI/CD Integration

For automated deployments, store credentials as encrypted secrets:

GitHub Actions Example:

env:
  AZ_SEARCH_SERVICE_NAME: $
  AZ_SEARCH_ADMIN_KEY: $

Azure DevOps Example:

variables:
  - group: azure-search-secrets

🧰 Development

Code Structure

Key Dependencies

Testing Changes

  1. Use development environment variables
  2. Create a test index to avoid affecting production
  3. Run with sample posts to verify functionality

📈 Performance Notes

🤝 Contributing

When modifying these scripts:

  1. Follow existing error handling patterns
  2. Maintain environment variable validation
  3. Update this README for any configuration changes
  4. Test thoroughly with both development and production configurations

📋 Available Scripts

Command Description
npm run index Run the main indexing script
npm run audit Check for security vulnerabilities
npm run audit-fix Automatically fix security issues

Need Help? Check the troubleshooting section above or review the Azure Search service logs for detailed error information.