Original Architecture v1

Overview

BILL is a plugin-based AI agent that maintains consistent personality across multiple platforms while keeping platform-specific conversations separate through a dual-layer memory architecture. The system supports multiple LLM providers via OpenRouter and includes image generation capabilities through GPT-4o.

Core Architecture

spinner

Enhanced Components

LLM Router

Intelligent routing between multiple LLM providers based on task requirements:

interface LLMTask {
  type: 'text' | 'code' | 'analysis' | 'creative';
  complexity: 'simple' | 'medium' | 'complex';
  platform: string;
  requiresVision?: boolean;
}

class LLMRouter {
  private providers: Map<string, LLMProvider>;
  private fallbackChain: string[];

  async selectProvider(task: LLMTask): Promise<LLMProvider> {
    // Route based on task requirements
    if (task.requiresVision) {
      return this.providers.get('gpt-4o')!;
    }
    
    if (task.type === 'code' && task.complexity === 'complex') {
      return this.providers.get('claude-3.5-sonnet')!;
    }
    
    if (task.complexity === 'simple') {
      return this.providers.get('gpt-4-turbo')!;
    }
    
    return this.providers.get('claude-3.5-sonnet')!; // Default
  }
}

Image Generation System

Handles image creation and processing using GPT-4o:

Dual-Layer Memory Architecture

Memory Isolation Strategy

The system implements a dual-layer memory approach to prevent context bleeding while enabling knowledge sharing:

Platform-Specific Memory

  • Purpose: Store conversation history and context per platform

  • Scope: Isolated to individual platforms (Twitter, Telegram)

  • Storage: Supabase tables with platform-specific schemas

  • Access Pattern: Recent conversation retrieval, thread context

  • Image Support: Store image URLs and analysis results

Shared Memory

  • Purpose: Cross-platform knowledge base and semantic search

  • Scope: Available to all platforms

  • Storage: Pinecone vector database with embeddings

  • Access Pattern: Semantic similarity search, fact retrieval

  • Image Support: Store image descriptions and generated content metadata

Enhanced Memory Flow

spinner

Database Configuration

Supabase Schema Design

Platform-Specific Tables with Image Support

Indexes for Performance

Pinecone Vector Database Configuration

Index Structure

  • Index Name: bill-agent

  • Dimensions: 1536 (OpenAI text-embedding-ada-002)

  • Metric: Cosine similarity

  • Pod Type: s1.x1 (starter)

Namespace Organization

Metadata Schema

LLM Provider Configuration

OpenRouter Integration

Provider Selection Logic

Component Details

Agent Runtime

The core processing engine that coordinates all system components:

Memory Manager

Coordinates access to both platform-specific and shared memory systems:

Context Building Strategy

The system builds rich context by combining multiple sources:

  1. Character System Prompt: Base personality and expertise

  2. Platform-Specific History: Recent conversation in the same thread/chat

  3. Platform Memory Search: Relevant past interactions on the platform

  4. Shared Knowledge: Cross-platform facts and learned information

  5. User Profile: Known preferences and interaction patterns

Data Flow Patterns

Message Processing Pipeline

  1. Reception: Platform plugin receives raw message

  2. Image Analysis: Analyze any attached images using GPT-4o vision

  3. Transformation: Convert to common Message interface

  4. Context Retrieval: Parallel fetch from platform and shared memory

  5. Task Analysis: Determine message type, complexity, and requirements

  6. LLM Selection: Route to optimal provider (Claude/GPT-4/Llama) based on task

  7. Context Assembly: Combine all context sources with image analysis

  8. Response Generation: Generate response using selected LLM provider

  9. Image Generation: Create images if response indicates need

  10. Storage: Store interaction in both memory layers with metadata

  11. Response Formatting: Platform-specific formatting with image attachment

  12. Delivery: Send response via platform API

Memory Storage Strategy

  • Immediate Storage: All interactions stored in platform-specific tables

  • Embedding Generation: Async generation of embeddings for vector storage

  • Batch Processing: Vector upserts batched for efficiency

  • Knowledge Extraction: Important facts extracted to shared knowledge

Technology Stack

  • Runtime: Node.js with TypeScript

  • Package Manager: Bun

  • Primary Database: Supabase (PostgreSQL)

  • Vector Database: Pinecone

  • Caching: Redis (for session management)

  • LLM Providers:

    • OpenRouter (Claude 3.5, GPT-4, Llama 3.1)

    • OpenAI Direct (GPT-4o for vision/images)

  • Image Generation: DALL-E 3 via GPT-4o

  • Image Storage: Supabase Storage

  • Deployment: Railway/Render

MVP Scope

Included:

  • Dual-layer memory architecture

  • Multiple LLM providers via OpenRouter

  • Image generation and analysis via GPT-4o

  • Twitter mentions and replies with image support

  • Telegram bot with image capabilities

  • Character system with platform adaptations

  • Vector-based semantic search

  • Basic user profiling

  • Cost tracking and optimization

Not Included:

  • Load balancing

  • CDN (images served via Supabase Storage)

  • Advanced analytics dashboard

  • Auto-scaling infrastructure

  • Real-time image processing

  • Video generation

Cost Optimization Strategy

LLM Cost Management

  • Intelligent Routing: Use cheaper models for simple tasks

  • Caching: Cache similar responses to reduce API calls

  • Rate Limiting: Prevent abuse and control costs

  • Usage Tracking: Monitor costs per platform and user

Image Generation Limits

  • Daily Limits: Reasonable limits per user/platform

  • Quality Settings: Use 'standard' quality for cost efficiency

  • Size Optimization: Default to 1024x1024 for most use cases

  • Prompt Enhancement: Improve prompts to reduce regeneration needs

Last updated