Testing Scripts Guide

This guide documents all available testing, debugging, and development scripts for the BILL Agent. All scripts use the unified bun start <script-name> format for consistency.

πŸš€ Quick Reference

To see all available scripts:

bun start

πŸ“‹ Script Categories

πŸ”§ Core Services

Start Main Services

bun start logger-ui         # Start logging UI for monitoring
bun start twitter           # Start Twitter agent with auto-posting
bun start twitter:debug     # Start Twitter agent with debug logging
bun start website           # Start website with API server
bun start dev               # Start full development environment
bun start dev:logging       # Start logging UI only
bun start telegram:ngrok    # Start Telegram with ngrok tunnel

Use Cases:

  • logger-ui: Monitor all agent activity in real-time web interface

  • twitter: Run BILL's Twitter bot with automatic posting/replies

  • dev: Start everything needed for development

  • website: Test the web interface with chat functionality

πŸ§ͺ Testing Scripts

Twitter Integration Tests

Use Cases:

  • test:twitter:dry-run: Safe testing without posting to Twitter

  • test:twitter:debug: Detailed debugging when tweets aren't working

  • test:twitter:quick: Fast verification that Twitter API is working

Core Functionality Tests

πŸ“Š Context & Analysis Scripts

Context System Testing

Key Script: context:inspect

  • Purpose: Diagnostic tool to see what context data BILL has access to

  • Shows: Timeline tweets, posted tweets, market analysis, event detection

  • Use: Troubleshoot context issues, verify timeline monitoring is working

  • Note: Read-only - doesn't collect new data, just shows existing

Timeline Monitoring

Examples:

Use Cases:

  • Populate timeline database for testing

  • Get fresh market data for context generation

  • Test image analysis on accounts with charts/graphs

πŸ”’ Security & Safety Scripts

Use Cases:

  • Verify BILL doesn't leak system prompts or internal instructions

  • Test safety mechanisms are working correctly

πŸ› οΈ Utility Scripts

Database & Cache Management

Authentication


🎯 Common Testing Workflows

1. First Time Setup

2. Context System Testing

3. Twitter Agent Testing

4. Debugging Issues

5. Context Pipeline Validation


πŸ” Diagnostic Tools

Timeline Monitoring Status

  • context:inspect: Shows what tweets are in database and how they're being analyzed

  • timeline:trigger: Manually collects tweets to populate database

  • Environment vars: TWITTER_IMMEDIATE_FETCH_CONTEXT=true enables immediate timeline fetching

Context Analysis

  • Shows: Market analysis, trending topics, recent events

  • Includes: AI analysis of images in tweets (charts, graphs)

  • Sources: @elonmusk, @MarketWatch, @zerohedge, @federalreserve, etc.

Logging & Monitoring

  • Real-time logs: Available at http://localhost:3003 when logger-ui is running

  • Debug output: Many scripts support :debug variants for detailed logging

  • Safety checks: Built-in prompt leakage and safety detection


⚑ Performance Tips

  1. Use dry-run tests for development to avoid rate limits

  2. Start logger-ui first to monitor all activity

  3. Clear cache regularly when testing context changes

  4. Use specific timeline:trigger commands rather than waiting for automatic monitoring

  5. Check context:inspect before testing context-driven features


🚨 Important Notes

  • All scripts use Bun: Don't use npm commands

  • Environment variables: Located in bill-agent/.env (not accessible to Cursor)

  • Character source of truth: Always reference agent/src/character/bill.ts

  • Timeline monitoring: Automatic in main agent, manual with timeline:trigger

  • Context inspection: Read-only diagnostic, doesn't collect new data


Last updated